It is a deep learning optimization library that makes distributed training easy, efficient, and effective. It can train DL models with over a hundred billion parameters on the current generation of GPU clusters while achieving over 5x in system performance compared to the state-of-art. Early adopters of DeepSpeed have already produced a language model (LM) with over 17B parameters called Turing-NLG, establishing a new SOTA in the LM category. | It is a differentiable computer vision library for PyTorch. It consists of a set of routines and differentiable modules to solve generic computer vision problems. At its core, the package uses PyTorch as its main backend both for efficiency and to take advantage of the reverse-mode auto-differentiation to define and compute the gradient of complex functions. |
Distributed Training with Mixed Precision; Model Parallelism; Memory and Bandwidth Optimizations; Simplified training API;
Gradient Clipping;
Automatic loss scaling with mixed precision; Simplified Data Loader; Performance Analysis and Debugging | Perform feature detection; Perform data augmentation in the GPU;
Perform image filtering and edge detection;
Differentiable computer vision library |
Statistics | |
GitHub Stars - | GitHub Stars 10.8K |
GitHub Forks - | GitHub Forks 1.1K |
Stacks 11 | Stacks 14 |
Followers 16 | Followers 6 |
Votes 0 | Votes 0 |
Integrations | |

Cloudinary is a cloud-based service that streamlines websites and mobile applications' entire image and video management needs - uploads, storage, administration, manipulations, and delivery.

imgix is the leading platform for end-to-end visual media processing. With robust APIs, SDKs, and integrations, imgix empowers developers to optimize, transform, manage, and deliver images and videos at scale through simple URL parameters.

TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API.

OpenCV was designed for computational efficiency and with a strong focus on real-time applications. Written in optimized C/C++, the library can take advantage of multi-core processing. Enabled with OpenCL, it can take advantage of the hardware acceleration of the underlying heterogeneous compute platform.

scikit-learn is a Python module for machine learning built on top of SciPy and distributed under the 3-Clause BSD license.

PyTorch is not a Python binding into a monolothic C++ framework. It is built to be deeply integrated into Python. You can use it naturally like you would use numpy / scipy / scikit-learn etc.

ImageKit offers a real-time URL-based API for image & video optimization, streaming, and 50+ transformations to deliver perfect visual experiences on websites and apps. It also comes integrated with a Digital Asset Management solution.

Deep Learning library for Python. Convnets, recurrent neural networks, and more. Runs on TensorFlow or Theano. https://keras.io/

The Kubeflow project is dedicated to making Machine Learning on Kubernetes easy, portable and scalable by providing a straightforward way for spinning up best of breed OSS solutions.

Use flexible and intuitive APIs to build and train models from scratch using the low-level JavaScript linear algebra library or the high-level layers API