Easy-to-use and general-purpose machine learning in Python
DMTK provides a parameter server based framework for training machine learning models on big data with numbers of machines. It is currently a standard C++ library and provides a series of friendly programming interfaces. | This feature store is built to simplify machine learning (ML) feature management and improve developer productivity. It automatically computes your feature values and joins them to your training data, using point-in-time-correct semantics to avoid data leakage, and supports materializing and deploying your features for use online in production. |
DMTK Framework: a flexible framework that supports unified interface for data parallelization, hybrid data structure for big model storage, model scheduling for big model training, and automatic pipelining for high training efficiency.;
LightLDA, an extremely fast and scalable topic model algorithm, with a O(1) Gibbs sampler and an efficient distributed implementation.;
Distributed (Multisense) Word Embedding, a distributed version of (multi-sense) word embedding algorithm. | Define features;
Get those features by their names;
Share features;
Rich UDF support |
Statistics | |
GitHub Stars 2.7K | GitHub Stars - |
GitHub Forks 559 | GitHub Forks - |
Stacks 4 | Stacks 10 |
Followers 18 | Followers 11 |
Votes 0 | Votes 0 |
Integrations | |
| No integrations available | |

TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API.

scikit-learn is a Python module for machine learning built on top of SciPy and distributed under the 3-Clause BSD license.

PyTorch is not a Python binding into a monolothic C++ framework. It is built to be deeply integrated into Python. You can use it naturally like you would use numpy / scipy / scikit-learn etc.

Deep Learning library for Python. Convnets, recurrent neural networks, and more. Runs on TensorFlow or Theano. https://keras.io/

The Kubeflow project is dedicated to making Machine Learning on Kubernetes easy, portable and scalable by providing a straightforward way for spinning up best of breed OSS solutions.

Use flexible and intuitive APIs to build and train models from scratch using the low-level JavaScript linear algebra library or the high-level layers API

An enterprise-grade open source platform for building, training, and monitoring large scale deep learning applications.

It is the app framework specifically for Machine Learning and Data Science teams. You can rapidly build the tools you need. Build apps in a dozen lines of Python with a simple API.

MLflow is an open source platform for managing the end-to-end machine learning lifecycle.

H2O.ai is the maker behind H2O, the leading open source machine learning platform for smarter applications and data products. H2O operationalizes data science by developing and deploying algorithms and models for R, Python and the Sparkling Water API for Spark.