Help developers discover the tools you use. Get visibility for your team's tech choices and contribute to the community's knowledge.
Atlas is one foundation to manage and provide visibility to your servers, containers, VMs, configuration management, service discovery, and additional operations services. | It is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data. |
One command to develop any application: vagrant up;One command to deploy any application: vagrant push | Datasets (Tables) schema and usage frequency/popularity; Users bookmark, owner, frequent user; Dashboard popularity, lineage to datasets |
Statistics | |
Stacks 33 | Stacks 17 |
Followers 125 | Followers 42 |
Votes 0 | Votes 0 |
Integrations | |
| No integrations available | |

Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning.

You can use AWS CloudFormation’s sample templates or create your own templates to describe the AWS resources, and any associated dependencies or runtime parameters, required to run your application. You don’t need to figure out the order in which AWS services need to be provisioned or the subtleties of how to make those dependencies work.

Distributed SQL Query Engine for Big Data

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.

Packer automates the creation of any type of machine image. It embraces modern configuration management by encouraging you to use automated scripts to install and configure the software within your Packer-made images.

Apache Flink is an open source system for fast and versatile data analytics in clusters. Flink supports batch and streaming analytics, in one system. Analytical programs can be written in concise and elegant APIs in Java and Scala.

It is an open-source data version control system for data lakes. It provides a “Git for data” platform enabling you to implement best practices from software engineering on your data lake, including branching and merging, CI/CD, and production-like dev/test environments.

Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.

Scalr is a remote state & operations backend for Terraform with access controls, policy as code, and many quality of life features.

Pulumi is a cloud development platform that makes creating cloud programs easy and productive. Skip the YAML and just write code. Pulumi is multi-language, multi-cloud and fully extensible in both its engine and ecosystem of packages.