96
312
+ 1
0

What is Delta Lake?

An open-source storage layer that brings ACID transactions to Apache Spark™ and big data workloads.
Delta Lake is a tool in the Big Data Tools category of a tech stack.
Delta Lake is an open source tool with 6.9K GitHub stars and 1.6K GitHub forks. Here’s a link to Delta Lake's open source repository on GitHub

Who uses Delta Lake?

Companies
9 companies reportedly use Delta Lake in their tech stacks, including XTRM-Data, Peak-AI, and Compile Inc.

Developers
86 developers on StackShare have stated that they use Delta Lake.

Delta Lake Integrations

Amazon S3, Apache Spark, Hadoop, Databricks, and StreamSets are some of the popular tools that integrate with Delta Lake. Here's a list of all 8 tools that integrate with Delta Lake.
Decisions about Delta Lake

Here are some stack decisions, common use cases and reviews by companies and developers who chose Delta Lake in their tech stack.

We are building cloud based analytical app and most of the data for UI is supplied from SQL server to Delta lake and then from Delta Lake to Azure Cosmos DB as JSON using Databricks. So that API can send it to front-end. Sometimes we get larger documents while transforming table rows into JSONs and it exceeds 2mb limit of cosmos size. What is the best solution for replacing Cosmos DB?

See more

Delta Lake's Features

  • ACID Transactions
  • Scalable Metadata Handling
  • Time Travel (data versioning)
  • Open Format
  • Unified Batch and Streaming Source and Sink
  • Schema Enforcement
  • Schema Evolution
  • 100% Compatible with Apache Spark API

Delta Lake Alternatives & Comparisons

What are some alternatives to Delta Lake?
Snowflake
Snowflake eliminates the administration and management demands of traditional data warehouses and big data platforms. Snowflake is a true data warehouse as a service running on Amazon Web Services (AWS)—no infrastructure to manage and no knobs to turn.
Apache Spark
Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning.
JavaScript
JavaScript is most known as the scripting language for Web pages, but used in many non-browser environments as well such as node.js or Apache CouchDB. It is a prototype-based, multi-paradigm scripting language that is dynamic,and supports object-oriented, imperative, and functional programming styles.
Git
Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.
GitHub
GitHub is the best place to share code with friends, co-workers, classmates, and complete strangers. Over three million people use GitHub to build amazing things together.
See all alternatives

Delta Lake's Followers
312 developers follow Delta Lake to keep up with related blogs and decisions.