lakeFS logo

lakeFS

Open source data version control system for data lakes
2
3
+ 1
37

What is lakeFS?

It is an open-source data version control system for data lakes. It provides a “Git for data” platform enabling you to implement best practices from software engineering on your data lake, including branching and merging, CI/CD, and production-like dev/test environments.
lakeFS is a tool in the Big Data Tools category of a tech stack.
lakeFS is an open source tool with 4.5K GitHub stars and 367 GitHub forks. Here’s a link to lakeFS's open source repository on GitHub

Who uses lakeFS?

lakeFS Integrations

Python, Amazon S3, Kafka, Airflow, and Presto are some of the popular tools that integrate with lakeFS. Here's a list of all 18 tools that integrate with lakeFS.
Pros of lakeFS
2
Full reproducibility
2
Easy integration with other tools
2
Cloud agnostic
2
Scalability
2
Open Source
2
Format agnostic
2
Highly Scalable
2
Inexpensive
2
Available On prem
2
Doesn't require local copies of the data
2
Easy to use
2
Big Data Scale
2
Strong Team
2
Scales to big data
2
Cloud agnostics
2
Supports unstructured data
2
SaaS
1
Highly performant
1
Great Git integration
1
Supports both data engineering and data science

lakeFS's Features

  • Zero copy version management
  • Any data formats: structured, unstructured, open table, etc
  • Scales to Petabytes and millions of objects with negligible performance impact
  • Seamless integration with all your data stack

lakeFS Alternatives & Comparisons

What are some alternatives to lakeFS?
MySQL
The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software.
PostgreSQL
PostgreSQL is an advanced object-relational database management system that supports an extended subset of the SQL standard, including transactions, foreign keys, subqueries, triggers, user-defined types and functions.
MongoDB
MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding.
Redis
Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache, and message broker. Redis provides data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes, and streams.
Amazon S3
Amazon Simple Storage Service provides a fully redundant data storage infrastructure for storing and retrieving any amount of data, at any time, from anywhere on the web
See all alternatives
Related Comparisons
No related comparisons found

lakeFS's Followers
3 developers follow lakeFS to keep up with related blogs and decisions.