What is Apache Parquet?
It is a columnar storage format available to any project in the Hadoop ecosystem, regardless of the choice of data processing framework, data model or programming language.
Apache Parquet is a tool in the Big Data Tools category of a tech stack.
Apache Parquet is an open source tool with 953 GitHub stars and 833 GitHub forks. Here’s a link to Apache Parquet's open source repository on GitHub
Who uses Apache Parquet?
9 companies reportedly use Apache Parquet in their tech stacks, including Plista GmbH, Grandata, and Yotpo.
8 developers on StackShare have stated that they use Apache Parquet.
Apache Parquet Integrations
Java, Hadoop, Apache Hive, Apache Impala, and Apache Thrift are some of the popular tools that integrate with Apache Parquet. Here's a list of all 6 tools that integrate with Apache Parquet.
Why developers like Apache Parquet?
Here’s a list of reasons why companies and developers use Apache Parquet
Be the first to leave a pro
Apache Parquet's Features
- Columnar storage format
- Type-specific encoding
- Pig integration
- Cascading integration
- Crunch integration
- Apache Arrow integration
- Apache Scrooge integration
- Adaptive dictionary encoding
- Predicate pushdown
- Column stats
Apache Parquet Alternatives & Comparisons
What are some alternatives to Apache Parquet?
See all alternatives
It is a row-oriented remote procedure call and data serialization framework developed within Apache's Hadoop project. It uses JSON for defining data types and protocols, and serializes data in a compact binary format.
A new addition to the open source Apache Hadoop ecosystem, Kudu completes Hadoop's storage layer to enable fast analytics on fast data.
The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software.
PostgreSQL is an advanced object-relational database management system that supports an extended subset of the SQL standard, including transactions, foreign keys, subqueries, triggers, user-defined types and functions.
MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding.