Need advice about which tool to choose?Ask the StackShare community!
Druid vs Pachyderm: What are the differences?
Druid: Fast column-oriented distributed data store. Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations; Pachyderm: MapReduce without Hadoop. Analyze massive datasets with Docker. Pachyderm is an open source MapReduce engine that uses Docker containers for distributed computations.
Druid and Pachyderm can be categorized as "Big Data" tools.
Druid and Pachyderm are both open source tools. It seems that Druid with 8.31K GitHub stars and 2.08K forks on GitHub has more adoption than Pachyderm with 3.81K GitHub stars and 369 GitHub forks.
Pros of Druid
- Real Time Aggregations15
- Batch and Real-Time Ingestion6
- OLAP5
- OLAP + OLTP3
- Combining stream and historical analytics2
- OLTP1
Pros of Pachyderm
- Containers3
- Versioning1
- Can run on GCP or AWS1
Sign up to add or upvote prosMake informed product decisions
Cons of Druid
- Limited sql support3
- Joins are not supported well2
- Complexity1
Cons of Pachyderm
- Recently acquired by HPE, uncertain future.1