Get Advice Icon

Need advice about which tool to choose?Ask the StackShare community!

Druid
Druid

104
143
+ 1
17
Pig
Pig

37
50
+ 1
4
Add tool

Druid vs Pig: What are the differences?

What is Druid? Fast column-oriented distributed data store. Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.

What is Pig? Platform for analyzing large data sets. Pig is a dataflow programming environment for processing very large files. Pig's language is called Pig Latin. A Pig Latin program consists of a directed acyclic graph where each node represents an operation that transforms data Operations are of two flavors: (1) relational-algebra style operations such as join, filter, project; (2) functional-programming style operators such as map, reduce. .

Druid and Pig can be categorized as "Big Data" tools.

Druid and Pig are both open source tools. It seems that Druid with 8.31K GitHub stars and 2.08K forks on GitHub has more adoption than Pig with 583 GitHub stars and 449 GitHub forks.

Airbnb, Instacart, and Dial Once are some of the popular companies that use Druid, whereas Pig is used by Netflix, Outbrain, and Cobrain. Druid has a broader approval, being mentioned in 24 company stacks & 12 developers stacks; compared to Pig, which is listed in 9 company stacks and 4 developer stacks.

No Stats

What is Druid?

Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.

What is Pig?

Pig is a dataflow programming environment for processing very large files. Pig's language is called Pig Latin. A Pig Latin program consists of a directed acyclic graph where each node represents an operation that transforms data. Operations are of two flavors: (1) relational-algebra style operations such as join, filter, project; (2) functional-programming style operators such as map, reduce.
Get Advice Icon

Need advice about which tool to choose?Ask the StackShare community!

Why do developers choose Druid?
Why do developers choose Pig?

Sign up to add, upvote and see more prosMake informed product decisions

    Be the first to leave a con
      Be the first to leave a con
      What companies use Druid?
      What companies use Pig?

      Sign up to get full access to all the companiesMake informed product decisions

      What tools integrate with Druid?
      What tools integrate with Pig?
      What are some alternatives to Druid and Pig?
      HBase
      Apache HBase is an open-source, distributed, versioned, column-oriented store modeled after Google' Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Apache Hadoop.
      MongoDB
      MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding.
      Cassandra
      Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.
      Prometheus
      Prometheus is a systems and service monitoring system. It collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts if some condition is observed to be true.
      Elasticsearch
      Elasticsearch is a distributed, RESTful search and analytics engine capable of storing data and searching it in near real time. Elasticsearch, Kibana, Beats and Logstash are the Elastic Stack (sometimes called the ELK Stack).
      See all alternatives
      Decisions about Druid and Pig
      No stack decisions found
      Interest over time
      Reviews of Druid and Pig
      No reviews found
      How developers use Druid and Pig
      No items found
      How much does Druid cost?
      How much does Pig cost?
      Pricing unavailable
      Pricing unavailable
      News about Druid
      More news
      News about Pig
      More news