Druid vs Pig: What are the differences?
What is Druid? Fast column-oriented distributed data store. Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.
What is Pig? Platform for analyzing large data sets. Pig is a dataflow programming environment for processing very large files. Pig's language is called Pig Latin. A Pig Latin program consists of a directed acyclic graph where each node represents an operation that transforms data Operations are of two flavors: (1) relational-algebra style operations such as join, filter, project; (2) functional-programming style operators such as map, reduce. .
Druid and Pig can be categorized as "Big Data" tools.
Druid and Pig are both open source tools. It seems that Druid with 8.31K GitHub stars and 2.08K forks on GitHub has more adoption than Pig with 583 GitHub stars and 449 GitHub forks.
Airbnb, Instacart, and Dial Once are some of the popular companies that use Druid, whereas Pig is used by Netflix, Outbrain, and Cobrain. Druid has a broader approval, being mentioned in 24 company stacks & 12 developers stacks; compared to Pig, which is listed in 9 company stacks and 4 developer stacks.