Need advice about which tool to choose?Ask the StackShare community!

Druid

312
683
+ 1
29
HBase

361
407
+ 1
15
Add tool

Druid vs HBase: What are the differences?

What is Druid? Fast column-oriented distributed data store. Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.

What is HBase? The Hadoop database, a distributed, scalable, big data store. Apache HBase is an open-source, distributed, versioned, column-oriented store modeled after Google' Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Apache Hadoop.

Druid belongs to "Big Data Tools" category of the tech stack, while HBase can be primarily classified under "Databases".

"Real Time Aggregations" is the primary reason why developers consider Druid over the competitors, whereas "Performance" was stated as the key factor in picking HBase.

Druid and HBase are both open source tools. Druid with 8.31K GitHub stars and 2.08K forks on GitHub appears to be more popular than HBase with 2.91K GitHub stars and 2.01K GitHub forks.

Pinterest, HubSpot, and hike are some of the popular companies that use HBase, whereas Druid is used by Airbnb, Instacart, and Dial Once. HBase has a broader approval, being mentioned in 54 company stacks & 18 developers stacks; compared to Druid, which is listed in 24 company stacks and 12 developer stacks.

Get Advice from developers at your company using Private StackShare. Sign up for Private StackShare.
Learn More
Pros of Druid
Pros of HBase
  • 14
    Real Time Aggregations
  • 5
    Batch and Real-Time Ingestion
  • 4
    OLAP
  • 3
    OLAP + OLTP
  • 2
    Combining stream and historical analytics
  • 1
    OLTP
  • 9
    Performance
  • 5
    OLTP
  • 1
    Fast Point Queries

Sign up to add or upvote prosMake informed product decisions

Cons of Druid
Cons of HBase
  • 3
    Limited sql support
  • 2
    Joins are not supported well
  • 1
    Complexity
    Be the first to leave a con

    Sign up to add or upvote consMake informed product decisions

    What is Druid?

    Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.

    What is HBase?

    Apache HBase is an open-source, distributed, versioned, column-oriented store modeled after Google' Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Apache Hadoop.

    Need advice about which tool to choose?Ask the StackShare community!

    What companies use Druid?
    What companies use HBase?
    See which teams inside your own company are using Druid or HBase.
    Sign up for Private StackShareLearn More

    Sign up to get full access to all the companiesMake informed product decisions

    What tools integrate with Druid?
    What tools integrate with HBase?

    Sign up to get full access to all the tool integrationsMake informed product decisions

    Blog Posts

    Jun 24 2020 at 4:42PM

    Pinterest

    Amazon S3KafkaHBase+4
    4
    1065
    MySQLKafkaApache Spark+6
    2
    1583
    What are some alternatives to Druid and HBase?
    MongoDB
    MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding.
    Cassandra
    Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.
    Prometheus
    Prometheus is a systems and service monitoring system. It collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts if some condition is observed to be true.
    Elasticsearch
    Elasticsearch is a distributed, RESTful search and analytics engine capable of storing data and searching it in near real time. Elasticsearch, Kibana, Beats and Logstash are the Elastic Stack (sometimes called the ELK Stack).
    Clickhouse
    It allows analysis of data that is updated in real time. It offers instant results in most cases: the data is processed faster than it takes to create a query.
    See all alternatives