Need advice about which tool to choose?Ask the StackShare community!

Clickhouse

407
528
+ 1
85
Druid

382
867
+ 1
32
Add tool

Clickhouse vs Druid: What are the differences?

Introduction

ClickHouse and Druid are both powerful analytical databases that are designed to handle large volumes of data and provide fast query performance. While they share some similarities, there are several key differences between the two.

  1. Architecture: ClickHouse is a columnar database that is optimized for online analytical processing (OLAP) workloads. It uses a shared-nothing architecture, where the data is spread across multiple nodes and each node processes data independently. On the other hand, Druid is a distributed, column-oriented OLAP database that uses a shared-disk architecture. It separates storage and compute, allowing for horizontal scalability and efficient data ingestion.

  2. Data Model: ClickHouse supports a traditional relational data model with tables, columns, and rows. It uses a SQL-like query language for data retrieval and manipulation. Druid, on the other hand, uses a multidimensional data model with dimensions, measures, and hierarchies. It is optimized for time series data and provides a JSON-based query language called Druid Query Language (DSL).

  3. Data Ingestion: ClickHouse supports batch and real-time data ingestion through various methods such as file uploads, replication, Kafka integration, and more. It also provides built-in support for merging and transforming data during ingestion. Druid, on the other hand, is designed for real-time data streaming and supports high-speed data ingestion from various sources like Kafka, AWS Kinesis, and more. It also supports batch ingestion for historical data.

  4. Scalability: ClickHouse can scale horizontally across multiple nodes by adding more servers to the cluster. It provides automatic sharding and distribution of data across nodes for efficient data processing. Druid, on the other hand, is designed to handle large-scale data sets and can scale horizontally by adding more nodes to the cluster. It uses a distributed storage system for efficient data storage and retrieval.

  5. Query Performance: ClickHouse is known for its fast query performance, especially for analytical queries that involve aggregations and complex calculations. It can handle millions of queries per second and provides various optimizations like data compression and indexing to improve query speed. Druid, on the other hand, is designed for real-time analytics and provides low latency queries on large data sets. It uses advanced caching techniques and indexing structures to optimize query performance.

  6. Use Cases: ClickHouse is commonly used for OLAP workloads, ad-hoc analytics, and business intelligence applications. It is popular in industries like e-commerce, finance, and telecommunications. Druid, on the other hand, is often used for real-time analytics, monitoring, and visualization of time series data. It is used in applications like user behavior tracking, log analytics, and IoT analytics.

In summary, ClickHouse and Druid differ in their architecture, data model, data ingestion methods, scalability, query performance, and use cases. Each database has its own strengths and could be chosen based on specific requirements and use case scenarios.

Manage your open source components, licenses, and vulnerabilities
Learn More
Pros of Clickhouse
Pros of Druid
  • 21
    Fast, very very fast
  • 11
    Good compression ratio
  • 7
    Horizontally scalable
  • 6
    Utilizes all CPU resources
  • 5
    RESTful
  • 5
    Open-source
  • 5
    Great CLI
  • 4
    Great number of SQL functions
  • 4
    Buggy
  • 3
    Server crashes its normal :(
  • 3
    Highly available
  • 3
    Flexible connection options
  • 3
    Has no transactions
  • 2
    ODBC
  • 2
    Flexible compression options
  • 1
    In IDEA data import via HTTP interface not working
  • 15
    Real Time Aggregations
  • 6
    Batch and Real-Time Ingestion
  • 5
    OLAP
  • 3
    OLAP + OLTP
  • 2
    Combining stream and historical analytics
  • 1
    OLTP

Sign up to add or upvote prosMake informed product decisions

Cons of Clickhouse
Cons of Druid
  • 5
    Slow insert operations
  • 3
    Limited sql support
  • 2
    Joins are not supported well
  • 1
    Complexity

Sign up to add or upvote consMake informed product decisions

What is Clickhouse?

It allows analysis of data that is updated in real time. It offers instant results in most cases: the data is processed faster than it takes to create a query.

What is Druid?

Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.

Need advice about which tool to choose?Ask the StackShare community!

What companies use Clickhouse?
What companies use Druid?
Manage your open source components, licenses, and vulnerabilities
Learn More

Sign up to get full access to all the companiesMake informed product decisions

What tools integrate with Clickhouse?
What tools integrate with Druid?

Sign up to get full access to all the tool integrationsMake informed product decisions

Blog Posts

Dec 22 2021 at 5:41AM

Pinterest

MySQLKafkaDruid+3
3
605
MySQLKafkaApache Spark+6
2
2059
What are some alternatives to Clickhouse and Druid?
Cassandra
Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.
Elasticsearch
Elasticsearch is a distributed, RESTful search and analytics engine capable of storing data and searching it in near real time. Elasticsearch, Kibana, Beats and Logstash are the Elastic Stack (sometimes called the ELK Stack).
MySQL
The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software.
InfluxDB
InfluxDB is a scalable datastore for metrics, events, and real-time analytics. It has a built-in HTTP API so you don't have to write any server side code to get up and running. InfluxDB is designed to be scalable, simple to install and manage, and fast to get data in and out.
MongoDB
MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding.
See all alternatives