Need advice about which tool to choose?Ask the StackShare community!

Clickhouse

388
517
+ 1
78
TimescaleDB

209
370
+ 1
44
Add tool

Clickhouse vs TimescaleDB: What are the differences?

Introduction:

ClickHouse and TimescaleDB are both popular database systems used for time-series data analysis and processing. While they share some similarities, there are several key differences between the two. This article aims to highlight these differences and provide a clear understanding of which database may be more suitable for specific use cases.

  1. Architecture: ClickHouse is a columnar database, meaning it stores data in columnar format which allows for efficient compression and better query performance for analytical workloads. On the other hand, TimescaleDB is an extension of PostgreSQL, using a row-oriented storage model with hypertables for time-series data. This allows for easy integration with existing PostgreSQL infrastructure and tools.

  2. Scalability: ClickHouse is designed for massive scalability and can handle high volumes of data and concurrent queries efficiently. It uses a distributed architecture that allows for horizontal scaling across multiple servers. TimescaleDB, on the other hand, is designed to scale vertically and can be deployed on a single server or in a multi-node cluster to handle larger workloads.

  3. Query Language: ClickHouse uses its own SQL dialect called ClickHouse SQL, which is optimized for analytical queries and supports a wide range of analytical functions and operations. TimescaleDB, being an extension of PostgreSQL, uses standard SQL with additional time-series specific functions and extensions like time_bucket and continuous aggregates.

  4. Data Model: ClickHouse is schemaless and does not enforce a predefined schema, allowing for flexibility in data storage. It supports dynamic schema where columns can be added or removed without downtime. On the other hand, TimescaleDB follows a strict schema where tables are defined with predefined columns and data types.

  5. Data Ingestion: ClickHouse provides various methods for data ingestion, including native support for insert operations, distributed data replication, and bulk data ingestion using formats like CSV, JSON, or Apache Kafka. TimescaleDB also supports various methods for data ingestion, including native inserts, COPY command, and data replication using tools like logical replication or streaming.

  6. Data Partitioning: ClickHouse supports automatic data partitioning based on a user-defined partition key, allowing for efficient data storage and retrieval. It can partition data based on time intervals, hash values, or other user-defined keys. TimescaleDB uses hypertables and automatic time-based partitioning by default, making it easy to store and query time-series data efficiently.

In summary, ClickHouse and TimescaleDB differ in their architecture, scalability, query language, data model, data ingestion methods, and data partitioning techniques. Choosing the right database depends on the specific requirements of the use case, with ClickHouse being suitable for high-performance analytics and large-scale deployments, while TimescaleDB provides easier integration with existing PostgreSQL infrastructure and a more traditional SQL experience for time-series data analysis.

Advice on Clickhouse and TimescaleDB
Needs advice
on
InfluxDBInfluxDBMongoDBMongoDB
and
TimescaleDBTimescaleDB

We are building an IOT service with heavy write throughput and fewer reads (we need downsampling records). We prefer to have good reliability when comes to data and prefer to have data retention based on policies.

So, we are looking for what is the best underlying DB for ingesting a lot of data and do queries easily

See more
Replies (3)
Yaron Lavi
Recommends
on
PostgreSQLPostgreSQL

We had a similar challenge. We started with DynamoDB, Timescale, and even InfluxDB and Mongo - to eventually settle with PostgreSQL. Assuming the inbound data pipeline in queued (for example, Kinesis/Kafka -> S3 -> and some Lambda functions), PostgreSQL gave us a We had a similar challenge. We started with DynamoDB, Timescale and even InfluxDB and Mongo - to eventually settle with PostgreSQL. Assuming the inbound data pipeline in queued (for example, Kinesis/Kafka -> S3 -> and some Lambda functions), PostgreSQL gave us better performance by far.

See more
Recommends
on
DruidDruid

Druid is amazing for this use case and is a cloud-native solution that can be deployed on any cloud infrastructure or on Kubernetes. - Easy to scale horizontally - Column Oriented Database - SQL to query data - Streaming and Batch Ingestion - Native search indexes It has feature to work as TimeSeriesDB, Datawarehouse, and has Time-optimized partitioning.

See more
Ankit Malik
Software Developer at CloudCover · | 3 upvotes · 322.2K views
Recommends
on
Google BigQueryGoogle BigQuery

if you want to find a serverless solution with capability of a lot of storage and SQL kind of capability then google bigquery is the best solution for that.

See more
Decisions about Clickhouse and TimescaleDB
Benoit Larroque
Principal Engineer at Sqreen · | 2 upvotes · 133.7K views

I chose TimescaleDB because to be the backend system of our production monitoring system. We needed to be able to keep track of multiple high cardinality dimensions.

The drawbacks of this decision are our monitoring system is a bit more ad hoc than it used to (New Relic Insights)

We are combining this with Grafana for display and Telegraf for data collection

See more
Get Advice from developers at your company using StackShare Enterprise. Sign up for StackShare Enterprise.
Learn More
Pros of Clickhouse
Pros of TimescaleDB
  • 19
    Fast, very very fast
  • 11
    Good compression ratio
  • 6
    Horizontally scalable
  • 5
    Great CLI
  • 5
    Utilizes all CPU resources
  • 5
    RESTful
  • 4
    Buggy
  • 4
    Open-source
  • 4
    Great number of SQL functions
  • 3
    Server crashes its normal :(
  • 3
    Has no transactions
  • 2
    Flexible connection options
  • 2
    Highly available
  • 2
    ODBC
  • 2
    Flexible compression options
  • 1
    In IDEA data import via HTTP interface not working
  • 9
    Open source
  • 8
    Easy Query Language
  • 7
    Time-series data analysis
  • 5
    Established postgresql API and support
  • 4
    Reliable
  • 2
    Paid support for automatic Retention Policy
  • 2
    Chunk-based compression
  • 2
    Postgres integration
  • 2
    High-performance
  • 2
    Fast and scalable
  • 1
    Case studies

Sign up to add or upvote prosMake informed product decisions

Cons of Clickhouse
Cons of TimescaleDB
  • 5
    Slow insert operations
  • 5
    Licensing issues when running on managed databases

Sign up to add or upvote consMake informed product decisions

- No public GitHub repository available -

What is Clickhouse?

It allows analysis of data that is updated in real time. It offers instant results in most cases: the data is processed faster than it takes to create a query.

What is TimescaleDB?

TimescaleDB: An open-source database built for analyzing time-series data with the power and convenience of SQL — on premise, at the edge, or in the cloud.

Need advice about which tool to choose?Ask the StackShare community!

What companies use Clickhouse?
What companies use TimescaleDB?
See which teams inside your own company are using Clickhouse or TimescaleDB.
Sign up for StackShare EnterpriseLearn More

Sign up to get full access to all the companiesMake informed product decisions

What tools integrate with Clickhouse?
What tools integrate with TimescaleDB?

Sign up to get full access to all the tool integrationsMake informed product decisions

Blog Posts

What are some alternatives to Clickhouse and TimescaleDB?
Cassandra
Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.
Elasticsearch
Elasticsearch is a distributed, RESTful search and analytics engine capable of storing data and searching it in near real time. Elasticsearch, Kibana, Beats and Logstash are the Elastic Stack (sometimes called the ELK Stack).
MySQL
The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software.
InfluxDB
InfluxDB is a scalable datastore for metrics, events, and real-time analytics. It has a built-in HTTP API so you don't have to write any server side code to get up and running. InfluxDB is designed to be scalable, simple to install and manage, and fast to get data in and out.
Druid
Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.
See all alternatives