Need advice about which tool to choose?Ask the StackShare community!

Clickhouse

407
528
+ 1
85
ScyllaDB

140
193
+ 1
8
Add tool

Clickhouse vs Scylla: What are the differences?

Introduction

ClickHouse and Scylla are both popular database management systems that are widely used in different applications. While they have some similarities, they also have key differences that set them apart from each other. In this markdown code, we will explore and highlight the main differences between ClickHouse and Scylla.

  1. Data Model and Query Language: ClickHouse is a columnar database that is designed to handle analytical workloads efficiently. It uses a SQL-like query language that supports complex analytical queries and allows users to perform various transformations and aggregations on large datasets. On the other hand, Scylla is a distributed database that is based on Apache Cassandra. It uses CQL (Cassandra Query Language) for querying data and follows the key-value model. This means that Scylla is optimized for high-throughput transactional workloads rather than complex analytics.

  2. Replication and Consistency: ClickHouse supports both synchronous and asynchronous replication methods, allowing users to choose the level of consistency they require for their data. It provides ways to replicate data across different servers and data centers to ensure high availability and fault tolerance. In contrast, Scylla has a built-in distributed architecture that automatically replicates data across multiple nodes. It provides high availability and fault tolerance by replicating data within the same data center or across different data centers, depending on the configuration.

  3. Data Storage and Compression: ClickHouse uses a columnar storage format, which means that data is stored in a column-wise manner rather than row-wise. This allows for efficient compression techniques like dictionary and run-length encoding, resulting in reduced storage space and improved query performance for analytical workloads. Scylla, on the other hand, uses a row-based storage format that is optimized for write-heavy workloads. It incorporates compression techniques like LZ4 and Snappy to reduce the storage footprint of data.

  4. Data Consistency and Durability: ClickHouse provides eventual consistency for data replication, which means that changes made to the data are eventually propagated to all replicas in the cluster. It also provides durability by storing data on disk and supports configurable storage policies for data retention. Scylla, being based on Apache Cassandra, provides tunable consistency levels for data replication. It ensures durability by writing data to disk and also provides the option of replicating data to multiple data centers for increased fault tolerance.

  5. Scalability and Performance: ClickHouse is known for its exceptional performance when it comes to complex analytical queries on large datasets. It can handle high concurrency and provides efficient data compression and caching mechanisms. Scylla, on the other hand, is designed for high-throughput transactional workloads and can handle a massive number of read and write operations in real-time. It provides low-latency responses and supports horizontal scalability by adding more nodes to the cluster.

  6. Community and Ecosystem: ClickHouse has a growing community and a rich ecosystem of tools and integrations that have been developed around it. It is widely adopted by companies for data analytics and reporting purposes. Scylla, being based on Cassandra, also has a large community and ecosystem. It benefits from the existing tools and integrations available for Cassandra and provides seamless integration with other Cassandra-compatible systems.

In summary, ClickHouse is a columnar database optimized for analytical workloads with a SQL-like query language, while Scylla is a distributed database based on Cassandra that is designed for high-throughput transactional workloads. ClickHouse excels in complex analytics and has a growing community, while Scylla provides high availability, low-latency, and scalability for real-time transactional workloads.

Advice on Clickhouse and ScyllaDB
Vinay Mehta
Needs advice
on
CassandraCassandra
and
ScyllaDBScyllaDB

The problem I have is - we need to process & change(update/insert) 55M Data every 2 min and this updated data to be available for Rest API for Filtering / Selection. Response time for Rest API should be less than 1 sec.

The most important factors for me are processing and storing time of 2 min. There need to be 2 views of Data One is for Selection & 2. Changed data.

See more
Replies (4)
Recommends
on
ScyllaDBScyllaDB

Scylla can handle 1M/s events with a simple data model quite easily. The api to query is CQL, we have REST api but that's for control/monitoring

See more
Alex Peake
Recommends
on
CassandraCassandra

Cassandra is quite capable of the task, in a highly available way, given appropriate scaling of the system. Remember that updates are only inserts, and that efficient retrieval is only by key (which can be a complex key). Talking of keys, make sure that the keys are well distributed.

See more
Pankaj Soni
Chief Technical Officer at Software Joint · | 2 upvotes · 160K views
Recommends
on
CassandraCassandra

i love syclla for pet projects however it's license which is based on server model is an issue. thus i recommend cassandra

See more
Recommends
on
ScyllaDBScyllaDB

By 55M do you mean 55 million entity changes per 2 minutes? It is relatively high, means almost 460k per second. If I had to choose between Scylla or Cassandra, I would opt for Scylla as it is promising better performance for simple operations. However, maybe it would be worth to consider yet another alternative technology. Take into consideration required consistency, reliability and high availability and you may realize that there are more suitable once. Rest API should not be the main driver, because you can always develop the API yourself, if not supported by given technology.

See more
Decisions about Clickhouse and ScyllaDB
Tom Klein

The Gentlent Tech Team made lots of updates within the past year. The biggest one being our database:

We decided to migrate our #PostgreSQL -based database systems to a custom implementation of #Cassandra . This allows us to integrate our product data perfectly in a system that just makes sense. High availability and scalability are supported out of the box.

See more
Manage your open source components, licenses, and vulnerabilities
Learn More
Pros of Clickhouse
Pros of ScyllaDB
  • 21
    Fast, very very fast
  • 11
    Good compression ratio
  • 7
    Horizontally scalable
  • 6
    Utilizes all CPU resources
  • 5
    RESTful
  • 5
    Open-source
  • 5
    Great CLI
  • 4
    Great number of SQL functions
  • 4
    Buggy
  • 3
    Server crashes its normal :(
  • 3
    Highly available
  • 3
    Flexible connection options
  • 3
    Has no transactions
  • 2
    ODBC
  • 2
    Flexible compression options
  • 1
    In IDEA data import via HTTP interface not working
  • 2
    Replication
  • 1
    Fewer nodes
  • 1
    Distributed
  • 1
    Scale up
  • 1
    High availability
  • 1
    Written in C++
  • 1
    High performance

Sign up to add or upvote prosMake informed product decisions

Cons of Clickhouse
Cons of ScyllaDB
  • 5
    Slow insert operations
    Be the first to leave a con

    Sign up to add or upvote consMake informed product decisions

    What is Clickhouse?

    It allows analysis of data that is updated in real time. It offers instant results in most cases: the data is processed faster than it takes to create a query.

    What is ScyllaDB?

    ScyllaDB is the database for data-intensive apps that require high performance and low latency. It enables teams to harness the ever-increasing computing power of modern infrastructures – eliminating barriers to scale as data grows.

    Need advice about which tool to choose?Ask the StackShare community!

    What companies use Clickhouse?
    What companies use ScyllaDB?
    Manage your open source components, licenses, and vulnerabilities
    Learn More

    Sign up to get full access to all the companiesMake informed product decisions

    What tools integrate with Clickhouse?
    What tools integrate with ScyllaDB?

    Sign up to get full access to all the tool integrationsMake informed product decisions

    What are some alternatives to Clickhouse and ScyllaDB?
    Cassandra
    Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.
    Elasticsearch
    Elasticsearch is a distributed, RESTful search and analytics engine capable of storing data and searching it in near real time. Elasticsearch, Kibana, Beats and Logstash are the Elastic Stack (sometimes called the ELK Stack).
    MySQL
    The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software.
    InfluxDB
    InfluxDB is a scalable datastore for metrics, events, and real-time analytics. It has a built-in HTTP API so you don't have to write any server side code to get up and running. InfluxDB is designed to be scalable, simple to install and manage, and fast to get data in and out.
    Druid
    Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.
    See all alternatives