Need advice about which tool to choose?Ask the StackShare community!

Clickhouse

407
528
+ 1
85
Vertica

88
120
+ 1
16
Add tool

Clickhouse vs Vertica: What are the differences?

Introduction

ClickHouse and Vertica are both columnar database management systems that are designed for high-performance analytics. While they share similarities in terms of providing fast query processing and scalability, they also have distinct differences that set them apart.

  1. Architecture: ClickHouse and Vertica have different architectural approaches. ClickHouse is built on a shared-nothing architecture where data is partitioned across a cluster of commodity hardware. On the other hand, Vertica utilizes a shared disk architecture where data is stored on a shared storage system accessible by multiple nodes. This architectural difference affects the way data is distributed, replicated, and processed, leading to variations in performance and fault tolerance.

  2. Data Compression: ClickHouse and Vertica employ different techniques for data compression. ClickHouse uses a combination of dictionary and delta compression, along with efficient bit packing and SIMD instructions to achieve high compression ratios. Vertica, on the other hand, utilizes various compression algorithms such as Huffman, Run-Length Encoding (RLE), and delta encoding. These compression techniques impact the storage requirements and query performance of the system.

  3. Indexing: ClickHouse and Vertica have different approaches to indexing. ClickHouse relies heavily on using an efficient MergeTree data structure, which is optimized for time-series data and supports granular partitioning and sorting. Vertica, on the other hand, leverages a combination of projection and segmentation to optimize query performance. These indexing strategies impact the speed and efficiency of querying data.

  4. Data Distribution: ClickHouse and Vertica handle data distribution differently. ClickHouse uses a replication model where data is divided into parts and replicated across multiple nodes for increased fault tolerance. Vertica, on the other hand, uses a sharding model where data is partitioned based on a predefined key and distributed across different nodes. These data distribution mechanisms have implications on query execution, data access patterns, and fault tolerance.

  5. Query Execution Model: ClickHouse and Vertica have different query execution models. ClickHouse employs a vectorized query execution model, where rows of data are processed in a batch-oriented manner to achieve high throughput. Vertica, on the other hand, uses a hybrid query execution model that combines a row-based execution approach with the ability to process multiple rows simultaneously. These execution models affect the performance characteristics of the systems in terms of query latency and throughput.

  6. SQL Compatibility: ClickHouse and Vertica differ in terms of SQL compatibility. ClickHouse supports a subset of SQL standards, primarily focusing on analytical workloads and lacking certain advanced features found in traditional SQL implementations. Vertica, on the other hand, provides a more comprehensive SQL implementation that supports advanced features like window functions, user-defined functions (UDFs), and complex SQL constructs. This difference in SQL compatibility may impact the ease of migration and compatibility with existing SQL-based applications.

In summary, ClickHouse and Vertica differ in their architectural approach, compression techniques, indexing strategies, data distribution mechanisms, query execution models, and SQL compatibility. These differences impact various aspects of the systems' performance, scalability, and functionality.

Manage your open source components, licenses, and vulnerabilities
Learn More
Pros of Clickhouse
Pros of Vertica
  • 21
    Fast, very very fast
  • 11
    Good compression ratio
  • 7
    Horizontally scalable
  • 6
    Utilizes all CPU resources
  • 5
    RESTful
  • 5
    Open-source
  • 5
    Great CLI
  • 4
    Great number of SQL functions
  • 4
    Buggy
  • 3
    Server crashes its normal :(
  • 3
    Highly available
  • 3
    Flexible connection options
  • 3
    Has no transactions
  • 2
    ODBC
  • 2
    Flexible compression options
  • 1
    In IDEA data import via HTTP interface not working
  • 3
    Shared nothing or shared everything architecture
  • 1
    Reduce costs as reduced hardware is required
  • 1
    Offers users the freedom to choose deployment mode
  • 1
    Flexible architecture suits nearly any project
  • 1
    End-to-End ML Workflow Support
  • 1
    All You Need for IoT, Clickstream or Geospatial
  • 1
    Freedom from Underlying Storage
  • 1
    Pre-Aggregation for Cubes (LAPS)
  • 1
    Automatic Data Marts (Flatten Tables)
  • 1
    Near-Real-Time Analytics in pure Column Store
  • 1
    Fully automated Database Designer tool
  • 1
    Query-Optimized Storage
  • 1
    Vertica is the only product which offers partition prun
  • 1
    Partition pruning and predicate push down on Parquet

Sign up to add or upvote prosMake informed product decisions

Cons of Clickhouse
Cons of Vertica
  • 5
    Slow insert operations
    Be the first to leave a con

    Sign up to add or upvote consMake informed product decisions

    What is Clickhouse?

    It allows analysis of data that is updated in real time. It offers instant results in most cases: the data is processed faster than it takes to create a query.

    What is Vertica?

    It provides a best-in-class, unified analytics platform that will forever be independent from underlying infrastructure.

    Need advice about which tool to choose?Ask the StackShare community!

    What companies use Clickhouse?
    What companies use Vertica?
    Manage your open source components, licenses, and vulnerabilities
    Learn More

    Sign up to get full access to all the companiesMake informed product decisions

    What tools integrate with Clickhouse?
    What tools integrate with Vertica?

    Sign up to get full access to all the tool integrationsMake informed product decisions

    What are some alternatives to Clickhouse and Vertica?
    Cassandra
    Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.
    Elasticsearch
    Elasticsearch is a distributed, RESTful search and analytics engine capable of storing data and searching it in near real time. Elasticsearch, Kibana, Beats and Logstash are the Elastic Stack (sometimes called the ELK Stack).
    MySQL
    The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software.
    InfluxDB
    InfluxDB is a scalable datastore for metrics, events, and real-time analytics. It has a built-in HTTP API so you don't have to write any server side code to get up and running. InfluxDB is designed to be scalable, simple to install and manage, and fast to get data in and out.
    Druid
    Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.
    See all alternatives