Clickhouse vs Microsoft SQL Server

Need advice about which tool to choose?Ask the StackShare community!

Clickhouse

384
513
+ 1
78
Microsoft SQL Server

20.5K
14.9K
+ 1
540
Add tool

Clickhouse vs Microsoft SQL Server: What are the differences?

Introduction

In this article, we will compare ClickHouse and Microsoft SQL Server (MSSQL) and highlight key differences between the two database management systems.

  1. Storage and Data Representation: ClickHouse uses a columnar storage model, which is optimized for analytics and processing large volumes of data. It compresses and stores data in a highly efficient manner, resulting in faster query performance. On the other hand, MSSQL uses a row-based storage model, which is more suitable for transactional systems.

  2. Scalability: ClickHouse is designed to scale horizontally, allowing users to add more servers to handle increasing data volumes and query loads. It leverages distributed computing and data replication techniques to achieve high scalability. In contrast, MSSQL has traditionally relied on vertical scaling, where a single server is upgraded with more resources such as CPU and RAM.

  3. Query Language: ClickHouse uses its own query language called ClickHouse SQL (with some similarities to SQL), which is optimized for analytical queries and can handle complex aggregations efficiently. MSSQL, on the other hand, supports the standard SQL language with its own extensions and provides a wide range of features for both transactional and analytical workloads.

  4. Integration and Ecosystem: MSSQL has been around for a longer time and has a mature ecosystem with a wide range of tools, libraries, and integrations available. It has better support for common BI and reporting tools and offers seamless integration with other Microsoft products. ClickHouse, on the other hand, has a smaller ecosystem but is gaining popularity in the analytics and big data space.

  5. Data Replication and High Availability: ClickHouse supports asynchronous data replication and can handle automatic failover in case of node failures. It provides built-in mechanisms for data redundancy and fault tolerance. In contrast, MSSQL requires additional configuration and setup for replication and high availability. It offers options like database mirroring, Always On Availability Groups, and log shipping for achieving data resilience.

  6. Data Partitioning and Indexing: ClickHouse supports efficient data partitioning and indexing strategies, which can significantly enhance query performance. It allows users to partition data based on certain criteria (like time or region) and utilize an appropriate storage layout for each partition. MSSQL also supports data partitioning and indexing but may require more manual tuning and optimization.

In summary, ClickHouse and MSSQL differ in storage models, scalability approaches, query languages, ecosystem maturity, data replication capabilities, and data partitioning/indexing strategies. These differences make each database management system suitable for specific use cases and workloads.

Advice on Clickhouse and Microsoft SQL Server

I am a Microsoft SQL Server programmer who is a bit out of practice. I have been asked to assist on a new project. The overall purpose is to organize a large number of recordings so that they can be searched. I have an enormous music library but my songs are several hours long. I need to include things like time, date and location of the recording. I don't have a problem with the general database design. I have two primary questions:

  1. I need to use either MySQL or PostgreSQL on a Linux based OS. Which would be better for this application?
  2. I have not dealt with a sound based data type before. How do I store that and put it in a table? Thank you.
See more
Replies (6)

Hi Erin,

Honestly both databases will do the job just fine. I personally prefer Postgres.

Much more important is how you store the audio. While you could technically use a blob type column, it's really not ideal to be storing audio files which are "several hours long" in a database row. Instead consider storing the audio files in an object store (hosted options include backblaze b2 or aws s3) and persisting the key (which references that object) in your database column.

See more
Aaron Westley
Recommends
on
PostgreSQLPostgreSQL

Hi Erin, Chances are you would want to store the files in a blob type. Both MySQL and Postgres support this. Can you explain a little more about your need to store the files in the database? I may be more effective to store the files on a file system or something like S3. To answer your qustion based on what you are descibing I would slighly lean towards PostgreSQL since it tends to be a little better on the data warehousing side.

See more
Christopher Wray
Web Developer at Soltech LLC · | 3 upvotes · 417.3K views
Recommends
on
DirectusDirectus
at

Hey Erin! I would recommend checking out Directus before you start work on building your own app for them. I just stumbled upon it, and so far extremely happy with the functionalities. If your client is just looking for a simple web app for their own data, then Directus may be a great option. It offers "database mirroring", so that you can connect it to any database and set up functionality around it!

See more
Julien DeFrance
Principal Software Engineer at Tophatter · | 3 upvotes · 416.9K views
Recommends
on
Amazon AuroraAmazon Aurora

Hi Erin! First of all, you'd probably want to go with a managed service. Don't spin up your own MySQL installation on your own Linux box. If you are on AWS, thet have different offerings for database services. Standard RDS vs. Aurora. Aurora would be my preferred choice given the benefits it offers, storage optimizations it comes with... etc. Such managed services easily allow you to apply new security patches and upgrades, set up backups, replication... etc. Doing this on your own would either be risky, inefficient, or you might just give up. As far as which database to chose, you'll have the choice between Postgresql, MySQL, Maria DB, SQL Server... etc. I personally would recommend MySQL (latest version available), as the official tooling for it (MySQL Workbench) is great, stable, and moreover free. Other database services exist, I'd recommend you also explore Dynamo DB.

Regardless, you'd certainly only keep high-level records, meta data in Database, and the actual files, most-likely in S3, so that you can keep all options open in terms of what you'll do with them.

See more
Recommends
on
PostgreSQLPostgreSQL

Hi Erin,

  • Coming from "Big" DB engines, such as Oracle or MSSQL, go for PostgreSQL. You'll get all the features you need with PostgreSQL.
  • Your case seems to point to a "NoSQL" or Document Database use case. Since you get covered on this with PostgreSQL which achieves excellent performances on JSON based objects, this is a second reason to choose PostgreSQL. MongoDB might be an excellent option as well if you need "sharding" and excellent map-reduce mechanisms for very massive data sets. You really should investigate the NoSQL option for your use case.
  • Starting with AWS Aurora is an excellent advise. since "vendor lock-in" is limited, but I did not check for JSON based object / NoSQL features.
  • If you stick to Linux server, the PostgreSQL or MySQL provided with your distribution are straightforward to install (i.e. apt install postgresql). For PostgreSQL, make sure you're comfortable with the pg_hba.conf, especially for IP restrictions & accesses.

Regards,

See more
Klaus Nji
Staff Software Engineer at SailPoint Technologies · | 1 upvotes · 417K views
Recommends
on
PostgreSQLPostgreSQL

I recommend Postgres as well. Superior performance overall and a more robust architecture.

See more
Get Advice from developers at your company using StackShare Enterprise. Sign up for StackShare Enterprise.
Learn More
Pros of Clickhouse
Pros of Microsoft SQL Server
  • 19
    Fast, very very fast
  • 11
    Good compression ratio
  • 6
    Horizontally scalable
  • 5
    Great CLI
  • 5
    Utilizes all CPU resources
  • 5
    RESTful
  • 4
    Buggy
  • 4
    Open-source
  • 4
    Great number of SQL functions
  • 3
    Server crashes its normal :(
  • 3
    Has no transactions
  • 2
    Flexible connection options
  • 2
    Highly available
  • 2
    ODBC
  • 2
    Flexible compression options
  • 1
    In IDEA data import via HTTP interface not working
  • 139
    Reliable and easy to use
  • 102
    High performance
  • 95
    Great with .net
  • 65
    Works well with .net
  • 56
    Easy to maintain
  • 21
    Azure support
  • 17
    Full Index Support
  • 17
    Always on
  • 10
    Enterprise manager is fantastic
  • 9
    In-Memory OLTP Engine
  • 2
    Easy to setup and configure
  • 2
    Security is forefront
  • 1
    Faster Than Oracle
  • 1
    Decent management tools
  • 1
    Great documentation
  • 1
    Docker Delivery
  • 1
    Columnstore indexes

Sign up to add or upvote prosMake informed product decisions

Cons of Clickhouse
Cons of Microsoft SQL Server
  • 5
    Slow insert operations
  • 4
    Expensive Licensing
  • 2
    Microsoft

Sign up to add or upvote consMake informed product decisions

What is Clickhouse?

It allows analysis of data that is updated in real time. It offers instant results in most cases: the data is processed faster than it takes to create a query.

What is Microsoft SQL Server?

Microsoft® SQL Server is a database management and analysis system for e-commerce, line-of-business, and data warehousing solutions.

Need advice about which tool to choose?Ask the StackShare community!

What companies use Clickhouse?
What companies use Microsoft SQL Server?
See which teams inside your own company are using Clickhouse or Microsoft SQL Server.
Sign up for StackShare EnterpriseLearn More

Sign up to get full access to all the companiesMake informed product decisions

What tools integrate with Clickhouse?
What tools integrate with Microsoft SQL Server?

Sign up to get full access to all the tool integrationsMake informed product decisions

What are some alternatives to Clickhouse and Microsoft SQL Server?
Cassandra
Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.
Elasticsearch
Elasticsearch is a distributed, RESTful search and analytics engine capable of storing data and searching it in near real time. Elasticsearch, Kibana, Beats and Logstash are the Elastic Stack (sometimes called the ELK Stack).
MySQL
The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software.
InfluxDB
InfluxDB is a scalable datastore for metrics, events, and real-time analytics. It has a built-in HTTP API so you don't have to write any server side code to get up and running. InfluxDB is designed to be scalable, simple to install and manage, and fast to get data in and out.
Druid
Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.
See all alternatives