Need advice about which tool to choose?Ask the StackShare community!
Microsoft SQL Server vs TimescaleDB: What are the differences?
Introduction
In this article, we will explore the key differences between Microsoft SQL Server and TimescaleDB. Both databases are widely used in the industry but have distinct features and purposes. Let's dive into the differences between these two databases.
Data Model: Microsoft SQL Server follows a relational data model, storing data in tables with predefined schemas. It relies on structured query language (SQL) for data retrieval and manipulation. On the other hand, TimescaleDB is built on top of PostgreSQL and extends it to provide native support for time-series data. It introduces the concept of hypertables, which allow automatic partitioning and scaling of time-series data, making it more efficient for storing and querying time-series data.
Scalability: While Microsoft SQL Server can scale vertically by adding more resources to a single server, TimescaleDB focuses on horizontal scalability. It allows data to be distributed across multiple servers, enabling better performance for large-scale deployments. TimescaleDB achieves this through automatic data partitioning and parallel query execution, making it suitable for handling massive volumes of time-series data.
Performance: Microsoft SQL Server is optimized for general-purpose workload management, providing excellent performance for complex queries across different types of data. TimescaleDB, on the other hand, is designed specifically for time-series data and offers high-performance features tailored for time-based analytical queries. Its automatic data partitioning and indexing strategies ensure faster query execution on time-series data.
Data Storage: In terms of data storage, Microsoft SQL Server typically uses a single-node architecture. It allows a single server to store and manage all the data, providing transactional consistency. In contrast, TimescaleDB utilizes a distributed architecture, spreading the data across multiple nodes. This distributed approach enables better data resilience, fault tolerance, and the ability to handle large volumes of data.
Community and Ecosystem: Microsoft SQL Server has a long-standing presence in the industry and a large user community. It offers extensive documentation, community support, and a wide range of tools and integrations. TimescaleDB, being built on PostgreSQL, benefits from the existing PostgreSQL ecosystem and community. It inherits many features and plugins from PostgreSQL, including support for various programming languages, query optimizers, and extensions.
Cost: Another significant difference is the cost aspect. Microsoft SQL Server is a commercial database, and licensing costs may apply based on server capacity and features. In contrast, TimescaleDB is an open-source extension built on PostgreSQL, making it a cost-effective choice for organizations seeking efficient time-series data handling without additional licensing costs.
In summary, Microsoft SQL Server follows a relational data model with a focus on general-purpose workload management, while TimescaleDB is specifically designed for time-series data with features like automatic partitioning and support for hypertables. TimescaleDB emphasizes horizontal scalability, high-performance time-series data handling, and is an open-source alternative to commercial databases.
Developing a solution that collects Telemetry Data from different devices, nearly 1000 devices minimum and maximum 12000. Each device is sending 2 packets in 1 second. This is time-series data, and this data definition and different reports are saved on PostgreSQL. Like Building information, maintenance records, etc. I want to know about the best solution. This data is required for Math and ML to run different algorithms. Also, data is raw without definitions and information stored in PostgreSQL. Initially, I went with TimescaleDB due to PostgreSQL support, but to increase in sites, I started facing many issues with timescale DB in terms of flexibility of storing data.
My major requirement is also the replication of the database for reporting and different purposes. You may also suggest other options other than Druid and Cassandra. But an open source solution is appreciated.
Hi Umair, Did you try MongoDB. We are using MongoDB on a production environment and collecting data from devices like your scenario. We have a MongoDB cluster with three replicas. Data from devices are being written to the master node and real-time dashboard UI is using the secondary nodes for read operations. With this setup write operations are not affected by read operations too.
We are building an IOT service with heavy write throughput and fewer reads (we need downsampling records). We prefer to have good reliability when comes to data and prefer to have data retention based on policies.
So, we are looking for what is the best underlying DB for ingesting a lot of data and do queries easily
We had a similar challenge. We started with DynamoDB, Timescale, and even InfluxDB and Mongo - to eventually settle with PostgreSQL. Assuming the inbound data pipeline in queued (for example, Kinesis/Kafka -> S3 -> and some Lambda functions), PostgreSQL gave us a We had a similar challenge. We started with DynamoDB, Timescale and even InfluxDB and Mongo - to eventually settle with PostgreSQL. Assuming the inbound data pipeline in queued (for example, Kinesis/Kafka -> S3 -> and some Lambda functions), PostgreSQL gave us better performance by far.
Druid is amazing for this use case and is a cloud-native solution that can be deployed on any cloud infrastructure or on Kubernetes. - Easy to scale horizontally - Column Oriented Database - SQL to query data - Streaming and Batch Ingestion - Native search indexes It has feature to work as TimeSeriesDB, Datawarehouse, and has Time-optimized partitioning.
if you want to find a serverless solution with capability of a lot of storage and SQL kind of capability then google bigquery is the best solution for that.
I am a Microsoft SQL Server programmer who is a bit out of practice. I have been asked to assist on a new project. The overall purpose is to organize a large number of recordings so that they can be searched. I have an enormous music library but my songs are several hours long. I need to include things like time, date and location of the recording. I don't have a problem with the general database design. I have two primary questions:
- I need to use either MySQL or PostgreSQL on a Linux based OS. Which would be better for this application?
- I have not dealt with a sound based data type before. How do I store that and put it in a table? Thank you.
Hi Erin,
Honestly both databases will do the job just fine. I personally prefer Postgres.
Much more important is how you store the audio. While you could technically use a blob type column, it's really not ideal to be storing audio files which are "several hours long" in a database row. Instead consider storing the audio files in an object store (hosted options include backblaze b2 or aws s3) and persisting the key (which references that object) in your database column.
Hi Erin, Chances are you would want to store the files in a blob type. Both MySQL and Postgres support this. Can you explain a little more about your need to store the files in the database? I may be more effective to store the files on a file system or something like S3. To answer your qustion based on what you are descibing I would slighly lean towards PostgreSQL since it tends to be a little better on the data warehousing side.
Hey Erin! I would recommend checking out Directus before you start work on building your own app for them. I just stumbled upon it, and so far extremely happy with the functionalities. If your client is just looking for a simple web app for their own data, then Directus may be a great option. It offers "database mirroring", so that you can connect it to any database and set up functionality around it!
Hi Erin! First of all, you'd probably want to go with a managed service. Don't spin up your own MySQL installation on your own Linux box. If you are on AWS, thet have different offerings for database services. Standard RDS vs. Aurora. Aurora would be my preferred choice given the benefits it offers, storage optimizations it comes with... etc. Such managed services easily allow you to apply new security patches and upgrades, set up backups, replication... etc. Doing this on your own would either be risky, inefficient, or you might just give up. As far as which database to chose, you'll have the choice between Postgresql, MySQL, Maria DB, SQL Server... etc. I personally would recommend MySQL (latest version available), as the official tooling for it (MySQL Workbench) is great, stable, and moreover free. Other database services exist, I'd recommend you also explore Dynamo DB.
Regardless, you'd certainly only keep high-level records, meta data in Database, and the actual files, most-likely in S3, so that you can keep all options open in terms of what you'll do with them.
Hi Erin,
- Coming from "Big" DB engines, such as Oracle or MSSQL, go for PostgreSQL. You'll get all the features you need with PostgreSQL.
- Your case seems to point to a "NoSQL" or Document Database use case. Since you get covered on this with PostgreSQL which achieves excellent performances on JSON based objects, this is a second reason to choose PostgreSQL. MongoDB might be an excellent option as well if you need "sharding" and excellent map-reduce mechanisms for very massive data sets. You really should investigate the NoSQL option for your use case.
- Starting with AWS Aurora is an excellent advise. since "vendor lock-in" is limited, but I did not check for JSON based object / NoSQL features.
- If you stick to Linux server, the PostgreSQL or MySQL provided with your distribution are straightforward to install (i.e. apt install postgresql). For PostgreSQL, make sure you're comfortable with the pg_hba.conf, especially for IP restrictions & accesses.
Regards,
I recommend Postgres as well. Superior performance overall and a more robust architecture.
I chose TimescaleDB because to be the backend system of our production monitoring system. We needed to be able to keep track of multiple high cardinality dimensions.
The drawbacks of this decision are our monitoring system is a bit more ad hoc than it used to (New Relic Insights)
We are combining this with Grafana for display and Telegraf for data collection
Pros of Microsoft SQL Server
- Reliable and easy to use139
- High performance101
- Great with .net95
- Works well with .net65
- Easy to maintain56
- Azure support21
- Always on17
- Full Index Support17
- Enterprise manager is fantastic10
- In-Memory OLTP Engine9
- Easy to setup and configure2
- Security is forefront2
- Great documentation1
- Faster Than Oracle1
- Columnstore indexes1
- Decent management tools1
- Docker Delivery1
- Max numar of connection is 140001
Pros of TimescaleDB
- Open source9
- Easy Query Language8
- Time-series data analysis7
- Established postgresql API and support5
- Reliable4
- Paid support for automatic Retention Policy2
- Chunk-based compression2
- Postgres integration2
- High-performance2
- Fast and scalable2
- Case studies1
Sign up to add or upvote prosMake informed product decisions
Cons of Microsoft SQL Server
- Expensive Licensing4
- Microsoft2
- Data pages is only 8k1
- Allwayon can loose data in asycronious mode1
- Replication can loose the data1
- The maximum number of connections is only 14000 connect1
Cons of TimescaleDB
- Licensing issues when running on managed databases5