Need advice about which tool to choose?Ask the StackShare community!
Microsoft SQL Server vs Scylla: What are the differences?
Introduction
Microsoft SQL Server and Scylla are both popular database management systems used in various applications. However, there are key differences between the two that make them suitable for different use cases. In this article, we will explore and compare these differences.
Data Model: Microsoft SQL Server is a relational database management system (RDBMS) that organizes data into tables with predefined schemas, where relationships between tables are defined by foreign keys. On the other hand, Scylla is a NoSQL database that uses a wide column data model, where data is organized into tables, but with a flexible schema that allows for dynamic column addition.
Scalability: Scylla is designed for high scalability with a distributed architecture that allows it to easily scale horizontally by adding more nodes to the cluster. In contrast, while Microsoft SQL Server does support some forms of scaling, its scaling capabilities are more limited compared to Scylla.
Performance: Scylla is known for its exceptional performance, especially when it comes to write-intensive workloads. It achieves this by employing a log-structured merge (LSM) strategy, which optimizes write operations. On the other hand, while Microsoft SQL Server is also performant, it may not be as optimized for write-intensive workloads as Scylla.
Consistency vs Availability: In terms of the CAP theorem (Consistency, Availability, Partition Tolerance), Microsoft SQL Server prioritizes consistency and availability. It ensures that data remains consistent even during network partitions, but at the cost of potential performance impact. Scylla, being a NoSQL database, focuses more on availability and partition tolerance, which means it may sacrifice some consistency guarantees in favor of high availability and fault tolerance.
Data Distribution: Microsoft SQL Server uses a master-slave replication model for data distribution. It relies on a central master server to handle write operations and replicates data to one or more slave servers for read operations. In contrast, Scylla utilizes a peer-to-peer gossip-based protocol to distribute data evenly across all nodes in the cluster, allowing for better data replication and fault tolerance.
Cost: Microsoft SQL Server is available as a commercial product, which means it comes with licensing fees that can be quite significant, especially for larger deployments. On the other hand, Scylla is an open-source database, providing a more cost-effective option without licensing costs. However, it should be noted that additional hardware and operational costs may still apply.
In summary, Microsoft SQL Server is a relational database management system that prioritizes consistency and availability, while Scylla is a NoSQL database with a focus on scalability, high performance, and availability. Additionally, Microsoft SQL Server comes with licensing costs, whereas Scylla is an open-source option.
I am a Microsoft SQL Server programmer who is a bit out of practice. I have been asked to assist on a new project. The overall purpose is to organize a large number of recordings so that they can be searched. I have an enormous music library but my songs are several hours long. I need to include things like time, date and location of the recording. I don't have a problem with the general database design. I have two primary questions:
- I need to use either MySQL or PostgreSQL on a Linux based OS. Which would be better for this application?
- I have not dealt with a sound based data type before. How do I store that and put it in a table? Thank you.
Hi Erin,
Honestly both databases will do the job just fine. I personally prefer Postgres.
Much more important is how you store the audio. While you could technically use a blob type column, it's really not ideal to be storing audio files which are "several hours long" in a database row. Instead consider storing the audio files in an object store (hosted options include backblaze b2 or aws s3) and persisting the key (which references that object) in your database column.
Hi Erin, Chances are you would want to store the files in a blob type. Both MySQL and Postgres support this. Can you explain a little more about your need to store the files in the database? I may be more effective to store the files on a file system or something like S3. To answer your qustion based on what you are descibing I would slighly lean towards PostgreSQL since it tends to be a little better on the data warehousing side.
Hey Erin! I would recommend checking out Directus before you start work on building your own app for them. I just stumbled upon it, and so far extremely happy with the functionalities. If your client is just looking for a simple web app for their own data, then Directus may be a great option. It offers "database mirroring", so that you can connect it to any database and set up functionality around it!
Hi Erin! First of all, you'd probably want to go with a managed service. Don't spin up your own MySQL installation on your own Linux box. If you are on AWS, thet have different offerings for database services. Standard RDS vs. Aurora. Aurora would be my preferred choice given the benefits it offers, storage optimizations it comes with... etc. Such managed services easily allow you to apply new security patches and upgrades, set up backups, replication... etc. Doing this on your own would either be risky, inefficient, or you might just give up. As far as which database to chose, you'll have the choice between Postgresql, MySQL, Maria DB, SQL Server... etc. I personally would recommend MySQL (latest version available), as the official tooling for it (MySQL Workbench) is great, stable, and moreover free. Other database services exist, I'd recommend you also explore Dynamo DB.
Regardless, you'd certainly only keep high-level records, meta data in Database, and the actual files, most-likely in S3, so that you can keep all options open in terms of what you'll do with them.
Hi Erin,
- Coming from "Big" DB engines, such as Oracle or MSSQL, go for PostgreSQL. You'll get all the features you need with PostgreSQL.
- Your case seems to point to a "NoSQL" or Document Database use case. Since you get covered on this with PostgreSQL which achieves excellent performances on JSON based objects, this is a second reason to choose PostgreSQL. MongoDB might be an excellent option as well if you need "sharding" and excellent map-reduce mechanisms for very massive data sets. You really should investigate the NoSQL option for your use case.
- Starting with AWS Aurora is an excellent advise. since "vendor lock-in" is limited, but I did not check for JSON based object / NoSQL features.
- If you stick to Linux server, the PostgreSQL or MySQL provided with your distribution are straightforward to install (i.e. apt install postgresql). For PostgreSQL, make sure you're comfortable with the pg_hba.conf, especially for IP restrictions & accesses.
Regards,
I recommend Postgres as well. Superior performance overall and a more robust architecture.
The problem I have is - we need to process & change(update/insert) 55M Data every 2 min and this updated data to be available for Rest API for Filtering / Selection. Response time for Rest API should be less than 1 sec.
The most important factors for me are processing and storing time of 2 min. There need to be 2 views of Data One is for Selection & 2. Changed data.
Scylla can handle 1M/s events with a simple data model quite easily. The api to query is CQL, we have REST api but that's for control/monitoring
Cassandra is quite capable of the task, in a highly available way, given appropriate scaling of the system. Remember that updates are only inserts, and that efficient retrieval is only by key (which can be a complex key). Talking of keys, make sure that the keys are well distributed.
By 55M do you mean 55 million entity changes per 2 minutes? It is relatively high, means almost 460k per second. If I had to choose between Scylla or Cassandra, I would opt for Scylla as it is promising better performance for simple operations. However, maybe it would be worth to consider yet another alternative technology. Take into consideration required consistency, reliability and high availability and you may realize that there are more suitable once. Rest API should not be the main driver, because you can always develop the API yourself, if not supported by given technology.
i love syclla for pet projects however it's license which is based on server model is an issue. thus i recommend cassandra
The Gentlent Tech Team made lots of updates within the past year. The biggest one being our database:
We decided to migrate our #PostgreSQL -based database systems to a custom implementation of #Cassandra . This allows us to integrate our product data perfectly in a system that just makes sense. High availability and scalability are supported out of the box.
Pros of Microsoft SQL Server
- Reliable and easy to use139
- High performance102
- Great with .net95
- Works well with .net65
- Easy to maintain56
- Azure support21
- Full Index Support17
- Always on17
- Enterprise manager is fantastic10
- In-Memory OLTP Engine9
- Easy to setup and configure2
- Security is forefront2
- Faster Than Oracle1
- Decent management tools1
- Great documentation1
- Docker Delivery1
- Columnstore indexes1
Pros of ScyllaDB
- Replication2
- Fewer nodes1
- Distributed1
- Scale up1
- High availability1
- Written in C++1
- High performance1
Sign up to add or upvote prosMake informed product decisions
Cons of Microsoft SQL Server
- Expensive Licensing4
- Microsoft2