Need advice about which tool to choose?Ask the StackShare community!
RocksDB vs Scylla: What are the differences?
Introduction:
Here we will discuss the key differences between RocksDB and Scylla. These two are popular database systems that have their own unique features and advantages. By understanding their differences, users can choose the one that best suits their requirements and needs.
Storage Model: RocksDB is a key-value store that is optimized for fast storage and retrieval of key-value pairs. It is designed to efficiently handle both read and write operations. On the other hand, Scylla is a wide-column store that is based on Apache Cassandra. It is known for its ability to handle large volumes of data with high write and read performance.
Consistency Model: RocksDB is a single-node database and follows strict consistency and atomicity guarantees. It ensures that all operations are performed in a serialized order and maintains strict consistency. In contrast, Scylla is a distributed database that uses a distributed consensus protocol for consistency. It provides eventual consistency and allows for high availability and fault tolerance.
Replication and Scalability: RocksDB does not provide built-in support for replication and scalability, although it can be used in distributed systems through frameworks like Hadoop and Spark. On the other hand, Scylla is designed to handle large-scale deployments and provides built-in support for replication and horizontal scalability. It uses a masterless architecture that allows for automatic data distribution and replication across multiple nodes.
Data Model: RocksDB is a key-value store and does not support complex data types or secondary indexes out-of-the-box. It is primarily used for simple key-value storage and retrieval. Scylla, on the other hand, supports a wide range of data types and allows for the creation of secondary indexes. It also provides support for advanced querying capabilities, including range scans and aggregations.
Concurrency Control: RocksDB provides multithreaded read and write operations within a single-node environment. It utilizes multi-version concurrency control (MVCC) to provide concurrent access to data. Scylla, being a distributed database, uses a distributed concurrency control mechanism to handle concurrent operations across multiple nodes. It employs techniques like token-based partitioning and distributed locking to ensure consistency and isolation.
Performance: RocksDB is known for its high performance and low-latency data access. It is optimized for fast storage and retrieval and can handle high write and read workloads efficiently. Scylla, on the other hand, is designed to provide scalable throughput and low latency for large-scale deployments. It is capable of handling millions of operations per second and can scale horizontally to handle massive amounts of data efficiently.
In Summary, RocksDB is a key-value store optimized for fast storage and retrieval, while Scylla is a wide-column store designed for high-performance handling of large volumes of data in distributed environments.
The problem I have is - we need to process & change(update/insert) 55M Data every 2 min and this updated data to be available for Rest API for Filtering / Selection. Response time for Rest API should be less than 1 sec.
The most important factors for me are processing and storing time of 2 min. There need to be 2 views of Data One is for Selection & 2. Changed data.
Scylla can handle 1M/s events with a simple data model quite easily. The api to query is CQL, we have REST api but that's for control/monitoring
Cassandra is quite capable of the task, in a highly available way, given appropriate scaling of the system. Remember that updates are only inserts, and that efficient retrieval is only by key (which can be a complex key). Talking of keys, make sure that the keys are well distributed.
By 55M do you mean 55 million entity changes per 2 minutes? It is relatively high, means almost 460k per second. If I had to choose between Scylla or Cassandra, I would opt for Scylla as it is promising better performance for simple operations. However, maybe it would be worth to consider yet another alternative technology. Take into consideration required consistency, reliability and high availability and you may realize that there are more suitable once. Rest API should not be the main driver, because you can always develop the API yourself, if not supported by given technology.
i love syclla for pet projects however it's license which is based on server model is an issue. thus i recommend cassandra
The Gentlent Tech Team made lots of updates within the past year. The biggest one being our database:
We decided to migrate our #PostgreSQL -based database systems to a custom implementation of #Cassandra . This allows us to integrate our product data perfectly in a system that just makes sense. High availability and scalability are supported out of the box.
Pros of RocksDB
- Very fast5
- Made by Facebook3
- Consistent performance2
- Ability to add logic to the database layer where needed1
Pros of ScyllaDB
- Replication2
- Fewer nodes1
- Distributed1
- Scale up1
- High availability1
- Written in C++1
- High performance1