Cassandra vs Hazelcast

Overview

Cassandra

Stacks3.6K

Followers3.5K

Votes507

GitHub Stars9.5K

Forks3.8K

Hazelcast

Stacks428

Followers474

Votes59

GitHub Stars6.4K

Forks1.9K

Cassandra vs Hazelcast: What are the differences?

Introduction: Cassandra and Hazelcast are both widely used distributed database management systems, but they have distinct differences in how they handle data distribution, scalability, and fault tolerance. Understanding these differences can help businesses choose the right solution for their specific needs.

Data Distribution Strategy: Cassandra employs a masterless architecture where data is distributed across multiple nodes using a peer-to-peer model. Each node in the cluster is equal and interacts directly with clients. Hazelcast, on the other hand, uses a master-slave architecture where a single node acts as the master and others as slaves. The master node manages data distribution and coordination with clients.
Consistency and Availability: Cassandra ensures high availability by allowing different consistency levels for reads and writes, allowing trade-offs between consistency and performance. Hazelcast, however, provides strong eventual consistency for distributed data, meaning that updates will eventually propagate to all nodes, but there may be temporary inconsistencies during the propagation process.
Partitioning Strategy: Cassandra uses consistent hashing to distribute data evenly across nodes in a cluster. It uses a ring-based design where each node gets assigned a range of hash values. Hazelcast, on the other hand, uses a partition-based approach where data is divided into partitions, and each partition is assigned to a specific node based on a partition strategy.
Querying Language: Cassandra uses its own query language called CQL (Cassandra Query Language), which is similar to SQL but has some differences. Hazelcast, on the other hand, provides an in-memory data grid and does not have a native query language. It allows users to interact with data using various programming language APIs.
Data Model: Cassandra is column-oriented and provides flexible schema options, allowing each row to have a different set of column names and types. Hazelcast, on the other hand, is a key-value store with a distributed map data structure, where data is organized as key-value pairs.
Integration with Other Systems: Cassandra has built-in support for integration with Apache Hadoop and Apache Spark, making it suitable for big data analytics workflows. Hazelcast, on the other hand, provides connectors and integrations for various systems and frameworks, including Apache Kafka, Apache Camel, Spring, and Hibernate.

In Summary, Cassandra and Hazelcast differ in their data distribution strategy, consistency and availability models, partitioning strategies, querying languages, data models, and integration capabilities. Understanding these differences can help businesses make informed decisions when selecting the right distributed database solution for their needs.

Share your Stack

Help developers discover the tools you use. Get visibility for your team's tech choices and contribute to the community's knowledge.

View Docs

CLI (Node.js)

Manual

Advice on Cassandra, Hazelcast

Vinay

Head of Engineering

Sep 19, 2019

Needs advice

The problem I have is - we need to process & change(update/insert) 55M Data every 2 min and this updated data to be available for Rest API for Filtering / Selection. Response time for Rest API should be less than 1 sec.

The most important factors for me are processing and storing time of 2 min. There need to be 2 views of Data One is for Selection & 2. Changed data.

174k views174k

Comments

Detailed Comparison

Cassandra	Hazelcast
Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.	With its various distributed data structures, distributed caching capabilities, elastic nature, memcache support, integration with Spring and Hibernate and more importantly with so many happy users, Hazelcast is feature-rich, enterprise-ready and developer-friendly in-memory data grid solution.
-	Distributed implementations of java.util.{Queue, Set, List, Map};Distributed implementation of java.util.concurrent.locks.Lock;Distributed implementation of java.util.concurrent.ExecutorService;Distributed MultiMap for one-to-many relationships;Distributed Topic for publish/subscribe messaging;Synchronous (write-through) and asynchronous (write-behind) persistence;Transaction support;Socket level encryption support for secure clusters;Second level cache provider for Hibernate;Monitoring and management of the cluster via JMX;Dynamic HTTP session clustering;Support for cluster info and membership events;Dynamic discovery, scaling, partitioning with backups and fail-over
Statistics
GitHub Stars 9.5K	GitHub Stars 6.4K
GitHub Forks 3.8K	GitHub Forks 1.9K
Stacks 3.6K	Stacks 428
Followers 3.5K	Followers 474
Votes 507	Votes 59
Pros & Cons
Pros 119 Distributed 98 High performance 81 High availability 74 Easy scalability 53 Replication Cons 3 Reliability of replication 2 Size 1 Updates	Pros 11 High Availibility 6 Distributed compute 6 Distributed Locking 5 Sharding 4 Load balancing Cons 4 License needed for SSL
Integrations
No integrations available	Java Spring

What are some alternatives to Cassandra, Hazelcast?

MongoDB

MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding.

Redis

Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache, and message broker. Redis provides data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes, and streams.

MySQL

The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software.

PostgreSQL

PostgreSQL is an advanced object-relational database management system that supports an extended subset of the SQL standard, including transactions, foreign keys, subqueries, triggers, user-defined types and functions.

Microsoft SQL Server

Microsoft® SQL Server is a database management and analysis system for e-commerce, line-of-business, and data warehousing solutions.

SQLite

SQLite is an embedded SQL database engine. Unlike most other SQL databases, SQLite does not have a separate server process. SQLite reads and writes directly to ordinary disk files. A complete SQL database with multiple tables, indices, triggers, and views, is contained in a single disk file.

Memcached

Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.

MariaDB

Started by core members of the original MySQL team, MariaDB actively works with outside developers to deliver the most featureful, stable, and sanely licensed open SQL server in the industry. MariaDB is designed as a drop-in replacement of MySQL(R) with more features, new storage engines, fewer bugs, and better performance.

RethinkDB

RethinkDB is built to store JSON documents, and scale to multiple machines with very little effort. It has a pleasant query language that supports really useful queries like table joins and group by, and is easy to setup and learn.

ArangoDB

A distributed free and open-source database with a flexible data model for documents, graphs, and key-values. Build high performance applications using a convenient SQL-like query language or JavaScript extensions.

Related Comparisons

Cassandra vs Hazelcast: What are the differences?

Data Distribution Strategy: Cassandra employs a masterless architecture where data is distributed across multiple nodes using a peer-to-peer model. Each node in the cluster is equal and interacts directly with clients. Hazelcast, on the other hand, uses a master-slave architecture where a single node acts as the master and others as slaves. The master node manages data distribution and coordination with clients.
Consistency and Availability: Cassandra ensures high availability by allowing different consistency levels for reads and writes, allowing trade-offs between consistency and performance. Hazelcast, however, provides strong eventual consistency for distributed data, meaning that updates will eventually propagate to all nodes, but there may be temporary inconsistencies during the propagation process.
Partitioning Strategy: Cassandra uses consistent hashing to distribute data evenly across nodes in a cluster. It uses a ring-based design where each node gets assigned a range of hash values. Hazelcast, on the other hand, uses a partition-based approach where data is divided into partitions, and each partition is assigned to a specific node based on a partition strategy.
Querying Language: Cassandra uses its own query language called CQL (Cassandra Query Language), which is similar to SQL but has some differences. Hazelcast, on the other hand, provides an in-memory data grid and does not have a native query language. It allows users to interact with data using various programming language APIs.
Data Model: Cassandra is column-oriented and provides flexible schema options, allowing each row to have a different set of column names and types. Hazelcast, on the other hand, is a key-value store with a distributed map data structure, where data is organized as key-value pairs.
Integration with Other Systems: Cassandra has built-in support for integration with Apache Hadoop and Apache Spark, making it suitable for big data analytics workflows. Hazelcast, on the other hand, provides connectors and integrations for various systems and frameworks, including Apache Kafka, Apache Camel, Spring, and Hibernate.

Cassandra vs Hazelcast

Overview