Cassandra vs Google Cloud Bigtable

Overview

Cassandra

Stacks3.6K

Followers3.5K

Votes507

GitHub Stars9.5K

Forks3.8K

Google Cloud Bigtable

Stacks173

Followers363

Votes25

Cassandra vs Google Cloud Bigtable: What are the differences?

Introduction

Cassandra and Google Cloud Bigtable are both NoSQL databases, designed for handling large amounts of data with high scalability and performance. However, they have some key differences that set them apart from each other.

Data Model: Cassandra uses a wide-column data model, which allows for flexible schema and dynamic addition of columns. On the other hand, Google Cloud Bigtable utilizes a sparse, distributed, persistent multidimensional sorted map data model, where data is structured in rows and columns similar to a traditional database table.
Consistency Model: Cassandra offers tunable consistency, where users can choose between strong consistency or eventual consistency based on their requirements. In contrast, Google Cloud Bigtable provides only eventual consistency, which means that data may be inconsistent for a brief period of time before it becomes consistent across all nodes.
Concurrency Control: Cassandra uses a distributed versioning approach known as "last write wins" to handle conflicts during concurrent updates. In contrast, Google Cloud Bigtable relies on optimistic concurrency control, where concurrent requests are allowed and conflicts are detected and resolved based on timestamps.
Storage Architecture: Cassandra employs a distributed, peer-to-peer architecture where data is distributed across a cluster of nodes. Data in Cassandra is stored in memory and disk-based data structures, with built-in support for replication and fault-tolerance. On the other hand, Google Cloud Bigtable utilizes a distributed file system called Colossus, where data is stored in a hierarchical structure of tablets for efficient storage and retrieval.
Query Language: Cassandra uses its own query language called CQL (Cassandra Query Language), which is similar to SQL but with some differences. Google Cloud Bigtable, on the other hand, does not provide a query language out of the box. Instead, it encourages the use of client libraries and frameworks to interact with the database.
Scaling: Both Cassandra and Google Cloud Bigtable are designed for horizontal scalability. However, Cassandra offers automatic data partitioning and distribution across nodes for seamless scalability. Google Cloud Bigtable, on the other hand, requires manual sharding and management of tablets to achieve scalability.

In summary, Cassandra and Google Cloud Bigtable differ in their data model, consistency model, concurrency control, storage architecture, query language, and scaling mechanisms.

Share your Stack

Help developers discover the tools you use. Get visibility for your team's tech choices and contribute to the community's knowledge.

View Docs

CLI (Node.js)

Manual

Advice on Cassandra, Google Cloud Bigtable

Vinay

Head of Engineering

Sep 19, 2019

Needs advice

The problem I have is - we need to process & change(update/insert) 55M Data every 2 min and this updated data to be available for Rest API for Filtering / Selection. Response time for Rest API should be less than 1 sec.

The most important factors for me are processing and storing time of 2 min. There need to be 2 views of Data One is for Selection & 2. Changed data.

174k views174k

Comments

Detailed Comparison

Cassandra	Google Cloud Bigtable
Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.	Google Cloud Bigtable offers you a fast, fully managed, massively scalable NoSQL database service that's ideal for web, mobile, and Internet of Things applications requiring terabytes to petabytes of data. Unlike comparable market offerings, Cloud Bigtable doesn't require you to sacrifice speed, scale, or cost efficiency when your applications grow. Cloud Bigtable has been battle-tested at Google for more than 10 years—it's the database driving major applications such as Google Analytics and Gmail.
-	Unmatched Performance: Single-digit millisecond latency and over 2X the performance per dollar of unmanaged NoSQL alternatives.;Open Source Interface: Because Cloud Bigtable is accessed through the HBase API, it is natively integrated with much of the existing big data and Hadoop ecosystem and supports Google’s big data products. Additionally, data can be imported from or exported to existing HBase clusters through simple bulk ingestion tools using industry-standard formats.;Low Cost: By providing a fully managed service and exceptional efficiency, Cloud Bigtable’s total cost of ownership is less than half the cost of its direct competition.;Security: Cloud Bigtable is built with a replicated storage strategy, and all data is encrypted both in-flight and at rest.;Simplicity: Creating or reconfiguring a Cloud Bigtable cluster is done through a simple user interface and can be completed in less than 10 seconds. As data is put into Cloud Bigtable the backing storage scales automatically, so there’s no need to do complicated estimates of capacity requirements.;Maturity: Over the past 10+ years, Bigtable has driven Google’s most critical applications. In addition, the HBase API is a industry-standard interface for combined operational and analytical workloads.
Statistics
GitHub Stars 9.5K	GitHub Stars -
GitHub Forks 3.8K	GitHub Forks -
Stacks 3.6K	Stacks 173
Followers 3.5K	Followers 363
Votes 507	Votes 25
Pros & Cons
Pros 119 Distributed 98 High performance 81 High availability 74 Easy scalability 53 Replication Cons 3 Reliability of replication 2 Size 1 Updates	Pros 11 High performance 9 Fully managed 5 High scalability
Integrations
No integrations available	Heroic Hadoop Apache Spark

What are some alternatives to Cassandra, Google Cloud Bigtable?

MongoDB

MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding.

MySQL

The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software.

PostgreSQL

PostgreSQL is an advanced object-relational database management system that supports an extended subset of the SQL standard, including transactions, foreign keys, subqueries, triggers, user-defined types and functions.

Microsoft SQL Server

Microsoft® SQL Server is a database management and analysis system for e-commerce, line-of-business, and data warehousing solutions.

SQLite

SQLite is an embedded SQL database engine. Unlike most other SQL databases, SQLite does not have a separate server process. SQLite reads and writes directly to ordinary disk files. A complete SQL database with multiple tables, indices, triggers, and views, is contained in a single disk file.

Memcached

Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.

MariaDB

Started by core members of the original MySQL team, MariaDB actively works with outside developers to deliver the most featureful, stable, and sanely licensed open SQL server in the industry. MariaDB is designed as a drop-in replacement of MySQL(R) with more features, new storage engines, fewer bugs, and better performance.

RethinkDB

RethinkDB is built to store JSON documents, and scale to multiple machines with very little effort. It has a pleasant query language that supports really useful queries like table joins and group by, and is easy to setup and learn.

Amazon DynamoDB

With it , you can offload the administrative burden of operating and scaling a highly available distributed database cluster, while paying a low price for only what you use.

ArangoDB

A distributed free and open-source database with a flexible data model for documents, graphs, and key-values. Build high performance applications using a convenient SQL-like query language or JavaScript extensions.

Related Comparisons

Cassandra vs Google Cloud Bigtable: What are the differences?

Introduction

Data Model: Cassandra uses a wide-column data model, which allows for flexible schema and dynamic addition of columns. On the other hand, Google Cloud Bigtable utilizes a sparse, distributed, persistent multidimensional sorted map data model, where data is structured in rows and columns similar to a traditional database table.
Consistency Model: Cassandra offers tunable consistency, where users can choose between strong consistency or eventual consistency based on their requirements. In contrast, Google Cloud Bigtable provides only eventual consistency, which means that data may be inconsistent for a brief period of time before it becomes consistent across all nodes.
Concurrency Control: Cassandra uses a distributed versioning approach known as "last write wins" to handle conflicts during concurrent updates. In contrast, Google Cloud Bigtable relies on optimistic concurrency control, where concurrent requests are allowed and conflicts are detected and resolved based on timestamps.
Storage Architecture: Cassandra employs a distributed, peer-to-peer architecture where data is distributed across a cluster of nodes. Data in Cassandra is stored in memory and disk-based data structures, with built-in support for replication and fault-tolerance. On the other hand, Google Cloud Bigtable utilizes a distributed file system called Colossus, where data is stored in a hierarchical structure of tablets for efficient storage and retrieval.
Query Language: Cassandra uses its own query language called CQL (Cassandra Query Language), which is similar to SQL but with some differences. Google Cloud Bigtable, on the other hand, does not provide a query language out of the box. Instead, it encourages the use of client libraries and frameworks to interact with the database.
Scaling: Both Cassandra and Google Cloud Bigtable are designed for horizontal scalability. However, Cassandra offers automatic data partitioning and distribution across nodes for seamless scalability. Google Cloud Bigtable, on the other hand, requires manual sharding and management of tablets to achieve scalability.

In summary, Cassandra and Google Cloud Bigtable differ in their data model, consistency model, concurrency control, storage architecture, query language, and scaling mechanisms.

Cassandra vs Google Cloud Bigtable

Overview

Cassandra vs Google Cloud Bigtable: What are the differences?