HBase vs RethinkDB: What are the differences?
Developers describe HBase as "The Hadoop database, a distributed, scalable, big data store". Apache HBase is an open-source, distributed, versioned, column-oriented store modeled after Google' Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Apache Hadoop. On the other hand, RethinkDB is detailed as "JSON. Scales to multiple machines with very little effort. Open source". RethinkDB is built to store JSON documents, and scale to multiple machines with very little effort. It has a pleasant query language that supports really useful queries like table joins and group by, and is easy to setup and learn.
HBase and RethinkDB belong to "Databases" category of the tech stack.
"Performance" is the top reason why over 7 developers like HBase, while over 46 developers mention "Powerful query language" as the leading cause for choosing RethinkDB.
HBase and RethinkDB are both open source tools. It seems that RethinkDB with 22.4K GitHub stars and 1.74K forks on GitHub has more adoption than HBase with 2.91K GitHub stars and 2.01K GitHub forks.
Pinterest, HubSpot, and Yammer are some of the popular companies that use HBase, whereas RethinkDB is used by miDrive, Runbook, and The Control Group. HBase has a broader approval, being mentioned in 54 company stacks & 18 developers stacks; compared to RethinkDB, which is listed in 37 company stacks and 25 developer stacks.
What is HBase?
What is RethinkDB?
Need advice about which tool to choose?Ask the StackShare community!
Sign up to add, upvote and see more prosMake informed product decisions
What are the cons of using HBase?
What are the cons of using RethinkDB?
Sign up to get full access to all the companiesMake informed product decisions
Sign up to get full access to all the tool integrationsMake informed product decisions
We initially chose RethinkDB because of the schema-less document store features, and better durability resilience/story than MongoDB In the end, it didn't work out quite as we expected: there's plenty of scalability issues, it's near impossible to run analytical workloads and small community makes working with Rethink a challenge. We're in process of migrating all our workloads to PostgreSQL and hopefully, we will be able to decommission our RethinkDB deployment soon.
The final output is inserted into HBase to serve the experiment dashboard. We also load the output data to Redshift for ad-hoc analysis. For real-time experiment data processing, we use Storm to tail Kafka and process data in real-time and insert metrics into MySQL, so we could identify group allocation problems and send out real-time alerts and metrics.
High-speed update-aware storage used in our region server infrastructure; provides a good middle layer for storage of rapidly modified information.
Main database, using it in multiple datacenters in an active-active configuration.