Alternatives to FoundationDB logo

Alternatives to FoundationDB

CockroachDB, MongoDB, Cassandra, Redis, and Couchbase are the most popular alternatives and competitors to FoundationDB.
33
21

What is FoundationDB and what are its top alternatives?

FoundationDB is a distributed database system that is designed to be scalable, fault-tolerant, and perform well under high workloads. It uses a distributed transaction layer to ensure data consistency across multiple nodes in a cluster. Some key features include ACID transactions, multi-model data support, distributed architecture, and automatic data sharding. However, one limitation of FoundationDB is that it may not be as feature-rich or widely adopted as some other databases in the market.

  1. CockroachDB: CockroachDB is a distributed SQL database that offers horizontal scalability, strong consistency, and ACID transactions. It is compatible with PostgreSQL, making it easy for developers to migrate their existing applications.

  2. TiDB: TiDB is an open-source distributed SQL database that combines the scalability of NoSQL systems with the ACID compliance of traditional RDBMS. It is highly scalable and can handle large volumes of data with ease.

  3. Apache Cassandra: Apache Cassandra is a distributed NoSQL database that is designed for high availability and scalability. It can handle large amounts of data across multiple nodes and data centers.

  4. Amazon DynamoDB: Amazon DynamoDB is a fully managed NoSQL database service offered by AWS. It is highly scalable, durable, and provides low-latency access to data.

  5. Google Cloud Spanner: Google Cloud Spanner is a globally distributed, horizontally scalable database service that offers strong consistency and high availability. It can handle both OLTP and OLAP workloads effectively.

  6. ScyllaDB: ScyllaDB is a highly scalable, distributed NoSQL database that is compatible with Apache Cassandra. It offers low-latency, high-throughput data access and can handle petabyte-scale workloads.

  7. YugabyteDB: YugabyteDB is a distributed SQL database that is compatible with PostgreSQL and offers high availability, scalability, and fault tolerance. It is designed for cloud-native applications.

  8. MongoDB: MongoDB is a popular NoSQL database that is known for its flexibility, scalability, and ease of use. It offers document-based storage and is suitable for a wide range of applications.

  9. Azure Cosmos DB: Azure Cosmos DB is a globally distributed, multi-model NoSQL database service offered by Microsoft Azure. It provides low latency, automatic scaling, and multiple consistency levels.

  10. RocksDB: RocksDB is an embeddable, persistent key-value store for fast storage. It is designed for high-performance applications that require low-latency data access and can be used as a building block for distributed systems.

Top Alternatives to FoundationDB

  • CockroachDB
    CockroachDB

    CockroachDB is distributed SQL database that can be deployed in serverless, dedicated, or on-prem. Elastic scale, multi-active availability for resilience, and low latency performance. ...

  • MongoDB
    MongoDB

    MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding. ...

  • Cassandra
    Cassandra

    Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL. ...

  • Redis
    Redis

    Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache, and message broker. Redis provides data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes, and streams. ...

  • Couchbase
    Couchbase

    Developed as an alternative to traditionally inflexible SQL databases, the Couchbase NoSQL database is built on an open source foundation and architected to help developers solve real-world problems and meet high scalability demands. ...

  • PostgreSQL
    PostgreSQL

    PostgreSQL is an advanced object-relational database management system that supports an extended subset of the SQL standard, including transactions, foreign keys, subqueries, triggers, user-defined types and functions. ...

  • HBase
    HBase

    Apache HBase is an open-source, distributed, versioned, column-oriented store modeled after Google' Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Apache Hadoop. ...

  • VoltDB
    VoltDB

    VoltDB is a fundamental redesign of the RDBMS that provides unparalleled performance and scalability on bare-metal, virtualized and cloud infrastructures. VoltDB is a modern in-memory architecture that supports both SQL + Java with data durability and fault tolerance. ...

FoundationDB alternatives & related posts

CockroachDB logo

CockroachDB

212
339
0
A distributed SQL database that scales fast, survives disaster, and thrives everywhere
212
339
+ 1
0
PROS OF COCKROACHDB
    Be the first to leave a pro
    CONS OF COCKROACHDB
      Be the first to leave a con

      related CockroachDB posts

      MongoDB logo

      MongoDB

      93.4K
      80.6K
      4.1K
      The database for giant ideas
      93.4K
      80.6K
      + 1
      4.1K
      PROS OF MONGODB
      • 827
        Document-oriented storage
      • 593
        No sql
      • 553
        Ease of use
      • 464
        Fast
      • 410
        High performance
      • 255
        Free
      • 218
        Open source
      • 180
        Flexible
      • 145
        Replication & high availability
      • 112
        Easy to maintain
      • 42
        Querying
      • 39
        Easy scalability
      • 38
        Auto-sharding
      • 37
        High availability
      • 31
        Map/reduce
      • 27
        Document database
      • 25
        Easy setup
      • 25
        Full index support
      • 16
        Reliable
      • 15
        Fast in-place updates
      • 14
        Agile programming, flexible, fast
      • 12
        No database migrations
      • 8
        Easy integration with Node.Js
      • 8
        Enterprise
      • 6
        Enterprise Support
      • 5
        Great NoSQL DB
      • 4
        Support for many languages through different drivers
      • 3
        Schemaless
      • 3
        Aggregation Framework
      • 3
        Drivers support is good
      • 2
        Fast
      • 2
        Managed service
      • 2
        Easy to Scale
      • 2
        Awesome
      • 2
        Consistent
      • 1
        Good GUI
      • 1
        Acid Compliant
      CONS OF MONGODB
      • 6
        Very slowly for connected models that require joins
      • 3
        Not acid compliant
      • 2
        Proprietary query language

      related MongoDB posts

      Jeyabalaji Subramanian

      Recently we were looking at a few robust and cost-effective ways of replicating the data that resides in our production MongoDB to a PostgreSQL database for data warehousing and business intelligence.

      We set ourselves the following criteria for the optimal tool that would do this job: - The data replication must be near real-time, yet it should NOT impact the production database - The data replication must be horizontally scalable (based on the load), asynchronous & crash-resilient

      Based on the above criteria, we selected the following tools to perform the end to end data replication:

      We chose MongoDB Stitch for picking up the changes in the source database. It is the serverless platform from MongoDB. One of the services offered by MongoDB Stitch is Stitch Triggers. Using stitch triggers, you can execute a serverless function (in Node.js) in real time in response to changes in the database. When there are a lot of database changes, Stitch automatically "feeds forward" these changes through an asynchronous queue.

      We chose Amazon SQS as the pipe / message backbone for communicating the changes from MongoDB to our own replication service. Interestingly enough, MongoDB stitch offers integration with AWS services.

      In the Node.js function, we wrote minimal functionality to communicate the database changes (insert / update / delete / replace) to Amazon SQS.

      Next we wrote a minimal micro-service in Python to listen to the message events on SQS, pickup the data payload & mirror the DB changes on to the target Data warehouse. We implemented source data to target data translation by modelling target table structures through SQLAlchemy . We deployed this micro-service as AWS Lambda with Zappa. With Zappa, deploying your services as event-driven & horizontally scalable Lambda service is dumb-easy.

      In the end, we got to implement a highly scalable near realtime Change Data Replication service that "works" and deployed to production in a matter of few days!

      See more
      Robert Zuber

      We use MongoDB as our primary #datastore. Mongo's approach to replica sets enables some fantastic patterns for operations like maintenance, backups, and #ETL.

      As we pull #microservices from our #monolith, we are taking the opportunity to build them with their own datastores using PostgreSQL. We also use Redis to cache data we’d never store permanently, and to rate-limit our requests to partners’ APIs (like GitHub).

      When we’re dealing with large blobs of immutable data (logs, artifacts, and test results), we store them in Amazon S3. We handle any side-effects of S3’s eventual consistency model within our own code. This ensures that we deal with user requests correctly while writes are in process.

      See more
      Cassandra logo

      Cassandra

      3.6K
      3.5K
      507
      A partitioned row store. Rows are organized into tables with a required primary key.
      3.6K
      3.5K
      + 1
      507
      PROS OF CASSANDRA
      • 119
        Distributed
      • 98
        High performance
      • 81
        High availability
      • 74
        Easy scalability
      • 53
        Replication
      • 26
        Reliable
      • 26
        Multi datacenter deployments
      • 10
        Schema optional
      • 9
        OLTP
      • 8
        Open source
      • 2
        Workload separation (via MDC)
      • 1
        Fast
      CONS OF CASSANDRA
      • 3
        Reliability of replication
      • 1
        Size
      • 1
        Updates

      related Cassandra posts

      Thierry Schellenbach
      Shared insights
      on
      RedisRedisCassandraCassandraRocksDBRocksDB
      at

      1.0 of Stream leveraged Cassandra for storing the feed. Cassandra is a common choice for building feeds. Instagram, for instance started, out with Redis but eventually switched to Cassandra to handle their rapid usage growth. Cassandra can handle write heavy workloads very efficiently.

      Cassandra is a great tool that allows you to scale write capacity simply by adding more nodes, though it is also very complex. This complexity made it hard to diagnose performance fluctuations. Even though we had years of experience with running Cassandra, it still felt like a bit of a black box. When building Stream 2.0 we decided to go for a different approach and build Keevo. Keevo is our in-house key-value store built upon RocksDB, gRPC and Raft.

      RocksDB is a highly performant embeddable database library developed and maintained by Facebook’s data engineering team. RocksDB started as a fork of Google’s LevelDB that introduced several performance improvements for SSD. Nowadays RocksDB is a project on its own and is under active development. It is written in C++ and it’s fast. Have a look at how this benchmark handles 7 million QPS. In terms of technology it’s much more simple than Cassandra.

      This translates into reduced maintenance overhead, improved performance and, most importantly, more consistent performance. It’s interesting to note that LinkedIn also uses RocksDB for their feed.

      #InMemoryDatabases #DataStores #Databases

      See more

      Trying to establish a data lake(or maybe puddle) for my org's Data Sharing project. The idea is that outside partners would send cuts of their PHI data, regardless of format/variables/systems, to our Data Team who would then harmonize the data, create data marts, and eventually use it for something. End-to-end, I'm envisioning:

      1. Ingestion->Secure, role-based, self service portal for users to upload data (1a. bonus points if it can preform basic validations/masking)
      2. Storage->Amazon S3 seems like the cheapest. We probably won't need very big, even at full capacity. Our current storage is a secure Box folder that has ~4GB with several batches of test data, code, presentations, and planning docs.
      3. Data Catalog-> AWS Glue? Azure Data Factory? Snowplow? is the main difference basically based on the vendor? We also will have Data Dictionaries/Codebooks from submitters. Where would they fit in?
      4. Partitions-> I've seen Cassandra and YARN mentioned, but have no experience with either
      5. Processing-> We want to use SAS if at all possible. What will work with SAS code?
      6. Pipeline/Automation->The check-in and verification processes that have been outlined are rather involved. Some sort of automated messaging or approval workflow would be nice
      7. I have very little guidance on what a "Data Mart" should look like, so I'm going with the idea that it would be another "experimental" partition. Unless there's an actual mart-building paradigm I've missed?
      8. An end user might use the catalog to pull certain de-identified data sets from the marts. Again, role-based access and self-service gui would be preferable. I'm the only full-time tech person on this project, but I'm mostly an OOP, HTML, JavaScript, and some SQL programmer. Most of this is out of my repertoire. I've done a lot of research, but I can't be an effective evangelist without hands-on experience. Since we're starting a new year of our grant, they've finally decided to let me try some stuff out. Any pointers would be appreciated!
      See more
      Redis logo

      Redis

      59.3K
      45.6K
      3.9K
      Open source (BSD licensed), in-memory data structure store
      59.3K
      45.6K
      + 1
      3.9K
      PROS OF REDIS
      • 886
        Performance
      • 542
        Super fast
      • 513
        Ease of use
      • 444
        In-memory cache
      • 324
        Advanced key-value cache
      • 194
        Open source
      • 182
        Easy to deploy
      • 164
        Stable
      • 155
        Free
      • 121
        Fast
      • 42
        High-Performance
      • 40
        High Availability
      • 35
        Data Structures
      • 32
        Very Scalable
      • 24
        Replication
      • 22
        Great community
      • 22
        Pub/Sub
      • 19
        "NoSQL" key-value data store
      • 16
        Hashes
      • 13
        Sets
      • 11
        Sorted Sets
      • 10
        NoSQL
      • 10
        Lists
      • 9
        Async replication
      • 9
        BSD licensed
      • 8
        Bitmaps
      • 8
        Integrates super easy with Sidekiq for Rails background
      • 7
        Keys with a limited time-to-live
      • 7
        Open Source
      • 6
        Lua scripting
      • 6
        Strings
      • 5
        Awesomeness for Free
      • 5
        Hyperloglogs
      • 4
        Transactions
      • 4
        Outstanding performance
      • 4
        Runs server side LUA
      • 4
        LRU eviction of keys
      • 4
        Feature Rich
      • 4
        Written in ANSI C
      • 4
        Networked
      • 3
        Data structure server
      • 3
        Performance & ease of use
      • 2
        Dont save data if no subscribers are found
      • 2
        Automatic failover
      • 2
        Easy to use
      • 2
        Temporarily kept on disk
      • 2
        Scalable
      • 2
        Existing Laravel Integration
      • 2
        Channels concept
      • 2
        Object [key/value] size each 500 MB
      • 2
        Simple
      CONS OF REDIS
      • 15
        Cannot query objects directly
      • 3
        No secondary indexes for non-numeric data types
      • 1
        No WAL

      related Redis posts

      Russel Werner
      Lead Engineer at StackShare · | 32 upvotes · 2.8M views

      StackShare Feed is built entirely with React, Glamorous, and Apollo. One of our objectives with the public launch of the Feed was to enable a Server-side rendered (SSR) experience for our organic search traffic. When you visit the StackShare Feed, and you aren't logged in, you are delivered the Trending feed experience. We use an in-house Node.js rendering microservice to generate this HTML. This microservice needs to run and serve requests independent of our Rails web app. Up until recently, we had a mono-repo with our Rails and React code living happily together and all served from the same web process. In order to deploy our SSR app into a Heroku environment, we needed to split out our front-end application into a separate repo in GitHub. The driving factor in this decision was mostly due to limitations imposed by Heroku specifically with how processes can't communicate with each other. A new SSR app was created in Heroku and linked directly to the frontend repo so it stays in-sync with changes.

      Related to this, we need a way to "deploy" our frontend changes to various server environments without building & releasing the entire Ruby application. We built a hybrid Amazon S3 Amazon CloudFront solution to host our Webpack bundles. A new CircleCI script builds the bundles and uploads them to S3. The final step in our rollout is to update some keys in Redis so our Rails app knows which bundles to serve. The result of these efforts were significant. Our frontend team now moves independently of our backend team, our build & release process takes only a few minutes, we are now using an edge CDN to serve JS assets, and we have pre-rendered React pages!

      #StackDecisionsLaunch #SSR #Microservices #FrontEndRepoSplit

      See more
      Simon Reymann
      Senior Fullstack Developer at QUANTUSflow Software GmbH · | 30 upvotes · 11M views

      Our whole DevOps stack consists of the following tools:

      • GitHub (incl. GitHub Pages/Markdown for Documentation, GettingStarted and HowTo's) for collaborative review and code management tool
      • Respectively Git as revision control system
      • SourceTree as Git GUI
      • Visual Studio Code as IDE
      • CircleCI for continuous integration (automatize development process)
      • Prettier / TSLint / ESLint as code linter
      • SonarQube as quality gate
      • Docker as container management (incl. Docker Compose for multi-container application management)
      • VirtualBox for operating system simulation tests
      • Kubernetes as cluster management for docker containers
      • Heroku for deploying in test environments
      • nginx as web server (preferably used as facade server in production environment)
      • SSLMate (using OpenSSL) for certificate management
      • Amazon EC2 (incl. Amazon S3) for deploying in stage (production-like) and production environments
      • PostgreSQL as preferred database system
      • Redis as preferred in-memory database/store (great for caching)

      The main reason we have chosen Kubernetes over Docker Swarm is related to the following artifacts:

      • Key features: Easy and flexible installation, Clear dashboard, Great scaling operations, Monitoring is an integral part, Great load balancing concepts, Monitors the condition and ensures compensation in the event of failure.
      • Applications: An application can be deployed using a combination of pods, deployments, and services (or micro-services).
      • Functionality: Kubernetes as a complex installation and setup process, but it not as limited as Docker Swarm.
      • Monitoring: It supports multiple versions of logging and monitoring when the services are deployed within the cluster (Elasticsearch/Kibana (ELK), Heapster/Grafana, Sysdig cloud integration).
      • Scalability: All-in-one framework for distributed systems.
      • Other Benefits: Kubernetes is backed by the Cloud Native Computing Foundation (CNCF), huge community among container orchestration tools, it is an open source and modular tool that works with any OS.
      See more
      Couchbase logo

      Couchbase

      478
      603
      110
      Document-Oriented NoSQL Database
      478
      603
      + 1
      110
      PROS OF COUCHBASE
      • 18
        High performance
      • 18
        Flexible data model, easy scalability, extremely fast
      • 9
        Mobile app support
      • 7
        You can query it with Ansi-92 SQL
      • 6
        All nodes can be read/write
      • 5
        Equal nodes in cluster, allowing fast, flexible changes
      • 5
        Both a key-value store and document (JSON) db
      • 5
        Open source, community and enterprise editions
      • 4
        Automatic configuration of sharding
      • 4
        Local cache capability
      • 3
        Easy setup
      • 3
        Linearly scalable, useful to large number of tps
      • 3
        Easy cluster administration
      • 3
        Cross data center replication
      • 3
        SDKs in popular programming languages
      • 3
        Elasticsearch connector
      • 3
        Web based management, query and monitoring panel
      • 2
        Map reduce views
      • 2
        DBaaS available
      • 2
        NoSQL
      • 1
        Buckets, Scopes, Collections & Documents
      • 1
        FTS + SQL together
      CONS OF COUCHBASE
      • 3
        Terrible query language

      related Couchbase posts

      Gabriel Pa

      We implemented our first large scale EPR application from naologic.com using CouchDB .

      Very fast, replication works great, doesn't consume much RAM, queries are blazing fast but we found a problem: the queries were very hard to write, it took a long time to figure out the API, we had to go and write our own @nodejs library to make it work properly.

      It lost most of its support. Since then, we migrated to Couchbase and the learning curve was steep but all worth it. Memcached indexing out of the box, full text search works great.

      See more
      Ilias Mentzelos
      Software Engineer at Plum Fintech · | 9 upvotes · 241.4K views
      Shared insights
      on
      MongoDBMongoDBCouchbaseCouchbase

      Hey, we want to build a referral campaign mechanism that will probably contain millions of records within the next few years. We want fast read access based on IDs or some indexes, and isolation is crucial as some listeners will try to update the same document at the same time. What's your suggestion between Couchbase and MongoDB? Thanks!

      See more
      PostgreSQL logo

      PostgreSQL

      98K
      82K
      3.5K
      A powerful, open source object-relational database system
      98K
      82K
      + 1
      3.5K
      PROS OF POSTGRESQL
      • 763
        Relational database
      • 510
        High availability
      • 439
        Enterprise class database
      • 383
        Sql
      • 304
        Sql + nosql
      • 173
        Great community
      • 147
        Easy to setup
      • 131
        Heroku
      • 130
        Secure by default
      • 113
        Postgis
      • 50
        Supports Key-Value
      • 48
        Great JSON support
      • 34
        Cross platform
      • 33
        Extensible
      • 28
        Replication
      • 26
        Triggers
      • 23
        Multiversion concurrency control
      • 23
        Rollback
      • 21
        Open source
      • 18
        Heroku Add-on
      • 17
        Stable, Simple and Good Performance
      • 15
        Powerful
      • 13
        Lets be serious, what other SQL DB would you go for?
      • 11
        Good documentation
      • 9
        Scalable
      • 8
        Free
      • 8
        Reliable
      • 8
        Intelligent optimizer
      • 7
        Transactional DDL
      • 7
        Modern
      • 6
        One stop solution for all things sql no matter the os
      • 5
        Relational database with MVCC
      • 5
        Faster Development
      • 4
        Full-Text Search
      • 4
        Developer friendly
      • 3
        Excellent source code
      • 3
        Free version
      • 3
        Great DB for Transactional system or Application
      • 3
        Relational datanbase
      • 3
        search
      • 3
        Open-source
      • 2
        Text
      • 2
        Full-text
      • 1
        Can handle up to petabytes worth of size
      • 1
        Composability
      • 1
        Multiple procedural languages supported
      • 0
        Native
      CONS OF POSTGRESQL
      • 10
        Table/index bloatings

      related PostgreSQL posts

      Simon Reymann
      Senior Fullstack Developer at QUANTUSflow Software GmbH · | 30 upvotes · 11M views

      Our whole DevOps stack consists of the following tools:

      • GitHub (incl. GitHub Pages/Markdown for Documentation, GettingStarted and HowTo's) for collaborative review and code management tool
      • Respectively Git as revision control system
      • SourceTree as Git GUI
      • Visual Studio Code as IDE
      • CircleCI for continuous integration (automatize development process)
      • Prettier / TSLint / ESLint as code linter
      • SonarQube as quality gate
      • Docker as container management (incl. Docker Compose for multi-container application management)
      • VirtualBox for operating system simulation tests
      • Kubernetes as cluster management for docker containers
      • Heroku for deploying in test environments
      • nginx as web server (preferably used as facade server in production environment)
      • SSLMate (using OpenSSL) for certificate management
      • Amazon EC2 (incl. Amazon S3) for deploying in stage (production-like) and production environments
      • PostgreSQL as preferred database system
      • Redis as preferred in-memory database/store (great for caching)

      The main reason we have chosen Kubernetes over Docker Swarm is related to the following artifacts:

      • Key features: Easy and flexible installation, Clear dashboard, Great scaling operations, Monitoring is an integral part, Great load balancing concepts, Monitors the condition and ensures compensation in the event of failure.
      • Applications: An application can be deployed using a combination of pods, deployments, and services (or micro-services).
      • Functionality: Kubernetes as a complex installation and setup process, but it not as limited as Docker Swarm.
      • Monitoring: It supports multiple versions of logging and monitoring when the services are deployed within the cluster (Elasticsearch/Kibana (ELK), Heapster/Grafana, Sysdig cloud integration).
      • Scalability: All-in-one framework for distributed systems.
      • Other Benefits: Kubernetes is backed by the Cloud Native Computing Foundation (CNCF), huge community among container orchestration tools, it is an open source and modular tool that works with any OS.
      See more
      Jeyabalaji Subramanian

      Recently we were looking at a few robust and cost-effective ways of replicating the data that resides in our production MongoDB to a PostgreSQL database for data warehousing and business intelligence.

      We set ourselves the following criteria for the optimal tool that would do this job: - The data replication must be near real-time, yet it should NOT impact the production database - The data replication must be horizontally scalable (based on the load), asynchronous & crash-resilient

      Based on the above criteria, we selected the following tools to perform the end to end data replication:

      We chose MongoDB Stitch for picking up the changes in the source database. It is the serverless platform from MongoDB. One of the services offered by MongoDB Stitch is Stitch Triggers. Using stitch triggers, you can execute a serverless function (in Node.js) in real time in response to changes in the database. When there are a lot of database changes, Stitch automatically "feeds forward" these changes through an asynchronous queue.

      We chose Amazon SQS as the pipe / message backbone for communicating the changes from MongoDB to our own replication service. Interestingly enough, MongoDB stitch offers integration with AWS services.

      In the Node.js function, we wrote minimal functionality to communicate the database changes (insert / update / delete / replace) to Amazon SQS.

      Next we wrote a minimal micro-service in Python to listen to the message events on SQS, pickup the data payload & mirror the DB changes on to the target Data warehouse. We implemented source data to target data translation by modelling target table structures through SQLAlchemy . We deployed this micro-service as AWS Lambda with Zappa. With Zappa, deploying your services as event-driven & horizontally scalable Lambda service is dumb-easy.

      In the end, we got to implement a highly scalable near realtime Change Data Replication service that "works" and deployed to production in a matter of few days!

      See more
      HBase logo

      HBase

      462
      494
      15
      The Hadoop database, a distributed, scalable, big data store
      462
      494
      + 1
      15
      PROS OF HBASE
      • 9
        Performance
      • 5
        OLTP
      • 1
        Fast Point Queries
      CONS OF HBASE
        Be the first to leave a con

        related HBase posts

        I am researching different querying solutions to handle ~1 trillion records of data (in the realm of a petabyte). The data is mostly textual. I have identified a few options: Milvus, HBase, RocksDB, and Elasticsearch. I was wondering if there is a good way to compare the performance of these options (or if anyone has already done something like this). I want to be able to compare the speed of ingesting and querying textual data from these tools. Does anyone have information on this or know where I can find some? Thanks in advance!

        See more

        Hi, I'm building a machine learning pipelines to store image bytes and image vectors in the backend.

        So, when users query for the random access image data (key), we return the image bytes and perform machine learning model operations on it.

        I'm currently considering going with Amazon S3 (in the future, maybe add Redis caching layer) as the backend system to store the information (s3 buckets with sharded prefixes).

        As the latency of S3 is 100-200ms (get/put) and it has a high throughput of 3500 puts/sec and 5500 gets/sec for a given bucker/prefix. In the future I need to reduce the latency, I can add Redis cache.

        Also, s3 costs are way fewer than HBase (on Amazon EC2 instances with 3x replication factor)

        I have not personally used HBase before, so can someone help me if I'm making the right choice here? I'm not aware of Hbase latencies and I have learned that the MOB feature on Hbase has to be turned on if we have store image bytes on of the column families as the avg image bytes are 240Kb.

        See more
        VoltDB logo

        VoltDB

        18
        72
        18
        In-memory relational DBMS capable of supporting millions of database operations per second
        18
        72
        + 1
        18
        PROS OF VOLTDB
        • 5
          SQL + Java
        • 4
          In-memory database
        • 4
          A brainchild of Michael Stonebraker
        • 3
          Very Fast
        • 2
          NewSQL
        CONS OF VOLTDB
          Be the first to leave a con

          related VoltDB posts