Hadoop聽vs聽MongoDB聽vs聽MySQL

Need advice about which tool to choose?Ask the StackShare community!

Hadoop

1.9K
1.8K
+ 1
54
MongoDB

51.3K
41K
+ 1
4K
MySQL

67.9K
52.3K
+ 1
3.7K
Decisions about Hadoop, MongoDB, and MySQL
Sergey Rodovinsky

We were looking at several alternative databases that would support following architectural requirements: - very quick prototyping for an unknown domain - ability to support large amounts of data - native ability to replicate and fail over - full stack approach for Node.js development After careful consideration MongoDB came on top, and 3 years later we are still very happy with that decision. Currently we keep almost 2TB of data in our cluster, and start thinking about sharding.

See more
Gabriel Pa

After using couchbase for over 4 years, we migrated to MongoDB and that was the best decision ever! I'm very disappointed with Couchbase's technical performance. Even though we received enterprise support and were a listed Couchbase Partner, the experience was horrible. With every contact, the sales team was trying to get me on a $7k+ license for access to features all other open source NoSQL databases get for free.

Here's why you should not use Couchbase

Full-text search Queries The full-text search often returns a different number of results if you run the same query multiple types

N1QL queries Configuring the indexes correctly is next to impossible. It's poorly documented and nobody seems to know what to do, even the Couchbase support engineers have no clue what they are doing.

Community support I posted several problems on the forum and I never once received a useful answer

Enterprise support It's very expensive. $7k+. The team constantly tried to get me to buy even though the community edition wasn't working great

Autonomous Operator It's actually just a poorly configured Kubernetes role that no matter what I did, I couldn't get it to work. The support team was useless. Same lack of documentation. If you do get it to work, you need 6 servers at least to meet their minimum requirements.

Couchbase cloud Typical for Couchbase, the user experience is awful and I could never get it to work.

Minimum requirements The minimum requirements in production are 6 servers. On AWS the calculated monthly cost would be ~$600. We achieved better performance using a $16 MongoDB instance on the Mongo Atlas Cloud

writing queries is a nightmare While N1QL is similar to SQL and it's easier to write because of the familiarity, that isn't entirely true. The "smart index" that Couchbase advertises is not smart at all. Creating an index with 5 fields, and only using 4 of them won't result in Couchbase using the same index, so you have to create a new one.

Couchbase UI The UI that comes with every database deployment is full of bugs, barely functional and the developer experience is poor. When I asked Couchbase about it, they basically said they don't care because real developers use SQL directly from code

Consumes too much RAM Couchbase is shipped with a smaller Memcached instance to handle the in-memory cache. Memcached ends up using 8 GB of RAM for 5000 documents! I'm not kidding! We had less than 5000 docs on a Couchbase instance and less than 20 indexes and RAM consumption was always over 8 GB

Memory allocations are useless I asked the Couchbase team a question: If a bucket has 1 GB allocated, what happens when I have more than 1GB stored? Does it overflow? Does it cache somewhere? Do I get an error? I always received the same answer: If you buy the Couchbase enterprise then we can guide you.

See more
Omran Jamal
CTO & Co-founder at Bonton Connect | 4 upvotes 路 122.5K views

We actually use both Mongo and SQL databases in production. Mongo excels in both speed and developer friendliness when it comes to geospatial data and queries on the geospatial data, but we also like ACID compliance hence most of our other data (except on-site logs) are stored in a SQL Database (MariaDB for now)

See more
Pros of Hadoop
Pros of MongoDB
Pros of MySQL
  • 38
    Great ecosystem
  • 11
    One stack to rule them all
  • 4
    Great load balancer
  • 1
    Java syntax
  • 822
    Document-oriented storage
  • 585
    No sql
  • 544
    Ease of use
  • 462
    Fast
  • 404
    High performance
  • 251
    Free
  • 212
    Open source
  • 177
    Flexible
  • 139
    Replication & high availability
  • 107
    Easy to maintain
  • 39
    Querying
  • 35
    Easy scalability
  • 34
    Auto-sharding
  • 33
    High availability
  • 29
    Map/reduce
  • 26
    Document database
  • 24
    Full index support
  • 24
    Easy setup
  • 15
    Reliable
  • 14
    Fast in-place updates
  • 13
    Agile programming, flexible, fast
  • 11
    No database migrations
  • 7
    Easy integration with Node.Js
  • 7
    Enterprise
  • 5
    Enterprise Support
  • 4
    Great NoSQL DB
  • 3
    Support for many languages through different drivers
  • 3
    Aggregation Framework
  • 3
    Drivers support is good
  • 2
    Easy to Scale
  • 2
    Schemaless
  • 2
    Fast
  • 2
    Awesome
  • 2
    Managed service
  • 1
    Consistent
  • 789
    Sql
  • 674
    Free
  • 557
    Easy
  • 527
    Widely used
  • 485
    Open source
  • 180
    High availability
  • 158
    Cross-platform support
  • 103
    Great community
  • 77
    Secure
  • 75
    Full-text indexing and searching
  • 25
    Fast, open, available
  • 14
    SSL support
  • 13
    Robust
  • 13
    Reliable
  • 8
    Enterprise Version
  • 7
    Easy to set up on all platforms
  • 1
    Relational database
  • 1
    Sequel Pro (best SQL GUI)
  • 1
    Replica Support
  • 1
    NoSQL access to JSON data type
  • 1
    Easy, light, scalable

Sign up to add or upvote prosMake informed product decisions

Cons of Hadoop
Cons of MongoDB
Cons of MySQL
    Be the first to leave a con
    • 5
      Very slowly for connected models that require joins
    • 3
      Not acid compliant
    • 1
      Proprietary query language
    • 13
      Owned by a company with their own agenda
    • 1
      Can't roll back schema changes

    Sign up to add or upvote consMake informed product decisions

    What is Hadoop?

    The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.

    What is MongoDB?

    MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding.

    What is MySQL?

    The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software.

    Need advice about which tool to choose?Ask the StackShare community!

    What companies use Hadoop?
    What companies use MongoDB?
    What companies use MySQL?

    Sign up to get full access to all the companiesMake informed product decisions

    What tools integrate with Hadoop?
    What tools integrate with MongoDB?
    What tools integrate with MySQL?

    Sign up to get full access to all the tool integrationsMake informed product decisions

    Blog Posts

    Dec 8 2020 at 5:50PM
    https://img.stackshare.io/company/93/8a444d2b7ec5dd7a4f3fc1819136e05178b964c8.png logo

    DigitalOcean

    GitHubMySQLMongoDB+11
    2
    1481
    MySQLKafkaApache Spark+6
    2
    1335
    Aug 28 2019 at 3:10AM
    https://img.stackshare.io/stack/505487/default_e35b8bd5e615e01dc9b420dbd2a444fcbaeff755.png logo

    Segment

    PythonJavaAmazon S3+16
    5
    1887
    What are some alternatives to Hadoop, MongoDB, and MySQL?
    Cassandra
    Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.
    Elasticsearch
    Elasticsearch is a distributed, RESTful search and analytics engine capable of storing data and searching it in near real time. Elasticsearch, Kibana, Beats and Logstash are the Elastic Stack (sometimes called the ELK Stack).
    Splunk
    It provides the leading platform for Operational Intelligence. Customers use it to search, monitor, analyze and visualize machine data.
    Snowflake
    Snowflake eliminates the administration and management demands of traditional data warehouses and big data platforms. Snowflake is a true data warehouse as a service running on Amazon Web Services (AWS)鈥攏o infrastructure to manage and no knobs to turn.
    Apache Spark
    Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning.
    See all alternatives
    Interest over time
    How much does Hadoop cost?
    How much does MongoDB cost?
    How much does MySQL cost?
    Pricing unavailable
    Pricing unavailable