Get Advice Icon

Need advice about which tool to choose?Ask the StackShare community!

Hadoop
Hadoop

1.1K
882
+ 1
48
RabbitMQ
RabbitMQ

4.7K
3.3K
+ 1
453
Add tool

Hadoop vs RabbitMQ: What are the differences?

Hadoop: Open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage; RabbitMQ: A messaging broker - an intermediary for messaging. RabbitMQ gives your applications a common platform to send and receive messages, and your messages a safe place to live until received.

Hadoop can be classified as a tool in the "Databases" category, while RabbitMQ is grouped under "Message Queue".

"Great ecosystem" is the top reason why over 34 developers like Hadoop, while over 202 developers mention "It's fast and it works with good metrics/monitoring" as the leading cause for choosing RabbitMQ.

Hadoop and RabbitMQ are both open source tools. It seems that Hadoop with 9.18K GitHub stars and 5.74K forks on GitHub has more adoption than RabbitMQ with 5.88K GitHub stars and 1.73K GitHub forks.

reddit, MIT, and SendGrid are some of the popular companies that use RabbitMQ, whereas Hadoop is used by Slack, Shopify, and SendGrid. RabbitMQ has a broader approval, being mentioned in 921 company stacks & 532 developers stacks; compared to Hadoop, which is listed in 237 company stacks and 116 developer stacks.

What is Hadoop?

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.

What is RabbitMQ?

RabbitMQ gives your applications a common platform to send and receive messages, and your messages a safe place to live until received.
Get Advice Icon

Need advice about which tool to choose?Ask the StackShare community!

Why do developers choose Hadoop?
Why do developers choose RabbitMQ?

Sign up to add, upvote and see more prosMake informed product decisions

    Be the first to leave a con
    What companies use Hadoop?
    What companies use RabbitMQ?

    Sign up to get full access to all the companiesMake informed product decisions

    What tools integrate with Hadoop?
    What tools integrate with RabbitMQ?

    Sign up to get full access to all the tool integrationsMake informed product decisions

    What are some alternatives to Hadoop and RabbitMQ?
    Cassandra
    Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.
    MongoDB
    MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding.
    Elasticsearch
    Elasticsearch is a distributed, RESTful search and analytics engine capable of storing data and searching it in near real time. Elasticsearch, Kibana, Beats and Logstash are the Elastic Stack (sometimes called the ELK Stack).
    Splunk
    Splunk Inc. provides the leading platform for Operational Intelligence. Customers use Splunk to search, monitor, analyze and visualize machine data.
    HBase
    Apache HBase is an open-source, distributed, versioned, column-oriented store modeled after Google' Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Apache Hadoop.
    See all alternatives
    Decisions about Hadoop and RabbitMQ
    James Cunningham
    James Cunningham
    Operations Engineer at Sentry · | 18 upvotes · 112.8K views
    atSentrySentry
    RabbitMQ
    RabbitMQ
    Celery
    Celery
    #MessageQueue

    As Sentry runs throughout the day, there are about 50 different offline tasks that we execute—anything from “process this event, pretty please” to “send all of these cool people some emails.” There are some that we execute once a day and some that execute thousands per second.

    Managing this variety requires a reliably high-throughput message-passing technology. We use Celery's RabbitMQ implementation, and we stumbled upon a great feature called Federation that allows us to partition our task queue across any number of RabbitMQ servers and gives us the confidence that, if any single server gets backlogged, others will pitch in and distribute some of the backlogged tasks to their consumers.

    #MessageQueue

    See more
    StackShare Editors
    StackShare Editors
    Apache Thrift
    Apache Thrift
    Kotlin
    Kotlin
    Presto
    Presto
    HHVM (HipHop Virtual Machine)
    HHVM (HipHop Virtual Machine)
    gRPC
    gRPC
    Kubernetes
    Kubernetes
    Apache Spark
    Apache Spark
    Airflow
    Airflow
    Terraform
    Terraform
    Hadoop
    Hadoop
    Swift
    Swift
    Hack
    Hack
    Memcached
    Memcached
    Consul
    Consul
    Chef
    Chef
    Prometheus
    Prometheus

    Since the beginning, Cal Henderson has been the CTO of Slack. Earlier this year, he commented on a Quora question summarizing their current stack.

    Apps
    • Web: a mix of JavaScript/ES6 and React.
    • Desktop: And Electron to ship it as a desktop application.
    • Android: a mix of Java and Kotlin.
    • iOS: written in a mix of Objective C and Swift.
    Backend
    • The core application and the API written in PHP/Hack that runs on HHVM.
    • The data is stored in MySQL using Vitess.
    • Caching is done using Memcached and MCRouter.
    • The search service takes help from SolrCloud, with various Java services.
    • The messaging system uses WebSockets with many services in Java and Go.
    • Load balancing is done using HAproxy with Consul for configuration.
    • Most services talk to each other over gRPC,
    • Some Thrift and JSON-over-HTTP
    • Voice and video calling service was built in Elixir.
    Data warehouse
    • Built using open source tools including Presto, Spark, Airflow, Hadoop and Kafka.
    Etc
    See more
    RabbitMQ
    RabbitMQ
    Kafka
    Kafka

    The question for which Message Queue to use mentioned "availability, distributed, scalability, and monitoring". I don't think that this excludes many options already. I does not sound like you would take advantage of Kafka's strengths (replayability, based on an even sourcing architecture). You could pick one of the AMQP options.

    I would recommend the RabbitMQ message broker, which not only implements the AMQP standard 0.9.1 (it can support 1.x or other protocols as well) but has also several very useful extensions built in. It ticks the boxes you mentioned and on top you will get a very flexible system, that allows you to build the architecture, pick the options and trade-offs that suite your case best.

    For more information about RabbitMQ, please have a look at the linked markdown I assembled. The second half explains many configuration options. It also contains links to managed hosting and to libraries (though it is missing Python's - which should be Puka, I assume).

    See more
    Frédéric MARAND
    Frédéric MARAND
    Core Developer at OSInet · | 2 upvotes · 92.2K views
    atOSInetOSInet
    RabbitMQ
    RabbitMQ
    Beanstalkd
    Beanstalkd
    Kafka
    Kafka

    I used Kafka originally because it was mandated as part of the top-level IT requirements at a Fortune 500 client. What I found was that it was orders of magnitude more complex ...and powerful than my daily Beanstalkd , and far more flexible, resilient, and manageable than RabbitMQ.

    So for any case where utmost flexibility and resilience are part of the deal, I would use Kafka again. But due to the complexities involved, for any time where this level of scalability is not required, I would probably just use Beanstalkd for its simplicity.

    I tend to find RabbitMQ to be in an uncomfortable middle place between these two extremities.

    See more
    Michael Mota
    Michael Mota
    CEO & Founder at AlterEstate · | 4 upvotes · 12K views
    atAlterEstateAlterEstate
    Django
    Django
    RabbitMQ
    RabbitMQ
    Celery
    Celery

    Automations are what makes a CRM powerful. With Celery and RabbitMQ we've been able to make powerful automations that truly works for our clients. Such as for example, automatic daily reports, reminders for their activities, important notifications regarding their client activities and actions on the website and more.

    We use Celery basically for everything that needs to be scheduled for the future, and using RabbitMQ as our Queue-broker is amazing since it fully integrates with Django and Celery storing on our database results of the tasks done so we can see if anything fails immediately.

    See more
    Interest over time
    Reviews of Hadoop and RabbitMQ
    Review ofRabbitMQRabbitMQ

    I developed one of the largest queue based medical results delivery systems in the world, 18,000+ queues and still growing over a decade later all using MQSeries, later called Websphere MQ. When I left that company I started using RabbitMQ after doing some research on free offerings.. it works brilliantly and is incredibly flexible from small scale single instance use to large scale multi-server - multi-site architectures.

    If you can think in queues then RabbitMQ should be a viable solution for integrating disparate systems.

    How developers use Hadoop and RabbitMQ
    Avatar of Pinterest
    Pinterest uses HadoopHadoop

    The MapReduce workflow starts to process experiment data nightly when data of the previous day is copied over from Kafka. At this time, all the raw log requests are transformed into meaningful experiment results and in-depth analysis. To populate experiment data for the dashboard, we have around 50 jobs running to do all the calculations and transforms of data.

    Avatar of Yelp
    Yelp uses HadoopHadoop

    in 2009 we open sourced mrjob, which allows any engineer to write a MapReduce job without contending for resources. We’re only limited by the amount of machines in an Amazon data center (which is an issue we’ve rarely encountered).

    Avatar of Cloudify
    Cloudify uses RabbitMQRabbitMQ

    The poster child for scalable messaging systems, RabbitMQ has been used in countless large scale systems as the messaging backbone of any large cluster, and has proven itself time and again in many production settings.

    Avatar of Chris Saylor
    Chris Saylor uses RabbitMQRabbitMQ

    Rabbit acts as our coordinator for all actions that happen during game time. All worker containers connect to rabbit in order to receive game events and emit their own events when applicable.

    Avatar of Clarabridge Engage
    Clarabridge Engage uses RabbitMQRabbitMQ

    Used as central Message Broker; off-loading tasks to be executed asynchronous, used as communication tool between different microservices, used as tool to handle peaks in incoming data, etc.

    Avatar of Analytical Informatics
    Analytical Informatics uses RabbitMQRabbitMQ

    RabbitMQ is the enterprise message bus for our platform, providing infrastructure for managing our ETL queues, real-time event notifications for applications, and audit logging.

    Avatar of Packet
    Packet uses RabbitMQRabbitMQ

    RabbitMQ is an all purpose queuing service for our stack. We use it for user facing jobs as well as keeping track of behind the scenes jobs.

    Avatar of Pinterest
    Pinterest uses HadoopHadoop

    The massive volume of discovery data that powers Pinterest and enables people to save Pins, create boards and follow other users, is generated through daily Hadoop jobs...

    Avatar of Robert Brown
    Robert Brown uses HadoopHadoop

    Importing/Exporting data, interpreting results. Possible integration with SAS

    Avatar of Rohith Nandakumar
    Rohith Nandakumar uses HadoopHadoop

    TBD. Good to have I think. Analytics on loads of data, recommendations?

    How much does Hadoop cost?
    How much does RabbitMQ cost?
    Pricing unavailable
    Pricing unavailable
    News about RabbitMQ
    More news