What is ZeroMQ and what are its top alternatives?
Top Alternatives to ZeroMQ
RabbitMQ gives your applications a common platform to send and receive messages, and your messages a safe place to live until received. ...
Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design. ...
It was designed as an extremely lightweight publish/subscribe messaging transport. It is useful for connections with remote locations where a small code footprint is required and/or network bandwidth is at a premium. ...
Redis is an open source, BSD licensed, advanced key-value store. It is often referred to as a data structure server since keys can contain strings, hashes, lists, sets and sorted sets. ...
Apache ActiveMQ is fast, supports many Cross Language Clients and Protocols, comes with easy to use Enterprise Integration Patterns and many advanced features while fully supporting JMS 1.1 and J2EE 1.4. Apache ActiveMQ is released under the Apache 2.0 License. ...
It is a socket library that provides several common communication patterns. It aims to make the networking layer fast, scalable, and easy to use. Implemented in C, it works on a wide range of operating systems with no further dependencies. ...
gRPC is a modern open source high performance RPC framework that can run in any environment. It can efficiently connect services in and across data centers with pluggable support for load balancing, tracing, health checking... ...
Unlike traditional enterprise messaging systems, NATS has an always-on dial tone that does whatever it takes to remain available. This forms a great base for building modern, reliable, and scalable cloud and distributed systems. ...
ZeroMQ alternatives & related posts
related RabbitMQ posts
As Sentry runs throughout the day, there are about 50 different offline tasks that we execute—anything from “process this event, pretty please” to “send all of these cool people some emails.” There are some that we execute once a day and some that execute thousands per second.
Managing this variety requires a reliably high-throughput message-passing technology. We use Celery's RabbitMQ implementation, and we stumbled upon a great feature called Federation that allows us to partition our task queue across any number of RabbitMQ servers and gives us the confidence that, if any single server gets backlogged, others will pitch in and distribute some of the backlogged tasks to their consumers.
We've been using RabbitMQ as Zulip's queuing system since we needed a queuing system. What I like about it is that it scales really well and has good libraries for a wide range of platforms, including our own Python. So aside from getting it running, we've had to put basically 0 effort into making it scale for our needs.
However, there's several things that could be better about it:
* It's error messages are absolutely terrible; if ever one of our users ends up getting an error with RabbitMQ (even for simple things like a misconfigured hostname), they always end up needing to get help from the Zulip team, because the errors logs are just inscrutable. As an open source project, we've handled this issue by really carefully scripting the installation to be a failure-proof configuration (in this case, setting the RabbitMQ hostname to
127.0.0.1, so that no user-controlled configuration can break it). But it was a real pain to get there and the process of determining we needed to do that caused a significant amount of pain to folks installing Zulip.
pika library for Python takes a lot of time to startup a RabbitMQ connection; this means that Zulip server restarts are more disruptive than would be ideal.
* It's annoying that you need to run the
rabbitmqctl management commands as root.
But overall, I like that it has clean, clear semanstics and high scalability, and haven't been tempted to do the work to migrate to something like Redis (which has its own downsides).
related Kafka posts
The algorithms and data infrastructure at Stitch Fix is housed in #AWS. Data acquisition is split between events flowing through Kafka, and periodic snapshots of PostgreSQL DBs. We store data in an Amazon S3 based data warehouse. Apache Spark on Yarn is our tool of choice for data movement and #ETL. Because our storage layer (s3) is decoupled from our processing layer, we are able to scale our compute environment very elastically. We have several semi-permanent, autoscaling Yarn clusters running to serve our data processing needs. While the bulk of our compute infrastructure is dedicated to algorithmic processing, we also implemented Presto for adhoc queries and dashboards.
Beyond data movement and ETL, most #ML centric jobs (e.g. model training and execution) run in a similarly elastic environment as containers running Python and R code on Amazon EC2 Container Service clusters. The execution of batch jobs on top of ECS is managed by Flotilla, a service we built in house and open sourced (see https://github.com/stitchfix/flotilla-os).
At Stitch Fix, algorithmic integrations are pervasive across the business. We have dozens of data products actively integrated systems. That requires serving layer that is robust, agile, flexible, and allows for self-service. Models produced on Flotilla are packaged for deployment in production using Khan, another framework we've developed internally. Khan provides our data scientists the ability to quickly productionize those models they've developed with open source frameworks in Python 3 (e.g. PyTorch, sklearn), by automatically packaging them as Docker containers and deploying to Amazon ECS. This provides our data scientist a one-click method of getting from their algorithms to production. We then integrate those deployments into a service mesh, which allows us to A/B test various implementations in our product.
For more info:
- Our Algorithms Tour: https://algorithms-tour.stitchfix.com/
- Our blog: https://multithreaded.stitchfix.com/blog/
- Careers: https://multithreaded.stitchfix.com/careers/
#DataScience #DataStack #Data
As we've evolved or added additional infrastructure to our stack, we've biased towards managed services. Most new backing stores are Amazon RDS instances now. We do use self-managed PostgreSQL with TimescaleDB for time-series data—this is made HA with the use of Patroni and Consul.
We also use managed Amazon ElastiCache instances instead of spinning up Amazon EC2 instances to run Redis workloads, as well as shifting to Amazon Kinesis instead of Kafka.
related MQTT posts
related Redis posts
We use MongoDB as our primary #datastore. Mongo's approach to replica sets enables some fantastic patterns for operations like maintenance, backups, and #ETL.
As we pull #microservices from our #monolith, we are taking the opportunity to build them with their own datastores using PostgreSQL. We also use Redis to cache data we’d never store permanently, and to rate-limit our requests to partners’ APIs (like GitHub).
When we’re dealing with large blobs of immutable data (logs, artifacts, and test results), we store them in Amazon S3. We handle any side-effects of S3’s eventual consistency model within our own code. This ensures that we deal with user requests correctly while writes are in process.
I'm working as one of the engineering leads in RunaHR. As our platform is a Saas, we thought It'd be good to have an API (We chose Ruby and Rails for this) and a SPA (built with React and Redux ) connected. We started the SPA with Create React App since It's pretty easy to start.
We use Jest as the testing framework and react-testing-library to test React components. In Rails we make tests using RSpec.
Our main database is PostgreSQL, but we also use MongoDB to store some type of data. We started to use Redis for cache and other time sensitive operations.
We have a couple of extra projects: One is an Employee app built with React Native and the other is an internal back office dashboard built with Next.js for the client and Python in the backend side.
related ActiveMQ posts
I want to choose Message Queue with the following features - Highly Available, Distributed, Scalable, Monitoring. I have RabbitMQ, ActiveMQ, Kafka and Apache RocketMQ in mind. But I am confused which one to choose.
I use ActiveMQ because RabbitMQ have stopped giving the support for AMQP 1.0 or above version and the earlier version of AMQP doesn't give the functionality to support OAuth.
If OAuth is not required and we can go with AMQP 0.9 then i still recommend rabbitMq.
related nanomsg posts
related gRPC posts
By mid-2015, Uber’s rider growth coupled with its cadence of releasing new services, like Eats and Freight, was pressuring the infrastructure. To allow the decoupling of consumption from production, and to add an abstraction layer between users, developers, and infrastructure, Uber built Catalyst, a serverless internal service mesh.
Uber decided to build their own severless solution, rather that using something like AWS Lambda, speed for its global production environments as well as introspectability.