What is Google Cloud Pub/Sub and what are its top alternatives?
Top Alternatives to Google Cloud Pub/Sub
Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design. ...
RabbitMQ gives your applications a common platform to send and receive messages, and your messages a safe place to live until received. ...
Firebase is a cloud service designed to power real-time, collaborative applications. Simply add the Firebase library to your application to gain access to a shared data structure; any changes you make to that data are automatically synchronized with the Firebase cloud and with other clients within milliseconds. ...
It enables real-time bidirectional event-based communication. It works on every platform, browser or device, focusing equally on reliability and speed. ...
Pusher is the category leader in delightful APIs for app developers building communication and collaboration features. ...
SignalR allows bi-directional communication between server and client. Servers can now push content to connected clients instantly as it becomes available. SignalR supports Web Sockets, and falls back to other compatible techniques for older browsers. SignalR includes APIs for connection management (for instance, connect and disconnect events), grouping connections, and authorization. ...
PubNub makes it easy for you to add real-time capabilities to your apps, without worrying about the infrastructure. Build apps that allow your users to engage in real-time across mobile, browser, desktop and server. ...
Unlike traditional enterprise messaging systems, NATS has an always-on dial tone that does whatever it takes to remain available. This forms a great base for building modern, reliable, and scalable cloud and distributed systems. ...
Google Cloud Pub/Sub alternatives & related posts
related Kafka posts
The algorithms and data infrastructure at Stitch Fix is housed in #AWS. Data acquisition is split between events flowing through Kafka, and periodic snapshots of PostgreSQL DBs. We store data in an Amazon S3 based data warehouse. Apache Spark on Yarn is our tool of choice for data movement and #ETL. Because our storage layer (s3) is decoupled from our processing layer, we are able to scale our compute environment very elastically. We have several semi-permanent, autoscaling Yarn clusters running to serve our data processing needs. While the bulk of our compute infrastructure is dedicated to algorithmic processing, we also implemented Presto for adhoc queries and dashboards.
Beyond data movement and ETL, most #ML centric jobs (e.g. model training and execution) run in a similarly elastic environment as containers running Python and R code on Amazon EC2 Container Service clusters. The execution of batch jobs on top of ECS is managed by Flotilla, a service we built in house and open sourced (see https://github.com/stitchfix/flotilla-os).
At Stitch Fix, algorithmic integrations are pervasive across the business. We have dozens of data products actively integrated systems. That requires serving layer that is robust, agile, flexible, and allows for self-service. Models produced on Flotilla are packaged for deployment in production using Khan, another framework we've developed internally. Khan provides our data scientists the ability to quickly productionize those models they've developed with open source frameworks in Python 3 (e.g. PyTorch, sklearn), by automatically packaging them as Docker containers and deploying to Amazon ECS. This provides our data scientist a one-click method of getting from their algorithms to production. We then integrate those deployments into a service mesh, which allows us to A/B test various implementations in our product.
For more info:
- Our Algorithms Tour: https://algorithms-tour.stitchfix.com/
- Our blog: https://multithreaded.stitchfix.com/blog/
- Careers: https://multithreaded.stitchfix.com/careers/
#DataScience #DataStack #Data
As we've evolved or added additional infrastructure to our stack, we've biased towards managed services. Most new backing stores are Amazon RDS instances now. We do use self-managed PostgreSQL with TimescaleDB for time-series data—this is made HA with the use of Patroni and Consul.
We also use managed Amazon ElastiCache instances instead of spinning up Amazon EC2 instances to run Redis workloads, as well as shifting to Amazon Kinesis instead of Kafka.
related RabbitMQ posts
As Sentry runs throughout the day, there are about 50 different offline tasks that we execute—anything from “process this event, pretty please” to “send all of these cool people some emails.” There are some that we execute once a day and some that execute thousands per second.
Managing this variety requires a reliably high-throughput message-passing technology. We use Celery's RabbitMQ implementation, and we stumbled upon a great feature called Federation that allows us to partition our task queue across any number of RabbitMQ servers and gives us the confidence that, if any single server gets backlogged, others will pitch in and distribute some of the backlogged tasks to their consumers.
We've been using RabbitMQ as Zulip's queuing system since we needed a queuing system. What I like about it is that it scales really well and has good libraries for a wide range of platforms, including our own Python. So aside from getting it running, we've had to put basically 0 effort into making it scale for our needs.
However, there's several things that could be better about it:
* It's error messages are absolutely terrible; if ever one of our users ends up getting an error with RabbitMQ (even for simple things like a misconfigured hostname), they always end up needing to get help from the Zulip team, because the errors logs are just inscrutable. As an open source project, we've handled this issue by really carefully scripting the installation to be a failure-proof configuration (in this case, setting the RabbitMQ hostname to
127.0.0.1, so that no user-controlled configuration can break it). But it was a real pain to get there and the process of determining we needed to do that caused a significant amount of pain to folks installing Zulip.
pika library for Python takes a lot of time to startup a RabbitMQ connection; this means that Zulip server restarts are more disruptive than would be ideal.
* It's annoying that you need to run the
rabbitmqctl management commands as root.
But overall, I like that it has clean, clear semanstics and high scalability, and haven't been tempted to do the work to migrate to something like Redis (which has its own downsides).
related Firebase posts
This is my stack in Application & Data
My Utilities Tools
Google Analytics Postman Elasticsearch
My Devops Tools
Git GitHub GitLab npm Visual Studio Code Kibana Sentry BrowserStack
My Business Tools
Fontumi focuses on the development of telecommunications solutions. We have opted for technologies that allow agile development and great scalability.
Firebase and Node.js + FeathersJS are technologies that we have used on the server side. Vue.js is our main framework for clients.
Our latest products launched have been focused on the integration of AI systems for enriched conversations. Google Compute Engine , along with Dialogflow and Cloud Firestore have been important tools for this work.
Git + GitHub + Visual Studio Code is a killer stack.
related Socket.IO posts
I use Socket.IO because the application has 2 frontend clients, which need to communicate in real-time. The backend-server handles the communication between these two clients via websockets. Socket.io is very easy to set up in Node.js and ExpressJS.
In the research project, the 1st client shows panoramic videos in a so called cave system (it is the VR setup of our research lab, which consists of three big screens, which are specially arranged, so the user experience the videos more immersive), the 2nd client controls the videos/locations of the 1st client.
related Pusher posts
Which messaging service (Pusher vs. PubNub vs. Google Cloud Pub/Sub) to use for IoT?
Recently we finished long research on chat tool for our students and mentors. In the end we picked Mattermost Team Edition as the cheapest and most feature complete option. We did consider building everything from scratch and use something like Pusher or Twilio on a backend, but then we would have to implement all the desktop and mobile clients and all the features oursevles. Mattermost gave us flexible API, lots of built in or easy to install integrations and future-proof feature set. We are still integrating it with our main platform but so far the team, existing mentors and students are very happy.