We've been using RabbitMQ as Zulip's queuing system since we needed a queuing system. What I like about it is that it scales really well and has good libraries for a wide range of platforms, including our own Python. So aside from getting it running, we've had to put basically 0 effort into making it scale for our needs.
However, there's several things that could be better about it:
* It's error messages are absolutely terrible; if ever one of our users ends up getting an error with RabbitMQ (even for simple things like a misconfigured hostname), they always end up needing to get help from the Zulip team, because the errors logs are just inscrutable. As an open source project, we've handled this issue by really carefully scripting the installation to be a failure-proof configuration (in this case, setting the RabbitMQ hostname to
127.0.0.1, so that no user-controlled configuration can break it). But it was a real pain to get there and the process of determining we needed to do that caused a significant amount of pain to folks installing Zulip.
pika library for Python takes a lot of time to startup a RabbitMQ connection; this means that Zulip server restarts are more disruptive than would be ideal.
* It's annoying that you need to run the
rabbitmqctl management commands as root.
But overall, I like that it has clean, clear semanstics and high scalability, and haven't been tempted to do the work to migrate to something like Redis (which has its own downsides).
Elasticsearch's built-in visualization tool, Kibana, is robust and the appropriate tool in many cases. However, it is geared specifically towards log exploration and time-series data, and we felt that its steep learning curve would impede adoption rate among data scientists accustomed to writing SQL. The solution was to create something that would replicate some of Kibana's essential functionality while hiding Elasticsearch's complexity behind SQL-esque labels and terminology ("table" instead of "index", "group by" instead of "sub-aggregation") in the UI.
Elasticsearch's API is really well-suited for aggregating time-series data, indexing arbitrary data without defining a schema, and creating dashboards. For the purpose of a data exploration backend, Elasticsearch fits the bill really well. Users can send an HTTP request with aggregations and sub-aggregations to an index with millions of documents and get a response within seconds, thus allowing them to rapidly iterate through their data.
Nexmo vs Twilio ?
Back in the early days at SmartZip Analytics, that evaluation had - for whatever reason - been made by Product Management. Some developers might have been consulted, but we hadn't made the final call and some key engineering aspects of it were omitted.
When revamping the platform, I made sure to flip the decision process how it should be. Business provided an input but Engineering lead the way and has the final say on all implementation matters. My engineers and I decided on re-evaluating the criteria and vendor selection. Not only did we need SMS support, but were we not thinking about #VoiceAndSms support as the use cases evolved.
Also, on an engineering standpoint, SDK mattered. Nexmo didn't have any. Twilio did. No-one would ever want to re-build from scratch integration layers vendors should naturally come up with and provide their customers with.
Twilio won on all fronts. Including costs and implementation timelines. No-one even noticed the vendor switch.
Many years later, Twilio demonstrated its position as a leader by holding conferences in the Bay Area, announcing features like Twilio Functions. Even acquired Authy which we also used for 2FA. Twilio's growth has been amazing. Its recent acquisition of SendGrid continues to show it.
I think next step could be to use Koa but I am not sure.
In my opinion PostgreSQL is totally over MongoDB - not only works with structured data & SQL & strict types, but also has excellent support for unstructured data as separate data type (you can store arbitrary JSONs - and they may be also queryable, depending on one of format's you may choose). Both writes & reads are much faster, then in Mongo. So you can get best on Document NoSQL & SQL in single database..
Formal downside of PostgreSQL is clustering scalability. There's not simple way to build distributed a cluster. However, two points:
1) You will need much more time before you need to actually scale due to PG's efficiency. And if you follow database-per-service pattern, maybe you won't need ever, cause dealing few billion records on single machine is an option for PG.
2) When you need to - you do it in a way you need, including as a part of app's logic (e.g. sharding by key, or PG-based clustering solution with strict model), scalability will be very transparent, much more obvious than Mongo's "cluster just works (but then fails)" replication.