What is Amazon SQS?
What is Kafka?
Want advice about which of these to choose?Ask the StackShare community!
In the beginning we thought we wanted to start using something like RabbitMQ or maybe Kafka or maybe ActiveMQ. Back then we only had a few developers and no ops people. That has changed now, but we didn't really look forward to setting up a queuing cluster and making sure that all works.
What we did instead was we looked at what services Amazon offers to see if we can use those to build our own messaging system within those services. That's basically what we did. We wrote some clients in Ruby that can basically do the entire orchestration for us, and we run all our messaging on both SNS and SQS. Basically what you can do in Amazon services is you can use Amazon Simple Notification Service, so SNS, for creating topics and you can use queues to subscribe to these topics. That's basically all you need for a messaging system. You don't have to worry about scalability at all. That's what really appealed to us.
Front-end messages are logged to Kafka by our API and application servers. We have batch processing (on the middle-left) and real-time processing (on the middle-right) pipelines to process the experiment data. For batch processing, after daily raw log get to s3, we start our nightly experiment workflow to figure out experiment users groups and experiment metrics. We use our in-house workflow management system Pinball to manage the dependencies of all these MapReduce jobs.
This isn't exactly low-latency (10s to 100s of milliseconds), but it has good throughput and a simple API. There is good reliability, and there is no configuration necessary to get up and running. A hosted queue is important when trying to move fast.
SQS is the bridge between our new Lambda services and our incumbent Rails applications. Extremely easy to use when you're already using other AWS infrastructure.
Building out real-time streaming server to present data insights to Coolfront Mobile customers and internal sales and marketing teams.
Primary message queue. Enqueueing operations revert to a local file-system-based queue when SQS is unavailable.
I can't afford to lose data if Dynamo throttles my writes, so everything goes into a message queue first.