Amazon Kinesis vs Amazon SQS: What are the differences?
Developers describe Amazon Kinesis as "Store and process terabytes of data each hour from hundreds of thousands of sources". Amazon Kinesis can collect and process hundreds of gigabytes of data per second from hundreds of thousands of sources, allowing you to easily write applications that process information in real-time, from sources such as web site click-streams, marketing and financial information, manufacturing instrumentation and social media, and operational logs and metering data. On the other hand, Amazon SQS is detailed as "Fully managed message queuing service". Transmit any volume of data, at any level of throughput, without losing messages or requiring other services to be always available. With SQS, you can offload the administrative burden of operating and scaling a highly available messaging cluster, while paying a low price for only what you use.
Amazon Kinesis belongs to "Real-time Data Processing" category of the tech stack, while Amazon SQS can be primarily classified under "Message Queue".
Some of the features offered by Amazon Kinesis are:
- Real-time Processing- Amazon Kinesis enables you to collect and analyze information in real-time, allowing you to answer questions about the current state of your data, from inventory levels to stock trade frequencies, rather than having to wait for an out-of-date report.
- Easy to use- You can create a new stream, set the throughput requirements, and start streaming data quickly and easily. Amazon Kinesis automatically provisions and manages the storage required to reliably and durably collect your data stream.
- High throughput. Elastic.- Amazon Kinesis seamlessly scales to match the data throughput rate and volume of your data, from megabytes to terabytes per hour. Amazon Kinesis will scale up or down based on your needs.
On the other hand, Amazon SQS provides the following key features:
- A queue can be created in any region.
- The message payload can contain up to 256KB of text in any format. Each 64KB ‘chunk’ of payload is billed as 1 request. For example, a single API call with a 256KB payload will be billed as four requests.
- Messages can be sent, received or deleted in batches of up to 10 messages or 256KB. Batches cost the same amount as single messages, meaning SQS can be even more cost effective for customers that use batching.
Medium, Lyft, and Coursera are some of the popular companies that use Amazon SQS, whereas Amazon Kinesis is used by Instacart, Lyft, and Zillow. Amazon SQS has a broader approval, being mentioned in 384 company stacks & 103 developers stacks; compared to Amazon Kinesis, which is listed in 132 company stacks and 25 developer stacks.
What is Amazon Kinesis?
What is Amazon SQS?
Need advice about which tool to choose?Ask the StackShare community!
Sign up to add, upvote and see more prosMake informed product decisions
What are the cons of using Amazon Kinesis?
Sign up to get full access to all the companiesMake informed product decisions
Sign up to get full access to all the tool integrationsMake informed product decisions
In order to accurately measure & track user behaviour on our platform we moved over quickly from the initial solution using Google Analytics to a custom-built one due to resource & pricing concerns we had.
While this does sound complicated, it’s as easy as clients sending JSON blobs of events to Amazon Kinesis from where we use AWS Lambda & Amazon SQS to batch and process incoming events and then ingest them into Google BigQuery. Once events are stored in BigQuery (which usually only takes a second from the time the client sends the data until it’s available), we can use almost-standard-SQL to simply query for data while Google makes sure that, even with terabytes of data being scanned, query times stay in the range of seconds rather than hours. Before ingesting their data into the pipeline, our mobile clients are aggregating events internally and, once a certain threshold is reached or the app is going to the background, sending the events as a JSON blob into the stream.
In the past we had workers running that continuously read from the stream and would validate and post-process the data and then enqueue them for other workers to write them to BigQuery. We went ahead and implemented the Lambda-based approach in such a way that Lambda functions would automatically be triggered for incoming records, pre-aggregate events, and write them back to SQS, from which we then read them, and persist the events to BigQuery. While this approach had a couple of bumps on the road, like re-triggering functions asynchronously to keep up with the stream and proper batch sizes, we finally managed to get it running in a reliable way and are very happy with this solution today.
#ServerlessTaskProcessing #GeneralAnalytics #RealTimeDataProcessing #BigDataAsAService
We are in the process of building a modern content platform to deliver our content through various channels. We decided to go with Microservices architecture as we wanted scale. Microservice architecture style is an approach to developing an application as a suite of small independently deployable services built around specific business capabilities. You can gain modularity, extensive parallelism and cost-effective scaling by deploying services across many distributed servers. Microservices modularity facilitates independent updates/deployments, and helps to avoid single point of failure, which can help prevent large-scale outages. We also decided to use Event Driven Architecture pattern which is a popular distributed asynchronous architecture pattern used to produce highly scalable applications. The event-driven architecture is made up of highly decoupled, single-purpose event processing components that asynchronously receive and process events.
To build our #Backend capabilities we decided to use the following: 1. #Microservices - Java with Spring Boot , Node.js with ExpressJS and Python with Flask 2. #Eventsourcingframework - Amazon Kinesis , Amazon Kinesis Firehose , Amazon SNS , Amazon SQS, AWS Lambda 3. #Data - Amazon RDS , Amazon DynamoDB , Amazon S3 , MongoDB Atlas
To build #Webapps we decided to use Angular 2 with RxJS
#Devops - GitHub , Travis CI , Terraform , Docker , Serverless
In the beginning we thought we wanted to start using something like RabbitMQ or maybe Kafka or maybe ActiveMQ. Back then we only had a few developers and no ops people. That has changed now, but we didn't really look forward to setting up a queuing cluster and making sure that all works.
What we did instead was we looked at what services Amazon offers to see if we can use those to build our own messaging system within those services. That's basically what we did. We wrote some clients in Ruby that can basically do the entire orchestration for us, and we run all our messaging on both SNS and SQS. Basically what you can do in Amazon services is you can use Amazon Simple Notification Service, so SNS, for creating topics and you can use queues to subscribe to these topics. That's basically all you need for a messaging system. You don't have to worry about scalability at all. That's what really appealed to us.
This isn't exactly low-latency (10s to 100s of milliseconds), but it has good throughput and a simple API. There is good reliability, and there is no configuration necessary to get up and running. A hosted queue is important when trying to move fast.
SQS is the bridge between our new Lambda services and our incumbent Rails applications. Extremely easy to use when you're already using other AWS infrastructure.
Primary message queue. Enqueueing operations revert to a local file-system-based queue when SQS is unavailable.
I can't afford to lose data if Dynamo throttles my writes, so everything goes into a message queue first.