Amazon DynamoDBย vsย Kafka

Get Advice Icon

Need advice about which tool to choose?Ask the StackShare community!

Amazon DynamoDB
Amazon DynamoDB

1.6K
951
+ 1
162
Kafka
Kafka

3.6K
3K
+ 1
460
Add tool

Amazon DynamoDB vs Kafka: What are the differences?

What is Amazon DynamoDB? Fully managed NoSQL database service. All data items are stored on Solid State Drives (SSDs), and are replicated across 3 Availability Zones for high availability and durability. With DynamoDB, you can offload the administrative burden of operating and scaling a highly available distributed database cluster, while paying a low price for only what you use.

What is Kafka? Distributed, fault tolerant, high throughput pub-sub messaging system. Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design.

Amazon DynamoDB and Kafka are primarily classified as "NoSQL Database as a Service" and "Message Queue" tools respectively.

Some of the features offered by Amazon DynamoDB are:

  • Automated Storage Scaling โ€“ There is no limit to the amount of data you can store in a DynamoDB table, and the service automatically allocates more storage, as you store more data using the DynamoDB write APIs.
  • Provisioned Throughput โ€“ When creating a table, simply specify how much request capacity you require. DynamoDB allocates dedicated resources to your table to meet your performance requirements, and automatically partitions data over a sufficient number of servers to meet your request capacity. If your throughput requirements change, simply update your table's request capacity using the AWS Management Console or the Amazon DynamoDB APIs. You are still able to achieve your prior throughput levels while scaling is underway.
  • Fully Distributed, Shared Nothing Architecture โ€“ Amazon DynamoDB scales horizontally and can seamlessly scale a single table over hundreds of servers.

On the other hand, Kafka provides the following key features:

  • Written at LinkedIn in Scala
  • Used by LinkedIn to offload processing of all page and other views
  • Defaults to using persistence, uses OS disk cache for hot data (has higher throughput then any of the above having persistence enabled)

"Predictable performance and cost" is the primary reason why developers consider Amazon DynamoDB over the competitors, whereas "High-throughput" was stated as the key factor in picking Kafka.

Kafka is an open source tool with 12.7K GitHub stars and 6.81K GitHub forks. Here's a link to Kafka's open source repository on GitHub.

According to the StackShare community, Kafka has a broader approval, being mentioned in 509 company stacks & 470 developers stacks; compared to Amazon DynamoDB, which is listed in 444 company stacks and 187 developer stacks.

- No public GitHub repository available -

What is Amazon DynamoDB?

With it , you can offload the administrative burden of operating and scaling a highly available distributed database cluster, while paying a low price for only what you use.

What is Kafka?

Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design.
Get Advice Icon

Need advice about which tool to choose?Ask the StackShare community!

Why do developers choose Amazon DynamoDB?
Why do developers choose Kafka?

Sign up to add, upvote and see more prosMake informed product decisions

What companies use Amazon DynamoDB?
What companies use Kafka?

Sign up to get full access to all the companiesMake informed product decisions

What tools integrate with Amazon DynamoDB?
What tools integrate with Kafka?

Sign up to get full access to all the tool integrationsMake informed product decisions

What are some alternatives to Amazon DynamoDB and Kafka?
Google Cloud Datastore
Use a managed, NoSQL, schemaless database for storing non-relational data. Cloud Datastore automatically scales as you need it and supports transactions as well as robust, SQL-like queries.
MongoDB
MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding.
Amazon SimpleDB
Developers simply store and query data items via web services requests and Amazon SimpleDB does the rest. Behind the scenes, Amazon SimpleDB creates and manages multiple geographically distributed replicas of your data automatically to enable high availability and data durability. Amazon SimpleDB provides a simple web services interface to create and store multiple data sets, query your data easily, and return the results. Your data is automatically indexed, making it easy to quickly find the information that you need. There is no need to pre-define a schema or change a schema if new data is added later. And scale-out is as simple as creating new domains, rather than building out new servers.
Amazon S3
Amazon Simple Storage Service provides a fully redundant data storage infrastructure for storing and retrieving any amount of data, at any time, from anywhere on the web
MySQL
The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software.
See all alternatives
Decisions about Amazon DynamoDB and Kafka
Conor Myhrvold
Conor Myhrvold
Tech Brand Mgr, Office of CTO at Uber ยท | 5 upvotes ยท 123.3K views
atUber TechnologiesUber Technologies
Kafka Manager
Kafka Manager
Kafka
Kafka
GitHub
GitHub
Apache Spark
Apache Spark
Hadoop
Hadoop

Why we built Marmaray, an open source generic data ingestion and dispersal framework and library for Apache Hadoop :

Built and designed by our Hadoop Platform team, Marmaray is a plug-in-based framework built on top of the Hadoop ecosystem. Users can add support to ingest data from any source and disperse to any sink leveraging the use of Apache Spark . The name, Marmaray, comes from a tunnel in Turkey connecting Europe and Asia. Similarly, we envisioned Marmaray within Uber as a pipeline connecting data from any source to any sink depending on customer preference:

https://eng.uber.com/marmaray-hadoop-ingestion-open-source/

(Direct GitHub repo: https://github.com/uber/marmaray Kafka Kafka Manager )

See more
Roman Bulgakov
Roman Bulgakov
Senior Back-End Developer, Software Architect at Chemondis GmbH ยท | 3 upvotes ยท 10.5K views
Kafka
Kafka

I use Kafka because it has almost infinite scaleability in terms of processing events (could be scaled to process hundreds of thousands of events), great monitoring (all sorts of metrics are exposed via JMX).

Downsides of using Kafka are: - you have to deal with Zookeeper - you have to implement advanced routing yourself (compared to RabbitMQ it has no advanced routing)

See more
RabbitMQ
RabbitMQ
Kafka
Kafka

The question for which Message Queue to use mentioned "availability, distributed, scalability, and monitoring". I don't think that this excludes many options already. I does not sound like you would take advantage of Kafka's strengths (replayability, based on an even sourcing architecture). You could pick one of the AMQP options.

I would recommend the RabbitMQ message broker, which not only implements the AMQP standard 0.9.1 (it can support 1.x or other protocols as well) but has also several very useful extensions built in. It ticks the boxes you mentioned and on top you will get a very flexible system, that allows you to build the architecture, pick the options and trade-offs that suite your case best.

For more information about RabbitMQ, please have a look at the linked markdown I assembled. The second half explains many configuration options. It also contains links to managed hosting and to libraries (though it is missing Python's - which should be Puka, I assume).

See more
Frรฉdรฉric MARAND
Frรฉdรฉric MARAND
Core Developer at OSInet ยท | 2 upvotes ยท 91.9K views
atOSInetOSInet
RabbitMQ
RabbitMQ
Beanstalkd
Beanstalkd
Kafka
Kafka

I used Kafka originally because it was mandated as part of the top-level IT requirements at a Fortune 500 client. What I found was that it was orders of magnitude more complex ...and powerful than my daily Beanstalkd , and far more flexible, resilient, and manageable than RabbitMQ.

So for any case where utmost flexibility and resilience are part of the deal, I would use Kafka again. But due to the complexities involved, for any time where this level of scalability is not required, I would probably just use Beanstalkd for its simplicity.

I tend to find RabbitMQ to be in an uncomfortable middle place between these two extremities.

See more
Doru Mihai
Doru Mihai
Solution Architect ยท | 4 upvotes ยท 454 views
Amazon DynamoDB
Amazon DynamoDB

I use Amazon DynamoDB because it integrates seamlessly with other AWS SaaS solutions and if cost is the primary concern early on, then this will be a better choice when compared to AWS RDS or any other solution that requires the creation of a HA cluster of IaaS components that will cost money just for being there, the costs not being influenced primarily by usage.

See more
Interest over time
Reviews of Amazon DynamoDB and Kafka
No reviews found
How developers use Amazon DynamoDB and Kafka
Avatar of Pinterest
Pinterest uses KafkaKafka

http://media.tumblr.com/d319bd2624d20c8a81f77127d3c878d0/tumblr_inline_nanyv6GCKl1s1gqll.png

Front-end messages are logged to Kafka by our API and application servers. We have batch processing (on the middle-left) and real-time processing (on the middle-right) pipelines to process the experiment data. For batch processing, after daily raw log get to s3, we start our nightly experiment workflow to figure out experiment users groups and experiment metrics. We use our in-house workflow management system Pinball to manage the dependencies of all these MapReduce jobs.

Avatar of Karma
Karma uses Amazon DynamoDBAmazon DynamoDB

For most of the stuff we use MySQL. We just use Amazon RDS. But for some stuff we use Amazon DynamoDB. We love DynamoDB. It's amazing. We store usage data in there, for example. I think we have close to seven or eight hundred million records in there and it's scaled like you don't even notice it. You never notice any performance degradation whatsoever. It's insane, and the last time I checked we were paying $150 bucks for that.

Avatar of Volkan ร–zรงelik
Volkan ร–zรงelik uses Amazon DynamoDBAmazon DynamoDB

zerotoherojs.com โ€™s userbase, and course details are stored in DynamoDB tables.

The good thing about AWS DynamoDB is: For the amount of traffic that I have, it is free. It is highly-scalable, it is managed by Amazon, and it is pretty fast.

It is, again, one less thing to worry about (when compared to managing your own MongoDB elsewhere).

Avatar of CloudRepo
CloudRepo uses Amazon DynamoDBAmazon DynamoDB

We store customer metadata in DynamoDB. We decided to use Amazon DynamoDB because it was a fully managed, highly available solution. We didn't want to operate our own SQL server and we wanted to ensure that we built CloudRepo on high availability components so that we could pass that benefit back to our customers.

Avatar of nrise
nrise uses Amazon DynamoDBAmazon DynamoDB

๋ช‡๋ช‡ ๋กœ๊ทธ๋Š” ํ˜„์žฌ AWS DynamoDB ์— ๊ธฐ๋ก๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๊ฐœ์„ ์„ ํ†ตํ•ด mongodb ๋กœ ์˜ฎ๊ธธ ๊ณ„ํš์„ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์•„์ฃผ ๊ฐ„๋‹จํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ์Œ“๋Š” ์šฉ๋„๋กœ๋Š” ๋‚˜์˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋‹ค๋งŒ, ์ฟผ๋ฆฌ๊ฐ€ ์•„์ฃผ ์ œํ•œ์ ์ž…๋‹ˆ๋‹ค. ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋ฐ˜๋“œ์‹œ DynamoDB ์˜ ์ŠคํŽ™์„ ํ™•์ธํ•  ํ•„์š”๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

Avatar of Coolfront Technologies
Coolfront Technologies uses KafkaKafka

Building out real-time streaming server to present data insights to Coolfront Mobile customers and internal sales and marketing teams.

Avatar of HyperTrack
HyperTrack uses Amazon DynamoDBAmazon DynamoDB

To store device health records as it allows super fast writes and range queries.

Avatar of ShareThis
ShareThis uses KafkaKafka

We are using Kafka as a message queue to process our widget logs.

Avatar of Christopher Davison
Christopher Davison uses KafkaKafka

Used for communications and triggering jobs across ETL systems

Avatar of theskyinflames
theskyinflames uses KafkaKafka

Used as a integration middleware by messaging interchanging.

How much does Amazon DynamoDB cost?
How much does Kafka cost?
Pricing unavailable