StackShareStackShare
Follow on
StackShare

Discover and share technology stacks from companies around the world.

Follow on

© 2025 StackShare. All rights reserved.

Product

  • Stacks
  • Tools
  • Feed

Company

  • About
  • Contact

Legal

  • Privacy Policy
  • Terms of Service
  1. Stackups
  2. Application & Data
  3. Infrastructure as a Service
  4. Cloud Storage
  5. Amazon S3 vs Kafka

Amazon S3 vs Kafka

OverviewDecisionsComparisonAlternatives

Overview

Amazon S3
Amazon S3
Stacks55.1K
Followers40.2K
Votes2.0K
Kafka
Kafka
Stacks24.2K
Followers22.3K
Votes607
GitHub Stars31.2K
Forks14.8K

Amazon S3 vs Kafka: What are the differences?

  1. Amazon S3: Amazon Simple Storage Service (S3) is a scalable storage service offered by Amazon Web Services (AWS). It provides developers with the ability to store and retrieve any amount of data at any time, from anywhere on the web. S3 is designed to provide durability, availability, and scalability for data storage needs.

  2. Kafka: Apache Kafka is a distributed streaming platform that is used for building real-time data pipelines and streaming applications. It is designed to handle high volumes of data, provide fault-tolerance, and support real-time processing of data streams. Kafka is widely used for building event-driven architectures and processing real-time data feeds.

  3. Storage vs Messaging: The key difference between S3 and Kafka lies in their primary use cases. S3 is primarily used for storage purposes, offering a scalable and durable solution for storing and retrieving objects like files, images, and videos. On the other hand, Kafka is designed for messaging and real-time data streaming, providing a publish-subscribe model to handle streams of data.

  4. Data Persistence: While S3 offers long-term data persistence and storage, Kafka is more focused on processing real-time data streams. S3 ensures durability and availability of stored objects, allowing them to be accessed at any time. In contrast, Kafka focuses on the real-time processing of data streams, allowing for high-throughput and fault-tolerant streaming applications.

  5. Retrieval and Querying: In S3, objects are stored and retrieved using APIs with simple key-value semantics. It provides a simple and efficient way to store and retrieve data, but querying the data requires additional processing and tools. In Kafka, data is stored in topics and can be retrieved by consumers subscribing to specific topics. Kafka provides powerful features for real-time data processing and querying, making it suitable for streaming applications.

  6. Event-driven Architecture: Kafka is commonly used in event-driven architectures, where it acts as a messaging system that enables the flow of data between different components and services. S3, on the other hand, is not designed for real-time event-driven architectures but rather for reliable and scalable storage of objects.

In Summary, Amazon S3 is a scalable storage service primarily used for storing and retrieving objects, while Kafka is a distributed streaming platform focused on real-time data processing and messaging. S3 provides long-term data persistence and retrieval with simple key-value semantics, while Kafka enables high-throughput real-time streaming and messaging in event-driven architectures.

Share your Stack

Help developers discover the tools you use. Get visibility for your team's tech choices and contribute to the community's knowledge.

View Docs
CLI (Node.js)
or
Manual

Advice on Amazon S3, Kafka

viradiya
viradiya

Apr 12, 2020

Needs adviceonAngularJSAngularJSASP.NET CoreASP.NET CoreMSSQLMSSQL

We are going to develop a microservices-based application. It consists of AngularJS, ASP.NET Core, and MSSQL.

We have 3 types of microservices. Emailservice, Filemanagementservice, Filevalidationservice

I am a beginner in microservices. But I have read about RabbitMQ, but come to know that there are Redis and Kafka also in the market. So, I want to know which is best.

933k views933k
Comments
Ishfaq
Ishfaq

Feb 28, 2020

Needs advice

Our backend application is sending some external messages to a third party application at the end of each backend (CRUD) API call (from UI) and these external messages take too much extra time (message building, processing, then sent to the third party and log success/failure), UI application has no concern to these extra third party messages.

So currently we are sending these third party messages by creating a new child thread at end of each REST API call so UI application doesn't wait for these extra third party API calls.

I want to integrate Apache Kafka for these extra third party API calls, so I can also retry on failover third party API calls in a queue(currently third party messages are sending from multiple threads at the same time which uses too much processing and resources) and logging, etc.

Question 1: Is this a use case of a message broker?

Question 2: If it is then Kafka vs RabitMQ which is the better?

804k views804k
Comments
Mohammad
Mohammad

Aug 30, 2020

Needs adviceonBackblaze B2 Cloud StorageBackblaze B2 Cloud StoragePHPPHPLaravelLaravel

Hello! I have a mobile app with nearly 100k MAU, and I want to add a cloud file storage service to my app.

My app will allow users to store their image, video, and audio files and retrieve them to their device when necessary.

I have already decided to use PHP & Laravel as my backend, and I use Contabo VPS. Now, I need an object storage service for my app, and my options are:

  • Amazon S3 : It sounds to me like the best option but the most expensive. Closest to my users (MENA Region) for other services, I will have to go to Europe. Not sure how important this is?

  • DigitalOcean Spaces : Seems like my best option for price/service, but I am still not sure

  • Wasabi: the best price (6 USD/MONTH/TB) and free bandwidth, but I am not sure if it fits my needs as I want to allow my users to preview audio and video files. They don't recommend their service for streaming videos.

  • Backblaze B2 Cloud Storage: Good price but not sure about them.

  • There is also the self-hosted s3 compatible option, but I am not sure about that.

Any thoughts will be helpful. Also, if you think I should post in a different sub, please tell me.

180k views180k
Comments

Detailed Comparison

Amazon S3
Amazon S3
Kafka
Kafka

Amazon Simple Storage Service provides a fully redundant data storage infrastructure for storing and retrieving any amount of data, at any time, from anywhere on the web

Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design.

Write, read, and delete objects containing from 1 byte to 5 terabytes of data each. The number of objects you can store is unlimited.;Each object is stored in a bucket and retrieved via a unique, developer-assigned key.;A bucket can be stored in one of several Regions. You can choose a Region to optimize for latency, minimize costs, or address regulatory requirements. Amazon S3 is currently available in the US Standard, US West (Oregon), US West (Northern California), EU (Ireland), Asia Pacific (Singapore), Asia Pacific (Tokyo), Asia Pacific (Sydney), South America (Sao Paulo), and GovCloud (US) Regions. The US Standard Region automatically routes requests to facilities in Northern Virginia or the Pacific Northwest using network maps.;Objects stored in a Region never leave the Region unless you transfer them out. For example, objects stored in the EU (Ireland) Region never leave the EU.;Authentication mechanisms are provided to ensure that data is kept secure from unauthorized access. Objects can be made private or public, and rights can be granted to specific users.;Options for secure data upload/download and encryption of data at rest are provided for additional data protection.;Uses standards-based REST and SOAP interfaces designed to work with any Internet-development toolkit.;Built to be flexible so that protocol or functional layers can easily be added. The default download protocol is HTTP. A BitTorrent protocol interface is provided to lower costs for high-scale distribution.;Provides functionality to simplify manageability of data through its lifetime. Includes options for segregating data by buckets, monitoring and controlling spend, and automatically archiving data to even lower cost storage options. These options can be easily administered from the Amazon S3 Management Console.;Reliability backed with the Amazon S3 Service Level Agreement.
Written at LinkedIn in Scala;Used by LinkedIn to offload processing of all page and other views;Defaults to using persistence, uses OS disk cache for hot data (has higher throughput then any of the above having persistence enabled);Supports both on-line as off-line processing
Statistics
GitHub Stars
-
GitHub Stars
31.2K
GitHub Forks
-
GitHub Forks
14.8K
Stacks
55.1K
Stacks
24.2K
Followers
40.2K
Followers
22.3K
Votes
2.0K
Votes
607
Pros & Cons
Pros
  • 590
    Reliable
  • 492
    Scalable
  • 456
    Cheap
  • 329
    Simple & easy
  • 83
    Many sdks
Cons
  • 7
    Permissions take some time to get right
  • 6
    Takes time/work to organize buckets & folders properly
  • 6
    Requires a credit card
  • 3
    Complex to set up
Pros
  • 126
    High-throughput
  • 119
    Distributed
  • 92
    Scalable
  • 86
    High-Performance
  • 66
    Durable
Cons
  • 32
    Non-Java clients are second-class citizens
  • 29
    Needs Zookeeper
  • 9
    Operational difficulties
  • 5
    Terrible Packaging

What are some alternatives to Amazon S3, Kafka?

RabbitMQ

RabbitMQ

RabbitMQ gives your applications a common platform to send and receive messages, and your messages a safe place to live until received.

Celery

Celery

Celery is an asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well.

Amazon SQS

Amazon SQS

Transmit any volume of data, at any level of throughput, without losing messages or requiring other services to be always available. With SQS, you can offload the administrative burden of operating and scaling a highly available messaging cluster, while paying a low price for only what you use.

NSQ

NSQ

NSQ is a realtime distributed messaging platform designed to operate at scale, handling billions of messages per day. It promotes distributed and decentralized topologies without single points of failure, enabling fault tolerance and high availability coupled with a reliable message delivery guarantee. See features & guarantees.

Amazon EBS

Amazon EBS

Amazon EBS volumes are network-attached, and persist independently from the life of an instance. Amazon EBS provides highly available, highly reliable, predictable storage volumes that can be attached to a running Amazon EC2 instance and exposed as a device within the instance. Amazon EBS is particularly suited for applications that require a database, file system, or access to raw block level storage.

ActiveMQ

ActiveMQ

Apache ActiveMQ is fast, supports many Cross Language Clients and Protocols, comes with easy to use Enterprise Integration Patterns and many advanced features while fully supporting JMS 1.1 and J2EE 1.4. Apache ActiveMQ is released under the Apache 2.0 License.

Google Cloud Storage

Google Cloud Storage

Google Cloud Storage allows world-wide storing and retrieval of any amount of data and at any time. It provides a simple programming interface which enables developers to take advantage of Google's own reliable and fast networking infrastructure to perform data operations in a secure and cost effective manner. If expansion needs arise, developers can benefit from the scalability provided by Google's infrastructure.

ZeroMQ

ZeroMQ

The 0MQ lightweight messaging kernel is a library which extends the standard socket interfaces with features traditionally provided by specialised messaging middleware products. 0MQ sockets provide an abstraction of asynchronous message queues, multiple messaging patterns, message filtering (subscriptions), seamless access to multiple transport protocols and more.

Apache NiFi

Apache NiFi

An easy to use, powerful, and reliable system to process and distribute data. It supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic.

Azure Storage

Azure Storage

Azure Storage provides the flexibility to store and retrieve large amounts of unstructured data, such as documents and media files with Azure Blobs; structured nosql based data with Azure Tables; reliable messages with Azure Queues, and use SMB based Azure Files for migrating on-premises applications to the cloud.

Related Comparisons

Bootstrap
Materialize

Bootstrap vs Materialize

Laravel
Django

Django vs Laravel vs Node.js

Bootstrap
Foundation

Bootstrap vs Foundation vs Material UI

Node.js
Spring Boot

Node.js vs Spring-Boot

Liquibase
Flyway

Flyway vs Liquibase