StackShareStackShare
Follow on
StackShare

Discover and share technology stacks from companies around the world.

Product

  • Stacks
  • Tools
  • Companies
  • Feed

Company

  • About
  • Blog
  • Contact

Legal

  • Privacy Policy
  • Terms of Service

© 2025 StackShare. All rights reserved.

API StatusChangelog
Kafka
ByKafkaKafka

Kafka

#1in Background Jobs
Discussions68
Followers22.3k
OverviewDiscussions68

What is Kafka?

Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design.

Kafka is a tool in the Background Jobs category of a tech stack.

Key Features

Written at LinkedIn in ScalaUsed by LinkedIn to offload processing of all page and other viewsDefaults to using persistence, uses OS disk cache for hot data (has higher throughput then any of the above having persistence enabled)Supports both on-line as off-line processing

Kafka Pros & Cons

Pros of Kafka

  • ✓High-throughput
  • ✓Distributed
  • ✓Scalable
  • ✓High-Performance
  • ✓Durable
  • ✓Publish-Subscribe
  • ✓Simple-to-use
  • ✓Open source
  • ✓Written in Scala and java. Runs on JVM
  • ✓Message broker + Streaming system

Cons of Kafka

  • ✗Non-Java clients are second-class citizens
  • ✗Needs Zookeeper
  • ✗Operational difficulties
  • ✗Terrible Packaging

Kafka Alternatives & Comparisons

What are some alternatives to Kafka?

RabbitMQ

RabbitMQ

RabbitMQ gives your applications a common platform to send and receive messages, and your messages a safe place to live until received.

Amazon SQS

Amazon SQS

Transmit any volume of data, at any level of throughput, without losing messages or requiring other services to be always available. With SQS, you can offload the administrative burden of operating and scaling a highly available messaging cluster, while paying a low price for only what you use.

Celery

Celery

Celery is an asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well.

MQTT

MQTT

It was designed as an extremely lightweight publish/subscribe messaging transport. It is useful for connections with remote locations where a small code footprint is required and/or network bandwidth is at a premium.

ActiveMQ

ActiveMQ

Apache ActiveMQ is fast, supports many Cross Language Clients and Protocols, comes with easy to use Enterprise Integration Patterns and many advanced features while fully supporting JMS 1.1 and J2EE 1.4. Apache ActiveMQ is released under the Apache 2.0 License.

Apache NiFi

Apache NiFi

An easy to use, powerful, and reliable system to process and distribute data. It supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic.

Kafka Integrations

Kafka Manager, Apache Flink, ContainerShip, Netuitive, MapD and 7 more are some of the popular tools that integrate with Kafka. Here's a list of all 12 tools that integrate with Kafka.

Kafka Manager
Kafka Manager
Apache Flink
Apache Flink
ContainerShip
ContainerShip
Netuitive
Netuitive
MapD
MapD
K8Guard
K8Guard
Kubeless
Kubeless
Interana
Interana
DoctorKafka
DoctorKafka
Woopra
Woopra
Honeycomb
Honeycomb
Cilium
Cilium

Kafka Discussions

Discover why developers choose Kafka. Read real-world technical decisions and stack choices from the StackShare community.Showing 4 of 5 discussions.

Marc Bollinger
Marc Bollinger

Infra & Data Eng Manager at Lumosity

Dec 3, 2018

Needs adviceonNode.jsNode.jsRubyRubyKafkaKafka

Lumosity is home to the world's largest cognitive training database, a responsibility we take seriously. For most of the company's history, our analysis of user behavior and training data has been powered by an event stream--first a simple Node.js pub/sub app, then a heavyweight Ruby app with stronger durability. Both supported decent throughput and latency, but they lacked some major features supported by existing open-source alternatives: replaying existing messages (also lacking in most message queue-based solutions), scaling out many different readers for the same stream, the ability to leverage existing solutions for reading and writing, and possibly most importantly: the ability to hire someone externally who already had expertise.

We ultimately migrated to Kafka in early- to mid-2016, citing both industry trends in companies we'd talked to with similar durability and throughput needs, the extremely strong documentation and community. We pored over Kyle Kingsbury's Jepsen post (https://aphyr.com/posts/293-jepsen-Kafka), as well as Jay Kreps' follow-up (http://blog.empathybox.com/post/62279088548/a-few-notes-on-kafka-and-jepsen), talked at length with Confluent folks and community members, and still wound up running parallel systems for quite a long time, but ultimately, we've been very, very happy. Understanding the internals and proper levers takes some commitment, but it's taken very little maintenance once configured. Since then, the Confluent Platform community has grown and grown; we've gone from doing most development using custom Scala consumers and producers to being 60/40 Kafka Streams/Connects.

We originally looked into Apache Storm / Heron , and we'd moved on from Redis pub/sub. Heron looks great, but we already had a programming model across services that was more akin to consuming a message consumers than required a topology of bolts, etc. Heron also had just come out while we were starting to migrate things, and the community momentum and direction of Kafka felt more substantial than the older Storm. If we were to start the process over again today, we might check out Apache Pulsar , although the ecosystem is much younger.

To find out more, read our 2017 engineering blog post about the migration!

0 views0
Comments
Serhii Almazov
Serhii Almazov

VP of Architecture-Deputy CTO

Nov 30, 2018

Needs adviceon.NET.NETC#C#KubernetesKubernetes

I started using .NET in the early 2000s. Ever since version .NET 3.5 (and even .NET 2.0 if we take a proper generics implementation into account), C# was dominating in the feature battle against its rival, yet wasn't advancing significantly in the product coverage due to its platform dependency.

Thus I was very excited to hear the news about plans to develop an open-sourced cross-platform .NET Core framework. We started using .NET Core in production from version 1.1, and a global decision to migrate the entire solution to .NET Core was made with the release of .NET Core 2.0. Now we have more than 100 .NET Core (micro)services running on Linux containers inside Kubernetes, using Kafka for reactive communications and a number of open-source relational and NoSQL storage engines.

0 views0
Comments
Nick Rockwell
Nick Rockwell

SVP, Engineering at The New York Times

Sep 24, 2018

Needs adviceonMySQLMySQLPHPPHPReactReact

When I joined NYT there was already broad dissatisfaction with the LAMP (AngularJS MySQL PHP) Stack and the front end framework, in particular. So, I wasn't passing judgment on it. I mean, LAMP's fine, you can do good work in LAMP. It's a little dated at this point, but it's not ... I didn't want to rip it out for its own sake, but everyone else was like, "We don't like this, it's really inflexible." And I remember from being outside the company when that was called MIT FIVE when it had launched. And been observing it from the outside, and I was like, you guys took so long to do that and you did it so carefully, and yet you're not happy with your decisions. Why is that? That was more the impetus. If we're going to do this again, how are we going to do it in a way that we're gonna get a better result?

So we're moving quickly away from LAMP, I would say. So, right now, the new front end is React based and using Apollo. And we've been in a long, protracted, gradual rollout of the core experiences.

React is now talking to GraphQL as a primary API. There's a Node.js back end, to the front end, which is mainly for server-side rendering, as well.

Behind there, the main repository for the GraphQL server is a big table repository, that we call Bodega because it's a convenience store. And that reads off of a Kafka pipeline.

0 views0
Comments
Dan Robinson
Dan Robinson

Sep 13, 2018

Needs adviceonHeapHeapCitusCitusPostgreSQLPostgreSQL

At Heap, we searched for an existing tool that would allow us to express the full range of analyses we needed, index the event definitions that made up the analyses, and was a mature, natively distributed system.

After coming up empty on this search, we decided to compromise on the “maturity” requirement and build our own distributed system around Citus and sharded PostgreSQL. It was at this point that we also introduced Kafka as a queueing layer between the Node.js application servers and Postgres.

If we could go back in time, we probably would have started using Kafka on day one. One of the biggest benefits in adopting Kafka has been the peace of mind that it brings. In an analytics infrastructure, it’s often possible to make data ingestion idempotent.

In Heap’s case, that means that, if anything downstream from Kafka goes down, we won’t lose any data – it’s just going to take a bit longer to get to its destination. We also learned that you want the path between data hitting your servers and your initial persistence layer (in this case, Kafka) to be as short and simple as possible, since that is the surface area where a failure means you can lose customer data. We learned that it’s a very good fit for an analytics tool, since you can handle a huge number of incoming writes with relatively low latency. Kafka also gives you the ability to “replay” the data flow: it’s like a commit log for your whole infrastructure.

#MessageQueue #Databases #FrameworksFullStack

0 views0
Comments
View all 5 discussions

Try It

Visit Website

Adoption

On StackShare

Companies
1.68k
CCCDEF+1674
Developers
21.9k
RJBAMB+21881