Heron logo

Heron

Realtime, distributed, fault-tolerant stream processing engine from Twitter
11
21
+ 1
4

What is Heron?

Heron is realtime analytics platform developed by Twitter. It is the direct successor of Apache Storm, built to be backwards compatible with Storm's topology API but with a wide array of architectural improvements.
Heron is a tool in the Stream Processing category of a tech stack.
Heron is an open source tool with 3.4K GitHub stars and 606 GitHub forks. Here鈥檚 a link to Heron's open source repository on GitHub

Who uses Heron?

Companies

Developers
9 developers on StackShare have stated that they use Heron.

Why developers like Heron?

Here鈥檚 a list of reasons why companies and developers use Heron
Heron Reviews

Here are some stack decisions, common use cases and reviews by companies and developers who chose Heron in their tech stack.

Marc Bollinger
Marc Bollinger
Infra & Data Eng Manager at Lumosity | 4 upvotes 78.1K views
atLumosityLumosity
Node.js
Node.js
Ruby
Ruby
Kafka
Kafka
Scala
Scala
Apache Storm
Apache Storm
Heron
Heron
Redis
Redis
Pulsar
Pulsar

Lumosity is home to the world's largest cognitive training database, a responsibility we take seriously. For most of the company's history, our analysis of user behavior and training data has been powered by an event stream--first a simple Node.js pub/sub app, then a heavyweight Ruby app with stronger durability. Both supported decent throughput and latency, but they lacked some major features supported by existing open-source alternatives: replaying existing messages (also lacking in most message queue-based solutions), scaling out many different readers for the same stream, the ability to leverage existing solutions for reading and writing, and possibly most importantly: the ability to hire someone externally who already had expertise.

We ultimately migrated to Kafka in early- to mid-2016, citing both industry trends in companies we'd talked to with similar durability and throughput needs, the extremely strong documentation and community. We pored over Kyle Kingsbury's Jepsen post (https://aphyr.com/posts/293-jepsen-Kafka), as well as Jay Kreps' follow-up (http://blog.empathybox.com/post/62279088548/a-few-notes-on-kafka-and-jepsen), talked at length with Confluent folks and community members, and still wound up running parallel systems for quite a long time, but ultimately, we've been very, very happy. Understanding the internals and proper levers takes some commitment, but it's taken very little maintenance once configured. Since then, the Confluent Platform community has grown and grown; we've gone from doing most development using custom Scala consumers and producers to being 60/40 Kafka Streams/Connects.

We originally looked into Storm / Heron , and we'd moved on from Redis pub/sub. Heron looks great, but we already had a programming model across services that was more akin to consuming a message consumers than required a topology of bolts, etc. Heron also had just come out while we were starting to migrate things, and the community momentum and direction of Kafka felt more substantial than the older Storm. If we were to start the process over again today, we might check out Pulsar , although the ecosystem is much younger.

To find out more, read our 2017 engineering blog post about the migration!

See more

Heron Alternatives & Comparisons

What are some alternatives to Heron?
Apache Flink
Apache Flink is an open source system for fast and versatile data analytics in clusters. Flink supports batch and streaming analytics, in one system. Analytical programs can be written in concise and elegant APIs in Java and Scala.
Apache Storm
Apache Storm is a free and open source distributed realtime computation system. Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing. Storm has many use cases: realtime analytics, online machine learning, continuous computation, distributed RPC, ETL, and more. Storm is fast: a benchmark clocked it at over a million tuples processed per second per node. It is scalable, fault-tolerant, guarantees your data will be processed, and is easy to set up and operate.
Kafka Streams
It is a client library for building applications and microservices, where the input and output data are stored in Kafka clusters. It combines the simplicity of writing and deploying standard Java and Scala applications on the client side with the benefits of Kafka's server-side cluster technology.
Apache NiFi
An easy to use, powerful, and reliable system to process and distribute data. It supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic.
Confluent
It is a data streaming platform based on Apache Kafka: a full-scale streaming platform, capable of not only publish-and-subscribe, but also the storage and processing of data within the stream
See all alternatives

Heron's Followers
21 developers follow Heron to keep up with related blogs and decisions.
Nurullah 脰zdemir
NING WANG
Matt Niedelman
Lenville Leo
afifrizqiawan
harishks
Jon Bock
Jake Foster
tepsl
Koushik Saha