Needs advice
on
Apache FlinkApache Flink
and
Kafka StreamsKafka Streams

We currently have 2 Kafka Streams topics that have records coming in continuously. We're looking into joining the 2 streams based on a key with a window of 5 minutes based on their timestamp.

Should I consider kStream - kStream join or Apache Flink window joins? Or is there any other better way to achieve this?

READ LESS
4 upvotes·199.6K views
Replies (2)
Recommends
on
Apache Flink

Hi !

I would say it depends on three things:

  • the volume (number of events by seconds)
  • the versatility you need.
  • the vendor lock on

If volume is an issue then maybe flink may be a better option. Flink is distributed and is architectured as such.

If you are ok with having everything based on Kafka then maybe Kafka streams is a better option, but if at some point you need to integrate with several systems maybe you should consider flink.

If you are okay with sticking with confluent tools it’s okay, but beware of technology. Flink is more generalist than Kafka streams.

READ MORE
5 upvotes·2 comments·3.6K views
Eran Levy
Eran Levy
·
January 12th 2022 at 4:58AM

Hi, If you are already working with Kafka Streams, I suggest to continue with that otherwise there is capability that not exists there or not fitting your needs..

·
Reply
S C
S C
·
January 12th 2022 at 6:32PM

Thank you for your input. We're talking ~ 6mil records per minute so a significant amount of data. We also want to join records within a small window and push them to a different topic. We're okay with something out of confluent so looks like Fink is a better option here.

·
Reply
Team Lead at Extenda Retail AS·
Recommends
on
Kafka Streams

Hi I would recommend to continue with Kafka tools. Kafka Streams integrates seamlessly with Kafka, and you also get the benefits of automatic schema management if you choose to use Schema Registry. I see that some recommend Flink with high volumes, but Kafka Streams is perfectly capable of doing high volumes and windowed operations as well. I guess the vendor lock in, that is also mentioned, is probably done equally on the messaging side as with your processing application.

Further details can be found here: https://www.confluent.io/blog/apache-flink-apache-kafka-streams-comparison-guideline-users/

Good luck with your choice!

READ MORE
4 upvotes·344 views
Avatar of Laurent Valdes

Laurent Valdes

Architect