Kafka vs MySQL

Get Advice Icon

Need advice about which tool to choose?Ask the StackShare community!

Kafka
Kafka

3.5K
2.9K
+ 1
460
MySQL
MySQL

22.8K
17.4K
+ 1
3.7K
Add tool

Kafka vs MySQL: What are the differences?

What is Kafka? Distributed, fault tolerant, high throughput pub-sub messaging system. Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design.

What is MySQL? The world's most popular open source database. The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software.

Kafka and MySQL are primarily classified as "Message Queue" and "Databases" tools respectively.

"High-throughput", "Distributed" and "Scalable" are the key factors why developers consider Kafka; whereas "Sql", "Free" and "Easy" are the primary reasons why MySQL is favored.

Kafka and MySQL are both open source tools. It seems that Kafka with 12.5K GitHub stars and 6.7K forks on GitHub has more adoption than MySQL with 3.91K GitHub stars and 1.54K GitHub forks.

According to the StackShare community, MySQL has a broader approval, being mentioned in 2965 company stacks & 2945 developers stacks; compared to Kafka, which is listed in 501 company stacks and 451 developer stacks.

What is Kafka?

Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design.

What is MySQL?

The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software.
Get Advice Icon

Need advice about which tool to choose?Ask the StackShare community!

Why do developers choose Kafka?
Why do developers choose MySQL?

Sign up to add, upvote and see more prosMake informed product decisions

Jobs that mention Kafka and MySQL as a desired skillset
What companies use Kafka?
What companies use MySQL?

Sign up to get full access to all the companiesMake informed product decisions

What tools integrate with Kafka?
What tools integrate with MySQL?

Sign up to get full access to all the tool integrationsMake informed product decisions

What are some alternatives to Kafka and MySQL?
ActiveMQ
Apache ActiveMQ is fast, supports many Cross Language Clients and Protocols, comes with easy to use Enterprise Integration Patterns and many advanced features while fully supporting JMS 1.1 and J2EE 1.4. Apache ActiveMQ is released under the Apache 2.0 License.
RabbitMQ
RabbitMQ gives your applications a common platform to send and receive messages, and your messages a safe place to live until received.
Amazon Kinesis
Amazon Kinesis can collect and process hundreds of gigabytes of data per second from hundreds of thousands of sources, allowing you to easily write applications that process information in real-time, from sources such as web site click-streams, marketing and financial information, manufacturing instrumentation and social media, and operational logs and metering data.
Apache Spark
Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning.
Akka
Akka is a toolkit and runtime for building highly concurrent, distributed, and resilient message-driven applications on the JVM.
See all alternatives
Decisions about Kafka and MySQL
StackShare Editors
StackShare Editors
MySQL
MySQL
Kafka
Kafka

The data-informed culture of Airbnb has been central to its success, and they have developed a sophisticated infrastructure to provide a reliable and scalable system for its users.

The basic flow sends source data into the system from two sources: a Kafka event stream and MySQL dumps from delivered through Sqoop. ETL and quality checks then happen in two separate Hive clusters, which are separated to isolate compute and storage resources and to provide “disaster recovery assurances if there ever was to be an outage.”

Partitioned Hive tables help ensure the data is immutable and reproducible, and Presto is used for almost all ad hoc queries on Hive managed tables.

See more
StackShare Editors
StackShare Editors
Presto
Presto
Apache Spark
Apache Spark
Scala
Scala
MySQL
MySQL
Kafka
Kafka

Slack’s data team works to “provide an ecosystem to help people in the company quickly and easily answer questions about usage, so they can make better and data informed decisions.” To achieve that goal, that rely on a complex data pipeline.

An in-house tool call Sqooper scrapes MySQL backups and pipe them to S3. Job queue and log data is sent to Kafka then persisted to S3 using an open source tool called Secor, which was created by Pinterest.

For compute, Amazon’s Elastic MapReduce (EMR) creates clusters preconfigured for Presto, Hive, and Spark.

Presto is then used for ad-hoc questions, validating data assumptions, exploring smaller datasets, and creating visualizations for some internal tools. Hive is used for larger data sets or longer time series data, and Spark allows teams to write efficient and robust batch and aggregation jobs. Most of the Spark pipeline is written in Scala.

Thrift binds all of these engines together with a typed schema and structured data.

Finally, the Hive Metastore serves as the ground truth for all data and its schema.

See more
Nick Rockwell
Nick Rockwell
CTO at NY Times · | 27 upvotes · 262.8K views
atThe New York TimesThe New York Times
Apache HTTP Server
Apache HTTP Server
Kafka
Kafka
Node.js
Node.js
GraphQL
GraphQL
Apollo
Apollo
React
React
PHP
PHP
MySQL
MySQL

When I joined NYT there was already broad dissatisfaction with the LAMP (Linux Apache HTTP Server MySQL PHP) Stack and the front end framework, in particular. So, I wasn't passing judgment on it. I mean, LAMP's fine, you can do good work in LAMP. It's a little dated at this point, but it's not ... I didn't want to rip it out for its own sake, but everyone else was like, "We don't like this, it's really inflexible." And I remember from being outside the company when that was called MIT FIVE when it had launched. And been observing it from the outside, and I was like, you guys took so long to do that and you did it so carefully, and yet you're not happy with your decisions. Why is that? That was more the impetus. If we're going to do this again, how are we going to do it in a way that we're gonna get a better result?

So we're moving quickly away from LAMP, I would say. So, right now, the new front end is React based and using Apollo. And we've been in a long, protracted, gradual rollout of the core experiences.

React is now talking to GraphQL as a primary API. There's a Node.js back end, to the front end, which is mainly for server-side rendering, as well.

Behind there, the main repository for the GraphQL server is a big table repository, that we call Bodega because it's a convenience store. And that reads off of a Kafka pipeline.

See more
Conor Myhrvold
Conor Myhrvold
Tech Brand Mgr, Office of CTO at Uber · | 5 upvotes · 79.5K views
atUber TechnologiesUber Technologies
Python
Python
MySQL
MySQL
PostgreSQL
PostgreSQL

Our most popular (& controversial!) article to date on the Uber Engineering blog in 3+ yrs. Why we moved from PostgreSQL to MySQL. In essence, it was due to a variety of limitations of Postgres at the time. Fun fact -- earlier in Uber's history we'd actually moved from MySQL to Postgres before switching back for good, & though we published the article in Summer 2016 we haven't looked back since:

The early architecture of Uber consisted of a monolithic backend application written in Python that used Postgres for data persistence. Since that time, the architecture of Uber has changed significantly, to a model of microservices and new data platforms. Specifically, in many of the cases where we previously used Postgres, we now use Schemaless, a novel database sharding layer built on top of MySQL (https://eng.uber.com/schemaless-part-one/). In this article, we’ll explore some of the drawbacks we found with Postgres and explain the decision to build Schemaless and other backend services on top of MySQL:

https://eng.uber.com/mysql-migration/

See more
Conor Myhrvold
Conor Myhrvold
Tech Brand Mgr, Office of CTO at Uber · | 4 upvotes · 100.6K views
atUber TechnologiesUber Technologies
Kafka Manager
Kafka Manager
Kafka
Kafka
GitHub
GitHub
Apache Spark
Apache Spark
Hadoop
Hadoop

Why we built Marmaray, an open source generic data ingestion and dispersal framework and library for Apache Hadoop :

Built and designed by our Hadoop Platform team, Marmaray is a plug-in-based framework built on top of the Hadoop ecosystem. Users can add support to ingest data from any source and disperse to any sink leveraging the use of Apache Spark . The name, Marmaray, comes from a tunnel in Turkey connecting Europe and Asia. Similarly, we envisioned Marmaray within Uber as a pipeline connecting data from any source to any sink depending on customer preference:

https://eng.uber.com/marmaray-hadoop-ingestion-open-source/

(Direct GitHub repo: https://github.com/uber/marmaray Kafka Kafka Manager )

See more
Khauth György
Khauth György
CTO at SalesAutopilot Kft. · | 11 upvotes · 81.4K views
atSalesAutopilot Kft.SalesAutopilot Kft.
AWS CodePipeline
AWS CodePipeline
Jenkins
Jenkins
Docker
Docker
vuex
vuex
Vuetify
Vuetify
Vue.js
Vue.js
jQuery UI
jQuery UI
Redis
Redis
MongoDB
MongoDB
MySQL
MySQL
Amazon Route 53
Amazon Route 53
Amazon CloudFront
Amazon CloudFront
Amazon SNS
Amazon SNS
Amazon CloudWatch
Amazon CloudWatch
GitHub
GitHub

I'm the CTO of a marketing automation SaaS. Because of the continuously increasing load we moved to the AWSCloud. We are using more and more features of AWS: Amazon CloudWatch, Amazon SNS, Amazon CloudFront, Amazon Route 53 and so on.

Our main Database is MySQL but for the hundreds of GB document data we use MongoDB more and more. We started to use Redis for cache and other time sensitive operations.

On the front-end we use jQuery UI + Smarty but now we refactor our app to use Vue.js with Vuetify. Because our app is relatively complex we need to use vuex as well.

On the development side we use GitHub as our main repo, Docker for local and server environment and Jenkins and AWS CodePipeline for Continuous Integration.

See more
Roman Bulgakov
Roman Bulgakov
Senior Back-End Developer, Software Architect at Chemondis GmbH · | 3 upvotes · 10.5K views
Kafka
Kafka

I use Kafka because it has almost infinite scaleability in terms of processing events (could be scaled to process hundreds of thousands of events), great monitoring (all sorts of metrics are exposed via JMX).

Downsides of using Kafka are: - you have to deal with Zookeeper - you have to implement advanced routing yourself (compared to RabbitMQ it has no advanced routing)

See more
RabbitMQ
RabbitMQ
Kafka
Kafka

The question for which Message Queue to use mentioned "availability, distributed, scalability, and monitoring". I don't think that this excludes many options already. I does not sound like you would take advantage of Kafka's strengths (replayability, based on an even sourcing architecture). You could pick one of the AMQP options.

I would recommend the RabbitMQ message broker, which not only implements the AMQP standard 0.9.1 (it can support 1.x or other protocols as well) but has also several very useful extensions built in. It ticks the boxes you mentioned and on top you will get a very flexible system, that allows you to build the architecture, pick the options and trade-offs that suite your case best.

For more information about RabbitMQ, please have a look at the linked markdown I assembled. The second half explains many configuration options. It also contains links to managed hosting and to libraries (though it is missing Python's - which should be Puka, I assume).

See more
Julien DeFrance
Julien DeFrance
Full Stack Engineering Manager at ValiMail · | 16 upvotes · 281.2K views
atSmartZipSmartZip
Amazon DynamoDB
Amazon DynamoDB
Ruby
Ruby
Node.js
Node.js
AWS Lambda
AWS Lambda
New Relic
New Relic
Amazon Elasticsearch Service
Amazon Elasticsearch Service
Elasticsearch
Elasticsearch
Superset
Superset
Amazon Quicksight
Amazon Quicksight
Amazon Redshift
Amazon Redshift
Zapier
Zapier
Segment
Segment
Amazon CloudFront
Amazon CloudFront
Memcached
Memcached
Amazon ElastiCache
Amazon ElastiCache
Amazon RDS for Aurora
Amazon RDS for Aurora
MySQL
MySQL
Amazon RDS
Amazon RDS
Amazon S3
Amazon S3
Docker
Docker
Capistrano
Capistrano
AWS Elastic Beanstalk
AWS Elastic Beanstalk
Rails API
Rails API
Rails
Rails
Algolia
Algolia

Back in 2014, I was given an opportunity to re-architect SmartZip Analytics platform, and flagship product: SmartTargeting. This is a SaaS software helping real estate professionals keeping up with their prospects and leads in a given neighborhood/territory, finding out (thanks to predictive analytics) who's the most likely to list/sell their home, and running cross-channel marketing automation against them: direct mail, online ads, email... The company also does provide Data APIs to Enterprise customers.

I had inherited years and years of technical debt and I knew things had to change radically. The first enabler to this was to make use of the cloud and go with AWS, so we would stop re-inventing the wheel, and build around managed/scalable services.

For the SaaS product, we kept on working with Rails as this was what my team had the most knowledge in. We've however broken up the monolith and decoupled the front-end application from the backend thanks to the use of Rails API so we'd get independently scalable micro-services from now on.

Our various applications could now be deployed using AWS Elastic Beanstalk so we wouldn't waste any more efforts writing time-consuming Capistrano deployment scripts for instance. Combined with Docker so our application would run within its own container, independently from the underlying host configuration.

Storage-wise, we went with Amazon S3 and ditched any pre-existing local or network storage people used to deal with in our legacy systems. On the database side: Amazon RDS / MySQL initially. Ultimately migrated to Amazon RDS for Aurora / MySQL when it got released. Once again, here you need a managed service your cloud provider handles for you.

Future improvements / technology decisions included:

Caching: Amazon ElastiCache / Memcached CDN: Amazon CloudFront Systems Integration: Segment / Zapier Data-warehousing: Amazon Redshift BI: Amazon Quicksight / Superset Search: Elasticsearch / Amazon Elasticsearch Service / Algolia Monitoring: New Relic

As our usage grows, patterns changed, and/or our business needs evolved, my role as Engineering Manager then Director of Engineering was also to ensure my team kept on learning and innovating, while delivering on business value.

One of these innovations was to get ourselves into Serverless : Adopting AWS Lambda was a big step forward. At the time, only available for Node.js (Not Ruby ) but a great way to handle cost efficiency, unpredictable traffic, sudden bursts of traffic... Ultimately you want the whole chain of services involved in a call to be serverless, and that's when we've started leveraging Amazon DynamoDB on these projects so they'd be fully scalable.

See more
Frédéric MARAND
Frédéric MARAND
Core Developer at OSInet · | 2 upvotes · 88.6K views
atOSInetOSInet
RabbitMQ
RabbitMQ
Beanstalkd
Beanstalkd
Kafka
Kafka

I used Kafka originally because it was mandated as part of the top-level IT requirements at a Fortune 500 client. What I found was that it was orders of magnitude more complex ...and powerful than my daily Beanstalkd , and far more flexible, resilient, and manageable than RabbitMQ.

So for any case where utmost flexibility and resilience are part of the deal, I would use Kafka again. But due to the complexities involved, for any time where this level of scalability is not required, I would probably just use Beanstalkd for its simplicity.

I tend to find RabbitMQ to be in an uncomfortable middle place between these two extremities.

See more
Ajit Parthan
Ajit Parthan
CTO at Shaw Academy · | 1 upvotes · 5K views
atShaw AcademyShaw Academy
MongoDB
MongoDB
MySQL
MySQL
#NosqlDatabaseAsAService

Initial storage was traditional MySQL. The pace of changes during a startup mode made it very difficult to have a clean and consistent schema. Large portions ended up as unstructured data stuffed into CLOBs and BLOBs.

Moving to MongoDB definitely made this part much easier.

Accessing data for analysis is a little bit of a challenge - especially for people coming from the world of SQL Workbench. But with tools like Exploratory this is becoming less of a problem.

#NosqlDatabaseAsAService

See more
Alex A
Alex A
Founder at PRIZ Guru · | 6 upvotes · 8.4K views
atPRIZ GuruPRIZ Guru
PostgreSQL
PostgreSQL
MySQL
MySQL

One of our battles at the very beginning of the road was choosing the right database. In fact, our first prototype was built on MySQL and back then nothing else was even under a consideration (don't ask me why). At some point, I was working on a project which was running on PostgreSQL and it is only then I understood the full power of it. We have over a billion of records in production instance, and we are able to optimize it to run fast and reliable. Well, now my default DB is PostgreSQL :)

See more
Tor Hagemann
Tor Hagemann
at Socotra · | 2 upvotes · 2.2K views
atSocotraSocotra
Amazon DynamoDB
Amazon DynamoDB
PostgreSQL
PostgreSQL
MySQL
MySQL

Much of our data model is relational, which makes MySQL or PostgreSQL (and family) fit the API's we need to build, in order to meet the needs of our customers.

Sometimes the flexibility of a NoSQL store like Amazon DynamoDB is very useful, but the lack of consistency really impacts usability and performance long-term, compared with viable alternatives. At our current scale, we've seen huge benefits from moving some of our tables out of Dynamo and doing more in SQL.

There will always be use cases for NoSQL and key-values stores, but if your model is well understood in your business/industry: relational databases are the way to go after finding product-market fit. Always understand the trade-offs (and a few intimate details) of any data store before you add to your company's stack!

See more
Joseph Irving
Joseph Irving
DevOps Engineer at uSwitch · | 8 upvotes · 6.6K views
atuSwitchuSwitch
Go
Go
PostgreSQL
PostgreSQL
MySQL
MySQL
Kubernetes
Kubernetes
Vault
Vault

At uSwitch we use Vault to generate short lived database credentials for our applications running in Kubernetes. We wanted to move from an environment where we had 100 dbs with a variety of static passwords being shared around to a place where each pod would have credentials that only last for its lifetime.

We chose vault because:

  • It had built in Kubernetes support so we could use service accounts to permission which pods could access which database.

  • A terraform provider so that we could configure both our RDS instances and their vault configuration in one place.

  • A variety of database providers including MySQL/PostgreSQL (our most common dbs).

  • A good api/Go -sdk so that we could build tooling around it to simplify development worfklow.

  • It had other features we would utilise such as PKI

See more
MongoDB
MongoDB
MySQL
MySQL
.NET Core
.NET Core
C#
C#

Hi! I needed to choose a full stack of tools for a web drop shipping site without the payment process for a family startup proyect. It will feed from several web services (JSON), I'm looking forward a 4,200 articles tops. For web use only and for a few clients at the beginning.

I'm considering C# with .NET Core 3.0 as is the one language I'm starting to learn. For the Database I haven´t made my mind yet, but could be MySQL or MongoDB any advice is welcome as I'm getting back to programming after year away from this awesome world. Thanks

See more
Interest over time
Reviews of Kafka and MySQL
No reviews found
How developers use Kafka and MySQL
Avatar of Pinterest
Pinterest uses KafkaKafka

http://media.tumblr.com/d319bd2624d20c8a81f77127d3c878d0/tumblr_inline_nanyv6GCKl1s1gqll.png

Front-end messages are logged to Kafka by our API and application servers. We have batch processing (on the middle-left) and real-time processing (on the middle-right) pipelines to process the experiment data. For batch processing, after daily raw log get to s3, we start our nightly experiment workflow to figure out experiment users groups and experiment metrics. We use our in-house workflow management system Pinball to manage the dependencies of all these MapReduce jobs.

Avatar of Rajeshkumar T
Rajeshkumar T uses MySQLMySQL
  • We are used MySQL database to build the Online Food Ordering System

    • Its best support normalization and all joins ( Restaurant details & Ordering, customer management, food menu, order transaction & food delivery).
    • Best for performance and structured the data.
    • Its help to stored the instant updates received from food delivery app ( update the real-time driver GPS location).
Avatar of Srinivas Adireddi
Srinivas Adireddi uses MySQLMySQL

1.It's very popular. Heared about it in Database class 2. The most comprehensive set of advanced features, management tools and technical support to achieve the highest levels of MySQL scalability, security, reliability, and uptime. 3. MySQL is an open-source relational database management system. Its name is a combination of "My", the name of co-founder Michael Widenius's daughter, and "SQL", the abbreviation for Structured Query Language.

Avatar of ShadowICT
ShadowICT uses MySQLMySQL

We use MySQL and variants thereof to store the data for our projects such as the community. MySQL being a well established product means that support is available whenever it is required along with an extensive list of support articles all over the web for diagnosing issues. Variants are also used where needed when, for example, better performance is needed.

Avatar of shridhardalavi
shridhardalavi uses MySQLMySQL

MySQL is a freely available open source Relational Database Management System (RDBMS) that uses Structured Query Language (SQL). SQL is the most popular language for adding, accessing and managing content in a database. It is most noted for its quick processing, proven reliability, ease and flexibility of use.

Avatar of John Galbraith
John Galbraith uses MySQLMySQL

I am not using this DB for blog posts or data stored on the site. I am using to track IP addresses and fully qualified domain names of attacker machines that either posted spam on my website, pig flooded me, or had more that a certain number of failed SSH attempts.

Avatar of Coolfront Technologies
Coolfront Technologies uses KafkaKafka

Building out real-time streaming server to present data insights to Coolfront Mobile customers and internal sales and marketing teams.

Avatar of ShareThis
ShareThis uses KafkaKafka

We are using Kafka as a message queue to process our widget logs.

Avatar of Christopher Davison
Christopher Davison uses KafkaKafka

Used for communications and triggering jobs across ETL systems

Avatar of theskyinflames
theskyinflames uses KafkaKafka

Used as a integration middleware by messaging interchanging.

How much does Kafka cost?
How much does MySQL cost?
Pricing unavailable
Pricing unavailable