Alternatives to Debezium logo

Alternatives to Debezium

Kafka, Slick, Sequel Pro, PostGIS, and Spring Data are the most popular alternatives and competitors to Debezium.
11
11
+ 1
0

What is Debezium and what are its top alternatives?

Start it up, point it at your databases, and your apps can start responding to all of the inserts, updates, and deletes that other apps commit to your databases. It is durable and fast, so your apps can respond quickly and never miss an event, even when things go wrong.
Debezium is a tool in the Database Tools category of a tech stack.
Debezium is an open source tool with 2.8K GitHub stars and 717 GitHub forks. Here’s a link to Debezium's open source repository on GitHub

Debezium alternatives & related posts

Kafka logo

Kafka

5K
4.5K
492
5K
4.5K
+ 1
492
Distributed, fault tolerant, high throughput pub-sub messaging system
Kafka logo
Kafka
VS
Debezium logo
Debezium

related Kafka posts

Eric Colson
Eric Colson
Chief Algorithms Officer at Stitch Fix · | 19 upvotes · 622.1K views
atStitch FixStitch Fix
Kafka
Kafka
PostgreSQL
PostgreSQL
Amazon S3
Amazon S3
Apache Spark
Apache Spark
Presto
Presto
Python
Python
R Language
R Language
PyTorch
PyTorch
Docker
Docker
Amazon EC2 Container Service
Amazon EC2 Container Service
#AWS
#Etl
#ML
#DataScience
#DataStack
#Data

The algorithms and data infrastructure at Stitch Fix is housed in #AWS. Data acquisition is split between events flowing through Kafka, and periodic snapshots of PostgreSQL DBs. We store data in an Amazon S3 based data warehouse. Apache Spark on Yarn is our tool of choice for data movement and #ETL. Because our storage layer (s3) is decoupled from our processing layer, we are able to scale our compute environment very elastically. We have several semi-permanent, autoscaling Yarn clusters running to serve our data processing needs. While the bulk of our compute infrastructure is dedicated to algorithmic processing, we also implemented Presto for adhoc queries and dashboards.

Beyond data movement and ETL, most #ML centric jobs (e.g. model training and execution) run in a similarly elastic environment as containers running Python and R code on Amazon EC2 Container Service clusters. The execution of batch jobs on top of ECS is managed by Flotilla, a service we built in house and open sourced (see https://github.com/stitchfix/flotilla-os).

At Stitch Fix, algorithmic integrations are pervasive across the business. We have dozens of data products actively integrated systems. That requires serving layer that is robust, agile, flexible, and allows for self-service. Models produced on Flotilla are packaged for deployment in production using Khan, another framework we've developed internally. Khan provides our data scientists the ability to quickly productionize those models they've developed with open source frameworks in Python 3 (e.g. PyTorch, sklearn), by automatically packaging them as Docker containers and deploying to Amazon ECS. This provides our data scientist a one-click method of getting from their algorithms to production. We then integrate those deployments into a service mesh, which allows us to A/B test various implementations in our product.

For more info:

#DataScience #DataStack #Data

See more
John Kodumal
John Kodumal
CTO at LaunchDarkly · | 16 upvotes · 378.1K views
atLaunchDarklyLaunchDarkly
Amazon RDS
Amazon RDS
PostgreSQL
PostgreSQL
TimescaleDB
TimescaleDB
Patroni
Patroni
Consul
Consul
Amazon ElastiCache
Amazon ElastiCache
Amazon EC2
Amazon EC2
Redis
Redis
Amazon Kinesis
Amazon Kinesis
Kafka
Kafka

As we've evolved or added additional infrastructure to our stack, we've biased towards managed services. Most new backing stores are Amazon RDS instances now. We do use self-managed PostgreSQL with TimescaleDB for time-series data—this is made HA with the use of Patroni and Consul.

We also use managed Amazon ElastiCache instances instead of spinning up Amazon EC2 instances to run Redis workloads, as well as shifting to Amazon Kinesis instead of Kafka.

See more
Slick logo

Slick

7.9K
80
0
7.9K
80
+ 1
0
Database query and access library for Scala
    Be the first to leave a pro
    Slick logo
    Slick
    VS
    Debezium logo
    Debezium
    Sequel Pro logo

    Sequel Pro

    247
    171
    63
    247
    171
    + 1
    63
    MySQL database management for Mac OS X
    Sequel Pro logo
    Sequel Pro
    VS
    Debezium logo
    Debezium
    PostGIS logo

    PostGIS

    197
    153
    28
    197
    153
    + 1
    28
    Open source spatial database
    PostGIS logo
    PostGIS
    VS
    Debezium logo
    Debezium
    Spring Data logo

    Spring Data

    142
    94
    0
    142
    94
    + 1
    0
    Provides a consistent approach to data access – relational, non-relational, map-reduce, and beyond
      Be the first to leave a pro
      Spring Data logo
      Spring Data
      VS
      Debezium logo
      Debezium
      Open PostgreSQL Monitoring logo

      Open PostgreSQL Monitoring

      139
      110
      0
      139
      110
      + 1
      0
      Oversee and Manage Your PostgreSQL Servers
        Be the first to leave a pro
        Open PostgreSQL Monitoring logo
        Open PostgreSQL Monitoring
        VS
        Debezium logo
        Debezium
        Microsoft SQL Server Management Studio logo

        Microsoft SQL Server Management Studio

        137
        87
        0
        137
        87
        + 1
        0
        An integrated environment for managing any SQL infrastructure
          Be the first to leave a pro
          Microsoft SQL Server Management Studio logo
          Microsoft SQL Server Management Studio
          VS
          Debezium logo
          Debezium
          DataGrip logo

          DataGrip

          136
          76
          0
          136
          76
          + 1
          0
          A database IDE for professional SQL developers
            Be the first to leave a pro
            DataGrip logo
            DataGrip
            VS
            Debezium logo
            Debezium