StackShareStackShare
Follow on
StackShare

Discover and share technology stacks from companies around the world.

Product

  • Stacks
  • Tools
  • Companies
  • Feed

Company

  • About
  • Blog
  • Contact

Legal

  • Privacy Policy
  • Terms of Service

© 2025 StackShare. All rights reserved.

API StatusChangelog
Google Cloud Dataflow
ByGoogle Cloud PlatformGoogle Cloud Platform

Google Cloud Dataflow

#15in Background Jobs
Discussions4
Followers497
OverviewDiscussions4

What is Google Cloud Dataflow?

Google Cloud Dataflow is a unified programming model and a managed service for developing and executing a wide range of data processing patterns including ETL, batch computation, and continuous computation. Cloud Dataflow frees you from operational tasks like resource management and performance optimization.

Google Cloud Dataflow is a tool in the Background Jobs category of a tech stack.

Key Features

Fully managedCombines batch and streaming with a single APIHigh performance with automatic workload rebalancing Open source SDK

Google Cloud Dataflow Pros & Cons

Pros of Google Cloud Dataflow

  • ✓Unified batch and stream processing
  • ✓Autoscaling
  • ✓Fully managed
  • ✓Throughput Transparency

Cons of Google Cloud Dataflow

No cons listed yet.

Google Cloud Dataflow Alternatives & Comparisons

What are some alternatives to Google Cloud Dataflow?

Amazon Kinesis

Amazon Kinesis

Amazon Kinesis can collect and process hundreds of gigabytes of data per second from hundreds of thousands of sources, allowing you to easily write applications that process information in real-time, from sources such as web site click-streams, marketing and financial information, manufacturing instrumentation and social media, and operational logs and metering data.

Amazon Kinesis Firehose

Amazon Kinesis Firehose

Amazon Kinesis Firehose is the easiest way to load streaming data into AWS. It can capture and automatically load streaming data into Amazon S3 and Amazon Redshift, enabling near real-time analytics with existing business intelligence tools and dashboards you’re already using today.

Twister2

Twister2

It is a high-performance data processing framework with capabilities to handle streaming and batch data. It can leverage high-performance clusters as well we cloud services to efficiently process data.

Google Cloud Dataflow Integrations

Google Cloud Healthcare API, Google AutoML Tables, Google AI Platform, Aviatrix, Cloud AI Platform Pipelines and 2 more are some of the popular tools that integrate with Google Cloud Dataflow. Here's a list of all 7 tools that integrate with Google Cloud Dataflow.

Google Cloud Healthcare API
Google Cloud Healthcare API
Google AutoML Tables
Google AutoML Tables
Google AI Platform
Google AI Platform
Aviatrix
Aviatrix
Cloud AI Platform Pipelines
Cloud AI Platform Pipelines
Sematic
Sematic
WhyLabs
WhyLabs

Google Cloud Dataflow Discussions

Discover why developers choose Google Cloud Dataflow. Read real-world technical decisions and stack choices from the StackShare community.

Andrea Latorre
Andrea Latorre

Jan 2, 2023

Needs adviceonGoogle Cloud Data FusionGoogle Cloud Data FusionGoogle BigQueryGoogle BigQueryGoogle Cloud DataflowGoogle Cloud Dataflow

I am currently launching 50 pipelines in a Google Cloud Data Fusion version 6.4 instance. These pipelines are launched daily and transport data from a MySQLServer database to Google BigQuery. The cost is becoming very high and I was wondering if the costs with Google Cloud Dataflow decrease for the same rows transported.

0 views0
Comments
Vishal Yadav
Vishal Yadav

Dec 26, 2022

Needs adviceonAWS GlueAWS GlueGoogle Cloud DataflowGoogle Cloud DataflowGoogle Cloud Data FusionGoogle Cloud Data Fusion

Will Dataflow be the right replacement for AWS Glue? Are there any unforeseen exceptions like certain proprietary transformations not supported in Google Cloud Dataflow, connectors ecosystem, Data Quality & Date cleansing not supported in DataFlow. etc?

Also, how about Google Cloud Data Fusion as a replacement? In terms of No Code/Low code .. (Since basic use cases in Glue support UI, in that case, CDF may be the right choice ).

What would be the best choice?

0 views0
Comments
Sung Won Chung
Sung Won Chung

Jun 5, 2019

Needs adviceonGoogle Cloud DataflowGoogle Cloud DataflowJavaJava

I use Google Cloud Dataflow because it has great templates for plug and play action.

I haven't invested in the apache beam framework because you need to know Java to take full advantage of it. The Python API is a second class citizen.

0 views0
Comments
Nick Rockwell
Nick Rockwell

SVP, Engineering at The New York Times

Sep 24, 2018

Needs adviceonGoogle BigQueryGoogle BigQueryGoogle Cloud Pub/SubGoogle Cloud Pub/SubGoogle Cloud DataflowGoogle Cloud Dataflow

We really drank the Google Kool-Aid on analytics. So, everything's going into Google BigQuery and almost everything is going straight into Google Cloud Pub/Sub and then doing some processing in Google Cloud Dataflow before ending up in BigQuery. We still do too much processing and augmentation on the front end before it goes into Pub/Sub. And that's using some kind of stuff we pulled together using Amazon DynamoDB and so on. And it's very brittle, actually. Actually, Dynamo throttling is one of our biggest headaches. So, I want all of that to go away and do all our augmentation in BigQuery after the data's been collected. And having it just go straight into Pub/Sub. So, we're working on that. And it'll happen, some time. #Analytics #AnalyticsPipeline

0 views0
Comments

Try It

Visit Website

Adoption

On StackShare

Companies
74
GSSKGS+68
Developers
150
ATFLDT+144