Google BigQuery
Google BigQuery

755
615
+ 1
110
Apache Spark
Apache Spark

1.6K
1.7K
+ 1
112
Add tool

Google BigQuery vs Apache Spark: What are the differences?

Developers describe Google BigQuery as "Analyze terabytes of data in seconds". Run super-fast, SQL-like queries against terabytes of data in seconds, using the processing power of Google's infrastructure Load data with ease. Bulk load your data using Google Cloud Storage or stream it in. Easy access. Access BigQuery by using a browser tool, a command-line tool, or by making calls to the BigQuery REST API with client libraries such as Java, PHP or Python.. On the other hand, Apache Spark is detailed as "Fast and general engine for large-scale data processing". Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning.

Google BigQuery and Apache Spark are primarily classified as "Big Data as a Service" and "Big Data" tools respectively.

Some of the features offered by Google BigQuery are:

  • All behind the scenes- Your queries can execute asynchronously in the background, and can be polled for status.
  • Import data with ease- Bulk load your data using Google Cloud Storage or stream it in bursts of up to 1,000 rows per second.
  • Affordable big data- The first Terabyte of data processed each month is free.

On the other hand, Apache Spark provides the following key features:

  • Run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk
  • Write applications quickly in Java, Scala or Python
  • Combine SQL, streaming, and complex analytics

"High Performance" is the top reason why over 17 developers like Google BigQuery, while over 45 developers mention "Open-source" as the leading cause for choosing Apache Spark.

Apache Spark is an open source tool with 22.3K GitHub stars and 19.3K GitHub forks. Here's a link to Apache Spark's open source repository on GitHub.

According to the StackShare community, Apache Spark has a broader approval, being mentioned in 262 company stacks & 111 developers stacks; compared to Google BigQuery, which is listed in 156 company stacks and 39 developer stacks.

- No public GitHub repository available -

What is Google BigQuery?

Run super-fast, SQL-like queries against terabytes of data in seconds, using the processing power of Google's infrastructure. Load data with ease. Bulk load your data using Google Cloud Storage or stream it in. Easy access. Access BigQuery by using a browser tool, a command-line tool, or by making calls to the BigQuery REST API with client libraries such as Java, PHP or Python.

What is Apache Spark?

Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning.
Why do developers choose Google BigQuery?
Why do developers choose Apache Spark?

Sign up to add, upvote and see more prosMake informed product decisions

What companies use Google BigQuery?
What companies use Apache Spark?

Sign up to get full access to all the companiesMake informed product decisions

What tools integrate with Google BigQuery?
What tools integrate with Apache Spark?

Sign up to get full access to all the tool integrationsMake informed product decisions

What are some alternatives to Google BigQuery and Apache Spark?
Google Cloud Bigtable
Google Cloud Bigtable offers you a fast, fully managed, massively scalable NoSQL database service that's ideal for web, mobile, and Internet of Things applications requiring terabytes to petabytes of data. Unlike comparable market offerings, Cloud Bigtable doesn't require you to sacrifice speed, scale, or cost efficiency when your applications grow. Cloud Bigtable has been battle-tested at Google for more than 10 years—it's the database driving major applications such as Google Analytics and Gmail.
Amazon Redshift
It is optimized for data sets ranging from a few hundred gigabytes to a petabyte or more and costs less than $1,000 per terabyte per year, a tenth the cost of most traditional data warehousing solutions.
Hadoop
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.
Snowflake
Snowflake eliminates the administration and management demands of traditional data warehouses and big data platforms. Snowflake is a true data warehouse as a service running on Amazon Web Services (AWS)—no infrastructure to manage and no knobs to turn.
Google Analytics
Google Analytics lets you measure your advertising ROI as well as track your Flash, video, and social networking sites and applications.
See all alternatives
Interest over time
Reviews of Google BigQuery and Apache Spark
No reviews found
How developers use Google BigQuery and Apache Spark
Avatar of ShareThis
ShareThis uses Google BigQueryGoogle BigQuery

BigQuery allows our team to pull reports quickly using a SQL-like queries against our large store of data about social sharing. We use the information throughout the company, to do everything from making internal product decisions based on usage patterns to sharing certain kinds of custom reports with our publishers.

Avatar of Lyndon Wong
Lyndon Wong uses Google BigQueryGoogle BigQuery

Aggregation of user events and traits across a marketing website, SaaS web application, user account provisioning backend and Salesforce CRM. Enables full-funnel analysis of campaign ROI, customer acquisition, engagement and retention at both the user and target account level.

Avatar of Wei Chen
Wei Chen uses Apache SparkApache Spark

Spark is good at parallel data processing management. We wrote a neat program to handle the TBs data we get everyday.

Avatar of Blue Shell Games
Blue Shell Games uses Google BigQueryGoogle BigQuery

Google's insanely fast, feature-rich, zero-maintenance column store. Used for real-time customer data queries.

Avatar of Ralic Lo
Ralic Lo uses Apache SparkApache Spark

Used Spark Dataframe API on Spark-R for big data analysis.

Avatar of Kalibrr
Kalibrr uses Apache SparkApache Spark

We use Apache Spark in computing our recommendations.

Avatar of Dotmetrics
Dotmetrics uses Apache SparkApache Spark

Big data analytics and nightly transformation jobs.

Avatar of brenoinojosa
brenoinojosa uses Apache SparkApache Spark

Data retrieval and analysis of Cassandra.

How much does Google BigQuery cost?
How much does Apache Spark cost?
Pricing unavailable
News about Google BigQuery
More news
News about Apache Spark
More news