Need advice about which tool to choose?Ask the StackShare community!

Apache Drill

Stacks72

Followers171

+ 1

Votes16

Google BigQuery

Stacks1.7K

Followers1.5K

+ 1

Votes152

Add tool

Apache Drill vs Google BigQuery: What are the differences?

Introduction

Apache Drill and Google BigQuery are both powerful data analysis tools that provide developers with the ability to query and analyze large datasets. While they have similar goals, there are several key differences between Apache Drill and Google BigQuery that make each unique.

Flexibility and Data Source Support: Apache Drill offers more flexibility and supports a wider range of data sources compared to Google BigQuery. Apache Drill can efficiently query structured and semi-structured data stored in various formats such as JSON, Parquet, Avro, and more. On the other hand, Google BigQuery is primarily designed for structured data stored in Google Cloud Storage or Google Drive.
Cost Structure: The cost structure of Apache Drill and Google BigQuery differs significantly. Apache Drill is an open-source project that can be freely downloaded, installed, and used without incurring any additional charges. In contrast, Google BigQuery is part of the Google Cloud Platform and has a usage-based pricing model. Users are charged based on the amount of data processed and storage used.
Scalability: While both Apache Drill and Google BigQuery can handle large volumes of data, the underlying architecture and scalability options differ. Apache Drill leverages the distributed computing power of Apache Hadoop to scale horizontally and process data in parallel across a cluster. Google BigQuery, on the other hand, is a fully managed service that automatically scales to handle massive datasets without requiring manual configuration or infrastructure management.
Query Language Support: Apache Drill supports SQL queries, making it easy for developers familiar with SQL to interact with the data. In addition, Apache Drill also provides support for complex nested data structures through its SQL-based query language. Google BigQuery, on the other hand, uses a proprietary query language called BigQuery SQL, which is similar to SQL but has some additional syntax and features.
Integration with Ecosystem: Apache Drill integrates well with the Apache Hadoop ecosystem and can leverage other tools such as Apache Hive, Apache HBase, and more. This allows developers to easily combine the capabilities of these tools with Apache Drill for efficient data analysis. Google BigQuery, on the other hand, is tightly integrated with other Google Cloud Platform services, providing seamless integration with storage, compute, and analytics services offered by Google.
Performance Optimization: Apache Drill provides developers with fine-grained control over query execution and optimization, allowing them to tune performance according to their specific requirements. Google BigQuery, being a fully managed service, automatically optimizes query execution behind the scenes. While this may simplify query optimization for users, it limits the level of control developers have over the performance tuning process.

In summary, Apache Drill provides more flexibility in terms of data source support, offers a cost advantage as an open-source project, and has better integration with the Apache Hadoop ecosystem. On the other hand, Google BigQuery is tightly integrated with Google Cloud Platform services, automatically scales to handle large datasets, and offers a simplified query optimization process.

Decisions about Apache Drill and Google BigQuery

Julien Lafont

CTO at TabMo · Sep 19, 2020 | 4 upvotes · 186.6K views

Chose

over

(

)

Cloud Data-warehouse is the centerpiece of modern Data platform. The choice of the most suitable solution is therefore fundamental.

Our benchmark was conducted over BigQuery and Snowflake. These solutions seem to match our goals but they have very different approaches.

BigQuery is notably the only 100% serverless cloud data-warehouse, which requires absolutely NO maintenance: no re-clustering, no compression, no index optimization, no storage management, no performance management. Snowflake requires to set up (paid) reclustering processes, to manage the performance allocated to each profile, etc. We can also mention Redshift, which we have eliminated because this technology requires even more ops operation.

BigQuery can therefore be set up with almost zero cost of human resources. Its on-demand pricing is particularly adapted to small workloads. 0 cost when the solution is not used, only pay for the query you're running. But quickly the use of slots (with monthly or per-minute commitment) will drastically reduce the cost of use. We've reduced by 10 the cost of our nightly batches by using flex slots.

Finally, a major advantage of BigQuery is its almost perfect integration with Google Cloud Platform services: Cloud functions, Dataflow, Data Studio, etc.

BigQuery is still evolving very quickly. The next milestone, BigQuery Omni, will allow to run queries over data stored in an external Cloud platform (Amazon S3 for example). It will be a major breakthrough in the history of cloud data-warehouses. Omni will compensate a weakness of BigQuery: transferring data in near real time from S3 to BQ is not easy today. It was even simpler to implement via Snowflake's Snowpipe solution.

We also plan to use the Machine Learning features built into BigQuery to accelerate our deployment of Data-Science-based projects. An opportunity only offered by the BigQuery solution

Manage your open source components, licenses, and vulnerabilities

Learn More

Pros of Apache Drill

Pros of Google BigQuery

4
NoSQL and Hadoop
3
Free
3
Lightning speed and simplicity in face of data jungle
2
Well documented for fast install
1
SQL interface to multiple datasources
1
Nested Data support
1
Read Structured and unstructured data
1
V1.10 released - https://drill.apache.org/

28
High Performance
25
Easy to use
22
Fully managed service
19
Cheap Pricing
16
Process hundreds of GB in seconds
12
Big Data
11
Full table scans in seconds, no indexes needed
8
Always on, no per-hour costs
6
Good combination with fluentd
4
Machine learning
1
Easy to manage
0
Easy to learn

Sign up to add or upvote prosMake informed product decisions

Cons of Apache Drill

Cons of Google BigQuery

Be the first to leave a con

1
You can't unit test changes in BQ data
0
Sdas

Sign up to add or upvote consMake informed product decisions

4.3K

26K

What is Apache Drill?

Apache Drill is a distributed MPP query layer that supports SQL and alternative query languages against NoSQL and Hadoop data storage systems. It was inspired in part by Google's Dremel.

What is Google BigQuery?

Run super-fast, SQL-like queries against terabytes of data in seconds, using the processing power of Google's infrastructure. Load data with ease. Bulk load your data using Google Cloud Storage or stream it in. Easy access. Access BigQuery by using a browser tool, a command-line tool, or by making calls to the BigQuery REST API with client libraries such as Java, PHP or Python.

Need advice about which tool to choose?Ask the StackShare community!

What companies use Apache Drill?

What companies use Google BigQuery?

Manage your open source components, licenses, and vulnerabilities

Learn More

Sign up to get full access to all the companiesMake informed product decisions

What tools integrate with Apache Drill?

What tools integrate with Google BigQuery?

DbSchema

Sign up to get full access to all the tool integrationsMake informed product decisions

Blog Posts

Cultivating your Data Lake

Aug 28 2019 at 3:10AM

Segment

+16

2651

The Growth Stacks of 2019

Jul 2 2019 at 9:34PM

Segment

+25

6912

Dubsmash: Scaling To 200 Million Users With 3 Engineers

Dec 14 2017 at 10:02AM

Dubsmash

+47

72891

How Sentry Receives 20 Billion Events Per Month While Preparin...

Nov 8 2017 at 5:09PM

Sentry

+31

37215

The Stack That Helped Opendoor Buy and Sell Over $1B in Homes

Mar 9 2017 at 8:02AM

Opendoor

+39

31815

How imgix Built A Stack To Serve 100,000 Images Per Second

Aug 28 2015 at 9:58AM

imgix

+26

10860

What are some alternatives to Apache Drill and Google BigQuery?

Presto

Distributed SQL Query Engine for Big Data

Apache Spark

Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning.

Apache Calcite

It is an open source framework for building databases and data management systems. It includes a SQL parser, an API for building expressions in relational algebra, and a query planning engine

Apache Impala

Impala is a modern, open source, MPP SQL query engine for Apache Hadoop. Impala is shipped by Cloudera, MapR, and Amazon. With Impala, you can query data, whether stored in HDFS or Apache HBase – including SELECT, JOIN, and aggregate functions – in real time.

Druid

Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.

See all alternatives

Apache Drill vs Google BigQuery

Need advice about which tool to choose?Ask the StackShare community!

Apache Drill vs Google BigQuery: What are the differences?

Introduction

Pros of Apache Drill

Pros of Google BigQuery

Sign up to add or upvote prosMake informed product decisions

Cons of Apache Drill

Cons of Google BigQuery

Sign up to add or upvote consMake informed product decisions

What is Apache Drill?

What is Google BigQuery?

Need advice about which tool to choose?Ask the StackShare community!

What companies use Apache Drill?

What companies use Google BigQuery?

Sign up to get full access to all the companiesMake informed product decisions

What tools integrate with Apache Drill?

What tools integrate with Google BigQuery?

Sign up to get full access to all the tool integrationsMake informed product decisions

Blog Posts

Related Comparisons

Trending Comparisons

Top Comparisons