Amazon Athena
Amazon Athena

258
325
+ 1
43
Druid
Druid

193
373
+ 1
20
Add tool

Amazon Athena vs Druid: What are the differences?

What is Amazon Athena? Query S3 Using SQL. Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.

What is Druid? Fast column-oriented distributed data store. Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.

Amazon Athena and Druid can be primarily classified as "Big Data" tools.

"Use SQL to analyze CSV files" is the top reason why over 9 developers like Amazon Athena, while over 3 developers mention "Real Time Aggregations" as the leading cause for choosing Druid.

Druid is an open source tool with 8.31K GitHub stars and 2.08K GitHub forks. Here's a link to Druid's open source repository on GitHub.

According to the StackShare community, Amazon Athena has a broader approval, being mentioned in 50 company stacks & 18 developers stacks; compared to Druid, which is listed in 24 company stacks and 12 developer stacks.

Advice on Amazon Athena and Druid

Hi all,

Currently, we need to ingest the data from Amazon S3 to DB either Amazon Athena or Amazon Redshift. But the problem with the data is, it is in .PSV (pipe separated values) format and the size is also above 200 GB. The query performance of the timeout in Athena/Redshift is not up to the mark, too slow while compared to Google BigQuery. How would I optimize the performance and query result time? Can anyone please help me out?

See more
Pros of Amazon Athena
Pros of Druid

Sign up to add or upvote prosMake informed product decisions

Cons of Amazon Athena
Cons of Druid
    Be the first to leave a con

    Sign up to add or upvote consMake informed product decisions

    - No public GitHub repository available -

    What is Amazon Athena?

    Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.

    What is Druid?

    Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.
    What companies use Amazon Athena?
    What companies use Druid?

    Sign up to get full access to all the companiesMake informed product decisions

    What tools integrate with Amazon Athena?
    What tools integrate with Druid?

    Sign up to get full access to all the tool integrationsMake informed product decisions

    What are some alternatives to Amazon Athena and Druid?
    Presto
    Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.
    Amazon Redshift Spectrum
    With Redshift Spectrum, you can extend the analytic power of Amazon Redshift beyond data stored on local disks in your data warehouse to query vast amounts of unstructured data in your Amazon S3 “data lake” -- without having to load or transform any data.
    Amazon Redshift
    It is optimized for data sets ranging from a few hundred gigabytes to a petabyte or more and costs less than $1,000 per terabyte per year, a tenth the cost of most traditional data warehousing solutions.
    Cassandra
    Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.
    Spectrum
    The community platform for the future.
    See all alternatives
    Interest over time