Get Advice Icon

Need advice about which tool to choose?Ask the StackShare community!

Amazon EMR
Amazon EMR

245
143
+ 1
49
Amazon Redshift
Amazon Redshift

653
338
+ 1
86
Add tool

Amazon EMR vs Amazon Redshift: What are the differences?

Developers describe Amazon EMR as "Distribute your data and processing across a Amazon EC2 instances using Hadoop". Amazon EMR is used in a variety of applications, including log analysis, web indexing, data warehousing, machine learning, financial analysis, scientific simulation, and bioinformatics. Customers launch millions of Amazon EMR clusters every year. On the other hand, Amazon Redshift is detailed as "Fast, fully managed, petabyte-scale data warehouse service". Redshift makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. It is optimized for datasets ranging from a few hundred gigabytes to a petabyte or more and costs less than $1,000 per terabyte per year, a tenth the cost of most traditional data warehousing solutions.

Amazon EMR and Amazon Redshift belong to "Big Data as a Service" category of the tech stack.

Some of the features offered by Amazon EMR are:

  • Elastic- Amazon EMR enables you to quickly and easily provision as much capacity as you need and add or remove capacity at any time. Deploy multiple clusters or resize a running cluster
  • Low Cost- Amazon EMR is designed to reduce the cost of processing large amounts of data. Some of the features that make it low cost include low hourly pricing, Amazon EC2 Spot integration, Amazon EC2 Reserved Instance integration, elasticity, and Amazon S3 integration.
  • Flexible Data Stores- With Amazon EMR, you can leverage multiple data stores, including Amazon S3, the Hadoop Distributed File System (HDFS), and Amazon DynamoDB.

On the other hand, Amazon Redshift provides the following key features:

  • Optimized for Data Warehousing- It uses columnar storage, data compression, and zone maps to reduce the amount of IO needed to perform queries. Redshift has a massively parallel processing (MPP) architecture, parallelizing and distributing SQL operations to take advantage of all available resources.
  • Scalable- With a few clicks of the AWS Management Console or a simple API call, you can easily scale the number of nodes in your data warehouse up or down as your performance or capacity needs change.
  • No Up-Front Costs- You pay only for the resources you provision. You can choose On-Demand pricing with no up-front costs or long-term commitments, or obtain significantly discounted rates with Reserved Instance pricing.

"On demand processing power" is the top reason why over 13 developers like Amazon EMR, while over 27 developers mention "Data Warehousing" as the leading cause for choosing Amazon Redshift.

Lyft, PedidosYa, and Zapier are some of the popular companies that use Amazon Redshift, whereas Amazon EMR is used by SendGrid, Vine Labs, and Etsy. Amazon Redshift has a broader approval, being mentioned in 267 company stacks & 63 developers stacks; compared to Amazon EMR, which is listed in 93 company stacks and 18 developer stacks.

- No public GitHub repository available -
- No public GitHub repository available -

What is Amazon EMR?

It is used in a variety of applications, including log analysis, data warehousing, machine learning, financial analysis, scientific simulation, and bioinformatics.

What is Amazon Redshift?

It is optimized for data sets ranging from a few hundred gigabytes to a petabyte or more and costs less than $1,000 per terabyte per year, a tenth the cost of most traditional data warehousing solutions.
Get Advice Icon

Need advice about which tool to choose?Ask the StackShare community!

Why do developers choose Amazon EMR?
Why do developers choose Amazon Redshift?

Sign up to add, upvote and see more prosMake informed product decisions

    Be the first to leave a con
      Be the first to leave a con
      What companies use Amazon EMR?
      What companies use Amazon Redshift?

      Sign up to get full access to all the companiesMake informed product decisions

      What tools integrate with Amazon EMR?
      What tools integrate with Amazon Redshift?

      Sign up to get full access to all the tool integrationsMake informed product decisions

      What are some alternatives to Amazon EMR and Amazon Redshift?
      Amazon EC2
      It is a web service that provides resizable compute capacity in the cloud. It is designed to make web-scale computing easier for developers.
      Hadoop
      The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.
      Amazon DynamoDB
      With it , you can offload the administrative burden of operating and scaling a highly available distributed database cluster, while paying a low price for only what you use.
      Azure HDInsight
      It is a cloud-based service from Microsoft for big data analytics that helps organizations process large amounts of streaming or historical data.
      Google BigQuery
      Run super-fast, SQL-like queries against terabytes of data in seconds, using the processing power of Google's infrastructure. Load data with ease. Bulk load your data using Google Cloud Storage or stream it in. Easy access. Access BigQuery by using a browser tool, a command-line tool, or by making calls to the BigQuery REST API with client libraries such as Java, PHP or Python.
      See all alternatives
      Decisions about Amazon EMR and Amazon Redshift
      Ankit Sobti
      Ankit Sobti
      CTO at Postman Inc · | 11 upvotes · 86.3K views
      atPostmanPostman
      Looker
      Looker
      Stitch
      Stitch
      Amazon Redshift
      Amazon Redshift
      dbt
      dbt

      Looker , Stitch , Amazon Redshift , dbt

      We recently moved our Data Analytics and Business Intelligence tooling to Looker . It's already helping us create a solid process for reusable SQL-based data modeling, with consistent definitions across the entire organizations. Looker allows us to collaboratively build these version-controlled models and push the limits of what we've traditionally been able to accomplish with analytics with a lean team.

      For Data Engineering, we're in the process of moving from maintaining our own ETL pipelines on AWS to a managed ELT system on Stitch. We're also evaluating the command line tool, dbt to manage data transformations. Our hope is that Stitch + dbt will streamline the ELT bit, allowing us to focus our energies on analyzing data, rather than managing it.

      See more
      Interest over time
      Reviews of Amazon EMR and Amazon Redshift
      No reviews found
      How developers use Amazon EMR and Amazon Redshift
      Avatar of Olo
      Olo uses Amazon RedshiftAmazon Redshift

      Aggressive archiving of historical data to keep the production database as small as possible. Using our in-house soon-to-be-open-sourced ETL library, SharpShifter.

      Avatar of Andrew La Grange
      Andrew La Grange uses Amazon EMRAmazon EMR

      We use Amazon EMR for all our Hadoop workloads.

      Avatar of Christian Moeller
      Christian Moeller uses Amazon RedshiftAmazon Redshift

      Connected to BI (Pentaho)

      Avatar of Kovid Rathee
      Kovid Rathee uses Amazon RedshiftAmazon Redshift

      OLAP and BI

      How much does Amazon EMR cost?
      How much does Amazon Redshift cost?