StackShareStackShare
Follow on
StackShare

Discover and share technology stacks from companies around the world.

Product

  • Stacks
  • Tools
  • Companies
  • Feed

Company

  • About
  • Blog
  • Contact

Legal

  • Privacy Policy
  • Terms of Service

© 2025 StackShare. All rights reserved.

API StatusChangelog
Amazon EMR
ByAmazon EMRAmazon EMR

Amazon EMR

#30in Databases
Stacks546Discussions1
Followers682
OverviewDiscussions1

What is Amazon EMR?

It is used in a variety of applications, including log analysis, data warehousing, machine learning, financial analysis, scientific simulation, and bioinformatics.

Amazon EMR is a tool in the Databases category of a tech stack.

Key Features

Elastic- Amazon EMR enables you to quickly and easily provision as much capacity as you need and add or remove capacity at any time. Deploy multiple clusters or resize a running clusterLow Cost- Amazon EMR is designed to reduce the cost of processing large amounts of data. Some of the features that make it low cost include low hourly pricing, Amazon EC2 Spot integration, Amazon EC2 Reserved Instance integration, elasticity, and Amazon S3 integration.Flexible Data Stores- With Amazon EMR, you can leverage multiple data stores, including Amazon S3, the Hadoop Distributed File System (HDFS), and Amazon DynamoDB.Hadoop Tools- EMR supports powerful and proven Hadoop tools such as Hive, Pig, and HBase.

Amazon EMR Pros & Cons

Pros of Amazon EMR

  • ✓On demand processing power
  • ✓Don't need to maintain Hadoop Cluster yourself
  • ✓Hadoop Tools
  • ✓Elastic
  • ✓Backed by Amazon
  • ✓Economic - pay as you go, easy to use CLI and SDKs
  • ✓Flexible
  • ✓Don't need a dedicated Ops group
  • ✓Great support
  • ✓Massive data handling

Cons of Amazon EMR

No cons listed yet.

Amazon EMR Alternatives & Comparisons

What are some alternatives to Amazon EMR?

Google BigQuery

Google BigQuery

Run super-fast, SQL-like queries against terabytes of data in seconds, using the processing power of Google's infrastructure. Load data with ease. Bulk load your data using Google Cloud Storage or stream it in. Easy access. Access BigQuery by using a browser tool, a command-line tool, or by making calls to the BigQuery REST API with client libraries such as Java, PHP or Python.

Amazon Redshift

Amazon Redshift

It is optimized for data sets ranging from a few hundred gigabytes to a petabyte or more and costs less than $1,000 per terabyte per year, a tenth the cost of most traditional data warehousing solutions.

Snowflake

Snowflake

Snowflake eliminates the administration and management demands of traditional data warehouses and big data platforms. Snowflake is a true data warehouse as a service running on Amazon Web Services (AWS)—no infrastructure to manage and no knobs to turn.

Stitch

Stitch

Stitch is a simple, powerful ETL service built for software developers. Stitch evolved out of RJMetrics, a widely used business intelligence platform. When RJMetrics was acquired by Magento in 2016, Stitch was launched as its own company.

Cloudera Enterprise

Cloudera Enterprise

Cloudera Enterprise includes CDH, the world’s most popular open source Hadoop-based platform, as well as advanced system management and data management tools plus dedicated support and community advocacy from our world-class team of Hadoop developers and experts.

Dremio

Dremio

Dremio—the data lake engine, operationalizes your data lake storage and speeds your analytics processes with a high-performance and high-efficiency query engine while also democratizing data access for data scientists and analysts.

Amazon EMR Integrations

SignalFx, AWS Glue, Eucalyptus, AWS Outposts, Amazon Managed Workflows for Apache Airflow are some of the popular tools that integrate with Amazon EMR. Here's a list of all 5 tools that integrate with Amazon EMR.

SignalFx
SignalFx
AWS Glue
AWS Glue
Eucalyptus
Eucalyptus
AWS Outposts
AWS Outposts
Amazon Managed Workflows for Apache Airflow
Amazon Managed Workflows for Apache Airflow

Amazon EMR Discussions

Discover why developers choose Amazon EMR. Read real-world technical decisions and stack choices from the StackShare community.Showing 1 of 3 discussions.

Sung Won Chung
Sung Won Chung

Jun 5, 2019

Needs adviceonAWS GlueAWS GlueAmazon EMRAmazon EMR

I use AWS Glue because I thought it was worth all they hype Fall 2018. However, you had to use Python 2.7 with no pandas support, and cold starts lasted as long as 15 minutes. Also, setting up a dev environment for iterative development was near impossible at the time.

It was a terrible experience for me. I recommend using Amazon EMR instead. Even talking with a friend that works at Amazon, they use EMR instead of Glue for internal spark workloads. Just because a company makes something doesn't mean they use that something :/

0 views0
Comments
View all 3 discussions

Try It

Visit Website

Adoption

On StackShare

Companies
178
TEEMNN+172
Developers
369
WJWCVY+363