Amazon EMR vs Amazon RDS

Get Advice Icon

Need advice about which tool to choose?Ask the StackShare community!

Amazon EMR
Amazon EMR

240
138
+ 1
49
Amazon RDS
Amazon RDS

5K
2.7K
+ 1
754
Add tool

Amazon EMR vs Amazon RDS: What are the differences?

What is Amazon EMR? Distribute your data and processing across a Amazon EC2 instances using Hadoop. Amazon EMR is used in a variety of applications, including log analysis, web indexing, data warehousing, machine learning, financial analysis, scientific simulation, and bioinformatics. Customers launch millions of Amazon EMR clusters every year.

What is Amazon RDS? Set up, operate, and scale a relational database in the cloud. Amazon RDS gives you access to the capabilities of a familiar MySQL, Oracle or Microsoft SQL Server database engine. This means that the code, applications, and tools you already use today with your existing databases can be used with Amazon RDS. Amazon RDS automatically patches the database software and backs up your database, storing the backups for a user-defined retention period and enabling point-in-time recovery. You benefit from the flexibility of being able to scale the compute resources or storage capacity associated with your Database Instance (DB Instance) via a single API call.

Amazon EMR belongs to "Big Data as a Service" category of the tech stack, while Amazon RDS can be primarily classified under "SQL Database as a Service".

Some of the features offered by Amazon EMR are:

  • Elastic- Amazon EMR enables you to quickly and easily provision as much capacity as you need and add or remove capacity at any time. Deploy multiple clusters or resize a running cluster
  • Low Cost- Amazon EMR is designed to reduce the cost of processing large amounts of data. Some of the features that make it low cost include low hourly pricing, Amazon EC2 Spot integration, Amazon EC2 Reserved Instance integration, elasticity, and Amazon S3 integration.
  • Flexible Data Stores- With Amazon EMR, you can leverage multiple data stores, including Amazon S3, the Hadoop Distributed File System (HDFS), and Amazon DynamoDB.

On the other hand, Amazon RDS provides the following key features:

  • Pre-configured Parameters
  • Monitoring and Metrics
  • Automatic Software Patching

"On demand processing power" is the primary reason why developers consider Amazon EMR over the competitors, whereas "Reliable failovers" was stated as the key factor in picking Amazon RDS.

According to the StackShare community, Amazon RDS has a broader approval, being mentioned in 1408 company stacks & 509 developers stacks; compared to Amazon EMR, which is listed in 93 company stacks and 18 developer stacks.

- No public GitHub repository available -
- No public GitHub repository available -

What is Amazon EMR?

It is used in a variety of applications, including log analysis, data warehousing, machine learning, financial analysis, scientific simulation, and bioinformatics.

What is Amazon RDS?

Amazon RDS gives you access to the capabilities of a familiar MySQL, Oracle or Microsoft SQL Server database engine. This means that the code, applications, and tools you already use today with your existing databases can be used with Amazon RDS. Amazon RDS automatically patches the database software and backs up your database, storing the backups for a user-defined retention period and enabling point-in-time recovery. You benefit from the flexibility of being able to scale the compute resources or storage capacity associated with your Database Instance (DB Instance) via a single API call.
Get Advice Icon

Need advice about which tool to choose?Ask the StackShare community!

Why do developers choose Amazon EMR?
Why do developers choose Amazon RDS?

Sign up to add, upvote and see more prosMake informed product decisions

    Be the first to leave a con
      Be the first to leave a con
      What companies use Amazon EMR?
      What companies use Amazon RDS?

      Sign up to get full access to all the companiesMake informed product decisions

      What tools integrate with Amazon EMR?
      What tools integrate with Amazon RDS?

      Sign up to get full access to all the tool integrationsMake informed product decisions

      What are some alternatives to Amazon EMR and Amazon RDS?
      Amazon EC2
      It is a web service that provides resizable compute capacity in the cloud. It is designed to make web-scale computing easier for developers.
      Amazon DynamoDB
      With it , you can offload the administrative burden of operating and scaling a highly available distributed database cluster, while paying a low price for only what you use.
      Hadoop
      The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.
      Amazon Redshift
      It is optimized for data sets ranging from a few hundred gigabytes to a petabyte or more and costs less than $1,000 per terabyte per year, a tenth the cost of most traditional data warehousing solutions.
      Azure HDInsight
      It is a cloud-based service from Microsoft for big data analytics that helps organizations process large amounts of streaming or historical data.
      See all alternatives
      Decisions about Amazon EMR and Amazon RDS
      Tim Specht
      Tim Specht
      ‎Co-Founder and CTO at Dubsmash · | 13 upvotes · 58K views
      atDubsmashDubsmash
      Amazon RDS for Aurora
      Amazon RDS for Aurora
      Redis
      Redis
      Amazon DynamoDB
      Amazon DynamoDB
      Amazon RDS
      Amazon RDS
      Heroku
      Heroku
      PostgreSQL
      PostgreSQL
      #PlatformAsAService
      #Databases
      #NosqlDatabaseAsAService
      #SqlDatabaseAsAService

      Over the years we have added a wide variety of different storages to our stack including PostgreSQL (some hosted by Heroku, some by Amazon RDS) for storing relational data, Amazon DynamoDB to store non-relational data like recommendations & user connections, or Redis to hold pre-aggregated data to speed up API endpoints.

      Since we started running Postgres ourselves on RDS instead of only using the managed offerings of Heroku, we've gained additional flexibility in scaling our application while reducing costs at the same time.

      We are also heavily testing Amazon RDS for Aurora in its Postgres-compatible version and will also give the new release of Aurora Serverless a try!

      #SqlDatabaseAsAService #NosqlDatabaseAsAService #Databases #PlatformAsAService

      See more
      Julien DeFrance
      Julien DeFrance
      Principal Software Engineer at Tophatter · | 16 upvotes · 392.6K views
      atSmartZipSmartZip
      Amazon DynamoDB
      Amazon DynamoDB
      Ruby
      Ruby
      Node.js
      Node.js
      AWS Lambda
      AWS Lambda
      New Relic
      New Relic
      Amazon Elasticsearch Service
      Amazon Elasticsearch Service
      Elasticsearch
      Elasticsearch
      Superset
      Superset
      Amazon Quicksight
      Amazon Quicksight
      Amazon Redshift
      Amazon Redshift
      Zapier
      Zapier
      Segment
      Segment
      Amazon CloudFront
      Amazon CloudFront
      Memcached
      Memcached
      Amazon ElastiCache
      Amazon ElastiCache
      Amazon RDS for Aurora
      Amazon RDS for Aurora
      MySQL
      MySQL
      Amazon RDS
      Amazon RDS
      Amazon S3
      Amazon S3
      Docker
      Docker
      Capistrano
      Capistrano
      AWS Elastic Beanstalk
      AWS Elastic Beanstalk
      Rails API
      Rails API
      Rails
      Rails
      Algolia
      Algolia

      Back in 2014, I was given an opportunity to re-architect SmartZip Analytics platform, and flagship product: SmartTargeting. This is a SaaS software helping real estate professionals keeping up with their prospects and leads in a given neighborhood/territory, finding out (thanks to predictive analytics) who's the most likely to list/sell their home, and running cross-channel marketing automation against them: direct mail, online ads, email... The company also does provide Data APIs to Enterprise customers.

      I had inherited years and years of technical debt and I knew things had to change radically. The first enabler to this was to make use of the cloud and go with AWS, so we would stop re-inventing the wheel, and build around managed/scalable services.

      For the SaaS product, we kept on working with Rails as this was what my team had the most knowledge in. We've however broken up the monolith and decoupled the front-end application from the backend thanks to the use of Rails API so we'd get independently scalable micro-services from now on.

      Our various applications could now be deployed using AWS Elastic Beanstalk so we wouldn't waste any more efforts writing time-consuming Capistrano deployment scripts for instance. Combined with Docker so our application would run within its own container, independently from the underlying host configuration.

      Storage-wise, we went with Amazon S3 and ditched any pre-existing local or network storage people used to deal with in our legacy systems. On the database side: Amazon RDS / MySQL initially. Ultimately migrated to Amazon RDS for Aurora / MySQL when it got released. Once again, here you need a managed service your cloud provider handles for you.

      Future improvements / technology decisions included:

      Caching: Amazon ElastiCache / Memcached CDN: Amazon CloudFront Systems Integration: Segment / Zapier Data-warehousing: Amazon Redshift BI: Amazon Quicksight / Superset Search: Elasticsearch / Amazon Elasticsearch Service / Algolia Monitoring: New Relic

      As our usage grows, patterns changed, and/or our business needs evolved, my role as Engineering Manager then Director of Engineering was also to ensure my team kept on learning and innovating, while delivering on business value.

      One of these innovations was to get ourselves into Serverless : Adopting AWS Lambda was a big step forward. At the time, only available for Node.js (Not Ruby ) but a great way to handle cost efficiency, unpredictable traffic, sudden bursts of traffic... Ultimately you want the whole chain of services involved in a call to be serverless, and that's when we've started leveraging Amazon DynamoDB on these projects so they'd be fully scalable.

      See more
      Interest over time
      Reviews of Amazon EMR and Amazon RDS
      No reviews found
      How developers use Amazon EMR and Amazon RDS
      Avatar of Pathwright
      Pathwright uses Amazon RDSAmazon RDS

      While we initially started off running our own Postgres cluster, we evaluated RDS and found it to be an excellent fit for us.

      The failovers, manual scaling, replication, Postgres upgrades, and pretty much everything else has been super smooth and reliable.

      We'll probably need something a little more complex in the future, but RDS performs admirably for now.

      Avatar of AngeloR
      AngeloR uses Amazon RDSAmazon RDS

      We are using RDS for managing PostgreSQL and legacy MSSQL databases.

      Unfortunately while RDS works great for managing the PostgreSQL systems, MSSQL is very much a second class citizen and they don't offer very much capability. Infact, in order to upgrade instance storage for MSSQL we actually have to spin up a new cluster and migrate the data over.

      Avatar of Wirkn Inc.
      Wirkn Inc. uses Amazon RDSAmazon RDS

      Our PostgreSQL servers, where we keep the bulk of Wirkn data, are hosted on the fantastically easy and reliable AWS RDS platform.

      Avatar of Digital2Go
      Digital2Go uses Amazon RDSAmazon RDS

      We use Aurora for our OLTP database, it provides significant speed increases on top of MySQL without the need to manage it

      Avatar of fadingdust
      fadingdust uses Amazon RDSAmazon RDS

      RDS allows us to replicate the development databases locally as well as making it available to CircleCI.

      Avatar of Andrew La Grange
      Andrew La Grange uses Amazon EMRAmazon EMR

      We use Amazon EMR for all our Hadoop workloads.

      How much does Amazon EMR cost?
      How much does Amazon RDS cost?