Amazon DynamoDBย vsย Presto

Get Advice Icon

Need advice about which tool to choose?Ask the StackShare community!

Amazon DynamoDB
Amazon DynamoDB

1.6K
953
+ 1
162
Presto
Presto

108
182
+ 1
46
Add tool

Amazon DynamoDB vs Presto: What are the differences?

What is Amazon DynamoDB? Fully managed NoSQL database service. All data items are stored on Solid State Drives (SSDs), and are replicated across 3 Availability Zones for high availability and durability. With DynamoDB, you can offload the administrative burden of operating and scaling a highly available distributed database cluster, while paying a low price for only what you use.

What is Presto? Distributed SQL Query Engine for Big Data. Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.

Amazon DynamoDB and Presto are primarily classified as "NoSQL Database as a Service" and "Big Data" tools respectively.

"Predictable performance and cost" is the primary reason why developers consider Amazon DynamoDB over the competitors, whereas "Works directly on files in s3 (no ETL)" was stated as the key factor in picking Presto.

Presto is an open source tool with 9.22K GitHub stars and 3.12K GitHub forks. Here's a link to Presto's open source repository on GitHub.

According to the StackShare community, Amazon DynamoDB has a broader approval, being mentioned in 433 company stacks & 173 developers stacks; compared to Presto, which is listed in 19 company stacks and 11 developer stacks.

- No public GitHub repository available -

What is Amazon DynamoDB?

With it , you can offload the administrative burden of operating and scaling a highly available distributed database cluster, while paying a low price for only what you use.

What is Presto?

Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.
Get Advice Icon

Need advice about which tool to choose?Ask the StackShare community!

Why do developers choose Amazon DynamoDB?
Why do developers choose Presto?

Sign up to add, upvote and see more prosMake informed product decisions

    Be the first to leave a con
    Jobs that mention Amazon DynamoDB and Presto as a desired skillset
    What companies use Amazon DynamoDB?
    What companies use Presto?

    Sign up to get full access to all the companiesMake informed product decisions

    What tools integrate with Amazon DynamoDB?
    What tools integrate with Presto?

    Sign up to get full access to all the tool integrationsMake informed product decisions

    What are some alternatives to Amazon DynamoDB and Presto?
    Google Cloud Datastore
    Use a managed, NoSQL, schemaless database for storing non-relational data. Cloud Datastore automatically scales as you need it and supports transactions as well as robust, SQL-like queries.
    MongoDB
    MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding.
    Amazon SimpleDB
    Developers simply store and query data items via web services requests and Amazon SimpleDB does the rest. Behind the scenes, Amazon SimpleDB creates and manages multiple geographically distributed replicas of your data automatically to enable high availability and data durability. Amazon SimpleDB provides a simple web services interface to create and store multiple data sets, query your data easily, and return the results. Your data is automatically indexed, making it easy to quickly find the information that you need. There is no need to pre-define a schema or change a schema if new data is added later. And scale-out is as simple as creating new domains, rather than building out new servers.
    MySQL
    The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software.
    Amazon S3
    Amazon Simple Storage Service provides a fully redundant data storage infrastructure for storing and retrieving any amount of data, at any time, from anywhere on the web
    See all alternatives
    Decisions about Amazon DynamoDB and Presto
    StackShare Editors
    StackShare Editors
    | 3 upvotes ยท 12K views
    atUber TechnologiesUber Technologies
    Presto
    Presto
    Apache Spark
    Apache Spark
    Hadoop
    Hadoop

    Around 2015, the growing use of Uberโ€™s data exposed limitations in the ETL and Vertica-centric setup, not to mention the increasing costs. โ€œAs our company grew, scaling our data warehouse became increasingly expensive. To cut down on costs, we started deleting older, obsolete data to free up space for new data.โ€

    To overcome these challenges, Uber rebuilt their big data platform around Hadoop. โ€œMore specifically, we introduced a Hadoop data lake where all raw data was ingested from different online data stores only once and with no transformation during ingestion.โ€

    โ€œIn order for users to access data in Hadoop, we introduced Presto to enable interactive ad hoc user queries, Apache Spark to facilitate programmatic access to raw data (in both SQL and non-SQL formats), and Apache Hive to serve as the workhorse for extremely large queries.

    See more
    StackShare Editors
    StackShare Editors
    | 4 upvotes ยท 24.3K views
    atUber TechnologiesUber Technologies
    Presto
    Presto
    Apache Spark
    Apache Spark
    Hadoop
    Hadoop

    To improve platform scalability and efficiency, Uber transitioned from JSON to Parquet, and built a central schema service to manage schemas and integrate different client libraries.

    While the first generation big data platform was vulnerable to upstream data format changes, โ€œad hoc data ingestions jobs were replaced with a standard platform to transfer all source data in its original, nested format into the Hadoop data lake.โ€

    These platform changes enabled the scaling challenges Uber was facing around that time: โ€œOn a daily basis, there were tens of terabytes of new data added to our data lake, and our Big Data platform grew to over 10,000 vcores with over 100,000 running batch jobs on any given day.โ€

    See more
    StackShare Editors
    StackShare Editors
    Presto
    Presto
    Apache Spark
    Apache Spark
    Scala
    Scala
    MySQL
    MySQL
    Kafka
    Kafka

    Slackโ€™s data team works to โ€œprovide an ecosystem to help people in the company quickly and easily answer questions about usage, so they can make better and data informed decisions.โ€ To achieve that goal, that rely on a complex data pipeline.

    An in-house tool call Sqooper scrapes MySQL backups and pipe them to S3. Job queue and log data is sent to Kafka then persisted to S3 using an open source tool called Secor, which was created by Pinterest.

    For compute, Amazonโ€™s Elastic MapReduce (EMR) creates clusters preconfigured for Presto, Hive, and Spark.

    Presto is then used for ad-hoc questions, validating data assumptions, exploring smaller datasets, and creating visualizations for some internal tools. Hive is used for larger data sets or longer time series data, and Spark allows teams to write efficient and robust batch and aggregation jobs. Most of the Spark pipeline is written in Scala.

    Thrift binds all of these engines together with a typed schema and structured data.

    Finally, the Hive Metastore serves as the ground truth for all data and its schema.

    See more
    StackShare Editors
    StackShare Editors
    Apache Thrift
    Apache Thrift
    Kotlin
    Kotlin
    Presto
    Presto
    HHVM (HipHop Virtual Machine)
    HHVM (HipHop Virtual Machine)
    gRPC
    gRPC
    Kubernetes
    Kubernetes
    Apache Spark
    Apache Spark
    Airflow
    Airflow
    Terraform
    Terraform
    Hadoop
    Hadoop
    Swift
    Swift
    Hack
    Hack
    Memcached
    Memcached
    Consul
    Consul
    Chef
    Chef
    Prometheus
    Prometheus

    Since the beginning, Cal Henderson has been the CTO of Slack. Earlier this year, he commented on a Quora question summarizing their current stack.

    Apps
    • Web: a mix of JavaScript/ES6 and React.
    • Desktop: And Electron to ship it as a desktop application.
    • Android: a mix of Java and Kotlin.
    • iOS: written in a mix of Objective C and Swift.
    Backend
    • The core application and the API written in PHP/Hack that runs on HHVM.
    • The data is stored in MySQL using Vitess.
    • Caching is done using Memcached and MCRouter.
    • The search service takes help from SolrCloud, with various Java services.
    • The messaging system uses WebSockets with many services in Java and Go.
    • Load balancing is done using HAproxy with Consul for configuration.
    • Most services talk to each other over gRPC,
    • Some Thrift and JSON-over-HTTP
    • Voice and video calling service was built in Elixir.
    Data warehouse
    • Built using open source tools including Presto, Spark, Airflow, Hadoop and Kafka.
    Etc
    See more
    Eric Colson
    Eric Colson
    Chief Algorithms Officer at Stitch Fix ยท | 19 upvotes ยท 286.4K views
    atStitch FixStitch Fix
    Amazon EC2 Container Service
    Amazon EC2 Container Service
    Docker
    Docker
    PyTorch
    PyTorch
    R
    R
    Python
    Python
    Presto
    Presto
    Apache Spark
    Apache Spark
    Amazon S3
    Amazon S3
    PostgreSQL
    PostgreSQL
    Kafka
    Kafka
    #AWS
    #Etl
    #ML
    #DataScience
    #DataStack
    #Data

    The algorithms and data infrastructure at Stitch Fix is housed in #AWS. Data acquisition is split between events flowing through Kafka, and periodic snapshots of PostgreSQL DBs. We store data in an Amazon S3 based data warehouse. Apache Spark on Yarn is our tool of choice for data movement and #ETL. Because our storage layer (s3) is decoupled from our processing layer, we are able to scale our compute environment very elastically. We have several semi-permanent, autoscaling Yarn clusters running to serve our data processing needs. While the bulk of our compute infrastructure is dedicated to algorithmic processing, we also implemented Presto for adhoc queries and dashboards.

    Beyond data movement and ETL, most #ML centric jobs (e.g. model training and execution) run in a similarly elastic environment as containers running Python and R code on Amazon EC2 Container Service clusters. The execution of batch jobs on top of ECS is managed by Flotilla, a service we built in house and open sourced (see https://github.com/stitchfix/flotilla-os).

    At Stitch Fix, algorithmic integrations are pervasive across the business. We have dozens of data products actively integrated systems. That requires serving layer that is robust, agile, flexible, and allows for self-service. Models produced on Flotilla are packaged for deployment in production using Khan, another framework we've developed internally. Khan provides our data scientists the ability to quickly productionize those models they've developed with open source frameworks in Python 3 (e.g. PyTorch, sklearn), by automatically packaging them as Docker containers and deploying to Amazon ECS. This provides our data scientist a one-click method of getting from their algorithms to production. We then integrate those deployments into a service mesh, which allows us to A/B test various implementations in our product.

    For more info:

    #DataScience #DataStack #Data

    See more
    Doru Mihai
    Doru Mihai
    Solution Architect ยท | 4 upvotes ยท 454 views
    Amazon DynamoDB
    Amazon DynamoDB

    I use Amazon DynamoDB because it integrates seamlessly with other AWS SaaS solutions and if cost is the primary concern early on, then this will be a better choice when compared to AWS RDS or any other solution that requires the creation of a HA cluster of IaaS components that will cost money just for being there, the costs not being influenced primarily by usage.

    See more
    Interest over time
    Reviews of Amazon DynamoDB and Presto
    No reviews found
    How developers use Amazon DynamoDB and Presto
    Avatar of Karma
    Karma uses Amazon DynamoDBAmazon DynamoDB

    For most of the stuff we use MySQL. We just use Amazon RDS. But for some stuff we use Amazon DynamoDB. We love DynamoDB. It's amazing. We store usage data in there, for example. I think we have close to seven or eight hundred million records in there and it's scaled like you don't even notice it. You never notice any performance degradation whatsoever. It's insane, and the last time I checked we were paying $150 bucks for that.

    Avatar of Volkan ร–zรงelik
    Volkan ร–zรงelik uses Amazon DynamoDBAmazon DynamoDB

    zerotoherojs.com โ€™s userbase, and course details are stored in DynamoDB tables.

    The good thing about AWS DynamoDB is: For the amount of traffic that I have, it is free. It is highly-scalable, it is managed by Amazon, and it is pretty fast.

    It is, again, one less thing to worry about (when compared to managing your own MongoDB elsewhere).

    Avatar of CloudRepo
    CloudRepo uses Amazon DynamoDBAmazon DynamoDB

    We store customer metadata in DynamoDB. We decided to use Amazon DynamoDB because it was a fully managed, highly available solution. We didn't want to operate our own SQL server and we wanted to ensure that we built CloudRepo on high availability components so that we could pass that benefit back to our customers.

    Avatar of nrise
    nrise uses Amazon DynamoDBAmazon DynamoDB

    ๋ช‡๋ช‡ ๋กœ๊ทธ๋Š” ํ˜„์žฌ AWS DynamoDB ์— ๊ธฐ๋ก๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๊ฐœ์„ ์„ ํ†ตํ•ด mongodb ๋กœ ์˜ฎ๊ธธ ๊ณ„ํš์„ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์•„์ฃผ ๊ฐ„๋‹จํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ์Œ“๋Š” ์šฉ๋„๋กœ๋Š” ๋‚˜์˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋‹ค๋งŒ, ์ฟผ๋ฆฌ๊ฐ€ ์•„์ฃผ ์ œํ•œ์ ์ž…๋‹ˆ๋‹ค. ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋ฐ˜๋“œ์‹œ DynamoDB ์˜ ์ŠคํŽ™์„ ํ™•์ธํ•  ํ•„์š”๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

    Avatar of HyperTrack
    HyperTrack uses Amazon DynamoDBAmazon DynamoDB

    To store device health records as it allows super fast writes and range queries.

    How much does Amazon DynamoDB cost?
    How much does Presto cost?
    Pricing unavailable
    News about Presto
    More news