What are some alternatives to Presto?

What is Presto and what are its top alternatives?

Distributed SQL Query Engine for Big Data

Presto is a tool in the Big Data Tools category of a tech stack.

Presto is an open source tool with GitHub stars and GitHub forks. Here’s a link to Presto's open source repository on GitHub

Top Alternatives to Presto

Apache Spark
Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning. ...
Stan
A state-of-the-art platform for statistical modeling and high-performance statistical computation. Used for statistical modeling, data analysis, and prediction in the social, biological, and physical sciences, engineering, and business. ...
Apache Impala
Impala is a modern, open source, MPP SQL query engine for Apache Hadoop. Impala is shipped by Cloudera, MapR, and Amazon. With Impala, you can query data, whether stored in HDFS or Apache HBase – including SELECT, JOIN, and aggregate functions – in real time. ...
Snowflake
Snowflake eliminates the administration and management demands of traditional data warehouses and big data platforms. Snowflake is a true data warehouse as a service running on Amazon Web Services (AWS)—no infrastructure to manage and no knobs to turn. ...
Apache Drill
Apache Drill is a distributed MPP query layer that supports SQL and alternative query languages against NoSQL and Hadoop data storage systems. It was inspired in part by Google's Dremel. ...
Druid
Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations. ...
MySQL
The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software. ...
PostgreSQL
PostgreSQL is an advanced object-relational database management system that supports an extended subset of the SQL standard, including transactions, foreign keys, subqueries, triggers, user-defined types and functions. ...

Presto alternatives & related posts

Apache Spark

140

Fast and general engine for large-scale data processing

Stacks3K

Votes140

PROS OF APACHE SPARK

61
Open-source
48
Fast and Flexible
8
One platform for every big data problem
8
Great for distributed SQL like applications
6
Easy to install and to use
3
Works well for most Datascience usecases
2
Interactive Query
2
Machine learning libratimery, Streaming in real
2
In memory Computation

CONS OF APACHE SPARK

4
Speed

COMPARE

Compare Apache Spark vs Presto

Stan

A Probabilistic Programming Language

Stacks64

Votes0

PROS OF STAN

Be the first to leave a pro

CONS OF STAN

Be the first to leave a con

COMPARE

Compare Stan vs Presto

Apache Impala

145

Real-time Query for Hadoop

Stacks145

Votes18

PROS OF APACHE IMPALA

11
Super fast
1
Massively Parallel Processing
1
Load Balancing
1
Replication
1
Scalability
1
Distributed
1
High Performance
1
Open Sourse

CONS OF APACHE IMPALA

Be the first to leave a con

COMPARE

Compare Apache Impala vs Presto

Snowflake

1.1K

The data warehouse built for the cloud

Stacks1.1K

Votes27

PROS OF SNOWFLAKE

7
Public and Private Data Sharing
4
Multicloud
4
Good Performance
4
User Friendly
3
Great Documentation
2
Serverless
1
Economical
1
Usage based billing
1
Innovative

CONS OF SNOWFLAKE

Be the first to leave a con

COMPARE

Compare Snowflake vs Presto

Apache Drill

Schema-Free SQL Query Engine for Hadoop and NoSQL

Stacks72

Votes16

PROS OF APACHE DRILL

4
NoSQL and Hadoop
3
Free
3
Lightning speed and simplicity in face of data jungle
2
Well documented for fast install
1
SQL interface to multiple datasources
1
Nested Data support
1
Read Structured and unstructured data
1
V1.10 released - https://drill.apache.org/

CONS OF APACHE DRILL

Be the first to leave a con

COMPARE

Compare Apache Drill vs Presto

Druid

382

Fast column-oriented distributed data store

Stacks382

Votes32

PROS OF DRUID

15
Real Time Aggregations
6
Batch and Real-Time Ingestion
5
OLAP
3
OLAP + OLTP
2
Combining stream and historical analytics
1
OLTP

CONS OF DRUID

3
Limited sql support
2
Joins are not supported well
1
Complexity

COMPARE

Compare Druid vs Presto

MySQL

126.1K

3.8K

The world's most popular open source database

Stacks126.1K

Votes3.8K

PROS OF MYSQL

800
Sql
679
Free
562
Easy
528
Widely used
490
Open source
180
High availability
160
Cross-platform support
104
Great community
79
Secure
75
Full-text indexing and searching
26
Fast, open, available
16
Reliable
16
SSL support
15
Robust
9
Enterprise Version
7
Easy to set up on all platforms
3
NoSQL access to JSON data type
1
Relational database
1
Easy, light, scalable
1
Sequel Pro (best SQL GUI)
1
Replica Support

CONS OF MYSQL

16
Owned by a company with their own agenda
3
Can't roll back schema changes

COMPARE

Compare MySQL vs Presto

PostgreSQL

98.8K

3.5K

A powerful, open source object-relational database system

Stacks98.8K

Votes3.5K

PROS OF POSTGRESQL

764
Relational database
510
High availability
439
Enterprise class database
383
Sql
304
Sql + nosql
173
Great community
147
Easy to setup
131
Heroku
130
Secure by default
113
Postgis
50
Supports Key-Value
48
Great JSON support
34
Cross platform
33
Extensible
28
Replication
26
Triggers
23
Multiversion concurrency control
23
Rollback
21
Open source
18
Heroku Add-on
17
Stable, Simple and Good Performance
15
Powerful
13
Lets be serious, what other SQL DB would you go for?
11
Good documentation
9
Scalable
8
Free
8
Reliable
8
Intelligent optimizer
7
Transactional DDL
7
Modern
6
One stop solution for all things sql no matter the os
5
Relational database with MVCC
5
Faster Development
4
Full-Text Search
4
Developer friendly
3
Excellent source code
3
Free version
3
Great DB for Transactional system or Application
3
Relational datanbase
3
search
3
Open-source
2
Text
2
Full-text
1
Can handle up to petabytes worth of size
1
Composability
1
Multiple procedural languages supported
0
Native

CONS OF POSTGRESQL

10
Table/index bloatings

COMPARE

Compare PostgreSQL vs Presto

related PostgreSQL posts

Simon Reymann

Senior Fullstack Developer at QUANTUSflow Software GmbH · Apr 27, 2020 | 30 upvotes · 11.9M views

Shared insights

QUANTUSflow Software GmbH

Our whole DevOps stack consists of the following tools:

GitHub (incl. GitHub Pages/Markdown for Documentation, GettingStarted and HowTo's) for collaborative review and code management tool
Respectively Git as revision control system
SourceTree as Git GUI
Visual Studio Code as IDE
CircleCI for continuous integration (automatize development process)
Prettier / TSLint / ESLint as code linter
SonarQube as quality gate
Docker as container management (incl. Docker Compose for multi-container application management)
VirtualBox for operating system simulation tests
Kubernetes as cluster management for docker containers
Heroku for deploying in test environments
nginx as web server (preferably used as facade server in production environment)
SSLMate (using OpenSSL) for certificate management
Amazon EC2 (incl. Amazon S3) for deploying in stage (production-like) and production environments
PostgreSQL as preferred database system
Redis as preferred in-memory database/store (great for caching)

The main reason we have chosen Kubernetes over Docker Swarm is related to the following artifacts:

Key features: Easy and flexible installation, Clear dashboard, Great scaling operations, Monitoring is an integral part, Great load balancing concepts, Monitors the condition and ensures compensation in the event of failure.
Applications: An application can be deployed using a combination of pods, deployments, and services (or micro-services).
Functionality: Kubernetes as a complex installation and setup process, but it not as limited as Docker Swarm.
Monitoring: It supports multiple versions of logging and monitoring when the services are deployed within the cluster (Elasticsearch/Kibana (ELK), Heapster/Grafana, Sysdig cloud integration).
Scalability: All-in-one framework for distributed systems.
Other Benefits: Kubernetes is backed by the Cloud Native Computing Foundation (CNCF), huge community among container orchestration tools, it is an open source and modular tool that works with any OS.

Jeyabalaji Subramanian

CTO at FundsCorner · Jan 30, 2019 | 25 upvotes · 3.4M views

Shared insights

Recently we were looking at a few robust and cost-effective ways of replicating the data that resides in our production MongoDB to a PostgreSQL database for data warehousing and business intelligence.

We set ourselves the following criteria for the optimal tool that would do this job: - The data replication must be near real-time, yet it should NOT impact the production database - The data replication must be horizontally scalable (based on the load), asynchronous & crash-resilient

Based on the above criteria, we selected the following tools to perform the end to end data replication:

We chose MongoDB Stitch for picking up the changes in the source database. It is the serverless platform from MongoDB. One of the services offered by MongoDB Stitch is Stitch Triggers. Using stitch triggers, you can execute a serverless function (in Node.js) in real time in response to changes in the database. When there are a lot of database changes, Stitch automatically "feeds forward" these changes through an asynchronous queue.

We chose Amazon SQS as the pipe / message backbone for communicating the changes from MongoDB to our own replication service. Interestingly enough, MongoDB stitch offers integration with AWS services.

In the Node.js function, we wrote minimal functionality to communicate the database changes (insert / update / delete / replace) to Amazon SQS.

Next we wrote a minimal micro-service in Python to listen to the message events on SQS, pickup the data payload & mirror the DB changes on to the target Data warehouse. We implemented source data to target data translation by modelling target table structures through SQLAlchemy . We deployed this micro-service as AWS Lambda with Zappa. With Zappa, deploying your services as event-driven & horizontally scalable Lambda service is dumb-easy.

In the end, we got to implement a highly scalable near realtime Change Data Replication service that "works" and deployed to production in a matter of few days!

Alternatives to Presto

What is Presto and what are its top alternatives?

Top Alternatives to Presto

Presto alternatives & related posts

Apache Spark

related Apache Spark posts

Stan

related Stan posts

Apache Impala

related Apache Impala posts

Snowflake

related Snowflake posts

Apache Drill

related Apache Drill posts

Druid

related Druid posts

MySQL

related MySQL posts

PostgreSQL

related PostgreSQL posts

Similar Tools

New Tools

Top Tools

Trending Comparisons