Hadoop vs MySQL vs PostgreSQL

Need advice about which tool to choose?Ask the StackShare community!

Hadoop

1.9K
1.8K
+ 1
54
MySQL

68.3K
52.6K
+ 1
3.7K
PostgreSQL

51.6K
39.8K
+ 1
3.5K
Decisions about Hadoop, MySQL, and PostgreSQL

Backend:

  • Considering that our main app functionality involves data processing, we chose Python as the programming language because it offers many powerful math libraries for data-related tasks. We will use Flask for the server due to its good integration with Python. We will use a relational database because it has good performance and we are mostly dealing with CSV files that have a fixed structure. We originally chose SQLite, but after realizing the limitations of file-based databases, we decided to switch to PostgreSQL, which has better compatibility with our hosting service, Heroku.
See more
Anthony Simon
Lead Engineer at Stylight · | 20 upvotes · 28.2K views

I try to follow an 80/20 distribution when it comes to my choice of tools. This means my stack consists of about 80% software I already know well, but I do allow myself 20% of the stack to explore tech I have less experience with.

The exact ratio is not what’s important here, it’s more the fact that you should lean towards using proven technologies.

I wrote more about this on my blog post on Choosing Boring Technology: https://panelbear.com/blog/boring-tech/

See more
Sergey Rodovinsky

We were looking at several alternative databases that would support following architectural requirements: - very quick prototyping for an unknown domain - ability to support large amounts of data - native ability to replicate and fail over - full stack approach for Node.js development After careful consideration MongoDB came on top, and 3 years later we are still very happy with that decision. Currently we keep almost 2TB of data in our cluster, and start thinking about sharding.

See more
Pros of Hadoop
Pros of MySQL
Pros of PostgreSQL
  • 38
    Great ecosystem
  • 11
    One stack to rule them all
  • 4
    Great load balancer
  • 1
    Java syntax
  • 789
    Sql
  • 674
    Free
  • 557
    Easy
  • 527
    Widely used
  • 485
    Open source
  • 180
    High availability
  • 158
    Cross-platform support
  • 103
    Great community
  • 77
    Secure
  • 75
    Full-text indexing and searching
  • 25
    Fast, open, available
  • 14
    SSL support
  • 13
    Robust
  • 13
    Reliable
  • 8
    Enterprise Version
  • 7
    Easy to set up on all platforms
  • 1
    Easy, light, scalable
  • 1
    Relational database
  • 1
    NoSQL access to JSON data type
  • 1
    Sequel Pro (best SQL GUI)
  • 1
    Replica Support
  • 755
    Relational database
  • 506
    High availability
  • 437
    Enterprise class database
  • 379
    Sql
  • 299
    Sql + nosql
  • 171
    Great community
  • 145
    Easy to setup
  • 129
    Heroku
  • 128
    Secure by default
  • 111
    Postgis
  • 48
    Supports Key-Value
  • 46
    Great JSON support
  • 32
    Cross platform
  • 29
    Extensible
  • 25
    Replication
  • 24
    Triggers
  • 22
    Rollback
  • 21
    Multiversion concurrency control
  • 20
    Open source
  • 17
    Heroku Add-on
  • 14
    Stable, Simple and Good Performance
  • 13
    Powerful
  • 12
    Lets be serious, what other SQL DB would you go for?
  • 9
    Good documentation
  • 7
    Scalable
  • 7
    Intelligent optimizer
  • 6
    Transactional DDL
  • 6
    Modern
  • 6
    Reliable
  • 5
    One stop solution for all things sql no matter the os
  • 5
    Free
  • 4
    Relational database with MVCC
  • 3
    Full-Text Search
  • 3
    Developer friendly
  • 3
    Faster Development
  • 2
    Excellent source code
  • 2
    Great DB for Transactional system or Application
  • 1
    Free version
  • 1
    Text
  • 1
    Open-source
  • 1
    search
  • 1
    Full-text

Sign up to add or upvote prosMake informed product decisions

Cons of Hadoop
Cons of MySQL
Cons of PostgreSQL
    Be the first to leave a con
    • 13
      Owned by a company with their own agenda
    • 1
      Can't roll back schema changes
    • 9
      Table/index bloatings

    Sign up to add or upvote consMake informed product decisions

    What is Hadoop?

    The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.

    What is MySQL?

    The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software.

    What is PostgreSQL?

    PostgreSQL is an advanced object-relational database management system that supports an extended subset of the SQL standard, including transactions, foreign keys, subqueries, triggers, user-defined types and functions.

    Need advice about which tool to choose?Ask the StackShare community!

    What companies use Hadoop?
    What companies use MySQL?
    What companies use PostgreSQL?

    Sign up to get full access to all the companiesMake informed product decisions

    What tools integrate with Hadoop?
    What tools integrate with MySQL?
    What tools integrate with PostgreSQL?

    Sign up to get full access to all the tool integrationsMake informed product decisions

    Blog Posts

    Dec 8 2020 at 5:50PM
    https://img.stackshare.io/company/93/8a444d2b7ec5dd7a4f3fc1819136e05178b964c8.png logo

    DigitalOcean

    GitHubMySQLMongoDB+11
    2
    1505
    MySQLKafkaApache Spark+6
    2
    1339
    Nov 20 2019 at 3:38AM
    https://img.stackshare.io/stack/517248/default_4bf5f3d2d3ef627f563fd3b2e94dee6cc37a38d6.jpg logo

    OneSignal

    PostgreSQLRedisRuby+8
    7
    3626
    Aug 28 2019 at 3:10AM
    https://img.stackshare.io/stack/505487/default_e35b8bd5e615e01dc9b420dbd2a444fcbaeff755.png logo

    Segment

    PythonJavaAmazon S3+16
    5
    1893
    What are some alternatives to Hadoop, MySQL, and PostgreSQL?
    Cassandra
    Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.
    MongoDB
    MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding.
    Elasticsearch
    Elasticsearch is a distributed, RESTful search and analytics engine capable of storing data and searching it in near real time. Elasticsearch, Kibana, Beats and Logstash are the Elastic Stack (sometimes called the ELK Stack).
    Splunk
    It provides the leading platform for Operational Intelligence. Customers use it to search, monitor, analyze and visualize machine data.
    Snowflake
    Snowflake eliminates the administration and management demands of traditional data warehouses and big data platforms. Snowflake is a true data warehouse as a service running on Amazon Web Services (AWS)—no infrastructure to manage and no knobs to turn.
    See all alternatives
    Interest over time