Hadoop vs PostgreSQL

Need advice about which tool to choose?Ask the StackShare community!

Hadoop

2.5K
2.3K
+ 1
56
PostgreSQL

95.6K
80K
+ 1
3.5K
Add tool

Hadoop vs PostgreSQL: What are the differences?

Differences between Hadoop and PostgreSQL

Introduction: Hadoop and PostgreSQL are both widely used technologies in the field of data management and analytics. While they share similarities in terms of handling large volumes of data, there are several key differences between them that make them suitable for different use cases.

  1. Data Format and Structure: Hadoop is designed to handle unstructured and semi-structured data, such as text, images, and log files. It stores data in the Hadoop Distributed File System (HDFS) and processes it using the MapReduce framework. In contrast, PostgreSQL is a relational database management system (RDBMS) that is optimized for structured data. It uses a table-based schema to organize and query data.

  2. Scalability and Performance: Hadoop is known for its ability to scale horizontally, meaning it can distribute data across multiple nodes in a cluster. This allows for parallel processing and efficient handling of large datasets. PostgreSQL, on the other hand, is primarily designed to run on a single server, although it does support limited scalability through techniques like replication. In terms of performance, Hadoop excels at batch processing of big data, while PostgreSQL is better suited for real-time data processing and transactional workloads.

  3. Data Processing Paradigm: Hadoop follows a batch processing paradigm, where data is processed in large batches and results are generated at the end of the processing. This makes it suitable for applications like data mining, log analysis, and machine learning. PostgreSQL, on the other hand, supports real-time data processing and provides features like triggers, stored procedures, and support for complex SQL queries, making it suitable for applications that require immediate response and transactional consistency.

  4. Data Storage and Indexing: Hadoop stores data in a distributed file system and does not provide traditional indexing mechanisms. Instead, it relies on data locality and parallel processing to optimize data retrieval. PostgreSQL, being a relational database, uses indexing techniques like B-trees and hash indexes to provide fast data retrieval based on key values. This makes it more suited for applications that require fast read and write operations on specific data subsets.

  5. Data Consistency and Durability: Hadoop does not provide strong consistency guarantees out-of-the-box. It focuses on high availability and fault tolerance through data replication and automatic recovery mechanisms. PostgreSQL, being an ACID-compliant database, ensures strong consistency, durability, and isolation, making it suitable for applications that require strict data integrity and transactional consistency.

  6. Data Integration and Ecosystem: Hadoop has a rich ecosystem of tools and frameworks that support various data processing and analytics tasks. It integrates well with other big data technologies like Apache Spark, Hive, and Pig. PostgreSQL, on the other hand, has a more traditional ecosystem of tools and frameworks that are suited for relational data management and analytics. It integrates well with tools like ETL (Extract, Transform, Load) and Business Intelligence (BI) systems.

In summary, Hadoop is a scalable and efficient solution for handling large volumes of unstructured and semi-structured data, while PostgreSQL is a reliable and feature-rich relational database system optimized for structured data processing and real-time applications.

Advice on Hadoop and PostgreSQL
Needs advice
on
ArangoDBArangoDB
and
PostgreSQLPostgreSQL

Hello All, I'm building an app that will enable users to create documents using ckeditor or TinyMCE editor. The data is then stored in a database and retrieved to display to the user, these docs can contain image data also. The number of pages generated for a single document can go up to 1000. Therefore by design, each page is stored in a separate JSON. I'm wondering which database is the right one to choose between ArangoDB and PostgreSQL. Your thoughts, advice please. Thanks, Kashyap

See more
Replies (2)
Recommends
on
MongoDBMongoDB

try mongodb first.

See more
Attila Fulop
Recommends

Which Graph DB features are you planning to use?

See more
Wassim Ben Jdida
Needs advice
on
GolangGolangMySQLMySQL
and
PostgreSQLPostgreSQL

I am building a fintech startup with a friend, we decided to use Go for its performance and friendly syntax. We want to know if we should use a web framework or just use the pure net/http lib and also for the databases we put PostgreSQL and MySQL on the table, we want to know which one is better, from the community support to the best open-source implementation?

See more
Replies (3)
Shubham Chadokar
Software Engineer Specialist at Kaleyra · | 7 upvotes · 75.6K views
Recommends
on
GolangGolangPostgreSQLPostgreSQL

Postgres is a better option to consider compared to MySQL. With respect to performance, postgres has an edge over MySQL. Don't use net/http for production. Read this https://medium.com/@nate510/don-t-use-go-s-default-http-client-4804cb19f779 I prefer gorilla/mux as it is simple and provides all the basic features. Other lib seems to be an overhead if you just need basic routing.

See more
Carlos Iglesias
Recommends

MySQL and Postgre both are great and awesome and great support, community, support. Whatever will be good. Postgree have some little advantages.

See more
Rafael Breno de Vasconcellos Santos
Recommends
on
ElixirElixir

I recommend Elixir, even though I work in a fintech with Go, Elixir is a FP language so in my opinion the immutability is a important topic when working with money.

See more
Needs advice
on
HadoopHadoopMarkLogicMarkLogic
and
SnowflakeSnowflake

For a property and casualty insurance company, we currently use MarkLogic and Hadoop for our raw data lake. Trying to figure out how snowflake fits in the picture. Does anybody have some good suggestions/best practices for when to use and what data to store in Mark logic versus Snowflake versus a hadoop or all three of these platforms redundant with one another?

See more
Needs advice
on
HadoopHadoopMarkLogicMarkLogic
and
SnowflakeSnowflake

for property and casualty insurance company we current Use marklogic and Hadoop for our raw data lake. Trying to figure out how snowflake fits in the picture. Does anybody have some good suggestions/best practices for when to use and what data to store in Mark logic versus snowflake versus a hadoop or all three of these platforms redundant with one another?

See more
Replies (1)
Ivo Dinis Rodrigues
none of you bussines at Marklogic · | 1 upvotes · 18K views
Recommends

As i see it, you can use Snowflake as your data warehouse and marklogic as a data lake. You can add all your raw data to ML and curate it to a company data model to then supply this to Snowflake. You could try to implement the dw functionality on marklogic but it will just cost you alot of time. If you are using Aws version of Snowflake you can use ML spark connector to access the data. As an extra you can use the ML also as an Operational report system if you join it with a Reporting tool lie PowerBi. With extra apis you can also provide data to other systems with ML as source.

See more
Needs advice
on
MongoDBMongoDBMySQLMySQL
and
PostgreSQLPostgreSQL

I'm planning to build a freelance marketplace website, using tools like Next.js, Firebase Authentication, Node.js, but I need to know which type of database is suitable with performance and powerful features. I'm trying to figure out what the best stack is for this project. If anyone has advice please, I’d love to hear more details. Thanks.

See more
Replies (3)
Reza Malek
at Meam Software Engineering Group · | 9 upvotes · 170.7K views
Recommends
on
PostgreSQLPostgreSQL

Postgres and MySQL are very similar, but Mongo has differences in terms of storage type and the CAP theorem. For your requirement, I prefer Postgres (or MySQL) over MongoDB. Mongo gives you no schema which is not always good. on the other hand, it is more common in NodeJS community, so you may find more articles about Node-Mongo stuff. I suggest to stay with RDBMS if possible.

See more
Recommends
on
MySQLMySQLPostgreSQLPostgreSQL

This is a little about experience. Postgresql is fine. You can use either the related table structure or the json table structure.

See more
Ruslan Rayanov
Recommends
on
MySQLMySQL

We have a ready-made engine for the online exchange and marketplace. To customize it, you only need to know sql. Connecting any database is not a problem. https://falconspace.site/list/solutions

See more
Dennis Kraaijeveld
Needs advice
on
ExpressJSExpressJSMongoDBMongoDB
and
PostgreSQLPostgreSQL

For learning purposes, I am trying to design a dashboard that displays the total revenue from all connected webshops/marketplaces, displaying incoming orders, total orders, etc.

So I will need to get the data (using Node backend) from the Shopify and marketplace APIs, storing this in the database, and get the data from the back end.

My question is:

What kind of database should I use? Is MongoDB fine for storing this kind of data? Or should I go with a SQL database?

See more
Replies (3)
Arash JalaliGhalibaf
Software Engineer at Cafe Bazaar · | 10 upvotes · 231K views
Recommends
on
PostgreSQLPostgreSQL

Postgres is a solid database with a promising background. In the relational side of database design, I see Postgres as an absolute; Now the arguments and conflicts come in when talking about NoSQL data types. The truth is jsonb in Postgres is efficient and gives a good performance and storage. In a comparison with MongoDB with the same resources (such as RAM and CPU) with better tools and community, I think you should go for Postgres and use jsonb for some of the data. All in all, don't use a NoSQL database just cause you have the data type matching this tech, have both SQL and NoSQL at the same time.

See more
Recommends
on
MongoDBMongoDB

I have found MongoDB easier to work with. Postgres and SQL in general, in my experience, is harder to work with. While Postgres does provide data consistency, MongoDB provides flexibility. I've found the MongoDB ecosystem to be really great with a good community. I've worked with MongoDB in production and it's been great. I really like the aggregation system and using query operators such as $in, $pull, $push.

While my opinion may be unpopular, I have found MongoDB really great for relational data, using aggregations from a code perspective. In general, data types are also more flexible with MongoDB.

See more
Luciano Bustos
Senior Software Developer · | 1 upvotes · 220.9K views
Recommends
on
PostgreSQLPostgreSQL

I will use PostgreSQL because you have more powerfull feature for data agregation and views (the raw data from shopify and others could be stored as is) and then use views to produce diff. kind of reports unless you wanna create those aggregations/views in nodejs code. HTH

See more
Krunal Shah
Full Stack Developer at Infynno Solutions · | 7 upvotes · 243.1K views
Needs advice
on
MongoDBMongoDB
and
PostgreSQLPostgreSQL

I want to store the data retrieved from multiple APIs and perform some analytics on it. The data stored in DB will never/hardly change. First, I thought it would be better to retrieve the data and create table columns for them, but some data might have different columns than others. So I thought about storing the JSON response from API directly to the table and use it. So which database will be the better choice, PostgreSQL or MongoDB.

See more
Replies (6)
Nikhil Gurnani
Sr. Backend Engineer at Grappus · | 8 upvotes · 235.4K views
Recommends
on
MongoDBMongoDB

Hey Krunal, your requirement sounds pretty clear and specific to what you want to do with that data. My recommendation to you, would be to use MongoDB. Since schema-less IO is faster in MongoDB, your general speed of reading / writing from and to the database would be quick. Additionally, the aggregate framework is very powerful with large data so that is also something that you can use in computing your analytics.

See more
Maxim Ryakhovskiy
Recommends
on
MongoDBMongoDBMongooseMongoose

I suggest you to go with MongoDB, because it is schema-less, i.e., it permits you to easily manipulate the schema of a table. If you want to add a column, it can be done without much effort. Moreover, MongoDB can deal with more types of data, since the latest is stored as key-value pair. I do not what kind of analysis you are going to do, but NoSQL is not the best choice if you are going to use complex queries. In addition, if you are working with huge amount of data and you are interested in optimising the performance, I suggest you PostgreSQL. Since you are speaking about API and JSON, I guess that you may using Node JS for fetching API. I suggest you to try Mongoose, which facilitate the use of MongoDB with Node JS.

See more
Tarun Batra
Senior Software Developer at Okta · | 3 upvotes · 231.4K views
Recommends
on
MongoDBMongoDBPostgreSQLPostgreSQL

Looks like the use case is to store JSON data. mongoDB and Postgres differ in so many aspects like scaling and consistency. Postgres has excellent JSON support now with the power of SQL. MongoDB is good in handling schema less data. However in this case it seems these differences don’t matter that much. I’d recommend you go with what you are most comfortable with.

See more
Bob Bass
President & Full Stack Enginee at Narro, LLC · | 3 upvotes · 231.3K views
Recommends
on
MySQLMySQLPostgreSQLPostgreSQL

This is largely a matter of opinion. I see that someone else responded and recommended MongoDB but since you are doing data analytics, I highly recommend you go with SQL. You're going to have a really hard time normalizing the data when you can't manipulate relationships and bulk edit with a nice update query.

I'm much more experienced with MySQL than any other database and I am having a hard time getting on board with noSQL entirely because it's really hard to query complex data with relationships using noSQL. I'm using Firestore with one of my apps and MongoDB with another app but they both use MySQL for the heavy lifting and then a document database for things like permissions, caching, etc.

It sounds like the type of problem you need to reverse engineer. I'm sure you can imagine what the data sets would look like if you use MongoDB or Postgres. I suspect that putting in a little bit more work up front will pay high dividends and productivity once the data is normalized.

Again - it's largely a matter of preference but I prefer SQL almost every time.

See more
Luiz H. Rapatão
Tech Lead at rapatao.com · | 3 upvotes · 231.4K views
Recommends
on
MongoDBMongoDB

I don't have an unquestionable opinion regarding your use case. I only trend to pick the MongoDB since it is schemaless avoiding null columns that you not always know when it is used (it depends on the source of the data). The only drawback that I could consider is the query's complexity in MongoDB, sometimes it is a bit tricky, when compared to the traditional SQL queries.

See more
Recommends
on
MongoDBMongoDB

MongoDB should be better for unstructured/less structured data.

See more
Needs advice
on
MongoDBMongoDB
and
PostgreSQLPostgreSQL

I need urgent advice from you all! I am making a web-based food ordering platform which includes 3 different ordering methods (Dine-in using QR code scanning + Take away + Home Delivery) and a table reservation system. We are using React for the front-end, and I need your advice if I should use NestJS or ExpressJS for the backend. And regarding the database, which database should I use, MongoDB or PostgreSQL? Which combination will be better? PS. We want to follow the microservice architecture as scalability, reliability, and usability are the most important Non Functional requirements. Expert advice is needed, please. A load of thanks in advance. Kind Regards, Miqdad

See more
Replies (3)
Stephen Badger | Vital Beats
Senior DevOps Engineer at Vital Beats · | 9 upvotes · 250.4K views
Recommends
on
PostgreSQLPostgreSQL
at

I can't speak for the NestJS vs ExpressJS discussion, but I can given a viewpoint on databases.

The main thing to consider around database choice, is what "shape" the data will be in, and the kind of read/write patterns you expect of that data. The blog example shows up so much for DBMS like MongoDB, because it showcases what NoSQL / document storage is very scalable and performant in: mostly isolated documents with a few views / ways to order them and filter them. In your case, I can imagine a number of "relations" already, which suggest a more traditional SQL solution would work well: You have restaurants, they have maybe a few menus (regular, gluten-free etc), with menu items in, which have different prices over time (25% discount on christmas food just after christmas, 50% off pizzas on wednesdays). Then there's a whole different set of "relations" for people ordering, like showing them past orders, which need to refer to the restaurant etc, and credit card transaction information for refunds etc. That to me suggests PostgreSQL, which will scale quite well if you database design is okay.

PostgreSQL also offers you some extensions, which are just amazing for your use-case. https://postgis.net/ for example will let you query for restaurants based on location, without the big cost that comes from constantly using something like Google Maps API to work out which restaurants are near to someone ordering. Partitioning and window functions will be great for your own use internally too, like answering questions of "What types of takeways perform the best for us, Italian, Mexican?" or in combination with PostGIS, answering questions like "What kind of takeways do we need to market to, to improve our selection?".

While these things can all be implemented in MongoDB, you tend to lose some of the convenience of ACID or have to deal with things like eventual consistency, which requires more thinking on the part of your engineers. PostgreSQL offers decent (if more complex) scalablity and redundancy solutions, and is honestly very well proven and plenty of documentation exists on optimising queries.

See more
Anis Zehani
Recommends
on
MongoDBMongoDB

Hello, i build microservice systems using Angular And Spring (Java) so i can't help with with ur back end choice, BUT, i definitely advice you to use a Nosql database, thus MongoDB of course or even Cassandra if your looking for extreme scalability with zero point of failure. Anyway, Nosql if much more faster then Sql (in your case Postresql DB). All you wanna do with sql can also be done by nosql (not the opposite of course).I also advice you to use docker containers + kubernetes to orchestrate them, if you need scalability and replication, that way your app can support auto scalability (in case ur users number goes high). Best of luck

See more
Carlos Iglesias
Recommends

About PostgreSQL vs MongoDB: short answer. Both are great. Choose what you like the most. Only if you expect millions of users, I‘ll incline with MongoDB.

See more
Dimelo Waterson
Needs advice
on
MySQLMySQLPostgreSQLPostgreSQL
and
SQLiteSQLite

I need to add a DBMS to my stack, but I don't know which. I'm tempted to learn SQLite since it would be useful to me with its focus on local access without concurrency. However, doing so feels like I would be defeating the purpose of trying to expand my skill set since it seems like most enterprise applications have the opposite requirements.

To be able to apply what I learn to more projects, what should I try to learn? MySQL? PostgreSQL? Something else? Is there a comfortable middle ground between high applicability and ease of use?

See more
Replies (3)
Recommends
on
SQLiteSQLite

You can easily start with SQlite. Really easy to startup since it doesn't require you to install any additional software since is self-contained. It has interfaces in almost any language and also GUIs. Start learning SQL basics and simpler data models and structures. There are many tutorials, also available in the official website. From there you will easily migrate to another database. MySQL could be next, sonce it's easier to learn at first and has more resources available. PostgreSQL is less widespread, more challenging and has the fewer resorces, but once you have some experience with MySQL is really easy to learn as well. All these technologies are really widespread and used accross the industry so you won't make a wrong decision with any of these.

See more
Stephen Badger | Vital Beats
Senior DevOps Engineer at Vital Beats · | 6 upvotes · 265.9K views

A question you might want to think about is "What kind of experience do I want to gain, by using a DBMS?". If your aim is to have experience with SQL and any related libraries and frameworks for your language of choice (python, I think?), then it kind of doesn't matter too much which you pick so much. As others have said, SQLite would offer you the ability to very easily get started, and would give you a reasonably standard (if a little basic) SQL dialect to work with.

If your aim is actually to have a bit of "operational" experience, in terms of things like what command line tools might be available as standard for the DBMS, understanding how the DBMS handles multiple databases, when to use multiple schemas vs multiple databases, some basic privilege management etc. Then I would recommend PostgreSQL. SQLite's simplicity actually avoids most of these experiences, which is not helpful to you if that is what you hope to learn. MySQL has a few "quirks" to how it manages things like multiple databases, which may lead you to making less good decisions if you tried to take your experience over to different DBMS, especially in bigger enterprise roles. PostgreSQL is kind of a happy middle ground here, with the ability to start PostgreSQL servers via docker or docker-compose making the actual day-to-day management pretty easy, while still giving you experience of the kinds of considerations I have listed above.

At Vital Beats we make use of PostgreSQL, largely because it offers us a happy balance between good management and backup of data, and good standard command line tools, which is essential for us where we are deploying our solutions within Kubernetes / docker, and so more graphical tools are not always appropriate for us. PostgreSQL is also pretty universally supported in terms of language libraries and frameworks, without having to make compromises on how we want to store and layout our data.

See more
Julien DeFrance
Principal Software Engineer at Tophatter · | 1 upvotes · 257.4K views
Recommends
on
MySQLMySQL

MySQL's very popular, easy to install, is also available as a managed service across most popular cloud offerings. The support/default tooling (such as MySQL Query Workbench) certainly is a little more baked than what you'll find for Postgres.

https://dev.mysql.com/downloads/workbench/

See more
Needs advice
on
HadoopHadoopInfluxDBInfluxDB
and
KafkaKafka

I have a lot of data that's currently sitting in a MariaDB database, a lot of tables that weigh 200gb with indexes. Most of the large tables have a date column which is always filtered, but there are usually 4-6 additional columns that are filtered and used for statistics. I'm trying to figure out the best tool for storing and analyzing large amounts of data. Preferably self-hosted or a cheap solution. The current problem I'm running into is speed. Even with pretty good indexes, if I'm trying to load a large dataset, it's pretty slow.

See more
Replies (1)
Recommends
on
DruidDruid

Druid Could be an amazing solution for your use case, My understanding, and the assumption is you are looking to export your data from MariaDB for Analytical workload. It can be used for time series database as well as a data warehouse and can be scaled horizontally once your data increases. It's pretty easy to set up on any environment (Cloud, Kubernetes, or Self-hosted nix system). Some important features which make it a perfect solution for your use case. 1. It can do streaming ingestion (Kafka, Kinesis) as well as batch ingestion (Files from Local & Cloud Storage or Databases like MySQL, Postgres). In your case MariaDB (which has the same drivers to MySQL) 2. Columnar Database, So you can query just the fields which are required, and that runs your query faster automatically. 3. Druid intelligently partitions data based on time and time-based queries are significantly faster than traditional databases. 4. Scale up or down by just adding or removing servers, and Druid automatically rebalances. Fault-tolerant architecture routes around server failures 5. Gives ana amazing centralized UI to manage data sources, query, tasks.

See more
Decisions about Hadoop and PostgreSQL
Micha Mailänder
CEO & Co-Founder at Dechea · | 14 upvotes · 76.6K views

Fauna is a serverless database where you store data as JSON. Also, you have build in a HTTP GraphQL interface with a full authentication & authorization layer. That means you can skip your Backend and call it directly from the Frontend. With the power, that you can write data transformation function within Fauna with her own language called FQL, we're getting a blazing fast application.

Also, Fauna takes care about scaling and backups (All data are sharded on three different locations on the globe). That means we can fully focus on writing business logic and don't have to worry anymore about infrastructure.

See more

As an advanced user, I prefer Postgres over MySQL. MySQL was the first database I learned from my institute. I always have to undergo that infamous date and time dilemma many Java devs know. Both are adequate for a small project. When I worked on a project with a date and time-intensive data, I spent a lot of time dealing with the conversion and transition, leaving me frustrated. I tried Postgres to see how well it can perform. To my surprise, all became a breeze, and the transactions were faster too. I've been using Postgres ever since, and no more dilemma.

See more

We started using PostgreSQL because there's no need to upgrade to an enterprise plan to access certain essential features. Postgres is essentially plug-and-play; you download it, install it, and there you go!

Another benefit of using Postgres is that you get to use SQL (Structured Query Language)—which isn't for everyone, but I enjoy how flexible and versatile it is.

Postgres also has point-in-time recovery, which you can export wherever you want—This means you can restore data from any given point in time. With this in mind, if you delete something accidentally, you can go back in time and grab said data without restoring the whole database.

Not to mention Postgres is remarkably fast with several thorough benchmarks comparing it to MongoDB, where Postgres mostly came out on top.

See more
Tjerk W
Founder at Impulz Technologies · | 13 upvotes · 67.4K views

As a startup, managing my own database, backups and even the schemas/migrations are all overhead. Next to that, I needed both Backend and Frontend ways to write to the database. With firebase this is possible, this saved us some time: Some API calls were not needed because I could directly fetch data in the FE.

Offline support & realtime data updates is also supported out of the box. No need to write your own websockets.

Once the startup grows, moving to a different relational database might make sense. But in a pre-product-market-fit startup, Firebase is a good, and cheaper fit!

The pricing model of firebase firestore is a bit risky. But it saves a lot of time to get quickly to market.

See more

We will be getting data in the form of CSVs. Because the data in a CSV is highly structured, it will be easy to create schemas and it works well in a SQL database as opposed to noSQL. For a SQL database, both mySQL and Postgres are very viable options. Both of them are highly performant, definitely enough for our application, even if we needed to scale drastically. Postgres does include some extra features over mySQL such as table inheritance and function overloading. However, the extra features are not advantageous to us given our database use case. Because both databases seemed to suit our use case perfectly, we chose to use mySQL simply because it is more familiar tech within our team.

See more

One of our biggest technical pillars is to "let the pros manage it", thus we settled on using Heroku PostgreSQL to manage our SQL cluster. We can take advantage of the free tier and the requests will be fast since it is integrated into Heroku. PostgreSQL also support Full text search which can come into handy with manually searching through the tables.

See more

All the benefits of relational joins and constraints, with JSON field types in Postgres to allow for flexibility like mongo. Objection ORM makes query building seamless and abstracts away a lot of complexity of SQL queries.

MongoDB tends to get slow with scale and requires a lot of code to maintain consistency across collections as foreign keys and other constraints are harder to implement. PostgreSQL also has a vibrant community with battle tested stability and horizontal scalability when needed.

See more
Nithin f

I was looking into PostgreSQL for a database option solely for the reason that it was popular, had good community support, and was used by many companies planning to develop social media platforms similar to Calosmic.

  1. However, I was very unfamiliar with relational databases and had only gotten acquainted with the basics of column-family database models with technologies like SqlLite3.

  2. Furthermore, I had already been using MongoDB, a document-based database, in a previous project so I was looking for options similar to the aforementioned technology.

  3. Last but not least, I wasn't all too into having to manage my database; I wanted to have a place to store my data, and be able to effectively query, and mutate the data without the hassle of learning SQL or maintaining an entire database. I found out about FaunaDB a couple of weeks ago and was very excited about the native GraphQL support, a combination of both document-based and relational database models, and the low-maintenance structure of the database. I am currently experimenting with using FaunaDB in my stack :)

  • One disadvantage I noticed while using FaunaDB and GraphQL is the lack of certain features that one expects when using the latter. Even though FaunaDB has native support for GraphQL it seems as if it's missing numerous features that are commonplace in the language such as unions and interfaces.
See more
Usman Sadiq
Student at University of Toronto · | 8 upvotes · 113.2K views
Migrated
from
PostgreSQLPostgreSQL
to
MongoDBMongoDB

MongoDB's document-oriented paradigm is nicely suited to the results of our ML model. We felt that this compatibility offered some time savings on figuring out and implementing an extensive data formatting and processing system. MongoDB's flexible schemas schemas (due to it being non-relational) were also attractive as a source of additional agility for our development process. The MongoDB ecosystem also has great GUI tools to simplify testing.

See more

Backend:

  • Considering that our main app functionality involves data processing, we chose Python as the programming language because it offers many powerful math libraries for data-related tasks. We will use Flask for the server due to its good integration with Python. We will use a relational database because it has good performance and we are mostly dealing with CSV files that have a fixed structure. We originally chose SQLite, but after realizing the limitations of file-based databases, we decided to switch to PostgreSQL, which has better compatibility with our hosting service, Heroku.
See more
Get Advice from developers at your company using StackShare Enterprise. Sign up for StackShare Enterprise.
Learn More
Pros of Hadoop
Pros of PostgreSQL
  • 39
    Great ecosystem
  • 11
    One stack to rule them all
  • 4
    Great load balancer
  • 1
    Amazon aws
  • 1
    Java syntax
  • 762
    Relational database
  • 510
    High availability
  • 439
    Enterprise class database
  • 383
    Sql
  • 304
    Sql + nosql
  • 173
    Great community
  • 147
    Easy to setup
  • 131
    Heroku
  • 130
    Secure by default
  • 113
    Postgis
  • 50
    Supports Key-Value
  • 48
    Great JSON support
  • 34
    Cross platform
  • 32
    Extensible
  • 28
    Replication
  • 26
    Triggers
  • 23
    Rollback
  • 22
    Multiversion concurrency control
  • 21
    Open source
  • 18
    Heroku Add-on
  • 17
    Stable, Simple and Good Performance
  • 15
    Powerful
  • 13
    Lets be serious, what other SQL DB would you go for?
  • 11
    Good documentation
  • 8
    Intelligent optimizer
  • 8
    Free
  • 8
    Scalable
  • 8
    Reliable
  • 7
    Transactional DDL
  • 7
    Modern
  • 6
    One stop solution for all things sql no matter the os
  • 5
    Relational database with MVCC
  • 5
    Faster Development
  • 4
    Developer friendly
  • 4
    Full-Text Search
  • 3
    Free version
  • 3
    Great DB for Transactional system or Application
  • 3
    Relational datanbase
  • 3
    search
  • 3
    Open-source
  • 3
    Excellent source code
  • 2
    Full-text
  • 2
    Text
  • 0
    Native

Sign up to add or upvote prosMake informed product decisions

Cons of Hadoop
Cons of PostgreSQL
    Be the first to leave a con
    • 10
      Table/index bloatings

    Sign up to add or upvote consMake informed product decisions

    What is Hadoop?

    The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.

    What is PostgreSQL?

    PostgreSQL is an advanced object-relational database management system that supports an extended subset of the SQL standard, including transactions, foreign keys, subqueries, triggers, user-defined types and functions.

    Need advice about which tool to choose?Ask the StackShare community!

    What companies use Hadoop?
    What companies use PostgreSQL?
    See which teams inside your own company are using Hadoop or PostgreSQL.
    Sign up for StackShare EnterpriseLearn More

    Sign up to get full access to all the companiesMake informed product decisions

    What tools integrate with Hadoop?
    What tools integrate with PostgreSQL?

    Sign up to get full access to all the tool integrationsMake informed product decisions

    Blog Posts

    Dec 8 2020 at 5:50PM

    DigitalOcean

    GitHubMySQLPostgreSQL+11
    2
    2355
    MySQLKafkaApache Spark+6
    2
    2003
    Nov 20 2019 at 3:38AM

    OneSignal

    PostgreSQLRedisRuby+8
    9
    4639
    Aug 28 2019 at 3:10AM

    Segment

    PythonJavaAmazon S3+16
    7
    2555
    What are some alternatives to Hadoop and PostgreSQL?
    Cassandra
    Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.
    MongoDB
    MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding.
    Elasticsearch
    Elasticsearch is a distributed, RESTful search and analytics engine capable of storing data and searching it in near real time. Elasticsearch, Kibana, Beats and Logstash are the Elastic Stack (sometimes called the ELK Stack).
    Splunk
    It provides the leading platform for Operational Intelligence. Customers use it to search, monitor, analyze and visualize machine data.
    Snowflake
    Snowflake eliminates the administration and management demands of traditional data warehouses and big data platforms. Snowflake is a true data warehouse as a service running on Amazon Web Services (AWS)—no infrastructure to manage and no knobs to turn.
    See all alternatives