What is MongoDB?
What is MySQL?
Want advice about which of these to choose?Ask the StackShare community!
Sign up to add, upvote and see more prosMake informed product decisions
Sign up to get full access to all the companiesMake informed product decisions
Sign up to get full access to all the tool integrationsMake informed product decisions
Recently we were looking at a few robust and cost-effective ways of replicating the data that resides in our production MongoDB to a PostgreSQL database for data warehousing and business intelligence.
We set ourselves the following criteria for the optimal tool that would do this job: - The data replication must be near real-time, yet it should NOT impact the production database - The data replication must be horizontally scalable (based on the load), asynchronous & crash-resilient
Based on the above criteria, we selected the following tools to perform the end to end data replication:
We chose MongoDB Stitch for picking up the changes in the source database. It is the serverless platform from MongoDB. One of the services offered by MongoDB Stitch is Stitch Triggers. Using stitch triggers, you can execute a serverless function (in Node.js) in real time in response to changes in the database. When there are a lot of database changes, Stitch automatically "feeds forward" these changes through an asynchronous queue.
We chose Amazon SQS as the pipe / message backbone for communicating the changes from MongoDB to our own replication service. Interestingly enough, MongoDB stitch offers integration with AWS services.
In the Node.js function, we wrote minimal functionality to communicate the database changes (insert / update / delete / replace) to Amazon SQS.
Next we wrote a minimal micro-service in Python to listen to the message events on SQS, pickup the data payload & mirror the DB changes on to the target Data warehouse. We implemented source data to target data translation by modelling target table structures through SQLAlchemy . We deployed this micro-service as AWS Lambda with Zappa. With Zappa, deploying your services as event-driven & horizontally scalable Lambda service is dumb-easy.
In the end, we got to implement a highly scalable near realtime Change Data Replication service that "works" and deployed to production in a matter of few days!
Database is at the heart of any technology stack. It is no wonder we spend a lot of time choosing the right database before we dive deep into product building.
When we were faced with the question of what database to choose, we set the following criteria: The database must (1) Have a very high transaction throughput. We wanted to err on the side of "reads" but not on the "writes". (2) be flexible. I.e. be adaptive enough to take - in data variations. Since we are an early-stage start-up, not everything is set in stone. (3) Fast & easy to work with (4) Cloud Native. We did not want to spend our time in "ANY" infrastructure management.
Based on the above, we picked PostgreSQL and MongoDB for evaluation. We tried a few iterations on hardening the data model with PostgreSQL, but realised that we can move much faster by loosely defining the schema (with just a few fundamental principles intact).
Thus we switched to MongoDB. Before diving in, we validated a few core principles such as: (1) Transaction guarantee. Until 3.6, MongoDB supports Transaction guarantee at Document level. From 4.0 onwards, you can achieve transaction guarantee across multiple documents.
(2) Primary Keys & Indexing: Like any RDBMS, MongoDB supports unique keys & indexes to ensure data integrity & search ability
(3) Ability to join data across data sets: MongoDB offers a super-rich aggregate framework that enables one to filter and group data
(4) Concurrency handling: MongoDB offers specific operations (such as findOneAndUpdate), which when coupled with Optimistic Locking, can be used to achieve concurrency.
Above all, MongoDB offers a complete no-frills Cloud Database as a service - MongoDB Atlas. This kind of sealed the deal for us.
Looking back, choosing MongoDB with MongoDB Atlas was one of the best decisions we took and it is serving us well. My only gripe is that there must be a way to scale-up or scale-down the Atlas configuration at different parts of the day with minimal downtime.
Initial storage was traditional MySQL. The pace of changes during a startup mode made it very difficult to have a clean and consistent schema. Large portions ended up as unstructured data stuffed into CLOBs and BLOBs.
Moving to MongoDB definitely made this part much easier.
Accessing data for analysis is a little bit of a challenge - especially for people coming from the world of SQL Workbench. But with tools like Exploratory this is becoming less of a problem.
We use postgresql for the merge between sql/nosql. A lot of our data is unstructured JSON, or JSON that is currently in flux due to some MVP/interation processes that are going on. PostgreSQL gives the capability to do this.
At the moment PostgreSQL on amazon is only at 9.5 which is one minor version down from support for document fragment updates which is something that we are waiting for. However, that may be some ways away.
Other than that, we are using PostgreSQL as our main SQL store as a replacement for all the MSSQL databases that we have. Not only does it have great support through RDS (small ops team), but it also has some great ways for us to migrate off RDS to managed EC2 instances down the line if we need to.
PostgreSQL combines the best aspects of traditional SQL databases such as reliability, consistent performance, transactions, querying power, etc. with the flexibility of schemaless noSQL systems that are all the rage these days. Through the powerful JSON column types and indexes, you can now have your cake and eat it too! PostgreSQL may seem a bit arcane and old fashioned at first, but the developers have clearly shown that they understand databases and the storage trends better than almost anyone else. It definitely deserves to be part of everyone's toolbox; when you find yourself needing rock solid performance, operational simplicity and reliability, reach for PostgresQL.
Relational data stores solve a lot of problems reasonably well. Postgres has some data types that are really handy such as spatial, json, and a plethora of useful dates and integers. It has good availability of indexing solutions, and is well-supported for both custom modifications as well as hosting options (I like Amazon's Postgres for RDS). I use HoneySQL for Clojure as a composable AST that translates reliably to SQL. I typically use JDBC on Clojure, usually via org.clojure/java.jdbc.
Used MongoDB as primary database. It holds trip data of NYC taxis for the year 2013. It is a huge dataset and it's primary feature is geo coordinates with pickup and drop off locations. Also used MongoDB's map reduce to process this large dataset for aggregation. This aggregated result was then used to show visualizations.
MongoDB fills our more traditional database needs. We knew we wanted Trello to be blisteringly fast. One of the coolest and most performance-obsessed teams we know is our next-door neighbor and sister company StackExchange. Talking to their dev lead David at lunch one day, I learned that even though they use SQL Server for data storage, they actually primarily store a lot of their data in a denormalized format for performance, and normalize only when they need to.
We are used MySQL database to build the Online Food Ordering System
- Its best support normalization and all joins ( Restaurant details & Ordering, customer management, food menu, order transaction & food delivery).
- Best for performance and structured the data.
- Its help to stored the instant updates received from food delivery app ( update the real-time driver GPS location).
1.It's very popular. Heared about it in Database class 2. The most comprehensive set of advanced features, management tools and technical support to achieve the highest levels of MySQL scalability, security, reliability, and uptime. 3. MySQL is an open-source relational database management system. Its name is a combination of "My", the name of co-founder Michael Widenius's daughter, and "SQL", the abbreviation for Structured Query Language.
We use MySQL and variants thereof to store the data for our projects such as the community. MySQL being a well established product means that support is available whenever it is required along with an extensive list of support articles all over the web for diagnosing issues. Variants are also used where needed when, for example, better performance is needed.
Nearly all of our backend storage is on MongoDB. This has also worked out pretty well. It's enabled us to scale up faster/easier than if we had rolled our own solution on top of PostgreSQL (which we were using previously). There have been a few roadbumps along the way, but the team at 10gen has been a big help with thing.
PostgreSQL is responsible for nearly all data storage, validation and integrity. We leverage constraints, functions and custom extensions to ensure we have only one source of truth for our data access rules and that those rules live as close to the data as possible. Call us crazy, but ORMs only lead to ruin and despair.
MySQL is a freely available open source Relational Database Management System (RDBMS) that uses Structured Query Language (SQL). SQL is the most popular language for adding, accessing and managing content in a database. It is most noted for its quick processing, proven reliability, ease and flexibility of use.
Tried MongoDB - early euphoria - later dread. Tried MySQL - not bad at all. Found PostgreSQL - will never go back. So much support for this it should be your first choice. Simple local (free) installation, and one-click setup in Heroku - lots of options in terms of pricing/performance combinations.
I am not using this DB for blog posts or data stored on the site. I am using to track IP addresses and fully qualified domain names of attacker machines that either posted spam on my website, pig flooded me, or had more that a certain number of failed SSH attempts.
We are testing out MongoDB at the moment. Currently we are only using a small EC2 setup for a delayed job queue backed by
agenda. If it works out well we might look to see where it could become a primary document storage engine for us.