Need advice about which tool to choose?Ask the StackShare community!
Clickhouse vs MongoDB: What are the differences?
ClickHouse and MongoDB are two popular database management systems. Let's explore the key differences between them.
Data Model: ClickHouse is a columnar database that stores data in columns rather than rows. It is optimized for analytical queries and is well-suited for handling large volumes of data. On the other hand, MongoDB is a document-oriented database that stores data in flexible JSON-like documents. It is designed for high availability and scalability, making it suitable for a wide range of applications.
Query Language: ClickHouse uses its own query language called ClickHouse SQL, which is based on SQL but optimized for analytical queries. It supports a wide range of SQL functions and aggregations. MongoDB, on the other hand, uses a query language called MongoDB Query Language (MQL), which is specifically designed for working with JSON-like documents. It supports flexible queries and powerful aggregation pipelines.
Data Replication: ClickHouse supports synchronous and asynchronous data replication mechanisms, allowing data to be replicated across multiple servers for high availability and fault tolerance. It also supports sharding for distributing data across multiple servers. MongoDB also supports replication and sharding for scalability and high availability. It uses a replica set architecture for replication and a sharded cluster architecture for sharding.
Indexing: ClickHouse uses a specialized indexing mechanism called a merge tree, which is optimized for efficient data compression and fast analytical queries. It also supports secondary indices for filtering and indexing specific columns. MongoDB, on the other hand, supports various types of indexes, including single field, compound, multi-key, geospatial, and text indexes. These indexes improve query performance and enable efficient data retrieval.
Durability and Atomicity: ClickHouse is optimized for fast data ingestion and analytical queries but does not provide full ACID (Atomicity, Consistency, Isolation, Durability) compliance. It does not support transactions or rollback mechanisms. MongoDB, on the other hand, provides ACID compliance for individual operations within a single document. It supports transactions, which allow multiple operations to be executed atomically and provide data consistency and durability.
Scalability: ClickHouse is designed to scale horizontally by adding more servers to a cluster. It can handle large-scale distributed analytics workloads and easily scale to petabytes of data. MongoDB also supports horizontal scalability through sharding, allowing data to be distributed across multiple servers. It can handle high write and read loads and scale to handle large datasets.
In summary, ClickHouse is a columnar database optimized for analytical queries, while MongoDB is a document-oriented database suited for flexible data storage and high availability. ClickHouse excels in handling large-scale analytics workloads, while MongoDB provides rich querying capabilities and ACID compliance.
Hello, I am developing a new project with an internal chat between users. Also, there are complex relationships between the other project entities but I wolud like to build something scalable and fast and right now I am designing the data model. What kind of database would you recommend me to manage all application data? relational like MySQL, no relational like MongoDB or a mixed one? Thank you
In MongoDB, a write operation is atomic on the level of a single document, so it's harder to deal with consistency without transactions.
MongoDB supports horizontal scaling through Sharding , distributing data across several machines and facilitating high throughput operations with large sets of data. ... Sharding allows you to add additional instances to increase capacity when required
If you are trying with "complex relationships", give a chance to learn ArangoDB and Graph databases. Its database structures allow doing this with faster and simpler queries. The database is not as strict as others and allows arbitrary data. The data model is really like a neural network and you will never need foreign keys tables anymore. In Udemy there is a free course about it to get started.
The most important question is where are you planning to host? On-premise, or in the cloud.
Particularly if you are planning to host in either AWS or Azure, then your first point of call should be the PaaS (Platform as a Service) databases supplied by these vendors, as you will find yourself requiring a lot less effort to support them, much easier Disaster Recovery options, and also, depending on how PAYG the database is that you use, potentially also much cheaper costs than having a dedicated database server.
Your question regards 'Relational or not' is obviously key, and you need to consider both your required data structure, as well as the ACID requirements of your application model, as well as the non-functional requirements in terms of scalability, resilience, whether you want security authorisation at the highest application tier, or right down to 'row' level in the database, etc. - however please don't fall into the trap of considering 'NoSQL' as being single category. MongoDB, with its document-store type solution is a very different model to key-value-pair stores (like AWS DynamoDB), or column stores (like AWS RedShift) or for more complex data relationships, Entity Graph Stores (like AWS Neptune), to stores designed for tokenisation and text search (ElasticSearch) etc.
Also critical in all this is how many items you believe you need to index by. RDBMS/SQL stores are great for having as many indexes as you want, other than the slow-down in write speed, whereas databases like Amazon DynamoDB provide blisteringly fast read/write performance, but are very limited on key indexing capabilities.
It feels like you have most experience with SQL/RDBMS technologies, so for the simplest learning curve, and if your application fits it, then I'd personally start by looking at AWS Aurora https://aws.amazon.com/rds/aurora/ .
I think, Its depend of your project type and your skills. MySQL is good and simple for maintenance but MongoDB need more skills and knowledge. If you work on little project, use MySQL. For your project type, MySQL is enough after you can migrate with PostgreSQL
FIrstly, it may help if you explain what you mean by "complex relationships between project entities". Secondly, you can build a fast and scalable solution using either. With that said however, the data sounds relational so I would recommend MySQL.
I am going to work on a real estate project and have to decide on a database. Now, SQL databases can be very efficient if appropriately designed. More relations between the data and less redundancy. But with a #NoSQL database, the development time is reduced, and it is easy to query. Since this is my first time working on the real estate domain, I would like to pick a database that would be efficient in the long run.
I recommend PostgreSQL as it’s the most powerful out of the 3 databases you mentioned. It supports JSON objects so you can mimic the MongoDB functionality, but I would also argue that SQL is actually quite powerful and in many cases significantly easier to work with than with NoSQL databases.
Stay away from foreign keys, keep it fast and simple. Define your data structures well in advance. Try to model your data structures based on your system’s vision; based on where it’s going and not based solely on what you currently need it to do. This will help you avoid drastic changes to your database after your system is launched. Populate the database with fake data and run tests. PostgreSQL allows you to create Views from multiple tables. Try to create those views and make sure you can easily create useful views from multiple tables. Run an Explain on those view queries to make sure you created your indexes correctly. Make sure it’s fast!
Any of those three databases are going to be efficient, scalable, and reliable in the long term if you configure and use them correctly. They all also have solid hosting solutions.
All things being equal, I would agree with other posters that Postgres is my preference among the three, but there are caveats.
MongoDB and MySQL have better support for mutli-region replication in your big three cloud environments. Azure recently bought Citus Data, which was a best-in-class Postgres replication solution, so they might be the only one I trust to provide cross-region replication at the moment.
If you have a single region deployment and are on AWS, I can't recommend Aurora Postgres highly enough. It's a very good implementation and extremely performant.
I'll second another piece of advice. Postgresql's JSON columns are a dream when it comes to productivity and I use them frequently with our Rails application. In these cases, no migration is required to change schema. We store payloads with dozens or hundreds of keys and performance has not been an issue. We also have a lot of relational tables, so the joins we get with SQL are very important to us and hard to replicate with a NoQL solution.
That really depends of where do you see you application in the long run. On any application, any of those choices are excellent. You could argue about good support on JSON binaries, but even MySQL has an excellent support for that on the latest versions.
On the long run, when your application gets hundreds of thousands of requests per second, you might start thinking about how many inputs you will have in the database compared to the outputs. PostgresSQL it’s excellent at giving you outputs, but table corruption can happen when you start receiving this massive number of inputs (Which was the reason Uber switched from Postgres to MySQL)
On our OPS Platform at CTO.ai , we decided to use Postgres, because we need a reliable and agile way to send the output to our users, so that was out best choice in the long run for our product.
I am one of those who believes that MongoDB can be used for everything, this thanks to the advertising of MongoDB.
We are creating an e-commerce platform, we know that it has many relationships, but with MongoDB we can avoid some, but in the end, some relationships have to exist.
A single developer to create two native applications in Flutter, a web application with React, create the backend with multiple microservices hosted with Google Cloud Run. PostgreSQL can be heavy because it should be used with an ORM, on the contrary, with MongoDB you can avoid some relationships and avoid ORM / ODM.
We need advice from someone who has the experience and has had to choose between these two databases for an e-commerce site.
The real question here is not about the technology but rather your real needs and your data. Do you need to manage data that has core concepts and relations ? (such as a family, with parents and children) or do you need to manage a basic collection of similar data (such as blog entries)? PostgreSQL is definitely a relational database for managing entities and their relationships whereas MongoDB (I may be strongly opinionated here ;-) ) is more targeted at managing collection of entities (such as the blog entries). For an e-commerce site (with some products, products categories, user ratings and comments, prices, bundles...) I would go for PostgreSQL as it will support/guide you in creating a structured data set with all your products, organized in categories and with user ratings/comments attached to them. HTH
Had exactly the same question when selecting data storage for our new product. Not e-commerce though, rather interactive and content-focused HR SaaS for SME.
The key arguments for PostgreSQL
It gives you the opportunity to use relationships where you really need it and just go with key-value tables where you don't.
With Jsonb datatype you can store documents/objects/arrays as JSON then use JSON elements in queries and even indexes.
There are more tools/integrations working with PostgreSQL which you can use out of the box, e.g. Hasura
I am in your spot, exactly. A few months ago, I had decided to use Postgres because since its version 9 it showed a lot of progress for being a high-availability database. However, frankly, I didn't want to model statically all data, since I have several distinct schemas (like for different product types) and I wanted some flexibility to add or remove as I saw fit. One of the main challenges with analyzing a NoSQL database being familiar in the SQL ways, is that it's easy to look for "analogies" for what makes SQL useful, like relationship enforcing, transactions and the cascading effect on deletes, updates and inserts, and that limit your vision a lot when analyzing a tool like Mongo, especially in a micro-services pattern. Now-a-days, I really found my solution in Mongo. Not just because of it being NoSQL, but because all of the support I find in the NodeJS community through packages and utilities that make it dead easy to use it for several use-cases. Whatever Postgres offers, Mongo does it a little easier and better, like text search and geo-queries. What you need to see is to model your data in a way that makes sense with Mongo. For instance, I've got a User service that has all auth related information of a user. But then, I have the same user in the Profile service, with the same id, but totally different fields. You have two de facto ways to connect data, by reference and embedding, which in Ecommerce, both have big uses. Like using references to relate a User to a Profile, and an embed to relate a Product to an Order. There's even a third, albeit a little more "manual" implementation here, the graph relationship in which you can model data, in which you can easily model event-driven documents, like a Purchase that goes from "a customer" to "a store", which you can later use for much easier and deep analytics than with the classical SQL stance. MariaDB has it readily available, and also has many improvements over MySQL and Postgres, especially for NoSQL features and scalability. Sadly it is just seen as a MySQL clone, but it offers more than that (although its documentation could be improved). Using Mongo in a micro-service environment is even better because your models can be smaller, meaning less burden on relationships, although you do compensate with a bit of duplication, but a well-designed schema will have minimal impact on that. Whatever tool might do the job, but I want to cheer on the newer generation. Hope it helps.
Hello,
I am trying to design an online ordering app similar to Doordash or Uber Eats. I'm having a hard time trying to finalise on what database (or mixture of databases) to use. I'm leaning towards using a relational database like MySQL or PostgreSQL. But, when the application grows, I don't want to join on 20 tables to get a data. Any help would be greatly appreciated. Thank you for your time.
Hello Suhas , We build our product www.voilacabs.com which is in the same lines as yours but we have used a combination of Mysql and MongoDB. When using MySQL, i would recommend doing the following: 1. Use Mysql only for storage only and for realtime updates we recommend MongoDB. 2. Don't try to Join more than 3 tables. ( the moment you reach 3 join stop there and try to un-normalized database. 3. Never or very rarely use Auto-increments. ( we recommend using UUIDS ) . Use UUIDS always for Auto increments for MYSQL. If you using Postgre SQL then i would suggest you to please check this https://instagram-engineering.com/sharding-ids-at-instagram-1cf5a71e5a5c There is a stored procedure that generated unique keys instead of auto-increment keys and that will help you sharding or clustering database without sync errors. 4. Also For MongoDB if you can put a layer of REDIS Cache then that will boost your api performance under large loads. 5. Use Node.js programing language as that function asynchronously .
Let me know if you still need any suggestion's . Thanks & Regards Rupen Makhecha CTO @ Voila Cab's www.voilacabs.com
I would recommend a mixture of MySQL and MongoDB. Using MongoDB for the Content Distribution Network (CDN) will make it easy to store high volume incoming data. MySQL is recommended to be used for business logic. PostgreSQL is not recommended since you will be faced with inefficient database replication features and constant migration from one PostgreSQL version to another.
I'm currently developing an app that ranks trending stuff ( such as games, memes or movies, etc. ) or events in a particular country or region. Here are the specs: My app does not require registration and requires cookies and localStorage to track users. Users can add new entries to each trending category provided that their country of origin is recorded in cookies. If each category contains more than 100 items then the oldest items get deleted. The question is: what kind of database should I use for managing this app? Thanks in advance
I think your best and cheapest choice is going to be MongoDB, Although Postgres is probably going to be the more scaleable approach, you likely have a good idea of how you want to present your data, and the app seems small enough that you shouldn't need to worry about scaling issues. It also sounds like your app can grow in a linear capacity based on the number of users, and the amount of data, which is the perfect use-case for noSQL databases (linear, predictable scaling).
Correct me if I have any of these assumptions wrong. 1. You're looking to have a relatively high-read with a lower write volume 2. Your app is essentially a list of objects that can belong to a category 3. users can create objects in this list.
I think Mongo is going to be what you're looking for on the following basis: 1. you absolutely need a database that is shared by all users of your app, therefor IndexedDB is out of the question. 2. You have semi-structured data 3. you probably want the cheapest solution.
I think Postgres is wrong for the following reasons: 1. your app is pretty simple in concept, SQL databases will add unnecessary complexity to your system, either through ORMs or SQL queries. (use an ORM if you go with SQL) 2. Hosting SQL databases for production is not cheap! the cheapest solution I know of for Postgres is ElephantSQL. It provides 20MB for free with 5 concurrent connections, you should be okay to manage these limitations if you decide to go Postgres in the end. Whereas mongoDB Atlas has some great free-tier options.
Although your data might be easier to model in Postgres, you can certainly model your data as a single list of items that have a category attached.
I don't want to officially recommend another tool, but you should really checkout prisma, firebase, amplify, or Azure App Services for this app! Just go completely backend-less [Firebase] https://firebase.google.com/ [Amplify] https://aws.amazon.com/amplify/ [Prisma] https://www.prisma.io/ [Azure App Services] https://azure.microsoft.com/en-us/services/app-service/?v=18.51
Hi everybody, I'm developing an application to be used in a gym setting where athletes fill out a health survey, and coaches can analyze the results. However, due to the dynamic nature of some aspects of the app and more static aspects of the other, I am wondering if/how I would integrate MongoDB with my existing PostgreSQL database. I would like to store things like registrations, license information, and club information in Postgres
, while I am thinking about moving things like user surveys, logging, and user settings information over to MongoDB
. Some fields on the survey are integers, some large blocks of text, and some are arrays. My thought is, if I moved that data to MongoDB
, it would give us greater flexibility in terms of adding and removing fields and data to them, and it would scale a lot easier than Postgres
. Not to mention it will be easier to organize that kind of data. Is that overkill or am I approaching this issue the right way? Thank you!
You can have your cake and eat it too. If you really need the flexibility of a document store, Postgresql's JSONB support allows you to mix and match relational data and document data within the same database/table. You can just as easily run analytical queries against JSONB data in Postgresql as you can against "normal" relational data. MongoDB comes with a significant operational overhead and cost (hello replica sets), so unless you really need MongoDB's sharding capabilities (which you shouldn't until you get to extreme scaling numbers), then just stick with Postgresql and use JSONB where you need it.
With PostgreSQL you could easily integrate JSON or array type columns and develope a simple interface to add columns on your application. Anyway handling all the data this way will require some intermediate skill with PostgreSQL dialect and a mix and match of syntaxes for your analitical queryes. Also you will need to have a good design for you backend to handle all this. MongoDB will handle all this in a more natural way and I believe will be more easily integrated with a Node.js backend.
How are you managing your PostgreSQL schema? It doesn't have to be hard to add or remove fields. We're working on a SQL database client at BaseDash that lets you add/remove columns in a couple clicks.
If you decide to migrate some of your data to MongoDB, you can definitely manage the two databases in parallel. For any records that need to be linked, you can treat it just like a foreign key by creating a column that points to an ID in the other database. For example, you might store user settings in MongoDB, and include a UserId
field that points to your User record in your Postgres database.
Those types of things should fit fine in a postgres json column. You'll actually have more flexibility with postgres because you can have a field as a normal column or in a json column, and you can have constraints and indexes on fields within a json column (or not).
We are building an IOT service with heavy write throughput and fewer reads (we need downsampling records). We prefer to have good reliability when comes to data and prefer to have data retention based on policies.
So, we are looking for what is the best underlying DB for ingesting a lot of data and do queries easily
We had a similar challenge. We started with DynamoDB, Timescale, and even InfluxDB and Mongo - to eventually settle with PostgreSQL. Assuming the inbound data pipeline in queued (for example, Kinesis/Kafka -> S3 -> and some Lambda functions), PostgreSQL gave us a We had a similar challenge. We started with DynamoDB, Timescale and even InfluxDB and Mongo - to eventually settle with PostgreSQL. Assuming the inbound data pipeline in queued (for example, Kinesis/Kafka -> S3 -> and some Lambda functions), PostgreSQL gave us better performance by far.
Druid is amazing for this use case and is a cloud-native solution that can be deployed on any cloud infrastructure or on Kubernetes. - Easy to scale horizontally - Column Oriented Database - SQL to query data - Streaming and Batch Ingestion - Native search indexes It has feature to work as TimeSeriesDB, Datawarehouse, and has Time-optimized partitioning.
if you want to find a serverless solution with capability of a lot of storage and SQL kind of capability then google bigquery is the best solution for that.
We Have thousands of .pdf docs generated from the same form but with lots of variability. We need to extract data from open text and more important - from tables inside the docs. The output of Couchbase/Mongo will be one row per document for backend processing. ADOBE renders the tables in an unusable form.
I prefer MongoDB due to own experience with migration of old archive of pdf and meta-data to a new “archive”. The biggest advantage is speed of filters output - a new archive is way faster and reliable then the old one - but also the the easy programming of MongoDB with many code snippets and examples available. I have no personal experience so far with Couchbase. From the architecture point of view both options are OK - go for the one you like.
I would like to suggest MongoDB or ArangoDB (can't choose both, so ArangoDB). MongoDB is more mature, but ArangoDB is more interesting if you will need to bring graph database ideas to solution. For example if some data or some documents are interlinked, then probably ArangoDB is a best solution.
To process tables we used Abbyy software stack. It's great on table extraction.
If you can select text with mouse drag in PDF. Use pdftotext it is fast! You can install it on server with command "apt-get install poppler-utils". Use it like "pdftotext -layout /path-to-your-file". In same folder it will make text file with line by line content. There is few classes on git stacks that you can use, also.
Hello everyone,
Well, I want to build a large-scale project, but I do not know which ORDBMS to choose. The app should handle real-time operations, not chatting, but things like future scheduling or reminders. It should be also really secure, fast and easy to use. And last but not least, should I use them both. I mean PostgreSQL with Python / Django and MongoDB with Node.js? Or would it be better to use PostgreSQL with Node.js?
*The project is going to use React for the front-end and GraphQL is going to be used for the API.
Thank you all. Any answer or advice would be really helpful!
If you want safety (no data corruption/loss) Postgresql is the way to go. You can use Postgresql with python/django but also node. And as a bonus postgresql performance should match those of mongodb if properly tuned...
I mean what you use really comes down to your team. I would normally pick a relationship database over an object store for reporting and other things down the road that are much harder to do with a DB like MongoDB. As for language, I would pick Elixir as it does these things really well but that because that is what I know best. Python or Node.js are also sure good choices. Whoever is building should probably pick the language, framework, and database.
I'd recommend using PostgreSQL and the built in row level security it offers. You can easily make multi-tenant real-time systems using it and https://www.graphile.org/postgraphile/ which gives you an GraphQL api guaranteed to be in sync with your database for free. Also, the transactional support in postgres shines in comparison to mongodb.
At Localazy, we have decided for PostgreSQL, because with its JSONB data type, it effectively combines both relational and document store. The relational model is still better in many situations. In combination with ACID transactions and other features, it provides you with something you are not about to get with Mongo. The ability to store JSONB gives you flexibility where you need it. Also, the SQL queries are something great and allow you to unload a lot of logic to the database server.
I think the point of difference is very clear in the sense that if you're looking to manage relationships in your application, you might as well do it with a database designed and architected to do so. MongoDB is great, but managing complex relationships becomes application logic which introduces more technical debt, in my personal opinion. No matter what programming language you end up using (Postgres works well with both Python and Node.js), you should also look up how easy it will be to manage code and database designs once you begin development. Hope this helps you!
If you have a lot of realtime operations and not-so-high variability of data so Postgres is the best choice. MongoDB is less suitable for high-performance projects but is good for semi-structured data and not-realtime projects. For example we use MongoDB as DB for analytics but we use Postgres for any high-load projects
I'd have recommended MongoDB, but since you're considering Security, then PostgreSQL it is. PostgreSQL is also easy to use, stable, good documentation, and huge support community out there. PostgreSQL also handles concurrency well..
Postgres have never disappointed me. I wouldn't choose MongoDb unless I've had strong reasons to choose it.
Unless you have need for consensus writes or cross-region replication, I generally prefer Postgres over Mongo.
You can use Postgres with either language and it can be performant if you use an RDBMS or are thoughtful about your query construction.
Given your need for security, I'd definitely interact with the database through a mature RDBMS and use well-vetted frameworks and consider going through a rigorous security evaluation every so often.
I personally think that Postgres is easier to reason about than Mongo because of its relational nature and comfort with constraints. Maintaining any sort of relational features in Mongo tends to require a lot of application code for features Postgres offers out of the box.
We build a Factoring-Software for a company in germany and realied on PostgreSQL as main banking and transactional database. Works flawlessly. It's very fast, realiable and well documented. It's a very good product.
My advice for your current needs would be to take a look at https://hasura.io/. I like the service and the fact that it's open source, too. It shippes with PostgreSQL database and gives you the power to map your database relations to a GrapQL endpoint. GraphQL works really well with a React frontend. Hasura getting started docs are well written and they also provide nice use-case and hands-down coding tutorials to get you started. Check if it matches your needs. Here are the features: https://hasura.io/all-features Here is their tutorial to kick it off: https://hasura.io/learn/graphql/hasura/introduction/ Here's an article about scaling their service for GraphQLto 1 million active subscriptions (live queries): https://hasura.io/blog/1-million-active-graphql-subscriptions/
Have a nice day! Cheers, Chris
I have been using this stack of PostgreSQL and Python/Django for a large scale project and I am happy with my decision of using this stack as it is easily scalable , quick and secure.
We actually use both Mongo and SQL databases in production. Mongo excels in both speed and developer friendliness when it comes to geospatial data and queries on the geospatial data, but we also like ACID compliance hence most of our other data (except on-site logs) are stored in a SQL Database (MariaDB for now)
MySQL has a lot of strengths working for it. It's simple and easy to set up and use. It's JSON engine is also really good these days. Mongo is also simple to setup and use, and it's speed as a document-object storage engine is first class.
Where Postgres has both beat is in it's combining of all of the features that make both MySQL and Mongo great, while adding on enterprise grade level scalability and replication. It's Postgres' stability and robustness, while still fulfilling the roles of it's contemporaries extremely well that edge Postgre for me.
When I was new with web development, I was using PHP for backend and MySQL for database. But after improving my JS skills, I chosen Node.js. Because of too many reasons including npm, express, community, fast coding and etc. MongoDB is so good for using with Node.js. If your JS skills are enough good, I recommend to migrate to Node.js and MongoDB.
My data was inherently hierarchical, but there was not enough content in each level of the hierarchy to justify a relational DB (SQL) with a one-to-many approach. It was also far easier to share data between the frontend (Angular), backend (Node.js) and DB (MongoDB) as they all pass around JSON natively. This allowed me to skip the translation layer from relational to hierarchical. You do need to think about correct indexes in MongoDB, and make sure the objects have finite size. For instance, an object in your DB shouldn't have a property which is an array that grows over time, without limit. In addition, I did use MySQL for other types of data, such as a catalog of products which (a) has a lot of data, (b) flat and not hierarchical, (c) needed very fast queries.
We used Mongo for the first iterations of our app, but the relational nature of our data was an awkward fit for a database that is not relational. We sorely lacked relational database integrity features that needed to be done on the application side (poorly) and it was a huge relief when we managed to port our application over to Postgres, which performs great and never gives us trouble, while having very user friendly extensions like JSON and PubSub that made the transition easy.
We wanted a JSON datastore that could save the state of our bioinformatics visualizations without destructive normalization. As a leading NoSQL data storage technology, MongoDB has been a perfect fit for our needs. Plus it's open source, and has an enterprise SLA scale-out path, with support of hosted solutions like Atlas. Mongo has been an absolute champ. So much so that SQL and Oracle have begun shipping JSON column types as a new feature for their databases. And when Fast Healthcare Interoperability Resources (FHIR) announced support for JSON, we basically had our FHIR datalake technology.
In the field of bioinformatics, we regularly work with hierarchical and unstructured document data. Unstructured text data from PDFs, image data from radiographs, phylogenetic trees and cladograms, network graphs, streaming ECG data... none of it fits into a traditional SQL database particularly well. As such, we prefer to use document oriented databases.
MongoDB is probably the oldest component in our stack besides Javascript, having been in it for over 5 years. At the time, we were looking for a technology that could simply cache our data visualization state (stored in JSON) in a database as-is without any destructive normalization. MongoDB was the perfect tool; and has been exceeding expectations ever since.
Trivia fact: some of the earliest electronic medical records (EMRs) used a document oriented database called MUMPS as early as the 1960s, prior to the invention of SQL. MUMPS is still in use today in systems like Epic and VistA, and stores upwards of 40% of all medical records at hospitals. So, we saw MongoDB as something as a 21st century version of the MUMPS database.
Pros of Clickhouse
- Fast, very very fast21
- Good compression ratio11
- Horizontally scalable7
- Utilizes all CPU resources6
- RESTful5
- Open-source5
- Great CLI5
- Great number of SQL functions4
- Buggy4
- Server crashes its normal :(3
- Highly available3
- Flexible connection options3
- Has no transactions3
- ODBC2
- Flexible compression options2
- In IDEA data import via HTTP interface not working1
Pros of MongoDB
- Document-oriented storage828
- No sql593
- Ease of use553
- Fast464
- High performance410
- Free255
- Open source218
- Flexible180
- Replication & high availability145
- Easy to maintain112
- Querying42
- Easy scalability39
- Auto-sharding38
- High availability37
- Map/reduce31
- Document database27
- Easy setup25
- Full index support25
- Reliable16
- Fast in-place updates15
- Agile programming, flexible, fast14
- No database migrations12
- Easy integration with Node.Js8
- Enterprise8
- Enterprise Support6
- Great NoSQL DB5
- Support for many languages through different drivers4
- Schemaless3
- Aggregation Framework3
- Drivers support is good3
- Fast2
- Managed service2
- Easy to Scale2
- Awesome2
- Consistent2
- Good GUI1
- Acid Compliant1
Sign up to add or upvote prosMake informed product decisions
Cons of Clickhouse
- Slow insert operations5
Cons of MongoDB
- Very slowly for connected models that require joins6
- Not acid compliant3
- Proprietary query language2