Need advice about which tool to choose?Ask the StackShare community!
Azure Cosmos DB vs Neo4j: What are the differences?
Azure Cosmos DB is a globally distributed, multi-model database service provided by Microsoft, Neo4j is a graph database management system. Below are the key differences between the two.
Data Model: One significant difference between Azure Cosmos DB and Neo4j is their data models. Azure Cosmos DB supports multiple data models, including key-value, document, column-family, and graph models. In contrast, Neo4j exclusively focuses on the graph data model, making it ideal for complex connections and relationships between data.
Scalability: When it comes to scalability, Azure Cosmos DB's design allows it to scale horizontally across multiple regions and data centers. It offers both manual and automatic scaling options, providing flexibility to handle variable workloads effectively. On the other hand, while Neo4j does support clustering to distribute the graph database over multiple machines, its scalability is comparatively limited.
Query Language: Azure Cosmos DB uses SQL (Structured Query Language) for querying data, making it familiar and easy to use for developers who are already familiar with SQL. In contrast, Neo4j uses a specialized query language called Cypher, specifically designed for querying graph data. Learning and using Cypher may require additional effort for developers who are not familiar with it.
Indexing: In Azure Cosmos DB, indexing is automatic by default, ensuring high performance for queries on various data models. It offers multiple indexing options to optimize query performance. On the other hand, in Neo4j, although indexing is supported, it needs to be explicitly defined by developers for properties they wish to search frequently, which may require additional effort and maintenance.
Consistency Models: Azure Cosmos DB offers multiple consistency models, including strong, bounded staleness, session, and eventual consistency. Developers can choose the desired consistency level based on their application's requirements. In contrast, Neo4j provides strong consistency as the default option but lacks the flexibility to choose different consistency models.
Deployment Options: Azure Cosmos DB is offered as a fully-managed service in the cloud, providing high availability, automatic backups, and seamless scaling without infrastructure management. It can be integrated with other Azure services and deployed across multiple regions globally. Neo4j, on the other hand, can be deployed in various ways, including on-premises, virtual machines, containers, or in the cloud, offering more deployment flexibility, albeit with additional management responsibilities.
In summary, Azure Cosmos DB offers multiple data models, automatic scaling, SQL querying, flexible indexing, various consistency models, and a fully-managed cloud deployment option. On the other hand, Neo4j focuses exclusively on the graph data model, supports clustering for scalability, uses the specialized Cypher query language, requires explicit indexing, offers strong consistency by default, and provides deployment flexibility.
Hi, I want to create a social network for students, and I was wondering which of these three Oriented Graph DB's would you recommend. I plan to implement machine learning algorithms such as k-means and others to give recommendations and some basic data analyses; also, everything is going to be hosted in the cloud, so I expect the DB to be hosted there. I want the queries to be as fast as possible, and I like good tools to monitor my data. I would appreciate any recommendations or thoughts.
Context:
I released the MVP 6 months ago and got almost 600 users just from my university in Colombia, But now I want to expand it all over my country. I am expecting more or less 20000 users.
I have not used the others but I agree, ArangoDB should meet your needs. If you have worked with RDBMS and SQL before Arango will be a easy transition. AQL is simple yet powerful and deployment can be as small or large as you need. I love the fact that for my local development I can run it as docker container as part of my project and for production I can have multiple machines in a cluster. The project is also under active development and with the latest round of funding I feel comfortable that it will be around a while.
Hi Jaime. I've worked with Neo4j and ArangoDB for a few years and for me, I prefer to use ArangoDB because its query sintax (AQL) is easier. I've built a network topology with both databases and now ArangoDB is the databases for that network topology. Also, ArangoDB has ArangoML that maybe can help you with your recommendation algorithims.
Hi Jaime, I work with Arango for about 3 years quite a lot. Before I do some investigation and choose ArangoDB against Neo4j due to multi-type DB, speed, and also clustering (but we do not use it now). Now we have RMDB and Graph working together. As others said, AQL is quite easy, but u can use some of the drivers like Java Spring, that get you to another level.. If you prefer more copy-paste with little rework, perhaps Neo4j can do the job for you, because there is a bigger community around it.. But I have to solve some issues with the ArangoDB community and its also fast. So I will preffere ArangoDB... Btw, there is a super easy Foxx Microservice tool on Arango that can help you solve basic things faster than write down robust BackEnd.
We have an in-house build experiment management system. We produce samples as input to the next step, which then could produce 1 sample(1-1) and many samples (1 - many). There are many steps like this. So far, we are tracking genealogy (limited tracking) in the MySQL database, which is becoming hard to trace back to the original material or sample(I can give more details if required). So, we are considering a Graph database. I am requesting advice from the experts.
- Is a graph database the right choice, or can we manage with RDBMS?
- If RDBMS, which RDMS, which feature, or which approach could make this manageable or sustainable
- If Graph database(Neo4j, OrientDB, Azure Cosmos DB, Amazon Neptune, ArangoDB), which one is good, and what are the best practices?
I am sorry that this might be a loaded question.
You have not given much detail about the data generated, the depth of such a graph, and the access patterns (queries). However, it is very easy to track all samples and materials if you traverse this graph using a graph database. Here you can use any of the databases mentioned. OrientDB
and ArangoDB
are also multi-model databases where you can still query the data in a relational way using joins - you retain full flexibility.
In SQL, you can use Common Table Expressions (CTEs) and use them to write a recursive query that reads all parent nodes of a tree.
I would recommend ArangoDB
if your samples also have disparate or nested attributes so that the document model (JSON) fits, and you have many complex graph queries that should be performed as efficiently as possible. If not - stay with an RDBMS.
Pros of Azure Cosmos DB
- Best-of-breed NoSQL features28
- High scalability22
- Globally distributed15
- Automatic indexing over flexible json data model14
- Tunable consistency10
- Always on with 99.99% availability sla10
- Javascript language integrated transactions and queries7
- Predictable performance6
- High performance5
- Analytics Store5
- Rapid Development2
- No Sql2
- Auto Indexing2
- Ease of use2
Pros of Neo4j
- Cypher – graph query language69
- Great graphdb61
- Open source33
- Rest api31
- High-Performance Native API27
- ACID23
- Easy setup21
- Great support17
- Clustering11
- Hot Backups9
- Great Web Admin UI8
- Powerful, flexible data model7
- Mature7
- Embeddable6
- Easy to Use and Model5
- Highly-available4
- Best Graphdb4
- It's awesome, I wanted to try it2
- Great onboarding process2
- Great query language and built in data browser2
- Used by Crunchbase2
Sign up to add or upvote prosMake informed product decisions
Cons of Azure Cosmos DB
- Pricing18
- Poor No SQL query support4
Cons of Neo4j
- Comparably slow9
- Can't store a vertex as JSON4
- Doesn't have a managed cloud service at low cost1