What are some alternatives to Neo4j?

What is Neo4j and what are its top alternatives?

Neo4j stores data in nodes connected by directed, typed relationships with properties on both, also known as a Property Graph. It is a high performance graph store with all the features expected of a mature and robust database, like a friendly query language and ACID transactions.

Neo4j is a tool in the Graph Databases category of a tech stack.

Neo4j is an open source tool with 13.4K GitHub stars and 2.4K GitHub forks. Here’s a link to Neo4j's open source repository on GitHub

Explore Neo4j's Story

Top Alternatives to Neo4j

Titan
Titan is a scalable graph database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster. Titan is a transactional database that can support thousands of concurrent users executing complex graph traversals in real time. ...
MongoDB
MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding. ...
Cassandra
Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL. ...
OrientDB
It is an open source NoSQL database management system written in Java. It is a Multi-model database, supporting graph, document, key/value, and object models, but the relationships are managed as in graph databases with direct connections between records. ...
JanusGraph
It is a scalable graph database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster. It is a transactional database that can support thousands of concurrent users executing complex graph traversals in real time. ...
Dgraph
Dgraph's goal is to provide Google production level scale and throughput, with low enough latency to be serving real time user queries, over terabytes of structured data. Dgraph supports GraphQL-like query syntax, and responds in JSON and Protocol Buffers over GRPC and HTTP. ...
ArangoDB
A distributed free and open-source database with a flexible data model for documents, graphs, and key-values. Build high performance applications using a convenient SQL-like query language or JavaScript extensions. ...
Neptune
It brings organization and collaboration to data science projects. All the experiement-related objects are backed-up and organized ready to be analyzed, reproduced and shared with others. Works with all common technologies and integrates with other tools. ...

Neo4j alternatives & related posts

Titan

Distributed Graph Database

Stacks38

Followers56

+ 1

Votes0

PROS OF TITAN

Be the first to leave a pro

CONS OF TITAN

Be the first to leave a con

COMPARE

Compare Titan vs Neo4j

MongoDB

93.4K

80.6K

4.1K

The database for giant ideas

Stacks93.4K

Followers80.6K

+ 1

Votes4.1K

PROS OF MONGODB

827
Document-oriented storage
593
No sql
553
Ease of use
464
Fast
410
High performance
255
Free
218
Open source
180
Flexible
145
Replication & high availability
112
Easy to maintain
42
Querying
39
Easy scalability
38
Auto-sharding
37
High availability
31
Map/reduce
27
Document database
25
Easy setup
25
Full index support
16
Reliable
15
Fast in-place updates
14
Agile programming, flexible, fast
12
No database migrations
8
Easy integration with Node.Js
8
Enterprise
6
Enterprise Support
5
Great NoSQL DB
4
Support for many languages through different drivers
3
Schemaless
3
Aggregation Framework
3
Drivers support is good
2
Fast
2
Managed service
2
Easy to Scale
2
Awesome
2
Consistent
1
Good GUI
1
Acid Compliant

CONS OF MONGODB

6
Very slowly for connected models that require joins
3
Not acid compliant
2
Proprietary query language

COMPARE

Compare MongoDB vs Neo4j

related MongoDB posts

Jeyabalaji Subramanian

CTO at FundsCorner · Jan 30, 2019 | 25 upvotes · 3.3M views

Shared insights

Recently we were looking at a few robust and cost-effective ways of replicating the data that resides in our production MongoDB to a PostgreSQL database for data warehousing and business intelligence.

We set ourselves the following criteria for the optimal tool that would do this job: - The data replication must be near real-time, yet it should NOT impact the production database - The data replication must be horizontally scalable (based on the load), asynchronous & crash-resilient

Based on the above criteria, we selected the following tools to perform the end to end data replication:

We chose MongoDB Stitch for picking up the changes in the source database. It is the serverless platform from MongoDB. One of the services offered by MongoDB Stitch is Stitch Triggers. Using stitch triggers, you can execute a serverless function (in Node.js) in real time in response to changes in the database. When there are a lot of database changes, Stitch automatically "feeds forward" these changes through an asynchronous queue.

We chose Amazon SQS as the pipe / message backbone for communicating the changes from MongoDB to our own replication service. Interestingly enough, MongoDB stitch offers integration with AWS services.

In the Node.js function, we wrote minimal functionality to communicate the database changes (insert / update / delete / replace) to Amazon SQS.

Next we wrote a minimal micro-service in Python to listen to the message events on SQS, pickup the data payload & mirror the DB changes on to the target Data warehouse. We implemented source data to target data translation by modelling target table structures through SQLAlchemy . We deployed this micro-service as AWS Lambda with Zappa. With Zappa, deploying your services as event-driven & horizontally scalable Lambda service is dumb-easy.

In the end, we got to implement a highly scalable near realtime Change Data Replication service that "works" and deployed to production in a matter of few days!

Robert Zuber

CTO at CircleCI · Jul 24, 2019 | 24 upvotes · 3.2M views

Shared insights

We use MongoDB as our primary #datastore. Mongo's approach to replica sets enables some fantastic patterns for operations like maintenance, backups, and #ETL.

As we pull #microservices from our #monolith, we are taking the opportunity to build them with their own datastores using PostgreSQL. We also use Redis to cache data we’d never store permanently, and to rate-limit our requests to partners’ APIs (like GitHub).

When we’re dealing with large blobs of immutable data (logs, artifacts, and test results), we store them in Amazon S3. We handle any side-effects of S3’s eventual consistency model within our own code. This ensures that we deal with user requests correctly while writes are in process.

Update: How CircleCI Processes Over 30 Million Builds Per Month - CircleCI Tech Stack

Cassandra

3.6K

3.5K

507

A partitioned row store. Rows are organized into tables with a required primary key.

Stacks3.6K

Followers3.5K

+ 1

Votes507

PROS OF CASSANDRA

119
Distributed
98
High performance
81
High availability
74
Easy scalability
53
Replication
26
Reliable
26
Multi datacenter deployments
10
Schema optional
9
OLTP
8
Open source
2
Workload separation (via MDC)
1
Fast

CONS OF CASSANDRA

3
Reliability of replication
1
Size
1
Updates

COMPARE

Compare Cassandra vs Neo4j

related Cassandra posts

Thierry Schellenbach

CEO at Stream · Sep 13, 2018 | 17 upvotes · 1.1M views

Shared insights

1.0 of Stream leveraged Cassandra for storing the feed. Cassandra is a common choice for building feeds. Instagram, for instance started, out with Redis but eventually switched to Cassandra to handle their rapid usage growth. Cassandra can handle write heavy workloads very efficiently.

Cassandra is a great tool that allows you to scale write capacity simply by adding more nodes, though it is also very complex. This complexity made it hard to diagnose performance fluctuations. Even though we had years of experience with running Cassandra, it still felt like a bit of a black box. When building Stream 2.0 we decided to go for a different approach and build Keevo. Keevo is our in-house key-value store built upon RocksDB, gRPC and Raft.

RocksDB is a highly performant embeddable database library developed and maintained by Facebook’s data engineering team. RocksDB started as a fork of Google’s LevelDB that introduced several performance improvements for SSD. Nowadays RocksDB is a project on its own and is under active development. It is written in C++ and it’s fast. Have a look at how this benchmark handles 7 million QPS. In terms of technology it’s much more simple than Cassandra.

This translates into reduced maintenance overhead, improved performance and, most importantly, more consistent performance. It’s interesting to note that LinkedIn also uses RocksDB for their feed.

#InMemoryDatabases #DataStores #Databases

Stream & Go: News Feeds for Over 300 Million End Users - Stream Tech Stack | StackShare

kew44

Nov 10, 2022 | 6 upvotes · 96.2K views

Shared insights

Trying to establish a data lake(or maybe puddle) for my org's Data Sharing project. The idea is that outside partners would send cuts of their PHI data, regardless of format/variables/systems, to our Data Team who would then harmonize the data, create data marts, and eventually use it for something. End-to-end, I'm envisioning:

Ingestion->Secure, role-based, self service portal for users to upload data (1a. bonus points if it can preform basic validations/masking)
Storage->Amazon S3 seems like the cheapest. We probably won't need very big, even at full capacity. Our current storage is a secure Box folder that has ~4GB with several batches of test data, code, presentations, and planning docs.
Data Catalog-> AWS Glue? Azure Data Factory? Snowplow? is the main difference basically based on the vendor? We also will have Data Dictionaries/Codebooks from submitters. Where would they fit in?
Partitions-> I've seen Cassandra and YARN mentioned, but have no experience with either
Processing-> We want to use SAS if at all possible. What will work with SAS code?
Pipeline/Automation->The check-in and verification processes that have been outlined are rather involved. Some sort of automated messaging or approval workflow would be nice
I have very little guidance on what a "Data Mart" should look like, so I'm going with the idea that it would be another "experimental" partition. Unless there's an actual mart-building paradigm I've missed?
An end user might use the catalog to pull certain de-identified data sets from the marts. Again, role-based access and self-service gui would be preferable. I'm the only full-time tech person on this project, but I'm mostly an OOP, HTML, JavaScript, and some SQL programmer. Most of this is out of my repertoire. I've done a lot of research, but I can't be an effective evangelist without hands-on experience. Since we're starting a new year of our grant, they've finally decided to let me try some stuff out. Any pointers would be appreciated!