Need advice about which tool to choose?Ask the StackShare community!

Druid

381
867
+ 1
32
Oracle

2.3K
1.8K
+ 1
113
Add tool

Druid vs Oracle: What are the differences?

Introduction

In this markdown, we will be discussing the key differences between Druid and Oracle.

  1. Data Model: Druid is a columnar data store that is optimized for time-series data and provides fast aggregations. It uses a denormalized, column-oriented storage format and requires pre-aggregated data ingestion. On the other hand, Oracle is a relational database management system (RDBMS) that follows a row-oriented approach. It supports complex data models with support for joins, constraints, and normalization.

  2. Scalability: Druid is designed to scale horizontally by adding more nodes to the cluster, allowing for high volume and high-speed data ingestion. It can handle large amounts of data and supports real-time data ingestion. Oracle, on the other hand, can also scale horizontally but requires additional configuration and management. It is better suited for traditional transactional workloads.

  3. Processing Speed: Druid is built to provide near real-time analytics and fast query response times. Its architecture and indexing structure enable quick aggregations on large volumes of data. Oracle, while capable of handling analytical workloads, may not provide the same level of performance as Druid for large-scale data analytics.

  4. Query Language: Druid uses a SQL-like query language called Druid Query Language (DQL), which is specifically designed for time-series data. It provides functions for aggregations, filtering, and advanced analytics. Oracle, on the other hand, supports SQL, PL/SQL, and other procedural languages. It provides a broader range of features for complex queries and supports a wide range of data types.

  5. Data Consistency: Druid offers eventual consistency for queries, which means that the data may not always reflect real-time updates immediately. This is because Druid is optimized for speed rather than strict consistency. Oracle, being a traditional RDBMS, provides strong consistency guarantees by enforcing ACID properties for transactions.

  6. Cost: Druid is an open-source project and is available for free. Its underlying infrastructure can be deployed on commodity hardware or cloud infrastructure, reducing the overall cost of ownership. Oracle, on the other hand, is a commercial product that comes with licensing costs. Additionally, the hardware requirements for running Oracle may be higher, resulting in higher infrastructure costs.

In Summary, Druid and Oracle differ in their data models, scalability, processing speed, query language, data consistency, and cost. While Druid excels in time-series data analytics and performance, Oracle provides a broader range of features for complex queries and supports strong consistency for transactions.

Decisions about Druid and Oracle
Daniel Moya
Data Engineer at Dimensigon · | 4 upvotes · 492.3K views

We have chosen Tibero over Oracle because we want to offer a PL/SQL-as-a-Service that the users can deploy in any Cloud without concerns from our website at some standard cost. With Oracle Database, developers would have to worry about what they implement and the related costs of each feature but the licensing model from Tibero is just 1 price and we have all features included, so we don't have to worry and developers using our SQLaaS neither. PostgreSQL would be open source. We have chosen Tibero over Oracle because we want to offer a PL/SQL that you can deploy in any Cloud without concerns. PostgreSQL would be the open source option but we need to offer an SQLaaS with encryption and more enterprise features in the background and best value option we have found, it was Tibero Database for PL/SQL-based applications.

See more

We wanted a JSON datastore that could save the state of our bioinformatics visualizations without destructive normalization. As a leading NoSQL data storage technology, MongoDB has been a perfect fit for our needs. Plus it's open source, and has an enterprise SLA scale-out path, with support of hosted solutions like Atlas. Mongo has been an absolute champ. So much so that SQL and Oracle have begun shipping JSON column types as a new feature for their databases. And when Fast Healthcare Interoperability Resources (FHIR) announced support for JSON, we basically had our FHIR datalake technology.

See more

In the field of bioinformatics, we regularly work with hierarchical and unstructured document data. Unstructured text data from PDFs, image data from radiographs, phylogenetic trees and cladograms, network graphs, streaming ECG data... none of it fits into a traditional SQL database particularly well. As such, we prefer to use document oriented databases.

MongoDB is probably the oldest component in our stack besides Javascript, having been in it for over 5 years. At the time, we were looking for a technology that could simply cache our data visualization state (stored in JSON) in a database as-is without any destructive normalization. MongoDB was the perfect tool; and has been exceeding expectations ever since.

Trivia fact: some of the earliest electronic medical records (EMRs) used a document oriented database called MUMPS as early as the 1960s, prior to the invention of SQL. MUMPS is still in use today in systems like Epic and VistA, and stores upwards of 40% of all medical records at hospitals. So, we saw MongoDB as something as a 21st century version of the MUMPS database.

See more
Manage your open source components, licenses, and vulnerabilities
Learn More
Pros of Druid
Pros of Oracle
  • 15
    Real Time Aggregations
  • 6
    Batch and Real-Time Ingestion
  • 5
    OLAP
  • 3
    OLAP + OLTP
  • 2
    Combining stream and historical analytics
  • 1
    OLTP
  • 44
    Reliable
  • 33
    Enterprise
  • 15
    High Availability
  • 5
    Hard to maintain
  • 5
    Expensive
  • 4
    Maintainable
  • 4
    Hard to use
  • 3
    High complexity

Sign up to add or upvote prosMake informed product decisions

Cons of Druid
Cons of Oracle
  • 3
    Limited sql support
  • 2
    Joins are not supported well
  • 1
    Complexity
  • 14
    Expensive

Sign up to add or upvote consMake informed product decisions

What is Druid?

Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.

What is Oracle?

Oracle Database is an RDBMS. An RDBMS that implements object-oriented features such as user-defined types, inheritance, and polymorphism is called an object-relational database management system (ORDBMS). Oracle Database has extended the relational model to an object-relational model, making it possible to store complex business models in a relational database.

Need advice about which tool to choose?Ask the StackShare community!

What companies use Druid?
What companies use Oracle?
Manage your open source components, licenses, and vulnerabilities
Learn More

Sign up to get full access to all the companiesMake informed product decisions

What tools integrate with Druid?
What tools integrate with Oracle?

Sign up to get full access to all the tool integrationsMake informed product decisions

Blog Posts

Dec 22 2021 at 5:41AM

Pinterest

MySQLKafkaDruid+3
3
693
MySQLKafkaApache Spark+6
4
2158
What are some alternatives to Druid and Oracle?
HBase
Apache HBase is an open-source, distributed, versioned, column-oriented store modeled after Google' Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Apache Hadoop.
MongoDB
MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding.
Cassandra
Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.
Prometheus
Prometheus is a systems and service monitoring system. It collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts if some condition is observed to be true.
Elasticsearch
Elasticsearch is a distributed, RESTful search and analytics engine capable of storing data and searching it in near real time. Elasticsearch, Kibana, Beats and Logstash are the Elastic Stack (sometimes called the ELK Stack).
See all alternatives