Need advice about which tool to choose?Ask the StackShare community!
InfluxDB vs Oracle: What are the differences?
Introduction
InfluxDB and Oracle are both popular databases used for different purposes. While InfluxDB is designed for time-series data and is used mainly in IoT and monitoring applications, Oracle is a general-purpose relational database that is widely used in enterprise applications. Here are the key differences between InfluxDB and Oracle.
Data Model: InfluxDB uses a time-series data model, where data is stored as a series of timestamped values. This makes it highly optimized for storing and querying time-series data efficiently. On the other hand, Oracle uses a relational data model, where data is organized in tables with predefined schemas, allowing for complex relationships between entities.
Scalability: InfluxDB is designed to handle large volumes of time-series data and provides built-in clustering and sharding capabilities for horizontal scalability. It can easily scale to handle millions of writes per second. In contrast, Oracle requires manual configuration and tuning for scaling to handle large workloads, and vertical scaling (adding more resources to a single machine) is the primary method of increasing performance.
Query Language: InfluxDB uses its own query language called InfluxQL, which is specifically designed for working with time-series data. It provides functions and operators optimized for time-series analysis and supports SQL-like syntax for querying data. On the other hand, Oracle uses SQL (Structured Query Language), which is a standardized language for querying relational databases. SQL provides a wide range of features and capabilities for working with structured data.
Indexing: InfluxDB uses a unique indexing mechanism called an inverted index, which allows for fast retrieval of data based on time ranges. This indexing approach is optimized for time-series data, making queries on time ranges highly efficient. In contrast, Oracle uses various indexing techniques, including B-tree, bitmap, and bitmap join indexes, to optimize query performance for different types of data and query patterns.
Performance: InfluxDB is designed for high-performance time-series data storage and retrieval, with write and query performance being its primary focus. It can handle millions of writes per second and provides fast query response times, making it suitable for real-time monitoring and analytics applications. Oracle, being a general-purpose database, offers a broader range of capabilities but may not provide the same level of performance for time-series data.
Cost: In terms of licensing and maintenance costs, there is a significant difference between InfluxDB and Oracle. InfluxDB is an open-source database and has a community edition available for free, while Oracle is a commercial database and requires a paid license. The cost of Oracle licenses can be substantial, especially for enterprise deployments, making it less accessible for smaller projects or organizations with budget constraints.
In Summary, InfluxDB and Oracle differ in their data models, scalability, query languages, indexing mechanisms, performance focus, and costs. InfluxDB is optimized for time-series data, provides high scalability and performance, and has a lower cost of ownership compared to Oracle. On the other hand, Oracle is a versatile relational database with broader capabilities but may require more manual configuration and incur higher licensing costs.
I have a lot of data that's currently sitting in a MariaDB database, a lot of tables that weigh 200gb with indexes. Most of the large tables have a date column which is always filtered, but there are usually 4-6 additional columns that are filtered and used for statistics. I'm trying to figure out the best tool for storing and analyzing large amounts of data. Preferably self-hosted or a cheap solution. The current problem I'm running into is speed. Even with pretty good indexes, if I'm trying to load a large dataset, it's pretty slow.
Druid Could be an amazing solution for your use case, My understanding, and the assumption is you are looking to export your data from MariaDB for Analytical workload. It can be used for time series database as well as a data warehouse and can be scaled horizontally once your data increases. It's pretty easy to set up on any environment (Cloud, Kubernetes, or Self-hosted nix system). Some important features which make it a perfect solution for your use case. 1. It can do streaming ingestion (Kafka, Kinesis) as well as batch ingestion (Files from Local & Cloud Storage or Databases like MySQL, Postgres). In your case MariaDB (which has the same drivers to MySQL) 2. Columnar Database, So you can query just the fields which are required, and that runs your query faster automatically. 3. Druid intelligently partitions data based on time and time-based queries are significantly faster than traditional databases. 4. Scale up or down by just adding or removing servers, and Druid automatically rebalances. Fault-tolerant architecture routes around server failures 5. Gives ana amazing centralized UI to manage data sources, query, tasks.
We are building an IOT service with heavy write throughput and fewer reads (we need downsampling records). We prefer to have good reliability when comes to data and prefer to have data retention based on policies.
So, we are looking for what is the best underlying DB for ingesting a lot of data and do queries easily
We had a similar challenge. We started with DynamoDB, Timescale, and even InfluxDB and Mongo - to eventually settle with PostgreSQL. Assuming the inbound data pipeline in queued (for example, Kinesis/Kafka -> S3 -> and some Lambda functions), PostgreSQL gave us a We had a similar challenge. We started with DynamoDB, Timescale and even InfluxDB and Mongo - to eventually settle with PostgreSQL. Assuming the inbound data pipeline in queued (for example, Kinesis/Kafka -> S3 -> and some Lambda functions), PostgreSQL gave us better performance by far.
Druid is amazing for this use case and is a cloud-native solution that can be deployed on any cloud infrastructure or on Kubernetes. - Easy to scale horizontally - Column Oriented Database - SQL to query data - Streaming and Batch Ingestion - Native search indexes It has feature to work as TimeSeriesDB, Datawarehouse, and has Time-optimized partitioning.
if you want to find a serverless solution with capability of a lot of storage and SQL kind of capability then google bigquery is the best solution for that.
We have chosen Tibero over Oracle because we want to offer a PL/SQL-as-a-Service that the users can deploy in any Cloud without concerns from our website at some standard cost. With Oracle Database, developers would have to worry about what they implement and the related costs of each feature but the licensing model from Tibero is just 1 price and we have all features included, so we don't have to worry and developers using our SQLaaS neither. PostgreSQL would be open source. We have chosen Tibero over Oracle because we want to offer a PL/SQL that you can deploy in any Cloud without concerns. PostgreSQL would be the open source option but we need to offer an SQLaaS with encryption and more enterprise features in the background and best value option we have found, it was Tibero Database for PL/SQL-based applications.
We wanted a JSON datastore that could save the state of our bioinformatics visualizations without destructive normalization. As a leading NoSQL data storage technology, MongoDB has been a perfect fit for our needs. Plus it's open source, and has an enterprise SLA scale-out path, with support of hosted solutions like Atlas. Mongo has been an absolute champ. So much so that SQL and Oracle have begun shipping JSON column types as a new feature for their databases. And when Fast Healthcare Interoperability Resources (FHIR) announced support for JSON, we basically had our FHIR datalake technology.
In the field of bioinformatics, we regularly work with hierarchical and unstructured document data. Unstructured text data from PDFs, image data from radiographs, phylogenetic trees and cladograms, network graphs, streaming ECG data... none of it fits into a traditional SQL database particularly well. As such, we prefer to use document oriented databases.
MongoDB is probably the oldest component in our stack besides Javascript, having been in it for over 5 years. At the time, we were looking for a technology that could simply cache our data visualization state (stored in JSON) in a database as-is without any destructive normalization. MongoDB was the perfect tool; and has been exceeding expectations ever since.
Trivia fact: some of the earliest electronic medical records (EMRs) used a document oriented database called MUMPS as early as the 1960s, prior to the invention of SQL. MUMPS is still in use today in systems like Epic and VistA, and stores upwards of 40% of all medical records at hospitals. So, we saw MongoDB as something as a 21st century version of the MUMPS database.
I chose TimescaleDB because to be the backend system of our production monitoring system. We needed to be able to keep track of multiple high cardinality dimensions.
The drawbacks of this decision are our monitoring system is a bit more ad hoc than it used to (New Relic Insights)
We are combining this with Grafana for display and Telegraf for data collection
Pros of InfluxDB
- Time-series data analysis59
- Easy setup, no dependencies30
- Fast, scalable & open source24
- Open source21
- Real-time analytics20
- Continuous Query support6
- Easy Query Language5
- HTTP API4
- Out-of-the-box, automatic Retention Policy4
- Offers Enterprise version1
- Free Open Source version1
Pros of Oracle
- Reliable44
- Enterprise33
- High Availability15
- Hard to maintain5
- Expensive5
- Maintainable4
- Hard to use4
- High complexity3
Sign up to add or upvote prosMake informed product decisions
Cons of InfluxDB
- Instability4
- Proprietary query language1
- HA or Clustering is only in paid version1
Cons of Oracle
- Expensive14