Dattell

Elastic Stack getting expensive? You’re not imagining it.
Many teams are surprised by the rising costs of enterprise Elastic—between license fees, cloud markups, and bundled features they don’t need.

This post breaks down:

Where the costs come from
When it makes sense to switch to third party managed Elasticsearch
And why OpenSearch is gaining traction

→ Read the full comparison: https://dattell.com/data-architecture-blog/why-is-enterprise-elastic-so-expensive-and-what-are-the-alternatives/

Running Kafka through Confluent? You’re not alone. Many teams are reconsidering Confluent as costs climb—especially when only a fraction of the platform’s features are used.

This post breaks down why some are switching to open-source Apache Kafka with managed support:

30–60% cost savings
No licensing or cloud markups
Full control with enterprise-grade SLAs

→ Read the breakdown: https://dattell.com/data-architecture-blog/why-teams-are-leaving-confluent-for-open-source-apache-kafka/

Considering Apache Pulsar for your stack? Many teams default to StreamNative for managed Pulsar—but that decision often comes with hidden costs, limited flexibility, and cloud-native messaging that doesn’t hold up under scrutiny.

We break down how Dattell compares:

Transparent, flat-rate pricing
Deep Pulsar ops experience
Real-world clarity on scaling, SLAs, and support
99.99% uptime for all engagements, 15-minute response times

→ Read the full comparison: https://dattell.com/data-architecture-blog/comparing-dattell-and-streamnative/

We wrote a guide to help teams evaluate and select the right Kafka service provider by outlining eight critical factors that influence performance, reliability, and long-term value. The guide compares Dattell, Amazon MSK, and Confluent Cloud, highlighting key differences in pricing models, scalability, support, and architecture.

The guide emphasizes the importance of consistent engineering support, recommending providers that assign a dedicated engineer familiar with your systems, rather than rotating staff. It also advises choosing providers that guarantee rapid response times—ideally 15 minutes for production issues—and 99.99% uptime to minimize operational disruptions.

Security is another crucial consideration; the guide suggests deploying Kafka within your own environment to maintain full control over data and compliance, rather than relying on third-party hosting. Additionally, it highlights the value of proactive maintenance and real-time monitoring to identify and resolve issues before they escalate.

For a comprehensive understanding, read the full guide here: https://dattell.com/kafka-service-provider-comparison-guide

How to stream Kafka data into Elasticsearch with millisecond latency

We wrote a guide that details the construction of a low-latency data pipeline for a fintech client requiring near-real-time search capabilities. The team opted for Kafka Connect with the Elasticsearch Sink Connector, favoring its simplicity and configurability over custom consumers. They fine-tuned parameters like flush.timeout.ms, linger.ms, and batch.size to optimize throughput.

To ensure consistent serialization and minimize parsing overhead, the pipeline utilized JSON or Avro formats with schemas managed by Confluent Schema Registry. Data enrichment was handled through Elasticsearch ingest pipelines, adding geo-tags, parsing user agents, and flattening nested data structures.

Monitoring was integral, focusing on Kafka consumer lag, event-to-index latency, and Elasticsearch ingest and refresh rates to maintain performance and meet SLAs. Indexing efficiency was further enhanced by setting the refresh_interval to 5 seconds, employing index templates with fast analyzers, and maintaining shard sizes under 30GB.

This comprehensive approach enabled the client to achieve millisecond-level latency across millions of daily events. For more details, read the full article here: https://dattell.com/data-architecture-blog/how-we-stream-kafka-data-into-elasticsearch-with-millisecond-latency/

Speeding up Elasticsearch

We wrote a guide for diagnosing and resolving common performance issues in Elasticsearch clusters. It identifies key problem areas and provides actionable solutions:

Unoptimized Queries: Queries using wildcards or regular expressions on analyzed fields, targeting too many shards, or employing deep pagination (using 'from' and 'size') are computationally expensive. To optimize, use filters where possible, limit the number of shards in queries, and replace deep pagination with 'search_after'.

Disk or Heap Pressure: Issues like garbage collection pauses or disk I/O bottlenecks can impair performance. Solutions include using SSDs, monitoring with '_nodes/stats', tuning garbage collection, and adjusting heap sizing based on node roles.

Misconfigured Refresh Intervals and Merge Policies: Frequent refreshes or default merge settings not aligned with the workload can increase segment count and reduce query performance. For heavy ingest workloads, increasing the 'refresh_interval' to 30 seconds or more and reviewing merge throttling settings is recommended.

The article emphasizes that many Elasticsearch performance issues are addressable with informed configuration and monitoring practices. For teams needing assistance, Dattell offers 24x7 Elasticsearch support and consulting services.

Migrating to OpenSearch Without Incurring Downtime

A zero downtime migration to OpenSearch ensures that your services remain available to users throughout the migration process. This is especially crucial for businesses that operate 24/7.

One important step to ensure you don't lose data during the migration is with data synchronization. You can do this by running a real-time data sync and/or run incremental updates.

Check out our instruction guide for migrating to OpenSearch without downtime for more details and tips.

Apache Pulsar vs RabbitMQ Both RabbitMQ and Pulsar can handle high-throughput message traffic and ensure reliable communication between various apps and other components of complex data systems.

However, there are important differences between them. Much of those differences stem from RabbitMQ's focus on simplicity and Apache Pulsar's support of more sophisticated messaging models. The drawback to Pulsar's modular architecture is the increased complexity. When running Pulsar, we must also install Pulsar Broker, BookKeeper, and ZooKeeper.

Another important, distinguishing difference is speed. RabbitMQ offers the fastest speed at the lowest throughputs, while Pulsar is significantly faster as throughput increases.

Click the link below to continue reading about Pulsar vs RabbitMQ.

Message Order Guarantees With Apache Kafka Producers, partitions, and consumers each play a role in how Kafka guarantees message order.

Kafka Producers Data is sent to Kafka from producers. Producers are often applications that generate messages. And each message sent to Kafka will correlate with an assigned key.

Kafka Partitions Kafka stores messages within topics, and each topic has one or more partitions. A partition is a logical unit of parallelization of work across multiple Kafka instances. If there are three partitions, then three Kafka brokers can process data across three computers.

Kafka Consumers Messages are read by consumers from the partitions. Each partition can only be read by a single consumer, but a consumer can read from multiple partitions.

Click the link below to read the full blog post explaining how Kafka guarantees message order.

Did you know that partitions play an important role in how Kafka guarantees message order?

Messages sent to Kafka from producers are assigned a key, for instance a username, email address, or phone number could be keys. Messages with the same key are always sent to the same partition.

Then, a second important piece of the partitions' role in message ordering is that a partition only sends messages to a single consumer.

Both of these above restrictions help to guarantee order by removing the potential for reduced speed or unavailability of a partition or consumer causing misordering of messages.

Click on the article article below to learn more.

Dattell

Tech Stack

Utilities

DevOps

Team Members

Engineering Blog

Stack Decisions