Grafana or Kibana - Help me decide
The observability of applications is an aspect growing in importance every day for software development teams. More observable applications result in improved the productivity of software teams and software organizations as a whole. The benefits of observable applications include:
- Less time debugging, because more debug information is already available.
- Resolving issues and incidents faster.
- Improved awareness of changes in the environment, from operational load to customer behavior.
Two approaches for creating observable applications are monitoring and log analysis.
The monitoring of applications is usually performed by analyzing the changes in discrete data points describing the state of the system at a given moment, called metrics. Metrics are usually submitted directly to the monitoring system by the running instance of an application. That instance can be a database instance, a web server, or any other part of the web service Monitoring systems are generally focused on real-time metrics.
Logs are information about the specific events that took place at a certain moment in time. Log analysis is a post-event inquiry into the log entries, and therefore past events, that a running application produced. Due to the decreasing latency in log processing over the past years, you can now accomplish log analysis in near-real-time.
In this Stackup we look at one tool from each of the two sides: Grafana, a monitoring solution, and Kibana, a log analysis solution that is part of the Elasticsearch, Logstash, and Kibana stack, or ELK.
At their core, Grafana and Kibana cover two different use cases and sets of functionality.
Grafana is a monitoring tool, and its functionality is optimized for monitoring tasks and time series data. The data sources it supports are those most commonly used for storing application metrics and Grafana produces alerts in real time.
Kibana, is a data visualization tool. It was created to facilitate log analysis in combination with the popular Elasticsearch and Logstash. The three tools allow you to query and parse relevant information out of the collected logs and display it in different ways.
What's the difference between the two use cases? Grafana focuses on efficiently displaying a defined set of metrics in real time. Kibana focuses on the exploration of available data and the flexibility of extracting metrics from raw log lines.
Both Grafana and Kibana support Elasticsearch as a data source.
Apart from Elasticsearch, Grafana supports sourcing metrics from:
- MySQL, PostgreSQL, Microsoft SQL Server
- AWS Cloudwatch
Kibana focuses on Elasticsearch and doesn't support any data sources besides Elasticsearch. However, Kibana offers more functionality for the Elasticseach source, like exploring available data and performing a full-text search on the logs.
With Kibana, you query log lines to produce metrics that you are looking for. For example, if the log lines contain information on HTTP requests:
method=post api=books result=201 method=get api=books result=200 method=get api=bookshelves result=404
If you want to present the amount of successful HTTP queries vs those that didn't return valid results, you do the following:
- On the machine that produces the example logs above, set up Logstash to process the logs and write them to Elasticsearch.
- In Kibana, create a time series view that looks for the items that have your desired HTTP statuses.
A full breakdown of HTTP requests by status, country, OS and other factors in Kibana. Source: elastic.co
Every time the dashboard needs to update, the query runs and produces the most recent counts for the different HTTP statuses.
The main area of the Kibana user interface includes a search box where you can try any Elasticsearch queries, visualize the results, and save the queries that produce the results you are looking for to dashboards.
On dashboards, it is possible to refine the set of data presented by using additional search parameters introduced via a search box (another Elasticsearch query).
Grafana's interface is not optimized for exploring data, but for setting up dashboards once and using them for a long time. Grafana's interface is optimized for time series data, which is the most common visualization type in monitoring systems.
A Grafana dashboard. Source: grafana.org
Like Kibana, Grafana allows you to narrow down the content of the dashboards with variables, a pre-set list of values you can use to filter the output of the visualizations.
Both Grafana and Kibana offer multiple types of data visualizations which you can use on dashboards. While both systems offer visualizations for most common use cases, Kibana goes further and also provides specialized visualizers like maps and tag clouds. Kibana also allows you to embed graphs created with the Vega framework.
You can find the most common visualization types and their availability in both Grafana and Kibana in the table below.
|Map / geospatial data||No||Yes|
Find more details about the supported visualizations in the Grafana and Kibana docs respectively:
Grafana has a built-in alerting engine. You can configure alerts for any metric displayed as a time series, and you set via a query like this:
avg() OF query(A, 5m, now) IS BELOW 14
A references a metric available in Grafana.
The engine allows handling of special cases like no data available or a failed database connection. If the alert is triggered, Grafana can notify Slack, PagerDuty and other services, or send a generic webhook.
You can find out more about alerting in Grafana in the docs.
Kibana doesn't handle alerts directly but requires you to configure them in Elasticsearch via data watchers. Watchers are functions that run a query periodically and act on the result. You can currently only configure watchers via the API.
Kibana and Elasticsearch currently offer limited documentation on configuring watchers that integrate with third-party services for alerting. Example watchers currently look like this:
You can find more details about the Elasticsearch Watcher APIs in the documentation.
While monitoring and log analysis solutions contribute to the observability of applications, the tools from the two camps solve different problems and are complementary.
Collecting metrics allows the teams responsible for applications to gain visibility into the current state of a system in real time. The application needs to submit these metrics, and changing the exact metrics submitted generally requires application changes. Collecting metrics is not always possible for legacy or closed-source applications where the team operating the system doesn't have access to the code. But if you can build metrics collection into your application, then collecting and visualizing metrics is where Grafana excels.
Log analysis makes it possible to analyze events produced by the application, which is sometimes the only way to gain insight into the state of a closed system that does not produce relevant metrics. For applications that do produce metrics, log analysis can allow operators to find new trends in the system behavior and iterate on the metrics quickly without application changes. When used as part of the ELK stack, this is where Kibana excels.
Grafana vs Kibana: What are the differences?
Grafana: Open source Graphite & InfluxDB Dashboard and Graph Editor. Grafana is a general purpose dashboard and graph composer. It's focused on providing rich ways to visualize time series metrics, mainly though graphs but supports other ways to visualize data through a pluggable panel architecture. It currently has rich support for for Graphite, InfluxDB and OpenTSDB. But supports other data sources via plugins; Kibana: Explore & Visualize Your Data. Kibana is an open source (Apache Licensed), browser based analytics and search dashboard for Elasticsearch. Kibana is a snap to setup and start using. Kibana strives to be easy to get started with, while also being flexible and powerful, just like Elasticsearch.
Grafana and Kibana belong to "Monitoring Tools" category of the tech stack.
Some of the features offered by Grafana are:
- Create, edit, save & search dashboards
- Change column spans and row heights
- Drag and drop panels to rearrange
On the other hand, Kibana provides the following key features:
- Flexible analytics and visualization platform
- Real-time summary and charting of streaming data
- Intuitive interface for a variety of users
"Beautiful" is the primary reason why developers consider Grafana over the competitors, whereas "Easy to setup" was stated as the key factor in picking Kibana.
Grafana and Kibana are both open source tools. It seems that Grafana with 29.7K GitHub stars and 5.63K forks on GitHub has more adoption than Kibana with 12.4K GitHub stars and 4.8K GitHub forks.
Airbnb, DigitalOcean, and 9GAG are some of the popular companies that use Kibana, whereas Grafana is used by Uber Technologies, DigitalOcean, and 9GAG. Kibana has a broader approval, being mentioned in 907 company stacks & 479 developers stacks; compared to Grafana, which is listed in 577 company stacks and 325 developer stacks.
What is Grafana?
What is Kibana?
Need advice about which tool to choose?Ask the StackShare community!
Sign up to add, upvote and see more prosMake informed product decisions
What are the cons of using Grafana?
Sign up to get full access to all the companiesMake informed product decisions
Sign up to get full access to all the tool integrationsMake informed product decisions
One size definitely doesn’t fit all when it comes to open source monitoring solutions, and executing generally understood best practices in the context of unique distributed systems presents all sorts of problems. Megan Anctil, a senior engineer on the Technical Operations team at Slack gave a talk at an O’Reilly Velocity Conference sharing pain points and lessons learned at wrangling known technologies such as Icinga, Graphite, Grafana, and the Elastic Stack to best fit the company’s use cases.
At the time, Slack used a few well-known monitoring tools since it’s Technical Operations team wasn’t large enough to build an in-house solution for all of these. Nor did the team think it’s sustainable to throw money at the problem, given the volume of information processed and the not-insignificant price and rigidity of many vendor solutions. With thousands of servers across multiple regions and millions of metrics and documents being processed and indexed per second, the team had to figure out how to scale these technologies to fit Slack’s needs.
On the backend, they experimented with multiple clusters in both Graphite and ELK, distributed Icinga nodes, and more. At the same time, they’ve tried to build usability into Grafana that reflects the team’s mental models of the system and have found ways to make alerts from Icinga more insightful and actionable.