What is collectd and what are its top alternatives?
Top Alternatives to collectd
- StatsD
It is a network daemon that runs on the Node.js platform and listens for statistics, like counters and timers, sent over UDP or TCP and sends aggregates to one or more pluggable backend services (e.g., Graphite). ...
- Nagios
Nagios is a host/service/network monitoring program written in C and released under the GNU General Public License. ...
- Ganglia
It is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. ...
- Fluentd
Fluentd collects events from various data sources and writes them to files, RDBMS, NoSQL, IaaS, SaaS, Hadoop and so on. Fluentd helps you unify your logging infrastructure. ...
- Prometheus
Prometheus is a systems and service monitoring system. It collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts if some condition is observed to be true. ...
- Munin
Munin is a networked resource monitoring tool that can help analyze resource trends and "what just happened to kill our performance?" problems. It is designed to be very plug and play. A default installation provides a lot of graphs with almost no work. ...
- Telegraf
It is an agent for collecting, processing, aggregating, and writing metrics. Design goals are to have a minimal memory footprint with a plugin system so that developers in the community can easily add support for collecting metrics. ...
- Metricbeat
Collect metrics from your systems and services. From CPU to memory, Redis to NGINX, and much more, It is a lightweight way to send system and service statistics. ...
collectd alternatives & related posts
- Open source9
- Single responsibility7
- Efficient wire format5
- Handles aggregation3
- Loads of integrations3
- Many implementations1
- Scales well1
- Simple to use1
- NodeJS1
- No authentication; cannot be used over Internet1
related StatsD posts
We use collectd because of it's low footprint and great capabilities. We use it to monitor our Google Compute Engine machines. More interestingly we setup collectd as StatsD replacement - all our Clojure services push application-level metrics using our own metrics library and collectd pushes them to Stackdriver
A huge part of our continuous deployment practices is to have granular alerting and monitoring across the platform. To do this, we run Sentry on-premise, inside our VPCs, for our event alerting, and we run an awesome observability and monitoring system consisting of StatsD, Graphite and Grafana. We have dashboards using this system to monitor our core subsystems so that we can know the health of any given subsystem at any moment. This system ties into our PagerDuty rotation, as well as alerts from some of our Amazon CloudWatch alarms (we’re looking to migrate all of these to our internal monitoring system soon).
Nagios
- It just works53
- The standard28
- Customizable12
- The Most flexible monitoring system8
- Huge stack of free checks/plugins to choose from1
related Nagios posts
Why we spent several years building an open source, large-scale metrics alerting system, M3, built for Prometheus:
By late 2014, all services, infrastructure, and servers at Uber emitted metrics to a Graphite stack that stored them using the Whisper file format in a sharded Carbon cluster. We used Grafana for dashboarding and Nagios for alerting, issuing Graphite threshold checks via source-controlled scripts. While this worked for a while, expanding the Carbon cluster required a manual resharding process and, due to lack of replication, any single node’s disk failure caused permanent loss of its associated metrics. In short, this solution was not able to meet our needs as the company continued to grow.
To ensure the scalability of Uber’s metrics backend, we decided to build out a system that provided fault tolerant metrics ingestion, storage, and querying as a managed platform...
(GitHub : https://github.com/m3db/m3)
I am new to DevOps and looking for training in DevOps. Some institutes are offering Nagios while some Prometheus in their syllabus. Please suggest which one is being used in the industry and which one should I learn.
related Ganglia posts
- Open-source11
- Easy9
- Great for Kubernetes node container log forwarding9
- Lightweight9
related Fluentd posts
Prometheus
- Powerful easy to use monitoring47
- Flexible query language38
- Dimensional data model32
- Alerts27
- Active and responsive community23
- Extensive integrations22
- Easy to setup19
- Beautiful Model and Query language12
- Easy to extend7
- Nice6
- Written in Go3
- Good for experimentation2
- Easy for monitoring1
- Just for metrics12
- Bad UI6
- Needs monitoring to access metrics endpoints6
- Not easy to configure and use4
- Supports only active agents3
- Written in Go2
- TLS is quite difficult to understand2
- Requires multiple applications and tools2
- Single point of failure1
related Prometheus posts
Grafana and Prometheus together, running on Kubernetes , is a powerful combination. These tools are cloud-native and offer a large community and easy integrations. At PayIt we're using exporting Java application metrics using a Dropwizard metrics exporter, and our Node.js services now use the prom-client npm library to serve metrics.
Why we spent several years building an open source, large-scale metrics alerting system, M3, built for Prometheus:
By late 2014, all services, infrastructure, and servers at Uber emitted metrics to a Graphite stack that stored them using the Whisper file format in a sharded Carbon cluster. We used Grafana for dashboarding and Nagios for alerting, issuing Graphite threshold checks via source-controlled scripts. While this worked for a while, expanding the Carbon cluster required a manual resharding process and, due to lack of replication, any single node’s disk failure caused permanent loss of its associated metrics. In short, this solution was not able to meet our needs as the company continued to grow.
To ensure the scalability of Uber’s metrics backend, we decided to build out a system that provided fault tolerant metrics ingestion, storage, and querying as a managed platform...
(GitHub : https://github.com/m3db/m3)
Munin
- Good defaults3
- Extremely fast to install2
- Alerts can trigger any command line program2
- Adheres to traditional Linux standards2
- Easy to write custom plugins1
related Munin posts
- One agent can work as multiple exporter with min hndlng5
- Cohesioned stack for monitoring5
- Open Source2
- Metrics2
- Supports custom plugins in any language1
- Many hundreds of plugins1
related Telegraf posts
- Simple2
- Easy to setup1
related Metricbeat posts
Hi, We have a situation, where we are using Prometheus to get system metrics from PCF (Pivotal Cloud Foundry) platform. We send that as time-series data to Cortex via a Prometheus server and built a dashboard using Grafana. There is another pipeline where we need to read metrics from a Linux server using Metricbeat, CPU, memory, and Disk. That will be sent to Elasticsearch and Grafana will pull and show the data in a dashboard.
Is it OK to use Metricbeat for Linux server or can we use Prometheus?
What is the difference in system metrics sent by Metricbeat and Prometheus node exporters?
Regards, Sunil.