What is OpenTracing and what are its top alternatives?
Top Alternatives to OpenTracing
It helps gather timing data needed to troubleshoot latency problems in service architectures. Features include both the collection and lookup of this data. ...
Datadog is the leading service for cloud-scale monitoring. It is used by IT, operations, and development teams who build and operate applications that run on dynamic or hybrid cloud infrastructure. Start monitoring in minutes with Datadog! ...
Jaeger, a Distributed Tracing System
Fluentd collects events from various data sources and writes them to files, RDBMS, NoSQL, IaaS, SaaS, Hadoop and so on. Fluentd helps you unify your logging infrastructure. ...
It is a set of libraries for various languages that allow you to collect application metrics and distributed traces, then transfer the data to a backend of your choice in real time. This data can be analyzed by developers and admins to understand the health of the application and debug problems. ...
Prometheus is a systems and service monitoring system. It collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts if some condition is observed to be true. ...
It is a fast, private and secure web browser for PC and mobile. It blocks ads and trackers. It prevents you from being tracked by sneaky advertisers, malware and pop-ups. ...
It provides the leading platform for Operational Intelligence. Customers use it to search, monitor, analyze and visualize machine data. ...
OpenTracing alternatives & related posts
related Zipkin posts
related Datadog posts
Our primary source of monitoring and alerting is Datadog. We’ve got prebuilt dashboards for every scenario and integration with PagerDuty to manage routing any alerts. We’ve definitely scaled past the point where managing dashboards is easy, but we haven’t had time to invest in using features like Anomaly Detection. We’ve started using Honeycomb for some targeted debugging of complex production issues and we are liking what we’ve seen. We capture any unhandled exceptions with Rollbar and, if we realize one will keep happening, we quickly convert the metrics to point back to Datadog, to keep Rollbar as clean as possible.
We use Segment to consolidate all of our trackers, the most important of which goes to Amplitude to analyze user patterns. However, if we need a more consolidated view, we push all of our data to our own data warehouse running PostgreSQL; this is available for analytics and dashboard creation through Looker.
We are looking for a centralised monitoring solution for our application deployed on Amazon EKS. We would like to monitor using metrics from Kubernetes, AWS services (NeptuneDB, AWS Elastic Load Balancing (ELB), Amazon EBS, Amazon S3, etc) and application microservice's custom metrics.
We are expected to use around 80 microservices (not replicas). I think a total of 200-250 microservices will be there in the system with 10-12 slave nodes.
We tried Prometheus but it looks like maintenance is a big issue. We need to manage scaling, maintaining the storage, and dealing with multiple exporters and Grafana. I felt this itself needs few dedicated resources (at least 2-3 people) to manage. Not sure if I am thinking in the correct direction. Please confirm.
You mentioned Datadog and Sysdig charges per host. Does it charge per slave node?
related Jaeger posts
related Fluentd posts
related OpenCensus posts
related Prometheus posts
Why we spent several years building an open source, large-scale metrics alerting system, M3, built for Prometheus:
By late 2014, all services, infrastructure, and servers at Uber emitted metrics to a Graphite stack that stored them using the Whisper file format in a sharded Carbon cluster. We used Grafana for dashboarding and Nagios for alerting, issuing Graphite threshold checks via source-controlled scripts. While this worked for a while, expanding the Carbon cluster required a manual resharding process and, due to lack of replication, any single node’s disk failure caused permanent loss of its associated metrics. In short, this solution was not able to meet our needs as the company continued to grow.
To ensure the scalability of Uber’s metrics backend, we decided to build out a system that provided fault tolerant metrics ingestion, storage, and querying as a managed platform...
(GitHub : https://github.com/m3db/m3)
We have Prometheus as a monitoring engine as a part of our stack which contains Kubernetes cluster, container images and other open source tools. Also, I am aware that Sysdig can be integrated with Prometheus but I really wanted to know whether Sysdig or sysdig+prometheus will make better monitoring solution.
related Brave posts
related Splunk posts
I use Kibana because it ships with the ELK stack. I don't find it as powerful as Splunk however it is light years above grepping through log files. We previously used Grafana but found it to be annoying to maintain a separate tool outside of the ELK stack. We were able to get everything we needed from Kibana.