What is Grafana OnCall and what are its top alternatives?
Top Alternatives to Grafana OnCall
PagerDuty is an alarm aggregation and dispatching service for system administrators and support teams. It collects alerts from your monitoring tools, gives you an overall view of all of your monitoring alarms, and alerts an on duty engineer if there's a problem. ...
OpsGenie is a cloud-based service for dev & ops teams, providing reliable alerts, on-call schedule management, and escalations. OpsGenie integrates with monitoring tools & services and ensures that the right people are at the right time. ...
VictorOps is a real-time incident management platform that combines the power of people and data to embolden DevOps teams so they can handle incidents as they occur and prepare for the next one. ...
Healthchecks.io is a monitoring service for your cron jobs, background services and scheduled tasks. It works by listening for HTTP "pings" from your services. You can set up various alert methods: email, Slack, Telegram, PagerDuty, etc. ...
Bigpanda helps you manage and respond to ops incidents faster. All your alerts: organized, assignable, trackable, snoozeable, and updated in real-time. ...
Monitoring systems are often complex and require a strong sysadmin background to properly configure and maintain. Cronitor replaces all this with a simple service that anyone can set up. Receive email/sms notifications if your jobs don't run, run too slow, or finish too quickly. ...
It is an alert aggregation and incident management service for IT and DevOps teams. It is a real-time SaaS platform that combines collaboration with alert management so you can handle critical incidents as they occur. With our quick escalations, the right alerts are delivered to the right people enabling increased agility to your team. Our mobile app and integrations allow you to get alerts through SMS, push notifications, and email so you never again miss a critical alert. ...
Spike.sh is a simple incident alerting platform built for growing teams. Spike.sh integrates with your monitoring tools and sends alerts on phone call, SMS, Slack and MS Teams. You can create flexible on-call schedules with ease. ...
Grafana OnCall alternatives & related posts
- Just works54
- Easy configuration23
- Awesome alerting hub14
- Fantastic Alert aggregation and on call management11
- User-customizable alerting modes9
- Awesome tool for alerting and monitoring. Love it4
- Most reliable out of the three and it isn't even close3
- Ugly UI3
related PagerDuty posts
Our primary source of monitoring and alerting is Datadog. We’ve got prebuilt dashboards for every scenario and integration with PagerDuty to manage routing any alerts. We’ve definitely scaled past the point where managing dashboards is easy, but we haven’t had time to invest in using features like Anomaly Detection. We’ve started using Honeycomb for some targeted debugging of complex production issues and we are liking what we’ve seen. We capture any unhandled exceptions with Rollbar and, if we realize one will keep happening, we quickly convert the metrics to point back to Datadog, to keep Rollbar as clean as possible.
We use Segment to consolidate all of our trackers, the most important of which goes to Amplitude to analyze user patterns. However, if we need a more consolidated view, we push all of our data to our own data warehouse running PostgreSQL; this is available for analytics and dashboard creation through Looker.
Data science and engineering teams at Lyft maintain several big data pipelines that serve as the foundation for various types of analysis throughout the business.
Apache Airflow sits at the center of this big data infrastructure, allowing users to “programmatically author, schedule, and monitor data pipelines.” Airflow is an open source tool, and “Lyft is the very first Airflow adopter in production since the project was open sourced around three years ago.”
There are several key components of the architecture. A web UI allows users to view the status of their queries, along with an audit trail of any modifications the query. A metadata database stores things like job status and task instance status. A multi-process scheduler handles job requests, and triggers the executor to execute those tasks.
Airflow supports several executors, though Lyft uses CeleryExecutor to scale task execution in production. Airflow is deployed to three Amazon Auto Scaling Groups, with each associated with a celery queue.
Audit logs supplied to the web UI are powered by the existing Airflow audit logs as well as Flask signal.
Datadog, Statsd, Grafana, and PagerDuty are all used to monitor the Airflow system.
- Two-way slack integration5
- Solid scheduling and team management support4
- Strong API4
- Strong, easy, fast, fits3
- Two-way nagios integration3
- Complete Incident Response Orchestration Platform2
- Free tier2
related OpsGenie posts
- The transmogrifier is a game changer7
- Great Team, Great Product6
- Free tier5
- Much better than ANY of the alternatives. Todd is GREAT3
- Great tiered escalation management3
- Android app with Wear integration2
- On-call routing and the timeline is brilliant2
- Awesome Team always updating1
- Nice UI1
related VictorOps posts
- Can be self-hosted3
- Great value2
- Free tier2
- Easy to understand2
related Healthchecks.io posts
- User interface, easy setup, analytics, integrations7
- Consolidates many systems into one6
- Correlation engine2
- Quick setup1
related Bigpanda posts
- Quick and helpful support2
- Simple and direct1