Help developers discover the tools you use. Get visibility for your team's tech choices and contribute to the community's knowledge.
PagerDuty is an alarm aggregation and dispatching service for system administrators and support teams. It collects alerts from your monitoring tools, gives you an overall view of all of your monitoring alarms, and alerts an on duty engineer if there's a problem. | Runbook is a SaaS application that monitors your servers and performs automated tasks when your monitors fails. Use Runbook to automatically recover from application crashes and unexpected failure without interrupting your service or your well earned sleep! |
Alerting that works (and wakes you up)- When your systems go down, PagerDuty will wake you up. You choose how you want to be alerted - via phone, SMS or email, to multiple numbers, with retries.;Integrate all your existing monitoring tools- PagerDuty works great with almost all monitoring tools including: Nagios (and Icinga), Keynote, New Relic, Pingdom, Circonus, Red Gate SQL Monitor, Server Density, Zenoss, Monit, Munin, SolarWinds and many others. If it can send email, it will work with PagerDuty.;Native apps with push notifications- iOS and Android native apps with push notifications and a cross-platform mobile website ensure you can respond to alerts wherever you are, even on the go.;On-call duty scheduling- Easily set up schedules to fairly share on-call duty responsibilities with your team.;Automatic escalation of alerts- If you're paged but don't respond in time, the alert is auto-escalated to a team member. Ensures nothing slips through the cracks - ever.;Reliable, distributed architecture- PagerDuty's infrastructure is fully replicated in multiple data centers, with fast failover when problems occur.;Works internationally (Yes, really!)- Phone alerts can be delivered to over 170 countries and territories; SMS alerts are available virtually world-wide. (Is my country included?) | Monitors are used to check the status of your environment. They can be webhooks that call to the Runbook RESTful API, they can be Datadog alerts, they can be ping requests. Or, you can setup our TCP custom port to validate connectivity.;Reactions are automated tasks that are called when Monitors fail. It can be anything from starting or restarting servers on AWS, Digital Ocean, or elsewhere, to running a custom script or executing a command. You know, all the first things you try when you get a 4am wake-up call;Integrated with the tools you use today: Heroku, Salt, Rackspace, DigitalOcean, Logentries |
Statistics | |
GitHub Stars - | GitHub Stars 193 |
GitHub Forks - | GitHub Forks 54 |
Stacks 1.0K | Stacks 6 |
Followers 703 | Followers 21 |
Votes 119 | Votes 0 |
Pros & Cons | |
Pros
Cons
| No community feedback yet |
Integrations | |

StackStorm is a platform for integration and automation across services and tools. It ties together your existing infrastructure and application environment so you can more easily automate that environment -- with a particular focus on taking actions in response to events.

VictorOps is a real-time incident management platform that combines the power of people and data to embolden DevOps teams so they can handle incidents as they occur and prepare for the next one.

OpsGenie is a cloud-based service for dev & ops teams, providing reliable alerts, on-call schedule management, and escalations. OpsGenie integrates with monitoring tools & services and ensures that the right people are at the right time.

End to end incident management platform for SRE, DevOps, Network Operations, Infrastructure and Security Operations teams

Bigpanda helps you manage and respond to ops incidents faster. All your alerts: organized, assignable, trackable, snoozeable, and updated in real-time.

Spike.sh is an incident response platform built for modern teams. Spike.sh integrates with your monitoring tools and alerts on phone call, SMS, Slack, MS Teams, Whatsapp, and Telegram.

Healthchecks.io is a monitoring service for your cron jobs, background services and scheduled tasks. It works by listening for HTTP "pings" from your services. You can set up various alert methods: email, Slack, Telegram, PagerDuty, etc.

It is an end-to-end incident response platform that helps tech teams adopt SRE best practices to maximize service reliability, accelerate innovation velocity and deliver outstanding customer experiences.

Monitoring systems are often complex and require a strong sysadmin background to properly configure and maintain. Cronitor replaces all this with a simple service that anyone can set up. Receive email/sms notifications if your jobs don't run, run too slow, or finish too quickly.

If your application is divided into multiple servers, you are probably connecting to them via ssh and executing over and over the same commands. Clearing caches, restarting services, backups, checking health. Wouldn't it be cool if you could do that from browser or smartphone? Gunnery is here for you!