StackShareStackShare
Follow on
StackShare

Discover and share technology stacks from companies around the world.

Follow on

© 2025 StackShare. All rights reserved.

Product

  • Stacks
  • Tools
  • Feed

Company

  • About
  • Contact

Legal

  • Privacy Policy
  • Terms of Service
  1. Home
  2. Companies
  3. PagerDuty
PagerDuty

PagerDuty

www.pagerduty.com?utm_source=lsio

PagerDuty is an alarm aggregation and dispatching service for system administrators and support teams. It collects alerts from your monitoring tools, gives you an overall view of all of your monitoring alarms, and alerts an on duty engineer if there's a problem.

15tools
5decisions
349followers
OverviewTech Stack15Dev Feed

Tech Stack

View all 15
Stack by Layer
Application & Data1
Utilities5
DevOps5
Business Tools4
Application & Data
1 tools (7%)
Utilities
5 tools (33%)
DevOps
5 tools (33%)
Business Tools
4 tools (27%)

Application & Data

1
Amazon EC2

Utilities

5
BraintreeSlackMailgunTwilioMeldium

DevOps

5
DatadogGitHubJiraChefSumo Logic

Business Tools

4
ZendeskSalesforce Sales CloudiDoneThisOlark

Latest from Engineering

View all
StackShare Editors
StackShare Editors

Sep 3, 2016

Distributed Task Scheduling with Akka, Kafka, Cassandra

Needs advice

To solve the problem of scheduling and executing arbitrary tasks in its distributed infrastructure, PagerDuty created an open-source tool called Scheduler. Scheduler is written in Scala and uses Cassandra for task persistence. It also adds Apache Kafka to handle task queuing and partitioning, with Akka to structure the library’s concurrency.

The service’s logic schedules a task by passing it to the Scheduler’s Scala API, which serializes the task metadata and enqueues it into Kafka. Scheduler then consumes the tasks, and posts them to Cassandra to prevent data loss.

427k views427k
Comments
StackShare Editors
StackShare Editors

Oct 15, 2014

Throwing more hardware at Cassandra and no more multi-tenancy

Needs advice

On June 3, 2014 PagerDuty experienced a major issue: their Cassandra pipeline had stopped processing events and refused new ones. All in all, an outage was created that lasted 3 hours, along with additional degraded performance.

"Cassandra seems to have two modes: fine and catastrophe" said one of the PagerDuty engineers, as a seemingly routine repair had cascaded into a very bad situation. Constant memory pressure and underprovisioned amounts of RAM were isolated as a few of the factors that pointed to weaknesses in the way the cluster was set up.

After the outage, each node in the Cassandra cluster was replaced with m2.2xlarge EC2 nodes with 4 cores and 32GB of RAM. PagerDuty also moved away from using a multi-tenant Cassandra setup at that point, to help isolate failures in the future.

28.3k views28.3k
Comments
StackShare Editors
StackShare Editors

May 9, 2014

Using build artifacts to improve mobile app packaging

Needs advice

In 2014, PagerDuty struggled with safely releasing reliable mobile applications to users due to some issues with how the code was being packaged and handled.

PagerDuty’s mobile apps are hybrid and used Cordova to share code between platforms. Coding was straightforward but packaging was not, as a separated Gulp-based build process was also being used. The PagerDuty team took a page from Java and started creating software artifacts.

Rather than checking in transformed code or publishing modules to NPM, the team started creating zipped-up build artifacts, which coincided perfectly with GitHub's Releases feature which arrived in 2013. So despite JavaScript lacking a standard packaged app format like a JAR, PagerDuty was still able to improve the build times and sizes of their mobile apps.

111k views111k
Comments
StackShare Editors
StackShare Editors

Nov 7, 2013

Chef at PagerDuty

Needs advice

In late 2013, the Operations Engineering team at PagerDuty was made up of 4 engineers, and was comprised of generalists, each of whom had one or two areas of depth. Although the Operations Team ran its own on-call, each engineering team at PagerDuty also participated on the pager.

The Operations Engineering Team owned 150+ servers spanning multiple cloud providers, and used Chef to automate their infrastructure across the various cloud providers with a mix of completely custom cookbooks and customized community cookbooks.

Custom cookbooks were managed by Berkshelf, andach custom cookbook contained its own tests based on ChefSpec 3, coupled with Rspec.

Jenkins was used to GitHub for new changes and to handle unit testing of those features.

308k views308k
Comments

Tools Owned

PagerDuty
PagerDuty
Verified
930 followers1,010 stacks

Team on StackShare

7
chrisgagne
angchappy
dbirck34
baskarp
timarmandpour
dshack
afolson