How CircleCI Processes 4.5 Million Builds Per Month

33,389
80
CircleCI
CircleCI’s continuous integration and delivery platform helps software teams rapidly release code with confidence by automating the build, test, and deploy process. CircleCI offers a modern software development platform that lets teams ramp quickly, scale easily, and build confidently every day.

By Rob Zuber, CTO at CircleCI.



CircleCI Workflow


Background

CircleCI is a platform for continuous integration and delivery. Thousands of engineers trust us to run tests and deploy their code, so they can focus on building great software. That trust rests on a solid stack of software that we use to keep people shipping and delivering value to their users.

As CTO of Engineering at CircleCI, I help make the big technical decisions and keep our teams happy and out of trouble. Before this, I was CTO of Copious, where I learned a lot of important lessons about tech in service of building a consumer marketplace. I like snowboarding, Funkadelic, and viscous cappuccino.

The Teams

Engineers are people. People work better in small groups. So we’ve divided our team into several functional units, inspired by Spotify’s pods. We’re much smaller, so we’ve adapted their ideas to meet our needs, while maintaining the core principle that each team has the resources they need to implement a feature across the stack.

But we think of these teams as more of a guideline than actual rules, so folks are free to move around if it means they’ll be more engaged in the work. Flexibility is a key value at CircleCI: it has to be, with the majority of our engineers working remotely across multiple time zones. To keep everyone on the same page, we use Zoom for videoconferencing and screensharing and update statuses in Pingboard to keep track of who’s “in the office”.

We use JIRA to create consistency in our processes across teams. This consistency lets us stay more nimble if engineers ever need or want to switch teams. We use GitHub for version control and Slack for Giphy control. In addition to chat, we use Slack-based integrations with tools like Hubot, PagerDuty, and Looker to give us central access to many day-to-day tasks.

But you didn’t come here to read about how many Slack channels we have (241), you’re here to read about...

The Stack

Languages

Most of CircleCI is written in Clojure. It’s been this way since almost the beginning. While there were some early spikes in Rails, the passion of a sole developer won out; by the time CircleCI was released to the market, it was written entirely in Clojure and has been at our platform’s core ever since.

Our frontend used to be in CoffeeScript, but when Om made a single-page ClojureScript application viable, we opted for consistency and unification. This choice wasn’t that hard to make, given how much we enjoy using Clojure. Having a lingua franca also helps reduce overhead when engineers want to move between layers of the stack.

That doesn’t mean we won’t sharpen other tools when warranted. The build agent for our recently launched 2.0 platform is written in Go, which lets us quickly inject a multi-platform static binary into environments where we can’t lean on a bunch of dependencies. We also use Go for CLI tools where static dependency compilation and fast start-up are more important than our love of Clojure.

But as we pull microservices out of our monolith, Clojure remains our weapon of choice. We’ve already got over ten microservices, and that number is growing rapidly. A major part of this velocity stems from using Clojure, which ensures developers can rapidly move between teams and projects without climbing a huge learning curve.

The Frontend

Our web app’s UI is written in ClojureScript. Specifically, we’re using the framework Om, a ClojureScript interface to Facebook’s React. This is currently in some flux, since we’re upgrading to Om Next, an Om reboot which fixes a lot of its quirks. You can read more about why we’re so excited in this deep dive by one of our engineers, Peter Jaros.


CircleCI Screenshot


The Backend

Two Pools, Both Alike in Dignity

There are two major pools of machines: the first hosts our own services — the systems that serve our site, manage jobs, send notifications, etc. These services are deployed within Docker containers orchestrated in Kubernetes. In 2012, this configuration wasn’t really an option. As functional programmers, though, we were big believers in immutable infrastructure, so we went all in on baking AMIs and rolling them on code changes.

However, rounding boot times and charges to the hour made using full VMs slow and expensive; rolling deploys in Docker with Kubernetes is much more efficient. Kubernetes’ ecosystem and toolchain made it an obvious choice for our fairly statically-defined processes: the rate of change of job types or how many we need in our internal stack is relatively low.

On the other hand, our customers’ jobs are changing constantly. It’s challenging to dynamically predict demand, what types of jobs we’ll be running, and the resource requirements of each of those jobs. We found that Nomad excelled in this area. With a fast, flexible scheduler built-in, Nomad distributes customer jobs across our second pool of machines, reserved specifically for scheduling purposes.

While we did evaluate both Kubernetes and Nomad to do All These Things, neither tool was optimized for such an all-inclusive job. And we treat Nomad’s scheduling role as more a piece of our software stack than as a part of the management or ops layer. So we use Kubernetes to manage the Nomad servers.

We’ve also recently started using Helm to make it easier to deploy new services into Kubernetes. We’ve had to build a couple small services to string the full CD process together with Helm, while also keeping Kubernetes locked down — but the results have been great. We create a chart (i.e. package) for each service. This lets us easily roll back new software and gives us an audit trail of what was installed or upgraded.

Infrastructure

For the last five years, we’ve run our infrastructure on AWS. It started simply because our architecture was simple but evolved into a necessarily complex stack of Linked Accounts, VPCs, Security Groups, and everything else AWS offers to help partition and restrict resources. We’re also running across multiple regions. Our deep investment in AWS led to increasing assumptions in our code about how the software was being managed.

When we introduced CircleCI Enterprise (our on-prem offering), we started supporting a number of different deployment models. We also started separating ourselves further from the system by packaging our code in Docker containers and using cloud-agnostic Kubernetes to manage resources and distribution.

With a much lower level of vendor lock-in, we’ve gained the flexibility to push part of our workload to Google Cloud Platform (GCP) when it suits us. We chose GCP because it’s particularly well-suited for short-lived VMs. Today, if you use our machine executor to run a job, it will run in GCP. This executor type allocates a full VM for tasks that need it.

We’ve also wrapped GCP in a VM service that preallocates machines, then tears everything down once you’re finished. Using an entire VM means you have full control over a much faster machine. We’re pretty happy with this architecture since it smooths out future forays into other platforms: we can just drop in the Go build agent and be on our merry way.

Communication with Frontend

When the frontend needs to talk to the backend, it does so via a dedicated tier of API hosts. These API hosts are also managed by Kubernetes, albeit in a separate cluster to increase isolation. Nearly all our APIs are public, which means we’re using the same interfaces available to our customers. The value of dogfooding your APIs can’t be overstated: it’s enabled us to keep the APIs clean and spot errors before our users find them.

If you’re interacting with our web application, then all of your requests are hitting the API hosts. The majority of our authentication is handled via OAuth from GitHub or Bitbucket. Once you’ve authenticated, you can also generate an API token to get programmatic access to everything we expose in the UI.

Our API hosts once accepted webhooks from GitHub and Bitbucket, but we’ve recently extracted that into its own service. Using a cleanly-separated service that dumps hooks into RabbitMQ allows us to more easily respond to a large array of operational issues. When version control system (VCS) providers are recovering from their own issues, we’ve seen massive spikes in hooks. Now we’re well equipped to deal with that.

Data! Data! Data!

Our primary datastore is MongoDB. We made this decision in CircleCI’s early days — lured like so many others by the simplicity of “schemaless” storage and rapid iteration. Having peaked at over 10TB of bloated storage in MMAP, along with painful, outage-inducing DB-level locks in Mongo 2.4, we’re happy to see progress being made in WiredTiger. Our operations have greatly improved, but we’re still suffering from a legacy of poorly-enforced schemas on a dataset too large to clean efficiently.

So we’re retreating to the structure of PostgreSQL. We’ve got a great opportunity for this migration as we build microservices with their own datastores. We’re also using Redis to cache data we’d never store permanently, as well as to rate-limit our requests to partners’ APIs (like GitHub).

When we’re dealing with large blobs of immutable data (logs, artifacts, and test results), we store them in Amazon S3. We’re well beyond the scale where we could just dump this kind of stuff in a database. We handle any side-effects of S3’s eventual consistency model within our code to ensure that we deal with user requests correctly while writes are in process.

A Build is Born

When we process a webhook from GitHub/Bitbucket telling us that a user pushed some new code, we use the information to create a new build or workflow representation in our datastores, then queue it for processing. In order to get promoted out of this first queue, the organization needs to have enough capacity in its plan to run the build/workflow.

If you’re a customer using all your containers, no new builds or workflows are runnable until enough containers free up. When that happens, we’ll pass the definition of the work to be performed to Nomad, which is responsible for allocating hardware for the work’s duration.

Running the Build

The gritty details of processing a build are executed by the creatively named build agent. It parses configuration, executes commands, and synthesizes actions that create artifacts and test results. Most builds run in a Docker container, or set of containers, which is defined by the customer for a completely tailored build environment.


CircleCI Screenshot


The build agent streams the results of its work over gRPC to the output processor, a secure facade that understands how to write to all our internal systems. This facade approach allows our 1.0 and 2.0 platforms to coexist.

In order to get this live streaming data to your browser, we use WebSockets managed by Pusher. We also use this channel to deliver state change notifications to the browser, e.g. when a build completes. We also store small segments temporarily in Redis while we collect enough to write permanently to S3.

A Hubot Postscript

We have added very little to the CoffeeScript Hubot application – just enough to allow it to talk to our Hubot workers. The hubot workers implement our operational management functionality and expose it to Hubot so we can get chat integration for free. We’ve also tailored the authentication and authorization code of Hubot to meet the needs of roles within our team.

For larger tasks, we’ve got an internal CLI written in Go that talks to the same API as Hubot, giving access to the same functionality we have in Slack, with the addition of scripting, piping, and all of our favorite Unix tools. When the Hubot worker recognizes the CLI is in use, it logs the commands to Slack to maintain visibility of operational changes.

Analytics & Monitoring

Our primary source of monitoring and alerting is Datadog. We’ve got prebuilt dashboards for every scenario and integration with PagerDuty to manage routing any alerts. We’ve definitely scaled past the point where managing dashboards is easy, but we haven’t had time to invest in figuring out their more anomalous features. Nor the willingness to trust that it will just work for us. We capture any unhandled exceptions with Rollbar and, if we realize one will keep happening, we quickly convert the metrics to point back to Datadog, to keep Rollbar as clean as possible. We’re also using LaunchDarkly to safely deploy new and/or incomplete features behind feature flags.

We use Segment to consolidate all of our trackers, the most important of which goes to Amplitude to analyze user patterns. However, if we need a more consolidated view, we push all of our data to our own data warehouse running Postgres; this is available for rapid analytics and dashboard creation through Looker. Many engineers who want to do their own analysis use tools they’re comfortable with, which includes sed and awk but also Pandas and R.

TL; DR

One of the great things about being a CI/CD company is that we get to practice what we preach. Instead of long dry spells between releases, we push several changes per day to keep our feedback loops short and our codebase clean. We’re small enough that we can move quickly, but large enough that our teams have the resources they need.

This is our stack today. As our customers deal with more complex problems, we’ll adapt and adopt new tools to deal with emerging tech. It’s all very exciting, and we can’t wait to see what the future holds.

While we wait for the future, though, there’s no reason you should be waiting for good code. Start building on CircleCI today and ship your code faster. Or come work with us and help us ship our own code faster.

P.S. If you're already a CircleCI customer, head over to our community site and share your stack to get some free swag.


CircleCI Workflow


80
CircleCI
CircleCI’s continuous integration and delivery platform helps software teams rapidly release code with confidence by automating the build, test, and deploy process. CircleCI offers a modern software development platform that lets teams ramp quickly, scale easily, and build confidently every day.
Technical Writer
San Francisco

As a Technical Writer on the Engineering team, you will work closely with our engineers, developer evangelists, product team, and others to deliver the documentation used by tens of thousands of developers to help them succeed at delivering software through CircleCI.

About Technical Writing at CircleCI

    You will join our growing Documentation Team. You will work closely and be supported by our engineering team and developer evangelists to understand the capabilities of our continuous integration and deployment platform and communicate those capabilities to our users such that they can get maximum value from our platform. You will update documentation regularly, write new material as needed, solicit input from across the company, and be a primary point of contact for customer contributions to our open source documentation repository.

Responsibilities:

  • Organize, edit, and be a contributing writer on our documentation site.
  • Work with our engineering and product teams to understand changes and new features as they are released to identify any changes needed to keep the documentation up-to-date.
  • Solicit and coordinate internal and external input sent as comments or as pull requests to our documentation repository.
  • Work with our developer evangelism team to ensure a cohesive voice, coding style, and consistency of approach in our outward-facing technical writing.

Requirements:

  • Demonstrable body of work communicating complex technical concepts in concise, well-organized writing.
  • Broad understanding of software development tools and practices - you don’t necessarily need to be an active engineer, but you’ll need to have a feel for how different teams build software.
  • Familiarity with the basics of git and GitHub pull requests.
  • A desire to delve into a myriad of different tools, languages, and frameworks.
  • At CircleCI, you’ll need to be able to climb the learning curve on the core idioms of most major programming languages and frameworks as well as various methods for deploying code into production environments.
  • An understanding of Docker or an eagerness to gain an understanding of Docker is also a big plus.

About CircleCI

    CircleCI is the best platform for software teams looking to rapidly build quality projects, at scale. Our intelligent continuous integration and delivery tools are simple yet powerful. Our aim is to provide the wisdom of a connected development ecosystem to every team member making technology decisions.

We run 7M+ builds a month on our platform for companies like Spotify, Kickstarter, Sony, and Coinbase. Over 25,000 organizations and 300,000 developers actively build, test, and deploy on CircleCI. We’ve raised $59.5M in venture capital from Industry Ventures, Top Tier Capital, Scale Venture Partners, DFJ, Harrison Metal Capital, and Baseline Ventures.

If you’re interested in joining the team, please send us your resume and a cover letter explaining why you’d be a great fit. If you have an easily accessible presence on the web (Twitter, blog, GitHub, LinkedIn, etc.) please share it.

We care deeply about diversity and inclusivity. We’re hiring at all experience levels, and seek talented teammates from a wide variety of backgrounds and experiences who are equally committed to cultivating a work environment of respect and kindness. We carefully consider every applicant that takes the time to apply.

Comments
Open jobs at CircleCI
Front End Engineering Manager
San Francisco, California
CircleCI is looking for a web development leader who brings exceptional management and technical skills to our growing team of web developers. The web development team uses modern web technologies and practices to build all aspects of the CircleCI user experience. In this role you will challenge and grow our engineers with the individual in mind. Your primary focus is your team and the people in it, and we expect you to build a great culture with diversity as a core value. We are proud to foster a workplace free from discrimination. We strongly believe that diversity of experience, perspectives, and background will lead to a better environment for our employees and a better product for our users.
  • Develop people through coaching, mentoring, and management support
  • Work with other development teams, Product Owners, and others to maintain high levels of transparency, efficiency, and collaboration
  • Advocate and promote leadership at all levels within the CircleCI engineering community
  • Grow the technical expertise of your teams in web technologies, performance, scalability, maintainable architecture and experimentation
  • Participate in architecture discussions guiding our use of ClojureScript, Om, React, and other web technologies
  • Work with the recruiting team to attract, onboard, and retain diverse top talent
  • You thrive when developing great people, not just great products
  • You are knowledgeable and passionate about software engineering practices, primarily focussed on web technologies
  • You have at least three years of experience leading, coaching and mentoring software development teams and delivering working software together with them
  • You know what it takes to build a team, you believe in agile & lean values and you are a servant leader
  • You bring energy, positivity and drive to the teams you work with
  • Solutions Engineer
    North America
    As a Solutions Engineer at CircleCI, you will be the customer’s trusted advisor throughout the sales process. You will provide hands on support for single tenant deployment and product guidance with CircleCI’s largest customers. About CircleCI Velocity is critical for software teams in today's competitive landscape, but maintaining speed can be difficult as apps and systems grow larger and more complex. CircleCI’s platform allows developers to rapidly release code (for web and mobile apps) they trust by automating the build, test, and deploy process. CircleCI enables developers to detect and fix bugs before they even reach customers. Thousands of leading companies including Facebook, Kickstarter, Shyp and Spotify rely on CircleCI to accelerate delivery of their code and enable developers to focus on creating business value fast.   CircleCI is a Bay Area Best Places to Work 2016 award winner. Founded in 2011 and headquartered in beautiful downtown San Francisco with a global remote workforce, CircleCI is venture backed by Scale Venture Partners, DFJ, Baseline Ventures and Harrison Metal Capital. About Sales Solutions at CircleCI Our Solutions Engineers have a strong understanding of the developer community and are able to confidently communicate with and provide first class support to developers, DevOps leads, and architects. We care deeply about diversity and inclusivity. We’re hiring at all experience levels, and seek talented teammates from a wide variety of backgrounds and experiences who are equally committed to cultivating a work environment of respect and kindness. We carefully consider every applicant that takes the time to apply.
  • Serve as the technical lead and owner of technical deal strategy
  • Collaborate with Account Executive to identify and uncover customer business goals, needs, and pains, and work to show how CircleCI can address them
  • Lead the technical implementation and day-to-day management of CircleCI Trials, integrated into our customer’s technology stack - ensuring early customer success and a long-term business relationship
  • Prove the technical feasibility of the CircleCI platform to highly technical and developer audiences
  • Stay on top of industry news, technology products, platforms and partners to ensure you and your team provide and maintain a deep industry and ecosystem expertise
  • Identify areas of potential improvement in our sales efforts, and then work to improve them. This could be high level strategy, daily tactical efforts, or down in the weeks of creating documentation or how to videos
  • Custom development and launch of new sales products and customer integrations to our internal teams and customer community
  • Partner closely with our customer success teams to ensure continuity of an amazing customer experience - for life
  • Collaborate closely with Product and Engineering teams to help influence product roadmap based on market/customer requirements
  • Provide technical responses to RFPs and RFIs
  • World class presentation skills. You feel comfortable leading presentations and demos of our platform to groups, both technical and non-technical, in-person and virtually
  • Work with customers directly to debug common errors without involving an engineer
  • You can tie business problems to technical solutions and understand technology value propositions
  • Demonstrated and proven capacity to quickly absorb new concepts and technologies
  • You have spent a decent amount of time using and scripting *nix
  • You know how to work with Git in general and ideally GitHub in particular
  • You’ve thrown together an app or two in a high-level programming language
  • You believe that the best way for all to succeed is to honestly discuss product and company abilities and limitations with customers. We do not oversell at CircleCI
  • Strong ops / infrastructure knowledge, especially networking and security
  • Experience with build, test, and deployment automation
  • Familiarity with Clojure and/or ClojureScript
  • Knowledge of Docker or Linux containers in general
  • Experience with popular web app frameworks (e.g. Rails, Django) and/or mobile app development (iOS, Android)
  • Experience using and automating a major IaaS like AWS, GCP, or Azure
  • Familiarity deploying and debugging distributed systems
  • Staff Front End Software Engineer
    Anywhere
    CircleCI is looking for an experienced front end software engineer to help us design and build a rich web experience. You will work closely with Product and Design to help shape industry standards for how developers both visualize and interact with their continuous integration and deployment workflows. At CircleCI you will be working with a team who values trust, respect, and diversity. We are a globally distributed team working towards the common goal of being the standard bearer for continuous integration and deployment software. We are remote friendly, and are looking for people who are passionate about helping shape software engineer best practices for the industry as a whole. To thrive in this role you are someone who works well in teams, and enjoys collaborating. Furthermore you are opinionated about UI, UX, as well as web development technologies and best practices. You enjoy working with Product and Design to find the best solution to the problem at hand, and balance technical tradeoffs with delivering customer value quickly and iteratively. About CircleCI Velocity is critical for software teams in today's competitive landscape, but maintaining speed can be difficult as apps and systems grow larger and more complex. CircleCI’s platform allows developers to rapidly release code (for web and mobile apps) they trust by automating the build, test, and deploy process. CircleCI enables developers to detect and fix bugs before they even reach customers. Thousands of leading companies including Facebook, Kickstarter, Shyp and Spotify rely on CircleCI to accelerate delivery of their code and enable developers to focus on creating business value fast.
  • Work closely with Product and Design to brainstorm the best way to visually represent our customer’s code to software process
  • Drive the direction of the UI and UX of our web app, as well as the technical direction of our front end architecture
  • Work on cross-functional, feature-driven engineering teams
  • Work in ClojureScript and Om (ClojureScript wrapper for React) on a daily basis

  • 3 years of single page application development experience
  • 10+ years of front end engineering experience
  • Articulate UI and UX opinions
  • Extensive knowledge of web technologies
  • A deep appreciation for and ability to support your development with appropriate testing
  • Working in a highly collaborative environment
  • Supporting an environment that values trust, respect, and diversity
  • Exploring different methods for efficiently delivering code and customer value
  • Delivering value to customers
  • Sharing your expertise and encouraging best practices
  • Staff Frontend Engineer
    Remote! or San Francisco, CA
      CircleCI is looking for an experienced front end software engineer to help us design and build a rich web experience. You will work closely with Product and Design to help shape industry standards for how developers both visualize and interact with their continuous integration and deployment workflows.
      At CircleCI you will be working with a team who values trust, respect, and diversity. We are a globally distributed team working towards the common goal of being the standard bearer for continuous integration and deployment software. We are remote friendly, and are looking for people who are passionate about helping shape software engineer best practices for the industry as a whole.
      To thrive in this role you are someone who works well in teams, and enjoys collaborating. Furthermore you are opinionated about UI, UX, as well as web development technologies and best practices. You enjoy working with Product and Design to find the best solution to the problem at hand, and balance technical tradeoffs with delivering customer value quickly and iteratively.
      About CircleCI
      Velocity is critical for software teams in today's competitive landscape, but maintaining speed can be difficult as apps and systems grow larger and more complex. CircleCI’s platform allows developers to rapidly release code (for web and mobile apps) they trust by automating the build, test, and deploy process. CircleCI enables developers to detect and fix bugs before they even reach customers. Thousands of leading companies including Facebook, Kickstarter, Shyp and Spotify rely on CircleCI to accelerate delivery of their code and enable developers to focus on creating business value fast.

    About this role:

    • Work closely with Product and Design to brainstorm the best way to visually represent our customer’s code to software process
    • Drive the direction of the UI and UX of our web app, as well as the technical direction of our front end architecture
    • Work on cross-functional, feature-driven engineering teams
    • Work in ClojureScript and Om (ClojureScript wrapper for React) on a daily basis

    You Have:

    • 3 years of single page application development experience
    • 10+ years of front end engineering experience
    • Articulate UI and UX opinions
    • Extensive knowledge of web technologies
    • A deep appreciation for and ability to support your development with appropriate testing

    You Enjoy:

    • Working in a highly collaborative environment
    • Supporting an environment that values trust, respect, and diversity
    • Exploring different methods for efficiently delivering code and customer value
    • Delivering value to customers
    • Sharing your expertise and encouraging best practices
      CircleCI is a Bay Area Best Places to Work 2016 award winner. Founded in 2011 and headquartered in beautiful downtown San Francisco with a global remote workforce, CircleCI is venture backed by Scale Venture Partners, DFJ, Baseline Ventures and Harrison Metal Capital.
      We care deeply about diversity and inclusivity. We’re hiring at all experience levels, and seek talented teammates from a wide variety of backgrounds and experiences who are equally committed to cultivating a work environment of respect and kindness. We carefully consider every applicant that takes the time to apply.
    Verified by
    1155288
    Software Engineer
    1330683
    Staff Software Engineer
    6912305
    Support Engineer
    6017470
    Developer Evangelist
    You may also like
    E-Commerce at Scale: Inside Shopify's Tech Stack
    How SendGrid Scaled to 40 Billion Emails Per Month
    How Stream Built a Modern RSS Reader With JavaScript
    How Heap Built an Analytics Platform that Auto-Tracks Every User Event