How CircleCI Processes 4.5 Million Builds Per Month

41,450
CircleCI
CircleCI’s continuous integration and delivery platform helps software teams rapidly release code with confidence by automating the build, test, and deploy process. CircleCI offers a modern software development platform that lets teams ramp quickly, scale easily, and build confidently every day.

By Rob Zuber, CTO at CircleCI.



CircleCI Workflow


Background

CircleCI is a platform for continuous integration and delivery. Thousands of engineers trust us to run tests and deploy their code, so they can focus on building great software. That trust rests on a solid stack of software that we use to keep people shipping and delivering value to their users.

As CTO of Engineering at CircleCI, I help make the big technical decisions and keep our teams happy and out of trouble. Before this, I was CTO of Copious, where I learned a lot of important lessons about tech in service of building a consumer marketplace. I like snowboarding, Funkadelic, and viscous cappuccino.

The Teams

Engineers are people. People work better in small groups. So we’ve divided our team into several functional units, inspired by Spotify’s pods. We’re much smaller, so we’ve adapted their ideas to meet our needs, while maintaining the core principle that each team has the resources they need to implement a feature across the stack.

But we think of these teams as more of a guideline than actual rules, so folks are free to move around if it means they’ll be more engaged in the work. Flexibility is a key value at CircleCI: it has to be, with the majority of our engineers working remotely across multiple time zones. To keep everyone on the same page, we use Zoom for videoconferencing and screensharing and update statuses in Pingboard to keep track of who’s “in the office”.

We use JIRA to create consistency in our processes across teams. This consistency lets us stay more nimble if engineers ever need or want to switch teams. We use GitHub for version control and Slack for Giphy control. In addition to chat, we use Slack-based integrations with tools like Hubot, PagerDuty, and Looker to give us central access to many day-to-day tasks.

But you didn’t come here to read about how many Slack channels we have (241), you’re here to read about...

The Stack

Languages

Most of CircleCI is written in Clojure. It’s been this way since almost the beginning. While there were some early spikes in Rails, the passion of a sole developer won out; by the time CircleCI was released to the market, it was written entirely in Clojure and has been at our platform’s core ever since.

Our frontend used to be in CoffeeScript, but when Om made a single-page ClojureScript application viable, we opted for consistency and unification. This choice wasn’t that hard to make, given how much we enjoy using Clojure. Having a lingua franca also helps reduce overhead when engineers want to move between layers of the stack.

That doesn’t mean we won’t sharpen other tools when warranted. The build agent for our recently launched 2.0 platform is written in Go, which lets us quickly inject a multi-platform static binary into environments where we can’t lean on a bunch of dependencies. We also use Go for CLI tools where static dependency compilation and fast start-up are more important than our love of Clojure.

But as we pull microservices out of our monolith, Clojure remains our weapon of choice. We’ve already got over ten microservices, and that number is growing rapidly. A major part of this velocity stems from using Clojure, which ensures developers can rapidly move between teams and projects without climbing a huge learning curve.

The Frontend

Our web app’s UI is written in ClojureScript. Specifically, we’re using the framework Om, a ClojureScript interface to Facebook’s React. This is currently in some flux, since we’re upgrading to Om Next, an Om reboot which fixes a lot of its quirks. You can read more about why we’re so excited in this deep dive by one of our engineers, Peter Jaros.


CircleCI Screenshot


The Backend

Two Pools, Both Alike in Dignity

There are two major pools of machines: the first hosts our own services — the systems that serve our site, manage jobs, send notifications, etc. These services are deployed within Docker containers orchestrated in Kubernetes. In 2012, this configuration wasn’t really an option. As functional programmers, though, we were big believers in immutable infrastructure, so we went all in on baking AMIs and rolling them on code changes.

However, rounding boot times and charges to the hour made using full VMs slow and expensive; rolling deploys in Docker with Kubernetes is much more efficient. Kubernetes’ ecosystem and toolchain made it an obvious choice for our fairly statically-defined processes: the rate of change of job types or how many we need in our internal stack is relatively low.

On the other hand, our customers’ jobs are changing constantly. It’s challenging to dynamically predict demand, what types of jobs we’ll be running, and the resource requirements of each of those jobs. We found that Nomad excelled in this area. With a fast, flexible scheduler built-in, Nomad distributes customer jobs across our second pool of machines, reserved specifically for scheduling purposes.

While we did evaluate both Kubernetes and Nomad to do All These Things, neither tool was optimized for such an all-inclusive job. And we treat Nomad’s scheduling role as more a piece of our software stack than as a part of the management or ops layer. So we use Kubernetes to manage the Nomad servers.

We’ve also recently started using Helm to make it easier to deploy new services into Kubernetes. We’ve had to build a couple small services to string the full CD process together with Helm, while also keeping Kubernetes locked down — but the results have been great. We create a chart (i.e. package) for each service. This lets us easily roll back new software and gives us an audit trail of what was installed or upgraded.

Infrastructure

For the last five years, we’ve run our infrastructure on AWS. It started simply because our architecture was simple but evolved into a necessarily complex stack of Linked Accounts, VPCs, Security Groups, and everything else AWS offers to help partition and restrict resources. We’re also running across multiple regions. Our deep investment in AWS led to increasing assumptions in our code about how the software was being managed.

When we introduced CircleCI Enterprise (our on-prem offering), we started supporting a number of different deployment models. We also started separating ourselves further from the system by packaging our code in Docker containers and using cloud-agnostic Kubernetes to manage resources and distribution.

With a much lower level of vendor lock-in, we’ve gained the flexibility to push part of our workload to Google Cloud Platform (GCP) when it suits us. We chose GCP because it’s particularly well-suited for short-lived VMs. Today, if you use our machine executor to run a job, it will run in GCP. This executor type allocates a full VM for tasks that need it.

We’ve also wrapped GCP in a VM service that preallocates machines, then tears everything down once you’re finished. Using an entire VM means you have full control over a much faster machine. We’re pretty happy with this architecture since it smooths out future forays into other platforms: we can just drop in the Go build agent and be on our merry way.

Communication with Frontend

When the frontend needs to talk to the backend, it does so via a dedicated tier of API hosts. These API hosts are also managed by Kubernetes, albeit in a separate cluster to increase isolation. Nearly all our APIs are public, which means we’re using the same interfaces available to our customers. The value of dogfooding your APIs can’t be overstated: it’s enabled us to keep the APIs clean and spot errors before our users find them.

If you’re interacting with our web application, then all of your requests are hitting the API hosts. The majority of our authentication is handled via OAuth from GitHub or Bitbucket. Once you’ve authenticated, you can also generate an API token to get programmatic access to everything we expose in the UI.

Our API hosts once accepted webhooks from GitHub and Bitbucket, but we’ve recently extracted that into its own service. Using a cleanly-separated service that dumps hooks into RabbitMQ allows us to more easily respond to a large array of operational issues. When version control system (VCS) providers are recovering from their own issues, we’ve seen massive spikes in hooks. Now we’re well equipped to deal with that.

Data! Data! Data!

Our primary datastore is MongoDB. We made this decision in CircleCI’s early days — lured like so many others by the simplicity of “schemaless” storage and rapid iteration. Having peaked at over 10TB of bloated storage in MMAP, along with painful, outage-inducing DB-level locks in Mongo 2.4, we’re happy to see progress being made in WiredTiger. Our operations have greatly improved, but we’re still suffering from a legacy of poorly-enforced schemas on a dataset too large to clean efficiently.

So we’re retreating to the structure of PostgreSQL. We’ve got a great opportunity for this migration as we build microservices with their own datastores. We’re also using Redis to cache data we’d never store permanently, as well as to rate-limit our requests to partners’ APIs (like GitHub).

When we’re dealing with large blobs of immutable data (logs, artifacts, and test results), we store them in Amazon S3. We’re well beyond the scale where we could just dump this kind of stuff in a database. We handle any side-effects of S3’s eventual consistency model within our code to ensure that we deal with user requests correctly while writes are in process.

A Build is Born

When we process a webhook from GitHub/Bitbucket telling us that a user pushed some new code, we use the information to create a new build or workflow representation in our datastores, then queue it for processing. In order to get promoted out of this first queue, the organization needs to have enough capacity in its plan to run the build/workflow.

If you’re a customer using all your containers, no new builds or workflows are runnable until enough containers free up. When that happens, we’ll pass the definition of the work to be performed to Nomad, which is responsible for allocating hardware for the work’s duration.

Running the Build

The gritty details of processing a build are executed by the creatively named build agent. It parses configuration, executes commands, and synthesizes actions that create artifacts and test results. Most builds run in a Docker container, or set of containers, which is defined by the customer for a completely tailored build environment.


CircleCI Screenshot


The build agent streams the results of its work over gRPC to the output processor, a secure facade that understands how to write to all our internal systems. This facade approach allows our 1.0 and 2.0 platforms to coexist.

In order to get this live streaming data to your browser, we use WebSockets managed by Pusher. We also use this channel to deliver state change notifications to the browser, e.g. when a build completes. We also store small segments temporarily in Redis while we collect enough to write permanently to S3.

A Hubot Postscript

We have added very little to the CoffeeScript Hubot application – just enough to allow it to talk to our Hubot workers. The hubot workers implement our operational management functionality and expose it to Hubot so we can get chat integration for free. We’ve also tailored the authentication and authorization code of Hubot to meet the needs of roles within our team.

For larger tasks, we’ve got an internal CLI written in Go that talks to the same API as Hubot, giving access to the same functionality we have in Slack, with the addition of scripting, piping, and all of our favorite Unix tools. When the Hubot worker recognizes the CLI is in use, it logs the commands to Slack to maintain visibility of operational changes.

Analytics & Monitoring

Our primary source of monitoring and alerting is Datadog. We’ve got prebuilt dashboards for every scenario and integration with PagerDuty to manage routing any alerts. We’ve definitely scaled past the point where managing dashboards is easy, but we haven’t had time to invest in figuring out their more anomalous features. Nor the willingness to trust that it will just work for us. We capture any unhandled exceptions with Rollbar and, if we realize one will keep happening, we quickly convert the metrics to point back to Datadog, to keep Rollbar as clean as possible. We’re also using LaunchDarkly to safely deploy new and/or incomplete features behind feature flags.

We use Segment to consolidate all of our trackers, the most important of which goes to Amplitude to analyze user patterns. However, if we need a more consolidated view, we push all of our data to our own data warehouse running Postgres; this is available for rapid analytics and dashboard creation through Looker. Many engineers who want to do their own analysis use tools they’re comfortable with, which includes sed and awk but also Pandas and R.

TL; DR

One of the great things about being a CI/CD company is that we get to practice what we preach. Instead of long dry spells between releases, we push several changes per day to keep our feedback loops short and our codebase clean. We’re small enough that we can move quickly, but large enough that our teams have the resources they need.

This is our stack today. As our customers deal with more complex problems, we’ll adapt and adopt new tools to deal with emerging tech. It’s all very exciting, and we can’t wait to see what the future holds.

While we wait for the future, though, there’s no reason you should be waiting for good code. Start building on CircleCI today and ship your code faster. Or come work with us and help us ship our own code faster.

P.S. If you're already a CircleCI customer, head over to our community site and share your stack to get some free swag.


CircleCI Workflow


CircleCI
CircleCI’s continuous integration and delivery platform helps software teams rapidly release code with confidence by automating the build, test, and deploy process. CircleCI offers a modern software development platform that lets teams ramp quickly, scale easily, and build confidently every day.
Tools mentioned in article
Open jobs at CircleCI
Backend Engineering Manager
Anywhere from UTC-8 to UTC+1
CircleCI is looking for a technology leader who brings exceptional management and technical skills to our growing team of back end developers. The back end team is currently building out our microservices infrastructure, and we are looking for a leader well versed in distributed systems and functional programming to guide the team as we scale.  In this role you will challenge and grow our engineers with the individual in mind. Your primary focus is your team and the people in it, and we expect you to build a great culture with diversity as a core value. We are proud to foster a workplace free from discrimination. We strongly believe that diversity of experience, perspectives, and background will lead to a better environment for our employees and a better product for our users. 
  • Develop people through coaching, mentoring, and management support
  • Work with other development teams, Product Owners, and others to maintain high levels of transparency, efficiency, and collaboration
  • Advocate and promote leadership at all levels within the CircleCI engineering community
  • Grow the technical expertise of your teams in performance, scalability, maintainable architecture and experimentation
  • Participate in architecture discussions guiding our use of Clojure, microservices and other distributed technologies
  • Work with the recruiting team to attract, onboard, and retain diverse top talent
  • You thrive when developing great people, not just great products
  • You are knowledgeable and passionate about software engineering practices, primarily focused on distributed systems technologies
  • You have at least three years of experience leading, coaching and mentoring software development teams and delivering working software together with them
  • You know what it takes to build a team, you believe in agile & lean values and you are a servant leader
  • You bring energy, positivity and drive to the teams you work with
  • Customer Success Engineer - Japan
    Tokyo, Japan
    As a Customer Success Engineer, you will be responsible for providing world class post-sales technical leadership to our client base. Working directly with customers you will be the subject matter expert on continuous integration and deployment as a practice and CircleCI. You will be responsible for delivering value by driving adoption of our platform across the client’s enterprise. Finally, you will have the opportunity to work directly with our Product Management, Engineering, Customer Success, and Marketing teams to share your knowledge and experiences to ultimately improve our customers’ success with CircleCI. You will be the main point of contact for all technical questions and assistance for these customers and represent their needs back to customer success, product management, engineering, and marketing.  And, you will work with the rest of the Customer Success team to build out and cultivate a customer community to enhance the experience of our customers. The successful candidate for this job will have a strong technical aptitude along with a strong self-starting, proactive mentality as well as the ability to create and maintain deep, lasting relationships with customers.  You’re going to be dealing with very technical users and complex issues, but you’re also tasked with creating excitement and loyalty in the customers you interact with. About CircleCI CircleCI provides software development teams the confidence to build, test, and deploy—quickly and consistently—across numerous platforms. Built to address the demanding needs for today's application development environments, CircleCI supports all types of software testing including web, mobile and container (Docker) environments. CircleCI makes continuous integration and continuous deployment simple and easy for thousands of companies like Shopify, Cisco, Sony and Trunk Club, so they can ship better code, faster. CircleCI is venture backed by Draper Fisher Jurvetson, Baseline Ventures, Harrison Metal Capital, Data Collective, 500 Startups, SV Angel, and a collection of respected angels. About Customer Engineering at CircleCI CircleCI’s Customer Engineering Team’s goal is to make life easier for our customers and leave them with the “wow” experience of building and testing their applications with ease. The Customer Engineering Team works with customers to understand their technical and business needs and requirements - from onboarding to implementation to scale. The Customer Success Engineer works with and across the Engineering, Product, and Revenue teams at CircleCI to drive platform adoption and help customers solve their technical challenges.
  • Work closely with customers to help with setting up their CircleCI account and building any custom setup to help customers get the most out of CircleCI
  • Partner with your Customer Success Manager to onboard and support our customers as well as act as the dedicated technical point of contact
  • Be creative and scrappy in solving customer technical problems and answering customer questions
  • Build best practices for onboarding across different technologies
  • Act as the voice of the customer and use customer feedback to help Product and Engineering improve the product
  • Code and commit relevant upgrades and changes to the CircleCI codebase
  • Work closely with the Product and Engineering teams to improve the customer experience across the whole platform
  • Become an expert on the CircleCI solution
  • 2+ years of technical product support, engineering, or experience deploying software in the enterprise
  • Work with customers directly to debug common errors without involving an engineer
  • You can tie business problems to technical solutions and understand technology value propositions
  • Demonstrated and proven capacity to quickly absorb new concepts and technologies
  • You have spent a decent amount of time using and scripting *nix
  • You know how to work with Git in general and ideally GitHub in particular
  • You’ve thrown together an app or two in a high-level programming language
  • You believe that the best way for all to succeed is to honestly discuss product and company abilities and limitations with customers. We do not oversell at CircleCI
  • Strong ops / infrastructure knowledge, especially networking and security
  • Great relationship building skills and a good people person
  • Fluency in Japanese; exceptional written and oral communication skills
  • Experience with build, test, and deployment automation, either as a practitioner or in a customer-facing role
  • Ability to troubleshoot networking issues that may prevent communication between different components
  • Knowledge of Docker or Linux containers in general
  • Experience with popular web app frameworks (e.g. Rails, Django) and/or mobile app development (iOS, Android)
  • Experience using and automating a major IaaS like AWS, GCP, or Azure
  • Familiarity deploying and debugging distributed systems
  • Developer Advocate - Content Producer
    San Francisco, California
    Our mission is to help people everywhere build and deliver software at the speed of imagination. As a Developer Advocate - Content Producer at CircleCI, you will have extensive influence over our voice, content, community building, and ecosystem. The ideal candidate is a passionate CircleCI user who loves writing about development practices, helping others, and sharing what they know, but could also be an experienced copywriter with a strong technical background. This person should be reasonably familiar and fluent in technologies like: Docker, AWS, CI/CD, DevOps, Kubernetes, Linux, application development, etc. As a content producer, you will work in the marketing team, and be responsible for creating and overseeing the production and quality of technical content for our blog and other properties. You will own the growth of our brand-new guest writer community of freelancers for the CircleCI blog, creating systems to help it grow and thrive, including research, topic generation, community building, and quality control. You will be writing and editing blog posts, including testing code snippets. You will also be relied upon for writing marketing copy for landing pages and more, maintaining brand guidelines and writing for multiple audiences. There is tons of room in this role for producing more creative projects including tutorials, narrative content, and video -- the sky’s the limit! We value unique voices, and look to create human, useful content. You can get a feel for our team and our brand voice in some of our favorite pieces here: https://circleci.com/blog/a-brief-history-of-devops-part-i-waterfall/ https://circleci.com/blog/interviewing-as-an-outsider-how-i-finally-got-seen-in-tech/ https://circleci.com/blog/testing-docker-images-with-circleci-and-goss/ This is a contract position while a team member is on family leave. We would like this person to start as soon as possible, and work full-time until mid-January. About CircleCI CircleCI is the best platform for software teams looking to rapidly build quality projects, at scale. Our intelligent continuous integration and delivery tools are simple yet powerful. Our aim is to provide the wisdom of a connected development ecosystem to every team member making technology decisions. We run 12M+ builds a month on our platform for companies like Spotify, Kickstarter, Sony, and Coinbase. Over 25,000 organizations and 300,000 developers actively build, test, and deploy on CircleCI.  We’ve raised $59.5M in venture capital from Industry Ventures, Top Tier Capital, Scale Venture Partners, DFJ, Harrison Metal Capital, and Baseline Ventures.
  • Write and edit technical content for the CircleCI blog, including tutorials and code snippets.
  • Manage CircleCI’s Guest Writer Program: including expanding our pool of freelancers, owning the direction of content, and managing the day-to-day of the program
  • Be a point-person for technical questions related to written content
  • Copywriting for a technical audience, including: landing pages, email, conference booths, ads, and other marketing needs as they arise.
  • Ensure communications are technically correct and tonally appropriate for our audience
  • Create new content streams: research papers, videos, interactive tutorials or guides, etc.
  • You have proven success as a writer for a technical audience.
  • You have experience in marketing or community management.
  • You have a deep understanding of CI/CD and DevOps more generally.
  • You should be familiar with our product and be able to articulate _why_ CircleCI can help engineering teams be more productive, happier, and successful.
  • You are self-directed and strategic: the ideal candidate will be able to set a course, articulate why that course will deliver the highest value and impact to the company, and execute on that direction.
  • You enjoy writing and content creation in general (bonus if you have YouTube experience).
  • You deeply enjoy sharing what you know and helping others to grow.
  • Strong grasp of English writing, grammar and mechanics and feel comfortable editing.
  • Share 3-5 writing samples. A mix of technical and non-technical writing is fine. Write a short cover letter, and tell us why you're passionate about DevOps and CircleCI.
  • Developer Community Manager - Japan
    Tokyo, Japan
    Our mission is to help people everywhere build and deliver software at the speed of imagination. As a developer advocate at CircleCI, you will have extensive influence over our voice, content, community building, and ecosystem. This is a position for engineers who love connecting with developers and speaking publicly about developer productivity and team success on conference panels, at user groups, on blogs, in docs, and internally. Your work grows and sustains the large community of developers in Japan who rely on CircleCI as a fundamental part of their daily work. As our first community manager and developer advocate in Japan, you will be responsible for creating our programs from the ground up. You will own both the strategy and execution for reaching and growing our audience in Japan. The ideal candidate is a passionate CircleCI user who deeply understands our product, and also has an affinity for community management and marketing. About CircleCI CircleCI is the best platform for software teams looking to rapidly build quality projects, at scale. Our intelligent continuous integration and delivery tools are simple yet powerful. Our aim is to provide the wisdom of a connected development ecosystem to every team member making technology decisions. We run 7M+ builds a month on our platform for companies like Spotify, Kickstarter, Sony, and Coinbase. Over 25,000 organizations and 300,000 developers actively build, test, and deploy on CircleCI.  We’ve raised $59.5M in venture capital from Industry Ventures, Top Tier Capital, Scale Venture Partners, DFJ, Harrison Metal Capital, and Baseline Ventures.
  • Define and execute the marketing and evangelism plan to reach developers in Japan, including managing budget and regular reporting
  • Speak at conferences on behalf of CircleCI, organize meetups, and host events
  • Create and run our user groups in Japan
  • Monitor and respond to technical questions in Japanese on Twitter, Stack Overflow, Quora, our Discuss community, and elsewhere
  • Create Japanese content that speaks to our audience of developers: conference talks, blog posts, videos, and more
  • Advocate with product, marketing, engineering, and revenue teams for changes we can make on behalf of our customers -- report on issues our customers are having, what other players in the space are doing, and where we can make improvements
  • Be the main point of contact, editor, and translator for the marketing and localization team to help make sure communications are technically correct and tonally appropriate for our audience
  • You have built, run, and scaled user groups in Japan for a tech company in the past.
  • You have previous experience as a developer advocate and/or community manager
  • You have a deep understanding of CI/CD and DevOps more generally.
  • You should be familiar with our product and be able to articulate _why_ CircleCI can help engineering teams be more productive, happier, and successful
  • You are self-directed and strategic: the ideal candidate will be able to set a course, articulate why that course will deliver the highest value and impact to the company, and execute on that direction
  • You enjoy networking, hosting events,  and speaking -- we’re looking for someone who is energized by meeting new people
  • You have native-level Japanese fluency and full professional proficiency in English, both written and oral
  • Verified by
    Head of DevRel & Community
    Vice President of Marketing
    Support Engineer
    Developer Evangelist
    Technical Content Marketing Manager
    You may also like