How LaunchDarkly Serves Over 4 Billion Feature Flags Daily

Serving over 10 billion feature flags daily to help software teams build better software, faster. LaunchDarkly helps eliminate risk for developers and operations teams from the software development cycle.

Editor's note: By John Kodumal, CTO, LaunchDarkly

LaunchDarkly Platform


Feature flagging (wrapping a feature in a flag that’s controlled outside of deployment) is a technique for effective continuous delivery. For example, you can wrap a new signup form in a feature flag and then control which users see that form, all without having to redeploy code or modify a database. Engineering-driven companies (think Google, Facebook, Twitter) invest heavily in custom-built feature flag management systems to roll features out to whom they want, when they want. Smaller companies build and maintain their own feature flagging infrastructure or using simple open source projects that often don't even have a UI. I was previously an engineering manager at Atlassian, where I’d seen a team work on an internal feature flagging system, so I was aware of the complexity of the problem and the investment required to build a product that addressed the needs of larger development teams and enterprises. That’s where we saw an opportunity to start LaunchDarkly.

LaunchDarkly Platform

We're currently serving over 4 billion feature flag requests per day for companies like Microsoft, Atlassian, Ten-X, and CircleCI. Many of our customers report that we’ve changed the way they do development-- we de-risk new feature launches, eliminate the need for painful long-lived branches, and empower product managers, QA, and others to use feature flags to improve their users’ experience.

General Architecture

You can think of LaunchDarkly as being split up into three pieces: a monolithic web application, a streaming API that serves feature flags, and an analytics processing pipeline that's structured as a set of microservices. We've written almost all of this in Go.

Go has really worked well for us. We love that our services compile from scratch in seconds, and produce small statically linked binaries that can be deployed easily and run in a small footprint. I'd done a lot with Scala at Atlassian, but I'd grown frustrated with the slow compilation times and overhead of the JVM. Our monolith has about a 6MB memory footprint— try that on the JVM!

I'm generally not a fan of large web frameworks like Django or Rails. Too much "magic" for me. I prefer to build on top of smaller libraries that serve specific needs. To that end, both our monolith and our microservices rely heavily on a home-built framework layer that uses libraries like Gorilla Mux.

Our framework makes it trivial to add a new resource to our REST API and get a ton of essential functionality out of the box-- with a few lines of code, you get authentication, APM with New Relic, metrics pumped to Graphite, CORS support, and more.

The web application monolith has a pretty standard architecture. Some of the technologies we use include:

  • MongoDB -- as our core application data store. It's popular to make fun of Mongo these days, but we've found it to be a great database technology as long as you don't store too many things in it. Anything you can count on your fingers and toes should be fine.
  • ElasticSearch -- handles user search and segmentation.
  • Redis -- caching, of course.
  • HAProxy -- as a load balancer.

LaunchDarkly Architecture

Serving feature flags, fast

One of the cool and novel parts of LaunchDarkly is our streaming architecture, which allows us to serve feature flag changes instantly. Think of it like a real-time, in-memory database containing feature flag settings. The closest comparison would be something like Firebase, except Firebase is really more focused on the client-side web and mobile, whereas we do that and the server-side.

We use several technologies to drive our streaming API. The most important is Pushpin / Fanout. These technologies abstract us away from managing these long-lived streaming connections and focus on building simple REST APIs.

We also use Fastly as a CDN. Fastly is perfect for us-- we can use VCL to write custom caching rules, and can purge content in milliseconds. If you're caching dynamic content (as opposed to say cat GIFs), or you find yourself needing to purge content programmatically, or you want the flexibility of Varnish in addition to the global network of POPs a CDN can provide, Fastly is the best choice out there. Their support team is also fantastic.

When assembled together, these technologies allow our customers to change their feature flag settings on our dashboard and have their new rollout settings streamed to thousands of servers in a hundred milliseconds or less.

Analytics at scale

The other huge component of LaunchDarkly is our analytics processing pipeline. Our customers request over 4 billion feature flags per day, and we use analytics data from these requests to power a lot of the features in our product. A/B testing is an obvious example, but we also do things like determine when a feature flag has stopped being requested, so that you can manage technical debt and clean up old flags.

Our current pipeline involves an HTTP microservice that writes analytics data to DynamoDB. If we need to do any further processing (say, for A/B testing), then we enqueue another job into SQS. Another microservice reads jobs off of the SQS queue and processes them. Right now, we're actively evolving this pipeline. We've found that when we're under heavy load, we need to buffer calls to DynamoDB while we expand capacity instead of trying to process them immediately. Kafka is perfect for this-- so we're splitting that HTTP microservice into a smaller HTTP service that simply queues events to Kafka, and another service that processes Kafka queues.

We actually use LaunchDarkly to control this evolution. We have a feature flag that controls whether a request goes through our old analytics pipeline, or the new Kafka-based pipeline we're rolling out. Once the new pipeline is enabled for all customers, we can clean up the code and switch over completely to the Kafka pipeline. This is a use case that surprises a lot of customers-- they think of feature flags in terms of controlling user-visible features (release toggles), but they are extremely valuable for other use cases like ops toggles, experiments, and permission management.

LaunchDarkly Platform

As we scaled this service out to handle tens of thousands of request per second, we learned an important lesson about microservice construction. When we first built many of these services, we thought in terms of building a separate service per concern. For example, we’d build a service that would read in analytics events and serve the autocomplete functionality on the site. The web application would make a sub-request to this service when it had an autocomplete request from the site.

We quickly learned that the need for fault tolerance and isolation trumps the conceptual neatness of having a service per concern. With fault tolerance in mind, we sliced our services along a different axis-- separating high-throughput analytics writes from the lower-volume read requests coming from the site. This shift dramatically improved the performance of our site, as well as our ability to evolve and scale the huge write load we see on the analytics side.


As you might have inferred, we use AWS as our hosting provider. We’re fairly conservative when it comes to adopting new technologies-- deployment for us consists of a set of Ansible scripts that spin up EC2 boxes for our various services. We don’t yet use ECS or Docker containers-- which by extension means we don’t use anything for container orchestration. A long while back, we spiked a migration to Mesosphere but we ran into enough issues that we didn’t proceed forward. We do think that these technologies are the future, but that future is not now, at least for us.

So maturity is one issue that prevents us from adopting some of the latest whiz-bang ops technology. There are other technologies that we find interesting, like Amazon’s API Gateway but the pricing models just don’t work for us-- at tens of thousands of requests per second, they’re non-starters.

Other services

For customer communications and support, we use Intercom, Slack, and GrooveHQ. We also recently started using elevio, and we've found it's a great way to turn Intercom questions into trackable support tickets.

We use for our product and developer API documentation, GitHub holds all our code hostage, and CircleCI helps us integrate continuously.

What’s next?

We’re constantly evolving our service to improve efficiency and scale. Besides the Kafka switchover, we’re looking at using Cassandra for some of the work that DynamoDB is doing right now. We also are keenly interested in Disque as a queuing solution, especially because we’ve had so much positive experience with Redis.

More aspirationally, we might try spiking some of our new services in Rust. I’m a functional programmer at heart, and while I am appreciative of the speed and tooling around Go, it would be nice to regain some of the expressiveness and elegance of a functional language while retaining what we like about Go (the fast compilation times, ease of deployment). If we do try it out, we’ll do so in a cautious manner, and isolate the trial to a new microservice somewhere.

Serving over 10 billion feature flags daily to help software teams build better software, faster. LaunchDarkly helps eliminate risk for developers and operations teams from the software development cycle.
Site Reliability Engineer
LaunchDarkly is dedicated to helping teams ship better software, faster. Developers and operations teams use our Feature Management platform to eliminate risk from their software development cycle. We serve over twenty billion feature flags daily for companies big and small. We're based in downtown Oakland and growing quickly. Come join our talented and diverse team and work alongside alumni of Atlassian, Intercom, Google and Twitter. You'll help us tackle some of the most challenging engineering problems around, like how can we deliver feature flags to hundreds of millions of users in milliseconds, without breaking the bank. And the platform you help us build will help software developers everywhere sleep better at night. At LaunchDarkly, we believe in the power of teams. We're building a team that is humble, open, collaborative, respectful and kind. A team that succeeds together. About the Role: As an SRE at LaunchDarkly in Oakland, you'll help us build, scale, and operate LaunchDarkly's feature management platform, improving our reliability and automation. LaunchDarkly serves billions of feature flags every day to customers around the world. And we ingest, analyze, and query billions of events per day. We need you to keep us ahead of the growth curve by making smart investments in tools, technology and people. You'll help us create architecture and process that can handle the exponential growth of our product. You'll enable us to deliver at high scale, performance and reliability.
  • Work directly with our CTO and development team to define and evolve our architecture. You'll be a core voice in every technical decision we make.
  • Deploy and operate our distributed, high-throughput, real-time data analytics pipeline, implemented as a set of Go microservices.
  • Evolve our monitoring and analytics infrastructure.
  • Diagnose and troubleshoot services during incidents.
  • Tune and manage open-source tools like Elasticsearch, Kafka, Redis, and Cassandra.
  • Evolve our CI/CD pipeline to survive an ever-growing number of engineers and accomodate an increasing rate of change safely.
  • Enhance the use of configuration management tools to operationalize deployments
  • Improve the reliability and efficiency of fault-tolerant distributed systems.
  • Lead a team of engineers during incidents and executing a thorough incident management process.
  • Experience building and operating large-scale production systems
  • A track record of working collaboratively in a rapidly moving engineering team
  • Strong understanding of networking technologies, plus practical experience dealing with networking issues in real-world environments
  • A bias toward repeatability and eliminating human effort through software automation
  • Self‐starter and problem solver, willing to solve difficult problems and work independently when necessary
  • The ability to identify problems, propose solutions, gain consensus and see those solutions into production
  • Strong testing background: experience building unit, integration, performance, and load tests
  • Experience with real-time event logging, stats collection, and analysis
  • Experience operating a large system on AWS
  • Experience with Go
  • Experience with commercial logging and monitoring tools
  • Comments
    Open jobs at LaunchDarkly
    Sales Engineer
    San Francisco
    LaunchDarkly is a rapidly growing software company with a strong mission and vision carried out by a talented and diverse team of employees. Our goal is to help teams build better software, faster. You'll join a small team and have an immediate impact with our product and customers. We are specifically looking for our first Sales Engineer who is highly competent at managing the technical conversation with our potential buyers. You should be a self-starter who works well with little supervision. At the same time you need to be comfortable wearing multiple hats. We trust you to do the right things with little oversight. We have unlimited vacation, flexible working hours, fully covered medical insurance, and encourage volunteering. 
  • Serve as the technical lead and owner of the technical deal strategy.
  • Ability to comprehend and communicate the architecture and security practices of LaunchDarkly
  • Perform technical discovery with prospects and quickly architect proposed solutions
  • Successfully manage and execute technical proof of concepts (POCs)
  • Able to respond to functional and technical elements of RFIs/RFPs/security questionnaires
  • Quickly understand a customer's’ business goals and translate them into what a technical implementation will look like
  • Collect feedback from customers, synthesize, analyze and channel throughout the company
  • 5+ years experience as Sales Engineer, Engineer, Implementation Consultant, or Customer Success Engineer
  • Conversational about .NET, PHP, Python, Node.js, Java, JavaScript, Ruby/Rails, Go, iOS, and Android, etc
  • Desire to learn and quickly absorb new development frameworks, practices, and approaches.
  • Passion for consulting and tactical empathy to apply and present complex solutions effectively
  • Excellent written communication skills with the ability to explain complex topics in easily understood, concise language
  • Obvious passion for your work and stellar people skills
  • Technical Support Engineer
    San Francisco
    LaunchDarkly is a rapidly growing software company with a strong mission and vision carried out by a talented and diverse team of employees. Our goal is to help teams build better software, faster. You'll join a small team and have an immediate impact with our product and customers. We are specifically looking for our first Technical Support Engineer who will take end-to-end ownership of customer issues, including initial troubleshooting, identification of root cause, and issue resolution. In addition to answering customer questions, support tasks include leading projects to drive efficiency, documenting knowledge so customers can self-solve questions, and develop tools that allow our clients to be more satisfied with LaunchDarkly. You should be a self-starter who works well with little supervision. At the same time, you need to be comfortable wearing multiple hats. We trust you to do the right things with little oversight. We have unlimited vacation, flexible working hours, fully covered medical insurance, and encourage volunteering. 
  • Meet or exceed customer expectations on response quality, timeliness of responses and overall customer experience.
  • Serve as internal and external point of contact on customer escalations and ensure customer issues are resolved as expediently as possible.
  • Collect information and document bugs with Engineering for product issues that are impacting customers.
  • Create process or troubleshooting documentation in the support knowledge base.
  • Deliver against customer experience and efficiency targets.
  • Push creative thinking beyond the boundaries of existing industry standard practices to come up with process improvements and new ways to delight customers.
  • Develop your skills in cutting-edge technologies
  • Share customer feedback throughout the entire company.
  • 2+ years of customer support, technical support, or related customer facing role.
  • Passion for solving customer issues and advocating for their success, in a fast paced, highly technical environment.
  • Technical fluency with one (or more) development platforms: .NET, PHP, Python, Node.js, Java, JavaScript, Ruby/Rails, Go, iOS, and Android
  • Experience with continuous delivery or agile software development processes and tools
  • Experience working with APIs or building integrations between SaaS services
  • Ability to learn new technologies quickly.
  • Excellent relationship management, customer service and communication skills in variety of forms (written, live chat, conference calls, in-person.)
  • Ability to work independently with little direct supervision and as a part of a team.
  • Outstanding analytical and organizational abilities.
  • Ability to remain calm, composed and articulate when dealing with tough customer situations.
  • You have a thirst for knowledge
  • You enjoy working on technical side projects to validate what you’ve learned.
  • You have good time management skills and can balance numerous projects at once.
  • Developer Advocate
    San Francisco
    As the market leader of a fast-growing space, we’re looking for a Developer Advocate to help us define and quickly expand the market for feature flagging. The overall mission of the Developer Advocate is to secure platform adoption and revenue growth through evangelism, community engagement, and developer relations. This is a technical role with the mission of engaging with the broad community of developers and driving excitement around developer related technologies. This position is a great opportunity to help improve awareness of LaunchDarkly and to increase usage of LaunchDarkly’s technologies through marketing programs as well as in-depth engagement with key accounts. You would be the first at this role, and must be excited about getting to define a new role.
  • Develop useful content, education, and demo apps on top of our platform to demonstrate value and build excitement.
  • Talk about technology intelligently and enthusiastically to developers, developer managers and senior management.
  • Develop relationships with influencers and third-party communities.
  • Attend and speak at conferences, user meetups and hackathons to connect with developers and understand how we can best serve them and make them successful.
  • Become a thought leader in the market.
  • Be a voice of our users inside LaunchDarkly.
  • Success in this role is measured by the growth and retention of LaunchDarkly customers.
  • You have unending enthusiasm to share your knowledge and ideas with other developers.
  • You are able to converse with a broad range of developer technologies and communities (Java, .NET, Node.js, Python, Ruby on Rails, iOS, Android, etc.), but have a particular interest in the DevOps and continuous delivery communities.
  • You have passion, curiosity, technical depth, and exceptional communication and presentation skills.
  • You have a genuine interest in helping developers solve their problems.
  • You are involved in developer community groups.
  • You have good marketing skills and business logic.
  • You possess a strong software developer background, write code and share what you know.
  • You love to build apps, create solutions, interact with other developers and derive job satisfaction from helping others learn by doing.
  • You are interested in implementing marketing programs that are scalable and repeatable.
  • Software Engineer (Full-stack)
    San Francisco
    LaunchDarkly is a rapidly growing software company with a strong mission and vision carried out by a talented and diverse team of employees. Our goal is to help teams build better software, faster. You'll join a small team from companies like Atlassian, Intercom, and Twitter, and you'll have an immediate impact with our product and customers. We're looking for a creative, product-focused full stack engineer to help us build our core platform. You'll own new feature development end-to-end, contributing to our back-end and front-end code. We're looking for someone who thrives on putting new features in front of customers and takes pride in the quality of their work. Our core platform serves over four billion feature flags daily. We use the following technologies on a daily basis: Golang— all our services are written in Go React / Redux / JavaScript on the front-end MongoDB ElasticSearch Redis HAProxy Kafka You don't need to know all of these, but if you're familiar with some or all of them, that's a good sign.
  • Proven experience and fluency with server-side web development (e.g. in Java / Scala, Ruby, Python, Golang, Node.js)
  • Proven experience and fluency with front-end web development in JavaScript
  • Strong understanding of concurrency and threading
  • Experience building RESTful APIs
  • Proven ability to mentor and provide technical leadership
  • Self-starter and problem solver, willing to solve difficult problems and work independently when necessary
  • Strong testing background: experience building unit, integration, load tests, and benchmarks
  • Experience with NoSQL databases (MongoDB, ElasticSearch)
  • Experience with React / Redux for front-end development
  • A deep understanding of networking technologies (TCP, HTTP, websockets, server-sent events, etc.)
  • Verified by
    Head of Marketing
    You may also like
    How Stream Built a Modern RSS Reader With JavaScript
    How Heap Built an Analytics Platform that Auto-Tracks Every User Event
    How Raygun Processes Millions of Error Events Per Second
    Stream & Go: News Feeds for Over 300 Million End Users