How LaunchDarkly Serves Over 4 Billion Feature Flags Daily

13,325
LaunchDarkly
Serving over 200 billion feature flags daily to help software teams build better software, faster. LaunchDarkly helps eliminate risk for developers and operations teams from the software development cycle.

Editor's note: By John Kodumal, CTO, LaunchDarkly



LaunchDarkly Platform


Background

Feature flagging (wrapping a feature in a flag that’s controlled outside of deployment) is a technique for effective continuous delivery. For example, you can wrap a new signup form in a feature flag and then control which users see that form, all without having to redeploy code or modify a database. Engineering-driven companies (think Google, Facebook, Twitter) invest heavily in custom-built feature flag management systems to roll features out to whom they want, when they want. Smaller companies build and maintain their own feature flagging infrastructure or using simple open source projects that often don't even have a UI. I was previously an engineering manager at Atlassian, where I’d seen a team work on an internal feature flagging system, so I was aware of the complexity of the problem and the investment required to build a product that addressed the needs of larger development teams and enterprises. That’s where we saw an opportunity to start LaunchDarkly.


LaunchDarkly Platform


We're currently serving over 4 billion feature flag requests per day for companies like Microsoft, Atlassian, Ten-X, and CircleCI. Many of our customers report that we’ve changed the way they do development-- we de-risk new feature launches, eliminate the need for painful long-lived branches, and empower product managers, QA, and others to use feature flags to improve their users’ experience.

General Architecture

You can think of LaunchDarkly as being split up into three pieces: a monolithic web application, a streaming API that serves feature flags, and an analytics processing pipeline that's structured as a set of microservices. We've written almost all of this in Go.

Go has really worked well for us. We love that our services compile from scratch in seconds, and produce small statically linked binaries that can be deployed easily and run in a small footprint. I'd done a lot with Scala at Atlassian, but I'd grown frustrated with the slow compilation times and overhead of the JVM. Our monolith has about a 6MB memory footprint— try that on the JVM!

I'm generally not a fan of large web frameworks like Django or Rails. Too much "magic" for me. I prefer to build on top of smaller libraries that serve specific needs. To that end, both our monolith and our microservices rely heavily on a home-built framework layer that uses libraries like Gorilla Mux.

Our framework makes it trivial to add a new resource to our REST API and get a ton of essential functionality out of the box-- with a few lines of code, you get authentication, APM with New Relic, metrics pumped to Graphite, CORS support, and more.

The web application monolith has a pretty standard architecture. Some of the technologies we use include:

  • MongoDB -- as our core application data store. It's popular to make fun of Mongo these days, but we've found it to be a great database technology as long as you don't store too many things in it. Anything you can count on your fingers and toes should be fine.
  • ElasticSearch -- handles user search and segmentation.
  • Redis -- caching, of course.
  • HAProxy -- as a load balancer.


LaunchDarkly Architecture


Serving feature flags, fast

One of the cool and novel parts of LaunchDarkly is our streaming architecture, which allows us to serve feature flag changes instantly. Think of it like a real-time, in-memory database containing feature flag settings. The closest comparison would be something like Firebase, except Firebase is really more focused on the client-side web and mobile, whereas we do that and the server-side.

We use several technologies to drive our streaming API. The most important is Pushpin / Fanout. These technologies abstract us away from managing these long-lived streaming connections and focus on building simple REST APIs.

We also use Fastly as a CDN. Fastly is perfect for us-- we can use VCL to write custom caching rules, and can purge content in milliseconds. If you're caching dynamic content (as opposed to say cat GIFs), or you find yourself needing to purge content programmatically, or you want the flexibility of Varnish in addition to the global network of POPs a CDN can provide, Fastly is the best choice out there. Their support team is also fantastic.

When assembled together, these technologies allow our customers to change their feature flag settings on our dashboard and have their new rollout settings streamed to thousands of servers in a hundred milliseconds or less.

Analytics at scale

The other huge component of LaunchDarkly is our analytics processing pipeline. Our customers request over 4 billion feature flags per day, and we use analytics data from these requests to power a lot of the features in our product. A/B testing is an obvious example, but we also do things like determine when a feature flag has stopped being requested, so that you can manage technical debt and clean up old flags.

Our current pipeline involves an HTTP microservice that writes analytics data to DynamoDB. If we need to do any further processing (say, for A/B testing), then we enqueue another job into SQS. Another microservice reads jobs off of the SQS queue and processes them. Right now, we're actively evolving this pipeline. We've found that when we're under heavy load, we need to buffer calls to DynamoDB while we expand capacity instead of trying to process them immediately. Kafka is perfect for this-- so we're splitting that HTTP microservice into a smaller HTTP service that simply queues events to Kafka, and another service that processes Kafka queues.

We actually use LaunchDarkly to control this evolution. We have a feature flag that controls whether a request goes through our old analytics pipeline, or the new Kafka-based pipeline we're rolling out. Once the new pipeline is enabled for all customers, we can clean up the code and switch over completely to the Kafka pipeline. This is a use case that surprises a lot of customers-- they think of feature flags in terms of controlling user-visible features (release toggles), but they are extremely valuable for other use cases like ops toggles, experiments, and permission management.

LaunchDarkly Platform

As we scaled this service out to handle tens of thousands of request per second, we learned an important lesson about microservice construction. When we first built many of these services, we thought in terms of building a separate service per concern. For example, we’d build a service that would read in analytics events and serve the autocomplete functionality on the site. The web application would make a sub-request to this service when it had an autocomplete request from the site.

We quickly learned that the need for fault tolerance and isolation trumps the conceptual neatness of having a service per concern. With fault tolerance in mind, we sliced our services along a different axis-- separating high-throughput analytics writes from the lower-volume read requests coming from the site. This shift dramatically improved the performance of our site, as well as our ability to evolve and scale the huge write load we see on the analytics side.

Infrastructure

As you might have inferred, we use AWS as our hosting provider. We’re fairly conservative when it comes to adopting new technologies-- deployment for us consists of a set of Ansible scripts that spin up EC2 boxes for our various services. We don’t yet use ECS or Docker containers-- which by extension means we don’t use anything for container orchestration. A long while back, we spiked a migration to Mesosphere but we ran into enough issues that we didn’t proceed forward. We do think that these technologies are the future, but that future is not now, at least for us.

So maturity is one issue that prevents us from adopting some of the latest whiz-bang ops technology. There are other technologies that we find interesting, like Amazon’s API Gateway but the pricing models just don’t work for us-- at tens of thousands of requests per second, they’re non-starters.

Other services

For customer communications and support, we use Intercom, Slack, and GrooveHQ. We also recently started using elevio, and we've found it's a great way to turn Intercom questions into trackable support tickets.

We use ReadMe.io for our product and developer API documentation, GitHub holds all our code hostage, and CircleCI helps us integrate continuously.

What’s next?

We’re constantly evolving our service to improve efficiency and scale. Besides the Kafka switchover, we’re looking at using Cassandra for some of the work that DynamoDB is doing right now. We also are keenly interested in Disque as a queuing solution, especially because we’ve had so much positive experience with Redis.

More aspirationally, we might try spiking some of our new services in Rust. I’m a functional programmer at heart, and while I am appreciative of the speed and tooling around Go, it would be nice to regain some of the expressiveness and elegance of a functional language while retaining what we like about Go (the fast compilation times, ease of deployment). If we do try it out, we’ll do so in a cautious manner, and isolate the trial to a new microservice somewhere.

LaunchDarkly
Serving over 200 billion feature flags daily to help software teams build better software, faster. LaunchDarkly helps eliminate risk for developers and operations teams from the software development cycle.
Tools mentioned in article
Open jobs at LaunchDarkly
IT Service Desk Engineer
Oakland, CA
LaunchDarkly is a rapidly growing software company with a strong mission and vision carried out by a talented and diverse team of employees. Our goal is to help teams build better software, faster. You'll join a small team from companies like Atlassian, Intercom, and GitHub, and you'll have an immediate impact with our product and customers. As an IT Service Desk Engineer, you play an important role of how our teams function and interact.  You will prepare new equipment for all staff and locations, manage service desk support requests through ticketing system, implement solutions to help people do their job duties and much more. You will play an important part in the development of our team culture, our work style, and our overall job satisfaction.  
  • Install and update company software and operating systems utilizing JAMF
  • Assist with the development of technical  operating procedures
  • Perform proactive routine maintenance and system checks
  • Create and organize IT documentationAssist and train staff with operating office equipment
  • Responsible for organization and storage of IT equipment and supplies
  • Ability to pick up a project and see it to fruition
  • IT onboarding and off-boarding procedures
  • Assist staff with use of Printers, and Scanners
  • Assist staff with remote conferencing solutions
  • Assist with access control system, alarm system, and video monitoring solutions
  • Assist in moving and setting up IT equipment during desk and office relocations
  • Desire to identify, research, and implement new technologies that can help deliver on business needs
  • Manage Helpdesk support requests through ticketing system
  • Asset management experience
  • Minimum 2 years experience in IT Service Desk
  • Minimum 2 years in an enterprise environment
  • Experience using and managing ticket-based enterprise workflow management systems, particularly JIRA and other Atlassian products.
  • Strong customer service, problem solving and teamwork abilities
  • Outstanding communication and interpersonal skills
  • Extensive Technical knowledge of MacOS, 10.12.x, 10.13.x, 10.14.x 
  • Experience administering GSuite for Enterprise (Gmail, GCal, GDocs, etc)
  • Experience with video conferencing solutions and support (Zoom/BlueJeans/etc)
  • Ability to participate in an on-call rotation that includes after hours and weekend support 
  • Ability to diagnose and resolve basic technical issues
  • Strong verbal and written communication skills
  • Solutions Engineer
    Oakland, CA
    As a Solutions Engineer, you will educate and guide prospects on the proper implementation of LaunchDarkly's SaaS product and Private Instances. You are passionate about trends and technologies involved in modern application development. You will be the technical voice during our sale and ensure our customers are comfortable with the way our systems work. You are passionate about the developer tools space and helping development teams eliminate risk and deliver value. LaunchDarkly is a rapidly growing software company with a strong mission and vision carried out by a talented and diverse team of employees. Our goal is to help teams build better software, faster. You'll join a small team from companies like Atlassian, Intercom, and GitHub, and you'll have an immediate impact on our product and customers. Software powers the world and LaunchDarkly empowers all teams to deliver and control their software.
  • Evangelize and advise customers on the importance and different uses of feature flags and how to administer them
  • Create solutions to customer's challenges implementing feature flags across large monolith and microservice applications, large organizations, and different technology stacks
  • Become a domain expert on LaunchDarkly architecture
  • Demo LaunchDarkly product to technical and business audiences
  • Become a subject matter expert on LaunchDarkly and communicate our value and features to potential customers
  • Be the voice of the customer by translating, aggregating, and representing customer feedback to the Product and Engineering teams

  •  4+ years of experience consulting with enterprise customers and large development teams
  • You led successful technical proof of concepts 
  • Proven success in building strong customer relationships
  • Ability to learn and synthesize large amounts of information with little context
  • Effective communicator with the ability to simplify complex technical concepts
  • A self‐starter and problem solver, willing to take on hard problems and work independently when necessary.
  • Experience working with teams that underwent development process transformation
  • Familiarity with at least one of our supported languages: Java, .NET, GO, JS, Python, PHP, Node, Ruby, Rails, iOS, or Android
  • Experience with data persistence technologies like Varnish or Redis
  • Developer Advocate
    This Developer Advocate role blends expertise from engineering, marketing, and product with the mission of developer engagement. This is done by engaging our community of developers and driving excitement around developer-related technologies. This is a great opportunity to help improve awareness and usage of LaunchDarkly’s technologies through both marketing programs and in-depth engagement with our key accounts. LaunchDarkly is a rapidly growing software company with a strong mission and vision carried out by a talented and diverse team of employees. Our goal is to help teams build better software, faster. Software powers the world and LaunchDarkly empowers all teams to deliver and control their software. About You You love solving problems with software and have an enthusiasm for educating and sharing solutions with your community. You have a background in engineering and a passion for the community. You have passion, curiosity, technical depth, and extraordinary written communication skills. You should have the ability to converse with a broad range of programming language communities (Java, .NET, Node.js, Python, Ruby, iOS, Android, etc.), and have a real passion for modern application development trends at the intersection of development and operations. Our Developer Advocates can be responsible for anything from organizing developer events, to writing production-quality code and contributing to LaunchDarkly’s SDKs. Ultimately, your goal is to empower developers with the tools they need to make their job better. We meet developers wherever they are and support their journey, wherever that may lead.
  • Develop demo applications against our integrations and/or SDKs to showcase the product use case.
  • Collaborate with our Partnerships team to advocate for the developer voice and create impactful content in the form of demos, blog posts, webinars, and workshops.
  • Write about technology trends focused around feature management, modern application architecture with the goal of engaging developers, developer managers, and senior technical leaders.
  • Lead conversations in the community around best practices for feature flag management.
  • Articulate the technical value proposition of LaunchDarkly experience vs competitive solutions
  • Provide cross-audience support and in-depth technical enablement
  • [SENIOR AND PRINCIPAL ONLY]
  • Plan, own and launch technical enablement programs and training course materials
  • Be the local subject matter expert on identified technologies that are meaningful to the company.
  • Minimum 3 years of production-level software development or operations experience
  • Ability to independently build apps, craft solutions, interact with developers and operators to help them learn through the articulation of your experience.
  • Engaging written and verbal communication skills
  • Ability to work autonomously, willingness to travel when need be.
  • [SENIOR OR PRINCIPAL ONLY]
  • Ability to speak on a variety of topics ranging from deep technical talks to strategy and culture. 
  • Known leader or contributor in your community
  • Preferred Qualifications
  • PM experience and/or have experience building communities.
  • A history of successful speaking engagements, industry influence and / or recognition in technology publications
  • Technical Support Engineer (London)
    London
    Note: This Technical Support Engineer position is located at the LaunchDarkly office in Hoxton, London. The hours will be London-based; however there is an expectation of overlapping some Pacific Time hours as well. * At this time our offices are closed due to COVID-19. We are looking for a Technical Support Engineer who will take end-to-end ownership of customer issues, including initial troubleshooting, identification of root cause, and issue resolution. To best serve our customers, you will become an expert in the LaunchDarkly product and develop tools to improve the LaunchDarkly customer experience. You should be a self-starter who works well with little supervision; we trust you to do the right things with little oversight. LaunchDarkly is a rapidly growing software company with a strong mission and vision carried out by a talented and diverse team of employees. Our goal is to eliminate risk and deliver value for development teams. You'll join a small team and have an immediate impact with our product and customers.
  • Become a technical expert on the LaunchDarkly platform (including SDKs)
  • Use this expertise to troubleshoot customer issues and answer questions internally
  • Communicate with customers in a friendly, timely manner
  • Reproduce and document bugs with Engineering for product issues that are impacting customers
  • Create process or troubleshooting documentation in the support knowledge base
  • Contribute to process improvements and new ways to delight customers
  • Represent customers in internal company discussions
  • 2+ years of customer support, technical support, or related customer facing role
  • Technical fluency with one (or more) development platforms: Python, Node.js, Java, JavaScript, Ruby/Rails, Go, .NET, PHP, iOS, and Android
  • Passion for solving customer issues and advocating for their success, in a fast paced, highly technical environment
  • Experience working with APIs or building integrations between SaaS services
  • Ability to learn new technologies quickly
  • Excellent relationship management, customer service and communication skills in variety of forms (written, live chat, conference calls, in-person)
  • Ability to work independently with little direct supervision and as a part of a team
  • Ability to remain calm, composed and articulate when facing tough customer situations
  • Interest in working on technical side projects to validate what you’ve learned
  • Excellent time management skills and ability to balance numerous projects at once
  • Verified by
    Engineering Lead
    Director Marketing
    VP of Product and Engineering
    You may also like