Prevent Technical Debt by Understanding Feature Flag and Experiment Lifetimes

By Asa Schachar

As your organization uses more feature flags and experiments, it’s paramount to understand that some of these changes are meant to last for a short time and should be removed from your codebase. Without a plan for removal, they will become outdated and add technical debt and complexity.

One measure you can track is how long the feature flag or experiment has been in your system and how many different states (ie. on/off, different configurations, different experimental versions) it’s in. While you’re iterating on your feature, it may go through many different states before getting to the final, optimized version when it would then be ready for removal. If the feature flag has been in your system for a long time and all of your users have the same state of the feature flag, then it should likely be removed and the feature code should be merged into your code base.

However, it’s smart to evaluate the purpose of a flag or experiment before removing it, since the real lifetime of an experiment or feature flag depends on its use case, more on that below.

In my experience as an engineer and an engineering manager, I’ve seen several different types of flags and experiments. Below, I’ll walk through the examples from my free e-book Ship Confidently with Progressive Delivery and Experimentation and highlight which feature flags should be removed and which ones will likely stay in your codebase based on their use case. Let’s dive in!

01 Temporary Flag and Experiment types to Set a Removal Timeline

If a feature is designed to be rolled out to everyone or you are using the feature for a short-term experiment, then you’ll want to ensure you have a ticket tracking its removal as soon as the feature has been fully launched to your users or the experiment has concluded. These temporary flags may last weeks, months, or quarters. Examples include:

🚪 Painted-door experiments: These experiments are intended to be used to show the smallest amount of UI for a feature to determine if there is customer interest and usually only used in the early phases of the software development lifecycle. These experiments aren’t intended to be kept in the product after they have validated or in-validated the experiment hypothesis.
🛩 Performance experiments: These experiments are intended to put two different implementations against each other in a live, real-world performance experiment. Once enough data has been gathered to determine the more performant solution, it’s usually best to move all traffic over to the higher performing variation.
🏗 Large-scale refactors: When moving between frameworks, languages, or implementation details, it’s useful to deploy these rather risky changes behind a feature flag so that you have confidence they will not negatively impact your users or business. However, once the re-factor is done, you hopefully won’t go back in the other direction and can remove the feature flag.
🎨 Product re-brands: If your business decides to change the look and feel of your product for brand purposes, it’s useful to have a rollout to gracefully move to the new branding and measure the customer engagement. After the new branding is established, it’s a good idea to remove the feature flag powering the switch.

02 Permanent Flag and Experiment Types to Review and Document

If a feature is designed to have different states for different customers, or you want to control its configuration for operational processes or continuous experimentation and optimization purposes, then it’s likely the flag will stay in your product for a longer period of time. Examples of these flags and experiments include:

💰 Promotional flags: These flags are useful if your product or company wants to offer a special promotion at a regular cadence. For example, you may run a promotional campaign every winter that provides free shipping of your products to your customers. By implementing a feature flag that can turn on and configure these promotions, your team can alleviate the need for engineering resources to turn these regular promotions on or off at regular intervals.
🔒 Permission flags: These flags are useful if you have different permission levels in your product like read-only that don’t allow edit access to the feature. They are also useful if you have modular pricing like an inexpensive “starter plan” that doesn’t have the feature, but a more costly “enterprise plan” that does have the feature.
📦 Operational flags: These flags control the operational knobs of your application. For example, these flags can control whether you batch events sent from your application to minimize the number of outbound requests. You could also use them to control the number of machines that are used for horizontally scaling your application. In addition, they can be used to disable a computational expensive non-essential service or allow for a graceful switchover from one third-party service to another in an outage.
📝 Configuration-based software: For any software or product that is powered by a config file, this is a great place to seamlessly insert experimentation that has a low cost to maintain and still allows endless product experimentation. For example, some companies may have their product layout powered by a config file that describes in abstract terms whether different modules are included and how they are positioned on the screen. With this architectural setup, even if you aren’t running an experiment right now, you can still enable future product experimentation.

Note that even if a flag is meant to be permanent, it’s still paramount to regularly review these flags and their surrounding code in case they are obsolete or should be deprecated. Otherwise, keeping these permanent flags may add technical debt to your codebase.

Some organizations use an integration between a task tracking system and their feature flag and experiment service to manage this cleanup process seamlessly and quickly. If the state of feature flags and experiments can be synced with a ticket tracking tool, then an engineering manager can run queries for all feature flags and experiments whose state has not been changed in the past 30 days and track down owners of the feature flags and experiments to evaluate their review. This strategy is especially useful at an organization with a centralized team helping review feature flag best practices. At Optimizely, we use the Optimizely JIRA integration to help ensure we aren’t accruing technical debt.

Other organizations have a recurring feature flag and experiment removal day in which engineers review the oldest items in the list at a regular cadence. This strategy is especially useful as a lightweight way to get different teams who own their own feature flags to think about regular removal.

Let me know what you think!

How do you keep track of feature flags and whether to remove them? Message me in our Slack community or find me on Twitter at @asametrical.

This is part of a series of best practices to help your company successfully implement progressive delivery and experimentation to ship faster with confidence.

If you like this content, check out my free e-book: Ship Confidently with Progressive Delivery and Experimentation which offers more best practices from just getting started to scaling these technologies organization-wide.

And if you are looking for a platform to get started, checkout Optimizely’s free offering.

Utilities

Optimizely

A/B Testing Analytics

Business Tools

Slack

Group Chat & Notifications

Jira

Issue Tracking