Feature flags
Overview
Our Rails applications use configuration values to support per-environment configuration. This allows us to create new configuration values which control whether a specific application feature is enabled in an environment, so that we can iterate on a large feature over time while rolling it out. These are referred to as “feature flags”, and are nothing more than boolean configuration values which are used in code to gate user and logic pathways.
Lifecycle of a feature flag
Adding a new feature flag
Most code changes don’t require a feature flag, since the code that’s being merged is self-contained and should be expected to go live immediately for all users in all environments.
Feature flags are best suited for large features which are planned to be implemented across multiple pull requests over a longer period of time, usually anywhere from a few weeks to several months of development. They can also be used as a faster alternative to incremental deploys when managing 50/50 state changes.
Consider the following when adding a new feature flag:
- Timing:
- A feature flag should be implemented as one of the very first code changes of a large feature development. It is not uncommon for the first pull request of a new large feature development to contain only code changes for adding the feature flag.
- Naming:
- As with any configuration value, the name of a feature flag should be clear, and it should be obvious for someone outside the team implementing the feature to know which user flows it impacts. A verbose-but-clear feature flag name is better than a succinct-but-ambiguous name.
- Feature flag names are typically suffixed with
_enabled
, indicating that it’s a boolean controlling whether something is enabled or not.
- Default Value:
- The simplest option is to enable a new feature flag by default, and disable it in the
production
section of the config. Theproduction
section is used as the default in all deployed environments. The feature can then be enabled in specific deployed environments usingapp-s3-secret
.
# config/application.yml.default new_feature_enabled: true production: new_feature_enabled: false
- Pros:
- This helps ensure broad test coverage for your new feature in continuous integration (CI) builds.
- It is easier to update existing tests which fail due to changed expectations of how a user proceeds through the application.
- Cons:
- Enabling by default can introduce a risk that continuous integration will not be running tests against what users will experience in the production environment. This can be offset by ensuring branch test coverage for the disabled state in affected codes’ specs (see Test Coverage)
- The simplest option is to enable a new feature flag by default, and disable it in the
Rolling out a feature flag
A new feature flag is expected to be disabled by default in the production
section of the config. You can then enable it in specific deployed environments using app-s3-secret
.
Refer to Environment Descriptions to understand the purpose of each environment, and when it might make sense to enable a feature in that environment.
The recommended process is:
- Enable the feature flag in local development and testing as described in adding a feature flag
- Enable the feature flag in
dev
- This environment is primarily used by internal team members for testing, which limits the impact and risk of less stable features.
- Enable the feature flag in
int
and/orstaging
- The
int
environment has a wider usage base, including partners testing and developing their integrations, and those accessing the partner portal. This has tradeoffs in that it increases potential impact, but also offers a better test prior to deploying a feature to prod. staging
is similar todev
in that it is very low traffic, but it can be useful to use when the feature involves an integration with a third-party vendor. Thestaging
environment is configured to integrate with live endpoints and is therefore more realistic to what a production user will experience.
- The
- Enable the feature flag in
prod
- Before enabling a feature flag in the
prod
environment, make sure that the feature is enabled in continuous integration (CI) tests, to increase confidence that there are no bugs in the code.
- Before enabling a feature flag in the
Removing a feature flag
Most feature flags should be added with the expectation that they’ll eventually be removed after the feature is live and stable in production.
The only reason to keep a feature flag after being enabled in production is to have the ability to quickly disable the feature if a problem is discovered. While this can lead to the temptation of keeping a feature flag forever so that it can be disabled if needed, this is almost always a bad idea:
- It increases maintenance burden and technical debt, because we need to support both the enabled and disabled versions of the code.
- Because of code drift and future assumptions that a feature is enabled in production, the disabled version of the code may behave unpredictably, or not work at all!
Consider removing a feature flag within a month of the feature being enabled in production.
Developing with feature flags
Refer to “Secrets and Configuration” for more information about adding a new configuration value.
Using a feature flag in code
A feature flag is usually used to create a branching path in your code, with if
and else
handling the enabled and disabled states respectively.
def create
if IdentityConfig.store.example_feature_enabled
# Handle what happens when the feature is enabled
else
# Handle what happens when the feature is disabled
end
end
Test coverage
As with any if
and else
statement in code, you will want to have good branch coverage for your code to cover both the scenario where the feature is enabled or disabled.
You can use RSpec to stub the response of the IdentityConfig.store
object method. This is best used in a before
block within an RSpec context
grouping:
it 'does something' do
# Assert expected behaviors in the default state
end
context 'when example feature is disabled' do
before do
allow(IdentityConfig.store).to receive(:example_feature_enabled).and_return(false)
end
it 'does something different' do
# Assert expected behaviors when the feature is the opposite of its default value
end
end