Feature flags

Overview

Our Rails applications use configuration values to support per-environment configuration. This allows us to create new configuration values which control whether a specific application feature is enabled in an environment, so that we can iterate on a large feature over time while rolling it out. These are referred to as “feature flags”, and are nothing more than boolean configuration values which are used in code to gate user and logic pathways.

Lifecycle of a feature flag

Adding a new feature flag

Most code changes don’t require a feature flag, since the code that’s being merged is self-contained and should be expected to go live immediately for all users in all environments.

Feature flags are best suited for large features which are planned to be implemented across multiple pull requests over a longer period of time, usually anywhere from a few weeks to several months of development. They can also be used as a faster alternative to incremental deploys when managing 50/50 state changes.

Consider the following when adding a new feature flag:

Timing:
- A feature flag should be implemented as one of the very first code changes of a large feature development. It is not uncommon for the first pull request of a new large feature development to contain only code changes for adding the feature flag.
Naming:
- As with any configuration value, the name of a feature flag should be clear, and it should be obvious for someone outside the team implementing the feature to know which user flows it impacts. A verbose-but-clear feature flag name is better than a succinct-but-ambiguous name.
- Feature flag names are typically suffixed with _enabled, indicating that it’s a boolean controlling whether something is enabled or not.
Default Value:
- The simplest option is to enable a new feature flag by default, and disable it in the production section of the config. The production section is used as the default in all deployed environments. The feature can then be enabled in specific deployed environments using app-s3-secret.
```
# config/application.yml.default
new_feature_enabled: true

production:
  new_feature_enabled: false
```
- Pros:
  - This helps ensure broad test coverage for your new feature in continuous integration (CI) builds.
  - It is easier to update existing tests which fail due to changed expectations of how a user proceeds through the application.
- Cons:
  - Enabling by default can introduce a risk that continuous integration will not be running tests against what users will experience in the production environment. This can be offset by ensuring branch test coverage for the disabled state in affected codes’ specs (see Test Coverage)

Rolling out a feature flag

A new feature flag is expected to be disabled by default in the production section of the config. You can then enable it in specific deployed environments using app-s3-secret.

Refer to Environment Descriptions to understand the purpose of each environment, and when it might make sense to enable a feature in that environment.

The recommended process is:

Enable the feature flag in local development and testing as described in adding a feature flag
Enable the feature flag in dev
- This environment is primarily used by internal team members for testing, which limits the impact and risk of less stable features.
Enable the feature flag in int and/or staging
- The int environment has a wider usage base, including partners testing and developing their integrations, and those accessing the partner portal. This has tradeoffs in that it increases potential impact, but also offers a better test prior to deploying a feature to prod.
- staging is similar to dev in that it is very low traffic, but it can be useful to use when the feature involves an integration with a third-party vendor. The staging environment is configured to integrate with live endpoints and is therefore more realistic to what a production user will experience.
Enable the feature flag in prod
- Before enabling a feature flag in the prod environment, make sure that the feature is enabled in continuous integration (CI) tests, to increase confidence that there are no bugs in the code.

Removing a feature flag

Most feature flags should be added with the expectation that they’ll eventually be removed after the feature is live and stable in production.

The only reason to keep a feature flag after being enabled in production is to have the ability to quickly disable the feature if a problem is discovered. While this can lead to the temptation of keeping a feature flag forever so that it can be disabled if needed, this is almost always a bad idea:

It increases maintenance burden and technical debt, because we need to support both the enabled and disabled versions of the code.
Because of code drift and future assumptions that a feature is enabled in production, the disabled version of the code may behave unpredictably, or not work at all!

Consider removing a feature flag within a month of the feature being enabled in production.

Developing with feature flags

Refer to “Secrets and Configuration” for more information about adding a new configuration value.

Using a feature flag in code

A feature flag is usually used to create a branching path in your code, with if and else handling the enabled and disabled states respectively.

def create
  if IdentityConfig.store.example_feature_enabled
    # Handle what happens when the feature is enabled
  else
    # Handle what happens when the feature is disabled
  end
end

Test coverage

As with any if and else statement in code, you will want to have good branch coverage for your code to cover both the scenario where the feature is enabled or disabled.

You can use RSpec to stub the response of the IdentityConfig.store object method. This is best used in a before block within an RSpec context grouping:

it 'does something' do
  # Assert expected behaviors in the default state
end

context 'when example feature is disabled' do
  before do
    allow(IdentityConfig.store).to receive(:example_feature_enabled).and_return(false)
  end

  it 'does something different' do
    # Assert expected behaviors when the feature is the opposite of its default value
  end
end