All Posts

Feature Flags Pattern: Decouple Deployments from User Exposure

Control activation by cohort, tenant, or region without redeploying application code.

Abstract AlgorithmsAbstract Algorithms
··12 min read
Share
Share on X / Twitter
Share on LinkedIn
Copy link

TLDR: Feature flags separate deploy from exposure. They are operationally valuable when you need cohort rollout, instant kill switches, or entitlement control without rebuilding or redeploying the service.

TLDR: Flags help only when they are treated like production configuration with ownership, expiry, and observability. Otherwise they become a second codebase hidden behind conditionals.

Operator note: Incident reviews usually do not blame “feature flags” in the abstract. They blame stale flags no one owned, conflicting flag combinations no one tested, or kill switches that depended on a remote control plane during the outage they were supposed to fix.

During Facebook’s 2019 infrastructure incident, engineers disabled a problematic caching layer in under two minutes by toggling a feature flag — no deployment, no rollback pipeline, no waking a second team. Without the flag, the only option would have been an emergency deploy under active incident conditions. A feature flag is a runtime boolean: when the targeting rule evaluates true, the new code path runs; when false, the stable path runs instead.

If you ship production services, feature flags are the mechanism that separates “code is deployed” from “users are affected” and give you the fastest possible kill switch.

Worked example — flag evaluation at request time with a cached local snapshot:

# No per-request network call — evaluated from a local config snapshot
if flags.get("new_checkout_flow", user_id=user.id, default=False):
    return new_checkout(cart)   # enabled for this cohort
return legacy_checkout(cart)    # safe fallback for everyone else

Disabling this globally takes one control-plane toggle — no redeploy, no incident bridge, no database change.

📖 When Feature Flags Actually Help

Feature flags are best when the deployment artifact and the exposure decision need to move at different speeds.

Use them for:

  • controlled rollout by cohort, tenant, or region,
  • kill switches for risky integrations or expensive features,
  • entitlement and plan-based access control,
  • safe migration paths where new and old behavior must coexist briefly.
Use caseWhy flags fit
Enable new billing UI for internal users firstExposure can change without redeploy
Turn off a failing recommendation backend fastKill switch reduces blast radius immediately
Roll out by premium tenant or geographyCohort control is more precise than traffic weights
Keep old and new write path side by side temporarilyBehavior can be switched gradually during migration

🔍 When Not to Use Feature Flags

Flags are a poor substitute for basic code and architecture discipline.

Avoid using them when:

  • the flag is really a permanent configuration constant,
  • the code path should never be active in production,
  • the feature needs irreversible data migration before exposure,
  • multiple flags would create a combinatorial test matrix that nobody can own.
ConstraintBetter alternative
Permanent environment settingStatic config or service config
Release safety for infrastructure onlyCanary or blue-green
One-off debugging pathTemporary admin switch with explicit removal plan
Large data migration with no coexistence windowExpand-contract migration first

⚙️ How Flags Work in Production

Good flag systems have two planes:

  1. A control plane where owners define targeting rules, defaults, expiry, and audit history.
  2. A data plane where the application evaluates the flag locally or with a cached config snapshot.

The production sequence usually looks like this:

  1. Define the flag with owner, default, and removal date.
  2. Ship dormant code behind the flag.
  3. Expose to internal or low-risk cohorts first.
  4. Compare metrics by variation.
  5. Expand gradually or turn it off instantly if risk appears.
  6. Remove dead flag code once the rollout is complete.
Control pointWhat to decideWhy it matters
Default valueSafe state if control plane is unavailablePrevents outage during config failure
Evaluation modeServer-side, client-side, or hybridChanges latency and security trade-offs
Targeting rulesCohort, tenant, region, percent, planControls blast radius precisely
Cache behaviorTTL and bootstrap snapshotKeeps kill switch usable during control-plane issues
LifecycleOwner and expiry datePrevents permanent flag debt

🛠️ Unleash, LaunchDarkly OSS, and Flipt: Feature Flag Platforms in Practice

Unleash is the leading open-source feature flag platform with a Java SDK, a rich strategy engine (gradual rollout, user targeting, custom constraints), A/B variant support, and a self-hostable control plane. Flipt is a lightweight, GitOps-friendly open-source flag server with a gRPC API. OpenFeature is a CNCF-incubated vendor-neutral SDK standard that decouples flag evaluation code from the backing provider.

These tools solve the feature flag problem by providing a proper two-plane architecture: a control plane stores targeting rules, defaults, and audit history; a data plane evaluates flags locally from a cached snapshot so evaluation stays fast and resilient even during control-plane disruptions.

The full Unleash Java integration with UnleashConfig, FeatureDecisions, and RiskScoringService is shown in the 🏗️ Enterprise Java Example section below. Here is the minimal wiring to get started with Unleash in any Spring Boot service:

import io.getunleash.DefaultUnleash;
import io.getunleash.Unleash;
import io.getunleash.UnleashContext;
import io.getunleash.util.UnleashConfig;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class FeatureFlagConfig {

    @Bean
    public Unleash unleash() {
        // SDK polls the control plane every 15s and caches rules locally.
        // Evaluation never makes a live network call — the local cache answers.
        return new DefaultUnleash(
            UnleashConfig.builder()
                .appName("checkout-service")
                .instanceId(System.getenv().getOrDefault("HOSTNAME", "local"))
                .unleashAPI(System.getenv("UNLEASH_URL"))
                .apiKey(System.getenv("UNLEASH_TOKEN"))
                .build()
        );
    }
}

// Usage in any Spring bean — pass user/tenant context for targeting
boolean enabled = unleash.isEnabled(
    "new-checkout-flow",
    UnleashContext.builder()
        .userId(userId)
        .addProperty("plan", plan)
        .addProperty("region", region)
        .build(),
    false   // safe default if SDK cannot resolve the flag
);

Flipt offers the same evaluation semantics with a self-contained binary, gRPC API, and GitOps-native flag definitions — no separate database required for small teams. OpenFeature wraps either provider with a vendor-neutral Client interface so teams can swap backends without touching flag evaluation code.

For a full deep-dive on Unleash, LaunchDarkly OSS, and Flipt feature flag platforms, a dedicated follow-up post is planned.

🏗️ Enterprise Java Example: Rolling Out checkout-risk-v2

Scenario: your checkout service has a new fraud/risk engine (v2). You want to expose it only to enterprise tenants in eu-west at first, ramp gradually, and retain instant rollback.

1) Isolate the flag boundary in a dedicated component```java

package com.acme.checkout.flags;

import io.getunleash.DefaultUnleash; import io.getunleash.Unleash; import io.getunleash.util.UnleashConfig; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration;

@Configuration public class FlagConfig {

@Bean public Unleash unleash() { UnleashConfig config = UnleashConfig.builder() .appName("checkout-service") .instanceId(System.getenv().getOrDefault("HOSTNAME", "checkout-1")) .unleashAPI(System.getenv("UNLEASH_API_URL")) .apiKey(System.getenv("UNLEASH_API_TOKEN")) .build();

return new DefaultUnleash(config); } }


### 2) Pass enterprise context into flag evaluation

```java
package com.acme.checkout.flags;

import io.getunleash.Unleash;
import io.getunleash.UnleashContext;
import org.springframework.stereotype.Component;

@Component
public class FeatureDecisions {

  private final Unleash unleash;

  public FeatureDecisions(Unleash unleash) {
    this.unleash = unleash;
  }

  public boolean useRiskEngineV2(String userId, String tenantId, String plan, String region) {
    UnleashContext context = UnleashContext.builder()
        .userId(userId)
        .addProperty("tenant", tenantId)
        .addProperty("plan", plan)
        .addProperty("region", region)
        .build();

    // `false` is the safe default when flag state cannot be resolved.
    return unleash.isEnabled("checkout-risk-v2", context, false);
  }
}

Control-plane targeting rule for this scenario:

  • Strategy 1: internal users = on
  • Strategy 2: plan=enterprise AND region=eu-west with gradual rollout (5% -> 25% -> 50% -> 100%)
  • Global fallback: off

3) Use a stable fallback path in business logic

package com.acme.checkout.risk;

import com.acme.checkout.flags.FeatureDecisions;
import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.core.instrument.Timer;
import org.springframework.stereotype.Service;

@Service
public class RiskScoringService {

  private final FeatureDecisions featureDecisions;
  private final RiskEngineV1 riskEngineV1;
  private final RiskEngineV2 riskEngineV2;
  private final MeterRegistry meterRegistry;

  public RiskScoringService(
      FeatureDecisions featureDecisions,
      RiskEngineV1 riskEngineV1,
      RiskEngineV2 riskEngineV2,
      MeterRegistry meterRegistry
  ) {
    this.featureDecisions = featureDecisions;
    this.riskEngineV1 = riskEngineV1;
    this.riskEngineV2 = riskEngineV2;
    this.meterRegistry = meterRegistry;
  }

  public RiskDecision score(RiskRequest request) {
    boolean useV2 = featureDecisions.useRiskEngineV2(
        request.userId(),
        request.tenantId(),
        request.plan(),
        request.region()
    );

    String variant = useV2 ? "v2" : "v1";
    Timer.Sample sample = Timer.start(meterRegistry);

    try {
      if (useV2) {
        return riskEngineV2.score(request);
      }
      return riskEngineV1.score(request);
    } catch (RuntimeException ex) {
      // Fail-safe behavior keeps checkout available even if the new path fails.
      meterRegistry.counter("checkout.risk.fallback_total", "reason", "v2_exception").increment();
      return riskEngineV1.score(request);
    } finally {
      sample.stop(Timer.builder("checkout.risk.latency")
          .tag("variant", variant)
          .register(meterRegistry));
    }
  }
}

🧠 Deep Dive: What Incident Reviews Usually Reveal First

Failure modeEarly symptomRoot causeFirst mitigation
Kill switch does not work during incidentApp cannot fetch fresh flag valuesData plane depended on live control-plane availabilityAdd cached local evaluation and safe defaults
Old feature path keeps breaking months laterNo one remembers which flags are still activeMissing owner and expiry disciplineAdd flag inventory with review dates
User reports inconsistent behavior across sessionsTargeting rule is unstable or client-side evaluation differsSticky assignment rules are missingUse deterministic bucketing
Metrics look healthy overall, one cohort is brokenVariation analysis is aggregated too broadlyNo cohort-by-variation dashboardBreak metrics down by flag variant
Testing becomes impossibleToo many overlapping flagsFlag system replaced design decisionsCap concurrent high-impact flags in one path

Field note: the fastest way to turn flags into operational debt is to keep “temporary” release flags after rollout. Every stale flag becomes hidden branch logic that on-call engineers must rediscover under pressure.

The Internals: Control Plane, Data Plane, and Evaluation Boundary

Good flag systems separate two planes: a control plane that stores targeting rules, defaults, and audit history, and a data plane where the application evaluates flags locally from a cached snapshot. Separating them keeps evaluation fast and resilient — the data plane can answer flag questions even when the control plane is temporarily unreachable. The critical implementation rule is a hard-coded safe default that activates if the local snapshot is stale or if the SDK cannot bootstrap at startup.

Performance Analysis: Evaluation Latency and Kill-Switch Reliability

On the hot request path, flag evaluation costs microseconds — the decision reads from an in-process cache with no network round trip. The performance risk is at the cache refresh boundary: if the control plane degrades during an incident, evaluation must fall back to the last snapshot and the configured safe default. Per-variation latency and error-rate metrics are essential; aggregate metrics hide degradation in the enabled cohort while the disabled cohort remains healthy.

📊 Feature Flag Evaluation Flow

flowchart TD
    A[Request arrives] --> B[Load cached flag configuration]
    B --> C[Evaluate flag rule for user, tenant, or region]
    C --> D{Flag on?}
    D -->|Yes| E[Execute new behavior]
    D -->|No| F[Execute stable behavior]
    E --> G[Emit metrics with flag variation]
    F --> G
    H[Control plane update] --> B

🧪 Concrete Config Example: Flag Definition with Ownership

{
  "key": "billing_ui_v2",
  "type": "release",
  "default": false,
  "owner": "billing-platform",
  "expires_at": "2026-06-30",
  "kill_switch": true,
  "rules": [
    {
      "match": { "segment": "internal" },
      "variation": true
    },
    {
      "match": { "plan": "enterprise" },
      "rollout": 25,
      "variation": true
    }
  ]
}

Why this matters operationally:

  • default must be the safe behavior if the flag service is unreachable.
  • owner and expires_at turn the flag into an owned operational asset.
  • Rule-based rollout keeps exposure aligned with business cohorts, not only percent traffic.

🌍 Real-World Applications: What to Instrument and What to Alert On

SignalWhy it mattersTypical alert
Variation-specific error rateShows whether the new behavior is actually safeCandidate variation error spike
Variation-specific p95/p99 latencyDetects hidden cost of enabled pathTail latency regression for enabled cohort
Evaluation cache ageShows if data plane is running on stale configCache too old during control-plane incident
Flag debt countMeasures how many flags should have been removedExpired flags still active
Targeting distributionVerifies exposure matches intentToo much or too little cohort exposure

What breaks first:

  1. Evaluation availability during control-plane problems.
  2. Missing per-variation dashboards.
  3. Flag sprawl in the most critical request paths.

⚖️ Trade-offs & Failure Modes: Pros, Cons, and Alternatives

CategoryPractical impactMitigation
ProsDecouples deploy from exposureUse for staged rollout and kill switches
ProsEnables tenant and cohort targetingKeep targeting rules deterministic
ConsAdds branch logic and test complexityRemove flags quickly after rollout
ConsRequires reliable config delivery and auditCache config locally and log changes
RiskFlag debt becomes permanent complexityEnforce expiry and ownership reviews
RiskTeams use flags instead of sound migration designKeep data compatibility decisions separate

🧭 Decision Guide for Release Control

SituationRecommendation
Need user or tenant exposure controlUse feature flags
Need traffic-based confidence in a new binaryUse canary
Need instant environment-level rollbackUse blue-green
Need both deployment safety and exposure controlCombine canary or blue-green with flags deliberately

If a flag cannot be assigned an owner and removal date, it should probably not be created.

📚 Interactive Review: Flag Readiness Checklist

Before enabling a flag beyond the first cohort, ask:

  1. What is the safe default if the control plane is unreachable?
  2. Which dashboard compares enabled vs disabled behavior directly?
  3. How are users or tenants assigned consistently across sessions?
  4. What exact event retires the flag and removes the code path?
  5. Can on-call disable the feature without waiting for a deploy or database change?

Scenario question: if the new billing path is healthy for internal users but causes latency only for enterprise tenants with large invoices, do you keep the flag on globally, restrict the cohort, or redesign the targeting rule?

📌 TLDR: Summary & Key Takeaways

  • Feature flags are release-control tools, not free-form branching systems.
  • Safe defaults, local evaluation, and ownership matter more than UI polish in the flag platform.
  • Per-variation metrics are essential for reliable rollout decisions.
  • Expiry dates and code cleanup prevent flag debt from becoming architecture debt.
  • Use flags for exposure control, not as a shortcut around migration or rollout design.

📝 Practice Quiz

  1. What is the main operational value of a feature flag?

A) It guarantees zero bugs
B) It separates deployment from exposure and enables fast disablement
C) It removes the need for testing

Correct Answer: B

  1. Which design choice matters most during a control-plane outage?

A) The color of the admin dashboard
B) Safe defaults and local cached evaluation behavior
C) The number of active experiments company-wide

Correct Answer: B

  1. What is the clearest sign of flag debt?

A) A rollout flag still active months after the feature is fully launched
B) A flag has an owner and expiry date
C) A flag is used for one staged rollout

Correct Answer: A

  1. Open-ended challenge: two interacting flags affect the same checkout path and only one cohort is failing. How would you simplify the targeting model before the next rollout?
Abstract Algorithms

Written by

Abstract Algorithms

@abstractalgorithms