Skip to content

Canary Testing

Overview

Canary Deployment is a gradual rollout strategy where a new version of a service is released to a small subset of users first. If no issues are detected, the deployment is incrementally expanded to more users until it replaces the old version entirely.

Think of a miner sending a canary into a coal mine to detect danger before the humans go in — in software, we use this approach to detect issues early, safely.

Advantages

  • Reduced risk: Only a small percentage of users are affected by potential issues.
  • Faster feedback: Real usage reveals bugs that testing may miss.
  • Safe rollback: Easy to revert if issues are detected early.
  • Supports A/B testing: You can collect metrics and compare performance or behaviour.

Drawbacks

  • Traffic routing complexity: Requires infrastructure that can split traffic (e.g., load balancer, feature flag, service mesh).
  • Monitoring is essential: You need solid observability to spot regressions or anomalies quickly.
  • State/data mismatches: Care must be taken with shared databases or APIs to ensure compatibility across versions.

How It Works (Step-by-Step)

  1. Deploy the new version alongside the current one (usually on a separate instance or container).
  2. Route a small % of traffic (e.g. 5%) to the new version (canary).
  3. Monitor metrics (errors, latency, usage, etc.) for that group.
  4. If everything looks good:
  5. Gradually increase traffic to the new version (e.g. 25%, 50%, 100%)
  6. If problems occur:
  7. Roll back by redirecting traffic entirely to the old version.

Common Use Cases

  • Deploying API changes
  • Testing new UI features
  • Introducing performance optimizations
  • Releasing ML models into production

Azure Example

With Azure App Service Deployment Slots

  1. Have two slots: production and canary
  2. Deploy new version to canary
  3. Use traffic routing to send a percentage to the canary slot
  4. Monitor using Application Insights
  5. Gradually increase traffic as confidence grows

With Azure Front Door / Application Gateway

  • Use rules or percentage-based routing to split traffic
  • Monitor using Log Analytics, Azure Monitor, etc.

Example Scenario

  • You deploy a new version of your web API that introduces new endpoints.
  • Route 5% of traffic to the new version using Azure Front Door.
  • Watch for any 500 errors, slow response times, or odd usage patterns.
  • Gradually increase to 50%, then 100%.
  • At 100%, the new version becomes the full production service.

C# Pseudo-code Analogy (Feature Flag Controlled)

Imagine you use a feature flag to enable a new function:

public IActionResult GetData()
{
    if (FeatureFlags.IsEnabled("UseNewDataSource"))
    {
        return NewDataSource.Fetch();
    }
    return LegacyDataSource.Fetch();
}

You release this behind a feature flag to 5% of users, observe behaviour, and slowly roll it out to more.

Comparison to Other Patterns

Pattern Traffic Approach Rollback Ease Risk Level
Canary Gradual user % rollout ✅ Easy 🔽 Low
Blue/Green Full switch (manual) ✅ Very easy 🔽 Low
Shadow Duplicated traffic N/A (no live output) 🔽 Very low
Rolling Update instance by instance ⚠️ Moderate ⚠️ Medium

Summary

Key Concept Description
What it is Gradual rollout of a new version to a subset of users
Risk Low (if monitored well)
Rollback Simple — redirect traffic to previous version
Ideal for Incremental updates, real-world testing, confident releases
Azure Integration App Service Slots, Azure Front Door, Application Gateway, Feature Management