Skip to content

Deployment & Release Management

Standard ID: DEPLOYMENT Version: 1.0 Effective: 2026-04-03 Owner: CDO

1. Environments

Environment Purpose Deployment Access
Sandbox Merchant integration testing On push to feature branch External (merchants)
Dev Internal development and smoke testing Auto on merge to main Internal only
Test QA, regression, performance testing On release candidate tag Internal only
Prod Live traffic — PK, BD, NP, IQ, EG Promoted from Test after approval External (merchants + consumers)

2. Deployment Strategies

Strategy When to Use
Blue/green Default for stateless Go services behind KrakenD
Canary High-risk changes, new payment flows, gateway integrations
Rolling Infrastructure components, configuration changes

Blue/Green Process

  1. Deploy new version to inactive (green) environment.
  2. Run health checks and synthetic transactions against green.
  3. Switch KrakenD routing from blue to green.
  4. Monitor error rates for 5 minutes.
  5. If healthy, decommission blue. If not, switch back immediately.

Canary Process

  1. Route 5% of traffic to canary.
  2. Monitor error rate, latency, and success rate for 15 minutes.
  3. Promote to 25%, then 50%, then 100% at 15-minute intervals.
  4. Automated rollback if error rate exceeds baseline by 1%.

3. Zero-Downtime Requirement

Mandatory for all payment services. No maintenance windows. All deployments must be zero-downtime.

  • Application code must handle graceful shutdown (drain in-flight requests).
  • Database migrations must be backward-compatible (see DATABASE-SCHEMA-CHANGE-STANDARD).
  • KrakenD configuration changes applied via hot-reload, not restart.

4. Rollback

  • Automated rollback: triggered if health checks fail within 5 minutes post-deployment.
  • Manual rollback: available via pipeline for up to 1 hour post-deployment.
  • Rollback plan: documented in every PR for production deployments.
  • Database rollback: see DATABASE-SCHEMA-CHANGE-STANDARD for migration rollback procedures.

5. Feature Flags

Use PostHog for feature flag management.

Flag Type Use Case
Percentage rollout Gradual feature release (5% → 25% → 100%)
Per-merchant targeting Enable features for specific merchants
Kill switch Instantly disable a feature in production

Rules: - All new merchant-facing features launch behind a feature flag. - Flags must be removed within 30 days of full rollout. - Flag naming: {product}-{feature} (e.g., payin-webhook-v2).

6. Release Cadence

Environment Cadence
Dev Continuous (every merge to main)
Test Continuous (every release candidate)
Prod Weekly release train (Tuesday 10:00 UTC) or on-demand for critical fixes

Emergency releases may bypass the weekly train with CDO approval.

7. Change Management

Category Approval Example
Standard Auto-approved by CI Dependency updates, lint fixes, documentation
Significant CDO review required New features, API changes, config changes
Emergency Deploy immediately, post-hoc review within 24h P1 incident fix, security patch

8. Pre-Deployment Checklist

Before promoting to Prod:

  • All tests pass (unit, integration, contract).
  • Security scan — no high/critical findings.
  • OpenAPI spec linted and published.
  • No open P1 bugs against this release.
  • Rollback plan documented in PR.
  • Database migrations tested on anonymised Prod clone.
  • Feature flags configured for gradual rollout.
  • Monitoring dashboards and alerts verified.

9. Post-Deployment Verification

Within 5 minutes of production deployment:

  1. Health checks — all service endpoints return 200.
  2. Synthetic transactions — automated test payments through each product (Pay-In, Pay-Out).
  3. Error rate — must not exceed pre-deployment baseline by more than 0.5%.
  4. Latency — P95 latency must not increase by more than 10%.
  5. OpenTelemetry traces — verify traces flow end-to-end through KrakenD to services.

10. Database Migrations

  • Applied before application deployment.
  • Must be forward-only and backward-compatible.
  • See DATABASE-SCHEMA-CHANGE-STANDARD for full process.

11. Artefact Management

  • Container images tagged with: {service}:{git-sha}-v{semver} (e.g., payin-api:a1b2c3d-v2.3.0).
  • Images stored in private container registry.
  • Images are immutable — never overwrite a tag.
  • Retention: keep last 20 images per service, plus all release-tagged images.

12. Notifications

Event Channel
Deployment started Slack #deployments
Deployment succeeded Slack #deployments
Deployment failed Slack #deployments + #incidents
Rollback triggered Slack #incidents + page on-call