SDLC Governance Framework¶

Field	Value
Document Number	SP-SOP-SDLC-0704
Version	2.0
Date	7 April 2026
Owner	Daniel O'Reilly, Chief Digital Officer
Reviewed By	CDO
Classification	Class 2 (Confidential)
Status	Approved

Table of Contents¶

Executive Summary
Objectives
Scope
Governance Principles
Roles and Responsibilities
Delivery Model
Work Hierarchy
SDLC Lifecycle Phases
Hotfix Process
Release Governance
Jira Workflow
Definition of Done
Testing Strategy
Compliance and Audit
Architecture Decision Records
Exception Handling
Metrics
Continuous Improvement
Supersedes
Approval and Adoption

1. Executive Summary¶

This document defines the Software Development Lifecycle (SDLC) governance framework for Simpaisa Holdings. It replaces the previous SDLC SOP (SP-SOP-SDLC-2603 v1.0) to align with the CDO Division's new organisational model - seven tribes operating under a Kanban-based continuous delivery model with automated quality gates, engineer-owned testing, and daily deployment capability.

The previous SOP was built around two-week sprints, a separate QA team, manual release approvals through a Change Advisory Board (CAB), and waterfall-style Jira gates. None of these reflect how a modern, AI-native payments engineering organisation should operate. This framework codifies the target state: continuous flow, shift-left security, automated pipelines, and measurable engineering performance through DORA metrics.

This framework applies to all software development, integration, and deployment activities across all Simpaisa products and all seven CDO Division tribes.

2. Objectives¶

Ship daily. Enable every tribe to deploy to production at least once per business day through automated pipelines, removing manual approval bottlenecks.
Engineers own quality. Eliminate the handoff to a separate QA function. Solution Engineers write code, write tests, and are accountable for the quality of what they ship.
Shift security left. Threat modelling and security review happen before development begins, not as a gate after code is written.
Measure what matters. Track DORA metrics (deployment frequency, lead time for changes, mean time to recovery, change failure rate) as first-class engineering KPIs, reported to the CDO and the board.
No projects. Work is organised as a continuous flow of Initiatives, Epics, and Stories through a Kanban system. There are no projects, no project managers in the delivery path, and no sprint commitments.
Regulatory traceability. Maintain a complete, automated audit trail from Jira ticket to pull request to pipeline execution to production deployment, satisfying PCI-DSS and DFSA operational resilience requirements.
AI-native by default. AI tools (code generation, code review assistance, automated test generation, documentation) are standard engineering tools, not exceptions requiring approval.

3. Scope¶

3.1 Products¶

This framework governs the SDLC for all Simpaisa software products:

Product	Description
Pay-Ins (Collections)	Inbound payment collection flows and partner integrations
Pay-Outs (Disbursements)	Outbound payment processing across all corridors
Remittances	Cross-border remittance corridor services
Portals	Customer-facing and partner-facing portal applications
Cards	Card issuance and payment card programme services

3.2 Tribes¶

All seven CDO Division tribes operate under this framework:

Tribe	Head	SDLC Role
Solution Engineering	CTO	Builds and ships product features. Owns code, tests, and delivery.
Platform Engineering	Muhammad Mohsin	Builds and maintains infrastructure, CI/CD pipelines, databases, and developer tooling.
Data Engineering	Head of Data (hire)	Builds data pipelines, analytics infrastructure, and reporting systems.
Production Engineering	Muhammad Owais Khalid	Owns production reliability, monitoring, incident response, and on-call.
Information Security	Danish Hamid (CISO)	Owns threat modelling, security review, penetration testing, and compliance evidence.
Product Management	Rizwan Zafar (CPO)	Owns discovery, requirements, acceptance criteria, and prioritisation.
Digital Office	CDO	Strategic execution, architecture governance, process improvement, and AI adoption.

3.3 Markets¶

This framework applies uniformly across all Simpaisa markets: Pakistan, Bangladesh, Nepal, Iraq, UAE, and planned expansions (Saudi Arabia, broader MENA, Central Asia). Market-specific regulatory requirements are handled through configuration, feature flags, and corridor-specific acceptance criteria - not through separate SDLCs.

3.4 Tooling¶

Tool	Purpose
Jira	Work tracking, Kanban boards, reporting
Bitbucket	Source control, pull requests, code review
Bitbucket Pipelines / Jenkins	CI/CD pipeline execution
Snyk	Dependency vulnerability scanning, container scanning
Confluence	Architecture decision records, runbooks, documentation

4. Governance Principles¶

4.1 No-Project Model¶

Simpaisa does not run projects. There are no project charters, no project managers in the delivery path, no project timelines with Gantt charts. Work flows continuously through a Kanban system. The unit of delivery is the Story. The unit of strategic alignment is the Initiative.

The Programme Management Office (PMO) function transitions to portfolio-level coordination, dependency management across tribes, and reporting - not day-to-day delivery management.

4.2 AI-Native Engineering¶

AI tools are standard engineering equipment, not exceptions. Engineers are expected to use AI-assisted code generation, AI-assisted code review, AI-assisted test writing, and AI-assisted documentation as part of their normal workflow. The Digital Office tracks AI adoption metrics and identifies opportunities to embed AI tooling deeper into the SDLC.

AI-generated code is subject to the same quality gates as human-written code: code review, automated testing, security scanning, and PR approval.

4.3 Automation Over Approval¶

Manual approval gates are the enemy of deployment frequency. This framework replaces manual approvals with automated quality gates wherever possible. The CI/CD pipeline is the authority on whether code is fit to ship. If the pipeline passes, the code can deploy. Human judgement is reserved for architecture decisions, security threat models, and exception handling - not for routine releases.

4.4 Ownership Over Handoff¶

Each tribe owns its output end-to-end. Solution Engineers own code quality, test coverage, and deployment. Production Engineering owns production reliability. Information Security owns threat models and security assurance. There is no "throw it over the wall" handoff between development and QA, between QA and operations, or between operations and security.

4.5 Transparency by Default¶

All work is visible in Jira. All code is visible in Bitbucket. All pipeline results are visible in the CI/CD system. All architecture decisions are recorded in Confluence. There are no shadow boards, no offline tracking spreadsheets, and no verbal-only decisions about technical direction.

5. Roles and Responsibilities¶

5.1 Solution Engineering¶

Solution Engineering is the primary delivery tribe. Solution Engineers (the engineers formerly organised as Portal, Pay-Ins, and Pay-Outs development teams) are responsible for:

Writing production code for all Simpaisa products
Writing unit tests, integration tests, and E2E tests for their own code
Creating and maintaining CI/CD pipeline configurations for their services
Conducting code reviews (every PR requires two approvals from Solution Engineers)
Owning the quality of their output - there is no separate QA sign-off
Participating in architecture reviews for significant changes
Writing and maintaining technical documentation for their services
Using AI tools to accelerate development and improve code quality

Tribe Lead accountability: The CTO is accountable for Solution Engineering throughput, code quality, and deployment frequency.

5.2 Platform Engineering¶

Platform Engineering (formerly DevOps and Infrastructure, plus Database Administration) is responsible for:

Building and maintaining CI/CD pipelines (Jenkins, Bitbucket Pipelines)
Managing cloud infrastructure (AWS, Cloudflare)
Database administration, performance tuning, and backup/recovery
Developer tooling and local development environments
Container orchestration and deployment automation
Infrastructure-as-code management
Automated rollback mechanisms
Feature flag platform administration

Tribe Lead accountability: Muhammad Mohsin is accountable for pipeline reliability, infrastructure availability, and deployment tooling.

5.3 Data Engineering¶

Data Engineering is responsible for:

Building and maintaining data pipelines
Analytics infrastructure and reporting systems
Data quality monitoring and governance
Business intelligence data models
Regulatory reporting data (transaction reporting, AML data feeds)
Data residency compliance per market

Tribe Lead accountability: Head of Data Engineering (hire in progress) is accountable for data pipeline reliability and data quality.

5.4 Production Engineering¶

Production Engineering (formerly SQA and Service Delivery, reoriented) is responsible for:

Production monitoring, alerting, and observability
Incident response and on-call rotation
SLO definition, measurement, and error budget management
Post-incident reviews
Service delivery and production handover
Canary deployment verification
Production environment management
Runbook creation and maintenance

Note on transition: The former SQA team members transition into Production Engineering with a focus on production reliability, automated test infrastructure, and monitoring - not manual QA. Engineers who previously performed manual testing will be retrained for production operations, automated test framework development, or redeployed based on skills assessment.

Tribe Lead accountability: Muhammad Owais Khalid is accountable for production uptime, MTTR, and incident response effectiveness.

5.5 Information Security¶

Information Security is responsible for:

Threat modelling for new features and services (before development begins)
Security architecture review for significant changes
Penetration testing (scheduled and ad hoc)
Snyk vulnerability management and triage
PCI-DSS compliance evidence and audit support
DFSA operational resilience evidence
SOC operations and security monitoring
Cloud security posture management
Security incident response

Tribe Lead accountability: Danish Hamid (CISO) is accountable for security posture, vulnerability remediation SLAs, and compliance evidence.

5.6 Product Management¶

Product Management is responsible for:

Discovery and validation (working with the Digital Office on spikes)
Writing Stories with clear acceptance criteria
Prioritising the backlog for each product area
Defining acceptance criteria that are testable and unambiguous
Stakeholder communication and expectation management
Corridor-specific requirements (regulatory, market, partner)
Product metrics and outcome measurement

Tribe Lead accountability: Rizwan Zafar (CPO) is accountable for product-market fit, roadmap alignment, and backlog quality.

5.7 Digital Office¶

The Digital Office is the CDO's strategic execution arm, responsible for:

Architecture governance (Architecture Review Board secretariat)
SDLC process ownership and continuous improvement
AI adoption strategy and tooling evaluation
Engineering metrics collection, analysis, and reporting
Cross-tribe coordination and dependency management
Spike facilitation (architecture blueprints for new initiatives)
Ways of Work documentation and training
Agile coaching and delivery model maturity

Team Lead accountability: The CDO directly oversees the Digital Office. The Agile Coach (Wajih Aslam) leads day-to-day delivery model support.

5.8 Architecture (Cross-Cutting)¶

The Principal Architect (Maqsood Ali) and Application Architect (Laique Ali) operate as a cross-cutting function within the Digital Office's governance remit:

Architecture Review Board (ARB) membership and facilitation
Architecture Decision Record (ADR) review and approval
System design review for significant changes
Technology standards and patterns governance
Integration architecture across products and corridors

6. Delivery Model¶

6.1 Kanban - Not Sprints¶

Simpaisa operates a Kanban continuous flow delivery model. There are no sprints. There are no sprint commitments. There is no sprint planning ceremony.

Work flows continuously from Backlog through to Done. The system is governed by WIP (Work in Progress) limits, not by time-boxed iterations.

6.2 Ten-Day Cycles¶

While delivery is continuous, Simpaisa uses a 10-business-day cadence for planning, reflection, and alignment:

Days	Activity
Days 1–9	Continuous delivery. Engineers pull work from the Ready column, develop, review, and deploy.
Day 10	Cycle ceremonies: Retrospective, Demo, and Backlog Grooming.

Day 10 ceremonies:

Retrospective (45 minutes per tribe): What went well, what didn't, what to change. Each tribe runs its own retro. Actions are captured as Stories and enter the backlog.
Demo (30 minutes, cross-tribe): Each tribe demonstrates what shipped in the cycle. Stakeholders attend. This is the primary visibility mechanism for Product Management and leadership.
Backlog Grooming (60 minutes per tribe): Product Manager and tribe leads review upcoming work, refine Stories, confirm acceptance criteria, and ensure the Ready column is stocked for the next cycle.

6.3 No Daily Standups¶

There are no daily standup meetings. Status updates are asynchronous:

Engineers update their Jira tickets daily (move cards, add comments on blockers)
Tribe leads review their Kanban board daily and address blockers
The Digital Office publishes a weekly cycle report (automated from Jira data) showing throughput, WIP age, and blockers

If a tribe identifies a need for synchronous coordination, it may hold ad hoc huddles - but these are not mandated, not daily, and not a standing ceremony.

6.4 WIP Limits¶

Each Kanban column has a WIP limit. These are calibrated per tribe based on team size:

Column	WIP Limit (per tribe)
Ready	2× team size
In Progress	1× team size
In Review	0.5× team size (rounded up)

When a WIP limit is breached, the tribe must resolve the bottleneck before pulling new work. WIP limit violations are tracked as a metric and reviewed in retrospectives.

6.5 Continuous Flow Principles¶

Pull, don't push. Engineers pull work when they have capacity. Work is not assigned by managers.
Finish before starting. Completing in-progress work takes priority over starting new work.
Small batches. Stories must be completable within a single 9-day delivery window. If a Story cannot be completed in 9 days, it must be decomposed.
Flow efficiency. The ratio of active work time to total lead time is tracked. Target: >40% flow efficiency.

7. Work Hierarchy¶

7.1 Initiative¶

An Initiative is a strategic goal that spans multiple Epics and may take one or more quarters to complete. Initiatives are owned by the CDO or CPO and align to company-level OKRs.

Examples: - "Launch Saudi Arabia Pay-Out corridor" - "Achieve PCI-DSS Level 1 certification" - "Migrate primary infrastructure to Cloudflare"

Jira type: Initiative
Owner: CDO or CPO
Lifespan: Typically 1–3 quarters

7.2 Epic¶

An Epic is a significant body of work that delivers a measurable outcome. Epics replace the concept of "projects." An Epic contains multiple Stories and is typically completed within 2–6 cycles (20–60 business days).

Examples: - "Integrate Saudi Arabia disbursement partner API" - "Implement feature flag infrastructure" - "Build transaction monitoring dashboard"

Jira type: Epic
Owner: Product Manager (with a Solution Engineering lead assigned)
Lifespan: Typically 2–6 cycles
Required fields: Business justification, acceptance criteria, target completion cycle, assigned tribe

7.3 Story¶

A Story is the fundamental unit of delivery. A Story describes a single, testable increment of value. Stories must be small enough to complete within one 9-day delivery window, including code, tests, review, and deployment.

Examples: - "As a partner, I can view my settlement report filtered by date range" - "Implement Snyk scanning in the Pay-Outs pipeline" - "Add canary deployment step to portal release pipeline"

Jira type: Story
Owner: Product Manager (definition) / Solution Engineer (delivery)
Lifespan: Must complete within one 9-day cycle
Required fields: Acceptance criteria, tribe, Story points (optional - used for forecasting, not commitment)

7.4 Task¶

A Task is a technical sub-unit of a Story. Tasks are used by engineers to decompose a Story into implementation steps. Tasks are optional - not every Story needs Tasks. Tasks do not appear on the Kanban board; only Stories flow through the board.

Examples: - "Write database migration for new settlement_reports table" - "Add unit tests for date range filter logic" - "Update API documentation for new endpoint"

Jira type: Sub-task
Owner: Solution Engineer
Lifespan: Hours to days (within the parent Story's cycle)

8. SDLC Lifecycle Phases¶

Lifecycle Flow¶

flowchart TD
    classDef planning fill:#4A90D9,stroke:#2C5F8A,color:#fff
    classDef technical fill:#7B68EE,stroke:#4B3DB5,color:#fff
    classDef delivery fill:#27AE60,stroke:#1A7A42,color:#fff

    START([New Initiative / Epic]) --> P1

    P1["Phase 1\nDiscovery and Spike"] --> P2
    P2["Phase 2\nRequirements and\nAcceptance Criteria"] --> ARCH_Q

    ARCH_Q{Architecture\nchange?}
    ARCH_Q -- Yes --> P3
    ARCH_Q -- No --> P4

    P3["Phase 3\nArchitecture Review"] --> P4
    P4["Phase 4\nSecurity Review\n(shift-left)"] --> P5

    HOTFIX([Hotfix Trigger]) -.->|Bypass Phases 1-4| P5

    P5["Phase 5\nDevelopment"] --> P6
    P6["Phase 6\nQuality Gates"] --> P7
    P7["Phase 7\nDeployment"] --> P8
    P8["Phase 8\nPost-Deploy\nVerification"] --> P9
    P9["Phase 9\nProduction\nOperations"] --> DONE([Live in Production])

    class P1,P2 planning
    class P3,P4,P5,P6 technical
    class P7,P8,P9 delivery

Jira Workflow¶

flowchart LR
    classDef active fill:#7B68EE,stroke:#4B3DB5,color:#fff
    classDef blocked fill:#E74C3C,stroke:#A93226,color:#fff
    classDef done fill:#27AE60,stroke:#1A7A42,color:#fff

    BL[Backlog] --> TD[To Do]
    TD --> IP[In Progress]
    IP --> CR[Code Review]
    CR --> TE[Testing]
    TE --> DN[Done]

    IP -.->|Impediment raised| BK[Blocked]
    BK -.->|Impediment cleared| IP

    class IP,CR active
    class BK blocked
    class DN done

Phase 1: Discovery and Spike¶

Owner: Digital Office + Product Management

When a new Initiative or significant Epic is proposed, the Digital Office conducts a discovery spike to validate feasibility and produce an architecture blueprint.

Activities: - Product Manager defines the business problem and desired outcome - Digital Office assesses technical feasibility, integration complexity, and regulatory implications - Principal Architect or Application Architect produces an architecture blueprint covering: system context, data flows, integration points, security considerations, and infrastructure requirements - AI tools are used to accelerate research and prototype validation - Spike output is a written document (Confluence) with a go/no-go recommendation

Output: Architecture blueprint (Confluence page)
Duration: 1–5 business days depending on complexity
Gate: CDO or CTO approves the blueprint before work enters the backlog

Phase 2: Requirements and Acceptance Criteria¶

Owner: Product Management

Once the architecture blueprint is approved, the Product Manager decomposes the Epic into Stories with clear, testable acceptance criteria.

Activities: - Product Manager writes Stories in Jira following the standard template: "As a [user], I [action], so that [outcome]" - Each Story includes explicit acceptance criteria written as Given/When/Then statements - Corridor-specific requirements (e.g., Bangladesh regulatory reporting, UAE DFSA evidence) are captured as acceptance criteria, not as separate documents - Product Manager confirms each Story is achievable within a single 9-day cycle - Stories are reviewed with the assigned Solution Engineering lead for feasibility

Output: Jira Stories with acceptance criteria in the Backlog column
Quality bar: No Story enters Ready without acceptance criteria approved by both Product Manager and Solution Engineering lead

Phase 3: Architecture Review¶

Owner: Architecture (Principal Architect) + Digital Office

Significant changes require Architecture Review Board (ARB) review before development begins. "Significant" is defined as:

New service or microservice introduction
Changes to data models that affect more than one service
New third-party integration or corridor partner
Infrastructure architecture changes (e.g., new AWS service, Cloudflare migration)
Changes affecting PCI-DSS Cardholder Data Environment (CDE) boundaries

Activities: - Solution Engineering lead presents the proposed approach to the ARB - ARB reviews against architecture principles, security standards, and scalability requirements - If approved, an Architecture Decision Record (ADR) is created documenting the decision - If rejected, the ARB provides specific feedback and the team revises

Output: ADR (Confluence page) and ARB approval recorded in Jira Epic
Attendees: Principal Architect, Application Architect, CISO (or delegate), relevant tribe leads
Cadence: ARB meets weekly (Wednesdays). Urgent reviews can be conducted asynchronously via Confluence with a 48-hour review period.

Phase 4: Security Review¶

Owner: Information Security

Security review is shift-left: it happens before development begins, not after code is written.

Activities: - Information Security conducts threat modelling for the proposed change using STRIDE methodology - Threat model covers: data flows, trust boundaries, authentication/authorisation, encryption at rest and in transit, PCI-DSS implications, and corridor-specific regulatory requirements - Snyk policy is confirmed for any new dependencies - If the change touches the CDE, a PCI-DSS impact assessment is produced - Security requirements are added as acceptance criteria to the relevant Stories

Output: Threat model document (Confluence), security acceptance criteria added to Stories
Duration: 1–3 business days depending on complexity
Gate: CISO (or delegate) signs off the threat model before Stories move to Ready

Exemptions: Changes that do not introduce new data flows, new integrations, or new infrastructure components may be exempted from formal threat modelling. The CISO maintains a standing exemption list (e.g., UI-only changes with no new API endpoints).

Phase 5: Development¶

Owner: Solution Engineering

Development is where code is written, tested, and reviewed.

Activities: - Solution Engineer pulls a Story from the Ready column and moves it to In Progress - Engineer writes production code, unit tests, and integration tests - Engineer creates a pull request (PR) in Bitbucket - PR description references the Jira Story key (e.g., PAY-1234) - PR includes: code changes, unit tests, integration tests (where applicable), and updated documentation - Two Solution Engineers (not including the author) review and approve the PR - Reviewers verify: code quality, test coverage, adherence to architecture patterns, security considerations, and acceptance criteria coverage - AI-assisted code review tools may be used to augment (not replace) human review

Constraints: - Every PR requires exactly two approvals before merge - PR author cannot approve their own PR - PRs should be small and focused - one Story per PR where possible - PRs must pass all automated pipeline checks before human review begins (see Phase 6)

Output: Approved, merged PR in Bitbucket
Jira transition: In Progress → In Review (when PR is created) → Done (when PR is merged and pipeline passes)

Phase 6: Automated Quality Gates¶

Owner: Platform Engineering (pipeline) + Solution Engineering (test content)

Every PR triggers an automated CI pipeline. There is no manual QA sign-off for standard releases. The pipeline is the quality authority.

Pipeline stages:

Stage	Tool	Criteria
Unit tests	JUnit / Jest / pytest (per service)	All tests pass. Coverage meets threshold (see Section 13).
Integration tests	Service-specific	All API contract tests pass.
Static analysis	SonarQube or equivalent	No new Critical or High issues.
Security scan	Snyk	No new Critical or High vulnerabilities in dependencies or container images.
Build	Maven / npm / Docker	Clean build with no errors.
Artifact publish	Container registry	Immutable, tagged artifact published.

Failure handling: - If any stage fails, the pipeline stops and the PR is blocked from merge - The Solution Engineer who raised the PR is responsible for fixing failures - Pipeline failures are not escalated to management - they are normal engineering workflow - Persistent pipeline failures (>24 hours unresolved) are flagged in the tribe's daily board review

Output: Green pipeline status on the PR in Bitbucket

Phase 7: Deployment¶

Owner: Platform Engineering (mechanism) + Solution Engineering (decision to deploy)

Simpaisa targets daily deployment capability for all services. Deployment is automated and does not require manual approval, a CAB, or a release manager.

Deployment process: 1. PR is merged to the main branch 2. Merge triggers the deployment pipeline automatically 3. Deployment pipeline executes: - Artifact promotion from CI registry to deployment registry - Database migration execution (if applicable) - Rolling deployment to staging environment - Automated smoke tests against staging - Progressive rollout to production (canary → 10% → 50% → 100%) 4. Feature flags control visibility of new functionality to end users 5. Automated rollback triggers if error rate exceeds threshold during canary phase

Rollback: - Automated rollback completes within 5 minutes of trigger - Rollback is triggered automatically if: error rate increases >2% above baseline, latency P99 exceeds 2× baseline, or health check failures exceed threshold - Manual rollback can be initiated by any Solution Engineer or Platform Engineer via a single command

No CAB: There is no Change Advisory Board. The pipeline is the change authority. If the pipeline passes, the change is approved. This is a deliberate decision to remove the bottleneck that prevented daily deployment under the previous SOP.

Deployment windows: Standard deployments occur during business hours (09:00–17:00 PKT, Monday–Thursday). Friday deployments are permitted but discouraged for non-urgent changes. Weekend and out-of-hours deployments require Production Engineering on-call acknowledgement.

Output: Running code in production behind appropriate feature flags

Phase 8: Post-Deployment Verification¶

Owner: Production Engineering

After deployment, Production Engineering verifies the release in production.

Activities: - Canary metrics are monitored for 30 minutes post-deployment (error rate, latency, throughput) - SLO compliance is verified against the service's defined SLOs - Automated alerting confirms no new alerts have fired - Key business metrics are spot-checked (transaction success rate, settlement completion) - If verification passes, the canary is promoted to full traffic - If verification fails, automated rollback is triggered and the Solution Engineering team is notified

Monitoring stack: - Application performance monitoring (APM) for latency and error tracking - Infrastructure monitoring for resource utilisation and availability - Business metrics dashboards for transaction volumes and success rates - Log aggregation for error investigation

Output: Verified production deployment or rollback with incident ticket

Phase 9: Production Operations¶

Owner: Production Engineering

Once code is in production, Production Engineering owns its operational health.

Activities: - 24/7 monitoring via NOC/SOC - On-call rotation for P1/P2 incident response (PagerDuty or equivalent) - Incident response per the Incident Management playbook (separate document) - Post-incident reviews for all P1 and P2 incidents (blameless, focused on systemic improvements) - SLO tracking and error budget management - Capacity planning and scaling recommendations - Runbook maintenance and operational documentation

On-call rotation: - Each tribe with production services maintains an on-call rotation - On-call engineer has authority to execute rollbacks, scale infrastructure, and page additional engineers - On-call handover occurs weekly with a written summary of active issues

Escalation path: 1. On-call engineer (Production Engineering) 2. Tribe lead 3. CTO (for P1 incidents) 4. CDO (for P1 incidents with business impact)

9. Hotfix Process¶

A hotfix is an emergency code change that bypasses the standard cycle workflow due to urgency. Hotfixes are reserved for:

P1 production incidents (service down or severely degraded)
Critical security vulnerabilities with active exploitation risk
Regulatory compliance breaches requiring immediate remediation

Hotfix workflow:

Trigger: P1 incident declared or critical security vulnerability identified
Branch: Engineer creates a hotfix branch from main (not from a feature branch)
Fix: Engineer develops the minimal fix. Scope is strictly limited to the immediate issue.
Review: One PR approval (reduced from standard two) from a senior Solution Engineer or tribe lead
Pipeline: Automated quality gates run (unit tests, security scan). Integration tests may be skipped with CDO or CTO written approval.
Deploy: Immediate deployment to production, bypassing canary progressive rollout if urgency requires it
Verify: Production Engineering confirms the fix resolves the incident
Post-incident review: Mandatory within 48 hours. Review covers root cause, timeline, fix effectiveness, and systemic improvements.
Backfill: Any skipped tests or documentation are completed within the current cycle as a follow-up Story

Hotfix approvers: CTO, CDO, or CISO (for security hotfixes)
Audit trail: All hotfixes are tagged in Jira with the hotfix label and linked to the incident ticket

10. Release Governance¶

10.1 What Is a Release?¶

A release is a merge to the main branch that passes all automated pipeline stages. There is no separate "release" ceremony, no release manager, and no release approval board.

10.2 Automated Gates Replace Manual Approvals¶

The following automated gates constitute release approval:

Gate	Authority
Unit tests pass	Pipeline
Integration tests pass	Pipeline
Static analysis clean	Pipeline
Snyk scan clean (no new Critical/High)	Pipeline
Two PR approvals	Bitbucket
Build succeeds	Pipeline
Staging smoke tests pass	Pipeline

If all gates pass, the release is approved. No human signs a release form.

10.3 Feature Flags for Risk Management¶

Feature flags are the primary mechanism for managing release risk:

New features deploy behind flags, disabled by default
Product Management controls flag activation (who sees the feature and when)
Flags enable instant rollback of individual features without code deployment
Flags enable progressive rollout (internal → beta partners → 10% → 50% → 100%)
Stale flags (>30 days post-full-rollout) are cleaned up as technical debt Stories

10.4 Release Cadence¶

There is no fixed release cadence. Releases happen when code is ready. The target is at least one production deployment per tribe per business day. Actual deployment frequency is tracked as a DORA metric.

11. Jira Workflow¶

11.1 Board Columns¶

Every tribe operates a single Kanban board with the following columns:

Column	Description	Entry Criteria
Backlog	All Stories not yet ready for development	Story created with title and description
Ready	Stories refined and ready to be pulled by an engineer	Acceptance criteria approved, security review complete (if required), architecture review complete (if required), sized to fit one cycle
In Progress	Story actively being developed	Engineer has pulled the Story and begun work
In Review	PR raised, awaiting code review and pipeline completion	PR created in Bitbucket, linked to Jira Story
Done	Story complete - code merged, pipeline green, deployed to production	All Definition of Done criteria met (see Section 12)

11.2 WIP Limits¶

WIP limits are enforced per column per tribe. Jira is configured to visually flag WIP breaches. WIP limits are set by the tribe lead in consultation with the Agile Coach and reviewed quarterly.

11.3 No Waterfall Gates¶

There are no approval-gate columns (e.g., "Awaiting QA Sign-Off," "Awaiting CAB Approval," "Awaiting Release"). Stories flow from Backlog to Done through continuous work and automated checks.

11.4 Jira Hygiene¶

Every Story must have a Jira key
Every PR must reference its Jira key in the PR title or description
Engineers update Story status daily
Stories that have been In Progress for >5 business days without movement are flagged by the Agile Coach
Stale Stories (Backlog items untouched for >60 days) are archived quarterly

12. Definition of Done¶

A Story is Done when all of the following are true:

Criterion	Verification
Code complete	All acceptance criteria implemented
Unit tests pass	Pipeline confirms all unit tests green
Integration tests pass	Pipeline confirms API contract tests green (where applicable)
PR approved ×2	Two Solution Engineer approvals recorded in Bitbucket
Pipeline green	All automated quality gates pass (see Section 8, Phase 6)
Feature flag configured	New user-facing functionality is behind a feature flag
Documentation updated	API documentation, runbooks, or user-facing docs updated as needed
No new Critical/High vulnerabilities	Snyk scan confirms no new Critical or High findings
Deployed to production	Code is running in production (behind feature flag if applicable)
Jira updated	Story moved to Done, all fields current

If any criterion is not met, the Story is not Done and remains in its current column.

13. Testing Strategy¶

13.1 Test Pyramid¶

Simpaisa follows the test pyramid model. The base of the pyramid (unit tests) is large and fast. The top of the pyramid (E2E tests) is small and targeted.

        /  E2E  \           ← Critical user journeys only
       /----------\
      / Integration \       ← API contracts between services
     /----------------\
    /    Unit Tests     \   ← Every PR, every function
   /____________________\

13.2 Unit Tests¶

Mandatory: Every PR must include unit tests for new or changed logic
Coverage target: 80% line coverage per service (enforced in pipeline)
Owner: Solution Engineer who writes the code writes the tests
Framework: JUnit (Java services), Jest (Node.js/frontend), pytest (Python services)
Execution: Every PR, every pipeline run

13.3 Integration Tests¶

Scope: API contract tests verifying inter-service communication
Coverage target: All public API endpoints covered
Owner: Solution Engineering tribe, maintained per service
Execution: Every PR pipeline run (using service stubs/mocks for external dependencies)
Corridor-specific: Integration tests include corridor-specific scenarios (e.g., Pakistan RAAST, Bangladesh bKash, UAE partner APIs)

13.4 End-to-End Tests¶

Scope: Critical user journeys only - not comprehensive UI testing
Examples: Partner onboarding flow, Pay-Out initiation to settlement, Pay-In collection to reconciliation
Owner: Production Engineering (test infrastructure) + Solution Engineering (test content)
Execution: Nightly against staging environment; not on every PR
Principle: E2E tests verify the system works. Unit and integration tests verify the code works. Do not duplicate coverage.

13.5 No Manual QA Sign-Off¶

There is no manual QA sign-off step for standard releases. The automated pipeline (unit tests + integration tests + static analysis + security scan) is the quality gate.

Manual exploratory testing may be conducted by Product Management during UAT for high-risk features (new corridors, new product launches) - but this is a risk-management activity, not a release gate. Exploratory testing findings are logged as new Stories, not as PR blockers.

13.6 Performance Testing¶

Load testing is conducted before major corridor launches or infrastructure changes
Performance baselines are maintained per service
Performance regression is detected via APM monitoring in production, not via pre-release performance gates
Load test scripts are maintained in the service repository alongside application code

14. Compliance and Audit¶

14.1 Traceability Chain¶

Every change in production has a complete, automated audit trail:

Jira Story → Bitbucket PR → Pipeline Execution → Deployment Record → Production Metrics

This chain is maintained automatically through Jira-Bitbucket integration (PR links to Story key) and pipeline logging. No manual audit trail maintenance is required.

14.2 PCI-DSS¶

For services within the Cardholder Data Environment (CDE):

All code changes to CDE services require security review (Phase 4)
Snyk scanning is mandatory with zero tolerance for Critical vulnerabilities
Access to production CDE environments is restricted to named Production Engineering personnel
All CDE deployments are logged with timestamp, deployer identity, and change reference
Quarterly access reviews are conducted by Information Security

14.3 DFSA Operational Resilience¶

For UAE-regulated services, the SDLC provides evidence of:

Change management controls (automated pipeline with documented gates)
Incident management (Production Engineering on-call, post-incident reviews)
Business continuity (automated rollback, feature flags, multi-region capability)
Testing adequacy (test pyramid, coverage metrics, deployment verification)
Third-party risk management (Snyk dependency scanning, vendor assessment for new integrations)

14.4 Audit Access¶

Internal and external auditors are granted read-only access to:

Jira boards and Story history
Bitbucket repositories and PR history
Pipeline execution logs
Deployment records
Snyk vulnerability reports

Audit access is provisioned by Information Security upon request and reviewed quarterly.

15. Architecture Decision Records¶

15.1 What Is an ADR?¶

An Architecture Decision Record (ADR) documents a significant technical decision. ADRs create an institutional memory of why decisions were made, preventing future teams from relitigating settled questions or unknowingly reversing intentional choices.

15.2 When to Write an ADR¶

An ADR is required when:

A new service or microservice is introduced
A new technology, framework, or language is adopted
A significant data model change affects multiple services
An infrastructure architecture change is made (new cloud service, region, provider)
A security architecture decision is made (new authentication mechanism, encryption approach)
The ARB reviews and approves a significant change

An ADR is not required for:

Bug fixes
Routine feature development within existing patterns
Configuration changes
Dependency version updates (unless changing major versions with breaking changes)

15.3 ADR Template¶

All ADRs are stored in Confluence under the Architecture Decision Records space and follow this template:

# ADR-[number]: [Title]

**Date:** [date]  
**Status:** [Proposed | Accepted | Deprecated | Superseded by ADR-XXX]  
**Deciders:** [names]

## Context
What is the issue that we are seeing that motivates this decision?

## Decision
What is the change that we are proposing and/or doing?

## Consequences
What becomes easier or harder as a result of this decision?

## Alternatives Considered
What other options were evaluated and why were they rejected?

15.4 ARB Review Process¶

Author creates a draft ADR in Confluence
ADR is submitted to the ARB agenda (weekly Wednesday meeting or async review)
ARB members review and provide comments within 48 hours (async) or discuss in meeting
Principal Architect records the decision (Accepted / Rejected / Deferred)
Accepted ADRs are linked to the relevant Jira Epic

16. Exception Handling¶

16.1 Regulatory Emergencies¶

When a regulatory authority (DFSA, SBP, Bangladesh Bank, NRB, CBI) issues a directive requiring immediate system changes:

CDO or CPO raises an emergency Initiative in Jira with the regulatory-emergency label
Normal cycle workflow is suspended for the affected tribe(s)
Architecture review and security review are conducted in parallel (not sequentially)
Pipeline gates remain enforced - regulatory emergencies do not bypass automated quality checks
CDO briefs the CEO within 24 hours on timeline and impact

16.2 Critical Security Vulnerabilities¶

When a Critical-severity vulnerability is identified (Snyk alert, penetration test finding, CERT advisory):

CISO declares a security incident and assigns an owner
Hotfix process (Section 9) is invoked
Vulnerability is patched within SLA: Critical = 24 hours, High = 72 hours
CISO confirms remediation and updates the vulnerability register
Post-incident review is conducted if the vulnerability was exploited or had production impact

16.3 P1 Incidents¶

P1 incidents (complete service outage or severe degradation affecting customers) follow the Incident Management playbook:

Production Engineering on-call engineer is paged immediately
Incident commander is assigned (CTO for engineering incidents, CISO for security incidents)
All-hands war room (virtual) is convened within 15 minutes
Resolution actions may invoke the hotfix process
Post-incident review within 48 hours is mandatory
Post-incident review actions are tracked as Stories in the relevant tribe's backlog

16.4 Requesting an Exception¶

Any deviation from this SDLC framework (e.g., deploying without two PR approvals, skipping security review, bypassing pipeline gates) requires written approval from the CDO. Exception requests are logged in Jira with the sdlc-exception label and include:

What is being bypassed
Why it is necessary
What risk it introduces
What compensating controls are in place
When the exception expires

There are no permanent exceptions. All exceptions have an expiry date and are reviewed.

17. Metrics¶

17.1 DORA Metrics¶

The following DORA metrics are tracked per tribe and reported fortnightly to the CDO:

Metric	Definition	Target
Deployment Frequency	Number of production deployments per business day per tribe	≥1 per day
Lead Time for Changes	Time from first commit to production deployment	<24 hours
Mean Time to Recovery (MTTR)	Time from P1 incident declaration to resolution	<1 hour
Change Failure Rate	Percentage of deployments causing a production incident	<5%

17.2 Flow Metrics¶

Metric	Definition	Target
Cycle Time	Time from In Progress to Done per Story	<5 business days
WIP Age	Age of the oldest item in In Progress per tribe	<7 business days
Throughput	Number of Stories completed per cycle per tribe	Tracked, no fixed target (tribe-specific)
Flow Efficiency	Active work time ÷ total lead time	>40%

17.3 Quality Metrics¶

Metric	Definition	Target
Unit Test Coverage	Line coverage percentage per service	≥80%
Pipeline Pass Rate	Percentage of pipeline runs that pass on first attempt	>85%
Snyk Critical/High	Number of open Critical or High vulnerabilities	0 Critical, <5 High
Escaped Defects	Production incidents caused by code changes (per cycle)	<2 per tribe per cycle

17.4 Reporting¶

The Digital Office produces a fortnightly engineering metrics report from Jira and pipeline data
Metrics are reviewed in the CDO's fortnightly leadership meeting
Quarterly metrics are included in the CDO's board report
Tribes that consistently miss targets receive focused support from the Digital Office and Agile Coach

18. Continuous Improvement¶

18.1 Quarterly SDLC Retrospective¶

Every quarter, the Digital Office facilitates a cross-tribe SDLC retrospective:

All tribe leads attend
Review: DORA metrics trends, flow metrics trends, quality metrics trends
Identify: systemic bottlenecks, process friction, tooling gaps
Decide: process changes for the next quarter (captured as updates to this document)
Owner: CDO

18.2 Process Change Management¶

Changes to this SDLC framework follow this process:

Anyone can propose a change by raising a Story with the sdlc-improvement label
The Digital Office reviews proposed changes fortnightly
Significant changes are discussed at the quarterly SDLC retrospective
The CDO approves all changes to this document
Updated versions are published to Confluence and communicated to all tribes

18.3 Maturity Model¶

The Digital Office maintains an SDLC maturity assessment per tribe, rated across:

Deployment automation maturity
Test coverage and test pyramid adherence
Security integration maturity
Monitoring and observability maturity
Documentation and ADR discipline

Maturity assessments are conducted quarterly and inform investment priorities and coaching focus.

19. Supersedes¶

This document supersedes and replaces the following:

Document	Version	Date	Author
SP-SOP-SDLC-2603	v1.0	18 March 2026	Saqlain Raza (CTO)

The superseded document is archived in Confluence under "Superseded Documents" for audit trail purposes. It is no longer operative and must not be followed.

Key reasons for replacement:

Organisational misalignment: v1.0 referenced the old team structure (Portal, Pay-Ins, Pay-Outs, SQA, DevOps). The organisation now operates as seven tribes under the CDO Division.
Delivery model change: v1.0 mandated two-week sprints. The organisation has moved to Kanban with 10-day cycles.
Quality model change: v1.0 relied on a separate SQA team for manual QA sign-off. Engineers now own testing.
Release model change: v1.0 required CAB approval for releases. Automated pipeline gates now replace manual approval.
Jira workflow change: v1.0 used waterfall-style gates in Jira. The Kanban board uses five flow columns.
Language and quality: v1.0 contained US English spellings and multiple typographical errors inconsistent with Simpaisa's documentation standards.

20. Approval and Adoption¶

20.1 Approval¶

Role	Name	Date	Signature
Owner / Approver	Daniel O'Reilly, CDO	7 April 2026	____
Reviewed	Daniel O'Reilly, CDO	7 April 2026	____

20.2 Adoption Timeline¶

Milestone	Target Date	Owner
Document published to Confluence	7 April 2026	Digital Office
All-hands communication to CDO Division	9 April 2026	CDO
Jira boards reconfigured (Kanban, WIP limits)	14 April 2026	Agile Coach
Pipeline automated gates operational	27 May 2026	Platform Engineering
Daily deploy capability for one tribe	27 May 2026	CTO + Platform Engineering
Feature flag infrastructure operational	27 May 2026	Platform Engineering
DORA metrics reporting live	24 June 2026	Digital Office
All tribes operating under v2.0	27 June 2026	CDO

20.3 Training¶

The Digital Office and Agile Coach deliver training to all CDO Division staff covering:

Kanban principles and WIP limits
PR workflow and code review expectations
Pipeline gates and how to respond to failures
Feature flag usage
Incident response and escalation
Jira workflow and hygiene standards

Training is completed within 30 days of document publication.

20.4 Version History¶

Version	Date	Author	Changes
1.0	18 March 2026	Saqlain Raza	Initial SDLC SOP (superseded)
2.0	7 April 2026	Daniel O'Reilly	Complete replacement. New delivery model, org structure, quality model, release model. See Section 19.

End of document.