W-10: Engineering Ways of Work¶

Field	Value
Document	W-10
Title	Engineering Ways of Work
Status	Draft
Owner	CTO (Acting)
Created	2026-04-05
Review	Quarterly
Depends On	W-01 (Company Operating Rhythm), STD-GOV-124 (ARB Charter), STD-GOV-125 (Technical Debt Management), GIT-WORKFLOW (Git Workflow & Branch Strategy), Incident Response Playbook

Purpose¶

Define how Simpaisa's engineering organisation builds, tests, ships, and operates software. This is the single source of truth for engineering process. If it is not in this document, it is not how we work. Where current practice diverges from the target, both states are documented explicitly.

This document applies to all 38 engineers across all five Kanban teams.

Team Structure¶

CTO (Acting) — Saqlain Raza
├── Pay-In Team (6 engineers)
│   └── Team Lead
│       ├── 2 × Senior Engineers (Go / Java)
│       ├── 2 × Engineers
│       └── 1 × Junior Engineer
│
├── Pay-Out Team (6 engineers)
│   └── Team Lead
│       ├── 2 × Senior Engineers (Go / Java)
│       ├── 2 × Engineers
│       └── 1 × Junior Engineer
│
├── Portal Team (5 engineers)
│   └── Team Lead
│       ├── 1 × Senior Engineer (React / Go)
│       ├── 2 × Engineers
│       └── 1 × Junior Engineer
│
├── DevOps / Infra Team (5 engineers)
│   └── Team Lead
│       ├── 2 × Senior Engineers (Terraform / K8s)
│       ├── 1 × Engineer
│       └── 1 × Junior Engineer
│
└── SQA Team (4 engineers, shared across all teams)
    └── SQA Lead
        ├── 2 × QA Engineers
        └── 1 × QA Automation Engineer

Headcount: ~32 in delivery teams + CTO + team leads + SQA lead ≈ 38

Reporting: All team leads report to the CTO (Acting). The CTO reports to the CDO. Product direction comes from the CPO via Product Managers embedded with each delivery team.

1. Kanban Cadence and Ceremonies¶

Flow model: Continuous flow with WIP limits. No fixed sprints. Work is pulled from the backlog as capacity becomes available.

WIP limits: Maximum 2 items per engineer in progress at any time. If at limit, finish something before starting something new. WIP limits are enforced on the Jira board.

Ceremony	When	Duration	Attendees	Purpose
Weekly Planning	Monday AM	1 hour	Kanban team + Product Manager	Replenish the board. Pull highest-priority items. Review and adjust WIP limits.
Daily Stand-up	Every day, 10:00 local	15 min (hard stop)	Kanban team	Focus on blocked items and WIP. Not a status report — unblock, then move on.
Weekly Demo	Friday AM	30 min	Kanban team + stakeholders + Product Manager	Show what shipped this week. Gather feedback. No slides.
Fortnightly Retrospective	Every other Friday PM	45 min	Kanban team only (no managers unless invited)	What went well, what to improve, agree max 3 action items. Track action completion.
Backlog Refinement	Wednesday PM	1 hour	Kanban team + Product Manager	Refine upcoming stories. Break epics. Clarify acceptance criteria.

Time allocation per week (5 working days):

Activity	Hours/week
Ceremonies (1 hr planning + 1.25 hr stand-ups + 30 min demo + ~22 min retro amortised)	~3
Feature development	~28
Technical debt (per STD-GOV-125)	~7 (1 day/week)
On-call / incident handling / production support	~2
Learning / documentation	~2

Ceremony overhead is approximately 3 hours per week — down from ~12 hours per two-week cycle under the previous Scrum cadence. The 20% technical debt allocation from STD-GOV-125 translates to 1 day per week per engineer. The CTO ensures this capacity is protected during weekly planning. If debt work is consistently deferred for feature work, escalate to the CDO.

2. Backlog Management and Story Writing Standards¶

Backlog Tool¶

Jira is the single backlog tool. Every piece of engineering work has a Jira ticket. No work happens without a ticket.

Story Template¶

Every user story or task in Jira must include:

Title: [Clear, concise description]

As a [persona],
I want [capability],
So that [business outcome].

Acceptance Criteria:
- [ ] [Specific, testable criterion]
- [ ] [Specific, testable criterion]
- [ ] ...

API Specification:
- Link to OpenAPI spec or API design doc (if applicable)

Compliance Requirements:
- [ ] PCI-DSS impact: [Yes/No — detail if yes]
- [ ] PII handling: [Yes/No — fields affected]
- [ ] Regulatory: [Market-specific requirements, e.g. SBP directive]

Technical Notes:
- Dependencies, migration steps, feature flag requirements

Estimation: [Story points — Fibonacci: 1, 2, 3, 5, 8, 13]

Rules: - Stories estimated at 13 points or above must be broken down before being pulled into work. - Stories without acceptance criteria are not pulled into work. - Stories involving new API endpoints require an OpenAPI spec link before being pulled. - Stories touching PII or payment data must have the compliance section completed.

Backlog Hygiene¶

Product Manager owns prioritisation. Team Lead owns technical feasibility.
Backlog refinement happens weekly (Wednesday session).
Stories in the "Ready" column have been refined, estimated, and have clear acceptance criteria.
Stale tickets (untouched for 6 weeks) are reviewed and either reprioritised or closed.

3. Code Review Process¶

Platform¶

All code reviews happen in Bitbucket pull requests. No code reaches main without a pull request.

Approval Requirements¶

Approval	Who	Required
Peer review	Any team member at same or higher level	Yes (minimum 1)
Lead / Architect review	Team Lead, Platform Lead, or CDO	Yes (minimum 1)
Total minimum approvals		2

Both approvals must be from different people. Self-approval is not permitted.

Review SLAs¶

SLA	Timeframe	Action
First review	Within 1 business day of PR creation	Reviewer picks up the PR
Escalation	At 2 business days without review	Author escalates to Team Lead
Hard escalation	At 3 business days without review	Team Lead escalates to CTO

Architecture Review Triggers¶

A PR requires an Architecture Review Board (ARB) review (per STD-GOV-124) if it involves any of the following:

New service or microservice creation
New external dependency or third-party integration
Database schema changes affecting more than one service
Changes to the API gateway or routing layer
New infrastructure components (not just scaling existing ones)
Changes to authentication or authorisation flows
Cross-service data flow changes
Any change touching settlement or reconciliation logic

For ARB-triggerable changes, add the Jira label arch-review-required and notify the CDO. The PR does not merge until ARB approval is recorded.

Review Checklist¶

Reviewers assess against:

4. Definition of Done¶

A story is Done when all of the following are true:

#	Criterion	Verified By
1	Code merged to `main` via approved PR (2 approvals)	Bitbucket
2	All unit and integration tests pass	Jenkins CI
3	Test coverage has not decreased from baseline	SonarQube
4	Snyk security scan clean (no new critical/high)	Snyk
5	SonarQube quality gate passed	SonarQube
6	Deployed to staging environment	Jenkins CD
7	SQA verification passed on staging	SQA team sign-off in Jira
8	API documentation updated (if API changed)	OpenAPI spec
9	Runbook updated (if operational behaviour changed)	Confluence
10	Product Manager accepts the story	Jira status transition

A story that is merged but not verified by SQA on staging is not Done. It remains "In Review" in Jira.

5. CI/CD Pipeline Stages¶

Tooling¶

Source control: Bitbucket
CI/CD: Jenkins
Security scanning: Snyk (dependencies), SonarQube (static analysis)
Artefact registry: Docker images in private registry
Infrastructure as Code: Terraform
Configuration management: Ansible

Pipeline Stages¶

┌─────────────┐    ┌──────────────┐    ┌─────────────┐    ┌──────────────┐
│  PR Created  │───▶│  Build + Unit│───▶│  SAST +     │───▶│  Integration │
│  (Bitbucket) │    │  Tests       │    │  Snyk Scan  │    │  Tests       │
└─────────────┘    └──────────────┘    └─────────────┘    └──────────────┘
                                                                  │
                                                                  ▼
┌─────────────┐    ┌──────────────┐    ┌─────────────┐    ┌──────────────┐
│  Production  │◀──│  Staging     │◀──│  Docker      │◀──│  Quality     │
│  Deploy      │    │  Deploy      │    │  Image Build │    │  Gate Check  │
└─────────────┘    └──────────────┘    └─────────────┘    └──────────────┘

Stage details:

Stage	Tool	Gate	Failure Action
Build + Unit Tests	Jenkins + Go test / Maven	All tests pass	PR blocked
SAST + Snyk Scan	SonarQube + Snyk	No new critical/high findings	PR blocked
Integration Tests	Jenkins	All integration tests pass	PR blocked
Quality Gate Check	SonarQube	Coverage thresholds met, no new bugs/vulnerabilities	PR blocked
Docker Image Build	Jenkins + Docker	Image builds successfully	Pipeline fails
Staging Deploy	Jenkins + Terraform/Ansible	Health check passes	Rollback, alert team
SQA Verification	Manual + Automated	SQA sign-off	Story stays in review
Production Deploy	Jenkins + Terraform/Ansible	Health check + smoke tests pass	Automatic rollback

Pipeline duration targets: - PR pipeline (build through quality gate): < 15 minutes - Full pipeline through staging: < 30 minutes - Production deploy (including smoke tests): < 15 minutes

Current state: PR pipeline runs approximately 20-25 minutes. Optimisation is a tracked technical debt item.

6. Testing Requirements and Coverage Targets¶

Coverage Targets¶

Metric	Existing Services (Java)	New Services (Go / Phoenix)
Unit test coverage	70% minimum	80% minimum
Integration test coverage	60% minimum	70% minimum
End-to-end test coverage	Critical paths only	Critical paths + happy paths

Coverage is enforced by SonarQube quality gates. A PR that decreases coverage below the threshold is blocked.

Testing Pyramid¶

        /‾‾‾‾‾‾‾‾‾‾‾‾‾\
       /   E2E Tests     \        ← Few, slow, high-confidence
      /   (SQA-owned)     \
     /─────────────────────\
    /   Integration Tests    \     ← Moderate, test service boundaries
   /   (Developer-owned)      \
  /────────────────────────────\
 /       Unit Tests              \  ← Many, fast, developer-owned
/────────────────────────────────-\

Testing Standards¶

Unit tests: Written by the developer as part of the story. Committed in the same PR as the feature code. No PR without tests.
Integration tests: Test service boundaries, database interactions, and external API contracts. Run in CI against ephemeral test infrastructure.
End-to-end tests: Owned by SQA. Run against staging after deployment. Cover critical payment flows (pay-in, pay-out, settlement, reconciliation).
Contract tests: Required for all inter-service APIs. Consumer-driven contracts where practical.
Performance tests: Required for any change to transaction processing paths. Must not degrade P99 latency by more than 10%.

Defect SLAs¶

Severity	Definition	SLA	Action
Critical	Production payment processing blocked, data loss, security breach	Blocks deployment. Fix immediately.	All hands. War room. See Incident Response Playbook.
High	Degraded service, incorrect calculations, compliance gap	3 business days	Pull into current work immediately.
Medium	User-facing bug, non-critical functionality broken	Next weekly planning	Prioritise in next refinement.
Low	Cosmetic, minor UX, non-blocking	Backlog	Fix when capacity allows.

Critical defects halt all deployments until resolved. No exceptions.

7. Deployment Process and Cadence¶

Target Cadence¶

Team	Target Frequency	Current State
Portal	Daily	2-3 times per week
Pay-In	Daily	2-3 times per week
Pay-Out	Daily	Weekly
DevOps/Infra	As needed	As needed

Gap: Target is daily deployments for all delivery teams. Portal and Pay-In are close. Pay-Out deploys weekly due to settlement window constraints and additional verification requirements. The path to daily Pay-Out deployments requires decoupling settlement batch processing from deployment — this is a Phoenix programme workstream.

Deployment Process¶

Merge to main — PR approved and merged.
CI pipeline runs — Build, test, scan, quality gate.
Staging deploy — Automatic on successful pipeline.
SQA verification — Manual and automated tests on staging.
Production deploy — Triggered manually by Team Lead or Senior Engineer.
Smoke tests — Automated post-deploy verification.
Monitor — 30-minute watch window after deploy. Deployer monitors dashboards.

Deployment Rules¶

Deploy window: 09:00-16:00 local time (Karachi), Monday to Thursday. No Friday deploys unless critical. No weekend deploys unless P1 incident.
Who can deploy: Team Leads and Senior Engineers. Junior engineers may deploy with a Senior Engineer observing.
Rollback: If smoke tests fail or error rates spike above baseline + 5% within 30 minutes, roll back immediately. Do not debug in production.
Feature flags: New user-facing features must be wrapped in feature flags. Deploy dark, then enable via flag. This allows instant rollback without redeployment.
Database migrations: Run before application deployment. Must be backward-compatible (the old code must work with the new schema). Irreversible migrations require ARB approval.

Current state: Feature flags are being rolled out. Not yet consistently used across all teams. Target: all user-facing features behind flags by end of Q3 2026.

8. Incident Response for Engineering¶

Engineering incident response follows the Incident Response Playbook (see Standards/INCIDENT-RESPONSE-PLAYBOOK.md).

Engineering-Specific Responsibilities¶

Role	During Incident	After Incident
On-call engineer	First responder. Triage, assess severity, begin mitigation.	Contribute to Post-Incident Review.
Team Lead	Escalation point. Coordinate team response. Decide on rollback.	Own remediation actions.
CTO	Escalation for P1/P2. Cross-team coordination.	Review PIR. Approve systemic changes.
CDO	Notified for P1. Stakeholder communication. Regulatory impact assessment.	Sign off PIR. Track systemic improvements.

Severity to Engineering Action¶

Severity	Engineering Action
P1 (Critical)	All deployments halted. War room. All-hands until resolved.
P2 (High)	Affected team stops feature work. Fix immediately.
P3 (Medium)	Fix prioritised in next weekly planning.
P4 (Low)	Added to backlog.

Post-Incident Review¶

Every P1 and P2 incident requires a blameless Post-Incident Review within 3 business days. The PIR follows the standard in Standards/STD-DEVEX-093-POST-INCIDENT-REVIEW-STANDARDS.md.

PIR outputs: timeline, root cause, contributing factors, remediation actions with owners and deadlines. Remediation actions are tracked in Jira with the label pir-action.

9. On-Call Rotation¶

Structure¶

Each delivery team (Pay-In, Pay-Out, Portal) maintains its own on-call rotation. DevOps/Infra provides a separate infrastructure on-call.

Rotation	Coverage	Escalation
Pay-In on-call	1 engineer, weekly rotation	Team Lead → CTO → CDO
Pay-Out on-call	1 engineer, weekly rotation	Team Lead → CTO → CDO
Portal on-call	1 engineer, weekly rotation	Team Lead → CTO → CDO
Infra on-call	1 engineer, weekly rotation	Team Lead → CTO → CDO

On-Call Rules¶

Hours: 24/7 for production systems. Response time: 15 minutes for P1, 30 minutes for P2, next business day for P3/P4.
Rotation length: 1 week, rotating through all eligible engineers (Senior Engineer and above).
Handover: Friday end of day. Outgoing on-call briefs incoming on-call on any open issues.
Compensation: On-call allowance per company policy. Incident response outside business hours compensated as overtime or time-in-lieu.
No single points of failure: Each rotation must have a minimum of 3 engineers to prevent burnout. If a team cannot staff 3 engineers, the CTO escalates to the CDO.
On-call engineer is not assigned work at full capacity. Reserve time for on-call duties (reflected in weekly planning by reducing WIP allocation).

Current state: On-call rotations are informal. Pay-In and Pay-Out have de facto on-call engineers but no formal schedule. Target: formalised rotation with PagerDuty (or equivalent) by end of Q2 2026.

10. Technical Debt Management¶

Technical debt management follows STD-GOV-125 (Technical Debt Management). Key points for engineering:

Capacity Allocation¶

20% of engineering capacity is reserved for technical debt reduction. This is not optional. It is not a stretch goal. It is committed capacity.

In practical terms: 1 day per engineer per week is spent on debt reduction items.

Debt Tracking¶

All technical debt items are logged in Jira with the label tech-debt.
Each item is categorised: Code, Architecture, Dependency, Testing, Documentation, Infrastructure.
Each item is scored for impact (1–5) and effort (1–5). High-impact, low-effort items are prioritised.
The CTO reviews the debt register quarterly with the CDO (per W-01 Operating Rhythm).

Phoenix Programme¶

The largest single debt reduction effort is the Phoenix programme — rewriting legacy Spring Boot Java services in Go. This is tracked as a separate programme with its own milestones, not as ad-hoc tech debt.

Legacy Java services remain in maintenance mode (critical bug fixes only).
New features are built in Go.
Migration follows a strangler fig pattern: new Go services sit behind the API gateway alongside legacy services, taking over routes incrementally.

11. Branch Strategy¶

Branch strategy follows the Git Workflow & Branch Strategy standard (see Standards/GIT-WORKFLOW-STANDARD.md).

Summary¶

Strategy: Trunk-based development with short-lived feature branches.
Platform: Bitbucket.
Main branch: main is always deployable. Protected: no direct pushes.
Feature branches: feature/{ticket-id}-brief-description. Maximum lifetime: 3 days. If a branch lives longer than 3 days, it is too large — break it down.
Bug fix branches: fix/{ticket-id}-brief-description.
Release branches: release/v{major}.{minor}.{patch} — created only when a release needs stabilisation.

Branch Rules¶

Rebase onto main before merging (no merge commits in feature branches).
Squash commits on merge to main (one commit per story).
Delete feature branches after merge.
No long-lived branches apart from main and active release/* branches.

12. Engineering Metrics¶

We track the four DORA metrics plus Simpaisa-specific operational metrics.

DORA Metrics¶

Metric	Definition	Current State	Target
Deployment Frequency	How often code is deployed to production	2-3x/week (Portal, Pay-In), 1x/week (Pay-Out)	Daily (all teams)
Lead Time for Changes	Time from commit to production	~3-5 days	< 1 day
Mean Time to Recovery (MTTR)	Time from incident detection to resolution	~2-4 hours (estimated)	< 1 hour for P1
Change Failure Rate	Percentage of deployments causing incidents	Not currently tracked	< 5%

Gap: DORA metrics are not systematically measured today. Target: automated DORA metric collection via Jenkins + Jira integration by end of Q3 2026.

Operational Metrics¶

Metric	Definition	Tracked In	Reviewed
Throughput	Items completed per week per team	Jira	Fortnightly retrospective
Cycle time	Time from work started to work done	Jira	Fortnightly retrospective
PR review turnaround	Time from PR creation to first review	Bitbucket	Monthly by CTO
Build success rate	Percentage of CI builds that pass	Jenkins	Weekly by DevOps
Test coverage trend	Unit and integration coverage over time	SonarQube	Monthly by CTO
Tech debt ratio	Debt items created vs resolved per week	Jira	Quarterly (CTO + CDO)
Incident count by severity	Number of P1-P4 incidents per month	Incident tracker	Monthly (CTO + CDO)

Metric Review Cadence¶

Team level: Throughput and cycle time reviewed in fortnightly retrospectives.
Monthly: CTO reviews PR turnaround, build success rate, coverage trends.
Quarterly: CTO + CDO review DORA metrics, tech debt ratio, incident trends. Feed into quarterly technical debt review (per W-01).

13. Appendix: Quick Reference¶

What Needs What¶

I want to...	I need...
Merge a PR	2 approvals (1 peer + 1 lead/architect), all CI gates green
Deploy to production	Merged to main, staging verified by SQA, deploy window, Team Lead or Senior Engineer
Create a new service	ARB approval (STD-GOV-124)
Add a new dependency	Snyk scan clean, ARB approval if external service
Change a database schema	Backward-compatible migration, ARB approval if multi-service
Ship a user-facing feature	Feature flag, SQA sign-off, Product Manager acceptance
Skip tech debt allocation	You cannot. Escalate to CDO if pressured.
Deploy on Friday	You need CTO approval and a very good reason

Key Documents¶

Document	Location
Git Workflow & Branch Strategy	`Standards/GIT-WORKFLOW-STANDARD.md`
Incident Response Playbook	`Standards/INCIDENT-RESPONSE-PLAYBOOK.md`
Technical Debt Management	`Standards/STD-GOV-125-TECHNICAL-DEBT-MANAGEMENT.md`
ARB Charter	`Standards/STD-GOV-124-ARCHITECTURE-REVIEW-BOARD-CHARTER.md`
Post-Incident Review Standards	`Standards/STD-DEVEX-093-POST-INCIDENT-REVIEW-STANDARDS.md`
Service Level Objectives	`Standards/STD-DEVEX-090-SERVICE-LEVEL-OBJECTIVES.md`
Runbook Standards	`Standards/STD-DEVEX-092-RUNBOOK-STANDARDS.md`
Company Operating Rhythm	`Standards/WaysOfWork/W-01-COMPANY-OPERATING-RHYTHM.md`

Compliance with This Document¶

This document describes how engineering at Simpaisa works today, with clearly marked targets where practice has not yet reached the standard. The CTO is accountable for adherence. The CDO reviews compliance quarterly.

Deviations from this document are permitted only with CTO approval (for tactical exceptions) or CDO approval (for structural changes). All exceptions are time-boxed and tracked.