Simpaisa Group - CDO Strategic Documents¶

Prepared by: Daniel O'Reilly, Chief Digital Officer
Prepared for: Yassir Pasha, Chief Executive Officer / Executive Leadership Team
Date: 3 April 2026
Classification: Confidential - ELT Only

Document 1: CDO 90-Day Plan¶

Executive Summary¶

This plan governs the first 90 days of the CDO function at Simpaisa Group. It is structured in three phases: assess and document (Days 1–30), restructure and implement (Days 31–60), and measure and optimise (Days 61–90). The plan covers technology, security, product, people, and regulatory obligations. Success is measured by specific, time-bound key results for each initiative. The plan is calibrated to the realities of Day 5: the operating model is substantially complete, the team structure is established, and the strategic direction is set. What remains is execution.

The CDO directly owns the outcomes below. Day-to-day delivery is delegated to the CPO (Rizwan Zafar), CISO (Danish Hamid), and CTO (Saqlain Raza), who are accountable to this plan within their respective domains.

Phase 1: Assess and Document (Days 1–30)¶

Period: 30 March – 6 May 2026

1.1 Operating Model Completion¶

Objective: Deliver Simpaisa Operating Model v1.0 as the definitive reference architecture for how the group operates, governs, and grows.

Key Result: All 17 sections finalised, reviewed by ELT, and version-controlled by Day 10. Board presentation delivered by Day 30.

Owner: CDO (Daniel O'Reilly)

Deadline: 16 April 2026 (finalisation); 6 May 2026 (board presentation)

Dependencies: None - work is substantially complete (17 files, 1.1 MB of content in OneDrive/OpModel).

Success Criteria: - Operating model approved by CEO without material revisions - Board acknowledges receipt and endorses strategic direction - Document is version-controlled and access-controlled in OneDrive - Becomes the onboarding baseline for all future senior hires

Notes: This is already in progress and represents the first visible CDO deliverable to the board. Prioritise completion above all else in Week 1.

1.2 Technology Assessment¶

Objective: Produce an accurate, honest picture of the current AWS infrastructure footprint, its gaps, and its migration candidacy for Cloudflare edge services.

Key Result: Infrastructure audit report delivered to CDO by Day 25, covering all AWS accounts, EC2 inventory, VPC topology, data residency status per country, and an initial list of Cloudflare migration candidates.

Owner: CTO (Saqlain Raza)

Deadline: 1 May 2026

Dependencies: AWS account access, access to infrastructure-as-code repositories (Terraform, Ansible), and DevOps team participation.

Success Criteria: - All AWS accounts and services inventoried (EC2, VPC, ALB, RDS, ElastiCache, MSK, S3, CloudFront) - Data residency compliance assessed for Pakistan, Bangladesh, Nepal, Iraq, UAE - Cloudflare migration candidates identified and categorised (edge routing, WAF, CDN, Workers) - Monthly AWS spend by account and service line confirmed - Single-region exposure flagged as a risk with a remediation recommendation

1.3 Security Posture Review¶

Objective: Establish an accurate baseline of Simpaisa's current information security posture across all nine entities, with particular focus on regulatory-critical frameworks.

Key Result: Security posture review report completed and presented to CDO by Day 28, covering ISO 27001 scope boundaries, PCI DSS CDE definition and control status, and a summary of penetration testing history.

Owner: CISO (Danish Hamid)

Deadline: 4 May 2026

Dependencies: Access to audit evidence, prior pen test reports, ISMS documentation, and relevant regulatory correspondence.

Success Criteria: - ISO 27001 scope confirmed or revised for the group structure (9 entities) - PCI DSS CDE boundaries documented and validated; SAQ or QSA status confirmed - Pen test history reviewed: last test date, critical findings, remediation status - Gaps identified and risk-rated (Critical / High / Medium / Low) - Priority remediation items linked to Phase 2 work

1.4 Product Audit¶

Objective: Establish the CDO's independent view of the product portfolio - what is working, what is not, and where capital should be directed.

Key Result: Product audit report completed and presented to CDO by Day 28, covering roadmap status, product-market fit by corridor, and competitive positioning across Pay-Ins, Pay-Outs, Remittances, Crypto Off-Ramping, and White-Label Wallets.

Owner: CPO (Rizwan Zafar)

Deadline: 4 May 2026

Dependencies: Access to product roadmap, transaction data by corridor, customer metrics, and NPS/retention data where available.

Success Criteria: - Roadmap reviewed and assessed against CDO strategic priorities - Revenue and volume by product line and corridor documented - Product-market fit rated per corridor (Strong / Developing / Weak / Exit candidate) - Three to five product-level recommendations made for Phase 2 prioritisation - CDO and CPO aligned on roadmap sequencing for Q3 2026

1.5 Team Assessment¶

Objective: Build an accurate picture of the capability, capacity, and cultural health of the teams under the CDO's remit, and identify gaps that must be addressed to execute the strategy.

Key Result: Structured 1:1s completed with all three direct reports and their direct reports (estimated 9–12 people) by Day 25. Capability gap analysis documented by Day 30.

Owner: CDO

Deadline: 1 May 2026 (1:1s completed); 6 May 2026 (gap analysis delivered)

Dependencies: Calendar availability; access to current job descriptions and performance review history where it exists.

Success Criteria: - 1:1s completed with: Rizwan Zafar (CPO), Danish Hamid (CISO), Saqlain Raza (CTO), and their respective direct reports - Each 1:1 covers: what is working, what is not, what the individual needs to succeed, and their career goals - Capability gap map produced against the strategy (SRE, data engineering, ML, cloud architecture, security) - Retention risks identified - Preliminary view formed on CTO appointment (see 2.6)

1.6 Vendor Audit¶

Objective: Understand the full vendor landscape - what Simpaisa is paying, what it is getting, where it is locked in, and where it is exposed.

Key Result: Vendor register produced by Day 28 covering all technology, security, and infrastructure vendors, including contract terms, annual spend, SLA performance, and exit risk.

Owner: CTO (Saqlain Raza) in coordination with Finance

Deadline: 4 May 2026

Dependencies: Access to procurement records, vendor contracts, and Finance for spend data.

Success Criteria: - All vendors catalogued: AWS, Datadog, Cloudflare (if any existing), Eastnets, CyGlass, Snyk, Jenkins tooling, and others - Annual spend confirmed per vendor - Contract end dates, auto-renewal clauses, and exit penalties documented - Lock-in risk rated (High / Medium / Low) with rationale - Immediate renegotiation or termination candidates flagged

1.7 DFSA Cat 3D Gap Remediation¶

Objective: Ensure Simpaisa's DFSA Category 3D licence application is not delayed by CDO-domain gaps.

Key Result: All CDO-domain gap items from the DFSA Cat 3D gap analysis (see existing gap analysis document) closed or formally assigned with a committed completion date by Day 20.

Owner: CDO, in coordination with CISO (Danish Hamid) and Legal/Compliance

Deadline: 24 April 2026

Dependencies: Gap analysis document (already complete in OneDrive); DFSA submission timeline from Legal.

Success Criteria: - All CDO-domain gaps reviewed and triaged - Each gap has a named owner, a remediation action, and a target date - No CDO-domain item is on the critical path to submission without a committed date - CISO confirms technology and security evidence pack is submission-ready

1.8 Quick Wins¶

Objective: Deliver three to five visible, tangible improvements in the first 30 days that signal the CDO function is active, pragmatic, and executive-grade.

Key Result: At least three quick wins identified by Day 15 and delivered or demonstrably in flight by Day 30, with a brief summary presented to the CEO.

Owner: CDO

Deadline: 21 April 2026 (identification); 6 May 2026 (delivery or confirmed in-flight)

Dependencies: Outputs from technology assessment, security review, and vendor audit.

Success Criteria: - Each quick win is visible: to the CEO, the board, or end users - At least one reduces cost or risk immediately - At least one improves developer or operational experience - Summary communicated to CEO in writing by Day 30

CDO-Accountable Quick Wins (confirmed, Day 15): - SDLC v2.0 adoption by one engineering team (CDO to identify and sponsor the pilot team) - Transaction Lifecycle Architecture CSNO validation (CDO to present design to Bachir Njeim; his sign-off gates implementation) - Confluence as the living operating model (CDO to migrate operating model into Confluence and establish it as the authoritative reference) - Daily deploy pilot team identified and sprint planned (CDO to select the team and unblock the Jenkins pipeline work)

1.9 Board Presentation: Operating Model v1.0¶

Objective: Present the CDO operating model to the board, establish credibility, and secure endorsement for the strategic direction.

Key Result: Board presentation delivered and operating model endorsed by the board by Day 30.

Owner: CDO

Deadline: 6 May 2026

Dependencies: Operating model finalised (1.1); CEO pre-briefed in advance.

Success Criteria: - Board presentation delivered (30–45 minutes plus Q&A) - Board endorses the operating model as the group's strategic framework - Board approves Phase 2 investment outline (Cloudflare POC, SRE foundation, OKR rollout) - No material objections outstanding at close of session

Phase 2: Restructure and Implement (Days 31–60)¶

Period: 7 May – 4 June 2026

2.1 Cloudflare Migration: Phase 1 POC¶

Objective: Validate the Cloudflare edge hypothesis in production conditions - specifically, whether Cloudflare Workers and edge routing can reduce latency, improve WAF coverage, and enable in-country data routing for one corridor.

Key Result: A working POC for one corridor (recommended: Pakistan) running on Cloudflare edge routing is deployed to a staging environment and benchmarked against the current AWS-only baseline by Day 55.

Owner: CTO (Saqlain Raza)

Deadline: 3 June 2026

Dependencies: Vendor audit complete (1.6); Cloudflare enterprise account provisioned; DevOps team allocated for POC sprint; AWS baseline performance data from technology assessment (1.2).

Success Criteria: - Cloudflare Workers-based edge routing deployed for one API or payment flow - Latency comparison: Cloudflare vs. AWS baseline documented - WAF rule parity confirmed (no degradation in security posture) - In-country data routing validated (traffic does not leave Pakistan jurisdiction) - Cost comparison: Cloudflare egress vs. AWS data transfer costs modelled - POC findings presented to CDO by Day 55 with a go / no-go recommendation for Phase 2 production migration

2.2 SRE Foundation¶

Objective: Establish the structural foundations of a Site Reliability Engineering model: SLO definitions, error budget policy, and an on-call framework, without a Change Advisory Board.

Key Result: SLOs defined and agreed for all critical business services by Day 45. Error budget policy documented and adopted by Day 55.

Owner: CTO (Saqlain Raza)

Deadline: 21 May 2026 (SLOs); 3 June 2026 (error budget policy)

Dependencies: Technology assessment (1.2) for baseline availability data; agreement from CPO (Rizwan Zafar) on product SLO targets; Datadog or replacement platform to measure SLOs.

Success Criteria: - SLOs defined for: Pay-Ins, Pay-Outs, Remittance processing, Crypto Off-Ramping, White-Label Wallet API, and core authentication - Each SLO includes: availability target (e.g. 99.9%), latency target (p95, p99), and measurement window (rolling 30 days) - Error budget policy published: what happens when a team burns their budget (freeze on feature work until reliability is restored) - On-call rotation established for at least one team, without CAB dependency - CAB process formally deprecated and replaced with deployment standards document

2.3 Datadog Assessment¶

Objective: Determine whether Datadog should be retained, replaced, or renegotiated, based on cost, capability, and fit with the Cloudflare-first infrastructure strategy.

Key Result: A business case document for the observability platform decision is presented to the CDO by Day 55, with a clear recommendation and a cost comparison across at least three alternatives.

Owner: CTO (Saqlain Raza) in coordination with CISO (Danish Hamid) for security observability requirements

Deadline: 3 June 2026

Dependencies: Vendor audit (1.6) for current Datadog contract terms; technology assessment (1.2) for current usage profile; CISO input on security monitoring requirements (CyGlass integration, log retention for compliance).

Success Criteria: - Current Datadog usage profiled: APM coverage, log ingestion volume, custom metrics, dashboard count, active users - At least three alternatives evaluated: Grafana Cloud, New Relic, and one of Elastic/ELK or Axiom - Evaluation scored against: cost (3-year TCO), Cloudflare integration quality, on-call alerting capability, log retention (90 days minimum for PCI DSS), compliance reporting, and migration complexity - Recommendation made with financial justification - If replacement is recommended, migration timeline proposed for Phase 3

2.4 Daily Deploy Pipeline¶

Objective: Remove the structural barriers to daily deployment and establish the engineering culture and tooling that makes it safe and normal.

Key Result: At least one team is deploying to production daily by Day 55, using automated QA gates, feature flags, and automated rollback - with no manual approval step.

Owner: CTO (Saqlain Raza)

Deadline: 3 June 2026

Dependencies: CAB deprecation (2.2); Jenkins pipeline access; QA automation coverage baseline; feature flag tooling decision.

Success Criteria: - One product team identified as the pilot (highest deployment frequency, strongest test coverage) - Jenkins pipeline for that team updated: automated unit tests, integration tests, and security scan (Snyk) as gates; no manual approval gate - Feature flag system implemented (LaunchDarkly, Unleash, or similar) for that team - Rollback automated: failed healthcheck triggers automatic rollback within 5 minutes - Deployment frequency measured: target is at least one production deployment per working day - Incident rate monitored: no increase in production incidents vs. weekly deploy baseline

2.5 OKR Rollout: Technology Teams Pilot¶

Objective: Introduce OKR-based performance management to the technology teams as a pilot for the Q3 2026 cycle, replacing or complementing any existing performance management framework.

Key Result: OKRs drafted, reviewed, and adopted for all CDO-reporting teams (product, engineering, security) for Q3 2026 (July–September 2026) by Day 55.

Owner: CDO

Deadline: 3 June 2026

Dependencies: Team assessment completed (1.5); CEO alignment on OKR adoption at group level; HR involvement for performance management integration.

Success Criteria: - OKR framework defined and documented (Objectives, Key Results, scoring methodology, review cadence) - Each direct report (CPO, CISO, CTO) has a set of Q3 2026 OKRs agreed with CDO - Each direct report's team has OKRs cascaded from leadership OKRs - OKR review cadence established: weekly check-in (15 minutes), monthly score review, end-of-quarter retrospective - OKR tool or template selected (Notion, Linear, or a dedicated OKR platform) - CEO briefed on pilot design and asked to endorse for wider rollout in Q4 2026

2.6 CTO Appointment¶

Status: Complete. Saqlain Raza is appointed as CTO, reporting to the CDO. No further action required.

2.7 Product Roadmap Alignment¶

Objective: Ensure the CPO's product roadmap reflects CDO strategic priorities - particularly the Cloudflare migration, SRE SLOs, active-active DR, and expansion into Saudi Arabia / MENA / Central Asia.

Key Result: A unified CDO + CPO product and technology roadmap is published for H2 2026 by Day 55, with clear sequencing and dependency mapping.

Owner: CDO and CPO (Rizwan Zafar) jointly

Deadline: 3 June 2026

Dependencies: Product audit (1.4); Cloudflare POC direction (2.1); SRE SLO adoption (2.2); market expansion timeline from CEO.

Success Criteria: - H2 2026 roadmap published covering product features, infrastructure milestones, and security improvements - All roadmap items have an owner, a quarter, and a dependency list - Cloudflare migration, SRE, and active-active DR are treated as first-class roadmap items - not background infrastructure work - Saudi and MENA expansion technology requirements are included with a readiness assessment - CPO and CDO have no unresolved conflicts on prioritisation

2.8 Security Improvements from Phase 1¶

Objective: Begin remediating the highest-priority findings from the Phase 1 security posture review.

Key Result: All Critical-rated findings from the Phase 1 review are in active remediation with a named owner and a confirmed completion date by Day 55. At least 50% of High-rated findings are closed by Day 55.

Owner: CISO (Danish Hamid)

Deadline: 3 June 2026

Dependencies: Security posture review completed (1.3); DevOps and engineering team capacity for remediation work; vendor cooperation where required.

Success Criteria: - Critical findings: zero unowned items by Day 35; all in active remediation by Day 55 - High findings: 50% closed by Day 55; remainder on a committed schedule - CISO provides fortnightly status report to CDO - No Critical finding results in a regulatory notification obligation during Phase 2

CDO Direct Report Weekly Check-In¶

Format: 30 minutes, standing meeting, Tuesdays

Owner: CDO

Participants: All CDO direct reports (CPO, CISO, CTO)

Purpose: Keep Phase 2 delivery on track without waiting for monthly reviews. Surface blockers early. Align on the week's priorities. Each direct report gives a 5-minute status: what shipped, what is blocked, what they need from the CDO.

Starting: Week of 14 April 2026 (immediately)

2.9 Transaction Lifecycle Implementation - Phase 1¶

Objective: Ship the transaction lifecycle state machine and audit log designed in the Transaction Lifecycle Architecture document (April 2026). Convert the design from a spec into running infrastructure that Operations, Partners, and Compliance can actually use.

Key Result: State machine and audit log deployed for at least one product (Pay-Out recommended as first) by Day 55. Partner-facing status API live for that product.

Owner: CDO (sponsor); CTO (Saqlain Raza) (delivery owner)

Deadline: 3 June 2026

Dependencies: Transaction Lifecycle Architecture validated by CSNO (Bachir Njeim) - gate for stories; CPO assigns Product Manager to own state definitions; engineering capacity allocated from Phase 2 sprint.

Success Criteria: - CSNO validation of state definitions completed (gate for engineering start) - Database schema deployed: transaction_state and transaction_audit_log tables - State machine live for one product: every transition captured with timestamp, actor, and outcome - Partner-facing status API live: GET /v1/transactions/{id}/status returning partner states and history - Internal ops view: time-in-state visible in Datadog or equivalent - Design for Phase 2 event sourcing migration documented (H2 2026 handoff)

2.10 Saudi Arabia Technical Readiness Assessment¶

Objective: Define what Simpaisa needs technically to launch in Saudi Arabia - infrastructure, compliance, licensing, and staffing - and confirm the feasibility of the planned timeline.

Key Result: Saudi Arabia Technical Readiness document completed and presented to CEO by Day 55, covering SAMA aggregator model requirements, Cloudflare edge deployment for in-country data residency, and the staffing pipeline.

Owner: CDO (Daniel O'Reilly)

Deadline: 3 June 2026

Dependencies: SAMA aggregator model review (existing CDO research); Cloudflare POC direction (2.1) for edge deployment feasibility; market expansion timeline from CEO; talent pipeline work (3.5).

Success Criteria: - SAMA aggregator licensing requirements documented: what Simpaisa needs vs. what it has - Technical prerequisites listed: data residency (in-KSA), Arabic localisation, SAMA-required transaction reporting - Cloudflare-first deployment architecture sketched for Saudi (edge in Riyadh via Cloudflare) - Staffing requirements defined: minimum viable team for Saudi launch - Go / no-go criteria defined for the CEO: what must be true before committing to a Saudi launch date - Document presented to and endorsed by CEO by Day 55

2.11 Data Strategy¶

Objective: Define the data architecture that Simpaisa needs to operate as a data-driven business - from operational reporting to ML-ready analytics. Pulled forward from Phase 3 to Phase 2 because the CDO can produce this directly without waiting for infrastructure, and because it unblocks AI/ML (3.4) earlier.

Key Result: Data Strategy document completed, presented to CEO, and accepted by Day 55.

Owner: CDO (Daniel O'Reilly)

Deadline: 3 June 2026

Dependencies: Product audit (1.4) for analytics requirements; technology assessment (1.2) for current data infrastructure; KPI baseline work (3.1) for data source inventory.

Success Criteria: - Current state documented: existing databases (MySQL, PostgreSQL, MongoDB, DocumentDB), reporting tools (Looker, Metabase), and data flows - Target state defined: cloud data warehouse (Snowflake or BigQuery), dbt transformation layer, and self-service BI - Build vs. buy vs. partner recommendation made for each component - Implementation roadmap produced and aligned to Workstreams 4 and 5 of the Digital Transformation Roadmap - CDO and CPO aligned on data ownership and governance model - CEO accepts document and approves Phase 3 implementation investment

Phase 3: Measure and Optimise (Days 61–90)¶

Period: 5 June – 4 July 2026

3.1 KPI Baseline Establishment¶

Objective: Move from a theoretical KPI framework (the 83 KPIs in the operating model) to a live, data-driven baseline with actual values for each metric.

Key Result: Actual baseline values established for at least 60 of the 83 KPIs in the operating model by Day 85, with a data source and measurement method documented for each.

Owner: CDO, with delivery support from CTO and CPO

Deadline: 1 July 2026

Dependencies: Technology assessment (1.2); product audit (1.4); SRE SLO definitions (2.2); observability platform decision (2.3); data warehouse readiness (linked to Workstream 4 in the Digital Transformation Roadmap).

Success Criteria: - Each of the 83 KPIs has a status: Live (actual value confirmed), In Progress (data source identified, instrumentation pending), or Deferred (requires data infrastructure not yet built) - At least 60 KPIs are Live by Day 85 - All SRE-related KPIs (availability, latency, deployment frequency, mean time to recovery) are Live - KPI dashboard or report format agreed with CEO and board

3.2 Cost Optimisation¶

Objective: Identify and begin realising material reductions in cloud and tool spend without degrading service quality or reliability.

Key Result: Cost optimisation initiatives worth at least USD 200,000 annualised savings identified by Day 80, with at least USD 50,000 of those savings committed or in flight.

Owner: CTO (Saqlain Raza) in coordination with Finance

Deadline: 26 June 2026

Dependencies: AWS account inventory (1.2); vendor audit (1.6); Datadog assessment (2.3); Cloudflare cost modelling (2.1).

Success Criteria: - AWS reserved instance and savings plan analysis completed; recommendations made - EC2 right-sizing recommendations produced (instances over-provisioned by more than 30%) - Datadog cost vs. alternative modelled and factored into the savings target - At least one Reserved Instance purchase or Savings Plan commitment made - Cloudflare migration cost offset modelled for Pakistan corridor - Monthly cloud spend report established and shared with Finance

3.3 Data Strategy¶

Moved to Phase 2 as item 2.11. The CDO can produce this document directly without waiting for Phase 3 infrastructure. Pulling it forward unblocks the AI/ML initiative (3.4) by at least six weeks.

See 2.11 Data Strategy for the full item definition.

3.4 AI and ML Opportunities¶

Objective: Identify the highest-ROI opportunities for artificial intelligence and machine learning in Simpaisa's operations, and initiate the first initiative.

Key Result: AI/ML opportunity assessment completed by Day 80, with at least one initiative (reconciliation automation recommended as highest priority) in active development or a confirmed start date by Day 90.

Owner: CDO, with delivery by CTO and data team

Deadline: 26 June 2026 (assessment); 4 July 2026 (first initiative confirmed)

Dependencies: Data strategy (3.3); product audit (1.4) for operational pain points; access to historical transaction data for model training potential assessment.

Success Criteria: - Three to five AI/ML use cases identified, sized for effort and ROI - Each use case assessed for: data availability, feasibility, time to value, and regulatory risk - Reconciliation automation confirmed as Phase 1 priority (or an alternative justified) - Phase 1 initiative has a named lead, a defined scope, and a start date - CEO and CPO briefed on the AI/ML roadmap

3.5 Talent Pipeline: Saudi/MENA Expansion¶

Objective: Ensure the technology talent required for Saudi Arabia, broader MENA, and Central Asia expansion is identified and being recruited ahead of market entry.

Key Result: A talent plan for the expansion technology team is produced by Day 85, covering roles, locations, cost, and timeline. At least two roles are in active recruitment by Day 90.

Owner: CDO in coordination with HR

Deadline: 1 July 2026 (talent plan); 4 July 2026 (active recruitment confirmed)

Dependencies: Market expansion timeline from CEO; product roadmap alignment (2.7) for scope of expansion technology requirements; budget approval.

Success Criteria: - Roles defined for Saudi launch: at minimum a senior engineer, a DevOps/SRE engineer, and a security lead (or equivalents) - Location strategy confirmed (in-country hires vs. remote from existing hubs) - Compensation benchmarks produced for Saudi and MENA markets - JDs published and recruitment partner engaged for at least two roles - CDO confident the technology team can support market entry on the planned timeline

3.6 Board Report: CDO Quarterly Report¶

Objective: Present the first formal CDO Quarterly Report to the board, demonstrating measurable progress, financial discipline, and strategic clarity.

Key Result: CDO Quarterly Report presented to the board by Day 90.

Owner: CDO

Deadline: 4 July 2026

Dependencies: All Phase 1–3 deliverables; KPI baseline (3.1); cost optimisation results (3.2); Cloudflare POC findings (2.1); AI/ML assessment (3.4).

Success Criteria: - Report covers: strategic progress vs. the 90-day plan, KPI dashboard with actual values, cost optimisation delivered and pipeline, security posture improvement, product roadmap status, and investment requests for H2 2026 - Investment requests clearly articulated with business cases: data warehouse, Cloudflare production migration, headcount for Saudi expansion - Board questions addressed confidently and with data - Board approves H2 2026 investment plan

Phase Summary Table¶

Item	Owner	Deadline	Phase
1.1 Operating Model Completion	CDO	16 Apr / 6 May	1
1.2 Technology Assessment	CTO	1 May	1
1.3 Security Posture Review	CISO	4 May	1
1.4 Product Audit	CPO	4 May	1
1.5 Team Assessment	CDO	1 May / 6 May	1
1.6 Vendor Audit	CTO	4 May	1
1.7 DFSA Cat 3D Remediation	CDO / CISO	24 Apr	1
1.8 Quick Wins	CDO	21 Apr / 6 May	1
1.9 Board Presentation	CDO	6 May	1
2.1 Cloudflare POC	CTO	3 Jun	2
2.2 SRE Foundation	CTO	21 May / 3 Jun	2
2.3 Datadog Assessment	CTO / CISO	3 Jun	2
2.4 Daily Deploy Pipeline	CTO	3 Jun	2
2.5 OKR Rollout	CDO	3 Jun	2
2.6 CTO Appointment	Complete	-	-
2.7 Product Roadmap Alignment	CDO / CPO	3 Jun	2
2.8 Security Improvements	CISO	3 Jun	2
2.9 Transaction Lifecycle Implementation	CDO (sponsor) / CTO	3 Jun	2
2.10 Saudi Arabia Technical Readiness	CDO	3 Jun	2
2.11 Data Strategy	CDO	3 Jun	2
3.1 KPI Baseline	CDO / CTO / CPO	1 Jul	3
3.2 Cost Optimisation	CTO	26 Jun	3
3.3 Data Strategy	(moved to 2.11)	-	-
3.4 AI/ML Opportunities	CDO / CTO	26 Jun / 4 Jul	3
3.5 Talent Pipeline	CDO / HR	1 Jul / 4 Jul	3
3.6 Board Quarterly Report	CDO	4 Jul	3

---¶

Document 2: Digital Transformation Roadmap (12–24 Months)¶

Executive Summary¶

This roadmap defines how Simpaisa Group will modernise its technology platform, operational model, and data capability over the period July 2026 to June 2028. It is structured around six workstreams that are interdependent but independently executable. The roadmap is calibrated to Simpaisa's current reality: a functioning but infrastructure-heavy AWS estate, a product portfolio with strong corridor economics in Pakistan, and an ambition to expand into Saudi Arabia, broader MENA, and Central Asia.

The strategic architecture target is: Cloudflare at the edge for routing, WAF, and CDN; AWS for compute and persistent data; an event-driven platform on Kafka; a cloud data warehouse for analytics; and ML-powered operations. The cultural target is: SRE model, daily deployments, OKR-driven teams, and a data-first product culture.

This is a living document. It will be reviewed and updated at each quarterly board meeting.

Document Owner: CDO (Daniel O'Reilly)
Review Cadence: Quarterly
Next Review: July 2026 (Q3 2026 Board)

Workstream 1: Infrastructure Modernisation (Cloudflare Migration)¶

Current State¶

Simpaisa operates on a per-country AWS deployment model. Each market (Pakistan, Bangladesh, Nepal, Iraq, UAE) has its own AWS region or a shared region with logical separation. There is no active-active disaster recovery posture - the failure of a primary region would cause a service outage. CloudFront is used for some CDN functions. WAF rules are managed in AWS WAF. EC2 instances with Auto Scaling Groups provide compute. ALBs route traffic within each environment. There is no Cloudflare presence in the current stack.

Target State¶

Cloudflare sits at the edge for all markets. Cloudflare Workers handle edge routing, request authentication, and geo-specific logic. Cloudflare WAF replaces AWS WAF for perimeter security. Cloudflare CDN handles static assets and cacheable responses. AWS retains all compute (EC2/containers) and all persistent data (RDS MySQL/PostgreSQL, DocumentDB, ElastiCache/Redis). In-country data residency is enforced at the Cloudflare edge - no customer data transits outside the country of origin. Active-active DR is achieved through Cloudflare's global anycast network routing between two AWS regions per market.

Milestones¶

Phase 1 - Q3 2026 (July–September 2026): Proof of Concept - Cloudflare enterprise account procured and configured - One corridor selected for POC (Pakistan recommended: highest volume, most mature engineering team) - Cloudflare Workers deployed for one payment API flow (e.g. Pay-In initiation) - Performance benchmarked: latency (p50, p95, p99), error rate, throughput - WAF rules migrated from AWS WAF to Cloudflare WAF for POC scope - In-country routing validated with Simpaisa Legal and CISO for Pakistani regulatory requirements - Go / no-go decision made by end of Q3 2026 - Estimated investment: USD 15,000–25,000 (Cloudflare enterprise contract, engineering time)

Phase 2 - Q4 2026 (October–December 2026): Pakistan Production Migration - All edge routing for Pakistan migrated to Cloudflare Workers (production) - Cloudflare WAF fully deployed for Pakistan; AWS WAF deprecated for that market - Cloudflare CDN serving all static assets and cacheable API responses for Pakistan - Active-active DR pilot: two AWS regions (ap-south-1 and me-south-1 or equivalent) connected via Cloudflare routing - Operational runbooks updated for Cloudflare-first operations - Datadog or replacement observability platform integrated with Cloudflare logs and analytics - Estimated investment: USD 40,000–60,000 (additional Cloudflare capacity, engineering sprint, observability integration)

Phase 3 - Q1 2027 (January–March 2027): Bangladesh, Nepal, Iraq Roll-out - Cloudflare edge deployed for Bangladesh, Nepal, and Iraq using the Pakistan playbook - Data residency validated for each jurisdiction (regulatory requirements differ) - In-country data routing confirmed with Legal per jurisdiction - Active-active DR extended to all three markets - WAF rule sets customised per market threat profile - Estimated investment: USD 30,000–50,000 (incremental Cloudflare capacity, engineering)

Phase 4 - Q2 2027 (April–June 2027): Cloudflare-First for New Markets - Saudi Arabia (new market): Cloudflare-first from day one - no legacy AWS WAF or CloudFront - All new market launches default to the Cloudflare + AWS architecture - UAE Cloudflare migration completed (currently DIFC-regulated - requires Legal sign-off on data routing) - Global Cloudflare configuration management tooling established (Terraform provider for Cloudflare) - Estimated investment: USD 20,000–35,000 (per new market onboarding, Saudi-specific regulatory work)

Dependencies¶

Workstream 2 (SRE): SLOs must be defined before production migration - need a reliability baseline to measure against
Workstream 3 (Observability): Cloudflare logs need to feed into the chosen observability platform; Datadog has a Cloudflare integration but so do alternatives
Legal and Compliance: data residency confirmation required per market before production migration
Workstream 6 (Platform): API Gateway consolidation (6.x) simplifies Cloudflare routing configuration

Risk Register¶

Risk	Likelihood	Impact	Mitigation
Cloudflare Workers limitations for complex payment routing logic	Medium	High	POC specifically designed to expose this; fallback to Cloudflare as WAF/CDN only if Workers insufficient
Regulatory non-compliance on data routing (Pakistan SECP / SBP, Bangladesh BB)	Low	Critical	Legal review gated before each market migration; CISO sign-off required
Active-active DR complexity introducing split-brain risk	Medium	High	DR architecture design peer-reviewed before implementation; tested in staging for 30 days before production
AWS egress cost increase during hybrid period	Low	Medium	Modelled in Phase 1 cost comparison; Cloudflare egress pricing negotiated upfront
Vendor lock-in to Cloudflare	Low	Medium	All routing logic kept portable; Terraform-managed; annual contract review

Workstream 2: SRE Maturity and Deployment¶

Current State¶

Simpaisa uses Jenkins for CI/CD. Deployments are weekly or less frequent. There is no formal Site Reliability Engineering function, no defined Service Level Objectives, and no error budget framework. Deployment approvals go through a manual process (effectively a CAB). There is no automated rollback. On-call rotations exist informally but are not structured. Mean Time to Recovery is not measured.

Target State¶

Simpaisa operates a full SRE model. Every critical service has a published SLO. Error budgets are enforced: when a team exhausts its budget, feature work stops until reliability is restored. Deployments happen daily for all teams, gated only by automated tests, security scans, and feature flags. Rollback is automated and completes within five minutes of a failed healthcheck. Chaos engineering is practised monthly. Mean Time to Recovery, deployment frequency, and error budget burn rate are board-level metrics.

Milestones¶

Month 1 (April 2026): SLO Definition - SLOs defined for all critical business services (Pay-In, Pay-Out, Remittance, Crypto Off-Ramp, White-Label Wallet API, Authentication) - Each SLO specifies: availability target, latency target (p95 and p99), error rate target, and measurement window - SLO targets agreed between CDO, CTO, and CPO - product and engineering jointly own reliability - SLO instrumentation in current observability platform (Datadog or replacement) validated

Month 3 (June 2026): Error Budget Implementation - Error budget policy published and adopted by all engineering teams - Error budgets instrumented and visible to engineering leads and CDO in real time - First error budget review conducted - teams report on budget status and reliability trends - CAB formally deprecated; deployment standards document published as replacement

Month 4 (July 2026): Daily Deploy - Pilot Team - One team (highest deployment maturity) deploying to production daily - Jenkins pipeline fully automated for that team: unit tests, integration tests, Snyk scan, automated rollback - Feature flags implemented for all new features on that team - Deployment frequency and incident rate reported weekly to CDO

Month 6 (September 2026): Daily Deploy - All Teams - All engineering teams deploying to production at least once per working day - Pipeline standardised across all teams using shared Jenkins library or migration to alternative CI tooling - On-call rotations formalised: all critical services have a named on-call engineer and an escalation path - MTTR measured for all P1 and P2 incidents; target: under 30 minutes for P1

Month 9 (December 2026): Chaos Engineering - Chaos engineering programme initiated: one chaos experiment per sprint per critical service team - Game Day exercises conducted quarterly - Failure modes documented for all critical paths - Resilience targets included in SLO definitions

Month 12 (March 2027): SRE Maturity Assessment - External or internal SRE maturity assessment conducted - Target: Level 3 of 5 on a standard SRE maturity model - Gaps from assessment feed into Year 2 roadmap

Investment Requirements¶

Item	Estimated Annual Cost
SRE tooling (feature flags, chaos engineering platform)	USD 20,000–40,000
On-call management platform (PagerDuty or equivalent)	USD 15,000–25,000
SRE lead hire (if not promoted internally)	USD 80,000–120,000 (salary + benefits)
Engineering time for pipeline automation (internal)	2–3 engineer-months

Risk Register¶

Risk	Likelihood	Impact	Mitigation
Cultural resistance to error budget enforcement	High	Medium	CDO and CTO visibly enforce policy from day one; first breach handled constructively
Daily deploy frequency exposes low test coverage	High	High	Test coverage audit conducted before pipeline gates enforced; coverage targets set and tracked
CAB deprecation creates compliance concern for regulated entities	Medium	High	Legal and CISO confirm that deployment standards document satisfies regulatory requirements

Workstream 3: Observability Platform Replacement¶

Current State¶

Simpaisa uses Datadog for application performance monitoring, log management, and infrastructure metrics. CyGlass provides network behaviour analytics for security. AWS CloudWatch provides basic infrastructure monitoring. There is no unified observability platform. Datadog costs are substantial and growing with data volume. The Cloudflare migration (Workstream 1) will introduce new log sources that must be integrated into whatever platform is chosen.

Target State¶

A unified observability platform covers application performance, infrastructure metrics, logs, security events, and Cloudflare telemetry in a single pane of glass. Cost per gigabyte of logs ingested is materially lower than Datadog. Log retention meets PCI DSS requirements (minimum 90 days online, 12 months archived). On-call alerting is reliable and integrated with the on-call management platform. The platform integrates natively with Cloudflare.

Evaluation Criteria¶

Criterion	Weight	Notes
3-year total cost of ownership	30%	Compare on equivalent log volume and APM coverage
Cloudflare integration quality	20%	Native log push from Cloudflare Logpush; Cloudflare analytics integration
On-call alerting reliability	15%	PagerDuty/Opsgenie integration; alert routing; noise reduction
Log retention and compliance	15%	90-day online minimum; 12-month archive; PCI DSS and DFSA-compatible
APM and distributed tracing	10%	Java/Spring, Node.js, Python trace support
Migration complexity	10%	Dashboard migration, alert migration, integration effort

Candidates¶

Grafana Cloud - Managed Grafana, Loki (logs), Tempo (traces), Mimir (metrics) - Strong Cloudflare integration via Loki log push - Cost: consumption-based; typically 40–60% lower than Datadog at Simpaisa's scale - Migration: moderate - dashboards require rebuild; Prometheus exporters widely supported

New Relic - Full-stack observability; similar feature set to Datadog - Cloudflare integration available - Cost: user-seat plus data ingestion model; typically comparable to Datadog - Migration: low - Datadog parity is high; risk of replacing one expensive platform with another

Elastic / ELK (self-hosted on AWS) - Elasticsearch, Logstash, Kibana - self-managed on EC2 or EKS - Highest control; lowest per-unit cost at scale; highest operational overhead - Cloudflare Logpush supported - Risk: requires dedicated engineering resource to operate; adds operational complexity

Axiom - Modern log management platform; excellent cost efficiency for high-volume log ingestion - Cloudflare native integration (official Cloudflare partner) - Weaker APM - would need to be combined with a metrics/tracing tool (e.g. Grafana for metrics) - Best suited if Simpaisa separates log management from APM

Timeline¶

Months 2–3 (May–June 2026): Evaluation - RFP or structured evaluation conducted for all four candidates - Datadog usage profiled: log volume, APM instrumented services, custom metrics, dashboard count, active users - Each candidate evaluated against scoring criteria - Recommendation and business case presented to CDO by end of Month 3

Months 4–5 (July–August 2026): Proof of Concept - Winning candidate deployed in parallel with Datadog for one engineering team - Cloudflare log integration tested - Alert parity validated: all existing Datadog alerts recreated in new platform - Cost modelled on actual ingestion during POC

Months 6–9 (September–December 2026): Migration - All teams migrated to new platform - Datadog contract terminated or scaled back to minimum - CyGlass security data integrated into new platform or retained separately - CloudWatch alerts decommissioned or replaced - Final cost comparison reported to CDO and Finance

Investment Requirements¶

Scenario	Estimated Annual Cost	vs. Datadog
Grafana Cloud	USD 60,000–100,000	Saving 40–60%
New Relic	USD 120,000–180,000	Broadly similar
Self-hosted ELK	USD 40,000–70,000 (infra) + 1 FTE	Saving 30–50% before headcount
Axiom + Grafana	USD 50,000–80,000	Saving 45–55%
Current Datadog (baseline)	USD 150,000–200,000 (estimated)	Baseline

Estimates are illustrative pending vendor audit (1.6) confirming actual Datadog spend.

Risk Register¶

Risk	Likelihood	Impact	Mitigation
Migration causes observability gap and delayed incident detection	Medium	High	Parallel running for minimum 60 days before Datadog decommission
Chosen platform underperforms on APM for Java/Spring services	Medium	High	POC explicitly tests Java APM instrumentation
PCI DSS log retention requirements not met	Low	Critical	Compliance review of candidate platforms before selection

Workstream 4: Data and Analytics¶

Current State¶

Simpaisa has operational databases per product and market: MySQL and PostgreSQL for transactional data, MongoDB and DocumentDB for document storage, Redis for caching. Reporting is produced from Looker and Metabase connected directly to operational databases - a practice that introduces query performance risk and lacks a governed data model. There is no data warehouse, no dbt or equivalent transformation layer, and no self-service analytics capability. Corridor economics (revenue, volume, margin per corridor) cannot be reported in real time. There is no data platform capable of supporting ML model training.

Target State¶

A cloud data warehouse (Snowflake or BigQuery) sits at the centre of the analytics stack. Operational data is replicated via event streams (Kafka already in place) or CDC (change data capture) into the warehouse. A dbt transformation layer produces clean, governed, business-aligned data models. Metabase or Looker connects to the warehouse (not to operational databases). Self-service analytics is available to product, finance, and operations teams. Real-time corridor economics are available as a live dashboard. The data platform is ML-ready: clean, versioned, documented data is available for model training.

Recommended Stack¶

Ingest: Kafka (already deployed) for real-time event streaming; AWS DMS or Debezium for CDC from MySQL/PostgreSQL
Warehouse: Snowflake (preferred for multi-cloud neutrality and ease of operations) or BigQuery (preferred if GCP is acceptable alongside AWS)
Transformation: dbt Core (open source) or dbt Cloud (managed)
BI: Metabase (already in use - connect to warehouse rather than operational DB) or Looker (if budget permits)
Orchestration: Apache Airflow (self-hosted) or Astronomer (managed Airflow)

Timeline¶

Month 3 (June 2026): Data Warehouse POC - Snowflake or BigQuery trial account provisioned - One data domain selected for POC (recommended: transaction and corridor data) - Kafka-to-warehouse streaming pipeline built for POC domain - dbt models built for corridor economics - Metabase connected to warehouse; corridor economics dashboard delivered - POC reviewed by CDO, CPO, and Finance

Month 6 (September 2026): Production Data Warehouse - Full production data warehouse deployed - All operational databases replicated into warehouse (CDC or event streaming) - dbt transformation layer in production covering: transactions, corridors, products, customers, and financial settlement - Metabase (or Looker) connected to warehouse; operational databases removed from direct BI connections - Data governance framework published: ownership, quality standards, access controls

Month 9 (December 2026): Self-Service Analytics - Self-service analytics enabled for product, finance, and operations teams - Training conducted: how to build reports in Metabase without engineering support - Data catalogue published (dbt documentation as the source of truth) - Real-time corridor economics dashboard live and used in weekly ELT review - Finance team producing regulatory reports from warehouse (not from operational DBs)

Month 12 (March 2027): ML-Ready Data Platform - Feature store established for ML model training (Workstream 5 dependency) - Data quality monitoring in place: dbt tests run daily, failures alert to data engineering team - Historical data (minimum 2 years) clean and accessible for fraud detection model training - Data platform cost reviewed and optimised

Investment Requirements¶

Item	Estimated Annual Cost
Snowflake (production, assuming current data volumes)	USD 50,000–100,000
dbt Cloud (if managed)	USD 15,000–30,000
Airflow / Astronomer (orchestration)	USD 10,000–25,000
Data engineering hire (1 senior + 1 mid)	USD 140,000–200,000 (salary + benefits, Dubai/remote)
Metabase Pro (if not already licensed)	USD 10,000–20,000
Total (Year 1)	USD 225,000–375,000

Risk Register¶

Risk	Likelihood	Impact	Mitigation
Data quality issues in operational databases corrupt warehouse	High	High	CDC with validation; dbt tests as quality gate; data quality remediation sprint before warehouse launch
Snowflake costs exceed budget as data volume grows	Medium	Medium	Consumption monitoring from day one; Snowflake credit alerts; query optimisation training
Data governance not adopted by product teams	Medium	Medium	CDO mandates warehouse as single source of truth; BI tools decoupled from operational DBs
Regulatory data handling requirements not met in warehouse	Low	High	Legal and CISO review data residency for warehouse; Snowflake supports regional data residency

Dependencies¶

Workstream 2 (SRE): data pipeline reliability requires SRE practices applied to data engineering
Workstream 5 (AI/ML): ML platform depends on clean, warehouse-accessible data
Workstream 1 (Cloudflare): Cloudflare log data should be routed to warehouse for analytics

Workstream 5: AI and ML for Operations¶

Current State¶

Simpaisa's operations are largely manual at the exception-handling layer. Reconciliation is performed manually: transaction records from payment rails, banking partners, and internal systems are matched by operations staff. Rule-based fraud detection exists but generates a high false-positive rate and misses novel patterns. Settlement is managed manually with cash flow visibility being limited. The cost of manual operations is material and scales linearly with transaction volume - a model that is incompatible with the growth trajectory into Saudi Arabia and MENA.

Target State¶

Reconciliation is automated: machine learning models match 95%+ of transactions automatically, with exceptions queued for human review. Fraud detection uses ML scoring on real-time transaction events, trained on Simpaisa's own historical data, with the false-positive rate materially lower than the current rule-based system. Settlement optimisation uses forecasting models to minimise idle float and reduce settlement costs. The operations team shifts from manual processing to exception management and model oversight.

Phase 1: Reconciliation Automation (Months 4–7)¶

Business Case: Reconciliation is the highest operational cost and the highest ROI for automation. Manual reconciliation staff cost is estimated at X FTE per Y transactions per day. A 95%+ auto-match rate would reduce this to X/20 FTE for exception handling. The model trains on structured, deterministic data (transaction IDs, amounts, timestamps, reference numbers) - high feasibility, low regulatory risk.

Deliverables: - Month 4: reconciliation data audit - all sources, formats, match rates, exception categories documented - Month 5: ML model designed and trained on 12 months of historical reconciliation data - Month 6: POC deployed in shadow mode - model runs alongside manual process; accuracy benchmarked - Month 7: production deployment - automated reconciliation for one corridor (Pakistan); exceptions queue managed by reduced operations team - Month 9: expanded to all corridors; 95%+ auto-match target validated in production

Stack: Python (scikit-learn or LightGBM for matching model), Kafka for real-time transaction event consumption, PostgreSQL or warehouse for reconciliation data, internal API for exception queue management.

Phase 2: Fraud Detection (Months 9–12)¶

Business Case: Rule-based fraud detection is brittle and generates operational overhead from false positives. ML fraud scoring on real-time transaction events, trained on Simpaisa's own labelled fraud data, will improve detection accuracy and reduce false positives. This protects revenue, reduces chargeback costs, and supports regulatory compliance.

Deliverables: - Month 9: fraud data audit - historical fraud labels, transaction features, class imbalance assessment - Month 10: baseline ML model trained (XGBoost or LightGBM); offline accuracy benchmarked against rule-based baseline - Month 11: shadow mode deployment - ML score computed alongside rule engine; no automated blocking yet - Month 12: production deployment for low-risk corridors - ML score gates transactions above risk threshold

Dependencies: Workstream 4 (Data) - ML-ready data platform required; minimum 12 months of labelled fraud data; data privacy review (PDPA compliance in Pakistan; GDPR-equivalent in UAE/DIFC).

Phase 3: Settlement Optimisation (Months 12–18)¶

Business Case: Settlement timing decisions currently made manually. Cash flow forecasting is limited. ML-based settlement optimisation will minimise idle float across corridors, reduce overdraft fees from banking partners, and improve liquidity management.

Deliverables: - Month 12: settlement data model built in warehouse; corridor cash flow history analysed - Month 14: forecasting model built (ARIMA or Prophet baseline; LSTM if data permits) - Month 16: settlement recommendation engine deployed - suggests optimal settlement timing per corridor - Month 18: closed-loop optimisation - system executes settlement decisions within pre-approved parameters

Investment Requirements¶

Item	Estimated Cost
ML engineering hire (1 senior ML engineer)	USD 100,000–140,000 per annum
Data infrastructure for ML (feature store, model registry)	USD 20,000–40,000 per annum
Cloud compute for model training (GPU instances, spot pricing)	USD 10,000–30,000 per annum
MLflow or equivalent for model lifecycle management	USD 5,000–15,000 per annum
Total (Year 1)	USD 135,000–225,000

Expected ROI: - Reconciliation automation: estimated saving of 3–5 FTE operations headcount (USD 60,000–120,000 per annum at local market rates, depending on geography) - Fraud detection improvement: estimated reduction in chargeback and fraud losses (to be quantified once fraud data audit is complete) - Settlement optimisation: estimated reduction in idle float costs (to be quantified once treasury data is accessible)

Risk Register¶

Risk	Likelihood	Impact	Mitigation
Insufficient historical data for model training	Medium	High	Data audit in Month 4 establishes feasibility before engineering investment; fallback to rule-based automation if data insufficient
Model deployed without adequate human oversight (regulatory risk)	Medium	Critical	All models deployed in shadow mode for minimum 60 days; human review of all exceptions; model decisions logged and auditable
Data privacy breach from ML training on customer transaction data	Low	Critical	Data privacy review by CISO and Legal before any model training; pseudonymisation where required
Operations team resistance to automation (change management)	Medium	Medium	Change programme led by CDO and CPO; operations team repositioned as exception management specialists

Dependencies¶

Workstream 4 (Data): ML-ready data warehouse is a hard dependency for Phase 2 and 3
Workstream 2 (SRE): ML pipelines subject to SRE SLOs - model drift detection treated as a reliability concern

Workstream 6: Platform Scalability¶

Current State¶

Simpaisa's core platform is composed of services built in Java/Spring, Node.js, Python, and PHP. Several services have characteristics of monolithic design - large codebases with broad responsibilities, tightly coupled to specific databases. Horizontal scaling is achieved via EC2 Auto Scaling Groups. Kafka is in place for asynchronous messaging but is not uniformly adopted across all services. There is no API gateway layer consolidating external-facing APIs. Database scaling relies on vertical scaling (larger instance types) and read replicas rather than sharding or multi-region replication. There is no active-active disaster recovery posture.

Target State¶

The platform is decomposed into well-bounded services where decomposition is justified by scale, team autonomy, or reliability requirements - not for its own sake. Event-driven architecture is the default for asynchronous operations (Kafka). An API gateway consolidates all external-facing APIs with consistent authentication, rate limiting, and observability. Databases for high-volume corridors are designed for horizontal scale (sharding or managed scaling). Critical path services operate in active-active configuration across at least two AWS regions, with Cloudflare routing arbitrating between them (Workstream 1 dependency).

Milestones¶

Month 2 (May 2026): Architecture Review - Full service map produced: all services, their languages, their databases, their Kafka topic subscriptions, and their external integrations - Coupling analysis: identify the highest-coupled, highest-risk services - Decomposition candidates identified: services where decomposition would meaningfully improve scalability, deployability, or team autonomy - API surface audit: all external APIs catalogued; redundant or inconsistent API endpoints identified - Database scaling risk assessment: identify databases approaching capacity limits or with problematic query patterns - Architecture review document presented to CDO and CTO

Month 5 (August 2026): First Service Extraction - One high-value service extraction completed: recommended candidate is the reconciliation service (which Workstream 5 is also touching) or the notification service - Extracted service has its own repository, its own deployment pipeline (daily deploy), its own SLO, and its own on-call rotation - Kafka integration standardised for the extracted service - Lessons learned documented for subsequent extractions

Month 8 (November 2026): Active-Active for Critical Path - Pay-In and Pay-Out services operating in active-active configuration across two AWS regions for Pakistan - Cloudflare routing directing traffic between regions based on health (Workstream 1 dependency) - Failover tested: simulated region failure with recovery time under 5 minutes and no data loss - RTO and RPO targets defined and validated: recommended RTO < 5 minutes, RPO = 0 for payment transactions

Month 12 (March 2027): API Gateway Consolidation - API gateway deployed (AWS API Gateway, Kong, or Cloudflare API Gateway - decision in architecture review) - All external-facing APIs routed through gateway - Consistent authentication (JWT or OAuth 2.0), rate limiting, and request logging applied at gateway layer - API versioning strategy published and enforced

Month 18 (September 2027): Database Sharding for High-Volume Corridors - Pakistan transaction database sharding strategy implemented for MySQL (or migration to a horizontally scalable alternative) - Sharding tested at 3x current peak volume - Database cost optimised: right-sized instances, reserved instances purchased

Month 24 (March 2028): Platform Maturity Review - End-of-roadmap architecture review - Actual vs. target state assessed - Remaining decomposition candidates prioritised for Year 3 roadmap - Platform capable of supporting 10x current transaction volume without architectural change

Investment Requirements¶

Item	Estimated Annual Cost
Additional cloud infrastructure for active-active (second region)	USD 80,000–150,000
API gateway platform (if Kong or commercial)	USD 20,000–40,000
Senior platform/backend engineer hires (2 FTE)	USD 160,000–240,000
Architecture consultancy (if used for decomposition design)	USD 30,000–60,000 (one-time)

Risk Register¶

Risk	Likelihood	Impact	Mitigation
Service decomposition creates distributed systems complexity without commensurate benefit	High	Medium	Decomposition is justified per service with a documented business case; not pursued as a dogma
Active-active introduces data consistency issues for financial transactions	Medium	Critical	Consistency model designed by senior engineers; tested exhaustively in staging; financial transaction writes use synchronous replication
API gateway becomes a single point of failure	Low	High	API gateway deployed in active-active configuration; Cloudflare provides edge failover
PHP legacy services difficult to integrate with modern architecture	Medium	Medium	PHP services isolated behind internal APIs; migration deprioritised unless they become a blocker

Dependencies¶

Workstream 1 (Cloudflare): active-active architecture requires Cloudflare routing for global load balancing
Workstream 2 (SRE): each extracted service requires SLO definition and error budget before production
Workstream 4 (Data): service extraction may require database separation - needs data engineering input

Cross-Workstream Dependencies¶

The following matrix shows where workstreams have hard dependencies on each other. A blocked dependency means the downstream workstream cannot proceed until the upstream milestone is complete.

Upstream Workstream	Upstream Milestone	Downstream Workstream	Dependency Type
WS1 (Cloudflare) Phase 1 POC	Q3 2026	WS6 Active-Active (Month 8)	Hard - routing requires Cloudflare
WS2 (SRE) SLO Definition	Month 1	WS1 Production Migration	Hard - need reliability baseline before migration
WS2 (SRE) Daily Deploy	Month 4	WS6 Service Extraction	Enabling - extracted services need daily deploy pipeline
WS3 (Observability) Selection	Month 3	WS1 Cloudflare Migration	Hard - Cloudflare logs must integrate with chosen platform
WS4 (Data) Warehouse Production	Month 6	WS5 Fraud ML (Month 9)	Hard - ML training requires warehouse
WS4 (Data) ML-Ready Platform	Month 12	WS5 Settlement Optimisation	Hard - forecasting requires clean historical data
WS6 (Platform) Architecture Review	Month 2	WS6 Service Extraction	Hard - extraction targets defined in review

Aggregate Investment Summary¶

The following table provides estimated investment requirements by workstream and year. All figures are estimates and will be refined as Phase 1 assessments (vendor audit, technology assessment) are completed.

Workstream	Year 1 (Jul 2026–Jun 2027)	Year 2 (Jul 2027–Jun 2028)	Notes
WS1: Cloudflare Migration	USD 85,000–135,000	USD 55,000–90,000	Includes Cloudflare contract; offset by AWS WAF/CDN savings
WS2: SRE and Deployment	USD 115,000–185,000	USD 80,000–130,000	Includes SRE hire, tooling; on-call platform
WS3: Observability Replacement	USD 60,000–100,000	USD 60,000–100,000	Year 1 includes migration cost; Year 2 is steady-state
WS4: Data and Analytics	USD 225,000–375,000	USD 180,000–280,000	Largest investment; 2 data engineering hires
WS5: AI/ML for Operations	USD 135,000–225,000	USD 100,000–160,000	1 ML engineer + infrastructure; offset by ops FTE saving
WS6: Platform Scalability	USD 290,000–490,000	USD 200,000–340,000	Active-active infra + 2 platform engineers
Total	USD 910,000–1,510,000	USD 675,000–1,100,000

Notes on Investment: - Year 1 total is front-loaded due to one-time migration and hiring costs - Savings from observability replacement (WS3) and operations automation (WS5) partially offset investment in Years 1–2 - Headcount costs assume Dubai-equivalent market rates; regional hiring (Pakistan, Bangladesh) would reduce total materially - All figures are before any cost savings from AWS right-sizing and reserved instance purchases (3.2 in the 90-day plan); estimated saving of USD 150,000–300,000 annually once optimisation is complete

Headcount Plan¶

Role	Workstream	Timing	Location
SRE Lead	WS2	Q3 2026	Dubai or remote
Data Engineer (Senior)	WS4	Q3 2026	Dubai or remote
Data Engineer (Mid)	WS4	Q4 2026	Dubai or remote
ML Engineer (Senior)	WS5	Q4 2026	Dubai or remote
Platform Engineer (Senior) x2	WS6	Q3–Q4 2026	Dubai or remote
Saudi/MENA Tech Lead	WS1/WS6	Q1 2027	Riyadh
Saudi DevOps/SRE Engineer	WS2	Q1 2027	Riyadh
Saudi Security Lead	WS3/Security	Q1 2027	Riyadh

Total net new headcount (Year 1): 7–8 FTE Total net new headcount (Year 2): 3–4 FTE (Saudi expansion team)

Quarterly Review Cadence¶

This roadmap is reviewed at each quarterly board meeting. The CDO presents:

Progress against milestones for the current quarter
Actuals vs. estimated investment spend
Risk register update: new risks, closed risks, escalations
Proposed adjustments to the roadmap for the next quarter
Any investment requests above the approved envelope

The roadmap is also reviewed monthly at ELT level by the CDO with the CTO and CPO.

End of Document

Document Classification: Confidential - ELT and Board Only
Version: 1.0
Date: 3 April 2026
Author: Daniel O'Reilly, Chief Digital Officer, Simpaisa Group