Skip to content

Platform Team Charter

Simpaisa Holdings — Technology Group

Field Value
Owner Daniel O'Reilly, Chief Digital Officer
Status Draft
Created 2026-04-03
Last Updated 2026-04-03
Review Cadence Quarterly

1. Purpose & Vision

Why a Platform Team Exists

Simpaisa processes over 270 million transactions annually, worth more than $1 billion, across five markets (Pakistan, Bangladesh, Nepal, Iraq, and Egypt). As the organisation transitions from a monolithic Java/Spring Boot architecture to a polyglot microservices estate, the complexity of infrastructure, tooling, and operational concerns will grow beyond what individual product teams can reasonably manage alongside their domain work.

A dedicated platform team exists to absorb that complexity — providing a curated, opinionated, self-service platform that product teams consume rather than build. This reduces cognitive load, eliminates duplicated effort, and ensures consistency in observability, security, and deployment practices across every product line.

Vision

Enable product teams to ship faster and more safely through self-service infrastructure, world-class developer tooling, and an agentic AI-augmented software development lifecycle.

The platform team's north star is developer velocity with production confidence: every product engineer should be able to go from idea to production with minimal friction, while the platform guarantees observability, security, and operational excellence by default.


2. Mission Statement

The Platform Team's mission is to:

  1. Build and maintain the shared platform that all product teams build upon — infrastructure, tooling, pipelines, and runtime services.
  2. Reduce cognitive load on product teams by providing paved roads, sensible defaults, and self-service capabilities so they can focus on domain problems.
  3. Enable the agentic AI SDLC across all engineering — making Claude Code, Beads, Engram, and autonomous agent workflows first-class citizens of the development process.
  4. Operate with a product mindset — treating internal developers as customers, measuring satisfaction, and iterating on the platform based on feedback.

3. Organisational Context

CDO Organisation Structure

Chief Digital Officer (Daniel O'Reilly)
├── Product Group
│   ├── Pay-Ins Team
│   ├── Pay-Outs Team
│   ├── Remittances Team
│   └── Cards Team
├── Technology Group
│   ├── Platform Team  ← this charter
│   ├── Portal Team (Iqbal Butt — Dev Manager, Selina Wilson — Portal Lead)
│   └── Engineering (Java/Spring Boot — Karachi)
├── Security Group
│   ├── CISO (Danish)
│   ├── InfoSec Manager (Hamza)
│   ├── Cloud Security Engineer (Khizer)
│   ├── SOC Lead (Khubaib)
│   ├── SOC Analyst (Zain)
│   └── Asst Manager InfoSec (Kamran)
└── Data Group

Where the Platform Team Sits

The Platform Team sits within the Technology Group under the CDO. It is a horizontal function that serves all product teams and collaborates closely with Security and Data.

Key Relationships

Relationship Nature
Product Teams Primary customers — consume platform services
Security Group Close collaboration on compliance, identity, secrets, threat modelling
Data Group Collaborate on data infrastructure, analytics pipelines, observability
Portal Team Consumer of platform services; migration partner (legacy → new stack)
CDO Accountable executive; architecture decisions escalate here

4. Team Topology

The Platform Team follows the Team Topologies model (Skelton & Pais) to define clear team types and interaction modes.

Team Types

Team Type Focus
Platform Platform Team Shared infrastructure, tooling, developer experience
Pay-Ins Stream-Aligned Payment acceptance and merchant integration
Pay-Outs Stream-Aligned Disbursements and settlement
Remittances Stream-Aligned Cross-border money transfer
Cards Stream-Aligned Card issuance and management
Portal Stream-Aligned Merchant and admin portal
Security/Compliance Complicated-Subsystem Security architecture, compliance, SOC

Interaction Modes

From → To Mode Description
Platform → Product Teams X-as-a-Service Platform provides self-service capabilities
Platform → Product Teams Facilitating Platform coaches teams on new tools, patterns, stack
Platform → Security Collaborating Joint work on identity, secrets, compliance tooling
Platform → Data Collaborating Joint work on data infrastructure and pipelines
Product Teams → Platform X-as-a-Service Product teams consume platform APIs and tooling

Thin Interaction Layer

The Platform Team exposes its capabilities through: - Self-service APIs and CLIs - Golden path templates (service scaffolding, pipeline configs) - Documentation on the developer portal - Beads issues for requests that require human involvement


5. Platform Team Scope

Owns (Builds, Operates, Supports)

The Platform Team has full ownership of the following. "Owns" means the team builds, deploys, monitors, maintains, and is on-call for these systems.

Domain Components
API Gateway KrakenD — configuration, deployment, monitoring, rate limiting
Observability OpenTelemetry, Grafana, Jaeger, OpenSearch, PostHog
CI/CD Pipelines Build, test, deploy pipelines for all services
Infrastructure as Code Terraform/Pulumi modules, environment provisioning
Developer Tooling Claude Code configuration, Beads (bd), Engram MCP setup
Identity & Access ControlPlane.com — service identity, RBAC, policy enforcement
Edge Infrastructure Cloudflare — DNS, CDN, WAF, Workers, R2
Database Infrastructure SurrealDB clusters, MySQL operations, backup, failover
Messaging Infrastructure NSQ — deployment, topic management, monitoring
Search Infrastructure Meilisearch — cluster management, index lifecycle
Workflow Engine Temporal — server operations, namespace management, SDK support
SDK Generation Pipeline OpenAPI → client SDK generation and publication
Developer Portal / DevEx Internal documentation site, onboarding guides, golden paths
Secret Management Vault/secrets infrastructure, rotation policies
Container Orchestration Kubernetes/container runtime, service mesh, scaling policies

Supports (Consults, Reviews — Does Not Own)

Domain Platform Team Role
Product API Design Review against API-STANDARDS.md; advisory
Security Architecture Collaborate with Security team; implement tooling they specify
Data Architecture Collaborate with Data team on infrastructure needs
Performance Optimisation Profiling tools, load testing infrastructure, advisory

Does NOT Own

The following are explicitly outside the Platform Team's scope:

  • Product business logic — owned by stream-aligned product teams
  • Merchant relationships — owned by commercial and product teams
  • Payment channel integrations — owned by product teams (Pay-Ins, Pay-Outs)
  • Compliance policy — owned by Security/Compliance (CISO organisation)
  • Data models and schemas — owned by product teams; platform provides the infrastructure only
  • Product roadmap prioritisation — owned by Product Group

6. Service Catalogue

# Service Description SLA (Availability) Response Time Model Docs
1 API Gateway (KrakenD) Central gateway for all APIs — routing, auth, rate limiting 99.95% Self-Service TBC
2 Observability Stack Metrics, traces, logs — OpenTelemetry, Grafana, Jaeger 99.9% Self-Service TBC
3 CI/CD Pipelines Build, test, deploy automation for all services 99.9% 4 hrs for changes Self-Service TBC
4 Infrastructure Provisioning New environments, services, databases 24 hrs Supported TBC
5 Developer Tooling Claude Code, Beads, Engram setup and configuration 4 hrs Self-Service TBC
6 Identity & Access Service accounts, RBAC policies via ControlPlane.com 99.95% 4 hrs Supported TBC
7 Edge / CDN (Cloudflare) DNS, CDN, WAF, Workers 99.99% (provider) 2 hrs Self-Service TBC
8 Database Operations SurrealDB clusters, MySQL — provisioning, backup, failover 99.95% 1 hr (critical) Supported TBC
9 Messaging (NSQ) Topic/channel provisioning, monitoring 99.9% 4 hrs Self-Service TBC
10 Search (Meilisearch) Index lifecycle, cluster management 99.9% 4 hrs Self-Service TBC
11 Workflow Engine (Temporal) Namespace management, worker deployment, SDK support 99.9% 4 hrs Supported TBC
12 SDK Generation Client SDK generation from OpenAPI specs Automated Self-Service TBC
13 Developer Portal Internal docs, onboarding, golden path templates 99.9% Self-Service TBC
14 Secret Management Secret provisioning, rotation, access policies 99.95% 2 hrs Supported TBC
15 Container Orchestration Kubernetes clusters, scaling, service mesh 99.95% 1 hr (critical) Supported TBC
16 Security Tooling SAST/DAST pipeline integration, dependency scanning 99.9% 4 hrs Self-Service TBC

Model definitions: - Self-Service — product teams use directly via CLI, API, or portal with no platform team involvement for standard operations. - Supported — product teams raise a request (Beads issue); platform team provisions or assists within the stated response time.


7. SLAs to Product Teams

Request Response Times

Request Type Target Response Time Notes
Infrastructure provisioning 24 hours New environments, services, databases
CI/CD pipeline changes 4 hours Modifications to existing pipelines
New CI/CD pipeline (new service) 2 business days Full pipeline for a new microservice
Security review (platform changes) 3 business days Coordinated with Security team
New service onboarding 3 business days Full golden-path setup
SDK regeneration (API spec change) Automated Triggered on OpenAPI spec merge
Secret provisioning 4 hours New secrets, rotation
Gateway route changes 2 hours KrakenD configuration updates

Incident Response (Platform Issues)

Severity Definition Response Time Resolution Target
P1 Platform outage affecting production traffic 15 minutes 1 hour
P2 Degraded platform service, partial impact 30 minutes 4 hours
P3 Non-critical platform issue, workaround available 2 hours 1 business day
P4 Cosmetic or minor issue, no business impact 1 business day 5 business days

Availability Targets

Tier Target Applies To
Tier 1 99.95% API Gateway, Identity, Databases, Containers
Tier 2 99.9% Observability, CI/CD, Messaging, Search
Tier 3 99.5% Developer Portal, Tooling, SDK Pipeline

8. Agentic AI SDLC Model

Simpaisa is pioneering an agentic AI-first software development lifecycle. The Platform Team is responsible for enabling and operationalising this model across all engineering teams.

Core Toolchain

Tool Purpose Owner
Claude Code Primary development tool — code generation, review, debugging, documentation Platform
Beads (bd) Issue tracking across all teams — local-first, Dolt-backed Platform
Engram MCP Persistent cross-session memory for AI agents — SQLite/WAL Platform

AI-Augmented Workflows

Workflow Automation Level Human Involvement
Code generation AI-primary Review and approval
Code review AI-assisted Human reviewer with AI analysis
Test generation AI-primary Human validation of coverage
Documentation AI-primary Human review for accuracy
Incident triage AI-assisted Human decision-making
Dependency updates Autonomous agent Human approval for major versions
SDK regeneration Fully automated None (triggered by spec change)
Security scanning Autonomous agent Human review of findings
Infrastructure provisioning AI-assisted Human approval via PR

Human-in-the-Loop Gates

The following activities require human decision-making regardless of AI capability:

  • Architecture decisions (captured as ADRs in ADR/ folder)
  • Security reviews and threat modelling
  • Compliance and regulatory decisions
  • Production deployment approval (for critical services)
  • Incident escalation beyond P3
  • Budget and vendor decisions

Agent-Platform Interaction Model

AI agents (Claude Code sessions) interact with platform services through:

  1. Beads — agents create, update, and close issues autonomously
  2. Engram — agents persist and retrieve context across sessions
  3. CI/CD — agents trigger pipelines and read results
  4. Observability — agents query metrics and traces for debugging
  5. SDK Pipeline — agents update OpenAPI specs, triggering SDK regeneration
  6. Infrastructure as Code — agents propose infrastructure changes via PRs

Organisational Implications

  • The agentic AI SDLC fundamentally changes the ratio of engineers to output
  • The team will be reorganised as required to optimise for this model
  • Roles shift from "writing code" to "directing AI, reviewing output, making architectural decisions"
  • The Platform Team's role expands to include AI agent operations — ensuring agents have the right context, tools, and guardrails

9. Working Agreements

How Product Teams Request Platform Changes

  1. Self-service first — check the developer portal and service catalogue
  2. Beads issue — create an issue with bd create "Platform: <description>" and tag appropriately
  3. Urgent requests — Slack the platform channel; platform on-call responds per severity matrix
  4. Architecture changes — raise an RFC (see below)

RFC Process

Significant platform changes require a Request for Comments (RFC):

  • New infrastructure component adoption
  • Changes to the golden path or default tooling
  • Breaking changes to platform APIs
  • New technology introduction (reference TECHNOLOGY-RADAR.md)

RFC template lives in the developer portal. RFCs are reviewed at the Architecture Review Board.

ADR Process

All significant architecture decisions are recorded as Architecture Decision Records in the ADR/ folder:

  • Lightweight, one-page format
  • Status: Proposed → Accepted → Superseded/Deprecated
  • Referenced from ARCHITECTURAL-REVIEW.md

On-Call Rotation

  • Platform team maintains a weekly on-call rotation
  • On-call engineer is first responder for all platform P1/P2 incidents
  • Escalation path: On-Call → Platform Lead → CDO
  • On-call handover happens every Monday at 10:00 PKT

Incident Response

  • Platform team participates in all incidents involving platform components
  • Post-incident reviews (blameless) within 48 hours for P1/P2
  • Action items tracked in Beads
  • Incident runbooks maintained in the developer portal

Sprint/Iteration Cadence

Activity Frequency Participants
Sprint planning Fortnightly Platform team
Daily stand-up Daily Platform team
Sprint review / demo Fortnightly Platform + product team leads
Retrospective Fortnightly Platform team
Architecture Review Board Monthly CDO, Platform Lead, Tech Leads, CISO
Technology Radar review Quarterly CDO, Platform Lead, Tech Leads

10. Success Metrics

Metric Target Measurement Method
Developer satisfaction (NPS) > 40 Quarterly survey
Time to first deploy (new service) < 2 hours Pipeline telemetry
CI/CD pipeline reliability > 99.5% Pipeline success rate
Platform availability (Tier 1) > 99.95% Uptime monitoring
Platform availability (Tier 2) > 99.9% Uptime monitoring
Mean time to recovery (MTTR) — P1 < 1 hour Incident tracking
Mean time to recovery (MTTR) — P2 < 4 hours Incident tracking
Self-service adoption rate > 80% of requests Service catalogue usage analytics
Manual interventions per sprint Decreasing trend Beads issue analysis
Golden path adoption > 90% of new services Repository analysis
AI-augmented development adoption 100% of engineers Claude Code usage telemetry
SDK generation latency < 5 minutes Pipeline telemetry
Security scan pass rate (first attempt) > 85% CI/CD security gate metrics

11. Technology Ownership Matrix

Technology Owner Status Notes
KrakenD Platform Target API gateway — replacing direct exposure
Temporal Platform Target Workflow orchestration
SurrealDB Platform (infra) Target Clusters managed by platform; schemas by product
MySQL Platform (infra) Legacy Operational management; migration path planned
NSQ Platform Target Messaging infrastructure
Meilisearch Platform Target Search infrastructure
OpenTelemetry Platform Target Observability instrumentation
Grafana Platform Target Dashboards and alerting
Jaeger Platform Target Distributed tracing
OpenSearch Platform Target Log aggregation and search
PostHog Platform Target Product analytics and feature flags
Cloudflare Platform Active Edge, DNS, CDN, WAF, Workers, R2
ControlPlane.com Platform Target Identity, access, policy
Claude Code Platform Active AI development tooling
Beads (bd) Platform Active Issue tracking
Engram MCP Platform Active Persistent AI memory
Go Product (new) Target Primary language for new services
Rust Product/Platform Target Performance-critical components
TypeScript Product Target Front-end, BFF, scripting
Python Data/Platform Target Data pipelines, ML, scripting
Astro Product Target Static/SSR sites, developer portal
Spring Boot/Java Product Legacy Current stack — migration planned
Bitbucket Platform Active Source control and CI/CD
Terraform/Pulumi Platform Target Infrastructure as Code
Kubernetes Platform Target Container orchestration

12. Staffing Model

Current State

There is no dedicated platform team today. Platform-adjacent work is handled ad hoc by:

  • Portal Team (Iqbal Butt, Selina Wilson, Atif, Zaid, Jamal, Jahanzeb + 1 front-end developer) — some infrastructure work alongside portal development
  • Pay-Out Team (Rafi ur Rehman, Rana Waqar) — Java developers focused on product
  • Security Team (Danish, Kamran, Khizer, Hamza, Khubaib, Zain) — InfoSec and SOC, not platform engineering
  • Other Engineers (Rizwan Zafar, Saqlain Raza, Danish Hamid) — various roles

Infrastructure and tooling work is distributed, uncoordinated, and competes with product delivery for attention.

Target Structure

Role Count Priority Notes
Platform Lead 1 P0 Technical leadership, architecture, team management
Site Reliability Engineer 2 P0 Infrastructure, observability, incident response
DevEx Engineer 1 P1 Developer portal, tooling, golden paths, AI SDLC
Platform Engineer 2 P1 CI/CD, IaC, gateway, databases
Security Engineer (embedded) 1 P1 From Security team — bridge role

Total target: 7 people (including 1 embedded from Security)

Hiring Priorities

  1. Platform Lead — must have experience building internal platforms, strong in Go or Rust, comfortable with AI-augmented workflows
  2. SRE — Kubernetes, observability, incident management experience
  3. DevEx Engineer — passion for developer productivity, experience with developer portals and tooling

Skill Gaps and Training Plan

Current Skill Target Skill Training Approach
Java / Spring Boot Go Structured learning + Claude Code pairing
Monolithic thinking Microservices architecture Architecture workshops, ADR practice
Manual deployment IaC + GitOps Hands-on with Terraform/Pulumi
Traditional testing AI-augmented testing Claude Code integration training
Ops as afterthought SRE / observability-first OpenTelemetry workshops

AI Augmentation of Roles

The agentic AI SDLC means each platform engineer is significantly more productive:

Role AI Amplification
Platform Lead AI drafts ADRs, RFCs, architecture docs; human reviews and decides
SRE AI assists incident triage, writes runbooks, analyses traces
DevEx Engineer AI generates docs, templates, onboarding guides
Platform Engineer AI writes IaC, pipeline configs, gateway routes; human reviews
Security Engineer AI runs security scans, analyses findings, drafts remediation

This AI amplification is why a team of 7 can deliver what traditionally requires 12-15 people.


13. Roadmap

Q2 2026 (April–June) — Foundation

  • Establish platform team — hire Platform Lead
  • Define and publish golden path for new Go services
  • Deploy observability stack (OpenTelemetry + Grafana + Jaeger)
  • Standardise CI/CD pipelines across all repositories
  • Set up Beads and Engram across all engineering teams
  • Publish initial developer portal (Astro-based)
  • Complete technology audit of current infrastructure

Q3 2026 (July–September) — Core Platform

  • Deploy KrakenD API gateway — migrate first services
  • Implement SDK generation pipeline from OpenAPI specs
  • Deploy Temporal for workflow orchestration (pilot with Pay-Outs)
  • Establish SurrealDB clusters for new services
  • Implement ControlPlane.com for service identity
  • Deploy NSQ messaging infrastructure
  • Hire SRE and DevEx Engineer

Q4 2026 (October–December) — Self-Service

  • Self-service infrastructure provisioning via CLI/portal
  • Self-service database provisioning (SurrealDB)
  • Automated security scanning in all pipelines
  • Meilisearch deployment for search services
  • Complete API gateway migration for all services
  • Publish comprehensive golden path documentation
  • Hire remaining Platform Engineers

Q1 2027 (January–March) — Maturity

  • Achieve measured SLAs across all platform services
  • > 80% self-service adoption rate
  • Developer NPS > 40
  • Full Temporal adoption for complex workflows
  • Complete Java → Go migration support tooling
  • Platform retrospective and charter refresh
  • Publish first Technology Radar update with data

14. Governance

Architecture Review Board (ARB)

  • Cadence: Monthly
  • Attendees: CDO, Platform Lead, Product Tech Leads, CISO
  • Purpose: Review significant architecture decisions, approve/reject RFCs, ensure alignment with strategy
  • Format: Lightweight, ADR-driven — decisions are documented, not committees
  • Escalation: CDO has final authority on contested decisions

Technology Radar Reviews

  • Cadence: Quarterly
  • Reference: TECHNOLOGY-RADAR.md
  • Purpose: Assess technology adoption status, identify emerging tools, deprecate outgoing technologies
  • Output: Updated Technology Radar, action items for platform team

Platform Retrospectives

  • Cadence: Fortnightly (aligned with sprint cadence)
  • Focus: Platform team effectiveness, service quality, developer feedback
  • Action items: Tracked in Beads

Budget and Cost Management

  • Platform infrastructure costs tracked and reported monthly
  • Cloud spend optimisation reviewed quarterly
  • New tooling/vendor decisions require CDO approval above defined thresholds
  • Cost allocation model: platform costs are shared across product teams proportionally

15. Appendix: RACI Matrix

R = Responsible | A = Accountable | C = Consulted | I = Informed

Activity Platform Product Security Data CDO
Infrastructure provisioning R, A I C I I
CI/CD pipeline management R, A I C I I
API gateway configuration R, A C C I I
Observability stack operations R, A I I C I
Secret management R I A I I
Service identity & access (ControlPlane) R I A I I
Developer tooling (Claude Code, Beads) R, A I I I I
Production incident response (platform) R, A C C I I
Production incident response (product) C R, A C I I
Architecture decisions (platform) R C C C A
Architecture decisions (product) C R C C A
Security architecture C I R, A I I
Compliance and regulatory I I R, A I I
Data architecture C C I R, A I
Product API design C R, A C C I
Technology adoption decisions R C C C A
Vendor selection (platform tools) R I C I A
Budget management (platform) R I I I A
On-call rotation (platform) R, A I I I I
Developer onboarding R C C C I
Golden path maintenance R, A C C I I
SDK generation pipeline R, A C I I I
AI SDLC tooling & operations R, A I C I I
Database operations (infrastructure) R, A I C I I
Database schema design C R, A I C I
Edge infrastructure (Cloudflare) R, A I C I I
Workflow engine operations (Temporal) R, A C I I I
Messaging infrastructure (NSQ) R, A I I I I
Search infrastructure (Meilisearch) R, A C I I I

Revision History

Date Version Author Changes
2026-04-03 0.1 Daniel O'Reilly Initial draft