Skip to content

Data Architecture & Governance

Document Owner: Daniel O'Reilly, Chief Digital Officer Organisation: Simpaisa Holdings (UAE) / SimPaisa (Operating Entities) Classification: Confidential Version: 1.0 Date: 2 April 2026 Status: Active


Table of Contents

  1. Executive Summary
  2. Data Architecture Principles
  3. Current State: Logical Data Model
  4. Target State: Data Architecture
  5. Data Classification Framework
  6. PII Inventory & Handling
  7. Data Retention Policy
  8. Cross-Border Data Flows
  9. Encryption Standards
  10. Data Quality & Integrity
  11. Analytics & Reporting Architecture
  12. Messaging & Event Architecture
  13. Search Architecture
  14. Backup & Disaster Recovery
  15. Regulatory Compliance Matrix
  16. Data Governance Operating Model
  17. Migration Roadmap
  18. Appendix: Data Dictionary

1. Executive Summary

Current State

Simpaisa processes 270M+ transactions valued at $1B+ across five markets (Pakistan, Bangladesh, Nepal, Iraq, Egypt) through four product lines (Pay-Ins, Pay-Outs, Remittances, Cards). All products operate against a single shared MySQL instance hosted on AWS RDS, with no per-service database isolation, no documented data retention policies, and PII stored in plain text without column-level encryption.

Data Maturity Rating: 1 out of 5 (Initial/Ad Hoc)

Dimension Rating Assessment
Data Architecture 1/5 Single shared database, no service isolation, no read replicas
Data Governance 1/5 No formal ownership model, no classification framework, no retention policy
Data Quality 1/5 Numeric status codes, no documented validation rules, no reconciliation automation
Data Security 1/5 PII in plain text, credential encryption undocumented, no masking in logs
Regulatory Compliance 1/5 No documented compliance mapping across six jurisdictions

Key Risks

# Risk Severity Impact
R1 Shared database β€” schema migration in Pay-Ins could cause downtime across all products CRITICAL Full platform outage, $1B+ transaction flow at risk
R2 PII stored in plain text β€” MSISDN, account numbers, names unencrypted at column level HIGH Regulatory sanctions in all six jurisdictions, reputational damage
R3 No data retention policy β€” transaction records, API logs, PII accumulate indefinitely HIGH Storage cost escalation, regulatory non-compliance, increased breach surface
R4 Operator credentials storage β€” third-party API keys in operator_credentials without documented encryption HIGH Credential compromise could expose payment channels
R5 No read replicas β€” 270M+ transactions served from a single instance for both OLTP and reporting MEDIUM Performance degradation, reporting contention with transactional workloads
R6 No cross-border data flow documentation β€” data moves between six jurisdictions without mapped controls HIGH Regulatory breach under SBP, Bangladesh Bank, NRB, CBI localisation rules

2. Data Architecture Principles

These principles govern all data architecture decisions across Simpaisa's multi-jurisdiction payment platform.

# Principle Rationale Implication
DA-01 Data is a product Each data domain must be owned, documented, and quality-assured like a customer-facing product Every table/collection has a documented owner, SLA, and quality metrics
DA-02 Service owns its data Shared databases create coupling that prevents independent deployment and scaling Each bounded context (Pay-Ins, Pay-Outs, Remittances, Cards) owns its database
DA-03 Classify before you store Data sensitivity must be determined before persistence, not retroactively All new fields require classification approval before schema change
DA-04 Encrypt by default Payment data and PII must be encrypted at rest and in transit across all jurisdictions Column-level encryption for Confidential/Restricted data; TLS 1.2+ everywhere
DA-05 Retain with purpose Every data element must have a defined retention period tied to business or regulatory need Automated archival and deletion pipelines per data classification level
DA-06 Data stays where it belongs Regulatory data localisation requirements vary by jurisdiction Primary transaction data resides in-country; only aggregated/anonymised data flows to UAE
DA-07 Separate reads from writes Reporting and analytics must never contend with transactional workloads Read replicas for reporting, dedicated analytics stores, CQRS where appropriate
DA-08 Schema changes are deployments A schema change to a shared resource is as risky as a code deployment All schema changes go through PR review, staging validation, and rollback plan
DA-09 Events as first-class citizens Asynchronous event-driven communication reduces coupling between services All state transitions published as events with versioned schemas
DA-10 Audit everything Payment platforms require complete audit trails for regulatory compliance All data access, modification, and deletion logged with actor, timestamp, and reason

3. Current State: Logical Data Model

3.1 Entity Relationship Overview

Pay-Ins Domain (15 Tables)

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ merchant_detail      │────▢│ product_config   │────▢│ product         β”‚
β”‚ (merchantId, name,   β”‚     β”‚ (merchant-operatorβ”‚     β”‚ (product defs)  β”‚
β”‚  contact, email,     β”‚     β”‚  mappings, fees)  β”‚     β”‚                 β”‚
β”‚  phone)              β”‚     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β”‚
           β”‚                          β”‚
           β–Ό                          β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ transaction          │────▢│ operator         │────▢│ operator_creds  β”‚
β”‚ (AC, amount, msisdn, β”‚     β”‚ (payment ops)    β”‚     β”‚ (API keys,      β”‚
β”‚  status, operatorId, β”‚     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚  secrets)        β”‚
β”‚  merchantId)         β”‚              β”‚               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β”‚
           β”‚                          β–Ό
           β”‚                 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
           β”‚                 β”‚ operator_redirect β”‚
           β”‚                 β”‚ _urls             β”‚
           β”‚                 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ transaction_refund   β”‚     β”‚ refund_requests  │────▢│ refund_txn_     β”‚
β”‚ (refund records)     β”‚     β”‚ (request tracking)β”‚     β”‚ status          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚
           β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ postback             β”‚     β”‚ api_logs         β”‚     β”‚ external_       β”‚
β”‚ (delivery records)   β”‚     β”‚ (request/responseβ”‚     β”‚ response        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚  bodies)          β”‚     β”‚ (error mapping) β”‚
                            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                              β”‚ webhook         β”‚
β”‚ currency             β”‚                              β”‚ (endpoint       β”‚
β”‚ (supported currencies)β”‚                              β”‚  config)        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Pay-Out State Machine

Published β†’ In Review β†’ On Hold β†’ Disbursed
                    β†˜            β†—
                     β†’ Stuck ──→ Rejected

6 States: Published, In Review, On Hold, Stuck, Disbursed, Rejected

Remittance State Machine

Published β†’ In Process β†’ In Review β†’ Remitted
                     β†˜          β†˜
                      β†’ On Hold β†’ AML Review β†’ Reversed
                      β†’ Stuck ──→ Rejected

9 States: Published, In Process, In Review, On Hold, Remitted, Rejected, Stuck, AML Review, Reversed

3.2 Shared Database Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    AWS RDS (MySQL)                     β”‚
β”‚                  SINGLE INSTANCE                       β”‚
β”‚                                                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”‚
β”‚  β”‚  Pay-Ins   β”‚ β”‚  Pay-Outs  β”‚ β”‚ Remittancesβ”‚        β”‚
β”‚  β”‚  Schema    β”‚ β”‚  Schema    β”‚ β”‚  Schema    β”‚        β”‚
β”‚  β”‚ (15 tables)β”‚ β”‚ (tables)   β”‚ β”‚ (tables)   β”‚        β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β”‚
β”‚                                                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                       β”‚
β”‚  β”‚   Cards    β”‚  ← All sharing the same MySQL         β”‚
β”‚  β”‚  Schema    β”‚    instance, same connection pool,     β”‚
β”‚  β”‚ (tables)   β”‚    same failover domain               β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚
        β”‚  No read replicas
        β”‚  No per-service isolation
        β”‚  Schema migration = full-platform risk
        β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚           Application Layer (All Services)            β”‚
β”‚  Pay-Ins API β”‚ Pay-Outs API β”‚ Remittance API β”‚ Cards β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

3.3 Table Inventory with Data Sensitivity Classification

Table Domain Description Record Volume (est.) Data Sensitivity
transaction Pay-Ins Core transaction records 270M+ Confidential β€” contains MSISDN, amounts, account codes
merchant_detail Pay-Ins Merchant profiles ~500 Confidential β€” contact details, email, phone
product_configuration Pay-Ins Merchant-operator mappings, fee config ~2,000 Internal β€” commercial terms
product Pay-Ins Product definitions ~50 Internal
operator Pay-Ins Payment operator definitions ~30 Internal
operator_credentials Pay-Ins Third-party API keys and secrets ~30 Restricted β€” cryptographic material
operator_redirect_urls Pay-Ins Redirect URLs per operator ~60 Internal
currency Pay-Ins Supported currencies ~10 Public
webhook Pay-Ins Webhook endpoint configuration ~500 Internal β€” contains endpoint URLs
transaction_refund Pay-Ins Refund records ~5M Confidential β€” financial records
refund_requests Pay-Ins Refund request tracking ~5M Confidential
refund_transaction_status Pay-Ins Refund state tracking ~10M Internal
postback Pay-Ins Postback delivery records ~270M Internal β€” delivery metadata
api_logs Pay-Ins API request/response logs ~500M+ Confidential β€” may contain PII in request/response bodies
external_response Pay-Ins Error code mapping (3rd party to internal) ~200 Internal
Pay-Out tables Pay-Outs Disbursement records, beneficiary data ~50M Confidential β€” beneficiary PII, bank accounts
Remittance tables Remittances Remittance records, sender/receiver data ~20M Confidential β€” sender/receiver PII, AML flags
Card tables Cards Card data, token mappings ~10M Restricted β€” PCI DSS scope, card numbers, tokens

4. Target State: Data Architecture

4.1 Database-per-Service Model

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        Target Architecture                         β”‚
β”‚                                                                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”‚
β”‚  β”‚  Pay-Ins     β”‚  β”‚  Pay-Outs    β”‚  β”‚ Remittances  β”‚            β”‚
β”‚  β”‚  MySQL (RDS) β”‚  β”‚  SurrealDB   β”‚  β”‚  SurrealDB   β”‚            β”‚
β”‚  β”‚  (existing,  β”‚  β”‚  (new)       β”‚  β”‚  (new)       β”‚            β”‚
β”‚  β”‚   strangler) β”‚  β”‚              β”‚  β”‚              β”‚            β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜            β”‚
β”‚         β”‚                  β”‚                  β”‚                    β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”‚
β”‚  β”‚  Cards       β”‚  β”‚  Merchant    β”‚  β”‚  Analytics   β”‚            β”‚
β”‚  β”‚  MySQL (RDS) β”‚  β”‚  SurrealDB   β”‚  β”‚  PostHog +   β”‚            β”‚
β”‚  β”‚  (PCI scope) β”‚  β”‚  (new)       β”‚  β”‚  Grafana     β”‚            β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β”‚
β”‚                                                                     β”‚
β”‚  Cross-cutting:                                                    β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”‚
β”‚  β”‚ Redis    β”‚ β”‚ NSQ      β”‚ β”‚Meilisearchβ”‚ β”‚OpenSearchβ”‚            β”‚
β”‚  β”‚ (cache)  β”‚ β”‚ (events) β”‚ β”‚ (search) β”‚ β”‚ (logs)   β”‚            β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

4.2 SurrealDB for New Services

Aspect Detail
Why SurrealDB Multi-model (document + graph + relational) in one engine; real-time subscriptions; built-in permissions; reduces operational overhead vs running separate stores
New services Pay-Outs v2, Remittances v2, Merchant Portal, Configuration Service
Data model Document-based with graph relations for entity linking (merchant β†’ operator β†’ channel)
Real-time LIVE SELECT for webhook status updates, transaction state changes pushed to dashboards
Multi-tenancy Namespace-per-market, database-per-product for logical isolation
Schema mode Schemafull mode enforced β€” all tables require defined fields and types

4.3 MySQL Migration Strategy (Strangler Fig)

Phase Action Duration
Phase A Introduce read replicas for reporting; no application changes Month 1-2
Phase B New Pay-Out and Remittance features write to SurrealDB; existing features remain on MySQL Month 3-6
Phase C Dual-write for migrating entities β€” write to both MySQL and SurrealDB, read from SurrealDB Month 6-9
Phase D Migrate read traffic from MySQL to SurrealDB for Pay-Outs and Remittances Month 9-12
Phase E Decommission MySQL tables for fully migrated domains; Pay-Ins and Cards remain on MySQL Month 12-18

Strangler fig rules: - Never rewrite; always wrap and redirect - Dual-write period must not exceed 3 months per domain - Rollback plan required before each phase gate - Data consistency verified via automated reconciliation during dual-write

4.4 Read Replica Strategy

Replica Purpose Source Lag Tolerance
Reporting replica Merchant dashboards, operations reports MySQL primary 30 seconds
Analytics replica PostHog ingestion, Grafana dashboards MySQL primary 5 minutes
Audit replica Compliance queries, regulatory reporting MySQL primary 1 minute

4.5 Data Partitioning Approach

Data Type Partitioning Strategy Partition Key Retention per Partition
Transactions Range partitioning by month created_at Active: 12 months; Archive: 7 years
API Logs Range partitioning by week log_date Active: 3 months; Archive: 1 year
Postbacks Range partitioning by month created_at Active: 6 months; Archive: 2 years
Refunds Range partitioning by quarter created_at Active: 24 months; Archive: 7 years
Audit Logs Range partitioning by month created_at Active: 12 months; Archive: 10 years

5. Data Classification Framework

5.1 Classification Levels

Level Definition Examples Handling Requirements
Public Data intended for public disclosure Currency codes, country codes, product names, API documentation No restrictions on storage or transmission
Internal Business data not intended for external disclosure Fee configurations, operator definitions, status codes, internal error mappings Access restricted to Simpaisa employees; encrypted in transit
Confidential Data whose disclosure could cause significant harm MSISDN, transaction amounts, merchant contact details, email addresses, beneficiary names, bank account numbers Encrypted at rest and in transit; column-level encryption for PII fields; access logged; masked in non-production environments
Restricted Data whose disclosure could cause severe operational or legal harm Operator API keys, card PANs, card CVVs, encryption keys, authentication tokens, AML flags Encrypted at rest with dedicated keys; access requires MFA and approval; stored in isolated vaults; never logged; PCI DSS controls for card data

5.2 Complete Data Classification Map

Data Element Classification Current Storage Encryption Required Masking Required
MSISDN (mobile number) Confidential Plain text in transaction Yes β€” AES-256 column-level Yes β€” 03XX-XXXX-567
Account Code (AC) Confidential Plain text in transaction Yes β€” AES-256 column-level Yes β€” last 4 only
Transaction Amount Internal Plain text in transaction Yes β€” RDS-level encryption No
Transaction Status Internal Numeric code in transaction No No
Merchant Name Internal Plain text in merchant_detail No No
Merchant Contact Name Confidential Plain text in merchant_detail Yes β€” AES-256 column-level Yes β€” first initial + surname
Merchant Email Confidential Plain text in merchant_detail Yes β€” AES-256 column-level Yes β€” d***@simpaisa.com
Merchant Phone Confidential Plain text in merchant_detail Yes β€” AES-256 column-level Yes β€” last 4 digits
Operator API Keys Restricted operator_credentials β€” encryption undocumented Yes β€” AWS KMS envelope encryption Never displayed
Operator API Secrets Restricted operator_credentials β€” encryption undocumented Yes β€” AWS KMS envelope encryption Never displayed
Card PAN Restricted Cards schema Yes β€” AES-256 + tokenisation Yes β€” first 6, last 4
Card CVV Restricted Cards schema (should never be stored) Must not be stored post-authorisation N/A β€” must not exist
Card Expiry Date Restricted Cards schema Yes β€” AES-256 Yes β€” XX/XX
Cardholder Name Confidential Cards schema Yes β€” AES-256 column-level Yes β€” first initial + surname
Beneficiary Name (Pay-Outs) Confidential Pay-Out tables Yes β€” AES-256 column-level Yes β€” first initial + surname
Beneficiary Bank Account Confidential Pay-Out tables Yes β€” AES-256 column-level Yes β€” last 4 digits
Beneficiary IBAN Confidential Pay-Out tables Yes β€” AES-256 column-level Yes β€” country code + last 4
Sender Name (Remittance) Confidential Remittance tables Yes β€” AES-256 column-level Yes β€” first initial + surname
Receiver Name (Remittance) Confidential Remittance tables Yes β€” AES-256 column-level Yes β€” first initial + surname
AML Review Flags Restricted Remittance tables Yes β€” AES-256 column-level Never shown to non-compliance staff
API Request Body Confidential Plain text in api_logs Yes β€” RDS-level; PII fields redacted before logging PII fields stripped or masked
API Response Body Confidential Plain text in api_logs Yes β€” RDS-level; PII fields redacted before logging PII fields stripped or masked
Webhook URLs Internal webhook table Yes β€” RDS-level encryption No
Fee Configuration Internal product_configuration No No
Currency Codes Public currency table No No
Error Code Mappings Internal external_response No No
Redirect URLs Internal operator_redirect_urls No No

6. PII Inventory & Handling

6.1 Complete PII Field Inventory

Product Table/Collection PII Field Data Type Current State Target State
Pay-Ins transaction msisdn Mobile number Plain text AES-256 column encryption, searchable via HMAC index
Pay-Ins transaction AC (account code) Account identifier Plain text AES-256 column encryption
Pay-Ins merchant_detail contact Person name Plain text AES-256 column encryption
Pay-Ins merchant_detail email Email address Plain text AES-256 column encryption
Pay-Ins merchant_detail phone Phone number Plain text AES-256 column encryption
Pay-Ins api_logs requestBody May contain MSISDN, names Plain text PII fields redacted before storage
Pay-Ins api_logs responseBody May contain MSISDN, status Plain text PII fields redacted before storage
Pay-Outs Disbursement tables Beneficiary name Person name Plain text AES-256 column encryption
Pay-Outs Disbursement tables Beneficiary bank account Bank account number Plain text AES-256 column encryption
Pay-Outs Disbursement tables Beneficiary IBAN IBAN Plain text AES-256 column encryption
Pay-Outs Disbursement tables Beneficiary CNIC/NID National ID Plain text AES-256 column encryption
Remittances Remittance tables Sender name Person name Plain text AES-256 column encryption
Remittances Remittance tables Sender ID document Passport/NID Plain text AES-256 column encryption
Remittances Remittance tables Receiver name Person name Plain text AES-256 column encryption
Remittances Remittance tables Receiver MSISDN Mobile number Plain text AES-256 column encryption
Remittances Remittance tables Receiver bank account Bank account number Plain text AES-256 column encryption
Cards Card tables Card PAN Card number Encrypted (PCI requirement) Tokenised via payment processor; PAN never stored
Cards Card tables Cardholder name Person name Unknown AES-256 column encryption
Cards Card tables Expiry date Date Unknown AES-256 column encryption

6.2 PII Masking Rules

PII Type Masking Pattern Example (Original) Example (Masked)
MSISDN (Pakistan) Show last 3 digits 03001234567 03XX-XXXX-567
MSISDN (Bangladesh) Show last 3 digits 01712345678 01XX-XXXXX-678
Email Show first char + domain [email protected] d***@simpaisa.com
Person name First initial + surname Daniel O'Reilly D. O'Reilly
Bank account Last 4 digits 1234567890 XXXXXX7890
IBAN Country code + last 4 PK36SCBL0000001123456702 PK********************6702
CNIC/NID Last 4 digits 35201-1234567-1 XXXXX-XXXXXXX-XX71
Card PAN First 6 + last 4 4111111111111111 411111XXXXXX1111
Card expiry Fully masked 12/27 XX/XX
Passport Last 3 characters AB1234567 XXXXXX567

6.3 Masking Application Points

Application Point Masking Rule Implementation
API response to merchant Mask MSISDN, bank accounts in all responses Response serialiser middleware
API logging (api_logs) Redact all PII from request/response bodies before storage Logging middleware β€” regex-based PII detection + replacement
Merchant dashboard Mask PII by default; reveal on explicit request with audit log Frontend masking component + backend audit endpoint
Internal admin panel Full PII visible to authorised roles only; all access logged RBAC with audit trail
Non-production environments All PII replaced with synthetic data Data anonymisation pipeline on DB restore
Error messages / stack traces Never include PII Error handler sanitisation
NSQ event payloads Mask PII in events consumed by non-owning services Event schema validation middleware

6.4 Column-Level Encryption Requirements

Requirement Specification
Algorithm AES-256-GCM (authenticated encryption)
Key management AWS KMS β€” one CMK per data classification per market
Key rotation Automatic rotation every 365 days via KMS
Searchable encryption HMAC-SHA256 blind index for MSISDN lookups (search by exact match without decrypting all rows)
Performance Encryption/decryption at application layer; KMS data key caching (5-minute TTL) to reduce API calls
Backward compatibility Migration script encrypts existing plain-text data in batches (10,000 rows/batch, off-peak hours)

7. Data Retention Policy

7.1 Retention Periods by Data Type

Data Type Active Retention Archive Retention Total Justification
Transaction records 12 months (hot storage) 7 years (cold/S3) 7 years SBP requires 5 years; Bangladesh Bank 5 years; PCI DSS 1 year; Simpaisa policy adds 2-year buffer
API logs (request/response) 3 months (hot) 9 months (S3 Glacier) 12 months Operational debugging; no regulatory requirement beyond dispute resolution
Postback records 6 months (hot) 18 months (S3) 24 months Merchant dispute resolution window
Refund records 24 months (hot) 5 years (cold) 7 years Aligned with transaction retention
Audit logs (access, changes) 12 months (hot) 9 years (S3 Glacier) 10 years Regulatory audit trail; SBP and Bangladesh Bank require minimum 5 years
Merchant profiles Lifetime of relationship 5 years post-termination Lifetime + 5 years KYC/KYB retention requirements
Operator credentials Lifetime of integration Securely destroyed on decommission Immediate deletion No value in retaining expired credentials; security risk
PII (stand-alone) Duration of active relationship Anonymised at end of retention Per data type above MSISDN retained with transaction; anonymised when transaction archived
Card data (PAN) Never stored post-authorisation N/A 0 PCI DSS β€” PAN must not be stored; only tokens retained
AML review records 24 months (hot) 8 years (cold) 10 years FATF recommendations; SBP AML/CFT guidelines
Currency / config data Indefinite N/A Indefinite Reference data; no PII
Error code mappings Indefinite N/A Indefinite Reference data
Analytics / product data 24 months (PostHog) Exported to S3 5 years aggregated Business intelligence; no PII in aggregated form

7.2 Regulatory Requirements by Jurisdiction

Jurisdiction Regulator Transaction Retention AML Records Audit Trail Data Localisation
Pakistan SBP 5 years minimum 5 years (AML Act 2010) 5 years Transaction data must reside in Pakistan
Bangladesh Bangladesh Bank 5 years minimum 5 years (MLPA 2012) 5 years Data must remain in Bangladesh
Nepal Nepal Rastra Bank 5 years minimum 5 years 5 years Data must remain in Nepal
Iraq CBI 5 years minimum 5 years 5 years Data must remain in Iraq
UAE DFSA 6 years 6 years 6 years N/A (holding company only)

7.3 Automated Cleanup and Archival Strategy

Phase Action Frequency Responsible
Identification Tag records reaching end of active retention period Daily cron job Data Platform team
Archival Move expired records from RDS to S3 (Parquet format, encrypted with KMS) Weekly batch job (Sunday 02:00 UTC) Data Platform team
Verification Verify archived records are readable and complete before deletion from primary Automated post-archive validation Data Platform team
Deletion Purge archived records from primary database after 30-day verification window Monthly Data Platform team
Cold archival Move records from S3 Standard to S3 Glacier after 12 months Monthly lifecycle policy AWS S3 lifecycle rule
Final deletion Permanently delete records exceeding total retention period Quarterly Compliance team review + Data Platform execution
Audit Generate retention compliance report per jurisdiction Monthly Compliance team
Step Action Owner
1 Legal/Compliance issues hold notice specifying scope (date range, data types, markets) General Counsel / Compliance
2 Data Platform team tags affected records as LEGAL_HOLD = true Data Platform team
3 Held records excluded from all automated archival and deletion pipelines Automated β€” pipeline checks hold flag
4 Hold reviewed quarterly; released only by written authorisation from General Counsel Compliance team
5 On release, records re-enter normal retention pipeline from current date Data Platform team

8. Cross-Border Data Flows

8.1 Data Flow Diagram

                          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                          β”‚     UAE (DFSA)       β”‚
                          β”‚  Simpaisa Holdings   β”‚
                          β”‚  Aggregated Reports  β”‚
                          β”‚  No raw PII          β”‚
                          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                     β”‚
                    Aggregated/anonymised data only
                                     β”‚
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚                           β”‚                           β”‚
         β–Ό                           β–Ό                           β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Pakistan      β”‚       β”‚   Bangladesh    β”‚       β”‚     Nepal       β”‚
β”‚   (SBP)         β”‚       β”‚ (Bangladesh Bank)β”‚      β”‚    (NRB)        β”‚
β”‚                 β”‚       β”‚                 β”‚       β”‚                 β”‚
β”‚ Pay-Ins         │──────▢│ Pay-Outs        β”‚       β”‚ Remittances     β”‚
β”‚ Pay-Outs        β”‚       β”‚ (Disbursements) β”‚       β”‚ (inbound)       β”‚
β”‚ Remittances     │──────▢│ Remittances     β”‚       β”‚                 β”‚
β”‚ Cards           β”‚       β”‚ (inbound)       β”‚       β”‚                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚     Iraq        β”‚
β”‚    (CBI)        β”‚
β”‚                 β”‚
β”‚ Pay-Ins         β”‚
β”‚ (inbound)       β”‚
β”‚                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

8.2 Cross-Border Data Flow Register

Flow Source β†’ Destination Data Types Transferred Transfer Mechanism Residency Compliance
Pay-Out initiation Pakistan β†’ Bangladesh Beneficiary name, bank account, amount, reference API call (TLS 1.3) to Bangladesh operator Transaction data replicated to Bangladesh RDS; Pakistan retains origination record
Pay-Out initiation Pakistan β†’ Nepal Beneficiary name, bank account, amount, reference API call (TLS 1.3) to Nepal operator Transaction data replicated to Nepal RDS; Pakistan retains origination record
Remittance processing Pakistan β†’ Bangladesh Sender/receiver name, amount, MSISDN, bank account API call (TLS 1.3) to Bangladesh partner Full record in both jurisdictions; AML data in Pakistan only
Remittance processing Pakistan β†’ Nepal Sender/receiver name, amount, bank account API call (TLS 1.3) to Nepal partner Full record in both jurisdictions
Pay-In processing Iraq β†’ Pakistan Transaction reference, amount, MSISDN API call (TLS 1.3) from Iraq operator Transaction data in Iraq RDS; processing record in Pakistan
Holding company reporting All markets β†’ UAE Aggregated volumes, revenue, success rates β€” no PII Encrypted batch export (S3 cross-region, KMS) Only aggregated/anonymised data; no raw PII crosses to UAE
Operational monitoring All markets β†’ UAE System metrics, error rates, latency β€” no PII CloudWatch / Grafana (TLS) Infrastructure metrics only; no transaction-level data

8.3 Data Residency Requirements

Jurisdiction Requirement Simpaisa Implementation
Pakistan SBP mandates transaction data resides in Pakistan; payment system data must be on servers within Pakistan or on approved cloud regions AWS ap-south-1 (Mumbai) β€” requires SBP approval OR dedicated Pakistan infrastructure
Bangladesh Bangladesh Bank requires data localisation for mobile financial services Dedicated Bangladesh database instance; no raw data transferred out
Nepal NRB requires financial data to remain in Nepal Dedicated Nepal database instance; processing data stays in-country
Iraq CBI requires financial data residency within Iraq or approved jurisdictions Dedicated Iraq database instance; transaction data remains in-country
UAE DFSA requires adequate protection for data processed in DIFC Holding company receives aggregated data only; no raw PII

8.4 Transfer Safeguards

Safeguard Description
Data minimisation Only data strictly necessary for the transaction is transferred cross-border
Pseudonymisation Internal reference IDs used instead of raw PII where possible
Encryption in transit TLS 1.3 for all cross-border API calls; certificate pinning for partner integrations
Contractual protections Data processing agreements with all cross-border partners specifying handling, retention, breach notification
Audit logging All cross-border data transfers logged with timestamp, data types, source, destination, actor
Aggregation for reporting UAE holding company receives only aggregated metrics; individual transaction data never leaves operating jurisdiction

9. Encryption Standards

9.1 Encryption at Rest

Layer Standard Implementation Key Management
RDS storage AES-256 AWS RDS encryption enabled at instance level AWS-managed KMS key per region
Column-level (PII) AES-256-GCM Application-layer encryption before write; decryption on read Customer-managed CMK in AWS KMS; one per market per classification
S3 archives AES-256 S3 server-side encryption (SSE-KMS) Customer-managed CMK; separate key for archives
S3 Glacier AES-256 SSE-KMS inherited from S3 lifecycle transition Same CMK as S3 Standard
SurrealDB AES-256 Storage-level encryption (filesystem/EBS encryption) + application-layer for PII fields AWS KMS; application-layer keys managed per namespace
Redis AES-256 ElastiCache encryption at rest enabled AWS-managed KMS key
OpenSearch AES-256 Domain-level encryption at rest enabled AWS-managed KMS key
Backups AES-256 RDS automated backups inherit instance encryption; manual snapshots use KMS Same CMK as primary instance

9.2 Encryption in Transit

Channel Standard Configuration
Client β†’ API Gateway TLS 1.2+ (TLS 1.3 preferred) AWS ALB/CloudFront termination; HSTS enabled; minimum TLS 1.2 enforced
API Gateway β†’ Services TLS 1.2+ Internal ALB with TLS; no plain HTTP permitted
Service β†’ Service mTLS Mutual TLS with service-specific certificates; certificate rotation every 90 days
Service β†’ Database TLS 1.2+ RDS require_secure_transport = ON; SurrealDB TLS enforced
Service β†’ Redis TLS 1.2+ ElastiCache in-transit encryption enabled
Service β†’ OpenSearch TLS 1.2+ Node-to-node encryption enabled; HTTPS enforced
Service β†’ NSQ TLS 1.2+ NSQ TLS configured for all producer/consumer connections
Cross-border API calls TLS 1.3 Certificate pinning for known partner endpoints
S3 access TLS 1.2+ Bucket policy enforces aws:SecureTransport condition

9.3 Application-Level Encryption

Use Case Algorithm Key Size Purpose
PII column encryption AES-256-GCM 256-bit Authenticated encryption for PII fields β€” confidentiality + integrity
Card data encryption AES-256-GCM 256-bit PCI DSS requirement for card data at rest (where tokenisation not applicable)
API payload signing RSA-2048 (minimum) / RSA-4096 (preferred) 2048/4096-bit Request/response signing for partner integrations; non-repudiation
HMAC for searchable encryption HMAC-SHA256 256-bit Blind index generation for encrypted field lookups (e.g., MSISDN search)
Webhook payload signing HMAC-SHA256 256-bit Merchant webhook verification β€” signature in X-Simpaisa-Signature header
Token generation CSPRNG 256-bit Idempotency keys, session tokens, API keys

9.4 Key Management (AWS KMS)

Aspect Policy
Key hierarchy Master key (CMK) β†’ Data encryption keys (DEK) via envelope encryption
CMK per market Separate CMK for Pakistan, Bangladesh, Nepal, Iraq β€” keys do not leave region
CMK per classification Separate CMK for Confidential vs Restricted data within each market
Automatic rotation Annual automatic rotation for CMKs; application detects and uses latest key version
Access policy IAM role-based; only the owning service's IAM role can call kms:Decrypt for its CMK
Audit All KMS API calls logged to CloudTrail; alerts on unusual decryption volume
Key deletion 30-day waiting period; requires two authorised approvals
Disaster recovery CMK replicas in secondary region (where permitted by data localisation requirements)

10. Data Quality & Integrity

10.1 Transaction Consistency

Issue Current State Gap Target State Action
Idempotency No documented idempotency mechanism for Pay-In API calls Duplicate transactions possible under network retries Idempotency key required on all payment initiation requests; deduplication window of 24 hours Implement idempotency key table with TTL-based cleanup
Atomic state transitions Status updates as numeric codes (0, 1, etc.) without transactional guarantees documented Potential for inconsistent state between transaction and postback tables Named enum states with database-level constraints; state machine enforced at application layer Replace numeric codes with ENUM columns; add CHECK constraints
Distributed consistency Shared database provides implicit consistency today Moving to database-per-service will break referential integrity Saga pattern for cross-service transactions; eventual consistency with compensating actions Design saga orchestrator for Pay-Out and Remittance flows
Duplicate detection No documented mechanism Unknown duplicate rate across 270M+ transactions Hash-based deduplication (amount + MSISDN + merchant + timestamp window) Implement at API gateway layer

10.2 Referential Integrity

Relationship Current State Risk Target State
transaction.merchantId β†’ merchant_detail.merchantId Foreign key (shared DB) Integrity maintained while shared DB exists; breaks on service split Event-driven consistency; merchant cache in each service; eventual consistency verified by reconciliation job
transaction.operatorId β†’ operator.id Foreign key (shared DB) Same as above Operator configuration replicated via events; local cache per service
product_configuration β†’ product + operator + merchant_detail Foreign keys (shared DB) Triple dependency β€” configuration changes must be atomic Configuration service owns all config; publishes change events
transaction_refund β†’ transaction Foreign key (shared DB) Refund without valid parent transaction Maintained in database; additional application-level validation

10.3 Data Validation at Ingestion

Validation Layer Rule Action on Failure
MSISDN format API Gateway Regex per country (PK: ^03\d{9}$, BD: ^01\d{9,10}$) Reject with 400 β€” invalid MSISDN format
Amount range API Gateway Amount > 0; amount <= configured maximum per merchant per product Reject with 400 β€” amount out of range
Currency code API Gateway Must match ISO 4217 and be in currency table Reject with 400 β€” unsupported currency
Merchant ID Service layer Must exist in merchant_detail and be active Reject with 403 β€” merchant not found or inactive
Operator ID Service layer Must exist in operator and be active for merchant's market Reject with 400 β€” operator not available
Idempotency key Service layer Must be unique within 24-hour window per merchant Return cached response for duplicate key
Card PAN (Luhn) Service layer Luhn algorithm validation Reject with 400 β€” invalid card number
IBAN (checksum) Service layer MOD-97 validation per ISO 13616 Reject with 400 β€” invalid IBAN

10.4 Reconciliation Requirements

Reconciliation Type Parties Frequency Method Tolerance
Merchant ↔ Simpaisa Merchant settlement vs Simpaisa transaction records Daily (T+1) Automated matching on transaction reference + amount Zero tolerance β€” all mismatches investigated
Channel ↔ Simpaisa Operator/channel settlement vs Simpaisa transaction records Daily (T+1) File-based reconciliation (operator provides settlement file) Zero tolerance
Merchant ↔ Channel ↔ Simpaisa Three-way reconciliation Weekly Automated three-way match; exceptions flagged Zero tolerance β€” exceptions resolved within 5 business days
Cross-border (Remittance) Source country vs destination country records Daily Reference-based matching across jurisdictions Zero tolerance β€” AML compliance requirement
Internal consistency Transaction table vs postback table vs refund table Hourly Count and sum validation; orphan detection Alert if delta > 0

11. Analytics & Reporting Architecture

11.1 Current State

Aspect Current Gap
Monitoring CloudWatch only No product analytics; no custom dashboards; no real-time merchant metrics
Reporting Direct queries against production MySQL Reporting queries contend with transactional workloads; risk of performance degradation
Log analysis OpenSearch for application logs Limited structured analysis; no correlation between business events and system events
Product analytics None No feature usage tracking, no funnel analysis, no conversion metrics

11.2 Target State

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Analytics Architecture                      β”‚
β”‚                                                                 β”‚
β”‚  Transactional          Analytical              Observability  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚  MySQL   │──CDC──▢│  PostHog  β”‚            β”‚ Grafana   β”‚   β”‚
β”‚  β”‚  Primary β”‚          β”‚ (product  β”‚            β”‚ (ops      β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β”‚ analytics)β”‚            β”‚ dashboardsβ”‚   β”‚
β”‚       β”‚                β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚       β”‚                                              β–²         β”‚
β”‚       β–Ό                                              β”‚         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚  MySQL   β”‚          β”‚  S3 Data β”‚            β”‚OpenSearch β”‚   β”‚
β”‚  β”‚  Read    │──ETL──▢│  Lake    β”‚            β”‚ (logs)    β”‚   β”‚
β”‚  β”‚  Replica β”‚          β”‚ (Parquet) β”‚            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

11.3 Platform Responsibilities

Platform Purpose Data Types Retention Access
PostHog Product analytics β€” feature usage, funnels, cohorts, A/B tests Anonymised user events, feature flags, session data 24 months Product team, CDO
Grafana Operational dashboards β€” transaction volumes, success rates, latency, error rates Metrics from Prometheus/CloudWatch; no PII 12 months (metrics) Engineering, Operations, CDO
OpenSearch Log aggregation β€” application logs, access logs, error investigation Structured logs (PII redacted before ingestion) 3 months (hot), 9 months (warm/cold) Engineering, Security
S3 Data Lake Long-term analytical storage; regulatory reporting; historical analysis Anonymised transaction summaries, aggregated metrics (Parquet) 7 years Data team, Compliance, CDO

11.4 Real-Time vs Batch

Report Type Delivery Platform Refresh Rate
Transaction volume dashboard Real-time Grafana 10-second intervals
Transaction success rate Real-time Grafana 10-second intervals
Merchant settlement summary Batch S3 + reporting API Daily (T+1 by 06:00 local)
Revenue reporting Batch S3 + reporting API Daily (T+1)
Regulatory reporting (per jurisdiction) Batch S3 + manual review Monthly / quarterly per regulator schedule
Product feature analytics Near real-time PostHog 1-minute batches
AML monitoring Near real-time Dedicated AML system 5-minute intervals
Reconciliation reports Batch S3 + notification Daily (T+1)

12. Messaging & Event Architecture

12.1 Current State (Kafka)

Aspect Current Gap
Platform Apache Kafka Operational overhead high; topic structure undocumented; partition strategy undefined
Topics Undocumented No topic naming convention; no schema registry; no documentation of existing topics
Consumer groups Undocumented Unknown consumer group assignments; no dead-letter queue strategy
Partitioning Undocumented Unknown partition counts; potential for ordering issues
Schema No schema registry Message format changes can break downstream consumers without warning
Monitoring Minimal No consumer lag monitoring; no alerting on failed message processing

12.2 Target State (NSQ)

Aspect Target
Platform NSQ β€” lightweight, operationally simple, horizontally scalable
Topic naming {domain}.{entity}.{event} e.g., payin.transaction.created, payout.disbursement.state_changed
Channel naming {consuming-service}-{purpose} e.g., webhook-service-fanout, analytics-ingestion
Message format JSON with versioned schema; schema_version field in every message
Dead letter Every channel has a corresponding DLQ topic: {topic}.dlq
Retry policy Exponential backoff: 1s, 5s, 30s, 5m, 30m β€” then DLQ
Ordering Per-topic ordering within a single nsqd instance; use message deduplication for multi-instance

12.3 Event Schema Standards

{
  "schema_version": "1.0",
  "event_id": "uuid-v4",
  "event_type": "payin.transaction.created",
  "timestamp": "2026-04-02T12:00:00.000Z",
  "source": "payin-service",
  "market": "PK",
  "correlation_id": "uuid-v4",
  "data": {
    "transaction_id": "TXN-uuid",
    "merchant_id": "MER-uuid",
    "amount": 1500.00,
    "currency": "PKR",
    "status": "PENDING"
  },
  "metadata": {
    "idempotency_key": "uuid-v4",
    "actor": "system"
  }
}

Schema rules: - All events must include schema_version, event_id, event_type, timestamp, source, correlation_id - PII must not appear in event payloads consumed by non-owning services (use references/IDs instead) - Schema changes must be backward-compatible (additive only) or increment major version - All events must be JSON-serialisable; no binary payloads in standard events

12.4 Delivery Guarantees

Guarantee Implementation
At-least-once delivery NSQ provides at-least-once by default; consumers must be idempotent
Idempotent consumers Every consumer tracks processed event_id values in a deduplication store (Redis, 24-hour TTL)
Ordering Not guaranteed across nsqd instances; consumers must handle out-of-order via timestamp comparison
Exactly-once (where required) Transactional outbox pattern for critical state changes β€” write event to outbox table in same DB transaction as state change; relay process publishes to NSQ

12.5 Key Event Catalogue

Event Producer Consumers Purpose
payin.transaction.created Pay-In Service Webhook Service, Analytics, Reconciliation New transaction initiated
payin.transaction.completed Pay-In Service Webhook Service, Analytics, Settlement Transaction successfully completed
payin.transaction.failed Pay-In Service Webhook Service, Analytics, Alerting Transaction failed
payin.refund.requested Pay-In Service Refund Processor, Analytics Refund initiated
payin.refund.completed Refund Processor Webhook Service, Analytics, Settlement Refund processed
payout.disbursement.created Pay-Out Service Analytics, Reconciliation Disbursement initiated
payout.disbursement.state_changed Pay-Out Service Webhook Service, Analytics State machine transition
remittance.transfer.created Remittance Service Analytics, AML Service, Reconciliation Remittance initiated
remittance.transfer.state_changed Remittance Service Webhook Service, Analytics, AML Service State machine transition
remittance.aml.flagged AML Service Compliance Dashboard, Alerting AML review triggered
merchant.onboarded Merchant Service All services (config sync), Analytics New merchant activated
merchant.config.updated Merchant Service All services (config sync) Merchant configuration changed

13. Search Architecture

13.1 Current State

Aspect Current Gap
OpenSearch Used for log aggregation only No merchant-facing search; no transaction lookup beyond direct DB queries
Transaction search Direct MySQL queries Slow for large result sets; competes with transactional workload; no full-text search
Merchant search None Merchants cannot search their own transactions efficiently
Bank/Channel search None No search interface for available banks, channels, or operators

13.2 Target State

Platform Use Case Data Source Index Strategy
Meilisearch Merchant-facing transaction search Transaction events (via NSQ) Index per merchant; fields: transaction_id, amount, status, date, masked_msisdn, operator
Meilisearch Bank/channel directory search Operator and channel configuration Single index; fields: bank_name, channel_type, country, currency, status
Meilisearch Merchant directory (internal) Merchant detail events Single index; fields: merchant_name, merchant_id, market, status, products
OpenSearch Application log search Filebeat / Fluentd log shipping Index per day; fields: timestamp, service, level, message, correlation_id, market
OpenSearch Security event search Audit log events Index per week; fields: timestamp, actor, action, resource, market, ip_address

13.3 Meilisearch Design

Aspect Configuration
Index refresh Near real-time β€” NSQ consumer writes to Meilisearch on each transaction event; ~1-second indexing delay
Searchable attributes transaction_id, merchant_reference, status_name, operator_name, masked_msisdn
Filterable attributes status, operator_id, currency, created_date, amount_range
Sortable attributes created_at, amount
PII handling Only masked PII in search index (MSISDN β†’ 03XX-XXXX-567); full PII retrieved from primary DB on explicit record view
Multi-tenancy Tenant key = merchant_id; Meilisearch tenant tokens enforce merchant can only search their own data
Pagination Cursor-based pagination; maximum 1,000 results per query
Typo tolerance Enabled for bank/channel names; disabled for transaction IDs and amounts

13.4 OpenSearch Optimisation

Aspect Current Target
Index lifecycle Manual management ISM (Index State Management) policy: hot (7 days) β†’ warm (23 days) β†’ cold (60 days) β†’ delete (90 days)
Index pattern Single index logs-{service}-{yyyy.MM.dd} per service per day
Shard strategy Default 1 primary + 1 replica per index; target 30-50 GB per shard
Retention Undefined 90 days total (aligned with Section 7 retention policy for API logs)

14. Backup & Disaster Recovery

14.1 RDS Backup Strategy

Aspect Current State Target State
Automated snapshots AWS RDS automated backups (assumed enabled) Verified enabled; 35-day retention (maximum)
Snapshot frequency Daily (RDS default) Daily automated + manual pre-migration snapshots
Point-in-time recovery RDS supports 5-minute granularity Documented and tested quarterly
Cross-region copies Not configured Automated cross-region snapshot copy for DR (subject to data localisation)
Snapshot encryption Encrypted (if RDS encryption enabled) Verified encrypted with CMK; cross-region copies re-encrypted with destination CMK
Backup testing Not documented Quarterly restore test to isolated environment; results documented

14.2 RPO/RTO Targets

Data Tier RPO (Recovery Point Objective) RTO (Recovery Time Objective) Strategy
Tier 1: Active transactions 5 minutes 30 minutes Multi-AZ RDS; point-in-time recovery; automated failover
Tier 2: Configuration & merchant data 1 hour 1 hour RDS snapshot + configuration-as-code in Git
Tier 3: API logs & postbacks 24 hours 4 hours Daily RDS snapshot; acceptable to lose up to 1 day of logs
Tier 4: Archived data (S3) 24 hours 24 hours S3 cross-region replication; Glacier retrieval within 12 hours
Tier 5: Analytics data 48 hours 48 hours Rebuild from primary data sources; PostHog has its own backup

14.3 Backup Coverage (All Data Stores)

Data Store Backup Method Frequency Retention DR Location
MySQL (RDS) Automated snapshot + PITR Daily + continuous WAL 35 days Multi-AZ (same region); cross-region copy weekly
SurrealDB (future) Scheduled surreal export + EBS snapshots 6-hourly export; daily EBS snapshot 30 days Cross-AZ EBS; export to S3
Redis (ElastiCache) Automatic backup Daily 7 days Multi-AZ replication
OpenSearch Automated snapshots to S3 Hourly 14 days S3 cross-region replication
Meilisearch (future) Scheduled dump + EBS snapshot Daily 7 days Rebuildable from primary data source
NSQ No backup (ephemeral by design) N/A N/A Messages replayed from transactional outbox if needed
S3 (archives) Versioning + cross-region replication Continuous Per retention policy Cross-region replication

14.4 Recovery Procedures

Scenario Procedure Estimated Recovery Time
Primary RDS failure Automatic failover to Multi-AZ standby 2-5 minutes (automatic)
Availability Zone failure Multi-AZ failover handles automatically 2-5 minutes (automatic)
Region failure Promote cross-region read replica (when implemented); restore from cross-region snapshot 30-60 minutes
Data corruption (application bug) Point-in-time recovery to moment before corruption; identify and fix root cause 30-60 minutes for restore; variable for root cause
Accidental deletion (table/rows) Point-in-time recovery to pre-deletion timestamp; selective data restoration 30-60 minutes
Ransomware / security breach Isolate affected systems; restore from verified clean snapshot; forensic investigation 2-4 hours for restore; investigation ongoing
SurrealDB data loss Restore from latest surreal export or EBS snapshot 30-60 minutes
Redis failure ElastiCache auto-recovery; application gracefully degrades (cache miss β†’ DB read) 2-5 minutes (automatic)
OpenSearch failure Restore from S3 snapshot; logs continue to buffer in Fluentd 30-60 minutes

15. Regulatory Compliance Matrix

15.1 Compliance Requirements by Jurisdiction

Requirement Pakistan (SBP) Bangladesh (BB) Nepal (NRB) Iraq (CBI) PCI DSS (Cards)
Data localisation Required β€” data must reside on servers in Pakistan or SBP-approved infrastructure Required β€” MFS data must remain in Bangladesh Required β€” financial data within Nepal Required β€” data within Iraq N/A (global standard)
Transaction retention 5 years minimum 5 years minimum 5 years minimum 5 years minimum 1 year minimum
AML record retention 5 years (AML Act 2010) 5 years (MLPA 2012) 5 years 5 years N/A
Customer data encryption Required (SBP Circular) Required Required Required Required (PCI DSS Req 3)
Breach notification SBP notification required within 24 hours Bangladesh Bank within 24 hours NRB notification required CBI notification required 72 hours to card brands; "without unreasonable delay" to individuals
Audit trail Required for all transactions Required Required Required Required (PCI DSS Req 10)
Access controls Role-based, documented Role-based, documented Role-based Role-based Strict RBAC (PCI DSS Req 7)
Penetration testing Annual (SBP requirement) Annual Annual Annual Annual + after significant changes (PCI DSS Req 11)
PAN storage N/A N/A N/A N/A Prohibited post-authorisation (Req 3.4)
CVV storage N/A N/A N/A N/A N/A
Network segmentation Recommended Recommended Recommended Recommended Recommended

15.2 Current Compliance Status

Requirement Status Gap Remediation
Data localisation (all markets) Non-compliant Single shared RDS instance; unclear which region hosts which market's data Deploy per-market database instances; document hosting region per regulator
Transaction retention Partially compliant Data retained (never deleted) but no formal policy; no archival process Implement retention policy from Section 7; automate archival
AML record retention Unknown AML records exist in Remittance tables but retention not formalised Formalise AML retention at 10 years (exceeds all jurisdictions)
Customer data encryption Non-compliant PII in plain text; column-level encryption absent Implement column-level encryption per Section 9
Breach notification No documented process No incident response plan with regulatory notification steps Create incident response plan with jurisdiction-specific timelines
Audit trail Partial api_logs captures API calls; no comprehensive audit trail for data access, admin actions Implement audit logging for all data access and administrative actions
Access controls Unknown Database access controls not documented; application-level RBAC unknown Document and enforce RBAC; implement database-level access controls
PCI DSS (Cards) Unknown Card data handling not documented in architectural review Conduct PCI DSS gap assessment; ensure PAN never stored, CVV never stored

15.3 Regulatory Reporting Obligations

Jurisdiction Report Frequency Data Required Current Capability
Pakistan (SBP) Transaction volume and value summary Monthly Aggregated by product, operator, currency Manual extraction from MySQL
Pakistan (SBP) Suspicious transaction report (STR) As required Transaction detail, customer detail, reason for suspicion Manual process
Bangladesh (BB) MFS transaction report Monthly Volume, value, success/failure rates Manual extraction
Nepal (NRB) Remittance inflow report Monthly Sender/receiver country, amount, channel Manual extraction
Iraq (CBI) Payment transaction summary Quarterly Volume, value by operator Manual extraction
All AML/CFT compliance report Quarterly/Annual STR count, SAR count, training records Manual compilation
PCI DSS Self-Assessment Questionnaire (SAQ) or ROC Annual Cardholder data environment documentation Not documented

16. Data Governance Operating Model

16.1 Data Ownership

Data Domain Product Owner Data Steward (Technical) Responsibilities
Pay-In transactions Pay-Ins Product Manager Pay-Ins Lead Engineer Schema changes, data quality, retention compliance, PII handling
Pay-Out / Disbursement data Pay-Outs Product Manager Pay-Outs Lead Engineer Schema changes, beneficiary data quality, state machine integrity
Remittance data Remittances Product Manager Remittances Lead Engineer Schema changes, AML data integrity, cross-border data compliance
Card data Cards Product Manager Cards Lead Engineer PCI DSS compliance, tokenisation, card data lifecycle
Merchant data Merchant Operations Manager Platform Lead Engineer Merchant profiles, KYB data, configuration integrity
Operator/Channel data Integration Manager Platform Lead Engineer Credential security, operator configuration, channel availability
Analytics data CDO (Daniel O'Reilly) Data Team Lead Data pipeline quality, anonymisation, reporting accuracy
Audit and compliance data Compliance Officer Security Engineer Completeness of audit trail, access log integrity

16.2 Data Stewardship Responsibilities

Responsibility Description Frequency
Schema change review Review and approve all schema changes (new tables, columns, type changes) for their domain Per change (PR-based)
Data quality monitoring Monitor data quality metrics (completeness, accuracy, consistency) for their domain Weekly review
Classification review Ensure all new data elements are classified per Section 5 framework Per change
Retention compliance Verify data retention and archival is operating correctly for their domain Monthly audit
PII handling audit Verify PII encryption, masking, and access controls are correctly implemented Quarterly audit
Incident response Act as domain expert during data incidents (breaches, corruption, loss) As needed
Regulatory response Provide domain data for regulatory reporting and audits As needed

16.3 Schema Change Management Process

Step Action Owner Tool
1 Developer creates schema change as migration script Developer Flyway / SurrealDB migrations
2 Developer raises PR with migration script + data classification for new fields Developer Bitbucket PR
3 Data Steward reviews: classification correct, PII handling defined, retention policy assigned Data Steward Bitbucket PR review
4 Security review (if Confidential/Restricted data involved) Security Engineer Bitbucket PR review
5 Migration tested in staging environment against production-like data volume Developer + QA Staging environment
6 Rollback script verified Developer Staging environment
7 Migration scheduled during low-traffic window (if DDL lock required) DevOps Deployment pipeline
8 Migration executed with monitoring DevOps Automated deployment with alerts
9 Post-migration validation: row counts, constraint checks, application health Developer + DevOps Automated checks
10 Data dictionary (Appendix) updated Data Steward Bitbucket PR

16.4 Access Control and Audit Logging

Access Level Who Database Access Application Access Audit
Read-only (production) On-call engineers, support Read replica only; no direct production primary access Application-level search tools All queries logged with actor + timestamp
Read-write (production) Authorised senior engineers (break-glass only) Direct access via approved bastion; session recorded N/A Full session recording; Jira ticket required
Admin (production) DBA / DevOps lead Full access via approved bastion; MFA required N/A Full session recording; two-person approval
Non-production All engineers Full access Full access Standard logging
Compliance / Audit Compliance team Read-only on audit replica Compliance dashboard Access logged
Merchant Merchant users None Merchant portal β€” own data only (tenant-scoped) All access logged

Audit log fields (minimum):

Field Description
timestamp ISO 8601 with millisecond precision
actor User ID or service account performing the action
action What was done (READ, WRITE, DELETE, EXPORT, LOGIN, SCHEMA_CHANGE)
resource What was accessed (table, record ID, API endpoint)
market Which jurisdiction's data was accessed
source_ip IP address of the actor
result Success or failure
reason Business justification (required for break-glass access)

17. Migration Roadmap

Phase 1: Classification and Retention Policies (Months 1-2)

Action Owner Deliverable Dependency
Classify all existing data fields per Section 5 framework Data Stewards (all domains) Complete data classification register None
Document current data volumes and growth rates per table Data Platform team Capacity planning report None
Implement data retention policy automation (tagging, archival jobs) Data Platform team Automated archival pipeline Classification register
Define legal hold procedures and test with mock scenario Compliance + Data Platform Documented and tested legal hold process None
Establish data governance operating model (owners, stewards, RACI) CDO Published governance model None
Implement schema change management process Engineering Leads PR-based migration workflow in Bitbucket None

Phase 2: PII Encryption and Masking (Months 2-4)

Action Owner Deliverable Dependency
Provision AWS KMS keys (per market, per classification) DevOps / Security KMS key hierarchy deployed Phase 1 classification
Implement application-layer encryption library Platform Engineering Shared encryption SDK with encrypt/decrypt/HMAC functions KMS keys provisioned
Encrypt PII columns in transaction table (MSISDN, AC) Pay-Ins team Encrypted columns with HMAC blind index Encryption library
Encrypt PII columns in merchant_detail (contact, email, phone) Platform team Encrypted columns Encryption library
Encrypt PII in Pay-Out and Remittance tables Pay-Out and Remittance teams Encrypted columns Encryption library
Implement PII masking in API response serialiser Platform Engineering Masking middleware deployed across all services Masking rules defined
Implement PII redaction in logging pipeline Platform Engineering API logs no longer contain plain-text PII Masking rules defined
Audit operator_credentials encryption; re-encrypt with KMS if needed Security team Verified envelope encryption for all operator credentials KMS keys provisioned
PCI DSS gap assessment for Cards Cards team + Security Gap assessment report with remediation plan None

Phase 3: Read Replicas and Reporting Segregation (Months 3-5)

Action Owner Deliverable Dependency
Deploy MySQL read replica for reporting DevOps Read replica running with < 30-second lag None
Migrate all reporting/dashboard queries to read replica Engineering teams Zero reporting queries against primary Read replica deployed
Deploy PostHog for product analytics Data team PostHog instance with initial event tracking None
Deploy Grafana for operational dashboards DevOps Grafana with transaction volume, success rate, latency dashboards Metrics pipeline (Prometheus/CloudWatch)
Implement OpenSearch index lifecycle management DevOps ISM policies for log rotation and retention None
Implement automated reconciliation (merchant ↔ channel ↔ Simpaisa) Data Platform team Daily reconciliation job with exception reporting Read replica

Phase 4: Database-per-Service and SurrealDB (Months 5-12)

Action Owner Deliverable Dependency
Deploy SurrealDB infrastructure (per market) DevOps SurrealDB clusters with TLS, authentication, backup Infrastructure provisioning
Implement Pay-Outs v2 on SurrealDB Pay-Outs team New Pay-Out service with dedicated SurrealDB SurrealDB infrastructure
Implement Remittances v2 on SurrealDB Remittances team New Remittance service with dedicated SurrealDB SurrealDB infrastructure
Deploy NSQ for event messaging DevOps NSQ cluster with topic/channel conventions None
Migrate Kafka consumers to NSQ All teams All event flows on NSQ; Kafka decommissioned NSQ deployed
Deploy Meilisearch for merchant-facing search Platform team Meilisearch with transaction search index NSQ events for index updates
Implement transactional outbox pattern for critical events All teams Outbox tables + relay process per service NSQ + database-per-service
Dual-write period for Pay-Outs (MySQL + SurrealDB) Pay-Outs team Verified data consistency between stores Pay-Outs v2 deployed
Cut over Pay-Outs to SurrealDB (read + write) Pay-Outs team Pay-Outs fully on SurrealDB; MySQL tables read-only Dual-write validation
Dual-write period for Remittances Remittances team Verified data consistency Remittances v2 deployed
Cut over Remittances to SurrealDB Remittances team Remittances fully on SurrealDB Dual-write validation

Phase 5: Cross-Border Compliance Certification (Months 10-18)

Action Owner Deliverable Dependency
Deploy per-market database instances (where required by data localisation) DevOps Separate RDS/SurrealDB instances per jurisdiction Database-per-service architecture
Implement data residency controls (geo-fencing of data writes) Platform Engineering Application-level routing ensures data stays in correct jurisdiction Per-market instances
Document cross-border data flows with regulatory mapping Compliance + Data Platform Cross-border data flow register (Section 8 fully verified) Per-market instances
SBP audit readiness Compliance SBP compliance evidence pack Phases 1-4 complete
Bangladesh Bank audit readiness Compliance BB compliance evidence pack Phases 1-4 complete
Nepal Rastra Bank audit readiness Compliance NRB compliance evidence pack Phases 1-4 complete
CBI Iraq audit readiness Compliance CBI compliance evidence pack Phases 1-4 complete
PCI DSS certification (Cards) Cards team + Security SAQ or ROC completion Phase 2 PCI gap remediation
Automated regulatory reporting per jurisdiction Data Platform + Compliance Monthly/quarterly automated report generation Analytics architecture (Phase 3)
Annual penetration test covering all jurisdictions Security team Penetration test report with remediation All phases

Migration Timeline Summary

Month:  1    2    3    4    5    6    7    8    9   10   11   12   ...  18
        β”œβ”€β”€β”€β”€β”€
        Phase 1: Classification & Retention
             β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
             Phase 2: PII Encryption & Masking
                  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
                  Phase 3: Read Replicas & Reporting
                       β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
                       Phase 4: Database-per-Service + SurrealDB
                                                   β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
                                                   Phase 5: Cross-Border Compliance

18. Appendix: Data Dictionary

18.1 Pay-Ins Domain β€” transaction Table

Column Data Type Description Classification Encrypted Masked Retention
id BIGINT (PK) Auto-increment primary key Internal No No 7 years
AC VARCHAR Account code / reference Confidential Target: Yes (currently: No) Yes β€” last 4 7 years
amount DECIMAL Transaction amount Internal No (RDS-level) No 7 years
msisdn VARCHAR Mobile subscriber number Confidential Target: Yes (currently: No) Yes β€” 03XX-XXXX-567 7 years
status INT Transaction status (0=pending, 1=success) Internal No No 7 years
operatorId INT (FK) Reference to operator table Internal No No 7 years
merchantId INT (FK) Reference to merchant_detail table Internal No No 7 years
created_at DATETIME Record creation timestamp Internal No No 7 years
updated_at DATETIME Last modification timestamp Internal No No 7 years

18.2 Pay-Ins Domain β€” merchant_detail Table

Column Data Type Description Classification Encrypted Masked Retention
merchantId INT (PK) Merchant identifier Internal No No Lifetime + 5 years
contact VARCHAR Contact person name Confidential Target: Yes (currently: No) Yes β€” D. O'Reilly Lifetime + 5 years
email VARCHAR Contact email Confidential Target: Yes (currently: No) Yes β€” d***@simpaisa.com Lifetime + 5 years
phone VARCHAR Contact phone Confidential Target: Yes (currently: No) Yes β€” last 4 Lifetime + 5 years

18.3 Pay-Ins Domain β€” operator_credentials Table

Column Data Type Description Classification Encrypted Masked Retention
id INT (PK) Primary key Internal No No Lifetime of integration
operatorId INT (FK) Reference to operator table Internal No No Lifetime
api_key VARCHAR Third-party API key Restricted Target: KMS envelope (currently: Unknown) Never displayed Destroyed on decommission
api_secret VARCHAR Third-party API secret Restricted Target: KMS envelope (currently: Unknown) Never displayed Destroyed on decommission

18.4 Pay-Ins Domain β€” api_logs Table

Column Data Type Description Classification Encrypted Masked Retention
id BIGINT (PK) Primary key Internal No No 12 months
requestBody TEXT Full API request body Confidential No (RDS-level) Target: PII redacted before storage 12 months
responseBody TEXT Full API response body Confidential No (RDS-level) Target: PII redacted before storage 12 months
endpoint VARCHAR API endpoint called Internal No No 12 months
created_at DATETIME Log timestamp Internal No No 12 months

18.5 Pay-Ins Domain β€” Supporting Tables

Table Key Columns Classification Retention
product_configuration merchantId, operatorId, fee_percentage, fee_flat Internal Lifetime of configuration
product id, name, description Internal Indefinite
operator id, name, country, status Internal Indefinite
operator_redirect_urls operatorId, redirect_url, callback_url Internal Lifetime of integration
currency code, name, symbol Public Indefinite
webhook merchantId, url, events, secret Internal (URL) / Restricted (secret) Lifetime of configuration
transaction_refund transactionId, refund_amount, status Confidential 7 years
refund_requests transactionId, reason, status, requested_by Confidential 7 years
refund_transaction_status refund_id, status, timestamp Internal 7 years
postback transactionId, url, status, attempts, response_code Internal 24 months
external_response external_code, internal_code, description Internal Indefinite

18.6 Pay-Out Domain β€” Key Fields

Field Classification Encrypted (Target) Masked Retention
Disbursement ID Internal No No 7 years
Beneficiary name Confidential Yes Yes β€” D. O'Reilly 7 years
Beneficiary bank account Confidential Yes Yes β€” last 4 7 years
Beneficiary IBAN Confidential Yes Yes β€” country + last 4 7 years
Beneficiary CNIC/NID Confidential Yes Yes β€” last 4 7 years
Disbursement amount Internal No (RDS-level) No 7 years
Disbursement status Internal No No 7 years
State (Published/In Review/On Hold/Stuck/Disbursed/Rejected) Internal No No 7 years

18.7 Remittance Domain β€” Key Fields

Field Classification Encrypted (Target) Masked Retention
Remittance ID Internal No No 7 years
Sender name Confidential Yes Yes β€” D. O'Reilly 7 years
Sender ID document (passport/NID) Confidential Yes Yes β€” last 3 10 years (AML)
Sender country Internal No No 7 years
Receiver name Confidential Yes Yes β€” D. O'Reilly 7 years
Receiver MSISDN Confidential Yes Yes β€” last 3 7 years
Receiver bank account Confidential Yes Yes β€” last 4 7 years
Remittance amount Internal No (RDS-level) No 7 years
Source currency Public No No 7 years
Destination currency Public No No 7 years
State (9-state machine) Internal No No 7 years
AML review flag Restricted Yes Not shown to non-compliance 10 years
AML review notes Restricted Yes Not shown to non-compliance 10 years

18.8 Cards Domain β€” Key Fields

Field Classification Encrypted (Target) Masked Retention
Card token (Simpaisa token) Confidential Yes Yes β€” last 4 Lifetime of token
Card PAN Restricted Must not be stored post-authorisation 411111XXXXXX1111 0 (not stored)
Card CVV Restricted Must never be stored N/A 0 (never stored)
Card expiry Restricted Yes XX/XX Lifetime of token
Cardholder name Confidential Yes D. O'Reilly Lifetime of token
Card brand Internal No No Lifetime of token
Issuing bank (BIN) Internal No No Lifetime of token
Transaction amount Internal No (RDS-level) No 7 years
Authorisation code Confidential Yes Yes β€” last 4 7 years

Document Revision History

Version Date Author Changes
1.0 2 April 2026 Daniel O'Reilly (CDO) Initial version β€” full data architecture and governance framework

This document is classified as Confidential and must not be shared outside Simpaisa Holdings without written authorisation from the CDO.