| Field |
Value |
| Status |
Draft |
| Owner |
Platform Engineering |
| Last Updated |
2026-04-03 |
| Applies To |
Shared platform infrastructure |
1. Overview
This document describes the shared platform services that underpin all Simpaisa product lines. These services provide API gateway routing, workflow orchestration, messaging, data persistence, observability, identity management and edge networking. Product services (pay-in-svc, pay-out-svc, remit-svc, cards-svc) depend on these platform components but do not own them.
graph TB
subgraph "Edge — Cloudflare"
CF_CDN["Cloudflare CDN"]
CF_WAF["Cloudflare WAF"]
CF_Workers["Cloudflare Workers<br/>(Edge logic, rate limiting)"]
CF_Pages["Cloudflare Pages<br/>(Merchant Portal MFEs)"]
CF_DNS["Cloudflare DNS"]
end
subgraph "API Gateway"
KrakenD["KrakenD<br/>API Gateway<br/>Auth validation, routing,<br/>response aggregation"]
end
subgraph "Identity & Access"
ControlPlane["ControlPlane.com<br/>OIDC, RBAC, service identity,<br/>certificate authority"]
end
subgraph "Workflow Orchestration"
Temporal["Temporal Server<br/>Durable workflows<br/>(payouts, remittances, recon)"]
end
subgraph "Messaging"
NSQ_D["nsqd (× 3)<br/>Message broker"]
NSQ_Lookup["nsqlookupd (× 2)<br/>Discovery"]
NSQ_Admin["nsqadmin<br/>Admin UI"]
end
subgraph "Data Persistence"
SurrealDB["SurrealDB Cluster<br/>(Primary — PK region)<br/>Transactions, merchants, config"]
SurrealDB_BD["SurrealDB Cluster<br/>(Secondary — BD region)<br/>Bangladesh data residency"]
Redis["Redis Cluster<br/>Caching, rate limiting,<br/>OTP, idempotency"]
Meilisearch["Meilisearch<br/>Transaction search,<br/>merchant search"]
end
subgraph "Observability"
OTelCollector["OpenTelemetry Collector<br/>Traces, metrics, logs ingestion"]
Grafana["Grafana<br/>Dashboards, alerting"]
Jaeger["Jaeger<br/>Distributed tracing UI"]
end
subgraph "Product Analytics"
PostHog["PostHog<br/>Feature flags, analytics,<br/>session replay"]
end
CF_CDN --> CF_WAF
CF_WAF --> CF_Workers
CF_Workers --> KrakenD
CF_DNS --> CF_CDN
CF_Pages --> CF_CDN
KrakenD -- "JWT validation" --> ControlPlane
KrakenD -- "gRPC" --> ProductServices["Product Services<br/>(pay-in, pay-out, remit, cards)"]
ProductServices --> Temporal
ProductServices --> NSQ_D
ProductServices --> SurrealDB
ProductServices --> Redis
ProductServices --> Meilisearch
ProductServices --> OTelCollector
ProductServices --> PostHog
NSQ_D --> NSQ_Lookup
NSQ_Admin --> NSQ_Lookup
OTelCollector --> Grafana
OTelCollector --> Jaeger
3. Service Dependency Matrix
| Product Service |
KrakenD |
Temporal |
NSQ |
SurrealDB |
Redis |
Meilisearch |
OTel |
PostHog |
ControlPlane |
| pay-in-svc |
✓ |
|
✓ |
✓ |
✓ |
|
✓ |
✓ |
✓ |
| pay-out-svc |
✓ |
✓ |
✓ |
✓ |
|
|
✓ |
✓ |
✓ |
| remit-svc |
✓ |
✓ |
✓ |
✓ |
✓ |
|
✓ |
✓ |
✓ |
| cards-svc |
✓ |
|
✓ |
✓ (CDE) |
|
|
✓ |
✓ |
✓ |
| merchant-svc |
✓ |
|
|
✓ |
✓ |
✓ |
✓ |
|
✓ |
| notification-svc |
|
|
✓ |
✓ |
|
|
✓ |
|
|
| fx-svc |
✓ |
|
|
|
✓ |
|
✓ |
|
|
| fraud-svc |
|
|
✓ |
✓ |
✓ |
|
✓ |
✓ |
|
| recon-svc |
|
✓ |
✓ |
✓ |
|
|
✓ |
|
|
| settlement-svc |
|
|
✓ |
✓ |
|
|
✓ |
|
|
4.1 KrakenD — API Gateway
| Capability |
Configuration |
| Authentication |
JWT validation via ControlPlane.com JWKS |
| Rate limiting |
Per-merchant, per-endpoint |
| Request/response |
JSON↔gRPC transcoding |
| Aggregation |
Merge responses from multiple backends |
| Circuit breaker |
Backend failure isolation |
| Telemetry |
Export traces to OTel Collector |
4.2 Temporal — Workflow Orchestration
| Capability |
Usage |
| Payout workflows |
Validate → debit → transfer → settle |
| Remittance workflows |
Quote → AML → disburse → settle |
| Reconciliation |
Nightly file download → match → report |
| Retry policies |
Per-activity, exponential back-off |
| Persistence |
PostgreSQL backend for workflow state |
| Namespaces |
One per environment (sandbox, prod) |
4.3 NSQ — Messaging
| Capability |
Configuration |
| Topics |
Per-domain event (txn.payin.completed, etc) |
| Channels |
Per-consumer service |
| Replication |
3-node nsqd cluster |
| Message TTL |
7 days (configurable per topic) |
| Max in-flight |
200 per channel |
| Dead letter |
After 5 failed attempts → DLQ topic |
4.4 SurrealDB — Primary Data Store
| Capability |
Configuration |
| Deployment |
Clustered, multi-node |
| PK region cluster |
Primary — all markets except BD |
| BD region cluster |
Bangladesh data residency compliance |
| CDE cluster |
Cards — PCI DSS isolated |
| Replication |
Synchronous within cluster |
| Encryption |
AES-256 at rest, mTLS in transit |
4.5 Redis — Caching
| Use Case |
TTL |
| OTP codes |
5 minutes |
| Idempotency keys |
24 hours |
| FX rate cache |
120 seconds |
| Rate limit counters |
Sliding window (60 s) |
| Session tokens |
30 minutes |
4.6 Observability Stack
graph LR
Services["Product Services"] -- "OTLP gRPC" --> Collector["OTel Collector"]
Collector -- "Traces" --> Jaeger
Collector -- "Metrics" --> Prometheus["Prometheus"]
Collector -- "Logs" --> Loki["Loki"]
Prometheus --> Grafana
Loki --> Grafana
Jaeger --> Grafana
Grafana -- "Alerts" --> PagerDuty["PagerDuty"]
Grafana -- "Alerts" --> Slack["Slack"]
All services emit traces, metrics and logs via the OpenTelemetry SDK. The OTel Collector routes data to appropriate backends. Grafana provides unified dashboards and alerting.
4.7 PostHog — Product Analytics & Feature Flags
| Capability |
Usage |
| Feature flags |
Progressive rollout of new channels |
| Analytics events |
Merchant portal usage tracking |
| Session replay |
Debug merchant portal issues |
| Experimentation |
A/B test fraud rule thresholds |
| Capability |
Usage |
| OIDC provider |
Merchant Portal SSO |
| Service identity |
mTLS certificate issuance |
| RBAC |
Role-based access for portal users |
| K8s integration |
Workload identity for pods |
5. Architectural Decision Records
Changes to platform services require an ADR in /Standards/ADR/.