Structured Logging Standard¶
| Owner | Classification | Review Date | Status |
|---|---|---|---|
| Engineering | Internal | April 2027 | Active |
Structured Logging Standard¶
Version: 1.0
Last Updated: 2026-04-03
Owner: Platform Team
Status: Active
Purpose¶
Define a consistent, machine-parseable logging format across all Simpaisa services. Every log entry must support distributed tracing, payment debugging, and regulatory audit requirements.
Log Format¶
All services MUST emit JSON-structured logs. One JSON object per line, no multi-line entries.
Required Fields¶
| Field | Type | Description |
|---|---|---|
timestamp |
string | ISO 8601 with milliseconds and timezone: 2026-04-03T14:22:01.123Z |
level |
string | One of: TRACE, DEBUG, INFO, WARN, ERROR, FATAL |
service |
string | Service name, e.g. payin-svc, payout-svc, remittance-svc |
traceId |
string | OpenTelemetry trace ID (32-char hex). Propagated from incoming request |
spanId |
string | OpenTelemetry span ID (16-char hex) |
message |
string | Human-readable description of the event |
Contextual Fields (when applicable)¶
| Field | Type | Description |
|---|---|---|
merchantId |
string | Merchant identifier |
transactionId |
string | Simpaisa transaction reference |
channelName |
string | Upstream channel, e.g. jazzcash, easypaisa, bkash |
amount |
number | Transaction amount |
currency |
string | ISO 4217 currency code: PKR, BDT, NPR, IQD |
transactionStatus |
string | Current state per TRANSACTION-LIFECYCLE-STANDARD.md |
error |
object | { "code": "string", "message": "string", "upstream": "string" } |
durationMs |
number | Operation duration in milliseconds |
Log Levels¶
| Level | Usage | Examples |
|---|---|---|
TRACE |
Fine-grained diagnostics. Never in production | Step-through of signature computation |
DEBUG |
Development and sandbox troubleshooting. Disabled in production by default | Full request/response bodies, parsed field values |
INFO |
Normal operational events | Transaction initiated, payment completed, webhook delivered |
WARN |
Recoverable issues that need attention | Channel retry triggered, rate limit approaching 80%, certificate expiry <30 days |
ERROR |
Failures requiring investigation | Channel timeout, signature verification failed, database connection lost |
FATAL |
Service cannot continue | Configuration missing, database unreachable on startup, TLS cert invalid |
PII Masking¶
All Personally Identifiable Information MUST be masked before logging. See PII-HANDLING-STANDARD.md for the complete policy.
| Data Type | Masking Rule | Example |
|---|---|---|
| MSISDN | Show last 4 digits | ****4567 |
| Account number | Show last 4 digits | ****7890 |
| Card number (PAN) | First 6 + last 4 (BIN preserved) | 424242****4242 |
| CNIC/NID | Show last 4 digits | *****4321 |
| OTP values | NEVER logged under any circumstance | — |
| Mask local part | d***@example.com |
Services MUST apply masking at the logger level, not the caller. Use the shared logmask package.
Correlation & Tracing¶
-
Every inbound request at KrakenD generates a
traceIdif none exists (W3C Trace Contexttraceparentheader). -
All downstream service calls propagate the same
traceId. -
Every log line includes
traceIdandspanIdfrom the OpenTelemetry context. -
Async operations (webhooks, queue consumers) MUST carry the originating
traceId.
Log Pipeline¶
Services → OpenTelemetry Collector → OpenSearch
→ Grafana (dashboards/alerts)
-
Services emit logs to stdout (JSON).
-
OpenTelemetry Collector scrapes/receives logs, enriches with resource attributes, and forwards.
-
OpenSearch is the primary log store and query interface.
-
Grafana reads from OpenSearch for dashboards and alerting.
Retention¶
| Tier | Duration | Storage | Purpose |
|---|---|---|---|
| Hot | 90 days | OpenSearch | Active investigation, dashboards |
| Warm | 1 year | OpenSearch (warm nodes) | Historical debugging, trend analysis |
| Cold | 7 years | Object storage (S3-compatible) | Regulatory compliance (SBP, Bangladesh Bank) |
Log Volume Management¶
-
Production: Default level
INFO.DEBUGenabled per-service via feature flag for time-limited troubleshooting (auto-revert after 1 hour). -
Never log full request/response bodies at
INFOor above. UseDEBUGlevel only. -
Never log credentials, API keys, or signing keys at any level.
-
Batch operations: log summary (count, success, failure) not individual items.
Example Log Entries¶
Successful Payment¶
{
"timestamp": "2026-04-03T14:22:01.123Z",
"level": "INFO",
"service": "payin-svc",
"traceId": "a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4",
"spanId": "1a2b3c4d5e6f7a8b",
"message": "Payment completed successfully",
"merchantId": "MCH-001",
"transactionId": "TXN-20260403-00042",
"channelName": "jazzcash",
"amount": 1500.00,
"currency": "PKR",
"transactionStatus": "COMPLETED",
"durationMs": 2340
}
Failed Payment¶
{
"timestamp": "2026-04-03T14:22:05.456Z",
"level": "ERROR",
"service": "payin-svc",
"traceId": "b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5",
"spanId": "2b3c4d5e6f7a8b9c",
"message": "Payment failed: insufficient funds",
"merchantId": "MCH-002",
"transactionId": "TXN-20260403-00043",
"channelName": "easypaisa",
"amount": 25000.00,
"currency": "PKR",
"transactionStatus": "FAILED",
"error": { "code": "INSUFFICIENT_FUNDS", "message": "Account balance too low", "upstream": "EP-4012" },
"durationMs": 1820
}
Channel Timeout¶
{
"timestamp": "2026-04-03T14:22:30.789Z",
"level": "ERROR",
"service": "payin-svc",
"traceId": "c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6",
"spanId": "3c4d5e6f7a8b9c0d",
"message": "Channel request timed out after 30000ms",
"merchantId": "MCH-001",
"transactionId": "TXN-20260403-00044",
"channelName": "bkash",
"transactionStatus": "PROCESSING",
"error": { "code": "CHANNEL_TIMEOUT", "message": "No response within 30s", "upstream": null },
"durationMs": 30000
}
Rate Limit Hit¶
{
"timestamp": "2026-04-03T14:23:00.111Z",
"level": "WARN",
"service": "krakend-gateway",
"traceId": "d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1",
"spanId": "4d5e6f7a8b9c0d1e",
"message": "Rate limit exceeded for merchant",
"merchantId": "MCH-003",
"error": { "code": "RATE_LIMIT_EXCEEDED", "message": "100 req/min limit reached", "upstream": null }
}
Compliance¶
-
All services MUST pass log format validation in CI (linter checks JSON structure against this schema).
-
Log masking is verified by automated tests — any log containing raw PII fails the build.
-
Quarterly audit: sample 1000 log entries across services and verify PII masking compliance.