Rate Limiting Policy¶
| Owner | Classification | Review Date | Status |
|---|---|---|---|
| Security | Confidential | April 2027 | Active |
Rate Limiting Policy¶
Organisation: Simpaisa Holdings
Document Owner: Daniel O'Reilly, Chief Digital Officer
Classification: Internal
Version: 1.0
Date: 3 April 2026
Status: Active
Table of Contents¶
1. Purpose¶
Simpaisa processes 270M+ transactions worth $1B+ annually across Pakistan, Bangladesh, Nepal, Iraq, and Egypt. Rate limiting is a critical control that:
-
Protects payment infrastructure — prevents resource exhaustion that could cause platform-wide outages affecting all merchants and all markets simultaneously.
-
Prevents abuse — defends against brute-force attacks on OTP endpoints, credential stuffing, transaction enumeration, and denial-of-service attempts.
-
Ensures fair usage — prevents any single merchant from consuming disproportionate resources, ensuring equitable service quality across all integrations.
-
Preserves upstream channel availability — respects the rate limits imposed by telco operators (Easypaisa, JazzCash, bKash, etc.) and banking partners, preventing Simpaisa from being throttled or blocked by upstream providers.
-
Supports regulatory compliance — provides audit evidence of security controls required by SBP, Bangladesh Bank, Nepal Rastra Bank, and CBI.
Current state: Rate limiting is rated 2/10 (Critical) in the Security Architecture assessment. No documented rate limits exist. This policy defines the target state.
2. Rate Limiting Architecture¶
2.1 Enforcement Point¶
Rate limiting is enforced at the KrakenD API Gateway layer, not within individual services. This is a deliberate architectural decision:
Client Request
│
▼
┌──────────────┐
│ Cloudflare │ ← L7 DDoS protection, IP reputation, bot management
│ (Edge) │ Global per-IP limits enforced here
└──────┬───────┘
│
▼
┌──────────────┐
│ KrakenD │ ← Per-merchant, per-endpoint, per-IP rate limiting
│ (Gateway) │ All rate limit headers injected here
└──────┬───────┘
│
▼
┌──────────────┐
│ Go Service │ ← No rate limiting logic in services
│ (Backend) │ Services trust that gateway has enforced limits
└──────────────┘
2.2 Why Gateway-Level, Not Per-Service¶
| Reason | Detail |
|---|---|
| Single enforcement point | Avoids inconsistent limits across services; one configuration governs all |
| Centralised visibility | All rate limit events flow through one component for monitoring and alerting |
| No service code changes | Adding or modifying limits requires gateway configuration changes only |
| Consistent headers | Rate limit response headers are injected uniformly regardless of backend service |
| Upstream protection | Gateway can enforce limits before requests reach services, protecting service resources |
2.3 Defence in Depth¶
Cloudflare provides the first layer of defence at the edge:
| Cloudflare Control | Purpose |
|---|---|
| IP reputation scoring | Block known-bad IPs before they reach the gateway |
| Bot management | Challenge automated traffic with Turnstile |
| L7 DDoS mitigation | Absorb volumetric attacks at the edge |
| Geographic restrictions | Block traffic from non-operational countries if required |
| WAF rules | Block malicious payloads before rate limiting applies |
KrakenD provides the second layer with application-aware limits based on merchant identity, endpoint, and request context.
3. Rate Limit Tiers¶
3.1 Per-Merchant Global Rate Limit¶
| Parameter | Value |
|---|---|
| Default limit | 1,000 requests per minute |
| Scope | All endpoints combined, per merchantId |
| Configurable | Yes — per merchant contract |
| Minimum | 100 requests per minute |
| Maximum | 10,000 requests per minute (requires CDO approval) |
| Identification | merchantId extracted from authenticated request (API key or signed header) |
Merchants exceeding their global limit receive 429 Too Many Requests regardless of which endpoint they are calling.
3.2 Per-Merchant Per-Endpoint Rate Limits¶
| Endpoint Category | Rate Limit | Window | Rationale |
|---|---|---|---|
| Payment initiation (Pay-In, Pay-Out, Remittance create) | 100 req/min | Per merchantId | Payment creation is expensive — hits upstream channels, creates state |
| Inquiry / status (transaction status, balance check, rate inquiry) | 500 req/min | Per merchantId | Read-heavy; merchants poll for status; must be generous |
| OTP request | 3 req/5 min | Per mobile number per merchantId | Prevents OTP bombing; protects SMS costs |
| OTP verification | 5 attempts/OTP | Per OTP session | Prevents brute-force; locks OTP after 5 failures |
| Webhook management (register, update, delete) | 10 req/hour | Per merchantId | Low-frequency operations; prevents configuration churn |
| Merchant onboarding (KYC submission, document upload) | 20 req/hour | Per merchantId | Protects file upload resources |
| Refund initiation | 50 req/min | Per merchantId | Lower than payment initiation; refunds are less frequent |
| Bulk/batch operations | 10 req/min | Per merchantId | Each request contains multiple items; protects backend processing |
| FX rate inquiry | 200 req/min | Per merchantId | High-frequency for remittance corridors; cached at gateway |
3.3 Per-IP Rate Limits¶
| Scenario | Rate Limit | Window | Action on Exceed |
|---|---|---|---|
| Sandbox / unauthenticated | 200 req/min | Per IP address | 429 + Retry-After header |
| Authentication failures | 5 attempts/15 min | Per IP address | Temporary IP block (15 minutes) |
| Global per-IP (authenticated) | 1,000 req/min | Per IP address | 429 + Cloudflare challenge |
| Unauthenticated endpoints (health, docs) | 60 req/min | Per IP address | 429 |
3.4 Burst Allowance¶
| Parameter | Value |
|---|---|
| Burst multiplier | 2x sustained limit |
| Burst window | 10 seconds |
| Behaviour | Permits short traffic spikes (e.g., batch submission) without triggering 429 |
| Recovery | After burst window expires, rate returns to sustained limit; no penalty |
Example: A merchant with a 100 req/min payment initiation limit can send up to 200 requests within any 10-second window, provided their sustained rate over the full minute does not exceed 100.
3.5 Tiered Merchant Plans¶
| Tier | Global Limit | Payment Initiation | Inquiry/Status | Monthly Transaction Volume |
|---|---|---|---|---|
| Standard | 1,000 req/min | 100 req/min | 500 req/min | Up to 1M transactions |
| Professional | 3,000 req/min | 300 req/min | 1,500 req/min | 1M–10M transactions |
| Enterprise | 5,000 req/min | 500 req/min | 3,000 req/min | 10M+ transactions |
| Custom | Per contract | Per contract | Per contract | By agreement |
4. Payment Channel Rate Limits¶
Simpaisa must respect upstream channel limits imposed by telco operators and banking partners. If an upstream channel's rate limit is lower than Simpaisa's default, the more restrictive limit applies.
4.1 Known Channel Limits¶
| Channel | Market | Upstream Limit | Simpaisa Limit | Notes |
|---|---|---|---|---|
| Easypaisa | PK | Document per integration | Per channel agreement | Telenor Microfinance Bank API limits |
| JazzCash | PK | Document per integration | Per channel agreement | Jazz/Mobilink API limits |
| UBL Omni | PK | Document per integration | Per channel agreement | Banking API limits |
| HBL | PK | Document per integration | Per channel agreement | Banking API limits |
| bKash | BD | Document per integration | Per channel agreement | bKash merchant API limits |
| Nagad | BD | Document per integration | Per channel agreement | Nagad API limits |
| eSewa | NP | Document per integration | Per channel agreement | eSewa API limits |
| Khalti | NP | Document per integration | Per channel agreement | Khalti API limits |
4.2 Channel Rate Limit Management¶
| Requirement | Implementation |
|---|---|
| Documentation | Every channel integration must document the upstream provider's rate limits in the Vendor Integration Register |
| Configuration | KrakenD rate limit configuration must include per-channel limits derived from upstream constraints |
| Circuit breaking | If a channel returns throttling responses (HTTP 429 or equivalent), KrakenD must back off and queue requests |
| Monitoring | Track per-channel request volume vs. upstream limit; alert at 80% utilisation |
| Fair allocation | When multiple merchants share a channel, implement fair-share allocation to prevent one merchant exhausting the channel |
5. Response Headers¶
All API responses MUST include rate limit headers, following the IETF draft standard (RateLimit Header Fields for HTTP).
5.1 Standard Headers¶
| Header | Description | Example |
|---|---|---|
X-RateLimit-Limit |
Maximum number of requests permitted in the current window | X-RateLimit-Limit: 100 |
X-RateLimit-Remaining |
Number of requests remaining in the current window | X-RateLimit-Remaining: 73 |
X-RateLimit-Reset |
Unix epoch timestamp (seconds) when the current window resets | X-RateLimit-Reset: 1712160000 |
Retry-After |
Seconds until the client should retry (only on 429 responses) | Retry-After: 30 |
5.2 Header Rules¶
| Rule | Detail |
|---|---|
| Always present | Rate limit headers MUST be included on every response (2xx, 4xx, 5xx), not only on 429 |
| Most restrictive | When multiple limits apply (global + per-endpoint), headers reflect the most restrictive limit |
| Reset accuracy | X-RateLimit-Reset MUST reflect the actual window reset time, not an approximation |
| Retry-After on 429 | Every 429 response MUST include Retry-After with the number of seconds to wait |
| No negative values | X-RateLimit-Remaining MUST NOT be negative; minimum value is 0 |
5.3 Example Response Headers (Normal Request)¶
HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 73
X-RateLimit-Reset: 1712160060
X-Request-Id: txn-abc-123-def-456
5.4 Example Response Headers (Rate Limited)¶
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1712160060
Retry-After: 30
X-Request-Id: txn-abc-123-def-456
6. 429 Response Format¶
Rate-limited responses use the unified error schema defined in API-STANDARDS.md:
{
"error": {
"code": "RATE_LIMIT_EXCEEDED",
"message": "You have exceeded the rate limit for this endpoint. Please retry after the period indicated in the Retry-After header.",
"details": [
{
"field": "endpoint",
"issue": "Payment initiation limit of 100 requests per minute exceeded"
}
]
},
"traceId": "abc-123-def-456",
"timestamp": "2026-04-03T12:00:00Z"
}
6.1 Error Codes for Rate Limiting¶
| Error Code | Trigger | Message |
|---|---|---|
RATE_LIMIT_EXCEEDED |
Any rate limit exceeded | "You have exceeded the rate limit for this endpoint" |
RATE_LIMIT_GLOBAL |
Merchant global limit exceeded | "You have exceeded your global rate limit across all endpoints" |
RATE_LIMIT_BURST |
Burst allowance exceeded | "Request rate too high; burst allowance exhausted" |
RATE_LIMIT_OTP |
OTP request limit exceeded | "Too many OTP requests for this mobile number" |
IP_BLOCKED_TEMPORARY |
IP temporarily blocked due to auth failures | "Your IP has been temporarily blocked due to repeated authentication failures" |
7. Rate Limit Storage¶
7.1 Storage Backend¶
| Parameter | Value |
|---|---|
| Backend | Redis (ElastiCache — existing infrastructure) |
| Algorithm | Sliding window log |
| Key format | rl:{scope}:{identifier}:{endpoint} |
| TTL | Window duration + 10 seconds buffer |
| Cluster | Dedicated Redis cluster for rate limiting (not shared with application cache) |
7.2 Why Sliding Window¶
| Algorithm | Pros | Cons | Decision |
|---|---|---|---|
| Fixed window | Simple, low memory | Burst at window boundary (2x actual rate) | Rejected |
| Sliding window log | Accurate, no boundary burst | Higher memory per key | Selected |
| Token bucket | Smooth, configurable burst | Complex state management | Considered for future |
7.3 Key Schema¶
# Per-merchant global
rl:merchant:{merchantId}:global
# Per-merchant per-endpoint
rl:merchant:{merchantId}:endpoint:{endpointCategory}
# Per-IP
rl:ip:{ipAddress}:global
# Per-mobile (OTP)
rl:otp:{merchantId}:{mobileNumber}
7.4 Failure Mode¶
| Scenario | Behaviour |
|---|---|
| Redis unavailable | Fail open — requests pass through without rate limiting; alert raised immediately |
| Redis latency > 50ms | Log warning; consider circuit breaker to fail open |
| Redis data loss (failover) | All counters reset; merchants get a fresh window; no 429s during recovery |
Rationale for fail-open: For a payment gateway, a false 429 (blocking a legitimate payment) is worse than temporarily allowing excess traffic. Redis unavailability is already a monitored incident and would trigger immediate investigation.
8. Exemptions¶
8.1 Exempt Traffic¶
| Traffic Type | Exemption | Rationale |
|---|---|---|
| Webhook delivery retries | Exempt from rate limits | Simpaisa-initiated; retry logic must not be throttled by our own limits |
| Internal service-to-service | Exempt from merchant rate limits | Identified by mTLS certificates; governed by service mesh policies |
| Health check endpoints | Exempt from authenticated limits | /health, /ready must always respond for load balancer probes |
| Cloudflare Workers | Exempt from IP-based limits | Identified by Cloudflare service token header |
8.2 Non-Exempt Traffic¶
| Traffic Type | Rate Limits Apply | Notes |
|---|---|---|
| Sandbox / test environment | Yes — reduced limits | Sandbox uses per-IP limits (200 req/min) |
| Merchant SDK traffic | Yes — standard merchant limits | SDK requests carry merchant credentials |
| Third-party integrator traffic | Yes — standard merchant limits | Integrators operate under the merchant's allocation |
9. Monitoring & Alerting¶
9.1 Grafana Dashboard — Rate Limit Monitoring¶
The rate limiting dashboard MUST display:
| Panel | Description | Refresh |
|---|---|---|
| 429s per merchant | Time series of rate-limited requests per merchantId | 15 seconds |
| 429s per endpoint | Time series of rate-limited requests per endpoint category | 15 seconds |
| Top 10 throttled merchants | Table of merchants with highest 429 counts (rolling 1 hour) | 1 minute |
| Rate limit utilisation | Per-merchant utilisation as percentage of their limit | 1 minute |
| Channel utilisation | Per-channel request volume vs. upstream limit | 30 seconds |
| Redis health | Rate limit Redis cluster latency, memory, connection count | 15 seconds |
9.2 Alerts¶
| Alert | Condition | Severity | Action |
|---|---|---|---|
| Sustained 429s | Merchant receives > 50 rate-limited responses in 5 minutes | Warning | Notify merchant via DevEx portal; investigate if legitimate traffic growth |
| Burst 429s | Merchant receives > 200 rate-limited responses in 1 minute | High | Investigate for potential abuse or misconfigured integration |
| IP block triggered | Any IP blocked due to authentication failures | Medium | Log for security review; check for credential stuffing |
| Channel near limit | Upstream channel at > 80% of its rate limit | Warning | Review traffic distribution; consider queuing |
| Channel at limit | Upstream channel at 100% of its rate limit | High | Activate queuing; notify affected merchants |
| Redis rate limit cluster down | Redis unavailable or latency > 100ms | Critical | Rate limiting fail-open; investigate immediately |
| Anomalous traffic pattern | Single merchant's request volume increases > 5x in 1 hour | Warning | Investigate; may indicate integration error or abuse |
9.3 Metrics (OpenTelemetry)¶
All rate limiting events emit OpenTelemetry metrics:
| Metric | Type | Labels |
|---|---|---|
simpaisa.ratelimit.requests.total |
Counter | merchant_id, endpoint, result (allowed/denied) |
simpaisa.ratelimit.latency.ms |
Histogram | operation (check/increment) |
simpaisa.ratelimit.utilisation.ratio |
Gauge | merchant_id, limit_type |
simpaisa.ratelimit.redis.errors |
Counter | error_type |
10. Merchant Communication¶
10.1 Developer Experience Portal¶
Rate limits MUST be documented in the merchant DevEx portal with:
| Content | Detail |
|---|---|
| Rate limit overview | Explanation of all tiers and how limits are applied |
| Current limits | Per-merchant view showing their contracted limits |
| Usage dashboard | Real-time view of current utilisation against limits |
| Best practices | Guidance on implementing exponential backoff, request batching, caching status responses |
| Code examples | SDK examples showing proper 429 handling in Java, Python, Node.js, PHP, C# |
10.2 Approaching Limits Notification¶
| Threshold | Action |
|---|---|
| 80% utilisation | DevEx portal displays yellow warning banner |
| 90% utilisation | Email notification to merchant's technical contact |
| 100% utilisation (first 429) | Webhook notification to merchant (if configured) |
| Sustained 429s ( > 5 minutes) | Email to merchant's technical contact + account manager notified |
10.3 Sandbox Rate Limit Testing¶
The sandbox environment provides a /v3/test/rate-limit endpoint that merchants can use to:
-
Verify their 429 handling logic
-
Test
Retry-Afterheader parsing -
Validate exponential backoff implementation
-
This endpoint artificially returns 429 on every other request
11. Escalation & Limit Increases¶
11.1 Standard Limit Increase¶
| Step | Action | Owner |
|---|---|---|
| 1 | Merchant submits limit increase request via DevEx portal or support ticket | Merchant |
| 2 | Account manager reviews request against merchant's transaction volume and contract | Account Manager |
| 3 | Technical review — assess infrastructure capacity for requested limits | Platform Engineering |
| 4 | Approval and configuration update in KrakenD | Platform Engineering |
| 5 | Merchant notified of new limits; DevEx portal updated | Account Manager |
SLA: Standard limit increase requests processed within 3 business days.
11.2 CDO Approval Required¶
| Scenario | Approval Required |
|---|---|
| Global limit > 5,000 req/min | CDO approval |
| Payment initiation > 500 req/min | CDO approval |
| Custom burst configuration | CDO approval |
| Rate limit exemption for any endpoint | CDO approval |
| Temporary limit increase > 30 days | CDO approval |
11.3 Emergency Limit Increase¶
For production incidents where rate limits are blocking legitimate traffic:
| Step | Action | Timeline |
|---|---|---|
| 1 | On-call engineer verifies the traffic is legitimate (not an attack) | Immediate |
| 2 | Temporary 2x limit increase applied via KrakenD config update | Within 15 minutes |
| 3 | Incident documented and CDO notified | Within 1 hour |
| 4 | Permanent limit adjustment or merchant migration to higher tier | Within 3 business days |
12. Implementation Checklist¶
- KrakenD rate limiting plugin configured with Redis backend
- Redis cluster provisioned (dedicated, not shared with application cache)
- Per-merchant global limits configured (default 1,000 req/min)
- Per-endpoint limits configured for all endpoint categories
- Per-IP limits configured for sandbox and unauthenticated endpoints
- Burst allowance configured (2x sustained, 10-second window)
- Rate limit response headers injected on all responses (not just 429)
- 429 response body follows unified error schema
Retry-Afterheader included on all 429 responses- Webhook delivery retries exempted from rate limits
- Internal service-to-service traffic exempted (identified by mTLS)
- Grafana dashboard created with all required panels
- OpenTelemetry metrics emitting for all rate limit events
- Alerts configured for sustained 429s, channel saturation, Redis health
- DevEx portal updated with rate limit documentation
- Approaching-limit notifications configured (80%, 90%, 100%)
- Sandbox rate limit test endpoint deployed
- Upstream channel limits documented in Vendor Integration Register
- Per-channel rate limits configured in KrakenD
- Fail-open behaviour tested and verified
- Load test executed to validate limits under production-like conditions
Cross-References¶
| Document | Relevance |
|---|---|
| API-STANDARDS.md | Section 13: Rate Limiting headers and response format |
| SECURITY-ARCHITECTURE.md | Section 7: API Security Controls — rate limiting requirements |
| INFRASTRUCTURE-STANDARDS.md | Section 7: KrakenD Gateway — rate limiting configuration |
| VENDOR-INTEGRATION-REGISTER.md | Upstream channel rate limits per operator |
| INCIDENT-RESPONSE-PLAYBOOK.md | Escalation procedures for rate-limiting incidents |