Rate Limiting Policy
Organisation: Simpaisa Holdings
Document Owner: Daniel O'Reilly, Chief Digital Officer
Classification: Internal
Version: 1.0
Date: 3 April 2026
Status: Active
Table of Contents
- Purpose
- Rate Limiting Architecture
- Rate Limit Tiers
- Payment Channel Rate Limits
- Response Headers
- 429 Response Format
- Rate Limit Storage
- Exemptions
- Monitoring & Alerting
- Merchant Communication
- Escalation & Limit Increases
- Implementation Checklist
1. Purpose
Simpaisa processes 270M+ transactions worth $1B+ annually across Pakistan, Bangladesh, Nepal, Iraq, and Egypt. Rate limiting is a critical control that:
- Protects payment infrastructure — prevents resource exhaustion that could cause platform-wide outages affecting all merchants and all markets simultaneously.
- Prevents abuse — defends against brute-force attacks on OTP endpoints, credential stuffing, transaction enumeration, and denial-of-service attempts.
- Ensures fair usage — prevents any single merchant from consuming disproportionate resources, ensuring equitable service quality across all integrations.
- Preserves upstream channel availability — respects the rate limits imposed by telco operators (Easypaisa, JazzCash, bKash, etc.) and banking partners, preventing Simpaisa from being throttled or blocked by upstream providers.
- Supports regulatory compliance — provides audit evidence of security controls required by SBP, Bangladesh Bank, Nepal Rastra Bank, and CBI.
Current state: Rate limiting is rated 2/10 (Critical) in the Security Architecture assessment. No documented rate limits exist. This policy defines the target state.
2. Rate Limiting Architecture
2.1 Enforcement Point
Rate limiting is enforced at the KrakenD API Gateway layer, not within individual services. This is a deliberate architectural decision:
Client Request
│
▼
┌──────────────┐
│ Cloudflare │ ← L7 DDoS protection, IP reputation, bot management
│ (Edge) │ Global per-IP limits enforced here
└──────┬───────┘
│
▼
┌──────────────┐
│ KrakenD │ ← Per-merchant, per-endpoint, per-IP rate limiting
│ (Gateway) │ All rate limit headers injected here
└──────┬───────┘
│
▼
┌──────────────┐
│ Go Service │ ← No rate limiting logic in services
│ (Backend) │ Services trust that gateway has enforced limits
└──────────────┘
2.2 Why Gateway-Level, Not Per-Service
| Reason |
Detail |
| Single enforcement point |
Avoids inconsistent limits across services; one configuration governs all |
| Centralised visibility |
All rate limit events flow through one component for monitoring and alerting |
| No service code changes |
Adding or modifying limits requires gateway configuration changes only |
| Consistent headers |
Rate limit response headers are injected uniformly regardless of backend service |
| Upstream protection |
Gateway can enforce limits before requests reach services, protecting service resources |
2.3 Defence in Depth
Cloudflare provides the first layer of defence at the edge:
| Cloudflare Control |
Purpose |
| IP reputation scoring |
Block known-bad IPs before they reach the gateway |
| Bot management |
Challenge automated traffic with Turnstile |
| L7 DDoS mitigation |
Absorb volumetric attacks at the edge |
| Geographic restrictions |
Block traffic from non-operational countries if required |
| WAF rules |
Block malicious payloads before rate limiting applies |
KrakenD provides the second layer with application-aware limits based on merchant identity, endpoint, and request context.
3. Rate Limit Tiers
3.1 Per-Merchant Global Rate Limit
| Parameter |
Value |
| Default limit |
1,000 requests per minute |
| Scope |
All endpoints combined, per merchantId |
| Configurable |
Yes — per merchant contract |
| Minimum |
100 requests per minute |
| Maximum |
10,000 requests per minute (requires CDO approval) |
| Identification |
merchantId extracted from authenticated request (API key or signed header) |
Merchants exceeding their global limit receive 429 Too Many Requests regardless of which endpoint they are calling.
3.2 Per-Merchant Per-Endpoint Rate Limits
| Endpoint Category |
Rate Limit |
Window |
Rationale |
| Payment initiation (Pay-In, Pay-Out, Remittance create) |
100 req/min |
Per merchantId |
Payment creation is expensive — hits upstream channels, creates state |
| Inquiry / status (transaction status, balance check, rate inquiry) |
500 req/min |
Per merchantId |
Read-heavy; merchants poll for status; must be generous |
| OTP request |
3 req/5 min |
Per mobile number per merchantId |
Prevents OTP bombing; protects SMS costs |
| OTP verification |
5 attempts/OTP |
Per OTP session |
Prevents brute-force; locks OTP after 5 failures |
| Webhook management (register, update, delete) |
10 req/hour |
Per merchantId |
Low-frequency operations; prevents configuration churn |
| Merchant onboarding (KYC submission, document upload) |
20 req/hour |
Per merchantId |
Protects file upload resources |
| Refund initiation |
50 req/min |
Per merchantId |
Lower than payment initiation; refunds are less frequent |
| Bulk/batch operations |
10 req/min |
Per merchantId |
Each request contains multiple items; protects backend processing |
| FX rate inquiry |
200 req/min |
Per merchantId |
High-frequency for remittance corridors; cached at gateway |
3.3 Per-IP Rate Limits
| Scenario |
Rate Limit |
Window |
Action on Exceed |
| Sandbox / unauthenticated |
200 req/min |
Per IP address |
429 + Retry-After header |
| Authentication failures |
5 attempts/15 min |
Per IP address |
Temporary IP block (15 minutes) |
| Global per-IP (authenticated) |
1,000 req/min |
Per IP address |
429 + Cloudflare challenge |
| Unauthenticated endpoints (health, docs) |
60 req/min |
Per IP address |
429 |
3.4 Burst Allowance
| Parameter |
Value |
| Burst multiplier |
2x sustained limit |
| Burst window |
10 seconds |
| Behaviour |
Permits short traffic spikes (e.g., batch submission) without triggering 429 |
| Recovery |
After burst window expires, rate returns to sustained limit; no penalty |
Example: A merchant with a 100 req/min payment initiation limit can send up to 200 requests within any 10-second window, provided their sustained rate over the full minute does not exceed 100.
3.5 Tiered Merchant Plans
| Tier |
Global Limit |
Payment Initiation |
Inquiry/Status |
Monthly Transaction Volume |
| Standard |
1,000 req/min |
100 req/min |
500 req/min |
Up to 1M transactions |
| Professional |
3,000 req/min |
300 req/min |
1,500 req/min |
1M–10M transactions |
| Enterprise |
5,000 req/min |
500 req/min |
3,000 req/min |
10M+ transactions |
| Custom |
Per contract |
Per contract |
Per contract |
By agreement |
4. Payment Channel Rate Limits
Simpaisa must respect upstream channel limits imposed by telco operators and banking partners. If an upstream channel's rate limit is lower than Simpaisa's default, the more restrictive limit applies.
4.1 Known Channel Limits
| Channel |
Market |
Upstream Limit |
Simpaisa Limit |
Notes |
| Easypaisa |
PK |
Document per integration |
Per channel agreement |
Telenor Microfinance Bank API limits |
| JazzCash |
PK |
Document per integration |
Per channel agreement |
Jazz/Mobilink API limits |
| UBL Omni |
PK |
Document per integration |
Per channel agreement |
Banking API limits |
| HBL |
PK |
Document per integration |
Per channel agreement |
Banking API limits |
| bKash |
BD |
Document per integration |
Per channel agreement |
bKash merchant API limits |
| Nagad |
BD |
Document per integration |
Per channel agreement |
Nagad API limits |
| eSewa |
NP |
Document per integration |
Per channel agreement |
eSewa API limits |
| Khalti |
NP |
Document per integration |
Per channel agreement |
Khalti API limits |
4.2 Channel Rate Limit Management
| Requirement |
Implementation |
| Documentation |
Every channel integration must document the upstream provider's rate limits in the Vendor Integration Register |
| Configuration |
KrakenD rate limit configuration must include per-channel limits derived from upstream constraints |
| Circuit breaking |
If a channel returns throttling responses (HTTP 429 or equivalent), KrakenD must back off and queue requests |
| Monitoring |
Track per-channel request volume vs. upstream limit; alert at 80% utilisation |
| Fair allocation |
When multiple merchants share a channel, implement fair-share allocation to prevent one merchant exhausting the channel |
All API responses MUST include rate limit headers, following the IETF draft standard (RateLimit Header Fields for HTTP).
5.1 Standard Headers
| Header |
Description |
Example |
X-RateLimit-Limit |
Maximum number of requests permitted in the current window |
X-RateLimit-Limit: 100 |
X-RateLimit-Remaining |
Number of requests remaining in the current window |
X-RateLimit-Remaining: 73 |
X-RateLimit-Reset |
Unix epoch timestamp (seconds) when the current window resets |
X-RateLimit-Reset: 1712160000 |
Retry-After |
Seconds until the client should retry (only on 429 responses) |
Retry-After: 30 |
| Rule |
Detail |
| Always present |
Rate limit headers MUST be included on every response (2xx, 4xx, 5xx), not only on 429 |
| Most restrictive |
When multiple limits apply (global + per-endpoint), headers reflect the most restrictive limit |
| Reset accuracy |
X-RateLimit-Reset MUST reflect the actual window reset time, not an approximation |
| Retry-After on 429 |
Every 429 response MUST include Retry-After with the number of seconds to wait |
| No negative values |
X-RateLimit-Remaining MUST NOT be negative; minimum value is 0 |
HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 73
X-RateLimit-Reset: 1712160060
X-Request-Id: txn-abc-123-def-456
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1712160060
Retry-After: 30
X-Request-Id: txn-abc-123-def-456
Rate-limited responses use the unified error schema defined in API-STANDARDS.md:
{
"error": {
"code": "RATE_LIMIT_EXCEEDED",
"message": "You have exceeded the rate limit for this endpoint. Please retry after the period indicated in the Retry-After header.",
"details": [
{
"field": "endpoint",
"issue": "Payment initiation limit of 100 requests per minute exceeded"
}
]
},
"traceId": "abc-123-def-456",
"timestamp": "2026-04-03T12:00:00Z"
}
6.1 Error Codes for Rate Limiting
| Error Code |
Trigger |
Message |
RATE_LIMIT_EXCEEDED |
Any rate limit exceeded |
"You have exceeded the rate limit for this endpoint" |
RATE_LIMIT_GLOBAL |
Merchant global limit exceeded |
"You have exceeded your global rate limit across all endpoints" |
RATE_LIMIT_BURST |
Burst allowance exceeded |
"Request rate too high; burst allowance exhausted" |
RATE_LIMIT_OTP |
OTP request limit exceeded |
"Too many OTP requests for this mobile number" |
IP_BLOCKED_TEMPORARY |
IP temporarily blocked due to auth failures |
"Your IP has been temporarily blocked due to repeated authentication failures" |
7. Rate Limit Storage
7.1 Storage Backend
| Parameter |
Value |
| Backend |
Redis (ElastiCache — existing infrastructure) |
| Algorithm |
Sliding window log |
| Key format |
rl:{scope}:{identifier}:{endpoint} |
| TTL |
Window duration + 10 seconds buffer |
| Cluster |
Dedicated Redis cluster for rate limiting (not shared with application cache) |
7.2 Why Sliding Window
| Algorithm |
Pros |
Cons |
Decision |
| Fixed window |
Simple, low memory |
Burst at window boundary (2x actual rate) |
Rejected |
| Sliding window log |
Accurate, no boundary burst |
Higher memory per key |
Selected |
| Token bucket |
Smooth, configurable burst |
Complex state management |
Considered for future |
7.3 Key Schema
# Per-merchant global
rl:merchant:{merchantId}:global
# Per-merchant per-endpoint
rl:merchant:{merchantId}:endpoint:{endpointCategory}
# Per-IP
rl:ip:{ipAddress}:global
# Per-mobile (OTP)
rl:otp:{merchantId}:{mobileNumber}
7.4 Failure Mode
| Scenario |
Behaviour |
| Redis unavailable |
Fail open — requests pass through without rate limiting; alert raised immediately |
| Redis latency > 50ms |
Log warning; consider circuit breaker to fail open |
| Redis data loss (failover) |
All counters reset; merchants get a fresh window; no 429s during recovery |
Rationale for fail-open: For a payment gateway, a false 429 (blocking a legitimate payment) is worse than temporarily allowing excess traffic. Redis unavailability is already a monitored incident and would trigger immediate investigation.
8. Exemptions
8.1 Exempt Traffic
| Traffic Type |
Exemption |
Rationale |
| Webhook delivery retries |
Exempt from rate limits |
Simpaisa-initiated; retry logic must not be throttled by our own limits |
| Internal service-to-service |
Exempt from merchant rate limits |
Identified by mTLS certificates; governed by service mesh policies |
| Health check endpoints |
Exempt from authenticated limits |
/health, /ready must always respond for load balancer probes |
| Cloudflare Workers |
Exempt from IP-based limits |
Identified by Cloudflare service token header |
8.2 Non-Exempt Traffic
| Traffic Type |
Rate Limits Apply |
Notes |
| Sandbox / test environment |
Yes — reduced limits |
Sandbox uses per-IP limits (200 req/min) |
| Merchant SDK traffic |
Yes — standard merchant limits |
SDK requests carry merchant credentials |
| Third-party integrator traffic |
Yes — standard merchant limits |
Integrators operate under the merchant's allocation |
9. Monitoring & Alerting
9.1 Grafana Dashboard — Rate Limit Monitoring
The rate limiting dashboard MUST display:
| Panel |
Description |
Refresh |
| 429s per merchant |
Time series of rate-limited requests per merchantId |
15 seconds |
| 429s per endpoint |
Time series of rate-limited requests per endpoint category |
15 seconds |
| Top 10 throttled merchants |
Table of merchants with highest 429 counts (rolling 1 hour) |
1 minute |
| Rate limit utilisation |
Per-merchant utilisation as percentage of their limit |
1 minute |
| Channel utilisation |
Per-channel request volume vs. upstream limit |
30 seconds |
| Redis health |
Rate limit Redis cluster latency, memory, connection count |
15 seconds |
9.2 Alerts
| Alert |
Condition |
Severity |
Action |
| Sustained 429s |
Merchant receives > 50 rate-limited responses in 5 minutes |
Warning |
Notify merchant via DevEx portal; investigate if legitimate traffic growth |
| Burst 429s |
Merchant receives > 200 rate-limited responses in 1 minute |
High |
Investigate for potential abuse or misconfigured integration |
| IP block triggered |
Any IP blocked due to authentication failures |
Medium |
Log for security review; check for credential stuffing |
| Channel near limit |
Upstream channel at > 80% of its rate limit |
Warning |
Review traffic distribution; consider queuing |
| Channel at limit |
Upstream channel at 100% of its rate limit |
High |
Activate queuing; notify affected merchants |
| Redis rate limit cluster down |
Redis unavailable or latency > 100ms |
Critical |
Rate limiting fail-open; investigate immediately |
| Anomalous traffic pattern |
Single merchant's request volume increases > 5x in 1 hour |
Warning |
Investigate; may indicate integration error or abuse |
9.3 Metrics (OpenTelemetry)
All rate limiting events emit OpenTelemetry metrics:
| Metric |
Type |
Labels |
simpaisa.ratelimit.requests.total |
Counter |
merchant_id, endpoint, result (allowed/denied) |
simpaisa.ratelimit.latency.ms |
Histogram |
operation (check/increment) |
simpaisa.ratelimit.utilisation.ratio |
Gauge |
merchant_id, limit_type |
simpaisa.ratelimit.redis.errors |
Counter |
error_type |
10. Merchant Communication
10.1 Developer Experience Portal
Rate limits MUST be documented in the merchant DevEx portal with:
| Content |
Detail |
| Rate limit overview |
Explanation of all tiers and how limits are applied |
| Current limits |
Per-merchant view showing their contracted limits |
| Usage dashboard |
Real-time view of current utilisation against limits |
| Best practices |
Guidance on implementing exponential backoff, request batching, caching status responses |
| Code examples |
SDK examples showing proper 429 handling in Java, Python, Node.js, PHP, C# |
10.2 Approaching Limits Notification
| Threshold |
Action |
| 80% utilisation |
DevEx portal displays yellow warning banner |
| 90% utilisation |
Email notification to merchant's technical contact |
| 100% utilisation (first 429) |
Webhook notification to merchant (if configured) |
| Sustained 429s (> 5 minutes) |
Email to merchant's technical contact + account manager notified |
10.3 Sandbox Rate Limit Testing
The sandbox environment provides a /v3/test/rate-limit endpoint that merchants can use to:
- Verify their 429 handling logic
- Test
Retry-After header parsing
- Validate exponential backoff implementation
- This endpoint artificially returns 429 on every other request
11. Escalation & Limit Increases
11.1 Standard Limit Increase
| Step |
Action |
Owner |
| 1 |
Merchant submits limit increase request via DevEx portal or support ticket |
Merchant |
| 2 |
Account manager reviews request against merchant's transaction volume and contract |
Account Manager |
| 3 |
Technical review — assess infrastructure capacity for requested limits |
Platform Engineering |
| 4 |
Approval and configuration update in KrakenD |
Platform Engineering |
| 5 |
Merchant notified of new limits; DevEx portal updated |
Account Manager |
SLA: Standard limit increase requests processed within 3 business days.
11.2 CDO Approval Required
| Scenario |
Approval Required |
| Global limit > 5,000 req/min |
CDO approval |
| Payment initiation > 500 req/min |
CDO approval |
| Custom burst configuration |
CDO approval |
| Rate limit exemption for any endpoint |
CDO approval |
| Temporary limit increase > 30 days |
CDO approval |
11.3 Emergency Limit Increase
For production incidents where rate limits are blocking legitimate traffic:
| Step |
Action |
Timeline |
| 1 |
On-call engineer verifies the traffic is legitimate (not an attack) |
Immediate |
| 2 |
Temporary 2x limit increase applied via KrakenD config update |
Within 15 minutes |
| 3 |
Incident documented and CDO notified |
Within 1 hour |
| 4 |
Permanent limit adjustment or merchant migration to higher tier |
Within 3 business days |
12. Implementation Checklist
Cross-References