Skip to content

Rate Limiting Policy

Organisation: Simpaisa Holdings Document Owner: Daniel O'Reilly, Chief Digital Officer Classification: Internal Version: 1.0 Date: 3 April 2026 Status: Active


Table of Contents

  1. Purpose
  2. Rate Limiting Architecture
  3. Rate Limit Tiers
  4. Payment Channel Rate Limits
  5. Response Headers
  6. 429 Response Format
  7. Rate Limit Storage
  8. Exemptions
  9. Monitoring & Alerting
  10. Merchant Communication
  11. Escalation & Limit Increases
  12. Implementation Checklist

1. Purpose

Simpaisa processes 270M+ transactions worth $1B+ annually across Pakistan, Bangladesh, Nepal, Iraq, and Egypt. Rate limiting is a critical control that:

  • Protects payment infrastructure — prevents resource exhaustion that could cause platform-wide outages affecting all merchants and all markets simultaneously.
  • Prevents abuse — defends against brute-force attacks on OTP endpoints, credential stuffing, transaction enumeration, and denial-of-service attempts.
  • Ensures fair usage — prevents any single merchant from consuming disproportionate resources, ensuring equitable service quality across all integrations.
  • Preserves upstream channel availability — respects the rate limits imposed by telco operators (Easypaisa, JazzCash, bKash, etc.) and banking partners, preventing Simpaisa from being throttled or blocked by upstream providers.
  • Supports regulatory compliance — provides audit evidence of security controls required by SBP, Bangladesh Bank, Nepal Rastra Bank, and CBI.

Current state: Rate limiting is rated 2/10 (Critical) in the Security Architecture assessment. No documented rate limits exist. This policy defines the target state.


2. Rate Limiting Architecture

2.1 Enforcement Point

Rate limiting is enforced at the KrakenD API Gateway layer, not within individual services. This is a deliberate architectural decision:

Client Request
       │
       ▼
┌──────────────┐
│  Cloudflare  │  ← L7 DDoS protection, IP reputation, bot management
│  (Edge)      │     Global per-IP limits enforced here
└──────┬───────┘
       │
       ▼
┌──────────────┐
│   KrakenD    │  ← Per-merchant, per-endpoint, per-IP rate limiting
│  (Gateway)   │     All rate limit headers injected here
└──────┬───────┘
       │
       ▼
┌──────────────┐
│  Go Service  │  ← No rate limiting logic in services
│  (Backend)   │     Services trust that gateway has enforced limits
└──────────────┘

2.2 Why Gateway-Level, Not Per-Service

Reason Detail
Single enforcement point Avoids inconsistent limits across services; one configuration governs all
Centralised visibility All rate limit events flow through one component for monitoring and alerting
No service code changes Adding or modifying limits requires gateway configuration changes only
Consistent headers Rate limit response headers are injected uniformly regardless of backend service
Upstream protection Gateway can enforce limits before requests reach services, protecting service resources

2.3 Defence in Depth

Cloudflare provides the first layer of defence at the edge:

Cloudflare Control Purpose
IP reputation scoring Block known-bad IPs before they reach the gateway
Bot management Challenge automated traffic with Turnstile
L7 DDoS mitigation Absorb volumetric attacks at the edge
Geographic restrictions Block traffic from non-operational countries if required
WAF rules Block malicious payloads before rate limiting applies

KrakenD provides the second layer with application-aware limits based on merchant identity, endpoint, and request context.


3. Rate Limit Tiers

3.1 Per-Merchant Global Rate Limit

Parameter Value
Default limit 1,000 requests per minute
Scope All endpoints combined, per merchantId
Configurable Yes — per merchant contract
Minimum 100 requests per minute
Maximum 10,000 requests per minute (requires CDO approval)
Identification merchantId extracted from authenticated request (API key or signed header)

Merchants exceeding their global limit receive 429 Too Many Requests regardless of which endpoint they are calling.

3.2 Per-Merchant Per-Endpoint Rate Limits

Endpoint Category Rate Limit Window Rationale
Payment initiation (Pay-In, Pay-Out, Remittance create) 100 req/min Per merchantId Payment creation is expensive — hits upstream channels, creates state
Inquiry / status (transaction status, balance check, rate inquiry) 500 req/min Per merchantId Read-heavy; merchants poll for status; must be generous
OTP request 3 req/5 min Per mobile number per merchantId Prevents OTP bombing; protects SMS costs
OTP verification 5 attempts/OTP Per OTP session Prevents brute-force; locks OTP after 5 failures
Webhook management (register, update, delete) 10 req/hour Per merchantId Low-frequency operations; prevents configuration churn
Merchant onboarding (KYC submission, document upload) 20 req/hour Per merchantId Protects file upload resources
Refund initiation 50 req/min Per merchantId Lower than payment initiation; refunds are less frequent
Bulk/batch operations 10 req/min Per merchantId Each request contains multiple items; protects backend processing
FX rate inquiry 200 req/min Per merchantId High-frequency for remittance corridors; cached at gateway

3.3 Per-IP Rate Limits

Scenario Rate Limit Window Action on Exceed
Sandbox / unauthenticated 200 req/min Per IP address 429 + Retry-After header
Authentication failures 5 attempts/15 min Per IP address Temporary IP block (15 minutes)
Global per-IP (authenticated) 1,000 req/min Per IP address 429 + Cloudflare challenge
Unauthenticated endpoints (health, docs) 60 req/min Per IP address 429

3.4 Burst Allowance

Parameter Value
Burst multiplier 2x sustained limit
Burst window 10 seconds
Behaviour Permits short traffic spikes (e.g., batch submission) without triggering 429
Recovery After burst window expires, rate returns to sustained limit; no penalty

Example: A merchant with a 100 req/min payment initiation limit can send up to 200 requests within any 10-second window, provided their sustained rate over the full minute does not exceed 100.

3.5 Tiered Merchant Plans

Tier Global Limit Payment Initiation Inquiry/Status Monthly Transaction Volume
Standard 1,000 req/min 100 req/min 500 req/min Up to 1M transactions
Professional 3,000 req/min 300 req/min 1,500 req/min 1M–10M transactions
Enterprise 5,000 req/min 500 req/min 3,000 req/min 10M+ transactions
Custom Per contract Per contract Per contract By agreement

4. Payment Channel Rate Limits

Simpaisa must respect upstream channel limits imposed by telco operators and banking partners. If an upstream channel's rate limit is lower than Simpaisa's default, the more restrictive limit applies.

4.1 Known Channel Limits

Channel Market Upstream Limit Simpaisa Limit Notes
Easypaisa PK Document per integration Per channel agreement Telenor Microfinance Bank API limits
JazzCash PK Document per integration Per channel agreement Jazz/Mobilink API limits
UBL Omni PK Document per integration Per channel agreement Banking API limits
HBL PK Document per integration Per channel agreement Banking API limits
bKash BD Document per integration Per channel agreement bKash merchant API limits
Nagad BD Document per integration Per channel agreement Nagad API limits
eSewa NP Document per integration Per channel agreement eSewa API limits
Khalti NP Document per integration Per channel agreement Khalti API limits

4.2 Channel Rate Limit Management

Requirement Implementation
Documentation Every channel integration must document the upstream provider's rate limits in the Vendor Integration Register
Configuration KrakenD rate limit configuration must include per-channel limits derived from upstream constraints
Circuit breaking If a channel returns throttling responses (HTTP 429 or equivalent), KrakenD must back off and queue requests
Monitoring Track per-channel request volume vs. upstream limit; alert at 80% utilisation
Fair allocation When multiple merchants share a channel, implement fair-share allocation to prevent one merchant exhausting the channel

5. Response Headers

All API responses MUST include rate limit headers, following the IETF draft standard (RateLimit Header Fields for HTTP).

5.1 Standard Headers

Header Description Example
X-RateLimit-Limit Maximum number of requests permitted in the current window X-RateLimit-Limit: 100
X-RateLimit-Remaining Number of requests remaining in the current window X-RateLimit-Remaining: 73
X-RateLimit-Reset Unix epoch timestamp (seconds) when the current window resets X-RateLimit-Reset: 1712160000
Retry-After Seconds until the client should retry (only on 429 responses) Retry-After: 30

5.2 Header Rules

Rule Detail
Always present Rate limit headers MUST be included on every response (2xx, 4xx, 5xx), not only on 429
Most restrictive When multiple limits apply (global + per-endpoint), headers reflect the most restrictive limit
Reset accuracy X-RateLimit-Reset MUST reflect the actual window reset time, not an approximation
Retry-After on 429 Every 429 response MUST include Retry-After with the number of seconds to wait
No negative values X-RateLimit-Remaining MUST NOT be negative; minimum value is 0

5.3 Example Response Headers (Normal Request)

HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 73
X-RateLimit-Reset: 1712160060
X-Request-Id: txn-abc-123-def-456

5.4 Example Response Headers (Rate Limited)

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1712160060
Retry-After: 30
X-Request-Id: txn-abc-123-def-456

6. 429 Response Format

Rate-limited responses use the unified error schema defined in API-STANDARDS.md:

{
  "error": {
    "code": "RATE_LIMIT_EXCEEDED",
    "message": "You have exceeded the rate limit for this endpoint. Please retry after the period indicated in the Retry-After header.",
    "details": [
      {
        "field": "endpoint",
        "issue": "Payment initiation limit of 100 requests per minute exceeded"
      }
    ]
  },
  "traceId": "abc-123-def-456",
  "timestamp": "2026-04-03T12:00:00Z"
}

6.1 Error Codes for Rate Limiting

Error Code Trigger Message
RATE_LIMIT_EXCEEDED Any rate limit exceeded "You have exceeded the rate limit for this endpoint"
RATE_LIMIT_GLOBAL Merchant global limit exceeded "You have exceeded your global rate limit across all endpoints"
RATE_LIMIT_BURST Burst allowance exceeded "Request rate too high; burst allowance exhausted"
RATE_LIMIT_OTP OTP request limit exceeded "Too many OTP requests for this mobile number"
IP_BLOCKED_TEMPORARY IP temporarily blocked due to auth failures "Your IP has been temporarily blocked due to repeated authentication failures"

7. Rate Limit Storage

7.1 Storage Backend

Parameter Value
Backend Redis (ElastiCache — existing infrastructure)
Algorithm Sliding window log
Key format rl:{scope}:{identifier}:{endpoint}
TTL Window duration + 10 seconds buffer
Cluster Dedicated Redis cluster for rate limiting (not shared with application cache)

7.2 Why Sliding Window

Algorithm Pros Cons Decision
Fixed window Simple, low memory Burst at window boundary (2x actual rate) Rejected
Sliding window log Accurate, no boundary burst Higher memory per key Selected
Token bucket Smooth, configurable burst Complex state management Considered for future

7.3 Key Schema

# Per-merchant global
rl:merchant:{merchantId}:global

# Per-merchant per-endpoint
rl:merchant:{merchantId}:endpoint:{endpointCategory}

# Per-IP
rl:ip:{ipAddress}:global

# Per-mobile (OTP)
rl:otp:{merchantId}:{mobileNumber}

7.4 Failure Mode

Scenario Behaviour
Redis unavailable Fail open — requests pass through without rate limiting; alert raised immediately
Redis latency > 50ms Log warning; consider circuit breaker to fail open
Redis data loss (failover) All counters reset; merchants get a fresh window; no 429s during recovery

Rationale for fail-open: For a payment gateway, a false 429 (blocking a legitimate payment) is worse than temporarily allowing excess traffic. Redis unavailability is already a monitored incident and would trigger immediate investigation.


8. Exemptions

8.1 Exempt Traffic

Traffic Type Exemption Rationale
Webhook delivery retries Exempt from rate limits Simpaisa-initiated; retry logic must not be throttled by our own limits
Internal service-to-service Exempt from merchant rate limits Identified by mTLS certificates; governed by service mesh policies
Health check endpoints Exempt from authenticated limits /health, /ready must always respond for load balancer probes
Cloudflare Workers Exempt from IP-based limits Identified by Cloudflare service token header

8.2 Non-Exempt Traffic

Traffic Type Rate Limits Apply Notes
Sandbox / test environment Yes — reduced limits Sandbox uses per-IP limits (200 req/min)
Merchant SDK traffic Yes — standard merchant limits SDK requests carry merchant credentials
Third-party integrator traffic Yes — standard merchant limits Integrators operate under the merchant's allocation

9. Monitoring & Alerting

9.1 Grafana Dashboard — Rate Limit Monitoring

The rate limiting dashboard MUST display:

Panel Description Refresh
429s per merchant Time series of rate-limited requests per merchantId 15 seconds
429s per endpoint Time series of rate-limited requests per endpoint category 15 seconds
Top 10 throttled merchants Table of merchants with highest 429 counts (rolling 1 hour) 1 minute
Rate limit utilisation Per-merchant utilisation as percentage of their limit 1 minute
Channel utilisation Per-channel request volume vs. upstream limit 30 seconds
Redis health Rate limit Redis cluster latency, memory, connection count 15 seconds

9.2 Alerts

Alert Condition Severity Action
Sustained 429s Merchant receives > 50 rate-limited responses in 5 minutes Warning Notify merchant via DevEx portal; investigate if legitimate traffic growth
Burst 429s Merchant receives > 200 rate-limited responses in 1 minute High Investigate for potential abuse or misconfigured integration
IP block triggered Any IP blocked due to authentication failures Medium Log for security review; check for credential stuffing
Channel near limit Upstream channel at > 80% of its rate limit Warning Review traffic distribution; consider queuing
Channel at limit Upstream channel at 100% of its rate limit High Activate queuing; notify affected merchants
Redis rate limit cluster down Redis unavailable or latency > 100ms Critical Rate limiting fail-open; investigate immediately
Anomalous traffic pattern Single merchant's request volume increases > 5x in 1 hour Warning Investigate; may indicate integration error or abuse

9.3 Metrics (OpenTelemetry)

All rate limiting events emit OpenTelemetry metrics:

Metric Type Labels
simpaisa.ratelimit.requests.total Counter merchant_id, endpoint, result (allowed/denied)
simpaisa.ratelimit.latency.ms Histogram operation (check/increment)
simpaisa.ratelimit.utilisation.ratio Gauge merchant_id, limit_type
simpaisa.ratelimit.redis.errors Counter error_type

10. Merchant Communication

10.1 Developer Experience Portal

Rate limits MUST be documented in the merchant DevEx portal with:

Content Detail
Rate limit overview Explanation of all tiers and how limits are applied
Current limits Per-merchant view showing their contracted limits
Usage dashboard Real-time view of current utilisation against limits
Best practices Guidance on implementing exponential backoff, request batching, caching status responses
Code examples SDK examples showing proper 429 handling in Java, Python, Node.js, PHP, C#

10.2 Approaching Limits Notification

Threshold Action
80% utilisation DevEx portal displays yellow warning banner
90% utilisation Email notification to merchant's technical contact
100% utilisation (first 429) Webhook notification to merchant (if configured)
Sustained 429s (> 5 minutes) Email to merchant's technical contact + account manager notified

10.3 Sandbox Rate Limit Testing

The sandbox environment provides a /v3/test/rate-limit endpoint that merchants can use to:

  • Verify their 429 handling logic
  • Test Retry-After header parsing
  • Validate exponential backoff implementation
  • This endpoint artificially returns 429 on every other request

11. Escalation & Limit Increases

11.1 Standard Limit Increase

Step Action Owner
1 Merchant submits limit increase request via DevEx portal or support ticket Merchant
2 Account manager reviews request against merchant's transaction volume and contract Account Manager
3 Technical review — assess infrastructure capacity for requested limits Platform Engineering
4 Approval and configuration update in KrakenD Platform Engineering
5 Merchant notified of new limits; DevEx portal updated Account Manager

SLA: Standard limit increase requests processed within 3 business days.

11.2 CDO Approval Required

Scenario Approval Required
Global limit > 5,000 req/min CDO approval
Payment initiation > 500 req/min CDO approval
Custom burst configuration CDO approval
Rate limit exemption for any endpoint CDO approval
Temporary limit increase > 30 days CDO approval

11.3 Emergency Limit Increase

For production incidents where rate limits are blocking legitimate traffic:

Step Action Timeline
1 On-call engineer verifies the traffic is legitimate (not an attack) Immediate
2 Temporary 2x limit increase applied via KrakenD config update Within 15 minutes
3 Incident documented and CDO notified Within 1 hour
4 Permanent limit adjustment or merchant migration to higher tier Within 3 business days

12. Implementation Checklist

  • KrakenD rate limiting plugin configured with Redis backend
  • Redis cluster provisioned (dedicated, not shared with application cache)
  • Per-merchant global limits configured (default 1,000 req/min)
  • Per-endpoint limits configured for all endpoint categories
  • Per-IP limits configured for sandbox and unauthenticated endpoints
  • Burst allowance configured (2x sustained, 10-second window)
  • Rate limit response headers injected on all responses (not just 429)
  • 429 response body follows unified error schema
  • Retry-After header included on all 429 responses
  • Webhook delivery retries exempted from rate limits
  • Internal service-to-service traffic exempted (identified by mTLS)
  • Grafana dashboard created with all required panels
  • OpenTelemetry metrics emitting for all rate limit events
  • Alerts configured for sustained 429s, channel saturation, Redis health
  • DevEx portal updated with rate limit documentation
  • Approaching-limit notifications configured (80%, 90%, 100%)
  • Sandbox rate limit test endpoint deployed
  • Upstream channel limits documented in Vendor Integration Register
  • Per-channel rate limits configured in KrakenD
  • Fail-open behaviour tested and verified
  • Load test executed to validate limits under production-like conditions

Cross-References

Document Relevance
API-STANDARDS.md Section 13: Rate Limiting headers and response format
SECURITY-ARCHITECTURE.md Section 7: API Security Controls — rate limiting requirements
INFRASTRUCTURE-STANDARDS.md Section 7: KrakenD Gateway — rate limiting configuration
VENDOR-INTEGRATION-REGISTER.md Upstream channel rate limits per operator
INCIDENT-RESPONSE-PLAYBOOK.md Escalation procedures for rate-limiting incidents