Rate Limiting Policy¶

Owner	Classification	Review Date	Status
Security	Confidential	April 2027	Active

Rate Limiting Policy¶

Organisation: Simpaisa Holdings
Document Owner: Daniel O'Reilly, Chief Digital Officer
Classification: Internal
Version: 1.0
Date: 3 April 2026
Status: Active

Table of Contents¶

1. Purpose¶

Simpaisa processes 270M+ transactions worth $1B+ annually across Pakistan, Bangladesh, Nepal, Iraq, and Egypt. Rate limiting is a critical control that:

Protects payment infrastructure — prevents resource exhaustion that could cause platform-wide outages affecting all merchants and all markets simultaneously.
Prevents abuse — defends against brute-force attacks on OTP endpoints, credential stuffing, transaction enumeration, and denial-of-service attempts.
Ensures fair usage — prevents any single merchant from consuming disproportionate resources, ensuring equitable service quality across all integrations.
Preserves upstream channel availability — respects the rate limits imposed by telco operators (Easypaisa, JazzCash, bKash, etc.) and banking partners, preventing Simpaisa from being throttled or blocked by upstream providers.
Supports regulatory compliance — provides audit evidence of security controls required by SBP, Bangladesh Bank, Nepal Rastra Bank, and CBI.

Current state: Rate limiting is rated 2/10 (Critical) in the Security Architecture assessment. No documented rate limits exist. This policy defines the target state.

2. Rate Limiting Architecture¶

2.1 Enforcement Point¶

Rate limiting is enforced at the KrakenD API Gateway layer, not within individual services. This is a deliberate architectural decision:

Client Request
       │
       ▼
┌──────────────┐
│  Cloudflare  │  ← L7 DDoS protection, IP reputation, bot management
│  (Edge)      │     Global per-IP limits enforced here
└──────┬───────┘
       │
       ▼
┌──────────────┐
│   KrakenD    │  ← Per-merchant, per-endpoint, per-IP rate limiting
│  (Gateway)   │     All rate limit headers injected here
└──────┬───────┘
       │
       ▼
┌──────────────┐
│  Go Service  │  ← No rate limiting logic in services
│  (Backend)   │     Services trust that gateway has enforced limits
└──────────────┘

2.2 Why Gateway-Level, Not Per-Service¶

Reason	Detail
Single enforcement point	Avoids inconsistent limits across services; one configuration governs all
Centralised visibility	All rate limit events flow through one component for monitoring and alerting
No service code changes	Adding or modifying limits requires gateway configuration changes only
Consistent headers	Rate limit response headers are injected uniformly regardless of backend service
Upstream protection	Gateway can enforce limits before requests reach services, protecting service resources

2.3 Defence in Depth¶

Cloudflare provides the first layer of defence at the edge:

Cloudflare Control	Purpose
IP reputation scoring	Block known-bad IPs before they reach the gateway
Bot management	Challenge automated traffic with Turnstile
L7 DDoS mitigation	Absorb volumetric attacks at the edge
Geographic restrictions	Block traffic from non-operational countries if required
WAF rules	Block malicious payloads before rate limiting applies

KrakenD provides the second layer with application-aware limits based on merchant identity, endpoint, and request context.

3. Rate Limit Tiers¶

3.1 Per-Merchant Global Rate Limit¶

Parameter	Value
Default limit	1,000 requests per minute
Scope	All endpoints combined, per `merchantId`
Configurable	Yes — per merchant contract
Minimum	100 requests per minute
Maximum	10,000 requests per minute (requires CDO approval)
Identification	`merchantId` extracted from authenticated request (API key or signed header)

Merchants exceeding their global limit receive 429 Too Many Requests regardless of which endpoint they are calling.

3.2 Per-Merchant Per-Endpoint Rate Limits¶

Endpoint Category	Rate Limit	Window	Rationale
Payment initiation (Pay-In, Pay-Out, Remittance create)	100 req/min	Per merchantId	Payment creation is expensive — hits upstream channels, creates state
Inquiry / status (transaction status, balance check, rate inquiry)	500 req/min	Per merchantId	Read-heavy; merchants poll for status; must be generous
OTP request	3 req/5 min	Per mobile number per merchantId	Prevents OTP bombing; protects SMS costs
OTP verification	5 attempts/OTP	Per OTP session	Prevents brute-force; locks OTP after 5 failures
Webhook management (register, update, delete)	10 req/hour	Per merchantId	Low-frequency operations; prevents configuration churn
Merchant onboarding (KYC submission, document upload)	20 req/hour	Per merchantId	Protects file upload resources
Refund initiation	50 req/min	Per merchantId	Lower than payment initiation; refunds are less frequent
Bulk/batch operations	10 req/min	Per merchantId	Each request contains multiple items; protects backend processing
FX rate inquiry	200 req/min	Per merchantId	High-frequency for remittance corridors; cached at gateway

3.3 Per-IP Rate Limits¶

Scenario	Rate Limit	Window	Action on Exceed
Sandbox / unauthenticated	200 req/min	Per IP address	429 + Retry-After header
Authentication failures	5 attempts/15 min	Per IP address	Temporary IP block (15 minutes)
Global per-IP (authenticated)	1,000 req/min	Per IP address	429 + Cloudflare challenge
Unauthenticated endpoints (health, docs)	60 req/min	Per IP address	429

3.4 Burst Allowance¶

Parameter	Value
Burst multiplier	2x sustained limit
Burst window	10 seconds
Behaviour	Permits short traffic spikes (e.g., batch submission) without triggering 429
Recovery	After burst window expires, rate returns to sustained limit; no penalty

Example: A merchant with a 100 req/min payment initiation limit can send up to 200 requests within any 10-second window, provided their sustained rate over the full minute does not exceed 100.

3.5 Tiered Merchant Plans¶

Tier	Global Limit	Payment Initiation	Inquiry/Status	Monthly Transaction Volume
Standard	1,000 req/min	100 req/min	500 req/min	Up to 1M transactions
Professional	3,000 req/min	300 req/min	1,500 req/min	1M–10M transactions
Enterprise	5,000 req/min	500 req/min	3,000 req/min	10M+ transactions
Custom	Per contract	Per contract	Per contract	By agreement

4. Payment Channel Rate Limits¶

Simpaisa must respect upstream channel limits imposed by telco operators and banking partners. If an upstream channel's rate limit is lower than Simpaisa's default, the more restrictive limit applies.

4.1 Known Channel Limits¶

Channel	Market	Upstream Limit	Simpaisa Limit	Notes
Easypaisa	PK	Document per integration	Per channel agreement	Telenor Microfinance Bank API limits
JazzCash	PK	Document per integration	Per channel agreement	Jazz/Mobilink API limits
UBL Omni	PK	Document per integration	Per channel agreement	Banking API limits
HBL	PK	Document per integration	Per channel agreement	Banking API limits
bKash	BD	Document per integration	Per channel agreement	bKash merchant API limits
Nagad	BD	Document per integration	Per channel agreement	Nagad API limits
eSewa	NP	Document per integration	Per channel agreement	eSewa API limits
Khalti	NP	Document per integration	Per channel agreement	Khalti API limits

4.2 Channel Rate Limit Management¶

Requirement	Implementation
Documentation	Every channel integration must document the upstream provider's rate limits in the Vendor Integration Register
Configuration	KrakenD rate limit configuration must include per-channel limits derived from upstream constraints
Circuit breaking	If a channel returns throttling responses (HTTP 429 or equivalent), KrakenD must back off and queue requests
Monitoring	Track per-channel request volume vs. upstream limit; alert at 80% utilisation
Fair allocation	When multiple merchants share a channel, implement fair-share allocation to prevent one merchant exhausting the channel

5. Response Headers¶

All API responses MUST include rate limit headers, following the IETF draft standard (RateLimit Header Fields for HTTP).

5.1 Standard Headers¶

Header	Description	Example
`X-RateLimit-Limit`	Maximum number of requests permitted in the current window	`X-RateLimit-Limit: 100`
`X-RateLimit-Remaining`	Number of requests remaining in the current window	`X-RateLimit-Remaining: 73`
`X-RateLimit-Reset`	Unix epoch timestamp (seconds) when the current window resets	`X-RateLimit-Reset: 1712160000`
`Retry-After`	Seconds until the client should retry (only on 429 responses)	`Retry-After: 30`

5.2 Header Rules¶

Rule	Detail
Always present	Rate limit headers MUST be included on every response (2xx, 4xx, 5xx), not only on 429
Most restrictive	When multiple limits apply (global + per-endpoint), headers reflect the most restrictive limit
Reset accuracy	`X-RateLimit-Reset` MUST reflect the actual window reset time, not an approximation
Retry-After on 429	Every 429 response MUST include `Retry-After` with the number of seconds to wait
No negative values	`X-RateLimit-Remaining` MUST NOT be negative; minimum value is `0`

5.3 Example Response Headers (Normal Request)¶

HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 73
X-RateLimit-Reset: 1712160060
X-Request-Id: txn-abc-123-def-456

5.4 Example Response Headers (Rate Limited)¶

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1712160060
Retry-After: 30
X-Request-Id: txn-abc-123-def-456

6. 429 Response Format¶

Rate-limited responses use the unified error schema defined in API-STANDARDS.md:

{
  "error": {
    "code": "RATE_LIMIT_EXCEEDED",
    "message": "You have exceeded the rate limit for this endpoint. Please retry after the period indicated in the Retry-After header.",
    "details": [
      {
        "field": "endpoint",
        "issue": "Payment initiation limit of 100 requests per minute exceeded"
      }
    ]
  },
  "traceId": "abc-123-def-456",
  "timestamp": "2026-04-03T12:00:00Z"
}

6.1 Error Codes for Rate Limiting¶

Error Code	Trigger	Message
`RATE_LIMIT_EXCEEDED`	Any rate limit exceeded	"You have exceeded the rate limit for this endpoint"
`RATE_LIMIT_GLOBAL`	Merchant global limit exceeded	"You have exceeded your global rate limit across all endpoints"
`RATE_LIMIT_BURST`	Burst allowance exceeded	"Request rate too high; burst allowance exhausted"
`RATE_LIMIT_OTP`	OTP request limit exceeded	"Too many OTP requests for this mobile number"
`IP_BLOCKED_TEMPORARY`	IP temporarily blocked due to auth failures	"Your IP has been temporarily blocked due to repeated authentication failures"

7. Rate Limit Storage¶

7.1 Storage Backend¶

Parameter	Value
Backend	Redis (ElastiCache — existing infrastructure)
Algorithm	Sliding window log
Key format	`rl:{scope}:{identifier}:{endpoint}`
TTL	Window duration + 10 seconds buffer
Cluster	Dedicated Redis cluster for rate limiting (not shared with application cache)

7.2 Why Sliding Window¶

Algorithm	Pros	Cons	Decision
Fixed window	Simple, low memory	Burst at window boundary (2x actual rate)	Rejected
Sliding window log	Accurate, no boundary burst	Higher memory per key	Selected
Token bucket	Smooth, configurable burst	Complex state management	Considered for future

7.3 Key Schema¶

# Per-merchant global
rl:merchant:{merchantId}:global

# Per-merchant per-endpoint
rl:merchant:{merchantId}:endpoint:{endpointCategory}

# Per-IP
rl:ip:{ipAddress}:global

# Per-mobile (OTP)
rl:otp:{merchantId}:{mobileNumber}

7.4 Failure Mode¶

Scenario	Behaviour
Redis unavailable	Fail open — requests pass through without rate limiting; alert raised immediately
Redis latency > 50ms	Log warning; consider circuit breaker to fail open
Redis data loss (failover)	All counters reset; merchants get a fresh window; no 429s during recovery

Rationale for fail-open: For a payment gateway, a false 429 (blocking a legitimate payment) is worse than temporarily allowing excess traffic. Redis unavailability is already a monitored incident and would trigger immediate investigation.

8. Exemptions¶

8.1 Exempt Traffic¶

Traffic Type	Exemption	Rationale
Webhook delivery retries	Exempt from rate limits	Simpaisa-initiated; retry logic must not be throttled by our own limits
Internal service-to-service	Exempt from merchant rate limits	Identified by mTLS certificates; governed by service mesh policies
Health check endpoints	Exempt from authenticated limits	`/health`, `/ready` must always respond for load balancer probes
Cloudflare Workers	Exempt from IP-based limits	Identified by Cloudflare service token header

8.2 Non-Exempt Traffic¶

Traffic Type	Rate Limits Apply	Notes
Sandbox / test environment	Yes — reduced limits	Sandbox uses per-IP limits (200 req/min)
Merchant SDK traffic	Yes — standard merchant limits	SDK requests carry merchant credentials
Third-party integrator traffic	Yes — standard merchant limits	Integrators operate under the merchant's allocation

9. Monitoring & Alerting¶

9.1 Grafana Dashboard — Rate Limit Monitoring¶

The rate limiting dashboard MUST display:

Panel	Description	Refresh
429s per merchant	Time series of rate-limited requests per merchantId	15 seconds
429s per endpoint	Time series of rate-limited requests per endpoint category	15 seconds
Top 10 throttled merchants	Table of merchants with highest 429 counts (rolling 1 hour)	1 minute
Rate limit utilisation	Per-merchant utilisation as percentage of their limit	1 minute
Channel utilisation	Per-channel request volume vs. upstream limit	30 seconds
Redis health	Rate limit Redis cluster latency, memory, connection count	15 seconds

9.2 Alerts¶

Alert	Condition	Severity	Action
Sustained 429s	Merchant receives > 50 rate-limited responses in 5 minutes	Warning	Notify merchant via DevEx portal; investigate if legitimate traffic growth
Burst 429s	Merchant receives > 200 rate-limited responses in 1 minute	High	Investigate for potential abuse or misconfigured integration
IP block triggered	Any IP blocked due to authentication failures	Medium	Log for security review; check for credential stuffing
Channel near limit	Upstream channel at > 80% of its rate limit	Warning	Review traffic distribution; consider queuing
Channel at limit	Upstream channel at 100% of its rate limit	High	Activate queuing; notify affected merchants
Redis rate limit cluster down	Redis unavailable or latency > 100ms	Critical	Rate limiting fail-open; investigate immediately
Anomalous traffic pattern	Single merchant's request volume increases > 5x in 1 hour	Warning	Investigate; may indicate integration error or abuse

9.3 Metrics (OpenTelemetry)¶

All rate limiting events emit OpenTelemetry metrics:

Metric	Type	Labels
`simpaisa.ratelimit.requests.total`	Counter	`merchant_id`, `endpoint`, `result` (allowed/denied)
`simpaisa.ratelimit.latency.ms`	Histogram	`operation` (check/increment)
`simpaisa.ratelimit.utilisation.ratio`	Gauge	`merchant_id`, `limit_type`
`simpaisa.ratelimit.redis.errors`	Counter	`error_type`

10. Merchant Communication¶

10.1 Developer Experience Portal¶

Rate limits MUST be documented in the merchant DevEx portal with:

Content	Detail
Rate limit overview	Explanation of all tiers and how limits are applied
Current limits	Per-merchant view showing their contracted limits
Usage dashboard	Real-time view of current utilisation against limits
Best practices	Guidance on implementing exponential backoff, request batching, caching status responses
Code examples	SDK examples showing proper 429 handling in Java, Python, Node.js, PHP, C#

10.2 Approaching Limits Notification¶

Threshold	Action
80% utilisation	DevEx portal displays yellow warning banner
90% utilisation	Email notification to merchant's technical contact
100% utilisation (first 429)	Webhook notification to merchant (if configured)
Sustained 429s ( > 5 minutes)	Email to merchant's technical contact + account manager notified

10.3 Sandbox Rate Limit Testing¶

The sandbox environment provides a /v3/test/rate-limit endpoint that merchants can use to:

Verify their 429 handling logic
Test Retry-After header parsing
Validate exponential backoff implementation
This endpoint artificially returns 429 on every other request

11. Escalation & Limit Increases¶

11.1 Standard Limit Increase¶

Step	Action	Owner
1	Merchant submits limit increase request via DevEx portal or support ticket	Merchant
2	Account manager reviews request against merchant's transaction volume and contract	Account Manager
3	Technical review — assess infrastructure capacity for requested limits	Platform Engineering
4	Approval and configuration update in KrakenD	Platform Engineering
5	Merchant notified of new limits; DevEx portal updated	Account Manager

SLA: Standard limit increase requests processed within 3 business days.

11.2 CDO Approval Required¶

Scenario	Approval Required
Global limit > 5,000 req/min	CDO approval
Payment initiation > 500 req/min	CDO approval
Custom burst configuration	CDO approval
Rate limit exemption for any endpoint	CDO approval
Temporary limit increase > 30 days	CDO approval

11.3 Emergency Limit Increase¶

For production incidents where rate limits are blocking legitimate traffic:

Step	Action	Timeline
1	On-call engineer verifies the traffic is legitimate (not an attack)	Immediate
2	Temporary 2x limit increase applied via KrakenD config update	Within 15 minutes
3	Incident documented and CDO notified	Within 1 hour
4	Permanent limit adjustment or merchant migration to higher tier	Within 3 business days

12. Implementation Checklist¶

KrakenD rate limiting plugin configured with Redis backend
Redis cluster provisioned (dedicated, not shared with application cache)
Per-merchant global limits configured (default 1,000 req/min)
Per-endpoint limits configured for all endpoint categories
Per-IP limits configured for sandbox and unauthenticated endpoints
Burst allowance configured (2x sustained, 10-second window)
Rate limit response headers injected on all responses (not just 429)
429 response body follows unified error schema
Retry-After header included on all 429 responses
Webhook delivery retries exempted from rate limits
Internal service-to-service traffic exempted (identified by mTLS)
Grafana dashboard created with all required panels
OpenTelemetry metrics emitting for all rate limit events
Alerts configured for sustained 429s, channel saturation, Redis health
DevEx portal updated with rate limit documentation
Approaching-limit notifications configured (80%, 90%, 100%)
Sandbox rate limit test endpoint deployed
Upstream channel limits documented in Vendor Integration Register
Per-channel rate limits configured in KrakenD
Fail-open behaviour tested and verified
Load test executed to validate limits under production-like conditions

Cross-References¶

Document	Relevance
API-STANDARDS.md	Section 13: Rate Limiting headers and response format
SECURITY-ARCHITECTURE.md	Section 7: API Security Controls — rate limiting requirements
INFRASTRUCTURE-STANDARDS.md	Section 7: KrakenD Gateway — rate limiting configuration
VENDOR-INTEGRATION-REGISTER.md	Upstream channel rate limits per operator
INCIDENT-RESPONSE-PLAYBOOK.md	Escalation procedures for rate-limiting incidents