Status: Accepted Date: 2026-02-14 Deciders: Reflections Maintainers
Context
The API currently applieshono-rate-limiter middleware with an in-process memory store. This is simple and fast for current traffic, but it has a known caveat under horizontal scale:
- Limits are enforced per-instance, not globally.
- Different instances can each accept requests for the same logical client within the same window.
- Observed
429behavior can vary with load balancing distribution.
- JWT-authenticated traffic (
auth:user:<sub>) - Trusted proxy IP traffic (
auth:ip:<ip>,public:ip:<ip>,tools:ip:<ip>) - Unknown client fallback (
auth:path:<path>,public:path:<path>,tools:anon:<hash>)
Decision
- Keep the current in-process limiter as the default path.
- Make the multi-instance caveat explicit in architecture documentation (this ADR).
- Strengthen key-derivation test coverage so fallback segmentation is mechanically verified.
- Defer distributed backing store adoption (Redis/Upstash/KV) until scale signals require it.
Migration trigger
Move to a distributed limiter when any of these are true:- API runs with more than one replica in steady state.
- Rate-limit fairness is materially impacting reliability (same client inconsistently limited).
- Security-sensitive routes require strict global enforcement per key.
- Monitoring shows sustained path-level fallback saturation where per-instance isolation is insufficient.
Consequences
Benefits:- Lowest operational complexity now.
- Clear and test-backed key derivation behavior.
- Explicitly documented tradeoff instead of implicit behavior.
- Not globally strict across replicas today.
- Some burst tolerance remains tied to load balancer distribution.
- Under multi-instance load, effective aggregate allowance per key can exceed nominal single-instance limits.
- Keep low limits on sensitive routes.
- Monitor 429 rates and route-specific key classes.
- Promote to distributed store when trigger conditions are met.
Implementation notes
- Middleware wiring is in the HTTP policy module.
- Key derivation logic and tests cover distinct anon tool keys per tool route and distinct auth fallback path keys per endpoint.

