Skip to main content
Status: Accepted Date: 2026-02-14 Deciders: Reflections Maintainers

Context

The API currently applies hono-rate-limiter middleware with an in-process memory store. This is simple and fast for current traffic, but it has a known caveat under horizontal scale:
  • Limits are enforced per-instance, not globally.
  • Different instances can each accept requests for the same logical client within the same window.
  • Observed 429 behavior can vary with load balancing distribution.
At the same time, key derivation is security-sensitive. You need deterministic key shapes for:
  • JWT-authenticated traffic (auth:user:<sub>)
  • Trusted proxy IP traffic (auth:ip:<ip>, public:ip:<ip>, tools:ip:<ip>)
  • Unknown client fallback (auth:path:<path>, public:path:<path>, tools:anon:<hash>)

Decision

  1. Keep the current in-process limiter as the default path.
  2. Make the multi-instance caveat explicit in architecture documentation (this ADR).
  3. Strengthen key-derivation test coverage so fallback segmentation is mechanically verified.
  4. Defer distributed backing store adoption (Redis/Upstash/KV) until scale signals require it.

Migration trigger

Move to a distributed limiter when any of these are true:
  • API runs with more than one replica in steady state.
  • Rate-limit fairness is materially impacting reliability (same client inconsistently limited).
  • Security-sensitive routes require strict global enforcement per key.
  • Monitoring shows sustained path-level fallback saturation where per-instance isolation is insufficient.

Consequences

Benefits:
  • Lowest operational complexity now.
  • Clear and test-backed key derivation behavior.
  • Explicitly documented tradeoff instead of implicit behavior.
Costs:
  • Not globally strict across replicas today.
  • Some burst tolerance remains tied to load balancer distribution.
Risks:
  • Under multi-instance load, effective aggregate allowance per key can exceed nominal single-instance limits.
Mitigations:
  • Keep low limits on sensitive routes.
  • Monitor 429 rates and route-specific key classes.
  • Promote to distributed store when trigger conditions are met.

Implementation notes

  • Middleware wiring is in the HTTP policy module.
  • Key derivation logic and tests cover distinct anon tool keys per tool route and distinct auth fallback path keys per endpoint.