> ## Documentation Index
> Fetch the complete documentation index at: https://docs.reflections.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# ADR-0023: API rate-limit store and keying strategy

> Keep the current in-process limiter for now, make key derivation explicit and tested, and define a clear trigger for distributed migration.

<Info>**Status:** Accepted **Date:** 2026-02-14 **Deciders:** Reflections Maintainers</Info>

## Context

The API currently applies `hono-rate-limiter` middleware with an in-process memory store. This is simple and fast for current traffic, but it has a known caveat under horizontal scale:

* Limits are enforced per-instance, not globally.
* Different instances can each accept requests for the same logical client within the same window.
* Observed `429` behavior can vary with load balancing distribution.

At the same time, key derivation is security-sensitive. You need deterministic key shapes for:

* JWT-authenticated traffic (`auth:user:<sub>`)
* Trusted proxy IP traffic (`auth:ip:<ip>`, `public:ip:<ip>`, `tools:ip:<ip>`)
* Unknown client fallback (`auth:path:<path>`, `public:path:<path>`, `tools:anon:<hash>`)

## Decision

1. Keep the current in-process limiter as the default path.
2. Make the multi-instance caveat explicit in architecture documentation (this ADR).
3. Strengthen key-derivation test coverage so fallback segmentation is mechanically verified.
4. Defer distributed backing store adoption (Redis/Upstash/KV) until scale signals require it.

## Migration trigger

Move to a distributed limiter when any of these are true:

* API runs with more than one replica in steady state.
* Rate-limit fairness is materially impacting reliability (same client inconsistently limited).
* Security-sensitive routes require strict global enforcement per key.
* Monitoring shows sustained path-level fallback saturation where per-instance isolation is insufficient.

## Consequences

**Benefits:**

* Lowest operational complexity now.
* Clear and test-backed key derivation behavior.
* Explicitly documented tradeoff instead of implicit behavior.

**Costs:**

* Not globally strict across replicas today.
* Some burst tolerance remains tied to load balancer distribution.

**Risks:**

* Under multi-instance load, effective aggregate allowance per key can exceed nominal single-instance limits.

**Mitigations:**

* Keep low limits on sensitive routes.
* Monitor 429 rates and route-specific key classes.
* Promote to distributed store when trigger conditions are met.

## Implementation notes

* Middleware wiring is in the HTTP policy module.
* Key derivation logic and tests cover distinct anon tool keys per tool route and distinct auth fallback path keys per endpoint.

## Related ADRs

* [ADR-0011: Observability and tracing strategy](/decisions/adr-0011)
* [ADR-0012: CI/CD quality and release gates](/decisions/adr-0012)
* [ADR-0017: Public share-link and anonymous session boundary](/decisions/adr-0017)