> ## Documentation Index
> Fetch the complete documentation index at: https://docs.reflections.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Two-plane architecture

> How Reflections separates low-latency voice inference from asynchronous background learning for safety and reliability.

The Reflections platform splits execution into two planes — **realtime** and **background** — because voice response latency and knowledge ingestion throughput have fundamentally different reliability and safety requirements.

## Why two planes

Combining read and write paths in a single execution flow creates two problems:

1. **Latency contamination** — background workload contention (LLM extraction, embedding generation, database writes) slows down user-facing voice responses.
2. **Safety risk** — if the realtime path can write to truth tables, unvetted facts could enter the knowledge graph without evaluation.

Separating the planes gives each side clear failure boundaries: a crash in the ingestion pipeline does not affect a live conversation, and the realtime plane cannot accidentally mutate active truth.

## Realtime plane

**Components:** `apps/api` + `packages/brain-core`

The realtime plane handles everything that happens during a live voice conversation:

* **Session bootstrap** — the API creates a signed session URL for the managed voice provider (ElevenLabs Conversations API).
* **Server-tool callbacks** — during a conversation, the voice provider calls back to the API to retrieve knowledge-graph evidence (`POST /v1/tools/retrieve-context`).
* **Webhook receivers** — conversation lifecycle events (e.g., conversation ended) are received and dispatched to the background plane.

<Warning>
  The realtime plane is **read-only against truth tables** (facts, entities, chunks, sources). It
  may read these tables for retrieval but must never write to them directly. This invariant is
  enforced by ESLint import rules, architecture guard scripts, and file-scan tests.
</Warning>

The realtime plane can write to operational tables — conversations, messages, and retrieval traces — but these are session data, not truth.

<Note>
  There is no in-process agent service. Voice orchestration is fully delegated to ElevenLabs. The
  API serves as a thin coordination layer between the voice provider and the knowledge graph.
</Note>

## Background plane

**Components:** `apps/workers` (Inngest pipeline)

The background plane handles everything that happens after a source is uploaded or a conversation ends:

1. **Ingestion** — new sources (documents, transcripts) are chunked and stored.
2. **Extraction** — an LLM extracts candidate facts, entities, and predicates from chunks.
3. **Evaluation** — candidate facts are scored and evaluated against existing knowledge.
4. **Patch application** — approved facts transition from `candidate` to `active` state, becoming visible to the realtime retrieval path.

Worker orchestration runs on Inngest, which provides durable step functions with built-in retry and idempotency.

## The learning gate

The learning gate is the core safety mechanism connecting the two planes. It ensures that no extracted information becomes active truth without passing through evaluation.

<Steps>
  <Step title="Candidate creation">
    The extraction step inserts new facts with `candidate` status, linked to a `patch_batch`.
  </Step>

  <Step title="Evaluation">
    Each candidate fact is evaluated — scored for confidence, checked against existing facts for
    contradictions, and assessed for quality.
  </Step>

  <Step title="Approval">Patch batches that pass evaluation are approved for application.</Step>

  <Step title="Application">
    Approved facts transition to `active` status with temporal validity fields (`valid_from`,
    `valid_to`). They become visible to the realtime retrieval path.
  </Step>
</Steps>

Facts that fail evaluation are marked as `rejected` and never enter active truth.

## How the planes communicate

The planes communicate through the database and event dispatch — never through direct function calls or shared in-process state.

| Direction              | Mechanism                | Example                                                  |
| ---------------------- | ------------------------ | -------------------------------------------------------- |
| Realtime to background | Event dispatch (Inngest) | Conversation-ended webhook triggers transcript ingestion |
| Background to realtime | Database state           | Newly active facts appear in retrieval queries           |
| Background internal    | Inngest step functions   | Extraction step feeds evaluation step                    |

## Operational isolation

Each plane has independent:

* **Failure domains** — a worker crash does not affect live conversations.
* **Scaling characteristics** — the API scales for concurrent sessions, workers scale for ingestion throughput.
* **Query access** — the realtime plane uses `@reflection/db/queries/read`, workers use `@reflection/db/queries/admin`.

## Further reading

* [Boundary matrix](/architecture/boundaries) — full list of allowed and forbidden operations per plane.
* [System invariants](/architecture/invariants) — the non-negotiable rules that govern both planes.
* [Knowledge graph](/architecture/knowledge-graph) — how temporal facts and the patch lifecycle work.
* [ADR-0003](/decisions/adr-0003) — the original decision record for the two-plane architecture.
