ADR-0007: Retrieval pipeline design

Status: Accepted Date: 2026-02-06 Deciders: Reflections Maintainers

Context

Voice turns need grounded responses that combine chunk-level semantic retrieval and structured fact context, with traceability for debugging quality issues.

Decision

Use a staged retrieval pipeline:

Embed user utterance (embedOne) and perform vector search over chunks.
Resolve entity candidates heuristically from user text and prior turn context.
Fetch active graph facts via neighborhood hops from resolved entities.
Build a system append with explicit untrusted-evidence warning.
Persist retrieval traces with fail-soft behavior (warn on trace insert failure, do not fail turn).

Alternatives considered

Alternative 1: Pure vector RAG without entity/fact hops

Pros:

Simpler and fast.

Cons:

Weaker structured memory grounding.
Harder to answer relation-heavy questions consistently.

Alternative 2: Pure graph retrieval without semantic chunk search

Pros:

Strong structured consistency.

Cons:

Misses evidence present only in source text chunks.
Lower recall for long-tail narrative information.

Alternative 3: LLM-only direct answer with no retrieval

Pros:

Lowest engineering complexity.

Cons:

Hallucination risk and poor factual grounding.
No evidence traceability.

Consequences

Benefits:

Balanced recall from both semantic chunks and graph facts.
Better explainability through trace payload persistence.
Safer prompt construction via explicit evidence warning banner.

Costs:

Multi-step retrieval introduces operational complexity.
Heuristic entity extraction can miss variants or noisy names.

Implementation notes

Pipeline orchestration lives in packages/brain-core/src/retrieve.ts.
Vector query behavior uses HNSW indexing for nearest-neighbor search.
Active fact lookup and hop expansion are implemented in the DB queries layer.
Trace persistence fallback behavior is intentionally fail-soft: retrieval continues even if trace writes fail.

ADR-0008: Auth and RBACSupport both service-to-service and end-user auth while enforcing server-derived reflection roles.

​Context

​Decision

​Alternatives considered

​Alternative 1: Pure vector RAG without entity/fact hops

​Alternative 2: Pure graph retrieval without semantic chunk search

​Alternative 3: LLM-only direct answer with no retrieval

​Consequences

​Implementation notes

​Related ADRs

Context

Decision

Alternatives considered

Alternative 1: Pure vector RAG without entity/fact hops

Alternative 2: Pure graph retrieval without semantic chunk search

Alternative 3: LLM-only direct answer with no retrieval

Consequences

Implementation notes

Related ADRs