> ## Documentation Index
> Fetch the complete documentation index at: https://docs.reflections.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# ADR-0007: Retrieval pipeline design

> Build evidence retrieval that combines semantic recall with graph context while remaining low-latency.

<Info>**Status:** Accepted **Date:** 2026-02-06 **Deciders:** Reflections Maintainers</Info>

## Context

Voice turns need grounded responses that combine chunk-level semantic retrieval and structured fact context, with traceability for debugging quality issues.

## Decision

Use a staged retrieval pipeline:

* Embed user utterance (`embedOne`) and perform vector search over chunks.
* Resolve entity candidates heuristically from user text and prior turn context.
* Fetch active graph facts via neighborhood hops from resolved entities.
* Build a system append with explicit untrusted-evidence warning.
* Persist retrieval traces with fail-soft behavior (warn on trace insert failure, do not fail turn).

## Alternatives considered

### Alternative 1: Pure vector RAG without entity/fact hops

Pros:

* Simpler and fast.

Cons:

* Weaker structured memory grounding.
* Harder to answer relation-heavy questions consistently.

### Alternative 2: Pure graph retrieval without semantic chunk search

Pros:

* Strong structured consistency.

Cons:

* Misses evidence present only in source text chunks.
* Lower recall for long-tail narrative information.

### Alternative 3: LLM-only direct answer with no retrieval

Pros:

* Lowest engineering complexity.

Cons:

* Hallucination risk and poor factual grounding.
* No evidence traceability.

## Consequences

**Benefits:**

* Balanced recall from both semantic chunks and graph facts.
* Better explainability through trace payload persistence.
* Safer prompt construction via explicit evidence warning banner.

**Costs:**

* Multi-step retrieval introduces operational complexity.
* Heuristic entity extraction can miss variants or noisy names.

## Implementation notes

* Pipeline orchestration lives in `packages/brain-core/src/retrieve.ts`.
* Vector query behavior uses HNSW indexing for nearest-neighbor search.
* Active fact lookup and hop expansion are implemented in the DB queries layer.
* Trace persistence fallback behavior is intentionally fail-soft: retrieval continues even if trace writes fail.

## Related ADRs

* [ADR-0003: Two-plane system architecture](/decisions/adr-0003)
* [ADR-0005: Temporal fact and patch lifecycle](/decisions/adr-0005)
* [ADR-0011: Observability and tracing strategy](/decisions/adr-0011)
