ADR-0031: Reflection voice prompt-pack runtime and public freshness

Status: Accepted Date: 2026-03-10 Deciders: Reflections Maintainers

Context

The reflection voice runtime originally inherited much of its behavior from a runtime-assembled prompt wrapper layered around the ElevenLabs provider integration documented in ADR-0019 and the session routing model documented in ADR-0027. That earlier design created several problems:

Prompt authority was split between provider-side agent configuration, stored reflection prompt content, and large dynamic-variable payloads assembled at session start.
Public/share/mobile-public sessions were difficult to reason about because audience, retrieval scope, and prompt surface were only partially separated.
Reflection voice identity was flattened into runtime scaffolding instead of being compiled into a durable artifact.
Public runtime drift could persist after reflection changes or publish-scope changes because session bootstrap trusted stored provider agent ids more than synced prompt state.
Artifact generation existed, but promotion and traceability were not strong enough to justify the repo invariant that learning is gated.

This branch introduces a new persisted voice runtime model and closes the trust-boundary gaps that appeared during the rollout:

prompt packs become the canonical runtime prompt source
public and private audiences become explicit runtime concepts
public sessions preserve real published namespace scope for retrieval
generated voice artifacts follow a candidate → evaluate → promote lifecycle
public provider agents are selected only when their sync state is fresh against the currently promoted public artifact and published namespace scope

Evidence in code/config:

apps/workers/src/functions/ingest-source/phases/reflection-voice-core.ts
apps/workers/src/functions/ingest-source/phases/reflection-voice-dossier.ts
apps/workers/src/functions/ingest-source/phases/reflection-voice-prompt-pack.ts
apps/workers/src/functions/ingest-source/phases/reflection-voice-approval.ts
apps/workers/src/functions/ingest-source/phases/capsule.ts
apps/api/src/lib/reflection-prompt-pack.ts
apps/api/src/lib/elevenlabs-agent-template.ts
apps/api/src/lib/session-orchestrator-bootstrap.ts
apps/api/src/lib/session-orchestrator-core.ts
apps/api/src/lib/session-orchestrator-dynamic-extras.ts
apps/api/src/lib/agent-id.ts
apps/api/src/lib/reflection-runtime-sync-state.ts
apps/api/src/lib/voice-runtime-sync.ts
packages/db/src/queries/conversations.ts
packages/db/src/queries/reflections.ts
packages/schemas/src/voice-prompt.ts
supabase/migrations/20260309210000_reflection_voice_prompt_artifacts.sql
supabase/migrations/20260309224500_reflections_public_agent_id.sql
supabase/migrations/20260309233000_reflection_runtime_sync_state.sql

Decision

1. Reflection voice runtime uses persisted prompt packs as the canonical prompt source

Reflection sessions no longer depend on a runtime-composed persona wrapper as the primary prompt authority. Instead, capsule regeneration persists a voice artifact set composed of:

voice_evidence_pack
reflection_voice_dossier
prompt_packs

At session/runtime sync time:

getReflectionPromptPack() selects the effective prompt pack for private_standard, private_memory, public_standard, private_emerging, or public_emerging
buildElevenLabsAgentTemplate() renders the provider payload from the selected persisted prompt pack
session bootstrap injects only a reduced runtime brief instead of the older large dynamic prompt surface

The prompt pack, not the legacy wrapper, is the steady-state source of reflection voice behavior.

2. Public and private are separate runtime audiences

Reflection sessions now carry an explicit sessionAudience: 'private' | 'public' contract. Audience and retrieval scope are separate concepts:

sessionAudience controls public/private runtime behavior
namespaces in conversation metadata remain the exact retrieval allowlist

For public/share/mobile-public sessions:

public prompt packs are derived from the public dossier path
runtime extras are generic and public-safe
recap, user memory, and private capsule signals are not reused
stored namespaces preserve the real published allowlist rather than collapsing to a marker-only ['public'] abstraction

For private sessions:

runtime extras may still use richer recap, memory, and capsule-derived context
private prompt packs remain eligible for reflection and memory surfaces

Legacy audience inference from older conversation metadata remains supported so pre-existing public/member sessions refresh correctly.

3. Public runtime is freshness-aware, not merely presence-aware

The existence of elevenlabs_public_agent_id alone is not sufficient to serve public traffic. Each reflection now persists runtime sync state per surface:

standard
memory
public

For each surface, sync state records at minimum:

agent id
prompt-pack kind
prompt-pack content hash
prompt-pack dossier hash
sync timestamp

For the public surface, sync state also records:

published namespace scope hash

Public session bootstrap and public agent resolution now require that the selected public agent be fresh against:

the currently promoted public prompt pack
the currently promoted public dossier hash
the current published namespace scope

If the stored public agent is stale:

the system attempts a lazy syncReflectionPublicRuntime()
if refresh cannot bring the public surface current, bootstrap fails closed with a structured stale/not-ready error instead of silently routing to a private or stale runtime

4. Voice artifacts use a candidate → evaluate → promote lifecycle

Generated voice artifacts are not written directly into the live runtime cache. Capsule regeneration now:

builds one candidate artifact payload
inserts that candidate into the durable artifact ledger
evaluates the exact inserted payload
promotes only approved generations into the live capsule cache

Promotion and live-cache updates are monotonic:

stale generations cannot overwrite newer voice_* cache state
stale generations also cannot overwrite the plain capsule, token_estimate, persona_narrative, or other runtime-facing fields that still shape private runtime context

Rejected generations persist rejection reasons for inspection.

5. Approval is deterministic and trust-oriented

The promotion gate is intentionally deterministic rather than model-judged. Approval checks include:

schema validity for dossiers, prompt packs, and sync metadata
required prompt-pack coverage
runtime placeholder/tool contract completeness
public/private dossier-hash alignment
public-only provenance checks for public dossier content
required end-call/wrap-up guidance in generated prompt packs

This is not a fuzzy “quality score.” It is a trust boundary and contract gate.

6. Publish/unpublish and runtime sync prioritize predictable user experience

The user experience goal is that public sessions feel safely public and private sessions feel current. To support that:

publish-state changes preserve namespace allowlists for runtime retrieval scope
public sync is non-blocking from the standard/private sync path, but public bootstrap separately enforces freshness before use
public runtime is never allowed to fall back to the private agent surface once a dedicated public runtime exists

Alternatives Considered

Alternative 1: Keep runtime-assembled reflection prompts

Pros:

Lower migration cost.
Fewer persisted artifacts to reason about.

Cons:

Keeps prompt authority split between runtime wrapper, stored prompt content, and dynamic variables.
Makes public/private trust boundaries harder to enforce consistently.
Preserves the original identity-flattening problem this branch set out to solve.

Alternative 2: Treat public sessions as a redacted view of private runtime state

Pros:

Less code and fewer explicit artifacts.
Easier initial rollout.

Cons:

Too easy to leak private cues through recap, memory, capsule anchors, or prompt-pack identity fields.
Redaction-after-the-fact is weaker than building public artifacts from public-safe inputs only.
Public retrieval scope becomes ambiguous.

Alternative 3: Let stored public agent ids act as the source of truth

Pros:

Simpler bootstrap logic.
Fewer metadata writes.

Cons:

Public runtime can drift after publish-scope or prompt-pack changes.
Session start cannot prove freshness.
Violates the trust goal that public behavior should track currently promoted public artifacts, not whichever provider agent happened to sync last.

Alternative 4: Use a fuzzy or LLM-judged approval step

Pros:

Could potentially score “quality” or naturalness beyond structure.

Cons:

Adds nondeterminism to a trust boundary.
Harder to debug, version, and enforce in CI.
Unnecessary for the branch’s current goal, which is correctness, provenance, and freshness.

Consequences

Benefits:

Prompt authority is clearer and more auditable.
Public/share/mobile-public sessions now have a cleaner and safer trust boundary.
Public runtime selection is freshness-aware and fails closed when stale.
Retrieval scope remains precise because published namespace allowlists are preserved in conversation metadata.
Artifact promotion is now a real durability and safety boundary instead of a documentation claim.
Prompt-pack hashes, dossier hashes, and runtime metadata are more honest and more useful for debugging.

Costs:

More moving parts: prompt packs, dossiers, evidence packs, promotion ledger, runtime sync state.
Public bootstrap can now return explicit stale/not-ready errors rather than serving potentially stale public behavior.
The approval gate is stricter, so some artifact generations that were previously tolerated now reject.
Runtime sync and bootstrap semantics are more complex because freshness is checked per surface.

Implementation Notes

Prompt-pack selection and rendering:
- apps/api/src/lib/reflection-prompt-pack.ts
- apps/api/src/lib/elevenlabs-agent-template.ts
Runtime sync freshness:
- apps/api/src/lib/reflection-runtime-sync-state.ts
- apps/api/src/lib/agent-id.ts
- apps/api/src/lib/voice-runtime-sync.ts
Session bootstrap and audience handling:
- apps/api/src/lib/session-orchestrator-bootstrap.ts
- apps/api/src/lib/session-orchestrator-core.ts
- apps/api/src/lib/session-orchestrator-dynamic-extras.ts
- apps/api/src/routes/public.ts
- apps/api/src/routes/mobile.ts
- apps/api/src/routes/mobile-helpers.ts
Artifact generation and approval:
- apps/workers/src/functions/ingest-source/phases/reflection-voice-core.ts
- apps/workers/src/functions/ingest-source/phases/reflection-voice-dossier.ts
- apps/workers/src/functions/ingest-source/phases/reflection-voice-prompt-pack.ts
- apps/workers/src/functions/ingest-source/phases/reflection-voice-approval.ts
- apps/workers/src/functions/ingest-source/phases/capsule.ts
Persistence and monotonic promotion:
- packages/db/src/queries/conversations.ts
- packages/db/src/queries/reflections.ts
- supabase/migrations/20260309210000_reflection_voice_prompt_artifacts.sql
- supabase/migrations/20260309224500_reflections_public_agent_id.sql
- supabase/migrations/20260309233000_reflection_runtime_sync_state.sql
Contract/testing support:
- packages/schemas/src/voice-prompt.ts
- apps/workers/src/functions/audit-elevenlabs-agent-config-core.ts
- elevenlabs/tests.json

Architecture decision records

​Context

​Decision

​1. Reflection voice runtime uses persisted prompt packs as the canonical prompt source

​2. Public and private are separate runtime audiences

​3. Public runtime is freshness-aware, not merely presence-aware

​4. Voice artifacts use a candidate → evaluate → promote lifecycle

​5. Approval is deterministic and trust-oriented

​6. Publish/unpublish and runtime sync prioritize predictable user experience

​Alternatives Considered

​Alternative 1: Keep runtime-assembled reflection prompts

​Alternative 2: Treat public sessions as a redacted view of private runtime state

​Alternative 3: Let stored public agent ids act as the source of truth

​Alternative 4: Use a fuzzy or LLM-judged approval step

​Consequences

​Implementation Notes

​Related ADRs

Context

Decision

1. Reflection voice runtime uses persisted prompt packs as the canonical prompt source

2. Public and private are separate runtime audiences

3. Public runtime is freshness-aware, not merely presence-aware

4. Voice artifacts use a candidate → evaluate → promote lifecycle

5. Approval is deterministic and trust-oriented

6. Publish/unpublish and runtime sync prioritize predictable user experience

Alternatives Considered

Alternative 1: Keep runtime-assembled reflection prompts

Alternative 2: Treat public sessions as a redacted view of private runtime state

Alternative 3: Let stored public agent ids act as the source of truth

Alternative 4: Use a fuzzy or LLM-judged approval step

Consequences

Implementation Notes

Related ADRs