ADR-0027: Three-mode agent routing and session orchestration

Status: Accepted Date: 2026-03-04 Deciders: Reflections Maintainers

Context

The platform runs voice sessions via ElevenLabs (ADR-0019) with a single agent prompt template. As the product grew to support three distinct session types — onboarding interviews, reflections, and shared sessions — the original routing based on mode and roomPrefix proved ambiguous. Specifically, mode: 'onboarding' could not distinguish a mid-onboarding interview from a late-stage onboarding reflection session, causing the wrong agent behavior and tool permissions. Additionally, the dynamic variable surface grew to include persona narratives, session recaps, preloaded evidence, user memories, and namespace scoping — all requiring mode-aware assembly with a budget cap to stay within provider limits. Evidence in code/config:

apps/api/src/lib/session-orchestrator-core.ts (SessionMode type, deriveSessionMode, dynamic variable assembly)
apps/api/src/lib/elevenlabs-agent-template.ts (Handlebars template, tool registration)
apps/api/src/lib/session-orchestrator-dynamic-extras.ts (async extras, budget enforcement)
apps/api/src/lib/session-orchestrator.ts (bootstrap sequence, fallback agent retry)
apps/api/src/routes/sessions/start-session.ts (agent resolution, voice clone gate)
apps/api/src/lib/conversation-bind-sig.ts (HMAC bind signature for write-tool authorization)

Decision

Three Session Modes

SessionMode = 'interview' | 'reflection' | 'share' Mode derivation (deriveSessionMode) follows a strict priority cascade:

agentType === 'interviewer' → interview
agentType === 'reflection' → reflection
mode === 'onboarding' → interview
roomPrefix === 'share' → share
Default → reflection

The explicit agentType field takes highest priority, resolving the routing ambiguity that previously existed when only mode and roomPrefix were available.

Tool Permissions Per Mode

Tool access is enforced at two layers:

Prompt layer (soft): {{allowed_actions}} dynamic variable tells the LLM which tools to call. The agent prompt includes a non-bypassable policy preamble: “Only call actions listed in {{allowed_actions}}.”
Server layer (hard): Write tools (request-source-ingest, propose-fact-correction) require a conversation_bind_sig (HMAC-SHA256) in the request body, verified server-side before any state mutation. The prompt layer is a UX signal; the server layer is the security boundary.

Permission matrix:

Mode	Role	Allowed Actions
`interview`	any	`retrieve-context`
`reflection`	owner/admin/operator	`retrieve-context`, `request-source-ingest`, `propose-fact-correction`
`reflection`	viewer/guest	`retrieve-context`
`share`	any	Follows role-based rules (typically `retrieve-context` only)

Interview mode restricts to read-only tools regardless of role. Interview sessions always enable ingestion (enforced by runtime throw: interview_requires_ingestion). Share mode disables ingestion by default.

Agent ID Resolution

Two-tier agent ID resolution:

Per-reflection agent ID from DB (custom branded agents).
Fallback to ELEVENLABS_AGENT_ID environment variable.

Interviewer sessions use ELEVENLABS_INTERVIEWER_AGENT_ID. On HTTP 400/404/422 from ElevenLabs, the system retries with the fallback agent ID (anti-stale-agent-config safety net).

Dynamic Variable Assembly

Variables are injected into the ElevenLabs Handlebars template at session bootstrap: Always present: reflection_id, conversation_id, conversation_bind_sig, off_record, actor_type, user_id (SHA-256 truncated to 16 hex chars), allowed_actions, session_access_scope. Conditionally loaded from DB: reflection_display_name, creator_name, reflection_speaking_style, reflection_persona_narrative, reflection_capsule_json, explore_next, session_recap, preloaded_evidence, reflection_namespaces, user_context.

Budget Enforcement

Total dynamic variable budget: 10,000 characters. An iterative trimming loop assigns survival ranks (0 = trimmed first, 13 = trimmed last). Security-critical variables (user_id, session_access_scope, allowed_actions, actor_type) have the highest ranks and are never trimmed under normal conditions. Content variables (reflection_capsule_json, user_context) are trimmed first.

Bootstrap Sequence

Race a 12-second timeout against the bootstrap promise.
In parallel: getConversationToken(agentId) (with fallback agent retry) and buildSessionDynamicExtras(...).
After both resolve: bind providerConversationId to DB and set lifecycle to active.
On any failure: markConversationFailed() writes status: 'failed' and captures to Sentry.

Dynamic extras failures are soft-fail: partial extras are returned, sessions proceed with a session_dynamic_variables_partial warning.

Conversation Bind Signature

computeConversationBindSig() produces an HMAC-SHA256 of the conversation ID using ELEVENLABS_BINDING_SECRET. This signature is:

Injected as {{conversation_bind_sig}} dynamic variable.
Passed through to the client, then sent back by the ElevenLabs agent in write-tool request bodies.
Verified server-side at tool endpoints before writes are allowed.
Supports key rotation: configuredBindingVerificationSecrets() tries all configured secrets in constant-time comparison.

Alternatives Considered

Alternative 1: Single mode with role-based permissions only

Pros:

Simpler routing — no mode derivation needed.
Role alone determines behavior.

Cons:

Cannot distinguish interview behavior from reflection behavior for the same role (owner).
Interview sessions need fundamentally different tool access and ingestion policy regardless of role.
Share sessions need different ingestion defaults regardless of role.

Alternative 2: Separate ElevenLabs agents per mode

Pros:

Complete isolation of prompt and tool configuration per mode.
No runtime tool filtering needed.

Cons:

Triples agent management overhead (creation, updates, config drift monitoring).
Dynamic variables still need mode-aware assembly.
Harder to keep prompts consistent across modes.

Alternative 3: Client-side mode routing

Pros:

Server stays mode-agnostic; client decides which agent to connect to.

Cons:

Tool permissions would be client-enforceable only — security boundary violation.
Ingestion policy must be server-enforced (two-plane invariant).

Consequences

Benefits:

Explicit agentType field eliminates the interview/reflection routing ambiguity.
Two-layer tool enforcement (prompt + HMAC) provides defense-in-depth for write operations.
Budget enforcement prevents dynamic variable overflow regardless of how much context is loaded.
Soft-fail extras loading ensures sessions start even when DB reads partially fail.
Bind signature with key rotation enables zero-downtime secret management.

Costs:

Mode derivation priority cascade must be understood by anyone modifying session creation.
Dynamic variable surface is large (14+ variables) and mode-dependent — testing requires covering the mode matrix.
Two separate enforcement layers (prompt-level allowed_actions and server-level bind signature) must stay aligned.
Mobile vs. web session responses differ (MobileSessionResponse includes providerConversationId; VoiceSessionResponse does not).

Implementation Notes

Session mode type and derivation: apps/api/src/lib/session-orchestrator-core.ts (SessionMode, deriveSessionMode(), buildDynamicVariables()).
Async extras and budget: apps/api/src/lib/session-orchestrator-dynamic-extras.ts (buildSessionDynamicExtras(), applyDynamicExtrasBudget(), resolveAllowedActions()).
Agent template: apps/api/src/lib/elevenlabs-agent-template.ts (buildElevenLabsAgentTemplate(), TOOL_DEFINITIONS, buildInlineTools()).
Main orchestrator: apps/api/src/lib/session-orchestrator.ts (startVoiceConversation(), startMobileConversation(), bootstrap with 12s timeout).
Agent ID resolution: apps/api/src/lib/agent-id.ts (resolveAgentForReflection()).
Bind signature: apps/api/src/lib/conversation-bind-sig.ts (computeConversationBindSig(), verifyConversationBindSig()).
Route layer: apps/api/src/routes/sessions/start-session.ts (resolveSessionAgent(), enforceVoiceCloneGate()).
Session planning (pure): apps/api/src/routes/sessions/planner.ts (decideSessionAccess(), decideProviderBindOutcome()).
Custom reflection template LLM config: Claude Sonnet 4.6, temperature 0.4, max 500 tokens, parallel tool calls disabled.

Architecture decision records

​Context

​Decision

​Three Session Modes

​Tool Permissions Per Mode

​Agent ID Resolution

​Dynamic Variable Assembly

​Budget Enforcement

​Bootstrap Sequence

​Conversation Bind Signature

​Alternatives Considered

​Alternative 1: Single mode with role-based permissions only

​Alternative 2: Separate ElevenLabs agents per mode

​Alternative 3: Client-side mode routing

​Consequences

​Implementation Notes

​Related ADRs

Context

Decision

Three Session Modes

Tool Permissions Per Mode

Agent ID Resolution

Dynamic Variable Assembly

Budget Enforcement

Bootstrap Sequence

Conversation Bind Signature

Alternatives Considered

Alternative 1: Single mode with role-based permissions only

Alternative 2: Separate ElevenLabs agents per mode

Alternative 3: Client-side mode routing

Consequences

Implementation Notes

Related ADRs