Skip to main content
Status: Accepted — supersedes ADR-0026 Date: 2026-03-06 Deciders: Reflections Maintainers

Context

The voice-clone architecture has moved away from conversation-audio creation and outbox-first state management. Current production behavior is sample-first and attempt-driven:
  • clone creation starts from explicit creator samples or validated custom voice IDs
  • durable lifecycle state lives in voice_clone_attempts
  • workers process attempt events through a dedicated internal endpoint
  • runtime readiness requires more than cloneStatus === 'ready'
  • preview completion is part of the launch gate for first-party reflection sessions
The old ADR described voice_clone_outbox, conversation-audio clone creation, and voiceConfig-derived status as the primary architecture. That is now historically useful but operationally misleading. Evidence in code/config:
  • supabase/migrations/20260305113000_voice_clone_attempts.sql
  • packages/db/src/queries/voice-clone-attempts.ts
  • apps/workers/src/functions/process-voice-clone-attempt.ts
  • apps/api/src/routes/internal/voice-clone-attempts.ts
  • apps/api/src/lib/voice-clone-gate.ts
  • docs/runbooks/voice-clone-pipeline.md

Decision

Durable Lifecycle Authority

voice_clone_attempts is the canonical runtime authority for voice-clone lifecycle state.
  • Each attempt has durable status, runtime status, preview state, retry counters, provider ids, and timestamps.
  • Reflection voice JSON remains a mirrored presentation/config surface for client reads and lightweight launch checks.
  • Attempt transitions must be driven through the DB/query surface and mirrored into reflection voice; direct config patching is not the primary lifecycle mechanism.

Sample-First Product Contract

The supported clone-creation paths are:
  • manual sample upload / recording via POST /v1/reflections/:id/voice/clone
  • validated custom voice-id attachment via PATCH /v1/reflections/:id/voice
Conversation-ended webhook audio and voice-clone outbox relay paths remain compatibility-only. They must not be treated as the primary product flow, primary retry surface, or architecture source of truth.

Worker-Driven Processing

Clone attempts are processed through an explicit worker event and internal attempt endpoint:
  • workers receive VOICE_CLONE_PROCESS
  • workers call POST /internal/reflections/:id/voice-clone-attempts/:attemptId/process
  • stale recovery and verification completion continue the same attempt-driven lifecycle instead of spawning parallel state machines

Readiness Gate

Reflection-session launch requires all of:
  • cloneStatus === 'ready'
  • runtimeStatus === 'synced'
  • previewRequired !== true
This means:
  • a mirrored ready state is still launch-blocked while runtime sync is incomplete
  • a mirrored ready state is still launch-blocked until preview has been completed when preview is required
  • late completions after reset/retry must not reactivate launch eligibility for inactive attempts

Consequences

Benefits:
  • one durable attempt model across API, workers, runbooks, and client presentation
  • fewer false-ready states during runtime sync and preview onboarding
  • clearer incident response because stale recovery, verification, and processing all operate on the same authority surface
Costs:
  • reflected voice JSON and attempt rows must stay synchronized
  • docs and operational habits must move away from the legacy outbox-first mental model

Operational Notes

  • Runbooks must start from attempt state, runtime sync, verification completion, and preview gating.
  • Client docs must describe reflection voice as mirrored status, not authoritative lifecycle storage.
  • Legacy outbox events are still accepted for compatibility, but they no longer define the core voice-clone architecture.