ADR-0016: AI vendor and model strategy

Status: Accepted Date: 2026-02-06 Deciders: Reflections Maintainers

Context

The platform uses multiple AI capabilities: realtime and background LLM generation, plus embedding generation for retrieval. A single-vendor strategy would simplify integrations but increase concentration risk and reduce fit-for-purpose model selection.

Decision

Use a split vendor/model strategy with three provider roles:

Generation: Anthropic is the LLM provider for worker JSON/text tasks and brain-core reasoning.
Embeddings: OpenAI is the embeddings provider with a fixed v1 embedding dimensionality contract (1536) tied to DB schema.
Voice runtime: ElevenLabs Conversations API is the managed voice orchestration provider. It handles STT, turn orchestration, and TTS, calling back to the Reflections API for knowledge retrieval via server-tool endpoints. See ADR-0019: Voice runtime provider strategy for the full decision record.
Vendor and model identifiers are configured via validated environment variables in shared runtime config.
Vendor DI boundary policy: direct vendor SDK imports are confined to packages/vendors/src/*; the ElevenLabs HTTP client is confined to the API layer (app-level, not a shared package, since only the API consumes it). All other modules depend on wrapper entrypoints for testability and swap control.

Alternatives considered

Alternative 1: Single vendor for generation and embeddings

Pros:

Fewer SDKs and contracts.
Simpler procurement/key management.

Cons:

Higher lock-in and outage concentration risk.
Reduced ability to optimize per capability.

Alternative 2: Provider-agnostic abstraction layer from day one

Pros:

Easier swaps in theory.

Cons:

Adds abstraction complexity before proven need.
Can hide vendor-specific quality/performance behavior.

Alternative 3: Self-hosted open model stack

Pros:

Maximum control over model behavior and cost surface.

Cons:

Significant infra and MLOps burden.
Higher reliability and latency risk for current team size.

Consequences

Benefits:

Better fit-for-purpose selection by capability domain.
Reduced single-provider risk.
Explicit environment validation for model-critical settings.

Costs:

More integration surfaces and credential management (three providers).
Need to monitor three provider reliability/cost profiles.
Voice runtime vendor lock-in: conversation lifecycle, transcript format, and webhook contract are ElevenLabs-specific.

Implementation notes

Embedding dimensionality guardrails are enforced in the shared environment configuration.
Retrieval and ingestion paths consume provider wrappers through package entrypoints, not direct SDK usage.
Provider wrappers are isolated in packages/vendors/src/*. This boundary is mechanically enforced by ESLint no-restricted-imports (IDE + pre-commit) and simplicity-guard.mjs (CI). See ADR-0022: Lint policy and enforcement layer strategy.
ElevenLabs integration is confined to the API layer: HTTP client, server-tool callback, webhook, and reconciliation endpoints. It is not a shared package because only the API service interacts with ElevenLabs directly.

​Context

​Decision

​Alternatives considered

​Alternative 1: Single vendor for generation and embeddings

​Alternative 2: Provider-agnostic abstraction layer from day one

​Alternative 3: Self-hosted open model stack

​Consequences

​Implementation notes

​Related ADRs

Context

Decision

Alternatives considered

Alternative 1: Single vendor for generation and embeddings

Alternative 2: Provider-agnostic abstraction layer from day one

Alternative 3: Self-hosted open model stack

Consequences

Implementation notes

Related ADRs