Skip to main content
Status: Accepted Date: 2026-02-06 Deciders: Reflections Maintainers

Context

Source ingestion has multiple side-effectful stages (fetch/chunk/embed/entity/fact/eval/apply). Failures and retries are expected; the system needs idempotency keys, resumable behavior, and explicit status telemetry.

Decision

Use Inngest orchestration with ingest ledgering and patch-aware recovery:
  • Create/update ingest run records keyed by (reflection_id, source_id, content_hash) and idempotency_key.
  • Reuse existing patch batch where possible; skip duplicate work for terminal patch states.
  • Recover from previously approved but unapplied patches by applying and marking source ingested.
  • Record progress states and terminal outcomes (completed, eval_rejected, failed, duplicate_skipped).

Alternatives considered

Alternative 1: Fire-and-forget ingest with no run ledger

Pros:
  • Minimal persistence overhead.
Cons:
  • No reliable retry semantics.
  • Poor observability and manual recovery burden.

Alternative 2: Exactly-once queue semantics only

Pros:
  • Cleaner mental model in theory.
Cons:
  • Hard to guarantee across all external side effects.
  • Still needs stateful compensation logic for partial failures.

Alternative 3: Manual operator-driven retries only

Pros:
  • Operationally explicit.
Cons:
  • Slow recovery and low throughput.
  • Error-prone repeated manual interventions.

Consequences

Benefits:
  • Controlled retries and deterministic duplicate handling.
  • Better failure visibility and pipeline telemetry.
  • Recovery logic tied to authoritative patch status.
Costs:
  • Increased state-machine complexity in worker orchestration.
  • More status transitions to test and maintain.

Implementation notes

  • End-to-end ingest orchestration is implemented in the worker functions layer using Inngest.
  • Ingest run persistence and status handling live in the DB queries layer.
  • Schema support for idempotency and statuses is defined across multiple migrations that add idempotency keys, entity constraints, and ingest run status refinements.