Architecture

ACSI · Hospitality · 2024–2025 · Public

PublicRepresentative · synthetic data
Live diagram — rewrite → hybrid retrieve → rerank → evidence gate → generate → verify → route. Only generate + verify are metered; the gate short-circuits hopeless queries before any spend.

The pipeline

Session load → intent + clarify-check → query rewrite → hybrid retrieve (semantic + keyword) → rerank → evidence-sufficiency gate → generate-with-citations → verify (claim-check + citation confirm + confidence) → route (answer / clarify / retry / escalate). The deterministic stages — retrieval, the evidence gate and routing — run at $0 in both cloud and OSS modes and are authoritative for grounding and hand-off.

Dual approach — cloud vs OSS

The metered stages (query rewrite, rerank, generate, verify) run on a live cloud model (claude-haiku-4-5), cost-capped and fail-closed — at the cap the system falls back to $0 OSS. The OSS path runs a self-hosted model recorded on local M4 hardware and replayed at $0, exactly as the live campsite systems do. Recorded Cloud-vs-OSS divergences are surfaced honestly in the inspector; an uncaptured case is shown as "not captured", never fabricated.

Architecture · Enterprise RAG Support Platform · Abhishek Saxena