Business Problem & Solution

ACSI · Hospitality · 2020–Present · Public

PublicRepresentative · synthetic data
campsite attributesthe systemwebsite copyregion & facilitiessuitabilityratingbrand voicetwo-stage genwrite · check · retrytagline & descriptionhighlightsnearby POIgrounded, on-brand, multilingual
Live diagram — structured attributes converge through write-then-check generation into grounded, on-brand, multilingual copy.

Why grounded generation matters

A hospitality marketplace lists tens of thousands of campsites, and every one needs a tagline, a description, a set of highlights and nearby points of interest — ideally in several languages. Writing and maintaining that by hand is slow, uneven and never finished; the moment a facility changes, the copy is stale. But handing it to a raw language model is worse: it writes fluently about a heated pool, a five-star spa or a beach that the property doesn't have, and it drifts off-brand the moment no one is watching.

The requirement is specific. The copy has to be grounded in each property's real structured attributes — region, facilities, suitability, rating — consistent in voice, free of banned superlatives, and it must never claim something the data doesn't support. And it has to do all of that at marketplace scale, at near-zero cost per item, with a human editor signing off before anything goes live.

The trade-off

A pure template engine is safe but lifeless; a pure language model is fluent but ungoverned. This system blends the two. A deterministic $0 layer owns the brief, the cache, the template and the authoritative quality floor — banned words, required facts, claim-versus-attributes, language and format — while a two-stage model pairing writes the prose and a second model checks it.

The result is copy that reads like a person wrote it but behaves like a rules engine produced it: on-brand, grounded, multilingual, and never published without passing QC and an editor's approval.

What this demo proves — and what it simplifies

This is a faithful, downscaled reimplementation on the shared synthetic catalog — never the production system. It proves grounded, two-stage, multilingual generation with a deterministic QC floor, a capped retry loop, a content cache, and an editorial human-in-the-loop studio. It deliberately simplifies the breadth of languages and content types, the translation (mock client-side), and full-scale generation infrastructure — all labelled in Architecture → Out of scope. The employer is named only as context.

Business Problem & Solution · Campsite Content Generation · Abhishek Saxena