The Open-Book Exam
A company rolls out an AI assistant for employee questions. Week one, someone asks: “How many weeks of parental leave do I get?”
The bot answers instantly, warmly, and specifically: “Twelve weeks at full pay.” The actual policy — updated last quarter — is eighteen. The bot has read, in effect, the whole public internet. It has never read the one document that mattered: the company handbook sitting in a folder fifty metres from the server room.
If you've ever asked a chatbot about your order, your contract, your policy — you've met this exact gap.
You own this rollout. The model demonstrably 'knows' a thousand times more than any employee — and none of the two hundred facts that make it useful here. How do you get your handbook into it?
What's the right fix?
Pick one — committing first is what makes the answer stick.
the lesson continues after you choose
“Train it on our data” is the phrase that launches a thousand overpriced projects. It makes sense — school taught us that knowing things means having studied them.
But it confuses two different kinds of memory. What the model learned in training is like fluency and judgment — slow to acquire, general, durable. Your parental-leave policy is a fact on a page — specific, changeable, and needing a citation. Fluency belongs in the brain. Facts belong in a library the brain can search.
The pipeline is called retrieval-augmented generation (RAG), and its magic ingredient is how the search works. Documents are split into chunks, and each chunk is converted into an embedding — a point in a space with hundreds of dimensions, positioned by meaning. Your question gets embedded into the same space, and the system grabs the chunks sitting nearest to it. That's why asking “how long can I stay home with my baby?” finds the parental leave section despite sharing no keywords: nearby in meaning-space is the search key. The retrieved passages are then placed into the model's context, with an instruction that matters: answer from these, and cite them.
Grounding also quietly attacks the hallucination problem from the Limits shelf: a model answering from a provided passage has far less room to invent than one answering from vibes, and the citation gives you a one-click audit. The honest caveats: retrieval can fetch the wrong pages (garbage retrieved, garbage generated), chunking and permissions take real engineering, and for small stable corpora, modern long-context models make option B a legitimate rival. The architecture question is always the same trade: what does the model need to know forever, versus look up right now?
So the bot that botched parental leave was never under-trained — it was under-libraried. The gap between “knows everything” and “knows your business” isn't closed by more knowing at all; it's closed by reading the right page at the right moment. You didn't need a smarter brain. You needed to turn a closed-book exam into an open-book one.
Your rule: ask of any fact an AI system must handle — “if this changed tomorrow, would the system answer correctly tomorrow?” If no, that fact belongs in retrieval, not in weights. And when you evaluate any 'trained on your data' pitch, ask to see the citations; grounded systems can show their pages, and the ones that can't are asking for faith.