Verify before recommending: when the AI cites, you grep
The most dangerous AI output isn't wrong. It's outdated.
Claude will recommend a flag that was removed three months ago, a function that got renamed last quarter, a file path that doesn't exist anymore. Each citation sounds right because it was right.
The default failure mode
This isn't hallucination. Hallucination is the model inventing a plausible-but-fake API. This is worse: the model is repeating something that was true the day it learned it, and has no way to notice it isn't anymore.
Three flavors show up in practice:
- Renamed symbols. A function got refactored last quarter; the assistant still cites the old name.
- Removed flags. A CLI option got deprecated; the assistant still recommends it because it appeared in a doc that's still cached somewhere.
- Moved paths. A file got reorganized; the assistant cites the old import.
None of these look wrong on the page. The Markdown is unchanged. The model emits the answer with the same confidence as a citation it generated this morning.
Why this happens to any system with persistent state
It's tempting to call this a model failure, but the same pattern shows up everywhere a stored snapshot meets a moving codebase:
- Auto-memory. A memory written six weeks ago says the auth flow lives in
src/lib/auth/middleware.ts. The file was renamed two weeks ago. The memory is now a hallucination engine pointed at your project. - RAG over docs. The index was last rebuilt before the rewrite. Queries return confidently-formatted answers from the old API surface.
- Fine-tuned models. Trained on last quarter's repo snapshot. Cites a function that was deleted in the cleanup PR.
The common feature is that the storage layer doesn't know when its contents went stale. The codebase keeps moving; the snapshot does not.
The discipline
The fix is one habit: when the AI names a specific symbol, treat it as a hypothesis. Verify before acting.
"Verify" is almost always a ten-second grep:
- "Use the
--strictflag" → does--strictstill exist? Grep the CLI. - "The session middleware is at
src/lib/auth/session.ts" → does the path exist?lsit. - "Call
validateInput()" → grep for the function. Was it renamed in the last refactor?
Ten seconds of verification catches what would otherwise be a fifteen-minute confusion. The cost-benefit is so lopsided it should be reflexive.
The same habit applies to memory-driven recommendations. If a memory references a path, a function, or a person on a project, that reference is a claim that needs checking — not a fact.
What this doesn't mean
It doesn't mean distrusting the model. The model is right far more often than not. The discipline is narrow: it kicks in on specific named symbols, not on conceptual guidance.
When the AI says "consider extracting this into a helper" — that's advice, no grep needed. When it says "extract this into formatRowForCSV() like the helper in src/utils/csv.ts" — now you grep. The named symbol is the part that ages. The advice doesn't.
This is also why generic "trust but verify" framings underperform. They're too broad to be a habit. Grep before you act on a named symbol is a habit. You can build a reflex around it.
The rule
When the AI cites, you verify. When it remembers, you check.
The codebase is the present. The memory is a snapshot. When they disagree, the codebase wins.
The ten seconds of grep is the cheapest insurance in the workflow.