Dev iconDevMay 29, 2026 ~6 min source read

My Agent Never Said "I Don't Know" — four real bugs and how they were fixed

A product manager built a working game AI Agent, then debugged four concrete hallucination and parameter-fabrication failures. This brief explains each failure, its root cause, and the practical fixes applied to stop plausible but false outputs.

My Agent Never Said "I Don't Know"

Share this story

Send the public story page.

Useful takeaways from this story.

Hard-anchor an agent's identity and other internal globals in the System Prompt so the model cannot infer them.

A consistent failure pattern — gap detected, gap filled — requires fixes at both context/KB design and prompt-rule levels.

A product manager who learned to design the knowledge base, write a Harness, and ship an in-production game AI Agent encountered four concrete bugs during testing. Each bug produced outputs that were plausible-sounding but false: a nonexistent gun, a fabricated player ID, a wrong self-ID, and an invalid activity jump code. The fixes combine KB design, prompt rules, and runtime constraints.

Bug 1 — Recommended a gun that doesn't exist

What happened: During playtesting the Agent recommended a weapon that sounded detailed and real — damage, fire rate — but the knowledge base had no record of that weapon.

Root cause: The Agent used general FPS knowledge to invent a plausible weapon when retrieval returned empty.

  • KB: Added an "absent entities" section listing weapons common in similar games but not present in this one. This explicitly tells retrieval "this item does not exist here."

Bug 2 — Fabricated a player ID that looked real

What happened: The Agent returned a player unique identifier with correct format and length even though the ID did not exist in the system.

Root cause: The identifier was not reliably injected into the Agent's context. The model knew the format and therefore generated a valid-looking ID.

  • Memory/context restructuring: Place the identifier near the top of context every time so it's consistently available for retrieval.

What happened: Asked "What's your ID?" the Agent produced a plausible-looking assistant ID that did not exist in the backend.

Root cause: No explicit identity anchor was provided, so the model inferred a likely identity string.

  • System Prompt: Add a dedicated, non-overridable identity block with all identity fields hard-coded. The Agent must use those values and cannot infer or override them.

Bug 4 — Gave a jump code for an activity that didn't exist

What happened: The Agent returned a navigation jump code for a nonexistent activity. This was an executable parameter that would have led players to an error page in production.

Root cause: When retrieval returned empty, the Agent filled the required navigation parameter with a plausible code.

  • Activity code whitelist: Only codes on the approved list may appear in Agent responses.
  • Prompt rule: If retrieval returns empty, the Agent must output "no activity found" and must not emit any code.

These are practical, targeted fixes for plausible-sounding hallucinations that could misdirect debugging or break production flows.

More context around this story.

Hermes Agent Gets Smarter Every Day. So Does the Bill.
Dev iconDevMay 30, 2026

Hermes Agent Gets Smarter Every Day. So Does the Bill.

This is a submission for the Hermes Agent Challenge : Write About Hermes Agent Hermes Agent Gets Smarter Every Day. So Does the Bill. Most write-ups about Hermes Agent tell you the same true thing: it's a self-improving, self-hosted agent that learns across sessions and gets better the longer it runs. That's accurate.

Loading more related stories...

Keep reading in the app

Open the app view to save this story, compare related coverage, and continue from the same source.

Open in app