Is Your AI Agent Flying Blind?

All posts

You paste a task into Claude Code. Something straightforward: "Add a filtering option to the reports table." The agent gets to work. It generates a component, hooks it up, writes reasonable-looking code. Then you review the PR and sigh.

It used inline styles instead of your CSS modules convention. It created a ReportFilter component in the wrong directory. It ignored the existing Zustand store pattern and invented its own state management. It didn't check permissions the way every other server action does. The code works, technically. But it doesn't belong in your codebase.

So you spend twenty minutes fixing what should have taken five, and you think: the AI doesn't know how we build things here.

You are right. It does not. But the poor uninformed model is not the problem here. How was he supposed to know you prefer CSS Modules over Tailwind.

The treasure map nobody drew

Every engineering team accumulates institutional knowledge. The problem is where it lives.

The CLAUDE.md file has some rules in it. Maybe import ordering, maybe a note about naming conventions. It was written three months ago and covers about 40% of what someone actually needs to know.

Slack threads contain half your architectural decisions. Somebody asked "should we use server components or client components for this?" and the tech lead replied with a thoughtful paragraph at 11:47 PM. That paragraph is now the de facto standard. It lives in a thread with 23 replies and some dancing parrot emoji reactions.

Meeting notes, if they exist, captured the decision to use Auth0 for auth and Neon with Kysely for the database. But the why behind those choices? The team picked them because they could be set up in ten minutes through the Vercel marketplace, which mattered more than any technical comparison at the time. That rationale is in someone's head. It is not in any document.

Confluence pages document the deployment process. They were accurate six months ago. The app used to deploy to Digital Ocean. It is on Fly.io now. A new developer spends half a day looking for the deployment pipeline that no longer exists. The Node version got bumped from 18 to 24 and every NPM package was upgraded along with it, but the Confluence page still says Node 18. The dev runs it locally on 18, everything breaks, and nobody spots the problem quickly because the page says 18 is correct.

Tribal knowledge fills the remaining gaps. Senior engineers know that user IDs are strings, not UUIDs, because the auth provider returns them that way. They know the CSS variable for colours uses British spelling. They know the ORM repository files must be named as plural entities. None of this is written down anywhere a new team member, or an AI agent, could find it.

This has always been a problem. But it used to be a manageable one, because the consumers of this knowledge were humans, and humans are remarkably good at piecing together fragments.

Humans improvise. Agents don't.

When a senior developer encounters ambiguity, they do something an AI agent cannot: they walk over to a colleague's desk, scan the existing codebase for patterns, remember a conversation from last Tuesday, and synthesise an answer from five incomplete sources. They fill gaps with judgment built from months of context absorption.

An AI coding agent gets what you give it. A system prompt. Maybe a project rules file. Maybe a few files of context pulled in by the IDE. That's the entire universe it operates in.

When that context is fragmented and incomplete, the agent doesn't ask clarifying questions in a Slack channel. It doesn't recall a meeting it wasn't in. It makes its best guess from whatever it can see, and that guess is often plausible but wrong in ways that only someone with full context would catch.

This is the core tension: AI coding agents are increasingly capable of generating sophisticated, working code. But capability without context produces confident mistakes. The model isn't the bottleneck. The context pipeline is.

What centralised context actually looks like

The fix isn't better prompting. It's better information architecture.

Imagine every architectural decision your team has made lives in a structured, queryable record. Not a paragraph buried in Confluence, but a discrete decision with its status, the alternatives considered, the rationale, and the date it was made. When someone, human or AI, asks "why do we use Supabase instead of Firebase?", the answer exists in one place, in a consistent format.

Imagine your features aren't just Jira tickets with vague descriptions, but structured specifications with acceptance criteria, tracking events, and explicit relationships to the solution designs that describe how they'll be built.

Imagine your coding conventions, import ordering, naming patterns, state management approach, CSS methodology, aren't scattered across a half-maintained markdown file and the muscle memory of your senior engineers, but captured as explicit, versioned instructions that any consumer can read.

This isn't documentation for documentation's sake. This is treating specifications as a first-class part of your engineering infrastructure, the same way you treat your database schema or your CI pipeline.

The bridge between specs and agents

Structured specifications become genuinely powerful when agents can consume them directly. This is where protocols like MCP, the Model Context Protocol, change the equation.

Instead of copying and pasting context into a prompt, or hoping the agent infers your conventions from a handful of example files, the agent queries your specifications at the moment it needs them. Starting work on a feature? It pulls the feature spec, the related solution design, the relevant ADRs, and the coding instructions, automatically.

The agent doesn't need to guess your naming conventions. It reads them. It doesn't need to infer your auth pattern from scanning five files. It fetches the explicit instruction. It doesn't write code that technically works but violates three team decisions, because those decisions are structured data it consulted before writing the first line.

This is the difference between an agent that produces code you have to fix and an agent that produces code that fits.

The mindset shift

For years, engineering teams have treated documentation as a tax. Something you do after the real work, reluctantly, knowing it'll be outdated within weeks. That's a rational response when documentation is a write-only medium, when nobody reads it, maintaining it feels pointless.

But AI agents have changed the economics. Documentation now has a direct, measurable consumer. Every specification you write doesn't just help the next human who reads it. It helps every AI-assisted coding session across your entire team, every day, compounding.

The team that captures a decision once and makes it queryable saves ten correction cycles across ten developers across ten weeks. The team that structures their feature specs properly doesn't just have better documentation. They have AI agents that produce better first drafts, fewer review cycles, and less rework.

This isn't about creating more process. It's about recognising that the context gap between what your team knows and what your AI tools know is the single largest source of waste in AI-assisted development today.

Your agents are capable. They're just flying blind.

Give them the map.