May 06 2026 7 min

Your Agent Spends 30 Seconds Finding Files - Here's How I Cut It to 3ms

Architecture Performance Workspace Indexing AI OpenClaw

One of the least glamorous bottlenecks in an agent system is file lookup.

It sounds small until you watch the same agent spend real time rediscovering the workspace over and over again. Scan the tree. Grep a memory file. Ask a human where something lives. Repeat next session.

That is fine when the workspace is tiny. It becomes wasteful fast once the system is expected to work repeatedly and at speed.

The fix I ended up with was simple: stop treating the filesystem like a mystery and give the agent an index.

The Real Problem Wasn't Just Speed

Yes, the raw timing mattered. Cutting file lookup from tens or hundreds of milliseconds down to a few milliseconds is useful.

But the bigger problem was semantic. A directory scan tells you that files exist. It does not tell the agent what those files mean. Is this an active idea, an archived issue, a product note, an article brief, or a sidecar file that should be ignored?

Without structure, the agent has to keep doing interpretive work after every lookup. That is the hidden tax.

Why a Compact Index Worked Better

The answer was to precompute meaning once and load it cheaply at session start.

Instead of making the agent walk the workspace every time, I built a generator that scans the workspace out of band, extracts the fields that matter, and writes them into a compact registry. The agent loads that registry once and then queries it in memory.

That changes the interaction model completely. Navigation becomes lookup, not discovery.

Once I had that shift, the numbers followed naturally: a few milliseconds to load, near-zero time to query, and much lower context overhead than asking the agent to rebuild its map from scratch.

The Constraints Were Practical

I did not want a little infrastructure empire just to solve file lookup.

The design constraints were straightforward:

load fast enough that startup actually improves
stay compact enough not to waste context
avoid a dependency chain that turns indexing into its own maintenance problem

That is why the solution stayed boring: a generator, a compact JSON representation, and in-memory queries. No daemon, no service, no premature database for a workspace that did not need one yet.

What Made It Useful

The registry did more than store paths. It stored identity and relationships.

Ideas, issues, products, articles, research, and project notes all had distinct patterns. Once those patterns were encoded, the agent no longer had to guess whether a file was primary, derivative, archived, or noisy. It could query the right object class directly.

That turned a filesystem problem into a data-model problem, which is exactly where it belonged.

The Bugs Were the Best Part

The first version of the indexer got things wrong in exactly the ways a real evolving workspace tends to go wrong.

Sidecar files looked like primary content. Legacy issue naming collided with newer conventions. Nested project files were missed. Duplicate article-related artifacts polluted the registry. In other words: reality showed up.

That was useful, because it proved the actual value of the indexer was not just speed. It was schema enforcement at the boundary. The workspace itself could remain messy and human. The registry was the place where ambiguity had to be resolved once.

That is a much healthier arrangement than forcing every downstream agent query to rediscover the mess independently.

Why This Matters for Agent Systems

Agents are unusually sensitive to navigation friction because they pay for it twice: once in time and once in context.

A human can scan a file tree quickly and carry a rough mental model for the rest of the session. An agent usually cannot. Every rediscovery step burns tokens, introduces ambiguity, and increases the chance of opening the wrong thing.

That is why a compact registry feels like such a large win relative to how simple it is. It removes a class of repetitive uncertainty from the system.

What I'd Generalize From It

If your agent works over any structured workspace, whether that is a codebase, a note system, a documentation vault, or a mixed operating repository, you probably want some form of precomputed registry.

Not because it is fancy. Because it is basic hygiene once the workspace stops being trivial.

The moment I stopped asking the agent to "figure out the workspace" and started handing it a compact map, the whole system felt more deliberate. Faster, yes. But more importantly: less confused.

That is usually the real sign that an infrastructure change was worth making.

Read more technical writing and case-study notes from the archive.