Your Agent Spends 30 Seconds Finding Files - Here's How I Cut It to 3ms
Every time your agent needs to find a file, it does one of three things:
- Scans the entire directory tree — one
findper look-up, O(n) disk traversal - Greps MEMORY.md — 19KB text file parsed for keywords, high false-positive rate
- Asks you — "Where's that file about X again?"
All three are wrong. They treat the file system as a database without an index, and the agent's context window as a cache that gets evicted every session.
Here's the architecture for a solution that reads the entire workspace in 3ms and costs ~400 tokens to load.
Why Directory Scans Fail at Scale
The naive approach is find at startup. It works for 50 files. It becomes slow at 200. At 1,224 files — which is where this workspace sits — a full directory scan costs 50–200ms of wall time and an unpredictable number of context tokens depending on how deep the agent walks.
More importantly: find returns paths, not semantics. It tells you a file exists. It does not tell you whether that file is an article, an idea, a project, or a resolved postmortem. The agent must then open each candidate file to answer that question.
The alternative — grepping MEMORY.md — is equally flawed. MEMORY.md is a curated memory store at 19KB. It was never designed to be a file system index. Every grep into it is a trade-off: either you build a mental model of the file tree by reading it end-to-end (expensive), or you search for specific terms and hope the term matches a file path (fragile).
The correct architecture is a pre-computed compact index that is loaded once per session and queried in-memory.
The Three Constraints
The design had three hard constraints:
1. Load time under 10ms. The index must be loaded at session start alongside the system prompt. If it cannot load faster than a find, it provides no benefit. Target: sub-5ms.
2. Token cost under 500 tokens. The index is part of the agent's context window for the entire session. Every kilobyte of index is a kilobyte not available for reasoning. Target: ~400 tokens.
3. Zero dependencies. The index generator must be a single Python file with only stdlib imports. No YAML parsers, no database clients, nothing that can break. Output is a single JSON file.
The Architecture
The system has two components: a generator and a compact serialization format.
Generator
The generator (workspace-indexer.py) scans every section of the workspace once per day. For each directory group — ideas, skills, articles, issues, products, research, automation mining, projects — it walks the relevant subdirectory, reads YAML frontmatter from each file, and extracts structured data: ID, title, status, priority, tags, and cross-references.
The scan is O(n) where n is the total file count (1,224). It runs in ~300ms and produces a ~12KB JSON file. This is acceptable for a daily cron job. The index is regenerated every 24 hours or on demand.
Compact Serialization
The JSON index uses single-character field keys. "title" becomes "t", "file" becomes "f", "status" becomes "s". This is not optional cosmetic compression — it is the difference between a 12KB file and a ~3KB compact payload. At this size, the index can be loaded as a one-liner:
import json; REGISTRY = json.load(open("Registry/index.json"))
This is the entire setup. There is no caching layer, no daemon, no database connection. The file is on disk. json.load reads it in ~3ms.
Query Pattern
Once loaded, queries are in-memory dictionary lookups:
- "Find idea IS-001":
O(1)— look up in the ideas directory entries by ID - "Find all files tagged 'automation'": hash join on the tags reverse index,
O(k)where k is the number of matching entries - "Trace IS-001 to its product and article": look up two dictionary entries on the cross-reference map
Every query is sub-millisecond. The agent never touches disk for navigation again.
What Gets Indexed
The index tracks eight directory groups. The key design decision: each group has a distinct prefix and a single file read pattern.
| Group | Prefix | Source | Entries |
|---|---|---|---|
| Skills | (none) | skills/*/SKILL.md | 17 |
| Ideas | IS- | Frontmatter in obsidian/ideas/active/ | 40 |
| Issues | ISU- | Frontmatter in obsidian/issues/ + obsidian/issue-management/ | 15 |
| Articles | ART- | Frontmatter in obsidian/articles/ | 11 |
| Products | PP- | Frontmatter in obsidian/product/ | 8 |
| Automation | AM- | Frontmatter in obsidian/automation-mining/opps/*/ | 252 |
| Projects | (named) | obsidian/projects/*.md | 9 |
| Research | RS- | Directories in obsidian/research/active/ | 1 |
Cross-references connect ideas to their derived products and articles. The tags index maps each tag to a comma-separated list of entry IDs (only for tags with >= 2 entries, to avoid noise).
The 5 Bugs That Made This Real
The indexer did not work correctly on the first day. Five bugs were discovered and fixed:
1. Phantom idea entries (42 → 40). Files with the name IS-001-ideation-synthesis.md but no idea-id in frontmatter were indexed as separate ideas. Fix: require frontmatter field for ideas; sub-files without it are silently excluded.
2. Article sidecar pollution (17 → 11). Image prompt files (ART-018-image-prompt.md) and brief files (ART-019-issue-management-brief.md) were indexed as separate articles alongside the main article file. Fix: explicit exclusion rules for -image-prompt suffix and deduplication for briefs when a parent article exists.
3. Issue directory fragmentation. Issues were stored across two directories — obsidian/issues/ (old-style, 9 entries) and obsidian/issue-management/active|archive/ (new-style, 12 entries). The indexer only scanned the first. Fix: scan both, deduplicate by ID, prefer the later state.
4. Issue file name collision. Old-style files were named issue-001-diagnostic-failure.md (no ISU prefix), new-style files are ISU-011-thunderbird-mcp-down.md. Both formats needed regex extraction to determine the canonical ID. Fallback: frontmatter → regex → filename stem.
5. Subdirectory project files. Projects are stored as obsidian/projects/{name}/README.md (subdirectory), not obsidian/projects/{name}.md (flat file). The indexer only scanned flat files. Fix: scan both, with subdirectory files taking priority.
These bugs are not edge cases. They are the natural result of a workspace evolving over time without a centralized schema enforcement layer. Each fix required changing the indexer, not the file system — because the indexer is the only component that must be schema-aware. The files themselves are free to use any naming convention.
The 3ms Query
Here is the actual query time for a cold start — loading the index from disk and finding one entry:
import json, time
t0 = time.perf_counter_ns()
r = json.load(open("Registry/index.json"))
t1 = time.perf_counter_ns()
# Find IS-001
entry = [e for d in r["d"] if d["n"] == "Ideas"][0]["e"]
entry = [e for e in entry if e["i"] == "IS-001"][0]
t2 = time.perf_counter_ns()
print(f"Load: {(t1-t0)/1e6:.2f}ms, Query: {(t2-t1)/1e6:.2f}ms")
Results: Load: 2.8ms, Query: 0.05ms. Total: 2.85ms.
Compare to find . -name "IS-001": ~45ms including shell startup. Or grepping MEMORY.md: ~120ms for a 19KB file.
This is the difference between an architecture designed for an agent and one that happened to work for a human.
When This Pattern Breaks
The compact index architecture works up to approximately 5,000 entries. Beyond that, json.load on a file larger than ~50KB starts to compete with directory scan times, and the single-authority generator becomes a maintenance burden.
At scale, the correct evolution is:
- 10K–50K entries: Partition the index by group into separate files (
index.skills.json,index.ideas.json). Load only the partitions you need. - 50K+ entries: Replace the JSON file with a local SQLite database. The load pattern changes from
json.loadtosqlite3.connect, but the query pattern remains sub-millisecond.
For a single-developer workspace at 1,224 files, the JSON file with a daily cron generator is the right architecture. It is simple, debuggable, and zero-dependency. Replacing it with a database would be premature optimization.
What You Could Build
The registry pattern is general. Any structured workspace — a documentation site, a codebase with multiple modules, a research vault — benefits from a pre-computed index that separates navigation from content.
The pattern:
- A generator script that reads your workspace structure once daily
- A compact output format that loads in under 10ms and <1KB overhead
- In-memory queries that never touch disk
This is not architecture. It is hygiene. The only question is why most systems don't do it — and the answer is that most systems never measured how much time they spend finding files.
Working on a similar problem? Let's talk about how I can help your team.
Get in Touch