8 min

We Built an Entity Database With Zero UI — Here’s How Agents Talk to It

AI Agents Architecture Data OpenClaw

"Give me 10 Munich startups with less than $2M revenue."

My agent answered in 8 seconds.

Not from Google. Not from a dashboard I'd built. From a database in my basement that no human has ever logged into. No web app. No login screen. No forgot-password flow. The database doesn't speak HTTP to browsers. It speaks MCP to agents.

This is how we built it — and zero UI was our best design decision.


The Breaking Point

Copy. Paste. Wait. Copy. Paste. Wait.

I was feeding startup names into a chat window one by one. Each name meant the same ritual: open Crunchbase. Open LinkedIn. Open the company website. Scrape. Synthesize. Watch the agent forget half the fields and ask me to clarify. Again.

Fifteen minutes per prospect.

My agent spent 90% of its time scraping and 10% thinking. I spent 100% of my time being the bottleneck.

I told my orchestrator: fix this. Don't optimize the ritual. Kill it.

Two minutes later it returned a diagnosis: "You're not prospecting wrong. You're storing wrong. You need a graph. One database that merges every source into a single queryable spine. Your agent queries the graph — it never scrapes again."

The orchestrator ran a Working Backwards exercise — an Amazon-style process starting from the end-user experience. It tore the initial concept apart:

  1. The name was wrong. "Berlin Stitcher" described nothing.
  2. The pitch fought the wrong enemy. Not competing with Crunchbase — competing with copy-paste.
  3. The persona was split. Serve me, or serve my agent? Pick one. We picked agent.
  4. The metric was vanity. "Time saved" doesn't measure replacement.
  5. The risk assessment was fiction. Two critical unknowns — dataset freshness and LLM stitching quality — weren't even priced in.
  6. Phase 2 was four workstreams. Make it one.
  7. No kill criteria. No thresholds. No escape hatch.

The orchestrator wrote a 2,500-word product critique with a kill decision, renamed the thing EntityScope, and handed a clean brief to the product planning agent.

Seven structural flaws. Found in minutes. By an agent.


The Handoff Chain

The orchestrator didn't write a single line of code. It followed our Agent Coordination Protocol (ACP) — structured handoffs via typed proposals and decisions, not free-form chat.

The chain looked like this:

  • Working Backwards agent — validated the idea, wrote a press release, identified zero kill triggers, flagged two critical risks (both testable in two hours)
  • Product critique — fixed all seven flaws, risk-adjusted ICE from 8.0 → 6.0, added four specific kill gates with hard thresholds
  • Product planner — scoped Phase 1 to one source, one spine, one test. Deferred Phase 2 to one integration (not four). Created the architecture brief.
  • Implementation brief — landed with the coding agent: DuckDB schema, ingestion pipeline, query templates, kill gates.

Each handoff was structured. Typed. Explicit. No ambiguity about who owned what decision.

The key choice: zero UI from day one. No web forms. No dashboards. No auth flows. EntityScope is an invisible intelligence layer — my agent talks to it via MCP, and the entire interaction is structured JSON. When your consumer is an agent, a REST endpoint or MCP tool is a better interface than a web app. Zero UI isn't a shortcut. It's the correct design.


Qwen Code Builds the Spine

The architecture brief landed with Qwen Code — our code-generation agent on a separate machine in the home lab. The brief was specific enough for autonomous work:

  • Data spine: DuckDB with the Berlin Business Dataset (304k companies, CC0 license)
  • Entity resolution: HRB registration numbers as primary foreign keys
  • Query layer: MCP server exposing four tools
  • Ingestion: CSV-based, with multi-source field-level provenance tracking

Qwen built the first working version in a single session. DuckDB came up. The MCP tools registered. The health check returned green.

Then came the moment that would either kill the project or validate it.


The Climax: The Agent Ate a Dataset It Had Never Seen

EntityScope's spine relies on the Berlin Business Dataset — 304k companies, free, public. But we needed a second source: EntityScope's outreach contacts394 curated Berlin and Munich startup profiles built through weeks of manual and automated research.

I asked my orchestrator: find open data for Munich startups. Import it.

Here's what happened next.

The agent found a dataset online. It analyzed the schema cold — columns it had never seen, formats it hadn't been trained on. It inferred field mappings. It decided where each column belonged. It resolved duplicates against the 394 existing records. It committed the new entities.

Two sequential turns.

No manual ETL. No data entry. No human opened a CSV. The agent navigated an unfamiliar schema and mapped it to a data model it understood from the architecture brief — because the brief was that clear and the protocol that structured.

EntityScope now holds 394 outreach contacts, 1,000 Berlin business records, 1,010 legal entities, and 8,018 field-level provenance entries — all from public or internally curated data. Every record traces back to its source with a confidence score.

The agent didn't just query the database. It grew it.


What We Learned

1. Handoff quality determines everything.

The orchestrator → product planner → Qwen Code pipeline worked because each handoff was structured: typed proposals, explicit decisions, clear success criteria. When agents communicate in structured protocols instead of natural language, error rates drop. A messy handoff between smart agents produces worse output than a clean handoff between average ones.

2. Kill criteria aren't pessimism — they're leverage.

We defined four conditions that would kill EntityScope before Phase 2. Knowing these early meant we tested the riskiest assumptions — dataset freshness, LLM stitching quality — in two hours instead of discovering them two weeks in. Kill criteria let you fail fast without failing chaotically.

3. Zero UI is a feature, not a limitation.

When the consumer is an agent, an MCP tool is a better interface than a web app. No authentication flows. No frontend maintenance. No design debt. We never built a login screen — and never will. The interface is the protocol.

4. Open data is shockingly good.

The Berlin Business Dataset, CORDIS grants, public company registries — these cover 80% of our use cases. The remaining 20% comes from curated internal data. The spine is free. The agent knows how to find and ingest it. The cost of data acquisition dropped to zero.

5. Multi-model orchestration works when each model does what it's best at.

The orchestrator handled discovery and critique. The product planner handled scoping and risk. Qwen Code handled implementation. Each played to its strengths. This isn't about using more models — it's about using the right model for the right decision.


The Architecture (for the Technical Readers)

If you're building something similar:

Architecture diagram showing Orchestrator, Orch Instance, and Qwen Code with MCP/SSH and ACP communication layers

The orchestrator on my workstation talks to the orch instance at 192.X.X.X. EntityScope runs DuckDB with an MCP server layer. The coding agent (Qwen Code) lives on the same instance and receives architecture briefs through the Agent Coordination Protocol.

EntityScope exposes four MCP tools:

  • entityscope_query — Natural-language query → structured results
  • entityscope_get_entity — Full profile by HRB registration ID
  • entityscope_list_sources — Data source freshness and provenance
  • entityscope_health — Server status and ingestion health

No user accounts. No rate limits. No paywalls. Just structured data, accessible to my agent, on my hardware.


The Result

EntityScope went from idea to working prototype in under 24 hours across two machines.

The same query that once required five open tabs and fifteen minutes — "show me all Berlin AI startups with EU funding, under 50 employees, with a CORDIS grant" — now takes 8 seconds and returns structured JSON with per-field source confidence.

The orchestrator found open data online, and the agent imported it autonomously — navigating an unfamiliar schema, mapping fields, resolving duplicates, committing records. All without a single manual ETL step.

Here's what I told my agent today: "Give me 10 Munich startups with less than $2M revenue."

It answered in seconds. From a database it had never queried before. On a server across the network. Using a tool it discovered at runtime.

That's the invisible intelligence layer. And it works.


Why This Matters

We're entering an era where the primary consumer of your data product might not be a human.

When that happens, everything changes. The interface becomes a protocol. The user experience becomes structured JSON. The "product" becomes a conversation between agents. The teams that figure out how to design those handoffs — how to let each agent play to its strengths — will build infrastructure that feels like magic.

EntityScope isn't a database. It's a pattern.

If you're building agent-native infrastructure, I'd love to hear your approach. The hardest problem isn't the data or the code — it's designing the handoffs between agents so that the output of one becomes the input of the next without friction.

I'm documenting this architecture and protocol openly. Follow for more on agent orchestration, ACP patterns, and building for an agent-first future.


Why this matters for your stack: The next time you build a data product, ask who the consumer is. If the answer is "an agent," don't build a login screen. Build a protocol.

Working on a similar problem? Let's talk about how I can help your team.

Get in Touch