How My Orchestrator Hands Off Coding Tasks to Qwen — The Agent Coordination Protocol That Makes It Work
I needed a database in my basement. No UI. No login screen. No human would ever touch it.
I didn't write it. My orchestrator didn't write it either. A coding agent — Qwen Code on a different machine — built it in one session.
The orchestrator knew what I needed. Qwen Code knew how to build it. The only thing between them was a three-message protocol.
Here's how it works.
The Problem With Single-Agent Systems
When you start building with AI, the instinct is obvious: one agent, all the tools, let it figure everything out.
It works. Until it doesn't.
The breaking point hits when a task spans skill domains. Discovery, planning, implementation, testing — each needs different reasoning. Your general-purpose orchestrator is terrible at writing DuckDB schemas. It's burning context on code when it should be thinking about product. You try running it on a different machine and everything falls apart.
I hit all four at once on EntityScope. My orchestrator excelled at discovery and product thinking. It was awful at implementation. The fix wasn't better tools. It was to stop asking it to code.
What Is ACP?
ACP is a structured handoff protocol between agents. Not a library — a pattern. I call it Agent Coordination Protocol. This one is lightweight: typed proposals, explicit decisions, no free-form chat.
┌────────────────┐
│ Orchestrator │ Has the big picture. Knows what needs to happen.
│ (General LLM) │ Does NOT write code.
└────────┬───────┘
│ 1. ACP Proposal: "Here's what needs built, here's the brief"
▼
┌────────────────┐
│ Coding Agent │ Receives a structured brief. Implements autonomously.
│ (Qwen Code) │ Returns: "Done. Here's the output. Here's the status."
└────────┬───────┘
│ 2. ACP Decision: "Approved. Deploy it."
▼
┌────────────────┐
│ Orchestrator │ Verifies. Deploys. Reports back to me.
└────────────────┘
The orchestrator doesn't execute. It directs.
The ACP Handoff Structure
Every handoff follows three stages. No improvisation.
The Proposal (Orchestrator → Specialist):
- Objective — One sentence. What are we building?
- Constraints — Non-negotiables: language, framework, deployment target, performance requirements
- Input — Where to find the data, schemas, or context
- Success criteria — Measurable. Testable.
- Kill gates — Code-level conditions that stop work and escalate back
The Response (Specialist → Orchestrator):
- Plan — What they'll build, in what order
- Risk assessment — What might go wrong
- Estimated effort — How long it'll take
- Dependencies — What they need from the orchestrator or infrastructure
The Decision (Orchestrator):
- Approve — Go ahead
- Revise — Change these things, resubmit
- Escalate — I need to ask the human
- Kill — Stop. The project is cancelled.
That's it. What makes this powerful isn't complexity — both sides agree to the same structure every time.
Why Not Just Chat?
The obvious alternative: let them talk. "Hey, can you build the DuckDB thing?" — "Sure, let me look at it."
I tried this. It fails predictably.
Ambiguity compounds every turn. "Build the database" on turn 1 becomes "which database?" on turn 3, "oh, you meant schema-first?" on turn 5, "wait, was this Postgres?" on turn 8. Every sentence plays telephone with the last. I've watched agents spend twenty messages clarifying what "load the data" means while the actual problem sat untouched.
Role drift erodes the architecture. The coder starts asking strategic questions. The orchestrator suggests implementation details. Within twenty turns they've swapped roles — the coder makes product decisions and the orchestrator debugs imports. Two agents, blurred lines, zero accountability. You end up with a database schema designed by someone who can't write SQL and deployment scripts written by someone who doesn't know the product.
Context is inseparable from noise. A fifty-turn conversation about EntityScope's data model tangles deployment concerns, casual remarks, and half-remembered assumptions from three sessions ago. Every handoff to a fresh agent requires reconstructing the thread from a wall of dialogue. The critical constraint you mentioned in message 12? Buried under seventeen irrelevant follow-ups.
There's no kill switch. Without explicit kill gates, agents iterate endlessly. I've watched one spend 45 minutes on a problem a human could solve in 30 seconds — because nothing told it to stop. No circuit breaker. No "this isn't working, escalate." Just polite, persistent, expensive failure.
Structured handoffs solve all of this:
- The proposal is a document, not a chat history. Every fact the specialist needs is in one place, on arrival.
- The decision is a boolean or revision list, not a paragraph to interpret.
- Kill gates are code — executable conditions checked on every status update. If condition X is met, the agent escalates. No heuristics.
Natural-language delegation versus ACP is a verbal agreement versus a contract with exit clauses. Both communicate intent. Only one survives contact with reality.
The Infrastructure: SSH + MCP + the Orchestrator Instance
The protocol is the contract. Here's the transport stack:
The orchestrator lives on my workstation. The coding agent runs on a separate machine at 192.X.X.X. They communicate via three layers:
- SSH for transport — file transfer and command execution
- MCP (Model Context Protocol) for tool access — the coding agent exposes MCP tools the orchestrator can invoke
- ACP for task handoffs — structured proposals and decisions ride over SSH/MCP
The coding agent needs no internet access. Just SSH to the project machine and a clear brief.
This separation is deliberate:
- Security: The coding agent touches only what's on the orch instance — zero access to my workstation
- Resource isolation: Heavy coding workloads don't compete with the orchestrator's reasoning context
- Session independence: The coding agent runs for hours. The orchestrator checks in, reviews, disconnects
The EntityScope Example: ACP in Action
Here's what the EntityScope handoff looked like:
Proposal (Orchestrator → Qwen Code):
OBJECTIVE: Build DuckDB spine + MCP server for entity data
CONSTRAINTS: Python 3.11+, DuckDB, MCP SDK via pip
INPUT: Berlin Business Dataset CSV at ~/data/berlin_businesses.csv
Provisional schema at PP-003/03-discovery.md
SUCCESS: 1. entityscope_query returns results from DuckDB
2. entityscope_health returns green
3. All 4 MCP tools register and respond
KILL: >2 hours of wall time without a working health check
Dataset has <50K usable records after dedup
Response (Qwen Code → Orchestrator):
PLAN: 1. Define schema and load CSV
2. Build MCP server with 4 tools
3. Test each tool
RISK: CSV encoding issues (resolved: forced UTF-8)
MCP SDK version mismatch (resolved: pinned to 0.4.1)
EFFORT: ~45 minutes
DEPS: Nothing — all data is local
Decision (Orchestrator):
APPROVED — proceed
Three structured exchanges. The coding agent built EntityScope's entire spine — DuckDB schema, MCP server with four tools, data ingestion — in one session.
The same pattern works for any coding task. The orchestrator doesn't need DuckDB internals. It defines the problem, sets constraints, and knows when to escalate.
Why This Setup Wins
1. Each model plays to its strengths.
The orchestrator handles: discovery, product thinking, risk assessment, context management, human communication. The coding agent handles: implementation, debugging, testing, deployment. Neither tries to do the other's job.
2. Swap models without swapping architecture.
Today it's Qwen Code. Tomorrow it could be Claude Code or GPT-5 Codex. The ACP proposal stays the same. The infrastructure stays the same. Only the agent changes.
3. Long-running tasks don't block the orchestrator.
The orchestrator submits a proposal, gets approval, and returns to other work. It checks in periodically. Database schema context stays free for higher-level thinking.
4. One document, everything the coding agent needs.
The proposal is a complete brief. No hunting through conversation history. Every dependency in the first message. This dramatically reduces context-missing errors.
5. Kill gates are code, checked on every status update.
The hard limit is two hours: if the coding agent can't produce a working health check, it escalates. This isn't a guideline — it's an executable condition, evaluated like an assertion on every status check. No infinite loops. No burning $50 on a lost cause. Kill gates are the difference between a system you trust unsupervised and one you babysit.
How to Build This Yourself
Five steps:
1. Get a dedicated coding machine.
A cheap VPS. A home server. A spare laptop. If it runs Python and accepts SSH, you're set.
2. Install an MCP server.
Use mcp-python-sdk to expose tools: deploy_task, read_file, write_file, run_command. The coding agent gets an MCP server that lets it read, write, and execute on the remote machine.
3. Define your handoff protocol.
Start simple. JSON schema. Three fields per stage. Clarity beats completeness — add fields as you find gaps.
4. Wire SSH key-based auth.
No passwords. The orchestrator should ssh user@host "ls" in one step. Fix this first. Everything else depends on it.
5. Start with one specialist.
Pick one task — coding, data analysis, research — and set up the ACP handoff for just that. Prove it works. Then expand.
You could have step 1 done tonight.
The Result
EntityScope was built by two agents on two machines, communicating through a structured protocol I designed in a single afternoon.
The orchestrator never wrote SQL. Qwen Code never made a product decision. The handoff was three messages. The database has been running for weeks, serving answers in eight seconds, with zero human intervention.
This is the architecture that works:
- One orchestrator that understands the big picture but doesn't implement
- Specialist agents that implement without needing the big picture
- A structured protocol that makes the handoff unambiguous
- Separate infrastructure that isolates risk and frees up resources
The hardest lesson in building agent systems: don't make one agent do everything. The easiest fix is a protocol.
If you're building multi-agent systems and struggling with handoff quality, I'd love to hear what patterns you've tried. The protocol space is wide open. Every real deployment teaches us something new.
I'm documenting my architecture openly. Follow for patterns on ACP, MCP infrastructure, and agent-native systems.
Working on a similar problem? Let's talk about how I can help your team.
Get in Touch