How to design subagent orchestrations
How to design subagent orchestrations
⚠️ Experimental feature: Running custom agents as subagents is currently experimental. The
agentsfrontmatter property and custom agent subagent invocation may change in future releases.
Subagents are the execution mechanism that makes orchestration practical. Where article 10 teaches you how to design orchestrators—decomposing tasks into phases, defining specialist roles, planning execution flow—this article teaches you how to implement the delegation itself using VS Code’s subagent capabilities.
A subagent is an independent AI agent that performs focused work in its own isolated context window and returns a summary to the main agent. The main agent stays in control, receiving condensed results without the noise of the subagent’s intermediate processing. This isolation is what makes orchestration scalable—each specialist can think deeply about its subtask without bloating the coordinator’s context.
Table of contents
- 🎯 Understanding subagent execution
- 📋 The
runSubagenttool - 🔧 Custom agents as subagents
- 🔒 Controlling agent visibility and invocation
- 🎯 Restricting subagent access with
agents - 🔀 Orchestration patterns
- ⚡ Parallel execution
- 💰 Token implications
- 🚫 Common pitfalls
- 🎯 Conclusion
- 📚 References
🎯 Understanding subagent execution
How subagents work
When a main agent spawns a subagent, VS Code creates a new execution context:
Main Agent (your orchestrator)
│
├─ Sends task prompt ──→ Subagent
│ │
│ ├─ Gets its own context window
│ ├─ Runs tools autonomously
│ ├─ Processes the subtask
│ │
│ ←── Returns summary ──┘
│
├─ Incorporates result
└─ Continues with next step
Key characteristics
| Aspect | Behavior |
|---|---|
| Context isolation | Each subagent starts with a clean context window—no inherited conversation history |
| Synchronous execution | The main agent waits for subagent results before continuing |
| Parallel support | VS Code can spawn multiple subagents concurrently when tasks are independent |
| Tool inheritance | By default, subagents inherit the main session’s model and tools |
| Custom agent override | When using a custom agent as subagent, the custom agent’s tools, model, and instructions override defaults |
| Result format | Only the subagent’s final result returns to the main agent—not intermediate tool calls |
What the user sees
Subagent execution appears in the chat as a collapsible tool call. By default, it’s collapsed, showing just the agent name and the currently running tool. Users can expand it to see the full details: all tool calls, the prompt passed to the subagent, and the returned result.
This design gives your orchestration transparency without overwhelming the main conversation. Users see that delegation happened and can inspect details on demand.
Subagents vs. new sessions
Subagents maintain a relationship with the parent agent—they do focused work and report back. Creating a new chat session, by contrast, creates an entirely separate conversation with no connection to the current task.
| Aspect | Subagent | New session |
|---|---|---|
| Relationship to parent | Child—reports results back | Independent—no connection |
| Context | Isolated but linked via task prompt | Completely separate |
| Coordination | Main agent synthesizes results | Manual user coordination |
| Use case | Delegating part of a larger task | Starting unrelated work |
📋 The runSubagent tool
Enabling subagent support
To allow an agent to delegate work through subagents, include the agent or runSubagent tool in the agent’s tool list:
---
name: My Orchestrator
tools: ['agent', 'read', 'search', 'edit']
---The agent tool is an alias for runSubagent—both enable subagent capabilities.
How invocation works
Subagents are typically agent-initiated, not directly invoked by users. The pattern:
- You (or your agent’s instructions) describe a complex task
- The main agent recognizes which parts benefit from isolated context
- The agent spawns a subagent with a focused task prompt
- The subagent works autonomously and returns a summary
- The main agent incorporates the result and continues
You don’t need to explicitly type “run a subagent” in every prompt. The main agent decides when delegation helps, based on its instructions and the task at hand. However, you can guide this behavior by structuring your prompts to suggest isolation:
Implicit delegation hints:
Research the authentication patterns used in this project,
then implement the OAuth 2.0 migration based on your findings.
Explicit delegation instructions:
Run a subagent to research authentication patterns. Then, using
only the research summary, implement the OAuth 2.0 migration.
For consistent delegation behavior, define when to use subagents in your custom agent’s instructions rather than relying on per-prompt hints.
Invoking subagents from prompt files
To use subagents in a prompt file, include the agent tool in the frontmatter:
---
name: document-feature
tools: ['agent', 'read', 'search', 'edit']
---
Run a subagent to research the new feature implementation details
and return only information relevant for user documentation.
Then update the docs/ folder with the new documentation.The prompt instructions can hint at delegation by suggesting isolated research or parallel analysis for specific subtasks.
🔧 Custom agents as subagents
Why use custom agents as subagents?
By default, a subagent inherits the main session’s model and tools. It starts with a clean context window but doesn’t have specialized behavior.
Running a custom agent as a subagent unlocks specialization. The custom agent’s tools, model, and instructions override the session defaults. This means:
- A research subagent can use read-only tools (can’t accidentally edit files)
- A performance analysis subagent can use a specific MCP server
- An implementation subagent can use a faster, cheaper model
- Each specialist follows its own behavioral instructions
Example: research with a custom agent
# .github/agents/researcher.agent.md
---
name: Researcher
user-invokable: false
tools: ['read', 'search', 'fetch', 'grep']
model: Claude Sonnet 4.5 (copilot)
---
You are a codebase researcher. When given a research task:
1. Search broadly first to understand the scope
2. Read specific files for detailed understanding
3. Return a structured summary with:
- Files analyzed (paths)
- Patterns found
- Dependencies identified
- Risks or concerns
- Recommendations
Keep your summary concise. The orchestrator needs decisions, not narration.When the orchestrator runs this agent as a subagent, the researcher operates with read-only tools and its specialized instructions—exactly the behavior you designed.
Prompting for custom agent invocation
To run a specific custom agent as a subagent, reference it in your prompt:
Use the Researcher agent to analyze the current auth implementation.
Or in an orchestrator’s instructions:
For each feature request:
1. Run the Researcher agent as a subagent to gather context
2. Review the research summary
3. Run the Builder agent as a subagent to implement changes
4. Run the Reviewer agent as a subagent to check quality🔒 Controlling agent visibility and invocation
Two frontmatter properties give you independent control over how agents can be accessed:
user-invokable
Controls whether the agent appears in the agents dropdown in the chat UI.
| Value | Behavior |
|---|---|
true (default) |
Agent appears in the dropdown—users can select it directly |
false |
Agent is hidden from the dropdown—only accessible as a subagent or programmatically |
When to set user-invokable: false:
- Specialist agents that only make sense within an orchestration workflow
- Internal helper agents that shouldn’t be invoked standalone
- Agents with narrow focus that would confuse users if selected directly
---
name: Internal Validator
user-invokable: false
tools: ['read', 'run_in_terminal']
---
Run validation checks on the specified files. Return a pass/fail
summary with details for any failures.disable-model-invocation
Controls whether other agents can automatically invoke this agent as a subagent.
| Value | Behavior |
|---|---|
false (default) |
Other agents can invoke this agent as a subagent |
true |
This agent can only be invoked explicitly by users, not by other agents |
When to set disable-model-invocation: true:
- Agents with destructive capabilities that shouldn’t be auto-delegated
- Top-level orchestrators that should only be triggered by humans
- Agents requiring human judgment to invoke appropriately
---
name: Production Deployer
disable-model-invocation: true
tools: ['run_in_terminal', 'read']
---
Deploy the application to production. This agent requires explicit
human invocation due to the irreversible nature of deployments.Combining the properties
The two properties create four visibility configurations:
user-invokable |
disable-model-invocation |
Result |
|---|---|---|
true |
false |
Full access — User can select it AND agents can auto-invoke it |
true |
true |
User-only — Visible in dropdown but protected from auto-invocation |
false |
false |
Subagent-only — Hidden from dropdown but available for orchestration |
false |
true |
Locked — Neither visible nor auto-invokable (must be explicitly allowed via agents array) |
Note: The
inferproperty is deprecated. It previously combined both controls—infer: truemade the agent both visible and auto-invokable,infer: falsehid it from both. Use the two new properties for granular control.
🎯 Restricting subagent access with agents
The problem of unrestricted delegation
By default, all custom agents that don’t have disable-model-invocation: true are available as subagents. This means your orchestrator might accidentally invoke an unintended agent if names or descriptions are similar.
For example, a general-purpose “Code Review” agent might get invoked instead of your specialized “Security Review” agent for an orchestration that specifically needs security-focused analysis.
The agents property
The agents frontmatter property restricts which custom agents an orchestrator can invoke as subagents:
---
name: TDD Coordinator
tools: ['agent']
agents: ['Red', 'Green', 'Refactor']
---
Implement features using test-driven development:
1. Use the Red agent to write failing tests
2. Use the Green agent to implement code to pass the tests
3. Use the Refactor agent to improve code qualityAccepted values
| Value | Meaning |
|---|---|
['Agent1', 'Agent2'] |
Only these specific agents can be used as subagents |
* |
All available agents can be used (default behavior) |
[] |
No subagent use allowed |
Override behavior
Listing an agent in the agents array overrides disable-model-invocation: true. This means you can create agents that are:
- Protected from general subagent use (
disable-model-invocation: true) - But still accessible to specific coordinator agents that explicitly allow them
This pattern is useful for sensitive or destructive agents that should only be invoked by trusted orchestrators:
# The deployer is protected from casual invocation
---
name: Deployer
user-invokable: false
disable-model-invocation: true
tools: ['run_in_terminal', 'read']
---
Deploy changes to staging environment.# But the release coordinator can explicitly invoke it
---
name: Release Coordinator
tools: ['agent', 'read']
agents: ['Tester', 'Deployer', 'Notifier']
---
Coordinate the release process:
1. Run the Tester to verify all tests pass
2. Run the Deployer to push to staging
3. Run the Notifier to alert the team🔀 Orchestration patterns
Coordinator and worker
The foundational pattern. A coordinator agent manages the overall task and delegates subtasks to specialized workers:
---
name: Feature Builder
tools: ['agent', 'edit', 'search', 'read']
agents: ['Planner', 'Plan Architect', 'Implementer', 'Reviewer']
---
You are a feature development coordinator. For each feature request:
1. Use the Planner agent to break down the feature into tasks.
2. Use the Plan Architect agent to validate the plan against codebase patterns.
3. If the architect identifies reusable patterns, send feedback to the
Planner to update the plan.
4. Use the Implementer agent to write the code for each task.
5. Use the Reviewer agent to check the implementation.
6. If the reviewer identifies issues, use the Implementer agent again to
apply fixes.
Iterate between planning and architecture, and between review and
implementation, until each phase converges.Each worker defines its own tool access and can specify a faster or more cost-effective model:
---
name: Planner
user-invokable: false
tools: ['read', 'search']
---
Break down feature requests into implementation tasks. Incorporate
feedback from the Plan Architect.---
name: Implementer
user-invokable: false
model: ['Claude Haiku 4.5 (copilot)', 'Gemini 3 Flash (Preview) (copilot)']
tools: ['edit', 'read', 'search', 'run_in_terminal']
---
Write code to complete assigned tasks.Multi-perspective review
Use subagents to run independent review perspectives in parallel, then synthesize findings:
---
name: Thorough Reviewer
tools: ['agent', 'read', 'search']
---
Review code through multiple perspectives simultaneously. Run each
perspective as a parallel subagent so findings are independent and unbiased.
When asked to review code, run these subagents in parallel:
- Correctness reviewer: logic errors, edge cases, type issues
- Code quality reviewer: readability, naming, duplication
- Security reviewer: input validation, injection risks, data exposure
- Architecture reviewer: codebase patterns, design consistency
After all subagents complete, synthesize findings into a prioritized
summary. Note which issues are critical vs. nice-to-have.This pattern works because each subagent approaches the code fresh, without being anchored by what other perspectives found. Each review is genuinely independent.
Tip: For more control, create dedicated custom agents for each review perspective. A security reviewer might use a security-focused MCP server, while a code-quality reviewer might have access to linting CLI tools.
Research before implementation
Separate research from implementation to prevent research context from bloating the implementation phase:
---
name: Implementer
tools: ['agent', 'edit', 'read', 'search']
---
When implementing a feature:
1. First, spawn a subagent to research:
- Current codebase patterns in affected areas
- Relevant test patterns
- Dependencies and potential conflicts
2. Using ONLY the research summary (not your own investigation),
implement the changes.
3. Run tests to verify the implementation.The research subagent processes potentially dozens of files, but only the structured summary enters the implementer’s context. This keeps the implementation focused and efficient.
Explore and select
For decisions where you want to evaluate multiple approaches before committing:
When the user asks to solve a complex problem:
1. Spawn 2-3 subagents, each exploring a different approach
2. Each subagent should:
- Describe the approach
- Identify pros and cons
- Estimate complexity
- Note potential risks
3. Compare the approaches and recommend the best option
4. Present all options to the user for final decisionEach exploration happens in isolation—one subagent’s reasoning doesn’t influence another’s. This prevents the anchoring effect where the first idea examined becomes disproportionately favored.
⚡ Parallel execution
When to parallelize
VS Code supports spawning multiple subagents concurrently. The main agent waits for all parallel subagents to complete before continuing.
Good candidates for parallelization:
- Multiple review perspectives (security + performance + quality)
- Independent research tasks (“find auth patterns” AND “find test patterns”)
- Generating alternative approaches for comparison
- Analyzing different parts of the codebase simultaneously
Don’t parallelize when:
- Later tasks depend on earlier results (sequential dependency)
- Tasks might conflict (two subagents editing the same files)
- You need human review between phases
- Order matters for correctness
How to trigger parallel execution
VS Code recognizes parallelization opportunities from your prompt structure. You can signal parallel intent explicitly:
In orchestrator instructions:
Run all three review perspectives as parallel subagents.
Wait for all results before synthesizing.In structured task descriptions:
Simultaneously analyze:
- Security implications of the auth changes
- Performance impact on the API endpoints
- Backward compatibility with existing clientsThe key signals VS Code looks for: “simultaneously,” “in parallel,” “all at once,” or listing multiple independent tasks without sequential connectors (“then,” “after,” “next”).
💰 Token implications
Why subagents reduce costs
Subagents have a counterintuitive benefit: they can actually reduce overall token consumption despite creating additional context windows.
Without subagents (single context)
Research: reads 50 files → 40,000 tokens of context
Implementation: needs 10,000 tokens but inherits 40,000 from research
Review: needs 5,000 tokens but inherits 50,000 from previous phases
Total context pressure: ~95,000 tokens
With subagents (isolated contexts)
Research subagent: reads 50 files → 40,000 tokens (isolated)
Returns: 500-token summary to main agent
Implementation subagent: 10,000 tokens + 500-token research summary
Returns: 300-token result summary
Review subagent: 5,000 tokens + 300-token implementation summary
Total main agent context: ~1,500 tokens of summaries
The main agent processes condensed summaries instead of accumulating raw data. For detailed token budgeting strategies, see How to Optimize Token Consumption During Prompt Orchestrations.
Model selection for subagents
Not every subagent needs the most powerful model. Match model capability to task complexity:
| Task type | Recommended model tier | Why |
|---|---|---|
| Complex reasoning (architecture, planning) | High (Sonnet, GPT-5) | Needs strong reasoning |
| Code implementation | Medium (Haiku, Flash) | Well-defined task, faster |
| Pattern matching (linting, formatting) | Low (fastest available) | Mechanical, rule-based |
| Research and summarization | Medium-High | Needs comprehension but not creativity |
Specify models in specialist agent frontmatter:
---
name: Quick Formatter
user-invokable: false
model: ['Claude Haiku 4.5 (copilot)', 'Gemini 3 Flash (Preview) (copilot)']
tools: ['edit', 'read']
---The model array is prioritized—VS Code tries each model in order until it finds one that’s available. This provides graceful fallback when specific models are unavailable.
🚫 Common pitfalls
1. Passing too much context to subagents
Subagents receive only the task prompt you provide—they don’t inherit the main agent’s conversation history. This is a feature, not a limitation. But it means you need to include sufficient context in the task prompt.
Too little:
Review the changes.
Too much:
Here's the full conversation history, all 50 files we discussed,
the complete project architecture... now review the changes.
Just right:
Review the changes made to src/auth/login.ts and src/auth/oauth.ts
for security issues. The changes implement OAuth 2.0 replacing the
previous session-based auth. Focus on: token handling, redirect validation,
and state parameter usage.
3. Unbounded subagent depth
Subagents can technically spawn their own subagents. This creates recursion that’s hard to debug and wastes tokens. Set explicit boundaries:
---
name: Specialist
user-invokable: false
disable-model-invocation: true # Can't be auto-invoked
agents: [] # Can't spawn subagents
tools: ['read', 'search']
---Use the agents: [] property to prevent specialists from spawning subagents, and disable-model-invocation: true to prevent unintended invocation chains.
4. Ignoring subagent failures
Subagent results might be incomplete, incorrect, or empty. Your orchestrator should handle these cases explicitly:
After receiving results from any subagent:
1. Check if the result addresses the original task
2. If the result is empty or irrelevant, retry ONCE with a more
specific prompt
3. If the retry also fails, note the gap and continue with available
information
4. Never retry more than once per subagent5. Using subagents for trivial tasks
Spawning a subagent has overhead: context window allocation, prompt processing, result summarization. For quick lookups or simple file reads, use tools directly rather than delegating to a subagent.
Don’t subagent: “Read the package.json and tell me the version” Do subagent: “Research all authentication patterns used across the project, analyze their security properties, and recommend which to keep”
6. Not restricting the agents array
Without explicit agent restrictions, your orchestrator might pick unintended agents based on name or description similarity. Always specify the agents array for production orchestrators.
🎯 Conclusion
Subagents are the mechanism that makes orchestration work. The key takeaways:
- Context isolation is the primary benefit — Subagents prevent context bloat by processing details in their own window and returning only summaries
- Custom agents + subagents = specialization — Combine custom agent definitions (tools, model, instructions) with subagent delegation for focused, constrained specialist behavior
- Control visibility with two properties —
user-invokablecontrols the dropdown,disable-model-invocationcontrols auto-invocation; combine them for four distinct access patterns - Restrict with the
agentsarray — Explicitly list allowed subagents in your orchestrator to prevent unintended delegation - Parallel execution accelerates independent tasks — VS Code can spawn concurrent subagents for truly independent work
- Match models to task complexity — Not every subagent needs the best model; use faster, cheaper models for implementation and rule-based tasks
What’s next
With orchestrator design (article 10) and subagent mechanics (this article) covered, the next articles dive into the data that flows between agents and how to manage it efficiently:
- How to Manage Information Flow — Data contracts, communication pathways, and context window dynamics across the customization stack
- How to Optimize Token Consumption — Token budgeting, context compression, and cost management for multi-agent workflows
📚 References
Official documentation
VS Code: Subagents [📘 Official]
The primary reference for subagent execution. Covers the synchronous execution model, context isolation, orchestration patterns (coordinator-worker, multi-perspective review), parallel execution, custom agent subagent invocation, the agents property, and visibility control with user-invokable and disable-model-invocation.
VS Code: Custom Agents [📘 Official]
Complete reference for custom agent file structure, frontmatter properties, tool configuration, handoffs, and the agents property. Essential for understanding how to define specialist agents used as subagents.
VS Code: Agents Overview [📘 Official]
Overview of all agent types (local, background, cloud, third-party), session management, session hand-offs between agent types, and the agent sessions list.
VS Code: Agent Tools [📘 Official]
Documentation for built-in tools, tool sets, MCP tools, and how tools are resolved. Understanding the tool system is essential for designing specialist agent tool configurations.