How to design subagent orchestrations

Learn how to use VS Code’s subagent capabilities—the runSubagent tool, context isolation, the agents property, and orchestration patterns—to delegate focused subtasks within multi-agent workflows.

Author

Dario Airoldi

Published

February 22, 2026

How to design subagent orchestrations

⚠️ Experimental feature: Running custom agents as subagents is currently experimental. The agents frontmatter property and custom agent subagent invocation may change in future releases.

Subagents are the execution mechanism that makes orchestration practical. Where article 10 teaches you how to design orchestrators—decomposing tasks into phases, defining specialist roles, planning execution flow—this article teaches you how to implement the delegation itself using VS Code’s subagent capabilities.

A subagent is an independent AI agent that performs focused work in its own isolated context window and returns a summary to the main agent. The main agent stays in control, receiving condensed results without the noise of the subagent’s intermediate processing. This isolation is what makes orchestration scalable—each specialist can think deeply about its subtask without bloating the coordinator’s context.

🎯 Understanding subagent execution
📋 The runSubagent tool
🔧 Custom agents as subagents
🔒 Controlling agent visibility and invocation
🎯 Restricting subagent access with agents
🔀 Orchestration patterns
⚡ Parallel execution
💰 Token implications
🚫 Common pitfalls
🎯 Conclusion
📚 References

🎯 Understanding subagent execution

How subagents work

When a main agent spawns a subagent, VS Code creates a new execution context:

Main Agent (your orchestrator)
  │
  ├─ Sends task prompt ──→ Subagent
  │                         │
  │                         ├─ Gets its own context window
  │                         ├─ Runs tools autonomously
  │                         ├─ Processes the subtask
  │                         │
  │  ←── Returns summary ──┘
  │
  ├─ Incorporates result
  └─ Continues with next step

Key characteristics

Aspect	Behavior
Context isolation	Each subagent starts with a clean context window—no inherited conversation history
Synchronous execution	The main agent waits for subagent results before continuing
Parallel support	VS Code can spawn multiple subagents concurrently when tasks are independent
Tool inheritance	By default, subagents inherit the main session’s model and tools
Custom agent override	When using a custom agent as subagent, the custom agent’s tools, model, and instructions override defaults
Result format	Only the subagent’s final result returns to the main agent—not intermediate tool calls

What the user sees

Subagent execution appears in the chat as a collapsible tool call. By default, it’s collapsed, showing just the agent name and the currently running tool. Users can expand it to see the full details: all tool calls, the prompt passed to the subagent, and the returned result.

This design gives your orchestration transparency without overwhelming the main conversation. Users see that delegation happened and can inspect details on demand.

Subagents vs. new sessions

Subagents maintain a relationship with the parent agent—they do focused work and report back. Creating a new chat session, by contrast, creates an entirely separate conversation with no connection to the current task.

Aspect	Subagent	New session
Relationship to parent	Child—reports results back	Independent—no connection
Context	Isolated but linked via task prompt	Completely separate
Coordination	Main agent synthesizes results	Manual user coordination
Use case	Delegating part of a larger task	Starting unrelated work

📋 The `runSubagent` tool

Enabling subagent support

To allow an agent to delegate work through subagents, include the agent or runSubagent tool in the agent’s tool list:

---
name: My Orchestrator
tools: ['agent', 'read', 'search', 'edit']
---

The agent tool is an alias for runSubagent—both enable subagent capabilities.

How invocation works

Subagents are typically agent-initiated, not directly invoked by users. The pattern:

You (or your agent’s instructions) describe a complex task
The main agent recognizes which parts benefit from isolated context
The agent spawns a subagent with a focused task prompt
The subagent works autonomously and returns a summary
The main agent incorporates the result and continues

You don’t need to explicitly type “run a subagent” in every prompt. The main agent decides when delegation helps, based on its instructions and the task at hand. However, you can guide this behavior by structuring your prompts to suggest isolation:

Implicit delegation hints:

Research the authentication patterns used in this project, 
then implement the OAuth 2.0 migration based on your findings.

Explicit delegation instructions:

Run a subagent to research authentication patterns. Then, using 
only the research summary, implement the OAuth 2.0 migration.

For consistent delegation behavior, define when to use subagents in your custom agent’s instructions rather than relying on per-prompt hints.

Invoking subagents from prompt files

To use subagents in a prompt file, include the agent tool in the frontmatter:

---
name: document-feature
tools: ['agent', 'read', 'search', 'edit']
---
Run a subagent to research the new feature implementation details 
and return only information relevant for user documentation.
Then update the docs/ folder with the new documentation.

The prompt instructions can hint at delegation by suggesting isolated research or parallel analysis for specific subtasks.

🔧 Custom agents as subagents

Why use custom agents as subagents?

By default, a subagent inherits the main session’s model and tools. It starts with a clean context window but doesn’t have specialized behavior.

Running a custom agent as a subagent unlocks specialization. The custom agent’s tools, model, and instructions override the session defaults. This means:

A research subagent can use read-only tools (can’t accidentally edit files)
A performance analysis subagent can use a specific MCP server
An implementation subagent can use a faster, cheaper model
Each specialist follows its own behavioral instructions

Example: research with a custom agent

# .github/agents/researcher.agent.md
---
name: Researcher
user-invokable: false
tools: ['read', 'search', 'fetch', 'grep']
model: Claude Sonnet 4.5 (copilot)
---
You are a codebase researcher. When given a research task:

1. Search broadly first to understand the scope
2. Read specific files for detailed understanding
3. Return a structured summary with:
   - Files analyzed (paths)
   - Patterns found
   - Dependencies identified
   - Risks or concerns
   - Recommendations

Keep your summary concise. The orchestrator needs decisions, not narration.

When the orchestrator runs this agent as a subagent, the researcher operates with read-only tools and its specialized instructions—exactly the behavior you designed.

Prompting for custom agent invocation

To run a specific custom agent as a subagent, reference it in your prompt:

Use the Researcher agent to analyze the current auth implementation.

Or in an orchestrator’s instructions:

For each feature request:
1. Run the Researcher agent as a subagent to gather context
2. Review the research summary
3. Run the Builder agent as a subagent to implement changes
4. Run the Reviewer agent as a subagent to check quality

🔒 Controlling agent visibility and invocation

Two frontmatter properties give you independent control over how agents can be accessed:

`user-invokable`

Controls whether the agent appears in the agents dropdown in the chat UI.

Value	Behavior
`true` (default)	Agent appears in the dropdown—users can select it directly
`false`	Agent is hidden from the dropdown—only accessible as a subagent or programmatically

When to set user-invokable: false:

Specialist agents that only make sense within an orchestration workflow
Internal helper agents that shouldn’t be invoked standalone
Agents with narrow focus that would confuse users if selected directly

---
name: Internal Validator
user-invokable: false
tools: ['read', 'run_in_terminal']
---
Run validation checks on the specified files. Return a pass/fail 
summary with details for any failures.

`disable-model-invocation`

Controls whether other agents can automatically invoke this agent as a subagent.

Value	Behavior
`false` (default)	Other agents can invoke this agent as a subagent
`true`	This agent can only be invoked explicitly by users, not by other agents

When to set disable-model-invocation: true:

Agents with destructive capabilities that shouldn’t be auto-delegated
Top-level orchestrators that should only be triggered by humans
Agents requiring human judgment to invoke appropriately

---
name: Production Deployer
disable-model-invocation: true
tools: ['run_in_terminal', 'read']
---
Deploy the application to production. This agent requires explicit 
human invocation due to the irreversible nature of deployments.

Combining the properties

The two properties create four visibility configurations:

`user-invokable`	`disable-model-invocation`	Result
`true`	`false`	Full access — User can select it AND agents can auto-invoke it
`true`	`true`	User-only — Visible in dropdown but protected from auto-invocation
`false`	`false`	Subagent-only — Hidden from dropdown but available for orchestration
`false`	`true`	Locked — Neither visible nor auto-invokable (must be explicitly allowed via `agents` array)

Note: The infer property is deprecated. It previously combined both controls—infer: true made the agent both visible and auto-invokable, infer: false hid it from both. Use the two new properties for granular control.

🎯 Restricting subagent access with `agents`

The problem of unrestricted delegation

By default, all custom agents that don’t have disable-model-invocation: true are available as subagents. This means your orchestrator might accidentally invoke an unintended agent if names or descriptions are similar.

For example, a general-purpose “Code Review” agent might get invoked instead of your specialized “Security Review” agent for an orchestration that specifically needs security-focused analysis.

The `agents` property

The agents frontmatter property restricts which custom agents an orchestrator can invoke as subagents:

---
name: TDD Coordinator
tools: ['agent']
agents: ['Red', 'Green', 'Refactor']
---
Implement features using test-driven development:
1. Use the Red agent to write failing tests
2. Use the Green agent to implement code to pass the tests
3. Use the Refactor agent to improve code quality

Accepted values

Value	Meaning
`['Agent1', 'Agent2']`	Only these specific agents can be used as subagents
`*`	All available agents can be used (default behavior)
`[]`	No subagent use allowed

Override behavior

Listing an agent in the agents array overrides disable-model-invocation: true. This means you can create agents that are:

Protected from general subagent use (disable-model-invocation: true)
But still accessible to specific coordinator agents that explicitly allow them

This pattern is useful for sensitive or destructive agents that should only be invoked by trusted orchestrators:

# The deployer is protected from casual invocation
---
name: Deployer
user-invokable: false
disable-model-invocation: true
tools: ['run_in_terminal', 'read']
---
Deploy changes to staging environment.

# But the release coordinator can explicitly invoke it
---
name: Release Coordinator
tools: ['agent', 'read']
agents: ['Tester', 'Deployer', 'Notifier']
---
Coordinate the release process:
1. Run the Tester to verify all tests pass
2. Run the Deployer to push to staging
3. Run the Notifier to alert the team

🔀 Orchestration patterns

Coordinator and worker

The foundational pattern. A coordinator agent manages the overall task and delegates subtasks to specialized workers:

---
name: Feature Builder
tools: ['agent', 'edit', 'search', 'read']
agents: ['Planner', 'Plan Architect', 'Implementer', 'Reviewer']
---
You are a feature development coordinator. For each feature request:

1. Use the Planner agent to break down the feature into tasks.
2. Use the Plan Architect agent to validate the plan against codebase patterns.
3. If the architect identifies reusable patterns, send feedback to the 
   Planner to update the plan.
4. Use the Implementer agent to write the code for each task.
5. Use the Reviewer agent to check the implementation.
6. If the reviewer identifies issues, use the Implementer agent again to 
   apply fixes.

Iterate between planning and architecture, and between review and 
implementation, until each phase converges.

Each worker defines its own tool access and can specify a faster or more cost-effective model:

---
name: Planner
user-invokable: false
tools: ['read', 'search']
---
Break down feature requests into implementation tasks. Incorporate 
feedback from the Plan Architect.

---
name: Implementer
user-invokable: false
model: ['Claude Haiku 4.5 (copilot)', 'Gemini 3 Flash (Preview) (copilot)']
tools: ['edit', 'read', 'search', 'run_in_terminal']
---
Write code to complete assigned tasks.

Multi-perspective review

Use subagents to run independent review perspectives in parallel, then synthesize findings:

---
name: Thorough Reviewer
tools: ['agent', 'read', 'search']
---
Review code through multiple perspectives simultaneously. Run each 
perspective as a parallel subagent so findings are independent and unbiased.

When asked to review code, run these subagents in parallel:
- Correctness reviewer: logic errors, edge cases, type issues
- Code quality reviewer: readability, naming, duplication
- Security reviewer: input validation, injection risks, data exposure
- Architecture reviewer: codebase patterns, design consistency

After all subagents complete, synthesize findings into a prioritized 
summary. Note which issues are critical vs. nice-to-have.

This pattern works because each subagent approaches the code fresh, without being anchored by what other perspectives found. Each review is genuinely independent.

Tip: For more control, create dedicated custom agents for each review perspective. A security reviewer might use a security-focused MCP server, while a code-quality reviewer might have access to linting CLI tools.

Research before implementation

Separate research from implementation to prevent research context from bloating the implementation phase:

---
name: Implementer
tools: ['agent', 'edit', 'read', 'search']
---
When implementing a feature:

1. First, spawn a subagent to research:
   - Current codebase patterns in affected areas
   - Relevant test patterns
   - Dependencies and potential conflicts
   
2. Using ONLY the research summary (not your own investigation), 
   implement the changes.

3. Run tests to verify the implementation.

The research subagent processes potentially dozens of files, but only the structured summary enters the implementer’s context. This keeps the implementation focused and efficient.

Explore and select

For decisions where you want to evaluate multiple approaches before committing:

When the user asks to solve a complex problem:

1. Spawn 2-3 subagents, each exploring a different approach
2. Each subagent should:
   - Describe the approach
   - Identify pros and cons
   - Estimate complexity
   - Note potential risks
3. Compare the approaches and recommend the best option
4. Present all options to the user for final decision

Each exploration happens in isolation—one subagent’s reasoning doesn’t influence another’s. This prevents the anchoring effect where the first idea examined becomes disproportionately favored.

⚡ Parallel execution

When to parallelize

VS Code supports spawning multiple subagents concurrently. The main agent waits for all parallel subagents to complete before continuing.

Good candidates for parallelization:

Multiple review perspectives (security + performance + quality)
Independent research tasks (“find auth patterns” AND “find test patterns”)
Generating alternative approaches for comparison
Analyzing different parts of the codebase simultaneously

Don’t parallelize when:

Later tasks depend on earlier results (sequential dependency)
Tasks might conflict (two subagents editing the same files)
You need human review between phases
Order matters for correctness

How to trigger parallel execution

VS Code recognizes parallelization opportunities from your prompt structure. You can signal parallel intent explicitly:

In orchestrator instructions:

Run all three review perspectives as parallel subagents. 
Wait for all results before synthesizing.

In structured task descriptions:

Simultaneously analyze:
- Security implications of the auth changes
- Performance impact on the API endpoints  
- Backward compatibility with existing clients

The key signals VS Code looks for: “simultaneously,” “in parallel,” “all at once,” or listing multiple independent tasks without sequential connectors (“then,” “after,” “next”).

💰 Token implications

Why subagents reduce costs

Subagents have a counterintuitive benefit: they can actually reduce overall token consumption despite creating additional context windows.

Without subagents (single context)

Research: reads 50 files → 40,000 tokens of context
Implementation: needs 10,000 tokens but inherits 40,000 from research
Review: needs 5,000 tokens but inherits 50,000 from previous phases
Total context pressure: ~95,000 tokens

With subagents (isolated contexts)

Research subagent: reads 50 files → 40,000 tokens (isolated)
  Returns: 500-token summary to main agent
Implementation subagent: 10,000 tokens + 500-token research summary
  Returns: 300-token result summary
Review subagent: 5,000 tokens + 300-token implementation summary
Total main agent context: ~1,500 tokens of summaries

The main agent processes condensed summaries instead of accumulating raw data. For detailed token budgeting strategies, see How to Optimize Token Consumption During Prompt Orchestrations.

Model selection for subagents

Not every subagent needs the most powerful model. Match model capability to task complexity:

Task type	Recommended model tier	Why
Complex reasoning (architecture, planning)	High (Sonnet, GPT-5)	Needs strong reasoning
Code implementation	Medium (Haiku, Flash)	Well-defined task, faster
Pattern matching (linting, formatting)	Low (fastest available)	Mechanical, rule-based
Research and summarization	Medium-High	Needs comprehension but not creativity

Specify models in specialist agent frontmatter:

---
name: Quick Formatter
user-invokable: false
model: ['Claude Haiku 4.5 (copilot)', 'Gemini 3 Flash (Preview) (copilot)']
tools: ['edit', 'read']
---

The model array is prioritized—VS Code tries each model in order until it finds one that’s available. This provides graceful fallback when specific models are unavailable.

🚫 Common pitfalls

1. Passing too much context to subagents

Subagents receive only the task prompt you provide—they don’t inherit the main agent’s conversation history. This is a feature, not a limitation. But it means you need to include sufficient context in the task prompt.

Too little:

Review the changes.

Too much:

Here's the full conversation history, all 50 files we discussed, 
the complete project architecture... now review the changes.

Just right:

Review the changes made to src/auth/login.ts and src/auth/oauth.ts 
for security issues. The changes implement OAuth 2.0 replacing the 
previous session-based auth. Focus on: token handling, redirect validation, 
and state parameter usage.

2. Expecting subagents to share state

Subagents don’t share state with each other. If subagent A discovers something that subagent B needs, the main agent must relay that information.

❌ "Subagent A, share your findings with Subagent B"
✅ "Based on Subagent A's finding that the auth module uses JWT tokens,
    Subagent B should check for token expiration handling"

3. Unbounded subagent depth

Subagents can technically spawn their own subagents. This creates recursion that’s hard to debug and wastes tokens. Set explicit boundaries:

---
name: Specialist
user-invokable: false
disable-model-invocation: true  # Can't be auto-invoked
agents: []                       # Can't spawn subagents
tools: ['read', 'search']
---

Use the agents: [] property to prevent specialists from spawning subagents, and disable-model-invocation: true to prevent unintended invocation chains.

4. Ignoring subagent failures

Subagent results might be incomplete, incorrect, or empty. Your orchestrator should handle these cases explicitly:

After receiving results from any subagent:
1. Check if the result addresses the original task
2. If the result is empty or irrelevant, retry ONCE with a more 
   specific prompt
3. If the retry also fails, note the gap and continue with available 
   information
4. Never retry more than once per subagent

5. Using subagents for trivial tasks

Spawning a subagent has overhead: context window allocation, prompt processing, result summarization. For quick lookups or simple file reads, use tools directly rather than delegating to a subagent.

Don’t subagent: “Read the package.json and tell me the version” Do subagent: “Research all authentication patterns used across the project, analyze their security properties, and recommend which to keep”

6. Not restricting the `agents` array

Without explicit agent restrictions, your orchestrator might pick unintended agents based on name or description similarity. Always specify the agents array for production orchestrators.

🎯 Conclusion

Subagents are the mechanism that makes orchestration work. The key takeaways:

Context isolation is the primary benefit — Subagents prevent context bloat by processing details in their own window and returning only summaries
Custom agents + subagents = specialization — Combine custom agent definitions (tools, model, instructions) with subagent delegation for focused, constrained specialist behavior
Control visibility with two properties — user-invokable controls the dropdown, disable-model-invocation controls auto-invocation; combine them for four distinct access patterns
Restrict with the agents array — Explicitly list allowed subagents in your orchestrator to prevent unintended delegation
Parallel execution accelerates independent tasks — VS Code can spawn concurrent subagents for truly independent work
Match models to task complexity — Not every subagent needs the best model; use faster, cheaper models for implementation and rule-based tasks

What’s next

With orchestrator design (article 10) and subagent mechanics (this article) covered, the next articles dive into the data that flows between agents and how to manage it efficiently:

How to Manage Information Flow — Data contracts, communication pathways, and context window dynamics across the customization stack
How to Optimize Token Consumption — Token budgeting, context compression, and cost management for multi-agent workflows

📚 References

Official documentation

VS Code: Subagents [📘 Official]
The primary reference for subagent execution. Covers the synchronous execution model, context isolation, orchestration patterns (coordinator-worker, multi-perspective review), parallel execution, custom agent subagent invocation, the agents property, and visibility control with user-invokable and disable-model-invocation.

VS Code: Custom Agents [📘 Official]
Complete reference for custom agent file structure, frontmatter properties, tool configuration, handoffs, and the agents property. Essential for understanding how to define specialist agents used as subagents.

VS Code: Agents Overview [📘 Official]
Overview of all agent types (local, background, cloud, third-party), session management, session hand-offs between agent types, and the agent sessions list.

VS Code: Agent Tools [📘 Official]
Documentation for built-in tools, tool sets, MCP tools, and how tools are resolved. Understanding the tool system is essential for designing specialist agent tool configurations.

How to design subagent orchestrations

Table of contents

🎯 Understanding subagent execution

How subagents work

Key characteristics

What the user sees

Subagents vs. new sessions

📋 The runSubagent tool

Enabling subagent support

How invocation works

Invoking subagents from prompt files

🔧 Custom agents as subagents

Why use custom agents as subagents?

Example: research with a custom agent

Prompting for custom agent invocation

🔒 Controlling agent visibility and invocation

user-invokable

disable-model-invocation

Combining the properties

🎯 Restricting subagent access with agents

The problem of unrestricted delegation

The agents property

Accepted values

Override behavior

🔀 Orchestration patterns

Coordinator and worker

Multi-perspective review

Research before implementation

Explore and select

⚡ Parallel execution

When to parallelize

How to trigger parallel execution

💰 Token implications

Why subagents reduce costs

Without subagents (single context)

With subagents (isolated contexts)

Model selection for subagents

🚫 Common pitfalls

1. Passing too much context to subagents

2. Expecting subagents to share state

3. Unbounded subagent depth

4. Ignoring subagent failures

5. Using subagents for trivial tasks

6. Not restricting the agents array

🎯 Conclusion

What’s next

📚 References

Official documentation

📎 Related articles in this series

📋 The `runSubagent` tool

`user-invokable`

`disable-model-invocation`

🎯 Restricting subagent access with `agents`

The `agents` property

6. Not restricting the `agents` array