Appendix: Multi-agent orchestration plan - detailed specifications

tech

github-copilot

prompt-engineering

agents

planning

Detailed agent specifications, handoff flow diagrams, communication protocols, and implementation roadmap for the multi-agent orchestration plan

Author

Dario Airoldi

Published

December 26, 2025

Appendix: Multi-agent orchestration plan - detailed specifications

Companion to: Multi-Agent Orchestration Plan V2 This appendix contains the detailed per-agent specifications, handoff flow diagrams, communication protocols, data exchange optimization patterns, and the full implementation roadmap.

Agent Specifications

Existing Agents to UPDATE (4 files)

1. `prompt-researcher.agent.md`

Status: 🔄 UPDATE - Add use case challenge capabilities

Current File: .github/agents/prompt-researcher.agent.md (604 lines)

Updates Needed:

Section	Current	Add/Modify
Role	Research specialist	Add “Requirements analyst” capability
Phase 1	Requirements Clarification	Add use case challenge generation
New Section	N/A	Add “Use Case Challenge Methodology”
Output Templates	Basic requirements summary	Add challenge results template

New Process Step to Add:

### Phase 1a: Use Case Challenge Validation (NEW)

**Goal**: Test goal clarity through realistic scenarios.

**Process**:
1. Determine complexity level (Simple/Moderate/Complex)
2. Generate use cases (3/5/7 based on complexity)
3. Test each scenario against goal:
   - Does goal provide clear guidance│
   - What gaps or ambiguities revealed│
   - What tools/boundaries discovered│
4. Refine goal based on findings

**Use Case Generation Guidelines**:

| Complexity | Indicators | Use Cases |
|------------|------------|-----------|
| Simple | Clear single task, standard role | 3 |
| Moderate | Multiple objectives, domain-specific | 5 |
| Complex | Broad scope, novel workflow | 7 |

**Output: Challenge Results**
```markdown
### Use Case Challenge Results

**Complexity Assessment**: [Simple/Moderate/Complex]
**Use Cases Generated**: [N]

**Use Case 1: [Common Case]**
- Scenario: [realistic situation]
- Test: [question about goal applicability]
- Current Guidance: [what goal says to do]
- Gap Identified: [ambiguity or missing info]
- Refinement: [specific change to goal]

[Repeat for all use cases]

**Validation Status**: [✅ Clear / ⚠️ Needs Refinement / ❌ Critical Gaps]
**Refined Goal**: [updated goal incorporating discoveries]


---

#### 2. `prompt-builder.agent.md`

**Status**: 🔄 UPDATE - Minor enhancements

**Current File**: `.github/agents/prompt-builder.agent.md` (620 lines)

**Updates Needed**:

| Section | Current | Add/Modify |
|---------|---------|------------|
| **Tool validation** | Basic structure check | Add tool count verification (3-7) |
| **Boundaries section** | Template-based | Ensure three-tier enforcement |
| **Metadata** | Basic | Add creation context tracking |

**New Validation Step to Add**:

```markdown
### Phase 3a: Pre-Save Validation (NEW)

**Before saving file, verify**:
- [ ] Tool count: [count] (MUST be 3-7)
- [ ] Agent mode matches tool types
- [ ] Three-tier boundaries complete
- [ ] Bottom metadata block present
- [ ] Examples included (minimum 2)

**If any check fails**: STOP, report issue, do not save

3. `prompt-updater.agent.md`

Status: 🔄 UPDATE - Add change categorization

Current File: .github/agents/prompt-updater.agent.md (699 lines)

Updates Needed:

Section	Current	Add/Modify
Change planning	List-based	Add impact categorization
Risk assessment	Implicit	Make explicit per-change

New Change Categorization to Add:

### Change Impact Categories (NEW)

**Classify each change before applying**:

| Category | Description | Approval Needed |
|----------|-------------|-----------------|
| **Structural** | Adds/removes sections, changes phase flow | Yes |
| **Behavioral** | Modifies boundaries, tool access | Yes |
| **Cosmetic** | Formatting, wording improvements | No |
| **Fix** | Corrects errors identified in validation | No |

**Update Plan Template**:
```markdown
### Change 1: [Description]
- **Type**: [Structural/Behavioral/Cosmetic/Fix]
- **Impact**: [High/Medium/Low]
- **Approval**: [Required/Auto-apply]
- **Lines**: [N-M]
- **Before**: [excerpt]
- **After**: [excerpt]


---

#### 4. `prompt-validator.agent.md`

**Status**: 🔄 UPDATE - Add tool alignment checks

**Current File**: `.github/agents/prompt-validator.agent.md` (648 lines)

**Updates Needed**:

| Section | Current | Add/Modify |
|---------|---------|------------|
| **Tool validation** | Basic presence check | Add count and alignment validation |
| **Scoring** | Structure-focused | Add tool composition score |

**New Validation Checks to Add**:

```markdown
### Phase 3a: Tool Composition Validation (NEW)

**Checks**:
1. **Tool Count**: [count] 
   - ✅ 3-7 tools: Optimal
   - ⚠️ <3 tools: May be insufficient
   - ❌ >7 tools: Tool clash risk - FAIL

2. **Agent/Tool Alignment**:
   - `agent: plan` + write tools → ❌ FAIL
   - `agent: agent` + only read tools → ⚠️ Warning (may be intentional)
   
3. **Tool Redundancy**:
   - Check for overlapping capabilities
   - Recommend consolidation if found

**Output: Tool Composition Score**
```markdown
### Tool Composition
- **Count**: [N] [✅/⚠️/❌]
- **Alignment**: [agent mode] + [tool types] [✅/❌]
- **Redundancy**: [None/Minor/Significant]
- **Score**: [X]/100


---

### New Agents to CREATE (4 files)

---

#### 5. `agent-researcher.agent.md`

**Status**: ✨ CREATE NEW

**Purpose**: Research specialist for agent file requirements and pattern discovery. Parallel to `prompt-researcher` but specialized for agent concerns.

#### Full Specification

```yaml
---
name: agent-researcher
description: "Research specialist for agent file requirements and pattern discovery with role challenge validation"
agent: plan
model: claude-sonnet-4.5
tools:
  - semantic_search    # Find similar agents and patterns
  - grep_search        # Search for specific patterns
  - read_file          # Read templates and context files
  - file_search        # Locate agent files
  - list_dir           # Explore agent directory
handoffs:
  - label: "Build Agent"
    agent: agent-builder
    send: false
---

Role Definition

# Agent Researcher

You are a **research specialist** focused on analyzing agent requirements and discovering implementation patterns. You excel at challenging role definitions with use cases, identifying tool requirements, and validating agent/tool alignment. You NEVER create or modify files│you only research and report.

## Your Expertise

- **Role Challenge Analysis**: Testing agent roles against realistic scenarios
- **Tool Discovery**: Identifying minimum essential tools from use cases
- **Pattern Recognition**: Finding similar agents and extracting patterns
- **Alignment Validation**: Ensuring agent mode matches tool requirements
- **Scope Definition**: Identifying IN SCOPE vs OUT OF SCOPE boundaries

Key Processes

## Process

### Phase 1: Requirements Clarification with Role Challenge

1. **Understand Primary Role**
   - What specialist persona is needed│
   - What tasks will this agent handle│
   - What mode: read-only analysis (plan) or active modification (agent)│

2. **Challenge Role with Use Cases**
   - Generate 3-7 scenarios based on complexity
   - Test each: Can this role handle effectively│
   - Identify gaps, overlaps, ambiguities
   - Discover tool requirements from scenarios
   - Find handoff needs (when to delegate)

3. **Validate Tool Requirements**
   - Map responsibilities │ capabilities │ tools
   - Enforce 3-7 tool limit (decompose if >7)
   - Verify agent/tool alignment:
     - `agent: plan` │ ONLY read-only tools
     - `agent: agent` │ read + write tools allowed

**Output: Validated Requirements**
```markdown
### Agent Requirements Summary

**Role**: [refined specialist role]
**Mode**: [plan/agent]
**Complexity**: [Simple/Moderate/Complex]

**Use Case Challenge Results**:
- [Use case 1]: [result]
- [Use case 2]: [result]
- [N more...]

**Tool Requirements** (3-7 only):
1. [tool-name] - [justification from use case]
2. [tool-name] - [justification from use case]
...

**Scope Boundaries**:
- IN SCOPE: [what agent handles]
- OUT OF SCOPE: [what's excluded or delegated]

**Handoffs Needed**: [other agents to coordinate with]

**Validation Status**: [✅/⚠️/❌]

Phase 2: Pattern Discovery

[Same structure as prompt-researcher Phase 2, but searching .github/agents/]

Phase 3: Structure Definition

[Same structure as prompt-researcher Phase 3, but with agent-specific fields]


#### Boundaries

```markdown
## ⚠️ CRITICAL BOUNDARIES

### ✅ Always Do
- Challenge EVERY role with at least 3 use cases
- Verify tool count is 3-7 (NEVER approve >7)
- Check agent/tool alignment (plan → read-only only)
- Cross-reference tool-composition-guide.md
- Provide specific justification for each tool
- Identify scope boundaries clearly

### ⚠️ Ask First
- When role seems too broad (suggest decomposition)
- When >7 tools seem needed (MUST decompose)
- When agent/tool alignment is ambiguous

### 🚫 Never Do
- NEVER create or modify files
- NEVER skip role challenge phase
- NEVER approve >7 tools
- NEVER mix `agent: plan` with write tools
- NEVER proceed to building without validated requirements

6. `agent-builder.agent.md`

Status: ✨ CREATE NEW

Purpose: File creation specialist for agent files. Parallel to prompt-builder.

Full Specification

---
name: agent-builder
description: "Agent file generator following validated patterns and templates"
agent: agent
model: claude-sonnet-4.5
tools:
  - read_file          # Load templates and context
  - semantic_search    # Find similar patterns
  - create_file        # Create new agent file
  - file_search        # Locate reference files
handoffs:
  - label: "Validate Agent"
    agent: agent-validator
    send: true
---

Role Definition

# Agent Builder

You are an **agent generation specialist** focused on creating high-quality agent files from validated specifications. You excel at implementing role definitions, configuring tools, and establishing clear boundaries. You create new files but do NOT modify existing agents.

## Your Expertise

- **Role Implementation**: Translating role specifications into agent personas
- **Tool Configuration**: Setting up 3-7 essential tools with proper alignment
- **Boundary Definition**: Creating comprehensive three-tier boundaries
- **Pattern Application**: Following repository conventions and templates

Boundaries

## ⚠️ CRITICAL BOUNDARIES

### ✅ Always Do
- Verify specification has validated requirements
- Check tool count is 3-7 before creating
- Verify agent/tool alignment before saving
- Include complete three-tier boundaries
- Add bottom metadata block
- Hand off to validator automatically

### ⚠️ Ask First
- When specification seems incomplete
- When tool count is at boundary (3 or 7)
- When handoff targets don't exist yet

### 🚫 Never Do
- NEVER modify existing agents (updater's role)
- NEVER create agent with >7 tools
- NEVER mix plan mode with write tools
- NEVER skip validation handoff

7. `agent-updater.agent.md`

Status: ✨ CREATE NEW

Purpose: Update specialist for existing agent files. Parallel to prompt-updater.

Full Specification

---
name: agent-updater
description: "Specialized updater for existing agent files with tool alignment preservation"
agent: agent
model: claude-sonnet-4.5
tools:
  - read_file                    # Read current state
  - grep_search                  # Find patterns to update
  - replace_string_in_file       # Single updates
  - multi_replace_string_in_file # Batch updates
handoffs:
  - label: "Re-validate After Update"
    agent: agent-validator
    send: true
---

Key Processes

## Process

### Phase 1: Update Planning

1. **Analyze Update Request**
   - Validation report with issues│
   - User-specified changes│
   - Tool realignment needed│

2. **Read Current Agent**
   - Load complete file
   - Parse YAML (tools, mode, handoffs)
   - Verify line numbers from validation report

3. **Plan Updates with Impact Assessment**
   - Categorize each change (Structural/Behavioral/Cosmetic/Fix)
   - Check: Will update break agent/tool alignment│
   - Check: Will update exceed 7-tool limit│
   - If either: STOP, report issue, ask for guidance

### Phase 2: Apply Updates

[Same as prompt-updater but with agent-specific validations]

### Phase 3: Post-Update Verification

**Before handoff to validator, verify**:
- [ ] Tool count still 3-7
- [ ] Agent/tool alignment preserved
- [ ] Boundaries still three-tier complete
- [ ] Handoff targets still valid

Boundaries

## ⚠️ CRITICAL BOUNDARIES

### ✅ Always Do
- Read complete file before any changes
- Verify agent/tool alignment after changes
- Verify tool count remains 3-7
- Include 3-5 lines context in replacements
- Hand off to validator for re-validation

### ⚠️ Ask First
- Before changes that affect tool count
- Before changes that affect agent mode
- Before removing or adding handoffs

### 🚫 Never Do
- NEVER create new files
- NEVER update without reading first
- NEVER break agent/tool alignment
- NEVER exceed 7-tool limit
- NEVER skip re-validation

8. `agent-validator.agent.md`

Status: ✨ CREATE NEW

Purpose: Quality assurance specialist for agent files. Parallel to prompt-validator with agent-specific checks.

Full Specification

---
name: agent-validator
description: "Quality assurance specialist for agent file validation with tool alignment verification"
agent: plan
model: claude-sonnet-4.5
tools:
  - read_file      # Load files to validate
  - grep_search    # Search for patterns
  - file_search    # Find reference files
---

Key Validation Checks

## Validation Checks

### Agent-Specific Checks (Priority)

1. **Tool Count Validation** (CRITICAL)
   - Count tools in YAML array
   - ❌ FAIL if >7 tools
   - ⚠️ WARN if <3 tools
   - ✅ PASS if 3-7 tools

2. **Agent/Tool Alignment** (CRITICAL)
   - Parse `agent:` field
   - Parse `tools:` array
   - Check alignment:
     - `agent: plan` + ANY write tool → ❌ FAIL
     - `agent: agent` + only read tools → ⚠️ WARN
     - Proper alignment → ✅ PASS
   - Write tools: create_file, replace_string_in_file, multi_replace_string_in_file, run_in_terminal

3. **Handoff Validity**
   - Parse `handoffs:` array
   - For each handoff target:
     - Check if target agent file exists
     - Verify target is valid agent name
   - ❌ FAIL if any target doesn't exist

4. **Role/Persona Validation**
   - Check for clear role definition section
   - Verify expertise areas defined
   - Check for specialist focus (not generic)

### Standard Checks (Same as prompt-validator)
- Structure validation
- Convention compliance  
- Pattern consistency
- Quality assessment

Output Format

### Validation Report: [agent-name]

**Overall Status**: [✅ PASSED / ⚠️ WARNINGS / ❌ FAILED]

**Agent-Specific Checks**:
| Check | Result | Details |
|-------|--------|---------|
| Tool Count | [✅/⚠️/❌] | [N] tools (3-7 required) |
| Agent/Tool Alignment | [✅/⚠️/❌] | [mode] + [tool types] |
| Handoff Validity | [✅/⚠️/❌] | [N] handoffs, [M] valid |
| Role Definition | [✅/⚠️/❌] | [assessment] |

**Standard Checks**:
| Check | Score | Details |
|-------|-------|---------|
| Structure | [X]/100 | [findings] |
| Conventions | [X]/100 | [findings] |
| Patterns | [X]/100 | [findings] |
| Quality | [X]/100 | [findings] |

**Issues by Severity**:
- **Critical** ([N]): [list with line numbers]
- **Moderate** ([N]): [list with line numbers]
- **Minor** ([N]): [list with line numbers]

**Recommendations**:
1. [Specific fix with line reference]
2. [Specific fix with line reference]

Instruction Files Assessment

`.github/instructions/prompts.instructions.md`

Status: ✅ NO CHANGES NEEDED

Assessment: Current file is comprehensive and covers:

Context engineering principles reference
Tool selection guidance
Repository-specific patterns
Naming conventions
Template references

`.github/instructions/agents.instructions.md`

Status: ✅ NO CHANGES NEEDED

Assessment: Current file is comprehensive and covers:

Context engineering principles reference
Tool selection (3-7 tool rule documented)
Agent/tool alignment guidance
Six essential agent patterns
Multi-agent orchestration guidance

Handoff Flow Diagrams

Prompt Design and Create Flow

User Request
    ↓
    ↓
+------------------------------------------------------------+
│                 PHASE 1: Requirements                       │
│                                                             │
│  Orchestrator --→ prompt-researcher (challenge with cases) │
│       │                    │                                │
│       ├──------------------+ Requirements Report            │
│       │                                                     │
│       │                                                     │
│  [User Approval Checkpoint]                                 │
+------------------------------------------------------------+
    ↓
    ↓
+------------------------------------------------------------+
│                 PHASE 2: Research                           │
│                                                             │
│  Orchestrator --→ prompt-researcher (pattern discovery)    │
│       │                    │                                │
│       ├──------------------+ Pattern Report                 │
│       │                                                     │
│       │                                                     │
│  [User Approval Checkpoint]                                 │
+------------------------------------------------------------+
    ↓
    ↓
+------------------------------------------------------------+
│                 PHASE 3: Structure                          │
│                                                             │
│  Orchestrator --→ prompt-researcher (define structure)     │
│       │                    │                                │
│       ├──------------------+ Specification                  │
│       │                                                     │
│       │                                                     │
│  [User Approval Checkpoint]                                 │
+------------------------------------------------------------+
    ↓
    ↓
+------------------------------------------------------------+
│                 PHASE 4: Build                              │
│                                                             │
│  Orchestrator --→ prompt-builder (create file)             │
│       │                    │                                │
│       ├──------------------+ File Created                   │
+------------------------------------------------------------+
    ↓
    ↓
+------------------------------------------------------------+
│            PHASE 5-6: Agent Plans (if needed)               │
│                                                             │
│  If agents to update: Create plan --→ agent-review-validate │
│  If agents to create: Create plan --→ agent-design-create   │
│                                                             │
│  [User Approval Checkpoint for each plan]                   │
+------------------------------------------------------------+
    ↓
    ↓
+------------------------------------------------------------+
│                 PHASE 7: Validation                         │
│                                                             │
│  Orchestrator --→ prompt-validator (AUTO handoff)          │
│       │                    │                                │
│       ├──------------------+ Validation Report              │
│       │                                                     │
│       │                                                     │
│  [Decision: Pass/Warn/Fail]                                 │
+------------------------------------------------------------+
    ↓
    +----------------------+
    │ If PASS             │ If FAIL
    │                      │
+--------------+    +----------------------------------------+
│   COMPLETE   │    │         PHASE 8: Fix Issues            │
│   │          │    │                                        │
+--------------+    │  Orchestrator --→ prompt-updater       │
                    │       │                    │            │
                    │       ├──------------------+ Updated    │
                    │       │                                 │
                    │       │                                 │
                    │  [Loop back to Phase 7, max 3x]         │
                    +----------------------------------------+

Agent Design and Create Flow

User Request
    ↓
    ↓
+------------------------------------------------------------+
│                 PHASE 1: Requirements                       │
│                                                             │
│  Orchestrator --→ agent-researcher (role challenge)        │
│       │                    │                                │
│       ├──------------------+ Requirements (tools 3-7)       │
│       │                                                     │
│       │                                                     │
│  [User Approval + Tool Count Verification]                  │
+------------------------------------------------------------+
    ↓
    ↓
+------------------------------------------------------------+
│                 PHASE 2-3: Research + Structure             │
│                                                             │
│  [Same as Prompt flow but with agent-specific checks]       │
+------------------------------------------------------------+
    ↓
    ↓
+------------------------------------------------------------+
│                 PHASE 4: Build                              │
│                                                             │
│  Orchestrator --→ agent-builder (create file)              │
│       │                    │                                │
│       ├──------------------+ File + Tool Count Verification │
│       │                                                     │
│       │  ⚠️ GATE: If tools >7, REJECT and go back          │
+------------------------------------------------------------+
    ↓
    ↓
+------------------------------------------------------------+
│            PHASE 5-6: Agent Dependencies                    │
│                                                             │
│  If agents to update --→ agent-review-and-validate         │
│  If agents to create --→ RECURSIVE agent-design-and-create │
│                                                             │
│  ⚠️ Max recursion depth: 2 levels                          │
│                                                             │
│  [User Approval Required for each]                          │
+------------------------------------------------------------+
    ↓
    ↓
+------------------------------------------------------------+
│                 PHASE 7: Validation                         │
│                                                             │
│  Orchestrator --→ agent-validator (AUTO)                   │
│       │                    │                                │
│       ├──------------------+ Report + Tool Alignment Check  │
│       │                                                     │
│       │  ⚠️ CRITICAL: Fail if tools >7 or alignment wrong  │
│       │                                                     │
│  [Decision: Pass/Warn/Fail]                                 │
+------------------------------------------------------------+
    ↓
    +----------------------+
    │ If PASS             │ If FAIL
    │                      │
+--------------+    +----------------------------------------+
│   COMPLETE   │    │         PHASE 8: Fix Issues            │
│   │          │    │                                        │
+--------------+    │  Orchestrator --→ agent-updater        │
                    │       │                                 │
                    │       │  ⚠️ Preserve tool alignment     │
                    │       │  ⚠️ Maintain 3-7 tool count     │
                    │       │                                 │
                    │       │                                 │
                    │  [Loop back to Phase 7, max 3x]         │
                    +----------------------------------------+

Communication Protocols & Information Contracts

Phase-to-Phase Data Flow

Each phase in the workflow has explicit input/output contracts to ensure reliability and token efficiency:

Phase	Input Contract	Output Contract	Token Budget
Phase 1: Requirements	User request (text)	Validated requirements document	500-1,000
Phase 2: Research	Requirements + search queries	Research report (patterns, templates)	2,000-4,000
Phase 3: Structure	Research summary (compressed)	Complete specification (YAML + structure)	1,000-2,000
Phase 4: Build	Specification + template reference	File path + self-check results	1,500-3,000
Phase 5: Validation	File path + specification	Validation report (pass/fail + issues)	1,000-2,000
Phase 6: Update	Validation report + file path	Updated file path + change summary	1,000-2,000

Information Compression Between Phases

Key Principle: Each phase produces a compressed summary for downstream consumption, not full context.

## Phase 2 │ Phase 3 Handoff (Research │ Structure)

### Phase 2 Full Output (3,000 tokens):
```markdown
## Research Report: Validation Prompts

### Context Summary
Analyzed 5 existing validation prompts...
[Full analysis with examples, anti-patterns, discovery process]

Compressed Handoff to Phase 3 (500 tokens):

### Research Findings Summary

**Key Patterns to Apply**:
1. Structured output format (reference: prompt-1.md)
2. Three-tier boundaries (reference: prompt-2.md)
3. Severity categorization (reference: prompt-3.md)

**Template Recommendation**: validation-template.md
**Critical Constraints**: Tool count 3-7, read-only mode
**Anti-patterns to Avoid**: Unbounded loops, missing checkpoints


### Handoff Prompt Template (Token-Optimized)

Use this structure for all phase transitions:

```yaml
handoffs:
  - label: "{Action Description}"
    agent: {target-agent}
    send: true
    prompt: |
      {Primary Task Statement - 1 sentence}
      
      **Context from Previous Phase** (reference, not embed):
      - See above for: {what's in conversation history}
      - Specification: {reference Phase 3 output}
      
      **Your Specific Inputs**:
      - {Input 1}: {value or reference}
      - {Input 2}: {value or reference}
      
      **Expected Output Format**:
      {Structured template for this agent's response}
      
      **Success Criteria**:
      - {Criterion 1}
      - {Criterion 2}

Example: Phase 4 → Phase 5 Handoff (Build → Validation)

handoffs:
  - label: "Validate Prompt Quality"
    agent: prompt-validator
    send: true
    prompt: |
      Validate the prompt file created above.
      
      **Context from Previous Phases**:
      - Specification: See Phase 3 output in conversation
      - Template used: {template-path from build phase}
      
      **Your Validation Inputs**:
      - File path: {created-file-path}
      - Expected tool count: 3-7
      - Expected mode: {agent-mode from spec}
      
      **Expected Output Format**:
      Validation report with:
      - Overall status (PASS/FAIL/WARN)
      - Check results table
      - Categorized issues (Critical/Moderate/Minor)
      - Recommended fixes for updater
      
      **Success Criteria**:
      - All structural checks pass
      - Tool count within bounds
      - Agent/tool alignment validated

Selective Context Passing Rules

For each phase transition, follow these rules:

Handoff	INCLUDE	EXCLUDE
Orch → Researcher	User request, initial goal, constraints	N/A (first phase)
Researcher → Orchestrator	Pattern summary, template recommendation	Discovery process, all search results
Orch → Builder	Specification, template path, boundaries	Research details, use case challenge history
Builder → Validator	File path, spec reference, expected mode	Build process, template loading steps
Validator → Updater	Issue list with line numbers, fix recommendations	Passing checks, validation methodology

Reliability Checksum Pattern

Before each handoff, orchestrator validates critical data survives:

## Phase Transition Checklist (Orchestrator Internal)

Before Phase {N} │ Phase {N+1}:

- [ ] **Goal Preservation**: Refined goal from Phase 1 intact│
- [ ] **Scope Boundaries**: IN/OUT scope carried forward│
- [ ] **Tool Requirements**: Tool list present in handoff│
- [ ] **Critical Constraints**: Boundaries included│
- [ ] **Success Criteria**: Validation criteria defined│

**If ANY fails**: Re-inject missing context before handoff.

📊 Data Exchange Optimization

This section defines token-efficient data exchange strategies for agent handoffs, ensuring maximum reliability while minimizing context overhead.

Token Budget Per Phase

Each phase has a target token budget for incoming/outgoing context:

Phase	Purpose	Incoming Budget	Outgoing Budget	Notes
Phase 0	Planning	~500 tokens	~800 tokens	User intent → plan outline
Phase 1	Requirements	~300 tokens	~1500 tokens	Context → detailed requirements
Phase 2	Research	~800 tokens	~2000 tokens	Requirements → patterns + references
Phase 3	Architecture	~1000 tokens	~1500 tokens	Research → design decisions
Phase 4	Build	~2000 tokens	~3000 tokens	Design → complete file
Phase 5-6	Agents	~1500 tokens	~2500 tokens	Dependencies → agent files
Phase 7	Validation	~2500 tokens	~1000 tokens	File → validation report
Phase 8	Integration	~500 tokens	~300 tokens	Status → completion report

Information Compression Patterns

Reference-First Pattern

Instead of passing full file contents, pass references with key excerpts:

# ❌ Token-Heavy (passes full content)
context: |
  Here is the existing file content:
  [2000 lines of file content...]
  
  Analyze and improve this.

# ✅ Token-Efficient (passes reference + key sections)
context: |
  File: `.github/prompts/validator.prompt.md`
  
  Key Sections to Improve:
  - Goal (lines 15-20): "Validate prompts..."
  - Tools (lines 45-60): Uses 5 tools currently
  - Constraints (lines 80-95): Missing boundary definitions
  
  Use read_file to access full content if needed.
  Focus improvements on identified sections.

Progressive Summarization Pattern

Each phase summarizes for the next, not for all future phases:

# Phase 2 │ Phase 3 Handoff
context: |
  ## Requirements Summary (from Phase 1)
  Goal: Validate prompt files for structure compliance
  Role: Technical Validator
  Scope: IN=markdown files, OUT=code generation
  
  ## Research Findings (from Phase 2)
  Patterns Discovered:
  1. YAML frontmatter validation (see pattern-a.md)
  2. Section structure checks (see pattern-b.md)
  3. Tool count verification (3-7 range)
  
  ## YOUR TASK (Phase 3)
  Design architecture for implementing above patterns.

Selective Context Rules

Information Type	Pass Forward?	Reason
User’s original request	✅ Always	Maintains intent alignment
Goal statement	✅ Always	Core reference for all phases
Role description	✅ Always	Constrains behavior
Use case challenge results	⚠️ Summary only	Full results too verbose
Research file paths	✅ References	Agent can read if needed
Research file contents	❌ Never	Too token-heavy
Architecture decisions	✅ Key decisions	Guides implementation
Build output (file)	✅ Full content	Needed for validation
Validation errors	✅ Full details	Needed for fixes

Handoff Reliability Patterns

Checksum Pattern for Critical Data

For data that must survive handoff intact:

context: |
  ## Goal (CRITICAL - verify preserved)
  Text: "Analyze and validate prompt files for structural compliance"
  Checksum: goal-hash-7f3a2b
  
  ## Tools (CRITICAL - verify count)
  Count: 5
  List: [read_file, grep_search, list_dir, semantic_search, file_search]
  Checksum: tools-hash-9c4e1d
  
  At end of your phase, confirm:
  - Goal text unchanged ✅/❌
  - Tool count unchanged ✅/❌
  - If changed, document reason

Explicit Acknowledgment Pattern

Require agents to acknowledge critical handoff data:

# Orchestrator instruction to receiving agent
context: |
  Before proceeding, confirm you received:
  1. Goal statement: [quote it back]
  2. Role definition: [quote it back]
  3. Scope boundaries: [list IN/OUT items]
  
  If any are unclear or missing, request clarification
  before beginning your task.

Phase Transition Templates

Standard Handoff Template

## Handoff: Phase {N} │ Phase {N+1}

### Summary from Phase {N}
{2-3 sentence summary of what was accomplished}

### Key Outputs
- Output 1: {brief description} [reference: path/file.md]
- Output 2: {brief description} [reference: inline below]

### Critical Data (verify preserved)
- Goal: "{goal statement}"
- Role: "{role statement}"
- Scope: IN=[list], OUT=[list]

### Phase {N+1} Task
{Specific instructions for next phase}

### Success Criteria
1. {Measurable criterion 1}
2. {Measurable criterion 2}
3. {Measurable criterion 3}

Error State Handoff Template

When a phase encounters issues:

## Handoff: Phase {N} │ Error Recovery

### Error Encountered
Type: {ambiguity/conflict/missing-info/tool-failure}
Description: {What went wrong}

### Context at Error Point
- Completed Steps: {list}
- Failed Step: {step name}
- Remaining Steps: {list}

### Recovery Options
1. {Option A}: {description + impact}
2. {Option B}: {description + impact}
3. Return to Phase {M}: {if backtrack needed}

### User Decision Required
{Specific question for user to resolve}

Anti-Patterns to Avoid

Anti-Pattern	Problem	Solution
Context Dumping	Passing all prior context to each phase	Use progressive summarization
Reference Omission	Not providing file paths for context	Always include actionable references
Implicit Expectations	Assuming agent knows what to return	Explicitly state required outputs
Unbounded Output	No guidance on output length	Specify token/line budgets
Silent Failures	Agent proceeds despite missing data	Require acknowledgment of critical data

Implementation Roadmap

Stage 1: Agent Infrastructure (Week 1, Days 1-3)

Priority: CRITICAL - Agents are dependencies for orchestrators

Day	Task	Output
1	Create `agent-researcher.agent.md`	Agent file with role challenge capabilities
1	Create `agent-validator.agent.md`	Agent file with tool alignment checks
2	Create `agent-builder.agent.md`	Agent file with tool count verification
2	Create `agent-updater.agent.md`	Agent file with alignment preservation
3	Test all 4 agents independently	Test results documented
3	v1.107 Testing: Test agents in background context with work trees	Verify isolation works

Stage 2: Update Existing Prompt Agents (Week 1, Days 3-4)

Priority: HIGH - Enhance existing agents with v2 capabilities

Day	Task	Output
3	Update `prompt-researcher.agent.md`	Added use case challenge methodology
3	Update `prompt-validator.agent.md`	Added tool alignment validation
4	Update `prompt-builder.agent.md`	Added pre-save validation
4	Update `prompt-updater.agent.md`	Added change categorization
4	v1.107 Testing: Test “Continue in” delegation from chat	Verify context transfer

Stage 3: Create Orchestration Prompts (Week 1-2, Days 5-8)

Priority: MEDIUM - Build on agent infrastructure

Day	Task	Output
5	Create `agent-design-and-create.prompt.md`	Full orchestrator with Phase 0-8 (planning mode)
6	Create `agent-review-and-validate.prompt.md`	Validation orchestrator
7	Update `prompt-design-and-create.prompt.md`	Enhanced with Phase 0, 5-6
8	Update `prompt-review-and-validate.prompt.md`	Minor enhancements
8	v1.107 Testing: Test Agent HQ session management	Verify all phases tracked

Stage 4: Integration Testing (Week 2, Days 9-10)

Day	Task	Output
9	End-to-end test: Create test prompt	Working prompt file
9	End-to-end test: Create test agent	Working agent file
9	v1.107 Testing: Test planning mode (Phase 0)	Verify plan generation works
10	End-to-end test: Validation workflows	Both review-validate prompts working
10	End-to-end test: Agent dependencies	Phase 5-6 agent plans work
10	v1.107 Testing: Test background validation with work trees	Verify parallel execution

Stage 5: Documentation and Migration (Week 2, Days 11-12)

Day	Task	Output
11	Add deprecation notices to v2 files	Notices pointing to new workflows
11	Update this planning document	Mark stages complete
11	v1.107 Documentation: Document Agent HQ workflow patterns	Usage guide for session management
12	Team training documentation	Usage guide for new workflows + v1.107 features
12	Update tech/ articles	References to new system and v1.107 capabilities

v1.107 Testing Checklist

Agent HQ Features

Verify all orchestration phases appear in Agent HQ sessions list
Test filtering sessions by agent name
Test archiving completed sessions
Verify read/unread markers work correctly
Test side-by-side session view (v1.107.1)

“Continue in” Delegation

Test “Continue in Background” from orchestrator
Test “Continue in Cloud” from orchestrator
Verify context transfer to specialist agents
Test notification when background session completes
Verify Agent HQ marks session complete

Work Tree Isolation

Test background agent creates isolated work tree
Verify no conflicts with main workspace edits
Test “Apply” action to merge changes back
Verify parallel execution (validation while building)
Test cleanup of work trees after completion

Planning Mode

Test agent: plan in orchestrator YAML
Verify implementation plan generation
Test delegation from plan to builder agents
Verify plan includes agent dependencies
Test user approval checkpoint after planning

Success Criteria

Reliability Metrics

Metric	Target	Measurement
Agent tool count	100% agents have 3-7 tools	YAML validation
Tool alignment	100% agents pass alignment check	Validator reports
Validation pass rate	90%+ first-time pass	Track Phase 7 results
Issue resolution cycles	=2 loops to resolution	Track Phase 7→8 iterations

Quality Metrics

Metric	Target	Measurement
Use case coverage	100% of roles challenged	Phase 1 completion
Boundary completeness	100% have three-tier	Validator check
Template compliance	100% follow patterns	Pattern consistency score

Workflow Metrics

Metric	Target	Measurement
Time to create prompt	<15 minutes	Time tracking
Time to create agent	<20 minutes	Time tracking
User approval rate	>90% plans approved as-is	Phase transition tracking

File Checklist Summary

Orchestration Prompts

File	Status	Action
`prompt-design-and-create.prompt.md`	🔄	UPDATE (add Phase 5-6)
`prompt-review-and-validate.prompt.md`	🔄	UPDATE (minor)
`agent-design-and-create.prompt.md`	✨	CREATE NEW
`agent-review-and-validate.prompt.md`	✨	CREATE NEW

Agent Files - Existing (Prompt)

File	Status	Action
`prompt-researcher.agent.md`	🔄	UPDATE (add use case challenge)
`prompt-builder.agent.md`	🔄	UPDATE (add pre-save validation)
`prompt-updater.agent.md`	🔄	UPDATE (add change categorization)
`prompt-validator.agent.md`	🔄	UPDATE (add tool alignment)

Agent Files - New (Agent)

File	Status	Action
`agent-researcher.agent.md`	✨	CREATE NEW
`agent-builder.agent.md`	✨	CREATE NEW
`agent-updater.agent.md`	✨	CREATE NEW
`agent-validator.agent.md`	✨	CREATE NEW

Instruction Files

File	Status	Action
`prompts.instructions.md`	✅	NO CHANGES
`agents.instructions.md`	✅	NO CHANGES

Files to Deprecate (After Validation)

File	Status	Migration Path
`agent-createorupdate-agent-file-v2.prompt.md`	🔄	→ `agent-design-and-create.prompt.md`
`prompt-createorupdate-prompt-file-v2.prompt.md`	🔄	→ `prompt-design-and-create.prompt.md`

Next Steps

Immediate Actions

⬜ Review this plan - Approve architecture and approach
⬜ Create agent-researcher.agent.md - First new agent
⬜ Create agent-validator.agent.md - Second new agent
⬜ Create agent-builder.agent.md - Third new agent
⬜ Create agent-updater.agent.md - Fourth new agent

Upon Plan Approval

Execute Stage 1-5 following the implementation roadmap above.

Document Version: 2.0 (appendix)
Extracted from: Multi-Agent Orchestration Plan V2
Created: 2025-12-14
Last Updated: 2025-12-14

Appendix: Multi-agent orchestration plan - detailed specifications

Agent Specifications

Existing Agents to UPDATE (4 files)

1. prompt-researcher.agent.md

3. prompt-updater.agent.md

Role Definition

Key Processes

Phase 2: Pattern Discovery

Phase 3: Structure Definition

6. agent-builder.agent.md

Full Specification

Role Definition

Boundaries

7. agent-updater.agent.md

Full Specification

Key Processes

Boundaries

8. agent-validator.agent.md

Full Specification

Key Validation Checks

Output Format

Instruction Files Assessment

.github/instructions/prompts.instructions.md

.github/instructions/agents.instructions.md

Handoff Flow Diagrams

Prompt Design and Create Flow

Agent Design and Create Flow

Communication Protocols & Information Contracts

Phase-to-Phase Data Flow

Information Compression Between Phases

Compressed Handoff to Phase 3 (500 tokens):

Example: Phase 4 → Phase 5 Handoff (Build → Validation)

Selective Context Passing Rules

Reliability Checksum Pattern

📊 Data Exchange Optimization

Token Budget Per Phase

Information Compression Patterns

Reference-First Pattern

Progressive Summarization Pattern

Selective Context Rules

Handoff Reliability Patterns

Checksum Pattern for Critical Data

Explicit Acknowledgment Pattern

Phase Transition Templates

Standard Handoff Template

Error State Handoff Template

Anti-Patterns to Avoid

Implementation Roadmap

Stage 1: Agent Infrastructure (Week 1, Days 1-3)

Stage 2: Update Existing Prompt Agents (Week 1, Days 3-4)

Stage 3: Create Orchestration Prompts (Week 1-2, Days 5-8)

Stage 4: Integration Testing (Week 2, Days 9-10)

Stage 5: Documentation and Migration (Week 2, Days 11-12)

v1.107 Testing Checklist

Agent HQ Features

“Continue in” Delegation

Work Tree Isolation

Planning Mode

Success Criteria

Reliability Metrics

Quality Metrics

Workflow Metrics

File Checklist Summary

Orchestration Prompts

Agent Files - Existing (Prompt)

Agent Files - New (Agent)

Instruction Files

Files to Deprecate (After Validation)

Next Steps

Immediate Actions

Upon Plan Approval

1. `prompt-researcher.agent.md`

3. `prompt-updater.agent.md`

6. `agent-builder.agent.md`

7. `agent-updater.agent.md`

8. `agent-validator.agent.md`

`.github/instructions/prompts.instructions.md`

`.github/instructions/agents.instructions.md`