Table of Contents
| Status | Last updated | Parent | Read before this | ||||
|---|---|---|---|---|---|---|---|
| complete | 2026-04-30 |
|
|
Skill Pipeline
Current state
A feature travels through a fixed sequence of skills from /setup through /worktree-cleanup. Each skill picks up from the previous one's deliverables and the prior Contract - Phase Outcome. No state is held in the orchestrator's head between skills.
/setup
→ /git-worktrees
→ /requirements ⎯⎯⎯ PREQ approval
→ /technical-plan ⎯⎯⎯ plan approval
→ /develop ⎯⎯⎯ UAT
→ /simplify → /simplify-report
→ /code-validate → /code-tests → /code-fix
→ /e2e-validate → /e2e-tests → /e2e-fix
→ /a11y-validate → /a11y-tests → /a11y-fix
→ /api-validate → /api-tests → /api-fix
→ /security-browser-validate → /security-browser-tests → /security-browser-fix
→ /security-api-validate → /security-api-tests → /security-api-fix
→ /closeout
→ /integrate
→ /worktree-cleanup
The pipeline has three phases:
Planning and implementation runs from /setup through /develop. It produces an approved PREQ, an approved technical plan (one SREQ per feature, merged with the plan), and implemented code that the orchestrator has accepted at UAT.
QA runs from /simplify through the security pipelines. The structure is the same for each domain: a validate skill produces a report; a tests skill generates regression tests with test.fixme() / @pytest.mark.xfail / t.Skip() for each unfixed finding; a fix skill applies fixes and promotes those tests to passing. Pairs are independent of each other — e2e-validate doesn't block a11y-validate — but within a pair the order is fixed.
Wrap-up runs /closeout (traceability matrix, MR description, learning analysis), /integrate (rebase, test, merge), and /worktree-cleanup.
The three approval gates are PREQ (after /requirements), plan (after /technical-plan), and UAT (after /develop). Each is a Pending Decision in the corresponding Phase Outcome.
/develop decomposes the approved SREQ into work units (WU-N) at runtime. The decomposition is the lead agent's call, made against the current code state — it doesn't appear in the SREQ artifact and may differ if the skill is re-run. Work units are not issues; they're an internal scheduling concept. Each work unit gets a Test Writer + Implementer pair, organised into waves where dependencies allow parallelism. See Role - Coding Harness Agents.
Rationale
The pipeline is fixed rather than dynamic because dynamic pipelines are hard to reason about and harder to debug. Every feature traveling through the same sequence means the orchestrator always knows what comes next and can resume from disk artifacts after any interruption.
The validate/tests/fix triplet exists because the three concerns are genuinely different: one is "what's wrong", one is "how do we keep it from regressing", one is "make it right". Fusing them produces skills that try to do too much; splitting them lets each one stay focused, and lets fix-skills be skipped when there's nothing to fix without losing the regression tests.
QA pairs are sequential within and parallel across because validate→tests has a real data dependency (tests are generated from the report) but code-validate doesn't depend on e2e-validate. Parallelising across domains would be possible; for now the cost discipline of running them sequentially (and stopping early if a critical one fails) wins.
Work units are intentionally ephemeral. The earlier model had per-SREQ child issues; that turned out to be premature decomposition — the right granularity emerges from looking at the actual code, not from a planning artifact. Letting /develop decide on the spot keeps the SREQ stable while letting the implementation adapt.
Open questions
- The validate/tests/fix triplet for
codeincludescode-tests(regression test generation fromcode-validatefindings). Some categories of code-validate finding (naming, dead code, complexity) are non-testable;code-testsskips them with a comment. Whether the skip rate makescode-testsworth its slot is worth revisiting once there's usage data. - A
/qaorchestrator that runs all the pairs in one command was considered and deferred. Will revisit when there's data on which combinations are commonly run together.
Product
Design
- Design - Architecture Overview
- Design - Skill Pipeline
- Design - Issues As Durable Record
- Design - Wiki Layer
- Design - Artifacts
Role
Contract
Concept
Research