MPL
Micro-Phase Loop - Coherence-first autonomous coding pipeline plugin for Claude Code
Ask AI about MPL
Powered by Claude Β· Grounded in docs
I know everything about MPL. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
MPL (Micro-Phase Loop) v0.18.1
Prevention over cure. Specification over debugging.
A Claude Code plugin that decomposes ambitious tasks into micro-phases β each independently planned, executed, and verified in isolation β so context never corrupts and failures never cascade.
Quick Start Β· Philosophy Β· How Β· Pipeline Router Β· Agents Β· Under the Hood
AI can build anything in isolation. The hard part is building things that compose.
The longer an AI agent runs, the more it forgets what it promised. MPL doesn't fight this β it embraces it by giving each phase a fresh mind with only the knowledge it needs.
From Chaos to Coherence
"The best way to predict the future is to prevent the past."
Every autonomous coding pipeline faces the same enemy: context pollution. The longer a session runs, the more accumulated state β half-finished ideas, abandoned approaches, stale assumptions β degrades every subsequent decision. By Phase 4, the agent is debugging its own confusion, not your code.
MPL's answer is architectural, not heroic:
Chaos Coherence
π π¬
"Build everything at once" β "Build one thing perfectly, then forget"
"Fix it later in Phase 5" β "Prevent it in Phase 0"
"Trust the agent's memory" β "Trust only written artifacts"
This isn't a philosophy of caution β it's a philosophy of compound reliability. Each micro-phase is small enough to succeed. Each success is recorded as a State Summary. Each subsequent phase reads only that summary, not the messy history of how it was produced.
The result: Phase 10 runs with the same clarity as Phase 1.
The Two Laws
MPL is built on two empirically validated laws:
Law 1: Invest in specification, eliminate debugging.
Seven experiments proved that Phase 0 investment (API contracts, type policies, error specs) monotonically reduces Phase 5 rework. At full specification, debugging drops to zero:
Phase 0 Investment Pass Rate Progression
ββββββββββββββββ ββββββββββββββββββββββ
No Phase 0 38% β debugging hell
+ API Contracts 58% β still painful
+ Example Patterns 65% β improving
+ Type Policy 77% β almost there
+ Error Spec 100% β zero debugging
Law 2: The orchestrator must never write code.
The moment the orchestrator touches source files, it becomes invested in its own implementation. It defends its code instead of objectively verifying it. MPL enforces separation with a PreToolUse hook that warns the orchestrator when it attempts to edit source files directly. All code flows through mpl-phase-runner agents via Task delegation.
Quick Start
Step 1 β Add marketplace & install:
# Register the MPL marketplace (one-time)
claude plugin marketplace add https://github.com/KyubumShin/MPL.git
# Install the plugin
claude plugin install mpl
Or from inside Claude Code:
/plugin marketplace add https://github.com/KyubumShin/MPL.git
/plugin install mpl
Alternative: Manual installation
# As a git submodule
cd /path/to/your-project
git submodule add https://github.com/KyubumShin/MPL.git
# Load locally for testing without installing
claude --plugin-dir ./MPL
Step 2 β Run setup:
/mpl:mpl-setup
The setup wizard automatically:
- Creates runtime directories (
.mpl/) - Detects available tools (LSP, AST)
- Configures standalone fallbacks if needed
- Optionally enables the HUD statusline
Step 3 β Start building:
mpl add user authentication with OAuth and role-based access
What just happened?
Quick Scope Scan β 8 affected files, 4 test scenarios β pipeline_score 0.72
Hat Selection β PP-proximity scoring β appropriate pipeline depth
PP Interview β 6 Pivot Points extracted (3 CONFIRMED, 3 PROVISIONAL)
Phase 0 Enhanced β API contracts + type policy + error spec generated
Decomposition β 4 micro-phases with interface contracts
Phase Execution β 4 phases Γ (plan β execute β test β verify)
3H+1A Gate β Hard 1: build+types, Hard 2: tests, Hard 3: PP compliance, Advisory: cross-boundary
RUNBOOK β Full execution log for session continuity
Each phase saw only its own context. No pollution. No cascade.
The Loop
MPL's core is a decompose-execute-verify loop where each iteration is a fresh session:
ββββ Phase 0: Specify ββββ
β API contracts β
β Type policies β
β Error specs β
ββββββββββββ¬βββββββββββββββ
β
ββββββββββββΌβββββββββββββββ
β Decompose into N β
β micro-phases β
ββββββββββββ¬βββββββββββββββ
β
ββββββββββββββββββΌβββββββββββββββββ
β For each phase (fresh session): β
β Plan β Phase Runner β Test β Verify β
β Output: State Summary only β
ββββββββββββββββββ¬βββββββββββββββββ
β
ββββββββββββΌβββββββββββββββ
β 3 Hard + 1 Advisory β
β Hard 1: Build+Types β
β Hard 2: Tests β
β Hard 3: PP Compliance β
β Advisory: Cross-Boundaryβ
ββββββββββββ¬βββββββββββββββ
β
Complete
| Step | What Happens | Why It Matters |
|---|---|---|
| Triage | Analyze prompt density, scan scope | Right-size the pipeline |
| Pivot Points | Socratic interview extracts immutable constraints | Prevent scope drift |
| Phase 0 | Pre-specification: contracts, types, errors | Eliminate debugging |
| Decompose | Break into ordered phases with interface contracts | Each phase is independently verifiable |
| Execute | Fresh session per phase, Phase Runner delegation, micro-test cycles | No context pollution |
| 3H+1A Gate | Tests (Hard) β Review (Hard) β PP (Hard) + Types (Advisory) | Evidence-based completion |
| RUNBOOK | Continuous audit log for human/agent session continuity | Pick up where you left off |
State Summary: The Only Bridge
Between phases, only one artifact passes: the State Summary. It contains what was built, what was decided, and what was verified β nothing else. No code snippets, no debugging history, no abandoned approaches.
This is the key insight: forgetting is a feature. Each phase starts clean, with only the structured knowledge it needs. The orchestrator manages context assembly β loading the right summaries, the right Phase Decisions, the right impact files β so the Phase Runner operates with perfect information density.
Build-Test-Fix: The Micro-Cycle
Inside each phase, every TODO gets immediate verification:
For each TODO:
Build β Phase Runner implements the change directly
Test β Run affected tests immediately (not at the end)
Fix β Fix on failure (max 2 retries per TODO)
After all TODOs:
Test Agent β Independent test writing (code author β test author)
Cumulative β Full regression check against all prior phases
Batching implementation before testing is forbidden. A bug discovered after 5 TODOs could have been caused by any of them. A bug discovered immediately after TODO #3 was caused by TODO #3.
Circuit Break: Graceful Failure
When a phase fails after 3 retries, it doesn't crash β it circuit breaks:
- Preserve PASS TODOs (verified work is never discarded)
- Rollback FAIL TODO files to pre-phase state
- Attempt tier escalation before giving up (see The Router)
- If all tiers exhausted, transition to
phase5-finalize(partial completion)
Circuit break leads directly to pipeline failure. MPL reports what succeeded and what failed β partial progress is always preserved.
The Router
The user should never have to judge "is this a small task or a big one?" The system should figure it out β and adapt when it's wrong.
The Hat Model (PP-Proximity)
MPL uses PP-proximity (Pivot Point proximity) to determine pipeline depth. Each task is scored by how close it touches the project's immutable constraints (Pivot Points). Higher proximity means more pipeline rigor.
One entry point. Auto-scoring. Dynamic escalation.
"mpl fix the login bug" β Triage β Hat scores PP-proximity β lightweight pipeline
"mpl add email validation" β Triage β Hat scores PP-proximity β standard pipeline
"mpl refactor the auth system" β Triage β Hat scores PP-proximity β full pipeline
PP-Proximity Score
Triage runs a Quick Scope Scan and computes PP-proximity β how much the task touches core constraints:
pp_proximity = (pp_impact Γ 0.40) + (file_scope Γ 0.25)
+ (contract_change Γ 0.20) + (risk_signal Γ 0.15)
pp_impact: How many PPs are directly affected (0.0 ~ 1.0)
file_scope: min(affected_files / 10, 1.0)
contract_change: Whether interface contracts change (0.0 or 1.0)
risk_signal: prompt keyword analysis (0.0 ~ 1.0)
Hat + Floor
| Hat | PP-Proximity | What Runs | Floor (minimum guarantee) | ~Tokens |
|---|---|---|---|---|
| Light | Low | Error Spec β Fix β Gate 1 β Commit | Gate 1 always runs | ~5-15K |
| Standard | Medium | PP(light) β Error Spec β Single Phase β Gate 1+2 | Gate 1+2 always run | ~20-40K |
| Full | High | Full pipeline with all gates | All 3 Hard Gates run | ~50-100K+ |
Dynamic Escalation
When a Hat level fails, it doesn't give up β it grows:
[Light] ββcircuit breakβββ [Standard] ββcircuit breakβββ [Full]
β β
ββ Completed TODOs preserved ββ Completed phases preserved
ββ Failed TODO β single phase ββ Failed phase β phase5-finalize
Keyword hints still work as manual overrides: "mpl bugfix" β light, "mpl small" β standard.
The Eight Minds
Eight agents, each with a single purpose. Loaded on-demand, never preloaded:
| Agent | Role | Core Principle |
|---|---|---|
| Interviewer | Socratic questioning for Pivot Points + Ambiguity Resolution | "What are you NOT willing to compromise on?" |
| Codebase Analyzer | Project structure analysis (haiku) | "What exists before we plan?" |
| Decomposer | Break into ordered micro-phases + Phase Seed generation | "What depends on what?" |
| Phase Runner | Execute a single phase end-to-end + test writing | "Plan, implement, verify, summarize" |
| Code Reviewer | Quality gate + PP compliance | "Would I approve this PR?" |
| Scout | Lightweight codebase exploration (haiku) | "Find it fast, spend nothing" |
| Compound | Learning extraction and distillation | "What did we learn that future runs should know?" |
| Doctor | Installation diagnostics | "Is everything wired correctly?" |
Agent Separation Principle
The Phase Runner who implements code is never the Test Agent who verifies it. The Decomposer who plans is never the Phase Runner who executes. The Orchestrator who assembles context never touches source files. Each separation eliminates a class of bias.
Verification System
A/S/H Classification
Not all verification is equal. MPL classifies every criterion:
| Type | Name | Verified By | Example |
|---|---|---|---|
| A-item | Agent-Verifiable | Exit code, file exists | npm test exits 0 |
| S-item | Sandbox Testing | BDD scenarios, Given/When/Then | Integration test passes |
| H-item | Human-Required | Side Interview with user | UX judgment, visual review |
3 Hard Gates + 1 Advisory
Three hard gates that block completion, plus one advisory gate:
| Gate | Type | Method | Pass Criteria |
|---|---|---|---|
| Hard 1 | Hard | Build + Type Check (project-wide) | 0 build errors, 0 type errors |
| Hard 2 | Hard | Automated tests (A + S items) | pass_rate >= 95% |
| Hard 3 | Hard | PP compliance + H-item resolution | No violations + all H-items resolved |
| Advisory | Advisory | Cross-boundary contract check | Boundary contract consistency (warns, does not block) |
Convergence Detection
Fix loops track pass rate history for automatic decisions:
| Status | Condition | Action |
|---|---|---|
progressing | delta > min_improvement | Continue fixing |
stagnating | variance < 5% AND delta < threshold | Change strategy or escalate |
regressing | delta < -10% | Revert or review Phase 0 artifacts |
Under the Hood
8 agents Β· 8 hooks Β· 11 skills Β· 5 protocol files
MPL/
βββ agents/ # 8 agent definitions (YAML frontmatter)
β βββ mpl-interviewer.md # PP Interview + ambiguity resolution (opus)
βββ commands/ # Orchestration protocols (split for token efficiency)
β βββ mpl-run.md # Router: which protocol file to load
β βββ mpl-run-phase0.md # Steps -1 ~ 2.5: Triage, PP, Phase 0
β βββ mpl-run-decompose.md # Steps 3 ~ 3-F: Decomposition + feedback loop
β βββ mpl-run-execute.md # Step 4: Execution loop, 3H+1A Gate, Fix loop
β βββ mpl-run-finalize.md # Steps 5 ~ 6: Finalize, Resume
βββ prompts/ # 4-Layer template system (F-39)
β βββ domains/ # 8 domain templates (base layer)
β βββ subdomains/ # 19 tech-stack templates
β βββ tasks/ # 6 task-type overlays
β βββ langs/ # 5 language templates
βββ hooks/ # 8 hooks across 6 events
β βββ mpl-write-guard.mjs # Warns orchestrator on source file edits
β βββ mpl-validate-output.mjs # Validates agent output schemas
β βββ mpl-phase-controller.mjs # Phase transitions + escalation (F-21)
β βββ mpl-keyword-detector.mjs # "mpl" keyword β pipeline init
β βββ mpl-auto-permit.mjs # Learned auto-permission (F-34)
β βββ mpl-permit-learner.mjs # Permission pattern learning (F-34)
β βββ mpl-compaction-tracker.mjs # Compaction checkpoint (F-31)
β βββ mpl-session-init.mjs # Context rotation init (F-38)
β βββ lib/
β βββ mpl-state.mjs # State management + escalation
β βββ mpl-scope-scan.mjs # Pipeline score calculation (F-20)
β βββ mpl-cache.mjs # Phase 0 caching
β βββ mpl-profile.mjs # Token profiling
β βββ mpl-routing-patterns.mjs # Routing pattern learning (F-22)
βββ skills/ # 11 skills
β βββ mpl/ # Main pipeline (single entry point)
β βββ mpl-pivot/ # PP interview
β βββ mpl-status/ # Dashboard
β βββ mpl-cancel/ # Clean cancellation
β βββ mpl-resume/ # Resume from checkpoint
β βββ mpl-doctor/ # Diagnostics
β βββ mpl-setup/ # Setup wizard
βββ docs/
βββ design.md # Full specification
βββ standalone.md # Standalone mode fallback matrix (F-04)
βββ roadmap/ # Evolution history + future plans
Key internals:
- Hat Model (PP-proximity) β Quick Scope Scan + PP-proximity score β Hat-based pipeline depth selection
- Dynamic Escalation (F-21) β light β standard β full on circuit break, preserving completed work
- RUNBOOK (F-10) β Integrated execution log, auto-updated at 9 pipeline points, enables session resume
- Session Persistence (F-12) β
<remember priority>tags at phase transitions + RUNBOOK dual safety net - Run-to-Run Learning (F-11) β Orchestrator distills RUNBOOK β
.mpl/memory/learnings.md - Routing Pattern Learning (F-22) β Jaccard similarity matching on past execution patterns
- Self-Directed Context (F-24) β Phase Runner can Read/Grep within scope-bounded impact files
- Task-based TODO (F-23) β TaskCreate/TaskUpdate as primary TODO state manager during execution
- Background Execution (F-13) β Independent TODOs dispatched with
run_in_background: true - Hard 1 Build+Type Check β Project-wide build and type checking (consolidates previous Gate 0.5)
- 4-Layer Templates (F-39) β Domain + Subdomain + Task Type + Language prompt composition
- Standalone Mode (F-04) β Auto-detect tool availability, Grep/Glob fallbacks when LSP/AST unavailable
- Phase 0 Caching β Hash-based cache key, skip entire Phase 0 on cache hit (~8-25K tokens saved)
- 2-Tier PD β Phase Decisions classified Active/Summary per phase for bounded token budget
- Convergence Detection β Stagnation (variance < 5%), regression (delta < -10%), strategy suggestions
- Dangerous Command Detection (T-01, v3.8) β Bash safety guard for rm -rf, DROP TABLE, git push --force, etc.
- Core-First Phase Ordering (T-12, v3.8) β CORE β EXTENSION β SUPPORT sort within dependency tiers
- Compaction Recovery (F-31, v3.8) β Read-side checkpoint loading after context compression
- Post-Execution Review (T-10, v3.9) β H-item severity routing: HIGH blocks, MED/LOW defer to Step 5.5
- Phase-Scoped File Lock (T-01 P2, v3.9) β Warn on writes outside current phase's declared scope
- Budget Pause & Resume (F-33, v3.9) β Auto-pause on context exhaustion, handoff signal for watcher
- Feasibility 2-Layer Defense (T-11, v4.0) β INFEASIBLE detection in Stage 2 + RE_INTERVIEW in Decomposer
- Browser QA Gate (T-03, v4.0) β Claude in Chrome MCP UI verification (Gate 1.7, non-blocking)
- PR Creation (T-04, v4.0) β Auto PR with Gate evidence via
gh pr create - MCP Server Tier 1 (M-01, v0.5.1) β Deterministic ambiguity scoring + active state read/write via MCP tools
- 2-Pass Decomposition + Phase Seed (D-01, v0.6.0) β JIT seed generation, deterministic TODOs, acceptance mapping
- 2-Level Parallelism (D-01, v0.6.0) β TODO parallel graph (within phase) + EXTENSION/SUPPORT phase parallel (between phases)
State Directory: .mpl/
| Path | Purpose |
|---|---|
.mpl/state.json | Unified pipeline + execution state (schema v2, P2-6). Contains run_mode, current_phase, tool_mode, and the execution subtree (task, phase_details, totals, cumulative_pass_rate) β formerly split across two files. |
.mpl/pivot-points.md | Immutable constraints (Pivot Points) |
.mpl/config.json | User configuration overrides |
.mpl/mpl/RUNBOOK.md | Integrated execution log for session continuity (F-10) |
.mpl/mpl/decomposition.yaml | Phase decomposition output |
.mpl/mpl/phase-decisions.md | Accumulated Phase Decisions (2-Tier) |
.mpl/mpl/phase0/ | Phase 0 Enhanced artifacts |
.mpl/mpl/phases/phase-N/ | Per-phase artifacts (mini-plan, state-summary, verification) |
.mpl/mpl/profile/ | Token/timing profile (phases.jsonl, run-summary.json) |
.mpl/memory/learnings.md | Run-to-Run accumulated learnings (F-11) |
.mpl/memory/routing-patterns.jsonl | Past execution patterns for tier prediction (F-22) |
.mpl/cache/phase0/ | Phase 0 cached artifacts |
HUD (Statusline)
MPL provides a real-time statusline that shows pipeline progress at a glance:
harness_lab | 5h:45%(3h42m) | wk:12%(2d5h) | ctx:67% | 12m
MPL Full | Sprint | TODO:3/7 | Gate:β-- | Fix:2/10 | tok:45.2K/500.0K
Line 1 β Project & Usage:
- Project folder, API rate limits (5-hour / 7-day from Anthropic OAuth API), context window %, session duration
Line 2 β Pipeline Status (MPL active only):
- Hat level (Light/Standard/Full), current phase
- TODO progress, Gate results (β/β/-), Fix loop count
- Token usage vs budget, tool mode
Color coding:
- Rate limits: green <70%, yellow 70-90%, red β₯90% (with reset countdown)
- Context: green <70%, yellow 70-85%, red β₯85%
- Fix loop & tokens: yellow at 50%+, red at 80%+
Activate: Run /mpl:mpl-setup β enable HUD, or manually:
// ~/.claude/settings.json
{ "statusLine": { "type": "command", "command": "node <MPL_ROOT>/hooks/mpl-hud.mjs" } }
Usage
# Just say what you want β the system figures out the rest
mpl add user authentication with OAuth # β Full (~80K tokens)
mpl add input validation to signup form # β Standard (~30K tokens)
mpl fix null check in handleSubmit # β Light (~8K tokens)
# Keyword hints for manual override
mpl bugfix missing error handler # β forces Light
mpl small add retry logic # β forces Standard
# Direct skill invocation
/mpl:mpl
# Diagnostics
/mpl:mpl-doctor
Testing
node --test hooks/__tests__/*.test.mjs
Versioning
MPL is pre-1.0 (development stage). Follows 0.MAJOR.PATCH:
| Position | Meaning | Examples |
|---|---|---|
| 0.X.0 | Structural change or major feature batch | MCP server, Stage 2 redesign, new gates |
| 0.0.X | Bug fix, prompt change, skill update, translation | OMC cleanup, KoreanβEnglish |
1.0.0 will be assigned after production validation and stabilization.
Design Reference
- Full specification:
docs/design.md - Roadmap:
docs/roadmap/overview.md - Adaptive Router plan:
docs/roadmap/adaptive-router-plan.md - Standalone mode:
docs/standalone.md - Full references:
docs/REFERENCES.md
References
MPL draws inspiration from various external projects and articles.
| Area | Source | MPL Application |
|---|---|---|
| Pipeline Router | Ouroboros (Q00) β PAL Router 3-tier cost model | Adaptive Pipeline Router (F-20, F-21, F-22) |
| Test Design | SG-Loop (integrated in UAM) β experiment-based verification design, Phase 0 specification philosophy (influenced by Hoyeon's test design philosophy) | Phase 0 Enhanced (7 experiments β 4-step specification) |
| Session Memory | QMD (Tobi LΓΌtke) β local hybrid search (BM25+vector+reranking) (historical influence; QMD integration removed in v0.14.2/v0.15.3) | Originally Scout QMD integration; Scout functionality absorbed by orchestrator in v2 with grep-based search |
| Grep Is Dead | ArtemXTech β /recall pattern, cross-session context persistence (historical) | Originally Scout 2-layer search; Scout removed in v2 |
| Long-Horizon Tasks | Codex docs pattern β 4-Document mapping | RUNBOOK.md (F-10) |
| Agent Design | Seeing like an Agent (Thariq, Anthropic) β self-directed search, progressive disclosure | F-23, F-24, F-16 |
| Software Factory | gstack (Garry Tan) β 25-skill sprint lifecycle, design-first approach, cross-model review | Temp roadmap: Safety Guard, Cross-Model Review, Ship Pipeline |
Detailed analysis: docs/REFERENCES.md
"The best debugging session is the one that never happens."
MPL doesn't fix bugs faster β it prevents them from existing.
