📦

MPL

Micro-Phase Loop - Coherence-first autonomous coding pipeline plugin for Claude Code

0 installs

Trust: 34 — Low

Ask AI about MPL

I know everything about MPL. Ask me about installation, configuration, usage, or troubleshooting.

0/500

Loading tools...

Reviews

Documentation

MPL (Micro-Phase Loop) v0.18.1

Prevention over cure. Specification over debugging.

A Claude Code plugin that decomposes ambitious tasks into micro-phases — each independently planned, executed, and verified in isolation — so context never corrupts and failures never cascade.

Quick Start · Philosophy · How · Pipeline Router · Agents · Under the Hood

AI can build anything in isolation. The hard part is building things that compose.

The longer an AI agent runs, the more it forgets what it promised. MPL doesn't fight this — it embraces it by giving each phase a fresh mind with only the knowledge it needs.

From Chaos to Coherence

"The best way to predict the future is to prevent the past."

Every autonomous coding pipeline faces the same enemy: context pollution. The longer a session runs, the more accumulated state — half-finished ideas, abandoned approaches, stale assumptions — degrades every subsequent decision. By Phase 4, the agent is debugging its own confusion, not your code.

MPL's answer is architectural, not heroic:

 Chaos                              Coherence
   🌀                                  🔬
"Build everything at once"  →  "Build one thing perfectly, then forget"
"Fix it later in Phase 5"  →  "Prevent it in Phase 0"
"Trust the agent's memory"  →  "Trust only written artifacts"

This isn't a philosophy of caution — it's a philosophy of compound reliability. Each micro-phase is small enough to succeed. Each success is recorded as a State Summary. Each subsequent phase reads only that summary, not the messy history of how it was produced.

The result: Phase 10 runs with the same clarity as Phase 1.

The Two Laws

MPL is built on two empirically validated laws:

Law 1: Invest in specification, eliminate debugging.

Seven experiments proved that Phase 0 investment (API contracts, type policies, error specs) monotonically reduces Phase 5 rework. At full specification, debugging drops to zero:

Phase 0 Investment    Pass Rate Progression
────────────────      ──────────────────────
No Phase 0            38% → debugging hell
+ API Contracts       58% → still painful
+ Example Patterns    65% → improving
+ Type Policy         77% → almost there
+ Error Spec          100% → zero debugging

Law 2: The orchestrator must never write code.

The moment the orchestrator touches source files, it becomes invested in its own implementation. It defends its code instead of objectively verifying it. MPL enforces separation with a PreToolUse hook that warns the orchestrator when it attempts to edit source files directly. All code flows through mpl-phase-runner agents via Task delegation.

Quick Start

Step 1 — Add marketplace & install:

# Register the MPL marketplace (one-time)
claude plugin marketplace add https://github.com/KyubumShin/MPL.git

# Install the plugin
claude plugin install mpl

Or from inside Claude Code:

/plugin marketplace add https://github.com/KyubumShin/MPL.git
/plugin install mpl

Alternative: Manual installation

# As a git submodule
cd /path/to/your-project
git submodule add https://github.com/KyubumShin/MPL.git

# Load locally for testing without installing
claude --plugin-dir ./MPL

Step 2 — Run setup:

/mpl:mpl-setup

The setup wizard automatically:

Creates runtime directories (.mpl/)
Detects available tools (LSP, AST)
Configures standalone fallbacks if needed
Optionally enables the HUD statusline

Step 3 — Start building:

mpl add user authentication with OAuth and role-based access

What just happened?

Quick Scope Scan  → 8 affected files, 4 test scenarios → pipeline_score 0.72
Hat Selection     → PP-proximity scoring → appropriate pipeline depth
PP Interview      → 6 Pivot Points extracted (3 CONFIRMED, 3 PROVISIONAL)
Phase 0 Enhanced  → API contracts + type policy + error spec generated
Decomposition     → 4 micro-phases with interface contracts
Phase Execution   → 4 phases × (plan → execute → test → verify)
3H+1A Gate        → Hard 1: build+types, Hard 2: tests, Hard 3: PP compliance, Advisory: cross-boundary
RUNBOOK           → Full execution log for session continuity

Each phase saw only its own context. No pollution. No cascade.

The Loop

MPL's core is a decompose-execute-verify loop where each iteration is a fresh session:

                    ┌─── Phase 0: Specify ───┐
                    │  API contracts          │
                    │  Type policies          │
                    │  Error specs            │
                    └──────────┬──────────────┘
                               │
                    ┌──────────▼──────────────┐
                    │  Decompose into N       │
                    │  micro-phases           │
                    └──────────┬──────────────┘
                               │
              ┌────────────────▼────────────────┐
              │  For each phase (fresh session): │
              │    Plan → Phase Runner → Test → Verify │
              │    Output: State Summary only     │
              └────────────────┬────────────────┘
                               │
                    ┌──────────▼──────────────┐
                    │  3 Hard + 1 Advisory    │
                    │  Hard 1: Build+Types    │
                    │  Hard 2: Tests          │
                    │  Hard 3: PP Compliance  │
                    │  Advisory: Cross-Boundary│
                    └──────────┬──────────────┘
                               │
                           Complete

Step	What Happens	Why It Matters
Triage	Analyze prompt density, scan scope	Right-size the pipeline
Pivot Points	Socratic interview extracts immutable constraints	Prevent scope drift
Phase 0	Pre-specification: contracts, types, errors	Eliminate debugging
Decompose	Break into ordered phases with interface contracts	Each phase is independently verifiable
Execute	Fresh session per phase, Phase Runner delegation, micro-test cycles	No context pollution
3H+1A Gate	Tests (Hard) → Review (Hard) → PP (Hard) + Types (Advisory)	Evidence-based completion
RUNBOOK	Continuous audit log for human/agent session continuity	Pick up where you left off

State Summary: The Only Bridge

Between phases, only one artifact passes: the State Summary. It contains what was built, what was decided, and what was verified — nothing else. No code snippets, no debugging history, no abandoned approaches.

This is the key insight: forgetting is a feature. Each phase starts clean, with only the structured knowledge it needs. The orchestrator manages context assembly — loading the right summaries, the right Phase Decisions, the right impact files — so the Phase Runner operates with perfect information density.

Build-Test-Fix: The Micro-Cycle

Inside each phase, every TODO gets immediate verification:

For each TODO:
  Build  → Phase Runner implements the change directly
  Test   → Run affected tests immediately (not at the end)
  Fix    → Fix on failure (max 2 retries per TODO)

After all TODOs:
  Test Agent → Independent test writing (code author ≠ test author)
  Cumulative → Full regression check against all prior phases

Batching implementation before testing is forbidden. A bug discovered after 5 TODOs could have been caused by any of them. A bug discovered immediately after TODO #3 was caused by TODO #3.

Circuit Break: Graceful Failure

When a phase fails after 3 retries, it doesn't crash — it circuit breaks:

Preserve PASS TODOs (verified work is never discarded)
Rollback FAIL TODO files to pre-phase state
Attempt tier escalation before giving up (see The Router)
If all tiers exhausted, transition to phase5-finalize (partial completion)

Circuit break leads directly to pipeline failure. MPL reports what succeeded and what failed — partial progress is always preserved.

The Router

The user should never have to judge "is this a small task or a big one?" The system should figure it out — and adapt when it's wrong.

The Hat Model (PP-Proximity)

MPL uses PP-proximity (Pivot Point proximity) to determine pipeline depth. Each task is scored by how close it touches the project's immutable constraints (Pivot Points). Higher proximity means more pipeline rigor.

One entry point. Auto-scoring. Dynamic escalation.

"mpl fix the login bug"              → Triage → Hat scores PP-proximity → lightweight pipeline
"mpl add email validation"           → Triage → Hat scores PP-proximity → standard pipeline
"mpl refactor the auth system"       → Triage → Hat scores PP-proximity → full pipeline

PP-Proximity Score

Triage runs a Quick Scope Scan and computes PP-proximity — how much the task touches core constraints:

pp_proximity = (pp_impact × 0.40) + (file_scope × 0.25)
             + (contract_change × 0.20) + (risk_signal × 0.15)

pp_impact:        How many PPs are directly affected (0.0 ~ 1.0)
file_scope:       min(affected_files / 10, 1.0)
contract_change:  Whether interface contracts change (0.0 or 1.0)
risk_signal:      prompt keyword analysis (0.0 ~ 1.0)

Hat + Floor

Hat	PP-Proximity	What Runs	Floor (minimum guarantee)	~Tokens
Light	Low	Error Spec → Fix → Gate 1 → Commit	Gate 1 always runs	~5-15K
Standard	Medium	PP(light) → Error Spec → Single Phase → Gate 1+2	Gate 1+2 always run	~20-40K
Full	High	Full pipeline with all gates	All 3 Hard Gates run	~50-100K+

Dynamic Escalation

When a Hat level fails, it doesn't give up — it grows:

[Light] ──circuit break──→ [Standard] ──circuit break──→ [Full]
                                │                              │
                                ├─ Completed TODOs preserved   ├─ Completed phases preserved
                                └─ Failed TODO → single phase  └─ Failed phase → phase5-finalize

Keyword hints still work as manual overrides: "mpl bugfix" → light, "mpl small" → standard.

The Eight Minds

Eight agents, each with a single purpose. Loaded on-demand, never preloaded:

Agent	Role	Core Principle
Interviewer	Socratic questioning for Pivot Points + Ambiguity Resolution	"What are you NOT willing to compromise on?"
Codebase Analyzer	Project structure analysis (haiku)	"What exists before we plan?"
Decomposer	Break into ordered micro-phases + Phase Seed generation	"What depends on what?"
Phase Runner	Execute a single phase end-to-end + test writing	"Plan, implement, verify, summarize"
Code Reviewer	Quality gate + PP compliance	"Would I approve this PR?"
Scout	Lightweight codebase exploration (haiku)	"Find it fast, spend nothing"
Compound	Learning extraction and distillation	"What did we learn that future runs should know?"
Doctor	Installation diagnostics	"Is everything wired correctly?"

Agent Separation Principle

The Phase Runner who implements code is never the Test Agent who verifies it. The Decomposer who plans is never the Phase Runner who executes. The Orchestrator who assembles context never touches source files. Each separation eliminates a class of bias.

Verification System

A/S/H Classification

Not all verification is equal. MPL classifies every criterion:

Type	Name	Verified By	Example
A-item	Agent-Verifiable	Exit code, file exists	`npm test` exits 0
S-item	Sandbox Testing	BDD scenarios, Given/When/Then	Integration test passes
H-item	Human-Required	Side Interview with user	UX judgment, visual review

3 Hard Gates + 1 Advisory

Three hard gates that block completion, plus one advisory gate:

Gate	Type	Method	Pass Criteria
Hard 1	Hard	Build + Type Check (project-wide)	0 build errors, 0 type errors
Hard 2	Hard	Automated tests (A + S items)	pass_rate >= 95%
Hard 3	Hard	PP compliance + H-item resolution	No violations + all H-items resolved
Advisory	Advisory	Cross-boundary contract check	Boundary contract consistency (warns, does not block)

Convergence Detection

Fix loops track pass rate history for automatic decisions:

Status	Condition	Action
`progressing`	delta > min_improvement	Continue fixing
`stagnating`	variance < 5% AND delta < threshold	Change strategy or escalate
`regressing`	delta < -10%	Revert or review Phase 0 artifacts

Under the Hood

8 agents · 8 hooks · 11 skills · 5 protocol files

MPL/
├── agents/                 # 8 agent definitions (YAML frontmatter)
│   └── mpl-interviewer.md   # PP Interview + ambiguity resolution (opus)
├── commands/               # Orchestration protocols (split for token efficiency)
│   ├── mpl-run.md          # Router: which protocol file to load
│   ├── mpl-run-phase0.md   # Steps -1 ~ 2.5: Triage, PP, Phase 0
│   ├── mpl-run-decompose.md # Steps 3 ~ 3-F: Decomposition + feedback loop
│   ├── mpl-run-execute.md  # Step 4: Execution loop, 3H+1A Gate, Fix loop
│   └── mpl-run-finalize.md # Steps 5 ~ 6: Finalize, Resume
├── prompts/                # 4-Layer template system (F-39)
│   ├── domains/            # 8 domain templates (base layer)
│   ├── subdomains/         # 19 tech-stack templates
│   ├── tasks/              # 6 task-type overlays
│   └── langs/              # 5 language templates
├── hooks/                  # 8 hooks across 6 events
│   ├── mpl-write-guard.mjs       # Warns orchestrator on source file edits
│   ├── mpl-validate-output.mjs   # Validates agent output schemas
│   ├── mpl-phase-controller.mjs  # Phase transitions + escalation (F-21)
│   ├── mpl-keyword-detector.mjs  # "mpl" keyword → pipeline init
│   ├── mpl-auto-permit.mjs       # Learned auto-permission (F-34)
│   ├── mpl-permit-learner.mjs    # Permission pattern learning (F-34)
│   ├── mpl-compaction-tracker.mjs # Compaction checkpoint (F-31)
│   ├── mpl-session-init.mjs      # Context rotation init (F-38)
│   └── lib/
│       ├── mpl-state.mjs         # State management + escalation
│       ├── mpl-scope-scan.mjs    # Pipeline score calculation (F-20)
│       ├── mpl-cache.mjs         # Phase 0 caching
│       ├── mpl-profile.mjs       # Token profiling
│       └── mpl-routing-patterns.mjs # Routing pattern learning (F-22)
├── skills/                 # 11 skills
│   ├── mpl/                # Main pipeline (single entry point)
│   ├── mpl-pivot/          # PP interview
│   ├── mpl-status/         # Dashboard
│   ├── mpl-cancel/         # Clean cancellation
│   ├── mpl-resume/         # Resume from checkpoint
│   ├── mpl-doctor/         # Diagnostics
│   └── mpl-setup/          # Setup wizard
└── docs/
    ├── design.md           # Full specification
    ├── standalone.md       # Standalone mode fallback matrix (F-04)
    └── roadmap/            # Evolution history + future plans

Key internals:

Hat Model (PP-proximity) — Quick Scope Scan + PP-proximity score → Hat-based pipeline depth selection
Dynamic Escalation (F-21) — light → standard → full on circuit break, preserving completed work
RUNBOOK (F-10) — Integrated execution log, auto-updated at 9 pipeline points, enables session resume
Session Persistence (F-12) — <remember priority> tags at phase transitions + RUNBOOK dual safety net
Run-to-Run Learning (F-11) — Orchestrator distills RUNBOOK → .mpl/memory/learnings.md
Routing Pattern Learning (F-22) — Jaccard similarity matching on past execution patterns
Self-Directed Context (F-24) — Phase Runner can Read/Grep within scope-bounded impact files
Task-based TODO (F-23) — TaskCreate/TaskUpdate as primary TODO state manager during execution
Background Execution (F-13) — Independent TODOs dispatched with run_in_background: true
Hard 1 Build+Type Check — Project-wide build and type checking (consolidates previous Gate 0.5)
4-Layer Templates (F-39) — Domain + Subdomain + Task Type + Language prompt composition
Standalone Mode (F-04) — Auto-detect tool availability, Grep/Glob fallbacks when LSP/AST unavailable
Phase 0 Caching — Hash-based cache key, skip entire Phase 0 on cache hit (~8-25K tokens saved)
2-Tier PD — Phase Decisions classified Active/Summary per phase for bounded token budget
Convergence Detection — Stagnation (variance < 5%), regression (delta < -10%), strategy suggestions
Dangerous Command Detection (T-01, v3.8) — Bash safety guard for rm -rf, DROP TABLE, git push --force, etc.
Core-First Phase Ordering (T-12, v3.8) — CORE → EXTENSION → SUPPORT sort within dependency tiers
Compaction Recovery (F-31, v3.8) — Read-side checkpoint loading after context compression
Post-Execution Review (T-10, v3.9) — H-item severity routing: HIGH blocks, MED/LOW defer to Step 5.5
Phase-Scoped File Lock (T-01 P2, v3.9) — Warn on writes outside current phase's declared scope
Budget Pause & Resume (F-33, v3.9) — Auto-pause on context exhaustion, handoff signal for watcher
Feasibility 2-Layer Defense (T-11, v4.0) — INFEASIBLE detection in Stage 2 + RE_INTERVIEW in Decomposer
Browser QA Gate (T-03, v4.0) — Claude in Chrome MCP UI verification (Gate 1.7, non-blocking)
PR Creation (T-04, v4.0) — Auto PR with Gate evidence via gh pr create
MCP Server Tier 1 (M-01, v0.5.1) — Deterministic ambiguity scoring + active state read/write via MCP tools
2-Pass Decomposition + Phase Seed (D-01, v0.6.0) — JIT seed generation, deterministic TODOs, acceptance mapping
2-Level Parallelism (D-01, v0.6.0) — TODO parallel graph (within phase) + EXTENSION/SUPPORT phase parallel (between phases)

State Directory: `.mpl/`

Path	Purpose
`.mpl/state.json`	Unified pipeline + execution state (schema v2, P2-6). Contains `run_mode`, `current_phase`, `tool_mode`, and the `execution` subtree (task, phase_details, totals, cumulative_pass_rate) — formerly split across two files.
`.mpl/pivot-points.md`	Immutable constraints (Pivot Points)
`.mpl/config.json`	User configuration overrides
`.mpl/mpl/RUNBOOK.md`	Integrated execution log for session continuity (F-10)
`.mpl/mpl/decomposition.yaml`	Phase decomposition output
`.mpl/mpl/phase-decisions.md`	Accumulated Phase Decisions (2-Tier)
`.mpl/mpl/phase0/`	Phase 0 Enhanced artifacts
`.mpl/mpl/phases/phase-N/`	Per-phase artifacts (mini-plan, state-summary, verification)
`.mpl/mpl/profile/`	Token/timing profile (phases.jsonl, run-summary.json)
`.mpl/memory/learnings.md`	Run-to-Run accumulated learnings (F-11)
`.mpl/memory/routing-patterns.jsonl`	Past execution patterns for tier prediction (F-22)
`.mpl/cache/phase0/`	Phase 0 cached artifacts

HUD (Statusline)

MPL provides a real-time statusline that shows pipeline progress at a glance:

harness_lab | 5h:45%(3h42m) | wk:12%(2d5h) | ctx:67% | 12m
MPL Full | Sprint | TODO:3/7 | Gate:✓-- | Fix:2/10 | tok:45.2K/500.0K

Line 1 — Project & Usage:

Project folder, API rate limits (5-hour / 7-day from Anthropic OAuth API), context window %, session duration

Line 2 — Pipeline Status (MPL active only):

Hat level (Light/Standard/Full), current phase
TODO progress, Gate results (✓/✗/-), Fix loop count
Token usage vs budget, tool mode

Color coding:

Rate limits: green <70%, yellow 70-90%, red ≥90% (with reset countdown)
Context: green <70%, yellow 70-85%, red ≥85%
Fix loop & tokens: yellow at 50%+, red at 80%+

Activate: Run /mpl:mpl-setup → enable HUD, or manually:

// ~/.claude/settings.json
{ "statusLine": { "type": "command", "command": "node <MPL_ROOT>/hooks/mpl-hud.mjs" } }

Usage

# Just say what you want — the system figures out the rest
mpl add user authentication with OAuth        # → Full (~80K tokens)
mpl add input validation to signup form       # → Standard (~30K tokens)
mpl fix null check in handleSubmit            # → Light (~8K tokens)

# Keyword hints for manual override
mpl bugfix missing error handler              # → forces Light
mpl small add retry logic                     # → forces Standard

# Direct skill invocation
/mpl:mpl

# Diagnostics
/mpl:mpl-doctor

Testing

node --test hooks/__tests__/*.test.mjs

Versioning

MPL is pre-1.0 (development stage). Follows 0.MAJOR.PATCH:

Position	Meaning	Examples
0.X.0	Structural change or major feature batch	MCP server, Stage 2 redesign, new gates
0.0.X	Bug fix, prompt change, skill update, translation	OMC cleanup, Korean→English

1.0.0 will be assigned after production validation and stabilization.

Design Reference

Full specification: docs/design.md
Roadmap: docs/roadmap/overview.md
Adaptive Router plan: docs/roadmap/adaptive-router-plan.md
Standalone mode: docs/standalone.md
Full references: docs/REFERENCES.md

References

MPL draws inspiration from various external projects and articles.

Area	Source	MPL Application
Pipeline Router	Ouroboros (Q00) — PAL Router 3-tier cost model	Adaptive Pipeline Router (F-20, F-21, F-22)
Test Design	SG-Loop (integrated in UAM) — experiment-based verification design, Phase 0 specification philosophy (influenced by Hoyeon's test design philosophy)	Phase 0 Enhanced (7 experiments → 4-step specification)
Session Memory	QMD (Tobi Lütke) — local hybrid search (BM25+vector+reranking) (historical influence; QMD integration removed in v0.14.2/v0.15.3)	Originally Scout QMD integration; Scout functionality absorbed by orchestrator in v2 with grep-based search
Grep Is Dead	ArtemXTech — /recall pattern, cross-session context persistence (historical)	Originally Scout 2-layer search; Scout removed in v2
Long-Horizon Tasks	Codex docs pattern — 4-Document mapping	RUNBOOK.md (F-10)
Agent Design	Seeing like an Agent (Thariq, Anthropic) — self-directed search, progressive disclosure	F-23, F-24, F-16
Software Factory	gstack (Garry Tan) — 25-skill sprint lifecycle, design-first approach, cross-model review	Temp roadmap: Safety Guard, Cross-Model Review, Ship Pipeline

Detailed analysis: docs/REFERENCES.md

"The best debugging session is the one that never happens."

MPL doesn't fix bugs faster — it prevents them from existing.

MPL