Claude Guardian
Flight computer for Claude Code — log rotation, watchdog, crash bundles, and MCP self-awareness
Ask AI about Claude Guardian
Powered by Claude · Grounded in docs
I know everything about Claude Guardian. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
日本語 | 中文 | Español | Français | हिन्दी | Italiano | Português (BR)
Flight computer for Claude Code — log rotation, watchdog, crash bundles, and MCP self-awareness.
Claude Guardian is a local reliability layer that keeps Claude Code sessions healthy. It detects log bloat, disk pressure, and hangs before they cause problems, captures evidence when things go wrong, and exposes an MCP server so Claude can self-monitor mid-session.
What it does
| Command | Purpose |
|---|---|
preflight | Scan Claude project logs, report oversized dirs/files, optionally auto-fix |
doctor | Generate a diagnostics bundle (zip) with system info, log tails, journal |
run -- <cmd> | Launch any command with watchdog monitoring, auto-bundle on crash/hang |
status | One-shot health check: disk free, log sizes, warnings |
watch | Background daemon: continuous monitoring, incident tracking, budget enforcement |
budget | View and manage the concurrency budget (show/acquire/release) |
mcp | Start MCP server (10 tools) for Claude Code self-monitoring |
Install
npm install -g @mcptoolshop/claude-guardian
Or run directly:
npx @mcptoolshop/claude-guardian preflight
Quick start
Check your environment
claude-guardian status
=== Claude Guardian Preflight ===
Disk free: 607.13GB [OK]
Claude projects: C:\Users\you\.claude\projects
Total size: 1057.14MB
Project directories (by size):
my-project: 1020.41MB
Issues found:
[WARNING] Project log dir is 1020.41MB (limit: 200MB)
[WARNING] File is 33.85MB (limit: 25MB)
[guardian] disk=607.13GB | logs=1057.14MB | issues=2
Auto-fix log bloat
claude-guardian preflight --fix
Rotates old logs (gzip), trims oversized .jsonl/.log files to their last N lines. Every action is logged to a journal file for traceability.
Generate a crash report
claude-guardian doctor --out ./bundle.zip
Creates a zip containing:
summary.json— system info, file size report, preflight resultslog-tails/— last 500 lines of each log filejournal.jsonl— every action the guardian has ever takenprocess.json— snapshot of running Claude processes at bundle timetimeline.json— reconstructed chronological event timelinestate.json— current daemon state (if daemon was running)incidents.jsonl— incident history (if any)
Run with watchdog
claude-guardian run -- claude
claude-guardian run --auto-restart --hang-timeout 120 -- node server.js
The watchdog:
- Spawns your command as a child process
- Monitors stdout/stderr for activity
- If no activity for
--hang-timeoutseconds → captures a doctor bundle - If the process crashes → captures a bundle, optionally restarts with backoff
MCP Server (the real unlock)
Register the guardian as a local MCP server so Claude can self-monitor:
Add to ~/.claude.json:
{
"mcpServers": {
"guardian": {
"command": "npx",
"args": ["@mcptoolshop/claude-guardian", "mcp"]
}
}
}
Then Claude can call:
| Tool | What it returns |
|---|---|
guardian_status | Disk, logs, processes, hang risk, budget, attention level |
guardian_preflight_fix | Runs log rotation/trimming, returns before/after report |
guardian_doctor | Creates diagnostics bundle (zip), returns path + summary |
guardian_nudge | Safe auto-remediation: fix logs if bloated, capture bundle if needed |
guardian_budget_get | Current concurrency cap, slots in use, active leases |
guardian_budget_acquire | Request concurrency slots (returns lease ID) |
guardian_budget_release | Release a lease when done with heavy work |
guardian_recovery_plan | Step-by-step recovery plan naming exact tools to call |
guardian_preview_ready | Poll a port until the dev server responds (use after preview_start) |
guardian_preview_recover | Diagnose stuck preview sessions, classify project type, guide recovery |
This lets Claude say: "Attention is WARN. Running guardian_nudge, then reducing concurrency."
Preview reliability
The guardian_preview_ready and guardian_preview_recover tools solve the race condition where preview_start returns success before the dev server is actually listening — which causes the browser to land on chrome-error:// and get stuck.
Workflow: preview_start → guardian_preview_ready (wait gate) → preview_snapshot
For non-web projects (Tauri, .NET MAUI, CLI tools), guardian_preview_recover detects the project type and returns "skip preview" guidance, avoiding false positives from the built-in preview verification hook.
Configuration
Three knobs (everything else is hardcoded with sane defaults):
| Flag | Default | Description |
|---|---|---|
--max-log-mb | 200 | Max project log directory size in MB |
--hang-timeout | 300 | Seconds of inactivity before declaring a hang |
--auto-restart | false | Auto-restart on crash/hang |
Plus one hardcoded guardrail:
- Disk free < 5GB → aggressive mode auto-enabled (shorter retention, lower thresholds)
Trust model
Claude Guardian is local-only. It has no network listener, no telemetry, and no cloud dependency.
What it reads: ~/.claude/projects/ (log files, sizes, modification times), process list (CPU, memory, uptime, handle counts for Claude-related processes via pidusage).
What it writes: ~/.claude-guardian/ (state.json, budget.json, journal.jsonl, doctor bundles). All files are under the user's home directory.
What it collects in bundles: System info (OS, CPU, memory, disk), log file tails (last 500 lines), process snapshots, and guardian's own journal. No API keys, tokens, credentials, or user content.
Dangerous actions — what Guardian will NOT do:
- Kill processes or send signals (no
SIGKILL, noSIGTERM) - Restart Claude Code or any other process
- Delete files (rotation = gzip, trimming = keep last N lines)
- Make network requests or phone home
- Elevate privileges or access other users' data
If process killing or auto-restart is ever added, it will be behind an explicit opt-in flag, documented here, and off by default.
Reliability
Guardian is hardened for continuous daily use:
- Async mutexes —
withStateLockandwithBudgetLockserialize concurrent file I/O, preventing TOCTOU races between the daemon and MCP tools - Overlap guard — daemon polls are protected by a
pollInProgressflag so slow polls can't stack - Clock skew protection — all time deltas clamped with
Math.max(0, ...)to handle system clock adjustments - Reverse-seek tail — large log files (>1MB) are tailed by reading chunks from the end, avoiding OOM on 500MB+ logs
- Corruption recovery — corrupt
state.jsonorbudget.jsonfiles are backed up and reset with a journal entry for forensics - Process enumeration tracking — enumeration failures are captured in
lastEnumerationErrorinstead of silently swallowed - Full UUID lease IDs — budget leases use full UUIDs for reliable identification
- Lease expiration journaling — expired leases are logged to the action journal for auditability
Design principles
- Evidence over vibes — every action writes a journal entry; crash bundles capture state, not guesses
- Deterministic — no ML, no heuristics beyond file age and size. Decision table you can read in 60 seconds
- Safe by default — rotation = gzip (reversible), trimming = keep last N lines (data preserved), no deletions in v1
- Boring dependencies — commander, pidusage, archiver, @modelcontextprotocol/sdk. That's it.
Development
npm install
npm run build
npm test
Scorecard
| Category | Score | Notes |
|---|---|---|
| A. Security | 10/10 | SECURITY.md, local-only, no telemetry, no cloud |
| B. Error Handling | 10/10 | GuardianError (code+hint+cause), structured MCP errors, exit codes |
| C. Operator Docs | 10/10 | README, CHANGELOG, HANDBOOK, SHIP_GATE, walkthrough |
| D. Shipping Hygiene | 10/10 | CI + tests (203), dep-audit, npm published |
| E. Identity | 10/10 | Logo, translations, landing page, npm listing |
| Total | 50/50 |
License
MIT
Built by MCP Tool Shop
