📦

Claude Guardian

Flight computer for Claude Code — log rotation, watchdog, crash bundles, and MCP self-awareness

0 installs

Trust: 37 — Low

Maps

Ask AI about Claude Guardian

I know everything about Claude Guardian. Ask me about installation, configuration, usage, or troubleshooting.

0/500

Loading tools...

Reviews

Documentation

Flight computer for Claude Code — log rotation, watchdog, crash bundles, and MCP self-awareness.

Claude Guardian is a local reliability layer that keeps Claude Code sessions healthy. It detects log bloat, disk pressure, and hangs before they cause problems, captures evidence when things go wrong, and exposes an MCP server so Claude can self-monitor mid-session.

What it does

Command	Purpose
`preflight`	Scan Claude project logs, report oversized dirs/files, optionally auto-fix
`doctor`	Generate a diagnostics bundle (zip) with system info, log tails, journal
`run -- <cmd>`	Launch any command with watchdog monitoring, auto-bundle on crash/hang
`status`	One-shot health check: disk free, log sizes, warnings
`watch`	Background daemon: continuous monitoring, incident tracking, budget enforcement
`budget`	View and manage the concurrency budget (show/acquire/release)
`mcp`	Start MCP server (10 tools) for Claude Code self-monitoring

Install

npm install -g @mcptoolshop/claude-guardian

Or run directly:

npx @mcptoolshop/claude-guardian preflight

Quick start

Check your environment

claude-guardian status

=== Claude Guardian Preflight ===

Disk free: 607.13GB [OK]
Claude projects: C:\Users\you\.claude\projects
Total size: 1057.14MB

Project directories (by size):
  my-project: 1020.41MB

Issues found:
  [WARNING] Project log dir is 1020.41MB (limit: 200MB)
  [WARNING] File is 33.85MB (limit: 25MB)

[guardian] disk=607.13GB | logs=1057.14MB | issues=2

Auto-fix log bloat

claude-guardian preflight --fix

Rotates old logs (gzip), trims oversized .jsonl/.log files to their last N lines. Every action is logged to a journal file for traceability.

Generate a crash report

claude-guardian doctor --out ./bundle.zip

Creates a zip containing:

summary.json — system info, file size report, preflight results
log-tails/ — last 500 lines of each log file
journal.jsonl — every action the guardian has ever taken
process.json — snapshot of running Claude processes at bundle time
timeline.json — reconstructed chronological event timeline
state.json — current daemon state (if daemon was running)
incidents.jsonl — incident history (if any)

Run with watchdog

claude-guardian run -- claude
claude-guardian run --auto-restart --hang-timeout 120 -- node server.js

The watchdog:

Spawns your command as a child process
Monitors stdout/stderr for activity
If no activity for --hang-timeout seconds → captures a doctor bundle
If the process crashes → captures a bundle, optionally restarts with backoff

MCP Server (the real unlock)

Add to ~/.claude.json:

{
  "mcpServers": {
    "guardian": {
      "command": "npx",
      "args": ["@mcptoolshop/claude-guardian", "mcp"]
    }
  }
}

Then Claude can call:

Tool	What it returns
`guardian_status`	Disk, logs, processes, hang risk, budget, attention level
`guardian_preflight_fix`	Runs log rotation/trimming, returns before/after report
`guardian_doctor`	Creates diagnostics bundle (zip), returns path + summary
`guardian_nudge`	Safe auto-remediation: fix logs if bloated, capture bundle if needed
`guardian_budget_get`	Current concurrency cap, slots in use, active leases
`guardian_budget_acquire`	Request concurrency slots (returns lease ID)
`guardian_budget_release`	Release a lease when done with heavy work
`guardian_recovery_plan`	Step-by-step recovery plan naming exact tools to call
`guardian_preview_ready`	Poll a port until the dev server responds (use after `preview_start`)
`guardian_preview_recover`	Diagnose stuck preview sessions, classify project type, guide recovery

This lets Claude say: "Attention is WARN. Running guardian_nudge, then reducing concurrency."

Preview reliability

The guardian_preview_ready and guardian_preview_recover tools solve the race condition where preview_start returns success before the dev server is actually listening — which causes the browser to land on chrome-error:// and get stuck.

Workflow: preview_start → guardian_preview_ready (wait gate) → preview_snapshot

For non-web projects (Tauri, .NET MAUI, CLI tools), guardian_preview_recover detects the project type and returns "skip preview" guidance, avoiding false positives from the built-in preview verification hook.

Configuration

Three knobs (everything else is hardcoded with sane defaults):

Flag	Default	Description
`--max-log-mb`	`200`	Max project log directory size in MB
`--hang-timeout`	`300`	Seconds of inactivity before declaring a hang
`--auto-restart`	`false`	Auto-restart on crash/hang

Plus one hardcoded guardrail:

Disk free < 5GB → aggressive mode auto-enabled (shorter retention, lower thresholds)

Trust model

Claude Guardian is local-only. It has no network listener, no telemetry, and no cloud dependency.

What it reads: ~/.claude/projects/ (log files, sizes, modification times), process list (CPU, memory, uptime, handle counts for Claude-related processes via pidusage).

What it writes: ~/.claude-guardian/ (state.json, budget.json, journal.jsonl, doctor bundles). All files are under the user's home directory.

What it collects in bundles: System info (OS, CPU, memory, disk), log file tails (last 500 lines), process snapshots, and guardian's own journal. No API keys, tokens, credentials, or user content.

Dangerous actions — what Guardian will NOT do:

Kill processes or send signals (no SIGKILL, no SIGTERM)
Restart Claude Code or any other process
Delete files (rotation = gzip, trimming = keep last N lines)
Make network requests or phone home
Elevate privileges or access other users' data

If process killing or auto-restart is ever added, it will be behind an explicit opt-in flag, documented here, and off by default.

Reliability

Guardian is hardened for continuous daily use:

Async mutexes — withStateLock and withBudgetLock serialize concurrent file I/O, preventing TOCTOU races between the daemon and MCP tools
Overlap guard — daemon polls are protected by a pollInProgress flag so slow polls can't stack
Clock skew protection — all time deltas clamped with Math.max(0, ...) to handle system clock adjustments
Reverse-seek tail — large log files (>1MB) are tailed by reading chunks from the end, avoiding OOM on 500MB+ logs
Corruption recovery — corrupt state.json or budget.json files are backed up and reset with a journal entry for forensics
Process enumeration tracking — enumeration failures are captured in lastEnumerationError instead of silently swallowed
Full UUID lease IDs — budget leases use full UUIDs for reliable identification
Lease expiration journaling — expired leases are logged to the action journal for auditability

Design principles

Evidence over vibes — every action writes a journal entry; crash bundles capture state, not guesses
Deterministic — no ML, no heuristics beyond file age and size. Decision table you can read in 60 seconds
Safe by default — rotation = gzip (reversible), trimming = keep last N lines (data preserved), no deletions in v1
Boring dependencies — commander, pidusage, archiver, @modelcontextprotocol/sdk. That's it.

Development

npm install
npm run build
npm test

Scorecard

Category	Score	Notes
A. Security	10/10	SECURITY.md, local-only, no telemetry, no cloud
B. Error Handling	10/10	GuardianError (code+hint+cause), structured MCP errors, exit codes
C. Operator Docs	10/10	README, CHANGELOG, HANDBOOK, SHIP_GATE, walkthrough
D. Shipping Hygiene	10/10	CI + tests (203), dep-audit, npm published
E. Identity	10/10	Logo, translations, landing page, npm listing
Total	50/50

License

MIT

Built by MCP Tool Shop