📦

Claude Hooks

Cross-platform Claude Code hooks for deterministic memory recall (Qdrant + Memory KG) with HyDE, attention decay, dedup, instinct extraction, and OpenWolf integration

0 installs

Trust: 39 — Low

Rag

Ask AI about Claude Hooks

I know everything about Claude Hooks. Ask me about installation, configuration, usage, or troubleshooting.

0/500

Loading tools...

Reviews

Documentation

claude-hooks

Cross-platform Claude Code hooks that auto-recall from Qdrant + Memory KG on every prompt and write findings back at the end of the turn.

Install once at the user level and every Claude Code session gets deterministic memory recall + storage — no per-project init, no model forgetting. Beyond the core:

v0.5+ — transparent api.anthropic.com proxy with stats DB + dashboard + behavior canaries
v0.6+ — in-process Python AST code-graph (with optional tree-sitter / MCP-server / clustering extras)
v0.7+ — session-scoped LSP engine + opt-in ruff PostToolUse hook
v0.8+ — Caliber grounding proxy + shared agent_loop.runner
v1.0 — daemon-first hook execution, stable skill surface, pgvector backup + validity canary stack
v1.1 — two LLM-to-LLM advisory features built on agent_loop.runner:
- /get-advice — single-model second-opinion advisor talking to a configured Ollama backend with project-grounded tool access. Multi-turn, effort-budgeted, tool-filtered. See docs/get-advice.md.
- /consultants — multi-agent council (planner → researcher → critic → synthesizer) with full per-role LLM-message-history persistence in transcript.db, so a follow-up against a session reopened from disk produces an answer indistinguishable from a still-warm one. Multi-model fan-out at xmedium/xhigh/xmax effort tiers, multi-critic consensus with meta-critic combine at xmax, synthesizer failure-fallback model chain, and a degraded-answer composer that surfaces the researcher + critic work even when the synthesizer can't compose. See docs/consultants.md for the runbook and docs/benchmarks/EVALUATION.md
  - docs/benchmarks/ for the cloud-model evaluation suite (smoke + audit-medium + audit-high sweeps across kimi-k2.6, gemma4-31b, glm-5-1, qwen3-5, qwen3-5-397b, minimax-m2-7).

Quickstart

git clone https://github.com/mann1x/claude-hooks.git
cd claude-hooks
python3 install.py

The installer auto-detects your MCP servers, creates the config, and wires hooks into ~/.claude/settings.json. Open a new Claude Code session and you'll see:

Started with claude-hooks recall enabled (2 provider(s): Qdrant, Memory KG).

Check ~/.claude/claude-hooks.log to confirm hooks are firing.

For the full playbook — LAN-shared proxy setup, systemd unit, statusline wiring, monitoring, uninstall — see docs/deployment.md.

Releases & versioning

Current version: v1.1.0 — see CHANGELOG.md for the full history, or docs/whats-new.md for the human-readable v1.1 highlights.
Tagged releases live on GitHub Releases with auto-generated Source code (zip / tar.gz) archives.
Branch model: main is the release branch (every commit shippable, tags live here); dev is the working branch (feature work + fixes land here first). See docs/RELEASING.md for the cut procedure.
To track unreleased work: git log v1.1.0..origin/dev after fetching.
Optional self-update check (opt-in via install.py or update_check.enabled = true): the daemon polls https://api.github.com/repos/mann1x/claude-hooks/releases/latest at most once every 24 hours and the Stop hook surfaces a one-line notice when a newer tag is available. Fails silently on timeouts, retries 5× at 5-minute intervals before deferring to the next 24-hour window, caps notifications at 10 per release, and can be disabled at runtime by flipping update_check.enabled in config/claude-hooks.json.

What it does

user prompt
   |
   v
[UserPromptSubmit hook] --> HyDE expand --> recall from providers --> decay rank --> inject
   |
   v
Claude responds (knowing the prior context, deterministically)
   |
   v
[Stop hook] --> classify --> dedup check --> store --> extract instincts
   |
   v
[SessionStart on compact] --> full recall re-injection (memory recovery)

Features

Core (v0.1)

Stdlib only for the core (Qdrant + Memory KG providers, hooks, dispatcher) — no pip install needed. Optional features (pgvector, sqlite-vec, code-graph, MCP server, clustering) pull in their own deps via the [code-graph] / [clustering] / [mcp-server] extras.
Python 3.9+, runs identically on Linux, macOS, and Windows.
Auto-detection of MCP servers from ~/.claude.json
Plugin model: each memory backend is one file (qdrant, memory_kg, pgvector, sqlite_vec)
OpenWolf integration: injects Do-Not-Repeat and recent bugs from .wolf/ projects
Non-blocking: every hook exits 0 even on failure

Intelligence (v0.2)

HyDE query expansion -- generates a hypothetical answer via Ollama before searching Qdrant, dramatically improving recall quality. Falls back to raw prompt if Ollama is unavailable.
Attention decay -- memories that haven't been recalled recently fade; frequently useful ones strengthen. Tracks history in a JSON file.
Memory dedup -- before storing, checks for near-duplicates using text similarity. Prevents Qdrant from accumulating redundant entries.
Observation classification -- tags stored memories as fix, preference, decision, gotcha, or general for better downstream filtering.
Compact recall -- when Claude Code compacts context, the SessionStart hook re-injects full recalled memory so the model recovers what it lost.
Instinct extraction -- when a bug-fix pattern is detected (error -> edit), auto-extracts it as a reusable markdown instinct file under ~/.claude/instincts/.
Progressive disclosure -- optional: inject only the first line of each memory with a char-count hint, cutting injected context by ~3-5x.
/reflect synthesis -- CLI command that analyzes recent memories for recurring patterns and generates CLAUDE.md rules. Uses Ollama.
Autonomous consolidation -- CLI command to find duplicates, compress old memories, and prune stale ones. Uses Ollama.

Proxy / observability (v0.5+)

Local HTTP proxy in front of api.anthropic.com (docs/proxy.md) that Claude Code hooks can't see on their own. Opt-in via config/claude-hooks.json. install.py orchestrates the per-OS service — pick "Use the API proxy?" → either [1] install locally (systemd unit on Linux, LaunchAgent on macOS, UAC-elevated scheduled task on Windows) or [2] point at an existing proxy on the LAN (writes ANTHROPIC_BASE_URL into ~/.claude/settings.json for you).

Warmup short-circuit (proxy.block_warmup: true) — drops the subagent-Warmup token drain (anthropics/claude-code#47922) without the all-or-nothing side-effects of CLAUDE_CODE_DISABLE_BACKGROUND_TASKS. Returns a spec-compliant stub (JSON or SSE) so CC never sees an error. The proxy recognises two distinct drain patterns and blocks both under the same flag:

Pattern	Signature (`claude_hooks/proxy/metadata.py`)	What it is
CLI "Warmup"	`first_user_text == "Warmup"`	The keepalive Claude Code sends every few turns to keep the context hot. Cheap per-request but runs thousands of times per day. Classic token drain.
SDK-CLI subagent priming	`cc_entrypoint == "sdk-cli"` AND `agent_type == "subagent"` AND `num_messages == 1`	The Agent SDK's priming message when a subagent boots. Single user turn, no "Warmup" literal, so it looks like a real prompt to a naïve filter — but it's the same "init the context" intent, just from the SDK-CLI entrypoint. Historically slipped past the old `first_user_text` check and amplified 300M+ cache reads/day on subagent-heavy workflows.

Both map to is_warmup=True in request metadata and are blocked identically when block_warmup is on. The dashboard's warmups_blocked counter aggregates them; scripts/proxy_stats.py --show-sidechain breaks them out.

Update 2026-04-28 — the literal "Warmup" priming call no longer appears in proxy logs starting with Claude Code 2.1.121 (60 blocks on 04-27 → 0 on 04-28 across 1,300+ requests, on a host with no proxy config change). The detector is unchanged; the traffic itself is absent. We're keeping block_warmup: true on as a safety net in case the pattern returns. See docs/issue-warmup-token-drain.md for the per-day evidence and the upstream issue thread.

Live weekly-limit % — proxy captures Anthropic's anthropic-ratelimit-unified-* headers into a rolling state file; scripts/statusline_usage.py reads it for a compact statusline segment, scripts/weekly_token_usage.py --current-usage-pct auto-populates from the same file.
Structured observations (port from thedotmack/claude-mem) — hooks.stop.summary_format: "xml" stores memories as <observation><type><title><files_modified>… so downstream recall can filter by type without prose parsing.
Metadata-gated rerank — hooks.user_prompt_submit.metadata_filter filters candidates by cwd / type / age / tags before vector rerank.
Caliber grounding proxy (docs/caliber-proxy.md) — local OpenAI-compat HTTP server that augments caliber with project grounding so caliber init/refresh cite real path:line references instead of hallucinated ones. Paired with bin/caliber-smart as a drop-in caliber wrapper that falls back to claude-cli when the proxy is down.

⚠️ The shipped bin/caliber-grounding-proxy defaults CALIBER_GROUNDING_UPSTREAM=http://192.168.178.2:11433/v1 — the author's home-LAN Ollama proxy. Override via the systemd drop-in or shell environment for your install (see the linked doc).

Caliber-proxy listens on port 38090 by default (caliber_proxy.listen_port in config/claude-hooks.json). The matching bin/caliber-smart wrapper makes caliber refresh / caliber init use the proxy when it's up and fall through to a vanilla caliber invocation otherwise. The native-tools agent loop is in claude_hooks/caliber_proxy/server.py; design notes for picking a small grounding model live in docs/gemma4-tool-use-notes.md.

Stats DB + dashboard + behavior canaries

The proxy is more than a forwarder. Every request is parsed for metadata (effort, model_requested, model_delivered, service_tier, beta_features, thinking signature bytes, per-tool counts) and rolled up into a SQLite stats DB. A read-only HTTP dashboard renders the state, and an opt-in stop-phrase scanner adds behavior-quality canaries on top:

Stats DB — schema v5 at ~/.claude/claude-hooks-proxy/stats.db, populated by scripts/proxy_rollup.py running every 5 min (claude-hooks-rollup.timer). Per-request requests table feeds daily / session / model / agent rollups; the same path persists S3 thinking-depth and S4 per-tool-name canary counts.
Dashboard — read-only HTTP view on port 38081 (config: proxy_dashboard.listen_port). Renders today's request count, cache hit-rate, rate-limit utilisation, thinking metrics, tool-use canaries, behavior canaries, per-day rollups, agent / model breakdowns, beta-feature drift, and the stop-phrases × effort × date table that pinned the 2026-05-01 xhigh quality regression in docs/cc-xhigh-regression-issue.md:

Run bin/claude-hooks-dashboard for a one-shot start, or install the claude-hooks-dashboard.service systemd unit (the proxy installer wires it for you). Restart with systemctl restart claude-hooks-dashboard.service after a code change.
In-stream stop_phrase_guard (proxy.scan_stop_phrases: true) — scans every assistant turn against the stellaraccident #42796 canary phrases (config/stop_phrases.yaml, ~8 categories: ownership-dodging, permission-seeking, premature-stopping, known-limitation labeling, session-length excuses, simplest-fix bias, reasoning-reversal, self-admitted error). Hits land in the sp_* columns of the stats DB and roll up by day, by effort, and by category. Rates per-1k requests show whether a route or a model variant has drifted in quality without you noticing the symptoms one turn at a time.
Daily health line — claude-hooks-health.timer fires once a day (default 09:07 UTC, after rollups have digested the morning's traffic) and runs scripts/proxy_health_oneliner.py. The script emits a single line summarising request counts, 5xx / 429 totals, and per-effort ownD / permS rates with a ↑ arrow when today is ≥ 2× the prior 7-day baseline. Output appended to ~/.claude/proxy-health-daily.log and the journal.

For the full schema, query patterns, and dashboard route inventory see docs/proxy.md. For the upstream-facing PR / incident drafts that came out of the proxy data, see docs/issue-warmup-token-drain.md and docs/cc-xhigh-regression-issue.md.

Code graph (v0.6+)

A built-in, file-based code-structure graph (graphify-out/graph.json

GRAPH_REPORT.md) auto-built per project. Stdlib-only Python ast extractor; opt-in [code-graph] extra adds tree-sitter parsing for JS/TS/Go/Rust/Java/Ruby. SessionStart injects a 2-3 KB structural summary; per-Grep code_graph_lookup_enabled adds one-line "X is at file:line, N callers" hints when the pattern looks like an identifier.

CLI subcommands (python -m claude_hooks.code_graph ...):

Command	What
`build`	Walk the tree, extract symbols + calls + imports, write graph.json + GRAPH_REPORT.md
`info`	Print the graph's stats (file/node/edge counts, by-language)
`impact <symbol>`	Transitive callers + callees of a symbol (blast radius before refactoring)
`changes [--base REF]`	Blast-radius report for the current `git diff` (pre-commit / PR sanity check)
`trace <entrypoint>`	Forward call-chain trace from an entry function ("how does X flow through the system?")
`mermaid [--center SYM]`	Render a Mermaid module-map or local subgraph diagram
`clusters`	Detect functional communities in the call graph (Louvain when `[clustering]` extra installed; file-based fallback otherwise)
`companions`	Show detection state for axon + gitnexus + the local code graph

Optional extras:

pip install claude-hooks[code-graph] — tree-sitter-language-pack for multi-language parsing.
pip install claude-hooks[clustering] — python-louvain + networkx for Leiden-style community detection.
pip install claude-hooks[mcp-server] — mcp[cli] to spin up an MCP server (python -m claude_hooks.code_graph.mcp_server) exposing the lookup/impact/changes/trace/mermaid/companions tools to any MCP client (Claude Code, Cursor, etc.).

Companion code-graph engines

When you want richer queries than the built-in code_graph provides, claude-hooks integrates with two heavier engines as opt-in companion tools (silent no-op when absent):

axon (RECOMMENDED for Python/JS/TS) — pip install axoniq, KuzuDB-backed, dead-code detection, file watcher, 7 MCP tools.
gitnexus (ALTERNATIVE for 14 languages or multi-repo) — npm i -g gitnexus, LadybugDB-backed, hybrid BM25+vector+RRF search, multi-repo group_* tools, 16 MCP tools.

claude-hooks detects either via filesystem checks (binary on PATH + per-project marker dir + global registry), appends a mcp__axon__* / mcp__gitnexus__* hint to the SessionStart inject, and spawns the appropriate analyze on Stop when the turn modified files. Both can coexist; both reindex paths fire when their respective marker dirs are present. See COMPANION_TOOLS.md §6-7 for the install + comparison matrix.

The built-in code_graph always runs as the floor; the companions upgrade specific dimensions (live MCP queries, dead-code detection, multi-language coverage) when present.

IDE-style feedback loop (v0.7+)

Closes the "I didn't notice the import error until I ran the code" gap. Three complementary layers — pick one or stack them:

PostToolUse ruff hook (built-in, on by default) — runs ruff check on every Python file Claude Code edits with Edit / Write / MultiEdit. Diagnostics are injected as hookSpecificOutput.additionalContext so the model sees them in the very next prompt — before claiming the change is done. ~50 ms cold, catches undefined names, unused imports, syntax errors, etc. Config under hooks.post_tool_use in config/claude-hooks.json. Pairs with a toml_comment_advisor that nudges Claude to leave a # why: … line above any non-default value when editing hand-edited TOMLs (.claude-hooks/, lsp-engine.toml) — config under hooks.post_tool_use.toml_comment_advisor_enabled (default on) and toml_comment_advisor_paths (default [".claude-hooks/", "lsp-engine.toml"]).
cclsp (recommended companion, opt-in) — multi-language LSP wrapper that fronts pyright / gopls / rust-analyzer / clangd / OmniSharp via a single MCP server. Gives Claude Code on-demand hover, go-to-definition, find-references, and type diagnostics across Python / Go / Rust / C/C++ / C#. See docs/lsp-mcp.md for the install + Linux/Windows config. Pairs with the ruff hook: ruff is the cheap synchronous Python layer, cclsp is the multi-language on-demand layer.

LSP engine (opt-in, v0.7+)

A session-scoped daemon that loads language servers once per project and follows Claude Code's edits in real time, so diagnostics queries return in single-digit milliseconds instead of the 1–3 s pyright cold-start every cclsp call pays. Phases 0–4 shipped (config + lifecycle, daemon + session-affinity locks, adaptive preload + git watcher, opt-in compile-aware diagnostics, Windows IPC parity). See docs/lsp-engine.md for the user guide and docs/PLAN-lsp-engine.md for the locked design.

Phase	What
Foundations (P0)	TOML config (`.claude-hooks/lsp-engine.toml`), per-language `LspChild` wrapper, schema validation. Per-project + per-language opt-in.
Daemon + locks (P1)	Long-lived `claude_hooks.lsp_engine.daemon` per project. UNIX socket IPC (POSIX) / named pipes (Windows). Per-file session-affinity locks serialise multi-session edits cleanly.
Preload + git (P2)	Adaptive preload of the code-graph hot set warms the LSP index before the first query. Polling git watcher bulk-refreshes open files on branch switch.
Compile-aware (P3)	Opt-in `[compile_aware.commands]` block merges `cargo check` / `tsc --noEmit` / `mypy` / `go vet` diagnostics on top of the LSP layer. Run `/setup-compile-aware` for a guided proposal of the per-language commands.
Windows parity + bench (P4)	`multiprocessing.connection.Listener(family="AF_PIPE")` backend, `msvcrt.locking` daemon lock, latency benchmarks. 0.25 ms p50 IPC, ~13 ms p99 — IPC overhead is 0.1 % of pyright's 280 ms analysis time. Run `python scripts/bench_lsp_engine.py` for a fresh measurement.

The engine is independent of the PostToolUse ruff hook and the cclsp MCP server; you can run all three or any subset.

Slash command — `/setup-compile-aware`

Proposes a [compile_aware.commands] block for .claude-hooks/lsp-engine.toml by detecting build tools in the current project (Cargo.toml → cargo, tsconfig.json → tsc, pyproject.toml + mypy → mypy, go.mod → go vet, Makefile, …). Run this once after enabling the engine to wire the compile-aware layer; it asks for explicit confirmation before writing.

Scripts

Script	What
`scripts/status.py`	At-a-glance dashboard: systemd state, current rate-limit %, today's Warmup-blocked count. `--json` for scripting.
`scripts/weekly_token_usage.py`	Per-day token breakdown against a custom weekly-reset window (default Fri 10:00 CEST). Auto-populates `%Limit` from the proxy. `--show-sidechain` reveals the Warmup share.
`scripts/proxy_stats.py`	Ad-hoc proxy-log summaries (per-day requests, Warmup-blocked savings, synthetic-rate-limit detection, per-model counts). `--json` for scripting.
`scripts/proxy_rollup.py`	Ingest the proxy's daily JSONL files into `stats.db` (rollups + per-request rows). Driven by `claude-hooks-rollup.timer` (every 5 min, persistent across reboots).
`scripts/proxy_health_oneliner.py`	One-line daily health summary: per-effort `ownD`/`permS` rates, model divergences, 4xx/5xx, with `↑` arrows for ≥2× baseline regressions. Driven by `claude-hooks-health.timer`.
`scripts/statusline_usage.py`	Compact statusline segment showing live 5h / 7d %. Safe-by-design (never crashes the caller).
`scripts/statusline_compose.py`	Stitches the statusline pieces (model, weekly %, recall hit count, …) into the single string Claude Code reads from `statusLine.command`.
`scripts/bench_recall.py`	End-to-end recall latency benchmark across the configured providers. p50/p90/p99 + per-stage breakdown.
`scripts/bench_lsp_engine.py`	LSP engine vs ruff-only baseline. Measures `did_change` IPC-only and full round-trip (with diagnostics). Use after a new pyright / engine release.
`scripts/migrate_to_pgvector.py`	One-shot dump-and-load from Qdrant or Memory KG into the pgvector backend, with delta sync. See `docs/pgvector-runbook.md`.
`scripts/install-caliber-hook.sh`	Installs the Caliber pre-commit hook into the current repo so agent configs stay in sync.
`scripts/openwolfstatus.{py,sh,bat}`	OpenWolf status utility.

`bin/` shim reference

The bin/ directory ships small entry-point shims that auto-detect the conda env and fall back to system Python. Use these from settings.json hooks, systemd ExecStart lines, or the shell.

Shim	What
`bin/claude-hook`	Hook dispatcher. Called from `~/.claude/settings.json` for every event; routes to the matching handler under `claude_hooks/hooks/`. POSIX (`claude-hook`) and Windows (`claude-hook.cmd`) variants.
`bin/claude-hooks-daemon`	Foreground entry to the long-lived hook executor (`claude_hooks.daemon`). Use under systemd or for debugging.
`bin/claude-hooks-daemon-ctl`	Daemon ctl: `status` / `restart` / `kill` against the live daemon socket.
`bin/claude-hooks-proxy`	Foreground entry to the API proxy (`claude_hooks.proxy.server`).
`bin/claude-hooks-dashboard`	Foreground entry to the read-only stats dashboard (port 38081).
`bin/claude-hooks-rollup`	One-shot proxy-log → stats.db ingester. Wired to `claude-hooks-rollup.timer`.
`bin/claude-hook-pgvector-mcp`	System-wide stdio MCP server that exposes pgvector recall + KG ops. Lets Cursor / Codex / OpenWebUI use the same Postgres store as Claude Code. Registered in `~/.claude.json` by `install.py` when pgvector is enabled.
`bin/caliber-grounding-proxy`	Foreground entry to the Caliber grounding proxy (port 38090).
`bin/caliber-smart`	Drop-in `caliber` wrapper that uses the proxy when up, falls through otherwise.
`bin/_resolve_python.sh`	Internal helper sourced by every shim to find the right Python.

systemd unit reference

systemd/ ships the unit templates the proxy installer drops into /etc/systemd/system/. Each is User=root by default; adjust the User= and WorkingDirectory= lines for your install. Linux only; macOS uses LaunchAgents, Windows uses scheduled tasks (the installer handles all three).

Unit	What
`claude-hooks-proxy.service`	Long-running proxy on port 38080 (configurable).
`claude-hooks-dashboard.service`	Read-only stats dashboard on port 38081.
`claude-hooks-rollup.service` + `.timer`	Ingests daily JSONL files into `stats.db` every 5 min, plus a 1-min boot delay. `Persistent=true` so a missed tick triggers once on wake.
`claude-hooks-health.service` + `.timer`	Daily one-line health summary (default 09:07 UTC). Appends to `~/.claude/proxy-health-daily.log` and the journal.
`claude-hooks-daemon.service`	Long-lived per-session hook executor — lets each hook answer in milliseconds instead of paying the 100–300 ms Python cold-start.
`claude-hooks-pgvector-mcp.service`	System-wide stdio MCP server fronting pgvector. Useful when other clients (Cursor, Codex, OpenWebUI) want the same Postgres recall as Claude Code.
`caliber-grounding-proxy.service`	Caliber grounding proxy (port 38090) with project-aware tools (`survey_project`, `recall`).
`axon-host.service`	Optional Axon code-graph engine companion (Python, Neo4j-based). See `COMPANION_TOOLS.md`.

Requirements

Python 3.9+. The recall/store core is stdlib-only; only the proxy and the optional DB-backed providers (pgvector, sqlite-vec) need wheels.
Claude Code with hooks support.
At least one memory backend — pick from the table below. Multiple can run simultaneously; the dispatcher fans out recall in parallel.
(Optional) Ollama for HyDE, /reflect, /consolidate, and the embedder side of the pgvector / sqlite-vec providers.

Memory backends — pick at install time

Backend	Setup	Extra deps	Strengths
Qdrant (HTTP MCP)	Run `mcp-server-qdrant` (we ship a patched version under `vendor/mcp-qdrant/`)	none	mature vector search; the historical default
Memory KG (HTTP MCP)	Run `mcp-memory` (npm `@modelcontextprotocol/server-memory`)	none	typed entity graph + observation keyword search
Postgres + pgvector	Local docker stack — see `docs/pgvector-runbook.md`. `install.py` handles DSN probe, schema init, embedder pull, and registers a system-wide `pgvector-mcp` stdio server in `~/.claude.json` so other MCP clients (Cursor/Codex/OpenWebUI) can use the same store.	`pip install -r requirements-pgvector.txt`	single SQL backend that replaces both Qdrant + Memory KG; hybrid recall (vector + BM25 RRF); native KG entities/relations/observations
sqlite-vec	Standalone SQLite file at `~/.claude/claude-hooks-memory.db`	`pip install -r requirements-sqlite-vec.txt`	zero-server, single-file, low-footprint

Conda env + dependency files

The installer creates a claude-hooks conda env (Python 3.11) by default and pip-installs the requirements files relevant to your enabled backends. Manual install for reference:

conda create -n claude-hooks python=3.11 -y
conda activate claude-hooks

pip install -r requirements.txt                          # core (httpx[http2])
pip install -r requirements-pgvector.txt                 # if pgvector enabled
pip install -r requirements-sqlite-vec.txt               # if sqlite-vec enabled
pip install -r requirements-dev.txt                      # tests + coverage

The bin/claude-hook shim auto-detects this env (POSIX layout, Windows Scripts/python.exe, MSYS2 hybrid) and falls back to system python3, so no activation step is needed at hook runtime.

Install

git clone https://github.com/mann1x/claude-hooks.git
cd claude-hooks
python3 install.py

The installer will:

Detect if you have a conda env and offer to create one (optional -- system Python works fine)
Scan ~/.claude.json for MCP servers matching Qdrant and Memory KG
Verify each server with a real MCP call
Write config/claude-hooks.json with your server URLs
Merge hook entries into ~/.claude/settings.json (idempotent, tagged _managedBy)
Drop PATH wrappers for every bin/* shim so skill CLIs (claude-advisor, claude-consultants, …) resolve by bare name from Claude Code's bash subprocess. Locations are platform-specific:
- POSIX (Linux + macOS): ~/.local/bin/<shim> — POSIX sh wrapper that execs the absolute repo path. Almost always already on PATH; the installer prints a one-line hint if not.
- Windows: %LOCALAPPDATA%\claude-hooks\bin\<shim> (POSIX sh wrapper for the MSYS bash that Claude Code uses) plus a <shim>.cmd sibling for native cmd / PowerShell users. The installer also prepends that directory to HKCU\Environment\PATH via reg add (NOT setx, which silently truncates User PATH to 1024 chars), then broadcasts WM_SETTINGCHANGE so new processes pick it up without a logoff. Wrappers carry an install-time tag string in their first comment line so re-runs are idempotent and --uninstall removes only the tagged ones — hand-rolled wrappers of the same name are left alone.
Asks "Install /consultants engine?". On yes (opt-in, off by default — declines cleanly): creates a dedicated claude-hooks-consultants conda env (Py 3.11), pip-installs the consultants/ package with its LangGraph + LangServe stack, and wires the per-OS service. Two modes:
- Always-on (default): systemd / launchd / Task Scheduler unit keeps the engine resident, ~250 MB steady-state RAM. First-turn latency is sub-second.
- Smart-start (opt-in): the daemon spawns the engine on demand and reaps it after idle_timeout_seconds (default 30 min). Zero RAM idle, ~5-10 s cold start on first request after a quiet period. Conda is required — install.py aborts with a clear message pointing at Miniconda if it's missing, no silent fallback to bare venv. Everything goes through the dedicated env so the LangGraph dep tree never leaks into the main claude-hooks conda env that the test suite runs in.
Asks "Use the API proxy?". On yes:
- [1] Local install — pip-installs httpx[http2]>=0.27 into the chosen Python env, then drops the per-OS service:
  - Linux — claude-hooks-proxy.service + rollup.service + rollup.timer + dashboard.service in /etc/systemd/system/, daemon-reload + enable --now.
  - macOS — ~/Library/LaunchAgents/com.claude-hooks.proxy.plist (KeepAlive=true, RunAtLoad=true), loaded via launchctl.
  - Windows — UAC-elevated logon-triggered scheduled task claude-hooks-proxy (pythonw + run_proxy.py to avoid a persistent cmd window). Optionally writes ANTHROPIC_BASE_URL=http://127.0.0.1:38080 into ~/.claude/settings.json (LAN listen hosts auto-translate to loopback on the client side).
- [2] Remote URL — prompts for the proxy URL of an existing host on the LAN (e.g. http://192.168.178.2:38080) and writes ANTHROPIC_BASE_URL into ~/.claude/settings.json. No local service.
- Idempotent on re-run: already-installed services are detected and left alone unless you confirm reinstall.

Installer flags

python3 install.py --dry-run         # show changes, don't write
python3 install.py --non-interactive # CI-friendly, fail on prompts
python3 install.py --uninstall       # remove all claude-hooks entries
python3 install.py --probe           # force tool-probe detection

Verify it works

After install, open a new Claude Code session. You should see the SessionStart status line. Then check the log:

tail -20 ~/.claude/claude-hooks.log

You should see recall entries for each provider on every prompt.

Configuration

After install, config/claude-hooks.json lives in the repo (gitignored). Full schema with all options: config/claude-hooks.example.json.

v0.2 features (all opt-in via config)

Feature	Config key	Default	What it does
HyDE query expansion	`hooks.user_prompt_submit.hyde_enabled`	`false`	Generates a hypothetical answer via Ollama to improve search recall
Attention decay	`hooks.user_prompt_submit.decay_enabled`	`false`	Fades old memories, strengthens frequently useful ones. `halflife_days` = how fast (14 = gentle, 7 = aggressive)
Progressive disclosure	`hooks.user_prompt_submit.progressive`	`false`	Shows only first line + char count per memory, ~3-5x less context
Memory dedup	`providers.qdrant.dedup_threshold`	`0.0`	Text similarity threshold before storing. Set to `0.85` to skip near-duplicates
Observation classification	`hooks.stop.classify_observations`	`true`	Tags memories as fix/preference/decision/gotcha/general
Compact recall	`hooks.session_start.compact_recall`	`true`	Re-injects memories after context compaction so nothing is lost
Instinct extraction	`hooks.stop.extract_instincts`	`false`	Auto-creates markdown "instinct" files from bug-fix patterns
/reflect synthesis	`reflect.enabled`	`true`	Requires Ollama. Analyzes memory patterns and generates CLAUDE.md rules
Consolidation	`consolidate.enabled`	`false`	Requires Ollama. Deduplicates, compresses, and prunes old memories
Auto-consolidation	`consolidate.trigger`	`"manual"`	`"session_start"` runs `consolidate()` automatically every `min_sessions_between_runs` (default 10) sessions. CLI invocation always works regardless.
PreToolUse memory warn	`hooks.pre_tool_use.warn_on_tools` / `warn_on_patterns`	`["Bash","Edit","Write"]` / `["rm ","DROP TABLE","git reset --hard"]`	Match a tool + a substring in its args; recall against that command and inject as advisory `additionalContext`. Never blocks.
PreToolUse file-read gate	`hooks.pre_tool_use.file_read_gate` / `file_read_gate_tools`	`false` / `["Read","Edit","MultiEdit"]`	Port 5 from thedotmack/claude-mem. When `Read`/`Edit`/`MultiEdit` touches a path with prior memories, inject those memories regardless of `warn_on_patterns`.
Detached store	`hooks.stop.detach_store`	`false`	Fork the dedup-and-store fan-out into a detached subprocess so Stop returns immediately. ~200–500 ms saved per noteworthy turn. See `docs/daemon.md`.
Daemon (long-lived hook executor)	`hooks.daemon.enabled` (auto via installer)	platform-dependent	Single Python process owns providers + config across hook invocations. Each hook answers in milliseconds instead of 100–300 ms. See `docs/daemon.md`.

HyDE model

Default: gemma4:e2b with qwen3:4b fallback. Any small Ollama model works -- it just needs to produce a short hypothetical answer for search expansion. If Ollama is down, HyDE degrades gracefully to the raw prompt.

Commands Reference

Slash commands (inside Claude Code)

These are available as skills after running the installer. Type the command in the Claude Code prompt.

Command	Since	Requires	Description
`/reflect`	v0.2	Ollama	Analyze recent memories for recurring patterns, generate CLAUDE.md rules
`/consolidate`	v0.2	Ollama	Find duplicate memories, compress old entries, prune stale ones
`/wrapup`	v0.5	--	Produce a restore-ready session state summary before compacting / pausing
`/episodic <query>`	v0.6	episodic-server	Search past Claude Code conversations by semantic query
`/save-learning`	v0.7	--	Save a user instruction/preference as a persistent learning
`/find-skills`	v0.7	caliber	Search the public skill registry for community skills
`/setup-caliber`	v0.7	caliber	Set up Caliber pre-commit hooks for config drift detection
`/setup-compile-aware`	v0.7	LSP engine	Detect build tools in the current project and propose a `[compile_aware.commands]` block for `.claude-hooks/lsp-engine.toml`. Asks for confirmation before writing.
`/get-advice <query>`	v1.1	claude-advisor + Ollama	Multi-turn LLM-to-LLM second-opinion conversation with a configured Ollama advisor. Project tools (read_file, grep, glob, list_files, recall_memory) available to the advisor. See `docs/get-advice.md`.
`/get-advice--model [name [ctx]]`	v1.1	claude-advisor	Report or set the advisor's Ollama model + optional pinned context length.
`/get-advice--effort [tier]`	v1.1	claude-advisor	Report or set the effort tier (`low`/`medium`/`high`/`max`) — caps how many fresh advisor sessions Claude may spawn per `/get-advice`.
`/get-advice--tools [csv\|all\|none]`	v1.1	claude-advisor	Report or set the project-tool list exposed to the advisor.
`/consultants <query>`	v1.1	claude-consultants	Multi-agent council consultation (planner → researcher → critic → synthesizer) with per-role message-history persistence in `transcript.db`. See `docs/consultants.md`.
`/consultants--config`	v1.1	claude-consultants	Interactive walk-through to toggle roles, change per-role models, set context pins, switch effort tier (`low`/`medium`/`high`/`max`/`xmedium`/`xhigh`/`xmax`), change service mode (always-on / smart-start).
`/consultants--list`	v1.1	claude-consultants	List past council sessions in this project.
`/consultants--show <sid>`	v1.1	claude-consultants	Print a stored session's synthesizer answer + metadata; `--raw` dumps `transcript.db` events.
`/consultants--followup [<sid>] <question>`	v1.1	claude-consultants	Iterate on a prior session — every role inherits its prior message thread from `transcript.db`. Failed-session-aware: when the most recent session failed (synthesizer flap), offers to chain off the failed sid (researcher + critic threads inherit warm; synthesizer re-runs with the v1.1 fallback chain) or its parent.

CLI commands (outside Claude Code)

Run these from your terminal in the claude-hooks repo directory.

# Memory analysis
python -m claude_hooks.reflect              # generate CLAUDE.md rules from memory patterns
python -m claude_hooks.reflect --dry-run    # preview without writing

python -m claude_hooks.consolidate          # deduplicate and compress old memories
python -m claude_hooks.consolidate --dry-run

# Installer
python3 install.py                          # interactive install
python3 install.py --dry-run                # show changes, don't write
python3 install.py --non-interactive        # CI-friendly, no prompts
python3 install.py --uninstall              # remove all claude-hooks entries
python3 install.py --probe                  # force MCP tool-probe detection
python3 install.py --episodic-server        # configure as episodic-memory server
python3 install.py --episodic-client URL    # configure as episodic-memory client

# Episodic server (on the server host)
python3 episodic_server/server.py --host 0.0.0.0 --port 11435
systemctl status episodic-server            # if installed as systemd service
journalctl -u episodic-server -f            # follow server logs

# Episodic API (from any host)
curl "http://SERVER:11435/search?q=bcache&limit=5"   # search conversations
curl http://SERVER:11435/health                       # health check
curl http://SERVER:11435/stats                        # index statistics
curl -X POST http://SERVER:11435/sync                 # trigger re-index

# /get-advice CLI (v1.1)
claude-advisor get-model                            # show configured model + ctx_max
claude-advisor set-model qwen3.5:cloud              # set model (auto-probes ctx_max)
claude-advisor set-model qwen3.5:cloud 32768        # set model + pin context length
claude-advisor get-effort                           # show effort tier + budget
claude-advisor set-effort medium                    # low | medium | high | max
claude-advisor get-tools                            # show advisor's project-tool list
claude-advisor set-tools all                        # all known tools
claude-advisor set-tools none                       # tools-off
claude-advisor set-tools read_file,grep             # explicit subset
claude-advisor turn <sid> --first --message "..."   # start a session
claude-advisor turn <sid> --message "..."           # continue a session
claude-advisor reset <sid> --carryover "..."        # forced reset, returns new sid
claude-advisor cleanup                              # prune sessions > 24h old

# /consultants CLI (v1.1)
claude-consultants config show                      # JSON dump of full config
claude-consultants config set-role planner --model gemma4:31b-cloud
claude-consultants config set-role researcher --add-model glm-5.1:cloud   # x-tier extra
claude-consultants config set-role synthesizer --add-model glm-5.1:cloud  # failure fallback
claude-consultants config set-effort medium         # or xmedium / xhigh / xmax
claude-consultants config set-service-mode always-on  # or smart-start
claude-consultants config list-models               # tools-capable Ollama tags upstream
claude-consultants consult --message "..." --cwd "$(pwd)"
claude-consultants consult --message "..." --effort xhigh   # multi-model fan-out
claude-consultants status <sid>                     # poll progress
claude-consultants result <sid>                     # fetch summary + metadata
claude-consultants list                             # past sessions in this project
claude-consultants show <sid>                       # render stored summary
claude-consultants show <sid> --raw                 # dump transcript.db events as JSONL
claude-consultants follow-up <parent_sid> --message "..."  # extend prior session
claude-consultants list-open                        # warm sessions in engine memory
claude-consultants reopen <sid>                     # restore evicted session from disk
claude-consultants close <sid>                      # release engine memory (reversible)

Per-project opt-out

touch your-project/.claude-hooks-disable

Any directory with this marker file (or any ancestor) will skip all hooks. The filename can be changed via the top-level disable_marker_filename config key (default .claude-hooks-disable) if you need a different sentinel name for your organisation.

Uninstall

python3 install.py --uninstall

This removes the 4 hook entries tagged _managedBy: "claude-hooks" from ~/.claude/settings.json. Your other hooks and settings are left intact.

Adding a new provider

Create claude_hooks/providers/<name>.py implementing the Provider ABC
Add it to claude_hooks/providers/__init__.py REGISTRY
Re-run python3 install.py

The 4 methods a provider implements (detect, verify, recall, store) are the entire contract. Providers may optionally override batch_recall and batch_store for backends with native bulk operations — the default implementation parallelises single-shot calls.

Pgvector backend (optional)

For users who'd rather run a single Postgres-backed memory store than Qdrant + Memory KG, claude-hooks ships an opt-in pgvector provider plus a docker stack and a migration script.

The full walkthrough lives at docs/pgvector-runbook.md: docker compose at /shared/config/mcp-pgvector/, idempotent migration

delta sync via scripts/migrate_to_pgvector.py, a benchmark harness at scripts/bench_recall.py, and the design rationale at docs/PLAN-pgvector-migration.md.

Bench-driven default embedder pick (since 2026-04-28) is qwen3-embedding:0.6b (1024 dim, native 32k ctx). It replaces the earlier nomic-embed-text default after a head-to-head bench showed tighter cosine distances on niche queries and full 32k context that eliminates the silent 8k truncation cliff on long Stop summaries. Speed cost is real (~85 ms p50 embed vs ~38 ms for nomic) but total recall stays under 100 ms end-to-end on HNSW. Tables are model-namespaced (memories_<short>) because the embedding dim is part of the column type — see the runbook's Swapping the embedding model section if you want to change it.

Pgvector runs alongside Qdrant + Memory KG until you decide to retire them — there's no flag day.

Plugin Extraction

Some Claude Code plugins inject additionalContext on every PreToolUse event, which accumulates context rapidly and can cause premature compaction. The extract_plugin.py utility extracts the useful parts (skills, agents, commands) as standalone files and disables the plugin's hooks:

python3 extract_plugin.py

This currently targets code-analysis@mag-claude-plugins, which intercepts every Grep, Glob, Bash, Read, and Task call with claudemem enrichment. After extraction, all skills (/code-analysis--investigate, /code-analysis--deep-analysis, etc.) remain available on-demand — only the automatic per-tool-call injection is removed.

Re-run after a plugin version bump to pick up new skills.

Vendored MCP servers

`vendor/mcp-qdrant` — patched `mcp-server-qdrant` with score threshold

Upstream mcp-server-qdrant always returns QDRANT_SEARCH_LIMIT results on every qdrant-find call, no matter how weak the cosine similarity. On a realistic memory store this injects low-confidence noise into your prompt context on every turn.

vendor/mcp-qdrant/ contains a Dockerfile + idempotent build-time patch that adds a QDRANT_SCORE_THRESHOLD env var, forwarding Qdrant's native score_threshold into the MCP server. Set it to e.g. 0.40 and anything below that similarity is dropped before reaching claude-hooks.

Same image, same endpoints as upstream — just one extra env var. See vendor/mcp-qdrant/README.md for the full build/run instructions and how to pick a threshold for your embedding model.

Optional PreToolUse / Stop hooks (opt-in)

Three optional hooks are bundled but disabled by default. Enable them individually in config/claude-hooks.json after reading the doc for each one.

`stop_guard` — force the assistant to keep working

Scans the last assistant message on Stop events for ownership-dodging phrases ("pre-existing issue", "known limitation"), session-quitting phrases ("good stopping point", "continue in the next session"), and permission-seeking mid-task ("should I continue?"). If matched, returns decision: block with a correction so the assistant resumes working instead of stopping. Respects stop_hook_active to avoid infinite loops.

"hooks": {
  "stop_guard": { "enabled": true }
}

Patterns are opinionated defaults (derived from rtfpessoa's CLAUDE.md golden rules). Override with your own patterns: [{"pattern": "regex", "correction": "message"}, ...] in config. Source: claude_hooks/stop_guard.py.

User-intent wrap-up escape: by default the guard skips its check when the last user message contains a wrap-up marker (e.g. "wrap up", "compact the context", "save state", "continue another time", "/wrapup"). This lets /wrapup and similar explicit hand-off requests finish cleanly without being blocked. Disable with skip_on_user_wrap_up: false, or extend the marker list via user_wrap_up_markers: ["…", …].

Meta-context escape: by default the guard skips its check when the match is only inside a quoted span ("…", '…', `…`) or the message contains a meta-marker phrase like "trigger phrase", "would trigger", "stop_guard", "testing the hook", etc. This avoids false positives when the assistant is documenting, testing, or quoting the guard's rules. Turn off with skip_meta_context: false, or extend the marker list via meta_markers: ["…", …].

`safety_scan` — ask-before-running on dangerous commands

PreToolUse scanner that matches dangerous patterns anywhere in a Bash command (after pipes, chains, find -exec, subshells), not just as a prefix. Emits permissionDecision: "ask" on match so the user always makes the call; never auto-denies. Complements the prefix-based allow-list in ~/.claude/settings.json.

"hooks": {
  "pre_tool_use": {
    "safety_scan_enabled": true,
    "safety_log_retention_days": 90
  }
}

Default pattern list covers sudo, rm -rf, mkfs, dd, curl | sh, destructive git operations, npm install -g, DROP TABLE, and more. See claude_hooks/safety_patterns.py. Matches are logged as JSONL under ~/.claude/permission-scanner/ with daily rotation (90-day retention by default).

`rtk_rewrite` — transparent command rewrite for token savings

PreToolUse hook that shells out to rtk (a Rust CLI) to rewrite verbose find / grep / git log / du style commands into terser rtk equivalents. rtk-ai claims 60-90% token savings on matching commands.

"hooks": {
  "pre_tool_use": {
    "rtk_rewrite_enabled": true,
    "rtk_min_version": "0.23.0"
  }
}

Requires the rtk binary (>= 0.23.0) on PATH. Install from https://github.com/rtk-ai/rtk (Homebrew, curl installer, or download the Windows zip). If rtk is missing or too old, the hook silently passes the command through — safe to enable on partially-deployed fleets. Name collision warning: there's an unrelated "Rust Type Kit" crate also named rtk on crates.io — uninstall it first (rm $(which rtk) if rtk --version shows 0.1.x without a rewrite subcommand). Source: claude_hooks/rtk_rewrite.py.

Safety interaction with rtk — when rtk produces a rewrite, the hook emits permissionDecision: "allow", which bypasses the prefix allow-list in ~/.claude/settings.json. To keep that safety net, rtk_scan_rewrites: true (default) runs the scanner on rtk-rewritten commands even when safety_scan_enabled: false:

rtk_rewrite_enabled=true, safety_scan_enabled=false, rtk_scan_rewrites=true (default): rtk rewrites ls && rm -rf /tmp → scanner catches rm -rf → "ask".
rtk_rewrite_enabled=true, safety_scan_enabled=false, rtk_scan_rewrites=false: same input → "allow" (user opted out of the safety net).
rtk_rewrite_enabled=true, safety_scan_enabled=true: scanner runs on every Bash command, rewritten or not.

Other configurable features

HyDE query expansion (UserPromptSubmit recall)

HyDE (Hypothetical Document Embeddings) rewrites your prompt into a hypothetical answer before vector search, which usually lands better in "answer space" than the raw question. Enabled via hooks.user_prompt_submit.hyde_enabled: true. Tunables:

Key	Default	Purpose
`hyde_model`	`gemma4:e2b`	Primary Ollama model
`hyde_fallback_model`	`gemma4:e4b`	Fallback if primary fails
`hyde_url`	`http://localhost:11434/api/generate`	Ollama endpoint
`hyde_timeout`	`30.0`	Per-call timeout (seconds)
`hyde_max_tokens`	`150`	Output length cap for the hypothetical answer
`hyde_keep_alive`	`"15m"`	Ollama `keep_alive` — keeps the model resident between calls so cold-start doesn't hit every prompt
`hyde_grounded`	`true`	Two-phase grounded pipeline: query Qdrant raw first, then feed top hits to the LLM as grounding before generating the expansion. Prevents hallucinated domain terms poisoning the search.
`hyde_ground_k`	`3`	How many raw hits to use as grounding context
`hyde_ground_max_chars`	`1500`	Cap on the grounding context size

If raw recall finds nothing (garbage query), grounded mode short-circuits and skips HyDE entirely — cheaper than an ungrounded hallucinated expansion.

Precedence with metadata_filter — when both are enabled, the metadata filter applies first: each provider returns recall_k * over_fetch_factor candidates, the filter trims by cwd/type/age/tags, and only the survivors form HyDE's grounding pool. So a too-narrow filter will silently disable grounded HyDE (zero raw hits ⇒ no grounding ⇒ HyDE skipped). Loosen the filter before suspecting HyDE quality. See docs/hyde.md for the full pipeline.

Per-provider `dedup_threshold`

On Stop, providers that expose cosine similarity (qdrant, pgvector, sqlite_vec) can skip storing a turn summary if an existing entry is above the given cosine threshold. Set on the provider entry:

"providers": {
  "qdrant": { "dedup_threshold": 0.85 }
}

0.0 disables (the default for most providers). 0.85 is a sensible floor for "don't bother, we already have this." Higher = stricter.

The threshold is a cosine similarity (range 0.0–1.0, higher = more similar), computed via the provider's own embedding model on the truncated summary (first 500 chars). Don't confuse with 1 - distance in some pgvector queries — claude-hooks normalises providers to similarity-space internally so dedup_threshold always means "skip storing if any existing memory has cosine ≥ this value."

`classify_observations` and instincts extraction (Stop hook)

The Stop hook tags each stored memory with an observation_type (fix, decision, preference, gotcha, general) so downstream tooling can filter. Toggle with hooks.stop.classify_observations.

hooks.stop.extract_instincts (opt-in) additionally runs a lightweight heuristic to pull persistent rules from the assistant's message and write them to hooks.stop.instincts_dir (default ~/.claude/instincts) as a sidecar you can promote to CLAUDE.md manually. Experimental.

`summary_format`: markdown vs XML

hooks.stop.summary_format controls the layout of stored memories:

"markdown" (default) — backward-compatible plain-text bullet list. What every Qdrant corpus written before v0.5 contains.

"xml" — structured <observation> block (port from thedotmack/claude-mem). Each field is addressable, so downstream recall can filter by type without prose parsing:

<observation ts="2026-04-29T12:34:56Z">
  <type>fix</type>
  <title>bcache make-bcache --wipe-bcache rebuild</title>
  <subtitle>/srv/dev-disk-by-label-opt/dev/claude-hooks</subtitle>
  <cwd>/srv/dev-disk-by-label-opt/dev/claude-hooks</cwd>
  <prompt>...truncated 600 chars...</prompt>
  <result>...truncated 1200 chars...</result>
  <files_modified>
    <file>/etc/fstab</file>
    <file>/etc/bcache.conf</file>
  </files_modified>
  <files_read>...</files_read>
  <commands>
    <command>make-bcache --wipe-bcache /dev/sda3</command>
  </commands>
</observation>

When summary_format: "xml" is on, the Stop hook also reads the <type> tag back to seed metadata.observation_type directly (skipping the heuristic classifier when the model has already declared it).

Mixing formats inside one corpus works but mucks up search ranking — pick one per corpus and stick with it. The format the entry was written with is recorded in metadata.summary_format so you can filter or re-write later.

Claudemem auto-reindex

claudemem is a semantic code-search tool with its own AST-aware index. Upstream ships a git post-commit hook (claudemem hooks install), but that doesn't cover uncommitted mid-session edits. This hook plugs the gap:

Stop event: if the turn ran any Edit/Write/MultiEdit/ NotebookEdit, spawn claudemem index --quiet detached so the hook adds no latency.
SessionStart event: if the index is older than staleness_minutes AND any source file is newer than the index, reindex.

All triggers silently no-op if claudemem isn't on PATH or the project has no .claudemem/ directory — safe to leave enabled on partially- configured fleets.

Config (hooks.claudemem_reindex):

Key	Default	Purpose
`enabled`	`true`	Master toggle
`check_on_stop`	`true`	Reindex on turns that touched files
`check_on_session_start`	`true`	Staleness check when a new session opens
`staleness_minutes`	`10`	Cooldown — reindex at most every N min
`max_files_to_scan`	`2000`	Cap on the stale-scan walk (set higher for monorepos)
`ignored_dirs`	`[]`	Extra dir names to skip (appended to built-in: `.git`, `.claudemem`, `node_modules`, `.venv`, `__pycache__`, `.wolf`, `.caliber`, `dist`, `build`, …)
`lock_min_age_seconds`	`60`	Cooldown on the `.claudemem-reindex.lock` file to prevent pile-ups

For commit-time reindexing on every project, run:

claudemem hooks install   # in each git repo

Credits

The three optional hooks above are Python ports of the Bash hooks in rtfpessoa/code-factory:

stop_guard ← hooks/stop-phrase-guard.sh
safety_scan ← hooks/command-safety-scanner.sh
rtk_rewrite ← hooks/rtk-rewrite.sh

Design changes for claude-hooks: pure-Python implementation (no bash / jq dependency), pattern lists surfaced as config, integration between rtk_rewrite and safety_scan so rewrites are still scanned before auto-approval. See docs/PLAN-code-factory-integration.md for the full integration plan.

Scripts

`scripts/install-caliber-hook.sh` — fast caliber pre-commit hook

Caliber's default pre-commit hook runs caliber refresh and caliber learn finalize synchronously — each calls an LLM, and on event-heavy sessions the combined wait can block git commit for 20 minutes or more. This installer replaces that hook with a portable non-blocking version:

backgrounds caliber refresh so commits return instantly (refreshed docs land in the next commit)
drops caliber learn finalize (the SessionEnd Claude Code hook already runs the --auto version, and 240-event LLM passes on the commit path were hitting caliber's internal 600 s timeout)
bounds the inner Claude CLI call at 60 s via CALIBER_CLAUDE_CLI_TIMEOUT_MS
wraps the refresh in GNU timeout 30 when available (skipped on Windows Git Bash, where timeout.exe is a sleep with different semantics)

Measured on a 240-event session: ~20 min → ~0.7 s per commit.

sh scripts/install-caliber-hook.sh         # install / update
sh scripts/install-caliber-hook.sh --dry   # preview

Existing .git/hooks/pre-commit is backed up to .git/hooks/pre-commit.bak-<timestamp> before being replaced. Re-run on every machine that clones the repo — git doesn't version-control .git/hooks/ so the install can't be automatic.

`scripts/openwolfstatus` — OpenWolf dashboard status

Shows all registered OpenWolf projects, their dashboard/daemon port assignments, and PM2 process status. Warns if the PM2 state hasn't been saved (i.e. new daemons won't survive a reboot).

# Linux
./scripts/openwolfstatus.sh

# Windows
scripts\openwolfstatus.bat

PM2 auto-start on boot

OpenWolf daemons run under PM2. After starting or changing daemons, run pm2 save to persist the process list. Then set up auto-start:

Linux (systemd):

pm2 startup          # generates and enables a systemd service (pm2-<user>)
pm2 save             # saves current process list for resurrection

This creates /etc/systemd/system/pm2-<user>.service which runs pm2 resurrect on boot.

Windows:

npm install -g pm2-windows-startup
pm2-startup install  # adds a registry entry for auto-start on login
pm2 save             # saves current process list

This adds a PM2 entry under HKCU\Software\Microsoft\Windows\CurrentVersion\Run that launches pm2 resurrect at login.

Important: Every time you add or remove an OpenWolf daemon, run pm2 save again. Without it, the new daemon won't be restored after a reboot.

Tests

pip install -r requirements-dev.txt   # pytest + coverage
python -m pytest tests/ -q            # ~1.5k tests, mostly unit + a handful of integration

The exact pass count drifts as new modules ship — pytest --collect-only -q | tail -1 for the current total. Run with pytest -k <module> to scope.

Branch coverage gate (target ≥ 80 %):

coverage run -m pytest tests/
coverage report
# Re-run `coverage report` for the current figure — the number
# drifts as new modules ship.

Test-file map

File	Module under test	Tests	Notes
`tests/conftest.py`	shared fixtures	—	`fake_provider`, `base_config`, `fake_transcript`, `transcript_file`, `tmp_claude_home`
`tests/mocks/ollama.py`	Ollama HTTP stubs	—	`mock_ollama_generate`, `mock_ollama_embeddings`
`tests/test_fixtures.py`	fixture smoke tests	17	Sanity-checks every fixture and mock
`tests/test_config.py`	`config.py`	6	merge, paths, project-disabled marker
`tests/test_dedup.py`	`dedup.py`	11	similarity, should_store, dedup w/ failing provider
`tests/test_decay.py`	`decay.py`	23	hash, recency / frequency boost, prune, atomic state I/O
`tests/test_embedders.py`	`embedders.py`	14	factory, Ollama / OpenAI clients, error paths
`tests/test_hyde.py`	`hyde.py`	18	grounded expansion, fallback model, `_format_context` cap
`tests/test_recall.py`	`recall.py`	20	full pipeline, dedup, OpenWolf injection, HyDE skip
`tests/test_handlers.py`	hook handlers	28	UserPromptSubmit / SessionStart / SessionEnd / Stop store half
`tests/test_pre_tool_use_handler.py`	`hooks/pre_tool_use.py`	11	safety + rtk integration
`tests/test_safety_scan.py`	`safety_scan.py`	17	dangerous-command detection, allow-list
`tests/test_rtk_rewrite.py`	`rtk_rewrite.py`	12	rewrite, version probe, opt-in policy
`tests/test_stop_guard.py`	`stop_guard.py`	23	default patterns, meta-context escape, user-intent wrap-up escape
`tests/test_instincts.py`	`instincts.py`	13	bug-fix detection, save / merge w/ frontmatter
`tests/test_reflect.py`	`reflect.py`	12	guards, Ollama failure, dedup across providers, append idempotency
`tests/test_consolidate.py`	`consolidate.py`	16	merge candidates, compress, state file, `should_run` cooldown
`tests/test_claudemem_reindex.py`	`claudemem_reindex.py`	15	lock cooldown, staleness scan, async spawn
`tests/test_openwolf.py`	`openwolf.py`	9	wolf-dir detection, anatomy / cerebrum read
`tests/test_dispatcher.py`	`dispatcher.py`	6	event routing
`tests/test_detect.py`	`detect.py`	6	MCP server discovery in `~/.claude.json`
`tests/test_mcp_client.py`	`mcp_client.py`	6	initialize → tools/call round-trip
`tests/test_providers.py`	provider registry	9	Qdrant / Memory KG signatures
`tests/test_pgvector_integration.py`	`providers/pgvector.py`	8 (skipped w/o psycopg)	live Postgres
`tests/test_sqlite_vec_integration.py`	`providers/sqlite_vec.py`	8 (skipped w/o sqlite-vec)	live sqlite-vec
`tests/test_coverage_phase8.py`	dispatcher / pre_tool_use / stop / providers / recall / safety_scan / claudemem_reindex	120	Phase 8 — error paths, edge cases, module-import failures (lifts coverage from 81 % → 92 %)
`tests/test_proxy.py`	`claude_hooks/proxy/` (P0)	17	Pass-through, JSONL logging, Warmup + synthetic detection
`tests/test_proxy_p1.py`	SSE tail + rate-limit state + weekly auto-populate	22	P1 observability half
`tests/test_proxy_p3.py`	`block_warmup` short-circuit	7	Stub builders + upstream-not-called invariant
`tests/test_statusline_usage.py`	`scripts/statusline_usage.py`	16	P4 segment rendering, stale detection, CLI safety
`tests/test_proxy_stats.py`	`scripts/proxy_stats.py`	9	Aggregation, per-model, JSON output, since/until window
`tests/test_claude_mem_ports.py`	ports 1-5 from thedotmack/claude-mem	37	XML summary, metadata filter, tag strip, composite hash, file-read gate

Before merging: run python -m pytest tests/ (0 failures) and coverage report (≥ 80 %). Both are part of the conda-env workflow documented at the top of this section.

Recommended Companion Tools

See COMPANION_TOOLS.md for detailed install instructions, descriptions, and importance rankings for tools that complement claude-hooks.

Documentation map

Runbooks (docs/):

deployment.md — full install playbook (LAN-shared proxy, systemd, statusline, monitoring, uninstall)
env-vars.md — curated reference of CLAUDE_CODE_* env vars the framework reads or writes
daemon.md — long-lived hook executor: latency tiers, session lock, control protocol
proxy.md — claude_hooks/proxy/ runbook: schema, dashboard routes, Warmup detection, stop_phrase_guard config
pgvector-runbook.md — Postgres + pgvector backend setup + system-wide MCP server registration
hyde.md — HyDE query expansion configuration and tuning
caliber-proxy.md — Caliber grounding proxy: native-tools agent, survey_project, recall
gemma4-tool-use-notes.md — empirical notes on small grounding models for the caliber proxy
episodic-server.md — HTTP front-end for obra/episodic-memory
lsp-engine.md — LSP engine user guide
lsp-mcp.md — cclsp MCP companion install and Linux/Windows config
RELEASING.md — versioning, branch model, cut procedure, hotfix flow

Plans (docs/PLAN-*.md):

PLAN-lsp-engine.md — locked design for the v0.7 LSP engine (Phases 0–4 shipped)
PLAN-stats-sqlite.md — proxy SQLite rollup design (✅ shipped)
PLAN-proxy-hook.md — original proxy mode design (✅ shipped)
PLAN-pgvector-migration.md — Qdrant/Memory-KG → pgvector migration design
PLAN-test-coverage.md — branch-coverage closure plan
PLAN-code-factory-integration.md — code-factory integration design

Issue drafts (docs/, mirrored to upstream trackers):

issue-warmup-token-drain.md — anthropics/claude-code Warmup detection / mitigation evidence
cc-xhigh-regression-issue.md — anthropics/claude-code#55301 (xhigh quality regression filed 2026-05-01)
openwolf-managedby-issue.md — cytostack/openwolf#31 / PR #32 (_managedBy tag)
doc-audit-2026-05-01.md — this round's documentation gap report

Where the system listens

Default ports the framework introduces or expects. All bind to 127.0.0.1 by default; the proxy supports a LAN-listen mode for shared installs.

Port	Service	Configurable as
38080	API proxy	`proxy.listen_port`
38081	Stats dashboard	`proxy_dashboard.listen_port`
38090	Caliber grounding proxy	`caliber_proxy.listen_port`
11435	Episodic-memory HTTP server	`episodic_server.listen_port` (see `docs/episodic-server.md`)
11433	(host-specific) Ollama upstream — used as `CALIBER_GROUNDING_UPSTREAM` default; override for your install	env / systemd drop-in

License

MIT

Inspiration

openwolf -- project-anatomy tracking
claude-mem -- progressive disclosure
vestige -- HyDE query expansion
claude-cognitive -- attention decay
everything-claude-code -- instincts
claude-diary -- /reflect synthesis
mnemex -- semantic code search
caliber -- config drift detection
episodic-memory -- transcript search

Claude Hooks

Reviews

Documentation

claude-hooks

Quickstart

Releases & versioning

What it does

Features

Core (v0.1)

Intelligence (v0.2)

Proxy / observability (v0.5+)

Stats DB + dashboard + behavior canaries

Code graph (v0.6+)

Companion code-graph engines

IDE-style feedback loop (v0.7+)

LSP engine (opt-in, v0.7+)

Slash command — /setup-compile-aware

Scripts

bin/ shim reference

systemd unit reference

Requirements

Memory backends — pick at install time

Conda env + dependency files

Install

Installer flags

Verify it works

Configuration

v0.2 features (all opt-in via config)

HyDE model

Commands Reference

Slash commands (inside Claude Code)

CLI commands (outside Claude Code)

Per-project opt-out

Uninstall

Adding a new provider

Pgvector backend (optional)

Plugin Extraction

Vendored MCP servers

vendor/mcp-qdrant — patched mcp-server-qdrant with score threshold

Optional PreToolUse / Stop hooks (opt-in)

stop_guard — force the assistant to keep working

safety_scan — ask-before-running on dangerous commands

rtk_rewrite — transparent command rewrite for token savings

Other configurable features

HyDE query expansion (UserPromptSubmit recall)

Per-provider dedup_threshold

classify_observations and instincts extraction (Stop hook)

summary_format: markdown vs XML

Claudemem auto-reindex

Credits

Scripts

scripts/install-caliber-hook.sh — fast caliber pre-commit hook

scripts/openwolfstatus — OpenWolf dashboard status

PM2 auto-start on boot

Tests

Test-file map

Recommended Companion Tools

Documentation map

Where the system listens

License

Inspiration

Slash command — `/setup-compile-aware`

`bin/` shim reference

`vendor/mcp-qdrant` — patched `mcp-server-qdrant` with score threshold

`stop_guard` — force the assistant to keep working

`safety_scan` — ask-before-running on dangerous commands

`rtk_rewrite` — transparent command rewrite for token savings

Per-provider `dedup_threshold`

`classify_observations` and instincts extraction (Stop hook)

`summary_format`: markdown vs XML

`scripts/install-caliber-hook.sh` — fast caliber pre-commit hook

`scripts/openwolfstatus` — OpenWolf dashboard status