Claude Hooks
Cross-platform Claude Code hooks for deterministic memory recall (Qdrant + Memory KG) with HyDE, attention decay, dedup, instinct extraction, and OpenWolf integration
Ask AI about Claude Hooks
Powered by Claude Β· Grounded in docs
I know everything about Claude Hooks. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
claude-hooks
Cross-platform Claude Code hooks that auto-recall from Qdrant + Memory KG on every prompt and write findings back at the end of the turn.
Install once at the user level and every Claude Code session gets deterministic memory recall + storage β no per-project init, no model forgetting. Beyond the core:
- v0.5+ β transparent
api.anthropic.comproxy with stats DB + dashboard + behavior canaries - v0.6+ β in-process Python AST code-graph (with optional tree-sitter / MCP-server / clustering extras)
- v0.7+ β session-scoped LSP engine + opt-in ruff PostToolUse hook
- v0.8+ β Caliber grounding proxy + shared
agent_loop.runner - v1.0 β daemon-first hook execution, stable skill surface, pgvector backup + validity canary stack
- v1.1 β two LLM-to-LLM advisory features built on
agent_loop.runner:/get-adviceβ single-model second-opinion advisor talking to a configured Ollama backend with project-grounded tool access. Multi-turn, effort-budgeted, tool-filtered. Seedocs/get-advice.md./consultantsβ multi-agent council (planner β researcher β critic β synthesizer) with full per-role LLM-message-history persistence intranscript.db, so a follow-up against a session reopened from disk produces an answer indistinguishable from a still-warm one. Multi-model fan-out atxmedium/xhigh/xmaxeffort tiers, multi-critic consensus with meta-critic combine atxmax, synthesizer failure-fallback model chain, and a degraded-answer composer that surfaces the researcher + critic work even when the synthesizer can't compose. Seedocs/consultants.mdfor the runbook anddocs/benchmarks/EVALUATION.mddocs/benchmarks/for the cloud-model evaluation suite (smoke + audit-medium + audit-high sweeps across kimi-k2.6, gemma4-31b, glm-5-1, qwen3-5, qwen3-5-397b, minimax-m2-7).
Quickstart
git clone https://github.com/mann1x/claude-hooks.git
cd claude-hooks
python3 install.py
The installer auto-detects your MCP servers, creates the config, and wires
hooks into ~/.claude/settings.json. Open a new Claude Code session and
you'll see:
Started with claude-hooks recall enabled (2 provider(s): Qdrant, Memory KG).
Check ~/.claude/claude-hooks.log to confirm hooks are firing.
For the full playbook β LAN-shared proxy setup, systemd unit, statusline
wiring, monitoring, uninstall β see docs/deployment.md.
Releases & versioning
- Current version: v1.1.0 β see CHANGELOG.md for the full history,
or
docs/whats-new.mdfor the human-readable v1.1 highlights. - Tagged releases live on GitHub Releases with auto-generated
Source code (zip / tar.gz)archives. - Branch model:
mainis the release branch (every commit shippable, tags live here);devis the working branch (feature work + fixes land here first). Seedocs/RELEASING.mdfor the cut procedure. - To track unreleased work:
git log v1.1.0..origin/devafter fetching. - Optional self-update check (opt-in via
install.pyorupdate_check.enabled = true): the daemon pollshttps://api.github.com/repos/mann1x/claude-hooks/releases/latestat most once every 24 hours and the Stop hook surfaces a one-line notice when a newer tag is available. Fails silently on timeouts, retries 5Γ at 5-minute intervals before deferring to the next 24-hour window, caps notifications at 10 per release, and can be disabled at runtime by flippingupdate_check.enabledinconfig/claude-hooks.json.
What it does
user prompt
|
v
[UserPromptSubmit hook] --> HyDE expand --> recall from providers --> decay rank --> inject
|
v
Claude responds (knowing the prior context, deterministically)
|
v
[Stop hook] --> classify --> dedup check --> store --> extract instincts
|
v
[SessionStart on compact] --> full recall re-injection (memory recovery)
Features
Core (v0.1)
- Stdlib only for the core (Qdrant + Memory KG providers, hooks, dispatcher) β no
pip installneeded. Optional features (pgvector, sqlite-vec, code-graph, MCP server, clustering) pull in their own deps via the[code-graph]/[clustering]/[mcp-server]extras. - Python 3.9+, runs identically on Linux, macOS, and Windows.
- Auto-detection of MCP servers from
~/.claude.json - Plugin model: each memory backend is one file (qdrant, memory_kg, pgvector, sqlite_vec)
- OpenWolf integration: injects Do-Not-Repeat and recent bugs from
.wolf/projects - Non-blocking: every hook exits 0 even on failure
Intelligence (v0.2)
- HyDE query expansion -- generates a hypothetical answer via Ollama before searching Qdrant, dramatically improving recall quality. Falls back to raw prompt if Ollama is unavailable.
- Attention decay -- memories that haven't been recalled recently fade; frequently useful ones strengthen. Tracks history in a JSON file.
- Memory dedup -- before storing, checks for near-duplicates using text similarity. Prevents Qdrant from accumulating redundant entries.
- Observation classification -- tags stored memories as
fix,preference,decision,gotcha, orgeneralfor better downstream filtering. - Compact recall -- when Claude Code compacts context, the SessionStart hook re-injects full recalled memory so the model recovers what it lost.
- Instinct extraction -- when a bug-fix pattern is detected (error -> edit),
auto-extracts it as a reusable markdown instinct file under
~/.claude/instincts/. - Progressive disclosure -- optional: inject only the first line of each memory with a char-count hint, cutting injected context by ~3-5x.
/reflectsynthesis -- CLI command that analyzes recent memories for recurring patterns and generates CLAUDE.md rules. Uses Ollama.- Autonomous consolidation -- CLI command to find duplicates, compress old memories, and prune stale ones. Uses Ollama.
Proxy / observability (v0.5+)
-
Local HTTP proxy in front of
api.anthropic.com(docs/proxy.md) that Claude Code hooks can't see on their own. Opt-in viaconfig/claude-hooks.json.install.pyorchestrates the per-OS service β pick "Use the API proxy?" β either[1]install locally (systemd unit on Linux,LaunchAgenton macOS, UAC-elevated scheduled task on Windows) or[2]point at an existing proxy on the LAN (writesANTHROPIC_BASE_URLinto~/.claude/settings.jsonfor you). -
Warmup short-circuit (
proxy.block_warmup: true) β drops the subagent-Warmup token drain (anthropics/claude-code#47922) without the all-or-nothing side-effects ofCLAUDE_CODE_DISABLE_BACKGROUND_TASKS. Returns a spec-compliant stub (JSON or SSE) so CC never sees an error. The proxy recognises two distinct drain patterns and blocks both under the same flag:Pattern Signature ( claude_hooks/proxy/metadata.py)What it is CLI "Warmup" first_user_text == "Warmup"The keepalive Claude Code sends every few turns to keep the context hot. Cheap per-request but runs thousands of times per day. Classic token drain. SDK-CLI subagent priming cc_entrypoint == "sdk-cli"ANDagent_type == "subagent"ANDnum_messages == 1The Agent SDK's priming message when a subagent boots. Single user turn, no "Warmup" literal, so it looks like a real prompt to a naΓ―ve filter β but it's the same "init the context" intent, just from the SDK-CLI entrypoint. Historically slipped past the old first_user_textcheck and amplified 300M+ cache reads/day on subagent-heavy workflows.Both map to
is_warmup=Truein request metadata and are blocked identically whenblock_warmupis on. The dashboard's warmups_blocked counter aggregates them;scripts/proxy_stats.py --show-sidechainbreaks them out.Update 2026-04-28 β the literal
"Warmup"priming call no longer appears in proxy logs starting with Claude Code 2.1.121 (60 blocks on 04-27 β 0 on 04-28 across 1,300+ requests, on a host with no proxy config change). The detector is unchanged; the traffic itself is absent. We're keepingblock_warmup: trueon as a safety net in case the pattern returns. Seedocs/issue-warmup-token-drain.mdfor the per-day evidence and the upstream issue thread. -
Live weekly-limit % β proxy captures Anthropic's
anthropic-ratelimit-unified-*headers into a rolling state file;scripts/statusline_usage.pyreads it for a compact statusline segment,scripts/weekly_token_usage.py --current-usage-pctauto-populates from the same file. -
Structured observations (port from thedotmack/claude-mem) β
hooks.stop.summary_format: "xml"stores memories as<observation><type><title><files_modified>β¦so downstream recall can filter by type without prose parsing. -
Metadata-gated rerank β
hooks.user_prompt_submit.metadata_filterfilters candidates by cwd / type / age / tags before vector rerank. -
Caliber grounding proxy (
docs/caliber-proxy.md) β local OpenAI-compat HTTP server that augments caliber with project grounding socaliber init/refreshcite realpath:linereferences instead of hallucinated ones. Paired withbin/caliber-smartas a drop-incaliberwrapper that falls back to claude-cli when the proxy is down.β οΈ The shipped
bin/caliber-grounding-proxydefaultsCALIBER_GROUNDING_UPSTREAM=http://192.168.178.2:11433/v1β the author's home-LAN Ollama proxy. Override via the systemd drop-in or shell environment for your install (see the linked doc).Caliber-proxy listens on port 38090 by default (
caliber_proxy.listen_portinconfig/claude-hooks.json). The matchingbin/caliber-smartwrapper makescaliber refresh/caliber inituse the proxy when it's up and fall through to a vanillacaliberinvocation otherwise. The native-tools agent loop is inclaude_hooks/caliber_proxy/server.py; design notes for picking a small grounding model live indocs/gemma4-tool-use-notes.md.
Stats DB + dashboard + behavior canaries
The proxy is more than a forwarder. Every request is parsed for
metadata (effort, model_requested, model_delivered, service_tier,
beta_features, thinking signature bytes, per-tool counts) and rolled
up into a SQLite stats DB. A read-only HTTP dashboard renders the
state, and an opt-in stop-phrase scanner adds behavior-quality canaries
on top:
-
Stats DB β schema v5 at
~/.claude/claude-hooks-proxy/stats.db, populated byscripts/proxy_rollup.pyrunning every 5 min (claude-hooks-rollup.timer). Per-requestrequeststable feeds daily / session / model / agent rollups; the same path persists S3 thinking-depth and S4 per-tool-name canary counts. -
Dashboard β read-only HTTP view on port 38081 (config:
proxy_dashboard.listen_port). Renders today's request count, cache hit-rate, rate-limit utilisation, thinking metrics, tool-use canaries, behavior canaries, per-day rollups, agent / model breakdowns, beta-feature drift, and thestop-phrases Γ effort Γ datetable that pinned the 2026-05-01xhighquality regression indocs/cc-xhigh-regression-issue.md:
Run
bin/claude-hooks-dashboardfor a one-shot start, or install theclaude-hooks-dashboard.servicesystemd unit (the proxy installer wires it for you). Restart withsystemctl restart claude-hooks-dashboard.serviceafter a code change. -
In-stream
stop_phrase_guard(proxy.scan_stop_phrases: true) β scans every assistant turn against the stellaraccident #42796 canary phrases (config/stop_phrases.yaml, ~8 categories: ownership-dodging, permission-seeking, premature-stopping, known-limitation labeling, session-length excuses, simplest-fix bias, reasoning-reversal, self-admitted error). Hits land in thesp_*columns of the stats DB and roll up by day, by effort, and by category. Rates per-1k requests show whether a route or a model variant has drifted in quality without you noticing the symptoms one turn at a time. -
Daily health line β
claude-hooks-health.timerfires once a day (default 09:07 UTC, after rollups have digested the morning's traffic) and runsscripts/proxy_health_oneliner.py. The script emits a single line summarising request counts, 5xx / 429 totals, and per-effortownD/permSrates with aβarrow when today is β₯ 2Γ the prior 7-day baseline. Output appended to~/.claude/proxy-health-daily.logand the journal.
For the full schema, query patterns, and dashboard route inventory
see docs/proxy.md. For the upstream-facing PR /
incident drafts that came out of the proxy data, see
docs/issue-warmup-token-drain.md
and docs/cc-xhigh-regression-issue.md.
Code graph (v0.6+)
A built-in, file-based code-structure graph (graphify-out/graph.json
GRAPH_REPORT.md) auto-built per project. Stdlib-only Pythonastextractor; opt-in[code-graph]extra adds tree-sitter parsing for JS/TS/Go/Rust/Java/Ruby. SessionStart injects a 2-3 KB structural summary; per-Grepcode_graph_lookup_enabledadds one-line "X is at file:line, N callers" hints when the pattern looks like an identifier.
CLI subcommands (python -m claude_hooks.code_graph ...):
| Command | What |
|---|---|
build | Walk the tree, extract symbols + calls + imports, write graph.json + GRAPH_REPORT.md |
info | Print the graph's stats (file/node/edge counts, by-language) |
impact <symbol> | Transitive callers + callees of a symbol (blast radius before refactoring) |
changes [--base REF] | Blast-radius report for the current git diff (pre-commit / PR sanity check) |
trace <entrypoint> | Forward call-chain trace from an entry function ("how does X flow through the system?") |
mermaid [--center SYM] | Render a Mermaid module-map or local subgraph diagram |
clusters | Detect functional communities in the call graph (Louvain when [clustering] extra installed; file-based fallback otherwise) |
companions | Show detection state for axon + gitnexus + the local code graph |
Optional extras:
pip install claude-hooks[code-graph]βtree-sitter-language-packfor multi-language parsing.pip install claude-hooks[clustering]βpython-louvain+networkxfor Leiden-style community detection.pip install claude-hooks[mcp-server]βmcp[cli]to spin up an MCP server (python -m claude_hooks.code_graph.mcp_server) exposing the lookup/impact/changes/trace/mermaid/companions tools to any MCP client (Claude Code, Cursor, etc.).
Companion code-graph engines
When you want richer queries than the built-in code_graph provides,
claude-hooks integrates with two heavier engines as opt-in companion
tools (silent no-op when absent):
- axon (RECOMMENDED for Python/JS/TS) β
pip install axoniq, KuzuDB-backed, dead-code detection, file watcher, 7 MCP tools. - gitnexus (ALTERNATIVE for 14 languages or multi-repo) β
npm i -g gitnexus, LadybugDB-backed, hybrid BM25+vector+RRF search, multi-repogroup_*tools, 16 MCP tools.
claude-hooks detects either via filesystem checks (binary on PATH +
per-project marker dir + global registry), appends a mcp__axon__* /
mcp__gitnexus__* hint to the SessionStart inject, and spawns the
appropriate analyze on Stop when the turn modified files. Both can
coexist; both reindex paths fire when their respective marker dirs
are present. See COMPANION_TOOLS.md Β§6-7 for
the install + comparison matrix.
The built-in code_graph always runs as the floor; the companions
upgrade specific dimensions (live MCP queries, dead-code detection,
multi-language coverage) when present.
IDE-style feedback loop (v0.7+)
Closes the "I didn't notice the import error until I ran the code" gap. Three complementary layers β pick one or stack them:
PostToolUseruff hook (built-in, on by default) β runsruff checkon every Python file Claude Code edits withEdit/Write/MultiEdit. Diagnostics are injected ashookSpecificOutput.additionalContextso the model sees them in the very next prompt β before claiming the change is done. ~50 ms cold, catches undefined names, unused imports, syntax errors, etc. Config underhooks.post_tool_useinconfig/claude-hooks.json. Pairs with atoml_comment_advisorthat nudges Claude to leave a# why: β¦line above any non-default value when editing hand-edited TOMLs (.claude-hooks/,lsp-engine.toml) β config underhooks.post_tool_use.toml_comment_advisor_enabled(default on) andtoml_comment_advisor_paths(default[".claude-hooks/", "lsp-engine.toml"]).- cclsp (recommended companion, opt-in) β multi-language LSP
wrapper that fronts pyright / gopls / rust-analyzer / clangd /
OmniSharp via a single MCP server. Gives Claude Code on-demand
hover, go-to-definition, find-references, and type diagnostics
across Python / Go / Rust / C/C++ / C#. See
docs/lsp-mcp.mdfor the install + Linux/Windows config. Pairs with the ruff hook: ruff is the cheap synchronous Python layer, cclsp is the multi-language on-demand layer.
LSP engine (opt-in, v0.7+)
A session-scoped daemon that loads language servers once per
project and follows Claude Code's edits in real time, so
diagnostics queries return in single-digit milliseconds instead of
the 1β3 s pyright cold-start every cclsp call pays. Phases 0β4
shipped (config + lifecycle, daemon + session-affinity locks,
adaptive preload + git watcher, opt-in compile-aware diagnostics,
Windows IPC parity). See docs/lsp-engine.md
for the user guide and docs/PLAN-lsp-engine.md
for the locked design.
| Phase | What |
|---|---|
| Foundations (P0) | TOML config (.claude-hooks/lsp-engine.toml), per-language LspChild wrapper, schema validation. Per-project + per-language opt-in. |
| Daemon + locks (P1) | Long-lived claude_hooks.lsp_engine.daemon per project. UNIX socket IPC (POSIX) / named pipes (Windows). Per-file session-affinity locks serialise multi-session edits cleanly. |
| Preload + git (P2) | Adaptive preload of the code-graph hot set warms the LSP index before the first query. Polling git watcher bulk-refreshes open files on branch switch. |
| Compile-aware (P3) | Opt-in [compile_aware.commands] block merges cargo check / tsc --noEmit / mypy / go vet diagnostics on top of the LSP layer. Run /setup-compile-aware for a guided proposal of the per-language commands. |
| Windows parity + bench (P4) | multiprocessing.connection.Listener(family="AF_PIPE") backend, msvcrt.locking daemon lock, latency benchmarks. 0.25 ms p50 IPC, ~13 ms p99 β IPC overhead is 0.1 % of pyright's 280 ms analysis time. Run python scripts/bench_lsp_engine.py for a fresh measurement. |
The engine is independent of the PostToolUse ruff hook and the
cclsp MCP server; you can run all three or any subset.
Slash command β /setup-compile-aware
Proposes a [compile_aware.commands] block for
.claude-hooks/lsp-engine.toml by detecting build tools in the
current project (Cargo.toml β cargo, tsconfig.json β tsc,
pyproject.toml + mypy β mypy, go.mod β go vet, Makefile, β¦).
Run this once after enabling the engine to wire the compile-aware
layer; it asks for explicit confirmation before writing.
Scripts
| Script | What |
|---|---|
scripts/status.py | At-a-glance dashboard: systemd state, current rate-limit %, today's Warmup-blocked count. --json for scripting. |
scripts/weekly_token_usage.py | Per-day token breakdown against a custom weekly-reset window (default Fri 10:00 CEST). Auto-populates %Limit from the proxy. --show-sidechain reveals the Warmup share. |
scripts/proxy_stats.py | Ad-hoc proxy-log summaries (per-day requests, Warmup-blocked savings, synthetic-rate-limit detection, per-model counts). --json for scripting. |
scripts/proxy_rollup.py | Ingest the proxy's daily JSONL files into stats.db (rollups + per-request rows). Driven by claude-hooks-rollup.timer (every 5 min, persistent across reboots). |
scripts/proxy_health_oneliner.py | One-line daily health summary: per-effort ownD/permS rates, model divergences, 4xx/5xx, with β arrows for β₯2Γ baseline regressions. Driven by claude-hooks-health.timer. |
scripts/statusline_usage.py | Compact statusline segment showing live 5h / 7d %. Safe-by-design (never crashes the caller). |
scripts/statusline_compose.py | Stitches the statusline pieces (model, weekly %, recall hit count, β¦) into the single string Claude Code reads from statusLine.command. |
scripts/bench_recall.py | End-to-end recall latency benchmark across the configured providers. p50/p90/p99 + per-stage breakdown. |
scripts/bench_lsp_engine.py | LSP engine vs ruff-only baseline. Measures did_change IPC-only and full round-trip (with diagnostics). Use after a new pyright / engine release. |
scripts/migrate_to_pgvector.py | One-shot dump-and-load from Qdrant or Memory KG into the pgvector backend, with delta sync. See docs/pgvector-runbook.md. |
scripts/install-caliber-hook.sh | Installs the Caliber pre-commit hook into the current repo so agent configs stay in sync. |
scripts/openwolfstatus.{py,sh,bat} | OpenWolf status utility. |
bin/ shim reference
The bin/ directory ships small entry-point shims that auto-detect
the conda env and fall back to system Python. Use these from
settings.json hooks, systemd ExecStart lines, or the shell.
| Shim | What |
|---|---|
bin/claude-hook | Hook dispatcher. Called from ~/.claude/settings.json for every event; routes to the matching handler under claude_hooks/hooks/. POSIX (claude-hook) and Windows (claude-hook.cmd) variants. |
bin/claude-hooks-daemon | Foreground entry to the long-lived hook executor (claude_hooks.daemon). Use under systemd or for debugging. |
bin/claude-hooks-daemon-ctl | Daemon ctl: status / restart / kill against the live daemon socket. |
bin/claude-hooks-proxy | Foreground entry to the API proxy (claude_hooks.proxy.server). |
bin/claude-hooks-dashboard | Foreground entry to the read-only stats dashboard (port 38081). |
bin/claude-hooks-rollup | One-shot proxy-log β stats.db ingester. Wired to claude-hooks-rollup.timer. |
bin/claude-hook-pgvector-mcp | System-wide stdio MCP server that exposes pgvector recall + KG ops. Lets Cursor / Codex / OpenWebUI use the same Postgres store as Claude Code. Registered in ~/.claude.json by install.py when pgvector is enabled. |
bin/caliber-grounding-proxy | Foreground entry to the Caliber grounding proxy (port 38090). |
bin/caliber-smart | Drop-in caliber wrapper that uses the proxy when up, falls through otherwise. |
bin/_resolve_python.sh | Internal helper sourced by every shim to find the right Python. |
systemd unit reference
systemd/ ships the unit templates the proxy installer drops into
/etc/systemd/system/. Each is User=root by default; adjust the
User= and WorkingDirectory= lines for your install. Linux only;
macOS uses LaunchAgents, Windows uses scheduled tasks (the
installer handles all three).
| Unit | What |
|---|---|
claude-hooks-proxy.service | Long-running proxy on port 38080 (configurable). |
claude-hooks-dashboard.service | Read-only stats dashboard on port 38081. |
claude-hooks-rollup.service + .timer | Ingests daily JSONL files into stats.db every 5 min, plus a 1-min boot delay. Persistent=true so a missed tick triggers once on wake. |
claude-hooks-health.service + .timer | Daily one-line health summary (default 09:07 UTC). Appends to ~/.claude/proxy-health-daily.log and the journal. |
claude-hooks-daemon.service | Long-lived per-session hook executor β lets each hook answer in milliseconds instead of paying the 100β300 ms Python cold-start. |
claude-hooks-pgvector-mcp.service | System-wide stdio MCP server fronting pgvector. Useful when other clients (Cursor, Codex, OpenWebUI) want the same Postgres recall as Claude Code. |
caliber-grounding-proxy.service | Caliber grounding proxy (port 38090) with project-aware tools (survey_project, recall). |
axon-host.service | Optional Axon code-graph engine companion (Python, Neo4j-based). See COMPANION_TOOLS.md. |
Requirements
- Python 3.9+. The recall/store core is stdlib-only; only the proxy and the optional DB-backed providers (pgvector, sqlite-vec) need wheels.
- Claude Code with hooks support.
- At least one memory backend β pick from the table below. Multiple can run simultaneously; the dispatcher fans out recall in parallel.
- (Optional) Ollama for HyDE, /reflect, /consolidate, and the embedder side of the pgvector / sqlite-vec providers.
Memory backends β pick at install time
| Backend | Setup | Extra deps | Strengths |
|---|---|---|---|
| Qdrant (HTTP MCP) | Run mcp-server-qdrant (we ship a patched version under vendor/mcp-qdrant/) | none | mature vector search; the historical default |
| Memory KG (HTTP MCP) | Run mcp-memory (npm @modelcontextprotocol/server-memory) | none | typed entity graph + observation keyword search |
| Postgres + pgvector | Local docker stack β see docs/pgvector-runbook.md. install.py handles DSN probe, schema init, embedder pull, and registers a system-wide pgvector-mcp stdio server in ~/.claude.json so other MCP clients (Cursor/Codex/OpenWebUI) can use the same store. | pip install -r requirements-pgvector.txt | single SQL backend that replaces both Qdrant + Memory KG; hybrid recall (vector + BM25 RRF); native KG entities/relations/observations |
| sqlite-vec | Standalone SQLite file at ~/.claude/claude-hooks-memory.db | pip install -r requirements-sqlite-vec.txt | zero-server, single-file, low-footprint |
Conda env + dependency files
The installer creates a claude-hooks conda env (Python 3.11) by default
and pip-installs the requirements files relevant to your enabled
backends. Manual install for reference:
conda create -n claude-hooks python=3.11 -y
conda activate claude-hooks
pip install -r requirements.txt # core (httpx[http2])
pip install -r requirements-pgvector.txt # if pgvector enabled
pip install -r requirements-sqlite-vec.txt # if sqlite-vec enabled
pip install -r requirements-dev.txt # tests + coverage
The bin/claude-hook shim auto-detects this env (POSIX layout, Windows
Scripts/python.exe, MSYS2 hybrid) and falls back to system python3,
so no activation step is needed at hook runtime.
Install
git clone https://github.com/mann1x/claude-hooks.git
cd claude-hooks
python3 install.py
The installer will:
- Detect if you have a conda env and offer to create one (optional -- system Python works fine)
- Scan
~/.claude.jsonfor MCP servers matching Qdrant and Memory KG - Verify each server with a real MCP call
- Write
config/claude-hooks.jsonwith your server URLs - Merge hook entries into
~/.claude/settings.json(idempotent, tagged_managedBy) - Drop PATH wrappers for every
bin/*shim so skill CLIs (claude-advisor,claude-consultants, β¦) resolve by bare name from Claude Code's bash subprocess. Locations are platform-specific:- POSIX (Linux + macOS):
~/.local/bin/<shim>β POSIX sh wrapper thatexecs the absolute repo path. Almost always already onPATH; the installer prints a one-line hint if not. - Windows:
%LOCALAPPDATA%\claude-hooks\bin\<shim>(POSIX sh wrapper for the MSYS bash that Claude Code uses) plus a<shim>.cmdsibling for native cmd / PowerShell users. The installer also prepends that directory to HKCU\Environment\PATH viareg add(NOTsetx, which silently truncates User PATH to 1024 chars), then broadcastsWM_SETTINGCHANGEso new processes pick it up without a logoff. Wrappers carry an install-time tag string in their first comment line so re-runs are idempotent and--uninstallremoves only the tagged ones β hand-rolled wrappers of the same name are left alone.
- POSIX (Linux + macOS):
- Asks "Install /consultants engine?". On yes (opt-in, off by
default β declines cleanly): creates a dedicated
claude-hooks-consultantsconda env (Py 3.11), pip-installs theconsultants/package with its LangGraph + LangServe stack, and wires the per-OS service. Two modes:- Always-on (default): systemd / launchd / Task Scheduler unit keeps the engine resident, ~250 MB steady-state RAM. First-turn latency is sub-second.
- Smart-start (opt-in): the daemon spawns the engine on
demand and reaps it after
idle_timeout_seconds(default 30 min). Zero RAM idle, ~5-10 s cold start on first request after a quiet period. Conda is required β install.py aborts with a clear message pointing at Miniconda if it's missing, no silent fallback to bare venv. Everything goes through the dedicated env so the LangGraph dep tree never leaks into the mainclaude-hooksconda env that the test suite runs in.
- Asks "Use the API proxy?". On yes:
[1]Local install β pip-installshttpx[http2]>=0.27into the chosen Python env, then drops the per-OS service:- Linux β
claude-hooks-proxy.service+rollup.service+rollup.timer+dashboard.servicein/etc/systemd/system/,daemon-reload+enable --now. - macOS β
~/Library/LaunchAgents/com.claude-hooks.proxy.plist(KeepAlive=true,RunAtLoad=true), loaded vialaunchctl. - Windows β UAC-elevated logon-triggered scheduled task
claude-hooks-proxy(pythonw +run_proxy.pyto avoid a persistent cmd window). Optionally writesANTHROPIC_BASE_URL=http://127.0.0.1:38080into~/.claude/settings.json(LAN listen hosts auto-translate to loopback on the client side).
- Linux β
[2]Remote URL β prompts for the proxy URL of an existing host on the LAN (e.g.http://192.168.178.2:38080) and writesANTHROPIC_BASE_URLinto~/.claude/settings.json. No local service.- Idempotent on re-run: already-installed services are detected and left alone unless you confirm reinstall.
Installer flags
python3 install.py --dry-run # show changes, don't write
python3 install.py --non-interactive # CI-friendly, fail on prompts
python3 install.py --uninstall # remove all claude-hooks entries
python3 install.py --probe # force tool-probe detection
Verify it works
After install, open a new Claude Code session. You should see the
SessionStart status line. Then check the log:
tail -20 ~/.claude/claude-hooks.log
You should see recall entries for each provider on every prompt.
Configuration
After install, config/claude-hooks.json lives in the repo (gitignored).
Full schema with all options: config/claude-hooks.example.json.
v0.2 features (all opt-in via config)
| Feature | Config key | Default | What it does |
|---|---|---|---|
| HyDE query expansion | hooks.user_prompt_submit.hyde_enabled | false | Generates a hypothetical answer via Ollama to improve search recall |
| Attention decay | hooks.user_prompt_submit.decay_enabled | false | Fades old memories, strengthens frequently useful ones. halflife_days = how fast (14 = gentle, 7 = aggressive) |
| Progressive disclosure | hooks.user_prompt_submit.progressive | false | Shows only first line + char count per memory, ~3-5x less context |
| Memory dedup | providers.qdrant.dedup_threshold | 0.0 | Text similarity threshold before storing. Set to 0.85 to skip near-duplicates |
| Observation classification | hooks.stop.classify_observations | true | Tags memories as fix/preference/decision/gotcha/general |
| Compact recall | hooks.session_start.compact_recall | true | Re-injects memories after context compaction so nothing is lost |
| Instinct extraction | hooks.stop.extract_instincts | false | Auto-creates markdown "instinct" files from bug-fix patterns |
| /reflect synthesis | reflect.enabled | true | Requires Ollama. Analyzes memory patterns and generates CLAUDE.md rules |
| Consolidation | consolidate.enabled | false | Requires Ollama. Deduplicates, compresses, and prunes old memories |
| Auto-consolidation | consolidate.trigger | "manual" | "session_start" runs consolidate() automatically every min_sessions_between_runs (default 10) sessions. CLI invocation always works regardless. |
| PreToolUse memory warn | hooks.pre_tool_use.warn_on_tools / warn_on_patterns | ["Bash","Edit","Write"] / ["rm ","DROP TABLE","git reset --hard"] | Match a tool + a substring in its args; recall against that command and inject as advisory additionalContext. Never blocks. |
| PreToolUse file-read gate | hooks.pre_tool_use.file_read_gate / file_read_gate_tools | false / ["Read","Edit","MultiEdit"] | Port 5 from thedotmack/claude-mem. When Read/Edit/MultiEdit touches a path with prior memories, inject those memories regardless of warn_on_patterns. |
| Detached store | hooks.stop.detach_store | false | Fork the dedup-and-store fan-out into a detached subprocess so Stop returns immediately. ~200β500 ms saved per noteworthy turn. See docs/daemon.md. |
| Daemon (long-lived hook executor) | hooks.daemon.enabled (auto via installer) | platform-dependent | Single Python process owns providers + config across hook invocations. Each hook answers in milliseconds instead of 100β300 ms. See docs/daemon.md. |
HyDE model
Default: gemma4:e2b with qwen3:4b fallback. Any small Ollama model
works -- it just needs to produce a short hypothetical answer for search
expansion. If Ollama is down, HyDE degrades gracefully to the raw prompt.
Commands Reference
Slash commands (inside Claude Code)
These are available as skills after running the installer. Type the command in the Claude Code prompt.
| Command | Since | Requires | Description |
|---|---|---|---|
/reflect | v0.2 | Ollama | Analyze recent memories for recurring patterns, generate CLAUDE.md rules |
/consolidate | v0.2 | Ollama | Find duplicate memories, compress old entries, prune stale ones |
/wrapup | v0.5 | -- | Produce a restore-ready session state summary before compacting / pausing |
/episodic <query> | v0.6 | episodic-server | Search past Claude Code conversations by semantic query |
/save-learning | v0.7 | -- | Save a user instruction/preference as a persistent learning |
/find-skills | v0.7 | caliber | Search the public skill registry for community skills |
/setup-caliber | v0.7 | caliber | Set up Caliber pre-commit hooks for config drift detection |
/setup-compile-aware | v0.7 | LSP engine | Detect build tools in the current project and propose a [compile_aware.commands] block for .claude-hooks/lsp-engine.toml. Asks for confirmation before writing. |
/get-advice <query> | v1.1 | claude-advisor + Ollama | Multi-turn LLM-to-LLM second-opinion conversation with a configured Ollama advisor. Project tools (read_file, grep, glob, list_files, recall_memory) available to the advisor. See docs/get-advice.md. |
/get-advice--model [name [ctx]] | v1.1 | claude-advisor | Report or set the advisor's Ollama model + optional pinned context length. |
/get-advice--effort [tier] | v1.1 | claude-advisor | Report or set the effort tier (low/medium/high/max) β caps how many fresh advisor sessions Claude may spawn per /get-advice. |
/get-advice--tools [csv|all|none] | v1.1 | claude-advisor | Report or set the project-tool list exposed to the advisor. |
/consultants <query> | v1.1 | claude-consultants | Multi-agent council consultation (planner β researcher β critic β synthesizer) with per-role message-history persistence in transcript.db. See docs/consultants.md. |
/consultants--config | v1.1 | claude-consultants | Interactive walk-through to toggle roles, change per-role models, set context pins, switch effort tier (low/medium/high/max/xmedium/xhigh/xmax), change service mode (always-on / smart-start). |
/consultants--list | v1.1 | claude-consultants | List past council sessions in this project. |
/consultants--show <sid> | v1.1 | claude-consultants | Print a stored session's synthesizer answer + metadata; --raw dumps transcript.db events. |
/consultants--followup [<sid>] <question> | v1.1 | claude-consultants | Iterate on a prior session β every role inherits its prior message thread from transcript.db. Failed-session-aware: when the most recent session failed (synthesizer flap), offers to chain off the failed sid (researcher + critic threads inherit warm; synthesizer re-runs with the v1.1 fallback chain) or its parent. |
CLI commands (outside Claude Code)
Run these from your terminal in the claude-hooks repo directory.
# Memory analysis
python -m claude_hooks.reflect # generate CLAUDE.md rules from memory patterns
python -m claude_hooks.reflect --dry-run # preview without writing
python -m claude_hooks.consolidate # deduplicate and compress old memories
python -m claude_hooks.consolidate --dry-run
# Installer
python3 install.py # interactive install
python3 install.py --dry-run # show changes, don't write
python3 install.py --non-interactive # CI-friendly, no prompts
python3 install.py --uninstall # remove all claude-hooks entries
python3 install.py --probe # force MCP tool-probe detection
python3 install.py --episodic-server # configure as episodic-memory server
python3 install.py --episodic-client URL # configure as episodic-memory client
# Episodic server (on the server host)
python3 episodic_server/server.py --host 0.0.0.0 --port 11435
systemctl status episodic-server # if installed as systemd service
journalctl -u episodic-server -f # follow server logs
# Episodic API (from any host)
curl "http://SERVER:11435/search?q=bcache&limit=5" # search conversations
curl http://SERVER:11435/health # health check
curl http://SERVER:11435/stats # index statistics
curl -X POST http://SERVER:11435/sync # trigger re-index
# /get-advice CLI (v1.1)
claude-advisor get-model # show configured model + ctx_max
claude-advisor set-model qwen3.5:cloud # set model (auto-probes ctx_max)
claude-advisor set-model qwen3.5:cloud 32768 # set model + pin context length
claude-advisor get-effort # show effort tier + budget
claude-advisor set-effort medium # low | medium | high | max
claude-advisor get-tools # show advisor's project-tool list
claude-advisor set-tools all # all known tools
claude-advisor set-tools none # tools-off
claude-advisor set-tools read_file,grep # explicit subset
claude-advisor turn <sid> --first --message "..." # start a session
claude-advisor turn <sid> --message "..." # continue a session
claude-advisor reset <sid> --carryover "..." # forced reset, returns new sid
claude-advisor cleanup # prune sessions > 24h old
# /consultants CLI (v1.1)
claude-consultants config show # JSON dump of full config
claude-consultants config set-role planner --model gemma4:31b-cloud
claude-consultants config set-role researcher --add-model glm-5.1:cloud # x-tier extra
claude-consultants config set-role synthesizer --add-model glm-5.1:cloud # failure fallback
claude-consultants config set-effort medium # or xmedium / xhigh / xmax
claude-consultants config set-service-mode always-on # or smart-start
claude-consultants config list-models # tools-capable Ollama tags upstream
claude-consultants consult --message "..." --cwd "$(pwd)"
claude-consultants consult --message "..." --effort xhigh # multi-model fan-out
claude-consultants status <sid> # poll progress
claude-consultants result <sid> # fetch summary + metadata
claude-consultants list # past sessions in this project
claude-consultants show <sid> # render stored summary
claude-consultants show <sid> --raw # dump transcript.db events as JSONL
claude-consultants follow-up <parent_sid> --message "..." # extend prior session
claude-consultants list-open # warm sessions in engine memory
claude-consultants reopen <sid> # restore evicted session from disk
claude-consultants close <sid> # release engine memory (reversible)
Per-project opt-out
touch your-project/.claude-hooks-disable
Any directory with this marker file (or any ancestor) will skip all hooks.
The filename can be changed via the top-level disable_marker_filename
config key (default .claude-hooks-disable) if you need a different
sentinel name for your organisation.
Uninstall
python3 install.py --uninstall
This removes the 4 hook entries tagged _managedBy: "claude-hooks" from
~/.claude/settings.json. Your other hooks and settings are left intact.
Adding a new provider
- Create
claude_hooks/providers/<name>.pyimplementing theProviderABC - Add it to
claude_hooks/providers/__init__.pyREGISTRY - Re-run
python3 install.py
The 4 methods a provider implements (detect, verify, recall, store)
are the entire contract. Providers may optionally override batch_recall
and batch_store for backends with native bulk operations β the default
implementation parallelises single-shot calls.
Pgvector backend (optional)
For users who'd rather run a single Postgres-backed memory store than Qdrant + Memory KG, claude-hooks ships an opt-in pgvector provider plus a docker stack and a migration script.
The full walkthrough lives at docs/pgvector-runbook.md:
docker compose at /shared/config/mcp-pgvector/, idempotent migration
- delta sync via
scripts/migrate_to_pgvector.py, a benchmark harness atscripts/bench_recall.py, and the design rationale atdocs/PLAN-pgvector-migration.md.
Bench-driven default embedder pick (since 2026-04-28) is
qwen3-embedding:0.6b (1024 dim, native 32k ctx). It replaces the
earlier nomic-embed-text default after a head-to-head bench showed
tighter cosine distances on niche queries and full 32k context that
eliminates the silent 8k truncation cliff on long Stop summaries.
Speed cost is real (~85 ms p50 embed vs ~38 ms for nomic) but total
recall stays under 100 ms end-to-end on HNSW. Tables are
model-namespaced (memories_<short>) because the embedding dim is
part of the column type β see the runbook's Swapping the embedding
model section if you want to change it.
Pgvector runs alongside Qdrant + Memory KG until you decide to retire them β there's no flag day.
Plugin Extraction
Some Claude Code plugins inject additionalContext on every PreToolUse
event, which accumulates context rapidly and can cause premature compaction.
The extract_plugin.py utility extracts the useful parts (skills, agents,
commands) as standalone files and disables the plugin's hooks:
python3 extract_plugin.py
This currently targets code-analysis@mag-claude-plugins, which intercepts
every Grep, Glob, Bash, Read, and Task call with claudemem enrichment.
After extraction, all skills (/code-analysis--investigate,
/code-analysis--deep-analysis, etc.) remain available on-demand β only the
automatic per-tool-call injection is removed.
Re-run after a plugin version bump to pick up new skills.
Vendored MCP servers
vendor/mcp-qdrant β patched mcp-server-qdrant with score threshold
Upstream mcp-server-qdrant
always returns QDRANT_SEARCH_LIMIT results on every qdrant-find call, no
matter how weak the cosine similarity. On a realistic memory store this
injects low-confidence noise into your prompt context on every turn.
vendor/mcp-qdrant/ contains a Dockerfile + idempotent build-time patch that
adds a QDRANT_SCORE_THRESHOLD env var, forwarding Qdrant's native
score_threshold into the MCP server. Set it to e.g. 0.40 and anything
below that similarity is dropped before reaching claude-hooks.
Same image, same endpoints as upstream β just one extra env var. See
vendor/mcp-qdrant/README.md for the full
build/run instructions and how to pick a threshold for your embedding model.
Optional PreToolUse / Stop hooks (opt-in)
Three optional hooks are bundled but disabled by default. Enable them
individually in config/claude-hooks.json after reading the doc for
each one.
stop_guard β force the assistant to keep working
Scans the last assistant message on Stop events for
ownership-dodging phrases ("pre-existing issue", "known limitation"),
session-quitting phrases ("good stopping point", "continue in the
next session"), and permission-seeking mid-task ("should I continue?").
If matched, returns decision: block with a correction so the
assistant resumes working instead of stopping. Respects
stop_hook_active to avoid infinite loops.
"hooks": {
"stop_guard": { "enabled": true }
}
Patterns are opinionated defaults (derived from rtfpessoa's CLAUDE.md
golden rules). Override with your own
patterns: [{"pattern": "regex", "correction": "message"}, ...] in
config. Source: claude_hooks/stop_guard.py.
User-intent wrap-up escape: by default the guard skips its check
when the last user message contains a wrap-up marker (e.g. "wrap up",
"compact the context", "save state", "continue another time",
"/wrapup"). This lets /wrapup and similar explicit hand-off requests
finish cleanly without being blocked. Disable with
skip_on_user_wrap_up: false, or extend the marker list via
user_wrap_up_markers: ["β¦", β¦].
Meta-context escape: by default the guard skips its check when the
match is only inside a quoted span ("β¦", 'β¦', `β¦`) or the
message contains a meta-marker phrase like "trigger phrase",
"would trigger", "stop_guard", "testing the hook", etc. This avoids
false positives when the assistant is documenting, testing, or quoting
the guard's rules. Turn off with skip_meta_context: false, or
extend the marker list via meta_markers: ["β¦", β¦].
safety_scan β ask-before-running on dangerous commands
PreToolUse scanner that matches dangerous patterns anywhere in a
Bash command (after pipes, chains, find -exec, subshells), not just
as a prefix. Emits permissionDecision: "ask" on match so the user
always makes the call; never auto-denies. Complements the
prefix-based allow-list in ~/.claude/settings.json.
"hooks": {
"pre_tool_use": {
"safety_scan_enabled": true,
"safety_log_retention_days": 90
}
}
Default pattern list covers sudo, rm -rf, mkfs, dd,
curl | sh, destructive git operations, npm install -g,
DROP TABLE, and more. See
claude_hooks/safety_patterns.py.
Matches are logged as JSONL under ~/.claude/permission-scanner/
with daily rotation (90-day retention by default).
rtk_rewrite β transparent command rewrite for token savings
PreToolUse hook that shells out to rtk
(a Rust CLI) to rewrite verbose find / grep / git log / du
style commands into terser rtk equivalents. rtk-ai claims 60-90%
token savings on matching commands.
"hooks": {
"pre_tool_use": {
"rtk_rewrite_enabled": true,
"rtk_min_version": "0.23.0"
}
}
Requires the rtk binary (>= 0.23.0) on PATH. Install from
https://github.com/rtk-ai/rtk (Homebrew, curl installer, or download
the Windows zip). If rtk is missing or too old, the hook silently
passes the command through β safe to enable on partially-deployed
fleets. Name collision warning: there's an unrelated "Rust Type
Kit" crate also named rtk on crates.io β uninstall it first
(rm $(which rtk) if rtk --version shows 0.1.x without a
rewrite subcommand). Source:
claude_hooks/rtk_rewrite.py.
Safety interaction with rtk β when rtk produces a rewrite, the
hook emits permissionDecision: "allow", which bypasses the
prefix allow-list in ~/.claude/settings.json. To keep that safety
net, rtk_scan_rewrites: true (default) runs the scanner on
rtk-rewritten commands even when safety_scan_enabled: false:
rtk_rewrite_enabled=true, safety_scan_enabled=false, rtk_scan_rewrites=true(default): rtk rewritesls && rm -rf /tmpβ scanner catchesrm -rfβ "ask".rtk_rewrite_enabled=true, safety_scan_enabled=false, rtk_scan_rewrites=false: same input β "allow" (user opted out of the safety net).rtk_rewrite_enabled=true, safety_scan_enabled=true: scanner runs on every Bash command, rewritten or not.
Other configurable features
HyDE query expansion (UserPromptSubmit recall)
HyDE (Hypothetical Document Embeddings) rewrites your prompt into a
hypothetical answer before vector search, which usually lands better in
"answer space" than the raw question. Enabled via
hooks.user_prompt_submit.hyde_enabled: true. Tunables:
| Key | Default | Purpose |
|---|---|---|
hyde_model | gemma4:e2b | Primary Ollama model |
hyde_fallback_model | gemma4:e4b | Fallback if primary fails |
hyde_url | http://localhost:11434/api/generate | Ollama endpoint |
hyde_timeout | 30.0 | Per-call timeout (seconds) |
hyde_max_tokens | 150 | Output length cap for the hypothetical answer |
hyde_keep_alive | "15m" | Ollama keep_alive β keeps the model resident between calls so cold-start doesn't hit every prompt |
hyde_grounded | true | Two-phase grounded pipeline: query Qdrant raw first, then feed top hits to the LLM as grounding before generating the expansion. Prevents hallucinated domain terms poisoning the search. |
hyde_ground_k | 3 | How many raw hits to use as grounding context |
hyde_ground_max_chars | 1500 | Cap on the grounding context size |
If raw recall finds nothing (garbage query), grounded mode short-circuits and skips HyDE entirely β cheaper than an ungrounded hallucinated expansion.
Precedence with metadata_filter β when both are enabled, the
metadata filter applies first: each provider returns
recall_k * over_fetch_factor candidates, the filter trims by
cwd/type/age/tags, and only the survivors form HyDE's grounding pool.
So a too-narrow filter will silently disable grounded HyDE (zero raw
hits β no grounding β HyDE skipped). Loosen the filter before
suspecting HyDE quality. See docs/hyde.md for the
full pipeline.
Per-provider dedup_threshold
On Stop, providers that expose cosine similarity (qdrant, pgvector,
sqlite_vec) can skip storing a turn summary if an existing entry is
above the given cosine threshold. Set on the provider entry:
"providers": {
"qdrant": { "dedup_threshold": 0.85 }
}
0.0 disables (the default for most providers). 0.85 is a sensible
floor for "don't bother, we already have this." Higher = stricter.
The threshold is a cosine similarity (range 0.0β1.0, higher = more
similar), computed via the provider's own embedding model on the
truncated summary (first 500 chars). Don't confuse with 1 - distance
in some pgvector queries β claude-hooks normalises providers to
similarity-space internally so dedup_threshold always means "skip
storing if any existing memory has cosine β₯ this value."
classify_observations and instincts extraction (Stop hook)
The Stop hook tags each stored memory with an observation_type
(fix, decision, preference, gotcha, general) so downstream
tooling can filter. Toggle with hooks.stop.classify_observations.
hooks.stop.extract_instincts (opt-in) additionally runs a lightweight
heuristic to pull persistent rules from the assistant's message and
write them to hooks.stop.instincts_dir (default ~/.claude/instincts)
as a sidecar you can promote to CLAUDE.md manually. Experimental.
summary_format: markdown vs XML
hooks.stop.summary_format controls the layout of stored memories:
-
"markdown"(default) β backward-compatible plain-text bullet list. What every Qdrant corpus written before v0.5 contains. -
"xml"β structured<observation>block (port from thedotmack/claude-mem). Each field is addressable, so downstream recall can filter by type without prose parsing:<observation ts="2026-04-29T12:34:56Z"> <type>fix</type> <title>bcache make-bcache --wipe-bcache rebuild</title> <subtitle>/srv/dev-disk-by-label-opt/dev/claude-hooks</subtitle> <cwd>/srv/dev-disk-by-label-opt/dev/claude-hooks</cwd> <prompt>...truncated 600 chars...</prompt> <result>...truncated 1200 chars...</result> <files_modified> <file>/etc/fstab</file> <file>/etc/bcache.conf</file> </files_modified> <files_read>...</files_read> <commands> <command>make-bcache --wipe-bcache /dev/sda3</command> </commands> </observation>When
summary_format: "xml"is on, the Stop hook also reads the<type>tag back to seedmetadata.observation_typedirectly (skipping the heuristic classifier when the model has already declared it).
Mixing formats inside one corpus works but mucks up search ranking β
pick one per corpus and stick with it. The format the entry was
written with is recorded in metadata.summary_format so you can
filter or re-write later.
Claudemem auto-reindex
claudemem is a semantic
code-search tool with its own AST-aware index. Upstream ships a git
post-commit hook (claudemem hooks install), but that doesn't cover
uncommitted mid-session edits. This hook plugs the gap:
- Stop event: if the turn ran any
Edit/Write/MultiEdit/NotebookEdit, spawnclaudemem index --quietdetached so the hook adds no latency. - SessionStart event: if the index is older than
staleness_minutesAND any source file is newer than the index, reindex.
All triggers silently no-op if claudemem isn't on PATH or the project
has no .claudemem/ directory β safe to leave enabled on partially-
configured fleets.
Config (hooks.claudemem_reindex):
| Key | Default | Purpose |
|---|---|---|
enabled | true | Master toggle |
check_on_stop | true | Reindex on turns that touched files |
check_on_session_start | true | Staleness check when a new session opens |
staleness_minutes | 10 | Cooldown β reindex at most every N min |
max_files_to_scan | 2000 | Cap on the stale-scan walk (set higher for monorepos) |
ignored_dirs | [] | Extra dir names to skip (appended to built-in: .git, .claudemem, node_modules, .venv, __pycache__, .wolf, .caliber, dist, build, β¦) |
lock_min_age_seconds | 60 | Cooldown on the .claudemem-reindex.lock file to prevent pile-ups |
For commit-time reindexing on every project, run:
claudemem hooks install # in each git repo
Credits
The three optional hooks above are Python ports of the Bash hooks in rtfpessoa/code-factory:
stop_guardβhooks/stop-phrase-guard.shsafety_scanβhooks/command-safety-scanner.shrtk_rewriteβhooks/rtk-rewrite.sh
Design changes for claude-hooks: pure-Python implementation (no bash /
jq dependency), pattern lists surfaced as config, integration between
rtk_rewrite and safety_scan so rewrites are still scanned before
auto-approval. See
docs/PLAN-code-factory-integration.md
for the full integration plan.
Scripts
scripts/install-caliber-hook.sh β fast caliber pre-commit hook
Caliber's default pre-commit hook runs caliber refresh and
caliber learn finalize synchronously β each calls an LLM, and on
event-heavy sessions the combined wait can block git commit for
20 minutes or more. This installer replaces that hook with a
portable non-blocking version:
- backgrounds
caliber refreshso commits return instantly (refreshed docs land in the next commit) - drops
caliber learn finalize(the SessionEnd Claude Code hook already runs the--autoversion, and 240-event LLM passes on the commit path were hitting caliber's internal 600 s timeout) - bounds the inner Claude CLI call at 60 s via
CALIBER_CLAUDE_CLI_TIMEOUT_MS - wraps the refresh in GNU
timeout 30when available (skipped on Windows Git Bash, wheretimeout.exeis asleepwith different semantics)
Measured on a 240-event session: ~20 min β ~0.7 s per commit.
sh scripts/install-caliber-hook.sh # install / update
sh scripts/install-caliber-hook.sh --dry # preview
Existing .git/hooks/pre-commit is backed up to
.git/hooks/pre-commit.bak-<timestamp> before being replaced. Re-run
on every machine that clones the repo β git doesn't version-control
.git/hooks/ so the install can't be automatic.
scripts/openwolfstatus β OpenWolf dashboard status
Shows all registered OpenWolf projects, their dashboard/daemon port assignments, and PM2 process status. Warns if the PM2 state hasn't been saved (i.e. new daemons won't survive a reboot).
# Linux
./scripts/openwolfstatus.sh
# Windows
scripts\openwolfstatus.bat
PM2 auto-start on boot
OpenWolf daemons run under PM2. After starting or changing daemons, run
pm2 save to persist the process list. Then set up auto-start:
Linux (systemd):
pm2 startup # generates and enables a systemd service (pm2-<user>)
pm2 save # saves current process list for resurrection
This creates /etc/systemd/system/pm2-<user>.service which runs
pm2 resurrect on boot.
Windows:
npm install -g pm2-windows-startup
pm2-startup install # adds a registry entry for auto-start on login
pm2 save # saves current process list
This adds a PM2 entry under
HKCU\Software\Microsoft\Windows\CurrentVersion\Run that launches
pm2 resurrect at login.
Important: Every time you add or remove an OpenWolf daemon, run
pm2 saveagain. Without it, the new daemon won't be restored after a reboot.
Tests
pip install -r requirements-dev.txt # pytest + coverage
python -m pytest tests/ -q # ~1.5k tests, mostly unit + a handful of integration
The exact pass count drifts as new modules ship β pytest --collect-only -q | tail -1 for the current total. Run with pytest -k <module> to scope.
Branch coverage gate (target β₯ 80 %):
coverage run -m pytest tests/
coverage report
# Re-run `coverage report` for the current figure β the number
# drifts as new modules ship.
Test-file map
| File | Module under test | Tests | Notes |
|---|---|---|---|
tests/conftest.py | shared fixtures | β | fake_provider, base_config, fake_transcript, transcript_file, tmp_claude_home |
tests/mocks/ollama.py | Ollama HTTP stubs | β | mock_ollama_generate, mock_ollama_embeddings |
tests/test_fixtures.py | fixture smoke tests | 17 | Sanity-checks every fixture and mock |
tests/test_config.py | config.py | 6 | merge, paths, project-disabled marker |
tests/test_dedup.py | dedup.py | 11 | similarity, should_store, dedup w/ failing provider |
tests/test_decay.py | decay.py | 23 | hash, recency / frequency boost, prune, atomic state I/O |
tests/test_embedders.py | embedders.py | 14 | factory, Ollama / OpenAI clients, error paths |
tests/test_hyde.py | hyde.py | 18 | grounded expansion, fallback model, _format_context cap |
tests/test_recall.py | recall.py | 20 | full pipeline, dedup, OpenWolf injection, HyDE skip |
tests/test_handlers.py | hook handlers | 28 | UserPromptSubmit / SessionStart / SessionEnd / Stop store half |
tests/test_pre_tool_use_handler.py | hooks/pre_tool_use.py | 11 | safety + rtk integration |
tests/test_safety_scan.py | safety_scan.py | 17 | dangerous-command detection, allow-list |
tests/test_rtk_rewrite.py | rtk_rewrite.py | 12 | rewrite, version probe, opt-in policy |
tests/test_stop_guard.py | stop_guard.py | 23 | default patterns, meta-context escape, user-intent wrap-up escape |
tests/test_instincts.py | instincts.py | 13 | bug-fix detection, save / merge w/ frontmatter |
tests/test_reflect.py | reflect.py | 12 | guards, Ollama failure, dedup across providers, append idempotency |
tests/test_consolidate.py | consolidate.py | 16 | merge candidates, compress, state file, should_run cooldown |
tests/test_claudemem_reindex.py | claudemem_reindex.py | 15 | lock cooldown, staleness scan, async spawn |
tests/test_openwolf.py | openwolf.py | 9 | wolf-dir detection, anatomy / cerebrum read |
tests/test_dispatcher.py | dispatcher.py | 6 | event routing |
tests/test_detect.py | detect.py | 6 | MCP server discovery in ~/.claude.json |
tests/test_mcp_client.py | mcp_client.py | 6 | initialize β tools/call round-trip |
tests/test_providers.py | provider registry | 9 | Qdrant / Memory KG signatures |
tests/test_pgvector_integration.py | providers/pgvector.py | 8 (skipped w/o psycopg) | live Postgres |
tests/test_sqlite_vec_integration.py | providers/sqlite_vec.py | 8 (skipped w/o sqlite-vec) | live sqlite-vec |
tests/test_coverage_phase8.py | dispatcher / pre_tool_use / stop / providers / recall / safety_scan / claudemem_reindex | 120 | Phase 8 β error paths, edge cases, module-import failures (lifts coverage from 81 % β 92 %) |
tests/test_proxy.py | claude_hooks/proxy/ (P0) | 17 | Pass-through, JSONL logging, Warmup + synthetic detection |
tests/test_proxy_p1.py | SSE tail + rate-limit state + weekly auto-populate | 22 | P1 observability half |
tests/test_proxy_p3.py | block_warmup short-circuit | 7 | Stub builders + upstream-not-called invariant |
tests/test_statusline_usage.py | scripts/statusline_usage.py | 16 | P4 segment rendering, stale detection, CLI safety |
tests/test_proxy_stats.py | scripts/proxy_stats.py | 9 | Aggregation, per-model, JSON output, since/until window |
tests/test_claude_mem_ports.py | ports 1-5 from thedotmack/claude-mem | 37 | XML summary, metadata filter, tag strip, composite hash, file-read gate |
Before merging: run
python -m pytest tests/(0 failures) andcoverage report(β₯ 80 %). Both are part of the conda-env workflow documented at the top of this section.
Recommended Companion Tools
See COMPANION_TOOLS.md for detailed install instructions, descriptions, and importance rankings for tools that complement claude-hooks.
Documentation map
Runbooks (docs/):
deployment.mdβ full install playbook (LAN-shared proxy, systemd, statusline, monitoring, uninstall)env-vars.mdβ curated reference ofCLAUDE_CODE_*env vars the framework reads or writesdaemon.mdβ long-lived hook executor: latency tiers, session lock, control protocolproxy.mdβclaude_hooks/proxy/runbook: schema, dashboard routes, Warmup detection, stop_phrase_guard configpgvector-runbook.mdβ Postgres + pgvector backend setup + system-wide MCP server registrationhyde.mdβ HyDE query expansion configuration and tuningcaliber-proxy.mdβ Caliber grounding proxy: native-tools agent,survey_project,recallgemma4-tool-use-notes.mdβ empirical notes on small grounding models for the caliber proxyepisodic-server.mdβ HTTP front-end for obra/episodic-memorylsp-engine.mdβ LSP engine user guidelsp-mcp.mdβ cclsp MCP companion install and Linux/Windows configRELEASING.mdβ versioning, branch model, cut procedure, hotfix flow
Plans (docs/PLAN-*.md):
PLAN-lsp-engine.mdβ locked design for the v0.7 LSP engine (Phases 0β4 shipped)PLAN-stats-sqlite.mdβ proxy SQLite rollup design (β shipped)PLAN-proxy-hook.mdβ original proxy mode design (β shipped)PLAN-pgvector-migration.mdβ Qdrant/Memory-KG β pgvector migration designPLAN-test-coverage.mdβ branch-coverage closure planPLAN-code-factory-integration.mdβ code-factory integration design
Issue drafts (docs/, mirrored to upstream trackers):
issue-warmup-token-drain.mdβ anthropics/claude-code Warmup detection / mitigation evidencecc-xhigh-regression-issue.mdβ anthropics/claude-code#55301 (xhigh quality regression filed 2026-05-01)openwolf-managedby-issue.mdβ cytostack/openwolf#31 / PR #32 (_managedBytag)doc-audit-2026-05-01.mdβ this round's documentation gap report
Where the system listens
Default ports the framework introduces or expects. All bind to 127.0.0.1
by default; the proxy supports a LAN-listen mode for shared installs.
| Port | Service | Configurable as |
|---|---|---|
| 38080 | API proxy | proxy.listen_port |
| 38081 | Stats dashboard | proxy_dashboard.listen_port |
| 38090 | Caliber grounding proxy | caliber_proxy.listen_port |
| 11435 | Episodic-memory HTTP server | episodic_server.listen_port (see docs/episodic-server.md) |
| 11433 | (host-specific) Ollama upstream β used as CALIBER_GROUNDING_UPSTREAM default; override for your install | env / systemd drop-in |
License
Inspiration
- openwolf -- project-anatomy tracking
- claude-mem -- progressive disclosure
- vestige -- HyDE query expansion
- claude-cognitive -- attention decay
- everything-claude-code -- instincts
- claude-diary -- /reflect synthesis
- mnemex -- semantic code search
- caliber -- config drift detection
- episodic-memory -- transcript search
