Token Savior
Token-efficient structural codebase MCP server for AI-assisted development
Ask AI about Token Savior
Powered by Claude Β· Grounded in docs
I know everything about Token Savior. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
β‘ Token Savior Recall
The MCP server that turns Claude into the only coding agent hitting 100% on a real benchmark. Structural code navigation + persistent memory. β77% active tokens. β76% wall time. Zero losses.
π mibayy.github.io/token-savior β project site + benchmark landing π§ͺ github.com/Mibayy/tsbench β benchmark source + fixtures
Benchmark β 96 real coding tasks (tiny+v2 default)
| Plain Claude Code | With Token Savior | |
|---|---|---|
| Score | 141 / 180 (78.3%) | 192 / 192 (100.0%) |
| Active tokens / task | 17 221 | 3 929 (β77%) |
| Wall time / task | 110.6 s | 26.6 s (β76%) |
| Wins / Ties / Losses | β | 25 / 65 / 0 (90 paired) |
Perfect (100%) across all 11 categories: audit, bug_fixing,
code_generation, code_review, config_infra, data_analysis,
documentation, explanation, git, navigation, refactoring,
writing_tests. Zero losses against plain Claude β every task is a
win or a tie.
The default config β TS_PROFILE=tiny_plus (15 tools, ~2.5 KT manifest)
TS_CAPTURE_DISABLED=1+ the v2 system prompt that bansAgentsub-agent delegation β reproduces 100% on Opus 4.7 with β54% active tokens vs the legacyleanprofile.
Also validated on Sonnet 4.6 (ts 170/180 = 94.4% vs base 156/180 = 86.7%).
Model: Claude Opus 4.7 Β· Methodology + per-task breakdown: mibayy.github.io/token-savior.
What it does
Claude Code reads whole files to answer questions about three lines, and forgets
everything the moment a session ends. Token Savior Recall fixes both. It
indexes your codebase by symbol β functions, classes, imports, call graph β so
the model navigates by pointer instead of by cat. Measured reduction: 97%
fewer chars injected across 170+ real sessions.
On top of that sits a persistent memory engine. Every decision, bugfix, convention, guardrail and session rollup is stored in SQLite WAL + FTS5 + vector embeddings, ranked by Bayesian validity and ROI, and re-injected as a compact delta at the start of the next session. Contradictions are detected at save time; observations decay with explicit TTLs; a 3-layer progressive-disclosure contract keeps lookup cost bounded.
Token savings
| Operation | Plain Claude | Token Savior | Reduction |
|---|---|---|---|
find_symbol("send_message") | 41M chars (full read) | 67 chars | β99.9% |
get_function_source("compile") | grep + cat chain | 4.5K chars | direct |
get_change_impact("LLMClient") | impossible | 16K chars (154 direct + 492 transitive) | new capability |
get_backward_slice(var, line) | 130 lines | 12 lines | β92% |
memory_index (Layer 1) | n/a | ~15 tokens/result | Layer 1 shortlist |
| 90-task tsbench (Opus baseβts) | 17.2 KT active/task | 3.9 KT active/task | β77% |
| tsbench score (Opus, 96 tasks) | 141/180 (78.3%) | 192/192 (100.0%) | +22 pts |
Full benchmark methodology and per-task results: tsbench.
Memory engine
| Capability | How it works |
|---|---|
| Storage | SQLite WAL + FTS5 + sqlite-vec (optional), 12 observation types |
| Hybrid search | BM25 + vector (all-MiniLM-L6-v2, 384d) fused via RRF, FTS fallback graceful |
| Progressive disclosure | 3-layer contract: memory_index β memory_search β memory_get |
| Citation URIs | ts://obs/{id} β reusable across layers, agent-native pointers |
| Bayesian validity | Each obs carries a validity prior + update rule; stale obs are surfaced, not silently trusted |
| Contradiction detection | Triggered at save time against existing index; flagged in hook output |
| Decay + TTL | Per-type TTL (command 60d, research 90d, note 60d) + LRU scoring 0.4Β·recency + 0.3Β·access + 0.3Β·type |
| Symbol staleness | Obs linked to symbols are invalidated when the symbol's content hash changes |
| ROI tracking | Access count Γ context weight β unused obs age out, high-ROI obs are promoted |
| MDL distillation | Minimum Description Length grouping compresses redundant observations into conventions |
| Auto-promotion | note Γ5 accesses β convention; warning Γ5 β guardrail |
| Hooks | 8 Claude Code lifecycle hooks (SessionStart/Stop/End, PreCompact, PreToolUse Γ2, UserPromptSubmit, PostToolUse) |
| Web viewer | 127.0.0.1:$TS_VIEWER_PORT β htmx + SSE, opt-in |
| LLM auto-extraction | Opt-in TS_AUTO_EXTRACT=1 β PostToolUse tool uses extracted into 0-3 observations via small-model call |
vs claude-mem
Two projects share the goal β persistent memory for Claude Code. The axes below are measured, not marketing.
| Axis | claude-mem | Token Savior Recall |
|---|---|---|
| Bayesian validity | no | yes |
| Contradiction detection at save | no | yes |
| Per-type decay + TTL | no | yes |
| Symbol staleness (content-hash linked obs) | no | yes |
| ROI tracking + auto-promotion | no | yes |
| MDL distillation into conventions | no | yes |
| Code graph / AST navigation | no | yes (90 tools, cross-language) |
| Progressive disclosure contract | no | yes (3 layers, ~15/60/200 tokens) |
| Hybrid FTS + vector search (RRF) | no | yes |
Token Savior Recall is a superset: it ships the memory engine plus the structural codebase server that gave the project its name.
Install
uvx (no venv, no clone)
uvx token-savior-recall
pip
pip install "token-savior-recall[mcp]"
# Optional hybrid vector search:
pip install "token-savior-recall[mcp,memory-vector]"
Claude Code one-liner
claude mcp add token-savior -- /path/to/venv/bin/token-savior
Development
git clone https://github.com/Mibayy/token-savior
cd token-savior
python3 -m venv .venv
.venv/bin/pip install -e ".[mcp,dev]"
pytest tests/ -q
Configure
{
"mcpServers": {
"token-savior-recall": {
"command": "/path/to/venv/bin/token-savior",
"env": {
"WORKSPACE_ROOTS": "/path/to/project1,/path/to/project2",
"TOKEN_SAVIOR_CLIENT": "claude-code"
}
}
}
}
Optional env: TELEGRAM_BOT_TOKEN + TELEGRAM_CHAT_ID (critical-observation
feed), TS_VIEWER_PORT (web viewer), TS_AUTO_EXTRACT=1 + TS_API_KEY
(LLM auto-extraction), TOKEN_SAVIOR_PROFILE (full / core / nav / lean /
ultra β filters advertised tool set to shrink the per-turn MCP manifest).
Tools (90)
Category counts β full catalog is served via MCP tools/list.
| Category | Count |
|---|---|
| Core navigation | 14 |
| Dependencies & graph | 9 |
| Git & diffs | 5 |
| Safe editing | 8 |
| Checkpoints | 6 |
| Test & run | 6 |
| Config & quality | 8 |
| Docker & multi-project | 2 |
| Advanced context (slicing, packing, RWR, prefetch, verify) | 6 |
| Memory engine | 21 |
| Reasoning (plan/decision traces) | 3 |
| Stats, budget, health | 10 |
| Project management | 7 |
Profiles
TOKEN_SAVIOR_PROFILE filters the advertised tools/list payload while
keeping handlers live.
| Profile | Advertised | ~Tokens | Use case |
|---|---|---|---|
full (default) | 67 | ~8 770 | All capabilities |
core | 54 | ~5 800 | Daily coding, no memory engine |
nav | 28 | ~3 100 | Read-only exploration |
lean | 52 | ~6 940 | Memory engine off β used in tsbench v2 |
ultra | 31 | ~4 250 | Hot tools + ts_extended meta-tool |
tiny (new) | 6 | ~1 070 | Defer-loading via ts_search (Tool Attention pattern) |
Bench-mode env vars
For benchmark / cold-start workloads where memory and capture sandboxing add no value, pair the profile with these env vars:
export TOKEN_SAVIOR_PROFILE=lean # or 'tiny' for max trim
export TS_MEMORY_DISABLE=1 # hide memory_* (-300 t)
export TS_CAPTURE_DISABLED=1 # hide capture_*, skip PostToolUse hook
export TS_HOOK_MINIMAL=1 # SessionStart emits Memory Index only
export TS_NO_HINTS=1 # drop _hints / _suggestion (~30-50 t/call)
Measured on tsbench (90 tasks, Claude Opus 4.7):
| Configuration | Active tokens / task | Score |
|---|---|---|
| Plain agent (Read/Grep/Bash, no Token Savior) | 17 221 | 78.3 % |
lean profile (default since v2.9) | 8 928 | 100.0 % |
lean + the 5 env vars above | ~5 500 | 100.0 % |
Defer-loading via ts_search
When the manifest budget is the bottleneck, the new tiny profile
exposes only 6 tools (switch_project, find_symbol,
get_function_source, get_full_context, search_codebase,
ts_search). Other ~60 tools are reachable just-in-time via:
ts_search(query="find dependents of update_user", top_k=5)
# β {"matched_tools": [{"name": "get_dependents", "score": 0.68, ...}, ...]}
Embeddings (Nomic 768d) score every tool description against the query; top-K candidates come back with their full inputSchema so the next turn can call them directly. Mirrors the Tool Attention paper (47.3k β 2.4k tokens / turn at 120 tools, β95 % prefix).
Anthropic API users β pair with native context management
For long agent loops, combine Token Savior with Anthropic's native context primitives (Claude API β₯ 2025-09-19):
client = anthropic.Anthropic(default_headers={
"anthropic-beta": "context-management-2025-06-27,clear-tool-uses-2025-09-19",
})
resp = client.messages.create(
model="claude-opus-4-7",
context_management={"edits": [{
"type": "clear_tool_uses_20250919",
"trigger": {"type": "input_tokens", "value": 30_000},
"keep": {"type": "tool_uses", "value": 4},
"exclude_tools": ["replace_symbol_source", "edit_lines_in_symbol"],
}]},
tools=[...],
messages=[...],
)
Anthropic's cookbook measures β48 % peak context with clearing alone on long agent loops.
Progressive disclosure β memory search
Three layers, increasing cost. Always start at Layer 1. Escalate only if the previous layer paid off. Full contract: docs/progressive-disclosure.md.
| Layer | Tool | Tokens/result | When |
|---|---|---|---|
| 1 | memory_index | ~15 | Always first |
| 2 | memory_search | ~60 | If Layer 1 matched |
| 3 | memory_get | ~200 | If Layer 2 confirmed |
Each Layer 1 row ends with [ts://obs/{id}] β pass it straight to Layer 3.
Links
- Site β https://mibayy.github.io/token-savior/
- Repo β https://github.com/Mibayy/token-savior
- PyPI β https://pypi.org/project/token-savior-recall/
- Benchmark β https://github.com/Mibayy/tsbench
- Changelog β CHANGELOG.md
- Progressive disclosure β docs/progressive-disclosure.md
License
MIT β see LICENSE.
Works with any MCP-compatible AI coding tool. Claude Code Β· Cursor Β· Codex CLI Β· Antigravity Β· Cline Β· Continue Β· Windsurf Β· Aider Β· Gemini CLI Β· Copilot CLI Β· Zed Β· any custom MCP client
