Claude Persistent Memory
Give Claude Code persistent memory across sessions. Hybrid BM25 + vector semantic search, auto-structuring via LLM, and 4-channel retrieval (MCP + hooks). Your AI assistant finally remembers.
Ask AI about Claude Persistent Memory
Powered by Claude Β· Grounded in docs
I know everything about Claude Persistent Memory. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Claude Persistent Memory
Give Claude Code long-term memory that persists across sessions.
Hybrid BM25 + vector semantic search Β· LLM-driven structuring Β· Multi-project isolation
English | δΈζ
Features β’ Quick Start β’ Architecture β’ MCP Tools β’ Configuration β’ Contributing
Features
Hybrid Search β BM25 full-text (FTS5) + vector semantic similarity (sqlite-vec), combined ranking (0.7 vector + 0.3 BM25)
4-Channel Retrieval β Pull (MCP tools on demand) + Push (auto-inject via hooks on user prompt, pre-tool, post-tool)
LLM Structuring β Memories auto-structured into <what>/<when>/<do>/<warn> XML format via Azure OpenAI
Multi-Project Isolation β Single shared embedding server routes requests by dataDir. Each project has its own database, no cross-contamination.
Automatic Clustering β Similar memories grouped, mature clusters merged into high-confidence consolidated memories
Confidence Scoring β Memories gain/lose confidence through validation feedback and usage patterns
Local-First β All data stored locally in SQLite. Your memories never leave your machine.
Quick Start
Install
# Set Azure OpenAI credentials (required for LLM structuring)
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com"
export AZURE_OPENAI_KEY="your-api-key"
# Install in any project
npm install @alex900530/claude-persistent-memory
The postinstall script automatically:
- Generates
.claude-memory.config.js(project config) - Configures
.mcp.json(MCP server registration) - Configures
.claude/settings.json(5 lifecycle hooks) - Downloads and verifies the embedding model (bge-m3, ~2GB)
- Registers background services via launchd/systemd
- Updates
.gitignore
Open Claude Code in the project directory β memory is ready.
Note: The embedding model (~2GB) is downloaded and verified during install. If the download is interrupted or the model is corrupt, install will fail. Simply re-run
npm installto retry.
Configure later
If you skipped Azure credentials during install:
npx claude-persistent-memory
Install from source
Click to expand
git clone https://github.com/MIMI180306/claude-persistent-memory.git
cd claude-persistent-memory
npm install
cp config.default.js config.js
# Edit config.js with your Azure credentials
# Start services
npm run embedding-server # Terminal 1
npm run llm-server # Terminal 2
Then manually configure .mcp.json and .claude/settings.json β see Configuration.
Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Claude Code Session β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Pull Channel (on demand) Push Channels (auto) β
β βββββββββββββββββββββ ββββββββββββββββββββββββββββββββ β
β β MCP Server β β UserPromptSubmit Hook β β
β β memory_search β β PreToolUse Hook β β
β β memory_save β β PostToolUse Hook β β
β β memory_validate β β PreCompact Hook (analysis) β β
β β memory_stats β β SessionEnd Hook (clustering) β β
β ββββββββββ¬βββββββββββ ββββββββββββββββ¬ββββββββββββββββ β
β β β β
β ββββββββββββ¬ββββββββββββββββββββ β
β β dataDir routing β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Shared Embedding Server (TCP :23811) β β
β β bge-m3 model (shared across projects) β β
β β Database pool (per-project by dataDir) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββΌβββββββββββββββ β
β βΌ βΌ βΌ β
β βββββββββββ βββββββββββ βββββββββββ β
β βProject Aβ βProject Bβ βProject Cβ β
β βmemory.dbβ βmemory.dbβ βmemory.dbβ β
β βββββββββββ βββββββββββ βββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β LLM Server (TCP :23812) β β
β β Azure OpenAI GPT-4.1 β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Multi-Project Support
The embedding server is shared across all projects. Each request carries a dataDir parameter that routes to the correct project's database:
- Embedding model β loaded once, shared across all projects (~2GB RAM)
- Database connections β pooled per
dataDir, created on first access (~5ms) - No cross-contamination β searching in Project A never returns Project B's memories
MCP Tools
| Tool | Description |
|---|---|
memory_search | Hybrid BM25 + vector search. Params: query, limit?, type?, domain? |
memory_save | Save a new memory. Params: content, type?, domain?, confidence? |
memory_validate | Feedback loop β helpful (+0.1) or unhelpful (-0.05). Params: memory_id, is_valid |
memory_stats | System stats: total memories, type/domain distribution, cluster status |
Hooks
| Hook | Event | Timeout | What it does |
|---|---|---|---|
user-prompt-hook.js | UserPromptSubmit | 1500ms | Embeds user query, searches, injects top memories via stdout |
pre-tool-memory-hook.js | PreToolUse | 300ms | Embeds tool context, searches, injects via additionalContext |
post-tool-memory-hook.js | PostToolUse | 300ms | Embeds tool context + result, searches, injects via additionalContext |
pre-compact-hook.js | PreCompact | async | Spawns LLM analysis of full transcript, extracts memories |
session-end-hook.js | SessionEnd | async | Incremental transcript analysis + clustering + mature cluster merging |
Memory Types
| Type | Use case |
|---|---|
fact | Stable facts about the codebase |
decision | Architectural decisions and rationale |
bug | Bug fixes and root causes |
pattern | Recurring code patterns |
context | Session-specific context |
preference | User workflow preferences |
skill | Promoted from mature clusters |
Memory Lifecycle
Save β memory_save or auto-extract from transcript
Structure β LLM converts to <what>/<when>/<do>/<warn> XML
Embed β bge-m3 generates 1024-dim vector
Dedupe β Jaccard similarity >= 0.95 β update existing
Search β 0.7 * vectorSimilarity + 0.3 * normalizedBM25
Validate β memory_validate adjusts confidence Β±
Cluster β similar memories auto-grouped
Merge β mature clusters consolidated into single memory
Uninstall
npx claude-persistent-memory-uninstall
Or manually: remove memory from .mcp.json, remove memory hooks from .claude/settings.json, then npm uninstall @alex900530/claude-persistent-memory. The .claude-memory/ data directory is preserved β delete manually if no longer needed.
Configuration
All settings in config.default.js (override via .claude-memory.config.js):
module.exports = {
embeddingPort: 23811, // TCP port for embedding server
llmPort: 23812, // TCP port for LLM server
dataDir: './data', // memory.db location (per-project)
azure: {
endpoint: process.env.AZURE_OPENAI_ENDPOINT,
apiKey: process.env.AZURE_OPENAI_KEY,
deployment: 'gpt-4-1',
},
embedding: {
model: 'Xenova/bge-m3', // 1024 dimensions, 8192 token context
dimensions: 1024,
},
search: {
maxResults: 3, // top-K results per query
minSimilarity: 0.6, // vector similarity threshold
},
cluster: {
similarityThreshold: 0.70, // min similarity to join a cluster
maturityCount: 5, // memories needed for mature cluster
},
};
Project Structure
claude-persistent-memory/
βββ bin/
β βββ setup.js # postinstall + interactive setup
β βββ uninstall.js # cleanup script
βββ hooks/
β βββ user-prompt-hook.js # UserPromptSubmit β memory injection
β βββ pre-tool-memory-hook.js # PreToolUse β memory injection
β βββ post-tool-memory-hook.js # PostToolUse β memory injection
β βββ pre-compact-hook.js # PreCompact β transcript analysis
β βββ session-end-hook.js # SessionEnd β clustering + merging
βββ lib/
β βββ memory-db.js # SQLite + FTS5 + sqlite-vec + connection pool
β βββ embedding-client.js # TCP client for embedding server
β βββ llm-client.js # TCP client for LLM server
β βββ compact-analyzer.js # Transcript β memory extraction
β βββ utils.js
βββ services/
β βββ embedding-server.js # Shared embedding service (bge-m3)
β βββ llm-server.js # LLM proxy (Azure OpenAI)
β βββ memory-mcp-server.js # MCP server (stdio, per-project)
βββ config.default.js
βββ package.json
Requirements
- Node.js >= 18
- macOS or Linux
- ~2GB RAM for embedding model (bge-m3)
- ~2GB disk for model cache (
~/.cache/huggingface/transformers-js/) - Azure OpenAI API access (for LLM structuring)
Notes
- LLM provider: Currently supports Azure OpenAI only. Modify
services/llm-server.jsfor other providers. - Ports: Embedding and LLM servers default to TCP 23811 / 23812. Change in config if conflicting.
- Multi-project: All projects share one embedding server process. The model is loaded once; databases are pooled by
dataDir. - Data:
.claude-memory/directory (containingmemory.dband logs) is auto-created and gitignored per project.
Contributing
Contributions welcome! Please read the Contributing Guide before submitting a PR.
