Memoirs
Local-first long-term memory engine for AI agents Β· MCP + HTTP + CLI Β· SQLite + sqlite-vec + FTS5 Β· 100% local, no cloud
Ask AI about Memoirs
Powered by Claude Β· Grounded in docs
I know everything about Memoirs. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
memoirs
memoirs /ΛmΙmwΙrz/ Β· MEM-wahrz Β· from French mΓ©moires (memories) β the durable written record an agent keeps of what it has learned.
Local-first long-term memory for AI agents.
memoirs gives every AI agent on your machine a single, durable memory layer that survives across sessions, IDEs, and even models. It ingests your transcripts, extracts the durable signal (preferences, decisions, projects, tool-calls), curates them with a local LLM, and serves the right ~1K-token context the moment your agent asks.
# In your agent
context = mcp.call("mcp_get_context", {"query": "what's our auth strategy?"})
# β ranked memories, conflict-resolved, in 6.2 ms p50
No cloud. No API keys leaving the box. SQLite + a 2 GB GGUF model β it runs on a laptop.
Why memoirs
| memoirs | |
|---|---|
| 100% local | No data egress. No SaaS. The DB is one file you can cp, rsync, or encrypt with SQLCipher. |
| Universal ingest | Claude Code transcripts, Cursor state.vscdb, ChatGPT zip, Claude.ai export, JSONL, Markdown. One command. |
| Native MCP | 22 tools served over stdio. Drop into Claude Desktop, Claude Code, Cursor, Continue, Cline β pick one or all. |
| Hybrid retrieval | BM25 + dense + RRF, with optional graph multi-hop (HippoRAG-style PPR) and RAPTOR hierarchical summaries. |
| Temporal & explainable | Bi-temporal validity, as_of=t time-travel queries, full provenance chain (memory β candidate β message β conversation β source). |
| LLM curator built-in | Qwen 2.5 3B local for extraction, consolidation, conflict resolution. Auto-detects available models, falls back to heuristics. |
| Auto-prune & forget | Ebbinghaus decay, sleep-time consolidation cron, 4-mode Zettelkasten linking, EXPIRE/ARCHIVE generation. |
| Privacy-first | PII redaction (Presidio + secret scanning), per-memory ACL with visibility tiers, GDPR export/import portable bundles, encryption-at-rest (SQLCipher). |
Install
memoirs ships three install paths. Pick one.
1. PyPI (recommended for users)
# minimal: SQLite + FTS5 + heuristic curator
pip install memoirs
# everything except the local LLM (lighter β ~150 MB)
pip install 'memoirs[all]'
# everything + Qwen 3 4B GGUF curator + bge-reranker (~2 GB on first run)
pip install 'memoirs[full]'
memoirs setup # one-shot: download GGUF, init DB, wire MCP into your IDEs
memoirs setup is idempotent β it auto-detects which curator backends you already have, downloads the missing ones via huggingface-hub, then writes MCP config snippets for Claude Code / Cursor / Continue / Cline.
2. From source
git clone https://github.com/misaelzapata/memoirs && cd memoirs
python3 -m venv .venv && source .venv/bin/activate
pip install -e '.[full,dev]' # editable + tests + curator + reranker
memoirs setup
3. Docker compose (zero-Python install)
git clone https://github.com/misaelzapata/memoirs && cd memoirs
docker compose up -d
# β API on :8283, MCP via docker exec, persistent volume in ./data
Auto-start on login (Linux, optional)
bash scripts/install_systemd_user.sh # creates user-level systemd unit
# memoirs now auto-starts on login. status: systemctl --user status memoirs-api
Verify
memoirs status
# β 12 migrations applied Β· curator=qwen3 Β· embedding=sentence-transformers/all-MiniLM-L6-v2
memoirs ingest ~/.claude/projects # pull past chats
memoirs ask "what did we decide about auth?"
That's it β the MCP server is wired into your IDEs.
As an MCP server
// .vscode/mcp.json (or ~/.codex/mcp.json, claude_desktop_config.json, ...)
{
"servers": {
"memoirs": {
"type": "stdio",
"command": "/path/to/.venv/bin/python3",
"args": ["-m", "memoirs", "mcp"]
}
}
}
As a Python library
from memoirs.db import MemoirsDB
from memoirs.engine.memory_engine import assemble_context
db = MemoirsDB("memoirs.sqlite")
ctx = assemble_context(db, "API rate-limit policy", top_k=10)
print(ctx["context"]) # list of compact lines
print(ctx["token_estimate"]) # ~600
As an HTTP API + web dashboard
memoirs serve --port 8283
# REST: http://localhost:8283/docs
# UI: http://localhost:8283
The web UI surfaces every feature live: memory search, timeline, entity graph, conflicts, point-in-time snapshots with side-by-side diff, procedural-memory inspector, and provenance/attribution audit.
Performance & quality
Internal head-to-head bench (synthetic, 6 engines, 20 queries)
scripts/bench_vs_others.py Β· artifact: bench_results/bench_others_v11_prf.json
| engine | MRR | Hit@1 | Hit@5 | R@10 | p50 ms | p95 ms | RAM MB |
|---|---|---|---|---|---|---|---|
| memoirs | 1.00 | 1.00 | 1.00 | 1.00 | 21 | 33 | 231 |
| cognee | 0.97 | 0.95 | 1.00 | 1.00 | 974 | 1316 | 1643 |
| mem0 | 0.93 | 0.90 | 1.00 | 1.23* | 533 | 1465 | 865 |
| memori | 0.93 | 0.85 | 1.00 | 1.00 | 339 | 462 | 1841 |
| langmem | 0.90 | 0.80 | 1.00 | 1.00 | 350 | 616 | 1846 |
| llamaindex | 0.90 | 0.80 | 1.00 | 1.00 | 351 | 627 | 1855 |
* mem0's R@10 > 1.00 is due to duplicate gold IDs in their response shape β counted as published.
memoirs leads on every retrieval metric and is 16-46Γ faster than cloud rivals (no network round trip, no LLM in the hot path). Temporal queries β where memoirs uses valid_from/valid_to bi-temporal β sweep at MRR 1.00 vs 0.50-0.88 for everyone else.
Real-world bench: LoCoMo (Snap Research, 1982 QA pairs across 10 long conversations)
scripts/eval_locomo.py Β· artifact: bench_results/locomo_full_v1.json Β· runtime: 873 s
| category | n | MRR | H@1 | H@5 | R@10 | p50 ms |
|---|---|---|---|---|---|---|
| multi-hop | 321 | 0.335 | 0.209 | 0.517 | 0.651 | 202 |
| temporal | 841 | 0.333 | 0.200 | 0.517 | 0.702 | 182 |
| adversarial | 446 | 0.213 | 0.085 | 0.341 | 0.659 | 140 |
| single-hop | 282 | 0.217 | 0.103 | 0.383 | 0.288 | 209 |
| open-domain | 92 | 0.163 | 0.109 | 0.207 | 0.269 | 223 |
| total | 1982 | 0.282 | 0.157 | 0.444 | 0.605 | 182 |
With bge-reranker (MEMOIRS_RERANKER_BACKEND=bge, MEMOIRS_PRF=off) on a 3-conv subset (495 queries, locomo_3_rk.json):
| category | baseline MRR | + reranker | uplift |
|---|---|---|---|
| single-hop | 0.182 | 0.494 | +171% |
| multi-hop | 0.312 | 0.760 | +144% |
| temporal | 0.298 | 0.558 | +87% |
| adversarial | 0.224 | 0.422 | +89% |
| TOTAL | 0.260 | 0.541 | +108% |
Cost: ~5 s/query (bge-reranker is a CPU cross-encoder). Trade-off: reranker off = 65Γ faster, reranker on = 2Γ quality.
LongMemEval (xiaowu0162, oracle split, ICLR 2025)
scripts/bench_vs_others.py --longmemeval
| variant | n | MRR | Hit@1 | Hit@5 | R@10 | p50 ms |
|---|---|---|---|---|---|---|
| baseline (PRF on) | 100 | 0.32 | 0.19 | 0.49 | 0.46 | 167 |
| + bge-reranker | 100 | 0.38 | 0.25 | 0.56 | 0.49 | 23,000 |
Engine internals (warm cache, single thread, 4,061-memory production DB)
| primitive | p50 | p95 | notes |
|---|---|---|---|
bm25 (FTS5) | 2.7 ms | 63 ms | lexical only |
dense (sqlite-vec) | 12.8 ms | 4.2 s* | dense only |
hybrid (BM25 + dense + RRF) | 3.9 ms | 10 s* | default |
hybrid + PRF (multi-hop bridge) | 6.5 ms | 12 s* | MEMOIRS_PRF=on |
graph (entity PPR) | 2.1 ms | 333 ms | rarely needed |
embed_text cached | 0.005 ms | β | LRU, ~2,000Γ speedup |
| migration cold start (1β12) | <200 ms | β | 2 GB DB |
| process cold start (+ Qwen + ST) | 5.7 s | β | one-time |
* p95 outliers come from the cold-embed path; default hot path stays sub-30 ms.
Curator quality (LLM JSON adherence, 20-prompt benchmark)
The local curator emits valid JSON on every contradiction-resolution prompt with Qwen3-4B. Gemma 2B was the previous default and failed all 20 β that's why Qwen is now the default backend.
Reproducing the benchmarks yourself
# 1. Internal head-to-head (fastest β 20 synthetic queries Γ 6 engines, ~3 min)
MEMOIRS_PRF=on python scripts/bench_vs_others.py \
--engines memoirs,mem0,cognee,memori,langmem,llamaindex \
--top-k 10 \
--out bench_results/bench_others_v11.json \
--md-out bench_results/bench_others_v11.md
# 2. LongMemEval (download the oracle split first, ~15 MB)
mkdir -p ~/datasets/longmemeval
curl -L https://github.com/xiaowu0162/LongMemEval/releases/download/v1.0/longmemeval_oracle.json \
-o ~/datasets/longmemeval/longmemeval_oracle.json
MEMOIRS_PRF=on python scripts/bench_vs_others.py \
--engines memoirs --top-k 10 --longmemeval --longmemeval-limit 100 \
--out bench_results/longmemeval_100.json --md-out bench_results/longmemeval_100.md
# 3. LoCoMo (10 long conversations, 1986 QA pairs)
mkdir -p ~/datasets
curl -L https://raw.githubusercontent.com/snap-research/locomo/main/data/locomo10.json \
-o ~/datasets/locomo10.json
MEMOIRS_PRF=on python scripts/eval_locomo.py \
--locomo ~/datasets/locomo10.json \
--top-k 10 --out bench_results/locomo_full.json
# 4. Same LoCoMo, but with bge-reranker on (slower but +108% MRR)
MEMOIRS_RERANKER_BACKEND=bge python scripts/eval_locomo.py \
--locomo ~/datasets/locomo10.json \
--top-k 10 --out bench_results/locomo_reranked.json
Catalog of all 24 catalogued benchmarks (with download recipes + license + estimated effort): docs/external_benchmarks_catalog.md.
Quality
The local curator (Qwen 2.5 3B Q4) emits 20/20 valid JSON on the contradiction-resolution prompt β vs Gemma 2B's 0/20 in our bench. Auto-detected at startup; override with MEMOIRS_CURATOR_BACKEND={qwen,phi,gemma}.
JSON valid p50 tokens
qwen2.5-3b-Q4 20/20 870 ms 23
phi-3.5-mini-Q4 8/20 raw β 20/20 with parser salvage
gemma-2-2b-Q4 0/20 (chat-template + stop-token interaction)
Every curator function has a tolerant parser that salvages truncated / fenced / bare-string JSON, so the system degrades cleanly even when the model misbehaves. Backend selection is auto: Qwen β Phi β Gemma (whichever GGUF is found first in ~/.local/share/memoirs/models/).
Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β IDE / Agent / Script β
βββββββββββββββββββ¬βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββ¬βββββββββββββββββ
β MCP stdio β HTTP API + Web UI β CLI
βΌ βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β memoirs runtime (Python) β
β β
β ingest βΆ raw βΆ extract βΆ consolidate βΆ retrieve βΆ serve β
β chats / conv / Qwen 3 / ADD Β· UPDATE PRF + MMR /context β
β events msgs spaCy / MERGE Β· EXPIRE reranker stream β
β sources noop ARCHIVE (opt-in) β
β β
β βββββ sleep-time cron (idle housekeeping) βββββββββββββββ β
β β dedup β link_rebuild β prune β contradictions β β
β β scheduled snapshots (point-in-time backup) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββββββ
β
β SQLite + sqlite-vec + FTS5
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β memories Β· candidates Β· links Β· entities Β· summaries Β· snapshots β
β β
β Ebbinghaus strength Β· Zettelkasten edges Β· RAPTOR hierarchy β
β bi-temporal validity Β· point-in-time backup Β· provenance_json β
β (actor Β· process Β· decision Β· reason β migration 012) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Engine layers (each independently testable): raw β extract β consolidate β retrieve β maintain. PRF (MEMOIRS_PRF) and bge-reranker (MEMOIRS_RERANKER_BACKEND) plug into retrieve as opt-in env knobs.
Engine layers (each independently testable):
- Raw β sources, conversations, messages.
- Extract β Qwen-driven candidates with type validation + secret scanning.
- Knowledge graph β entities, relationships, project context.
- Embeddings β sqlite-vec + sentence-transformers (or fastembed).
- Engine β consolidation, scoring (
importance Γ confidence Γ Ebbinghaus(t,S) Γ usage Γ signal), lifecycle, dedup, time-travel, hybrid + graph + RAPTOR retrieval. - MCP / API / UI β 22 MCP tools, FastAPI with SSE, web inspector at
/ui.
Features
Retrieval
- Hybrid BM25 + dense with RRF fusion
- HippoRAG-style multi-hop via Personalized PageRank over entity + memory graph
- RAPTOR hierarchical summaries for long-context queries
- Streaming SSE β context streams as it ranks (TTFT < 50 ms)
- Time-travel β
as_of=treturns the corpus state at any past timestamp - HyDE / cross-encoder reranker / MMR β opt-in pipeline stages
Memory lifecycle
- Ebbinghaus forgetting curve β
R = e^(-Ξt / (SΒ·24h)), strength Γ 1.5 per access (cap 64) - Zettelkasten linking β bidirectional
memory_linkswith 4 modes (absolute / topk / adaptive / zscore) - EXPIRE/ARCHIVE generation β heuristic + LLM curator decide when to retire memories
- Sleep-time consolidation β cron job runs dedup/prune/contradictions on idle
- Versioning β bi-temporal
valid_from/valid_toper memory
Privacy & ops
- PII redaction with Microsoft Presidio (optional) + 11 always-on secret detectors
- Per-memory ACL with
private / shared / org / publicvisibility tiers - Encryption-at-rest via SQLCipher (
MEMOIRS_ENCRYPT_KEY) - GDPR export/import as portable zip bundles with sha256 manifests
- Structured JSON logging + OpenTelemetry traces (opt-in)
- Versioned migrations with
up()/down()and rollback
Integrations
- MCP stdio server (Claude Desktop, Claude Code, Cursor, Continue, Cline, Codex)
- HTTP API with FastAPI + SSE streaming
- Inspector UI at
/uiβ timeline, graph, search, conflict-resolution, edit/pin/forget - CLI with 30+ subcommands
- Eval harness β LongMemEval / LoCoMo / synthetic suites with precision@k / recall@k / MRR
CLI tour
# Ingestion
memoirs ingest <path|url> # any supported format, auto-detected
memoirs ingest --kind claude-export memories.zip
memoirs watch ~/.claude/projects # real-time
memoirs daemon start # background watcher + extractor + sleep cron
# Retrieval & inspection
memoirs ask "..." # one-shot context query
memoirs why <memory_id> --query "..." # provenance trace
memoirs trace <conversation_id> # source β messages β candidates β memories
memoirs graph entities --html # interactive entity visualization
memoirs ui --port 8284 # web inspector
# Lifecycle
memoirs maintenance # recompute scores + expire
memoirs cleanup # dedup near-duplicates
memoirs links rebuild --mode topk # rebuild Zettelkasten edges
memoirs raptor build # construct hierarchical summaries
memoirs sleep run-once # one cron tick
# Privacy & data
memoirs export --user-id alice --redact-pii --out alice.zip
memoirs import alice.zip --mode merge
memoirs db encrypt --key 'passphrase'
# Eval
memoirs eval --suite synthetic_basic --modes hybrid,bm25,dense
Run memoirs --help for the full surface (35 subcommands).
MCP tools (22)
Raw ingest_event β’ ingest_conversation β’ status
Extraction extract_pending β’ consolidate_pending β’ audit_corpus
record_tool_call β’ consolidate_with_gemma
Engine run_maintenance β’ get_context β’ summarize_thread
search_memory β’ add_memory β’ update_memory
score_feedback β’ explain_context β’ forget_memory
list_memories
Graph index_entities β’ get_project_context β’ list_projects
Ops event_stats β’ export_user_data β’ import_user_data
mcp_get_context is the workhorse β every other tool exists to make its output better. Returns ~600β1,500 tokens of ranked, conflict-resolved memory with optional provenance_chain for explainability.
Configuration
Everything is env-var driven so the same binary works in dev, daemon, and CI.
# Storage
MEMOIRS_DB=path/to.sqlite # default .memoirs/memoirs.sqlite
MEMOIRS_ENCRYPT_KEY=<passphrase> # enable SQLCipher
MEMOIRS_SQLITE_MMAP_MB=256 # default 256
MEMOIRS_SQLITE_CACHE_MB=64 # default 64
# Curator (LLM) β Qwen 2.5 3B is the default when its GGUF is present.
MEMOIRS_CURATOR_BACKEND=qwen|phi|gemma # default: auto-detect (qwen > phi > gemma)
MEMOIRS_CURATOR_MODEL=/path/to.gguf
MEMOIRS_GEMMA_THREADS=20 # legacy name, applies to whichever curator is loaded
MEMOIRS_GEMMA_GPU_LAYERS=99 # Vulkan offload
# Retrieval
MEMOIRS_RETRIEVAL_MODE=hybrid_graph # dense|bm25|hybrid|graph|hybrid_graph|raptor|hybrid_raptor
MEMOIRS_RETRIEVAL_GEMMA=off # off | on (slow but smarter conflict resolution)
MEMOIRS_RETRIEVAL_GEMMA_MAX=2 # cap LLM calls per query
MEMOIRS_HYDE=off # query expansion
MEMOIRS_RERANKER_BACKEND=none|bge # cross-encoder rerank top-N
MEMOIRS_MMR=on
MEMOIRS_MMR_LAMBDA=0.7
# Embedding
MEMOIRS_EMBED_BACKEND=sentence_transformers|fastembed
# Privacy
MEMOIRS_REDACT=on|off|strict # PII + secret scanning at ingest
MEMOIRS_USER_ID=alice # multi-tenant scope
MEMOIRS_NAMESPACE=work
# Observability
MEMOIRS_LOG_FORMAT=json|text
MEMOIRS_LOG_LEVEL=INFO
MEMOIRS_OTEL_ENDPOINT=http://otelcollector:4317
# Lifecycle
MEMOIRS_ZETTELKASTEN=on|off
MEMOIRS_GEMMA_CURATOR=on|off|auto # consolidation curator
Full list in docs/configuration.md.
Comparison
| Local-first | Hybrid retrieval | Multi-hop bridge | Temporal queries | Native MCP | Auto-curate | Eval harness | |
|---|---|---|---|---|---|---|---|
| memoirs | β SQLite | β BM25+dense+RRF | β PRF + PPR | β
bi-temporal as_of=t | β 29 tools | β Qwen 3 / Phi / Gemma | β LongMemEval, LoCoMo, synthetic |
| Mem0 | β οΈ SaaS-default | β | β | partial | β οΈ wrapper | β OpenAI/Anthropic | β οΈ in-house only |
| Cognee | β οΈ Postgres | β | β ontology graph | β | β | β LiteLLM | β οΈ in-house |
| Letta (MemGPT) | β οΈ Postgres | β | β | β | β οΈ wrapper | β OpenAI | partial |
| Zep / Graphiti | β SaaS | β | β entity graph | β | β οΈ wrapper | β OpenAI | β DMR + LongMemEval |
| LangMem | β in-process | β οΈ ad hoc | β | β | β οΈ wrapper | β procedural | β |
| LlamaIndex | β in-process | β | β οΈ via plugins | β | β οΈ wrapper | β DIY | β |
| Memori | β SQLite | β | β | β | β | β heuristic | β |
Project status
- 693 tests passing β’ coverage 60% β’ 9 schema migrations β’ 35 CLI subcommands β’ 22 MCP tools
- Tested on a real 4,196-memory corpus: p50 6.2 ms hybrid_graph
- LongMemEval / LoCoMo head-to-head vs other engines is the next milestone
Documentation
docs/configuration.mdβ every env var explaineddocs/architecture.mdβ schema, layers, data flowdocs/benchmarks.mdβ methodology + raw numbersdocs/mcp.mdβ MCP tool referencedocs/cli.mdβ CLI reference
Contributing
Issues and PRs welcome. Run the suite first:
.venv/bin/pytest tests/ -q # 693 tests, ~90 s
bash scripts/coverage.sh # coverage report β .coverage_html/
CONTRIBUTING.md (TODO) covers the migration cookbook, test conventions, and how to add a new MCP tool.
License
MIT β see LICENSE.
Screenshots
Built against the seeded demo DB (scripts/seed_demo_db.py) β 51 memorias spanning every memory type, plus a snapshot baseline so the diff is non-empty.
| Dashboard β corpus stats + doughnut of memory types + procedural policies | ![]() |
| Memories β full list with type pills + provenance | ![]() |
Procedural β ?type=procedural filter | ![]() |
| Search β BM25 + dense + RRF retrieval | ![]() |
| Timeline β bi-temporal view, filter by type | ![]() |
| Conflicts β semantic dups across types | ![]() |
| Snapshots β list + create + auto-cadence stat | ![]() |
| Snapshot diff β side-by-side, added/removed/changed in colored cards | ![]() |
To regenerate:
python scripts/seed_demo_db.py --out data/demo/memoirs_demo.sqlite
memoirs --db data/demo/memoirs_demo.sqlite serve --port 8283 &
bash scripts/take_screenshots.sh








