Claudemem
Claude memory system using BM25 + vector search capabilities and initialized as a stdio MCP server with wrapper skills for claude to use.
Ask AI about Claudemem
Powered by Claude Β· Grounded in docs
I know everything about Claudemem. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
agent-memory
Persistent hybrid-search memory system for AI coding agents. Replaces markdown-based memory with selective retrieval, cross-project search, and agent-scoped memories that scale without context cost.
Architecture
- SQLite -- single-file backing store, portable, zero config
- fastembed-rs -- local embeddings via all-MiniLM-L6-v2 ONNX model (semantic similarity, no API calls)
- Hybrid ranking -- BM25 (FTS5) + cosine similarity combined via Reciprocal Rank Fusion (RRF)
- MCP server -- stdio JSON-RPC server for native Claude Code tool integration
- CLI -- direct command-line interface for humans, scripts, and AI agents
Install
Linux / macOS
curl -fsSL https://raw.githubusercontent.com/nitecon/agent-memory/refs/heads/main/install.sh | sudo bash
Installs two binaries to /opt/agentic/bin/ (symlinked into /usr/local/bin/):
memoryβ the main CLI + MCP servermemory-dreamβ offline batch compactor (condense + dedup, see Dream compactor below)
Windows (PowerShell as Administrator)
irm https://raw.githubusercontent.com/nitecon/agent-memory/refs/heads/main/install.ps1 | iex
Installs memory.exe + memory-dream.exe to %USERPROFILE%\.agentic\bin\ and adds the directory to your PATH.
From source
This is a Cargo workspace. The default build produces both binaries:
cargo build --release
# β target/release/memory
# β target/release/memory-dream
(Windows adds the .exe suffix.)
First memory invocation downloads the embedding model (~80MB, cached alongside the database). First memory-dream --pull downloads the gemma3 weights (~2GB, same cache directory).
Release archives
Each release tag produces a single combined archive per platform:
agent-memory-v1.2.0-linux-x86_64.tar.gz # memory + memory-dream
agent-memory-v1.2.0-linux-aarch64.tar.gz
agent-memory-v1.2.0-macos-x86_64.tar.gz
agent-memory-v1.2.0-macos-aarch64.tar.gz
agent-memory-v1.2.0-windows-x86_64.zip
Archive size is ~70MB (candle + tokenizers add weight to memory-dream). The model weights are not shipped in the archive β they're downloaded on demand via memory-dream --pull. Users who never run memory-dream pay the ~28MB of disk it takes up but incur zero cognitive overhead; memory update force-bundles both binaries on every upgrade so install and updater logic stay symmetric.
Database location
The database path is resolved in this order:
| Priority | Condition | Path |
|---|---|---|
| 1 | AGENT_MEMORY_DIR env var is set | $AGENT_MEMORY_DIR/memory.db |
| 2 | ~/.agentic/memory.db exists | ~/.agentic/memory.db (user-local) |
| 3 | Default (Linux/macOS) | /opt/agentic/memory.db (global) |
| 3 | Default (Windows) | %USERPROFILE%\.agentic\memory.db |
The model cache and any auxiliary data are stored alongside the database in the same directory.
This means on a shared Linux/macOS machine, all agents share /opt/agentic/memory.db by default. If you need per-user isolation, create ~/.agentic/memory.db (even an empty file will trigger the user-local path) or set AGENT_MEMORY_DIR.
Recommended usage: CLI-first
Calling the memory binary directly is the recommended approach. It is just as fast as MCP mode and avoids the overhead of running a persistent server process. The fastest way to teach your agent to use it is the memory setup command β it bundles an interactive checklist that injects the rules block into your agent rule files and installs a Claude Code skill that auto-advertises the CLI to every session.
Auto-install the agent protocol
memory setup is now a small subcommand family:
| Command | Behavior |
|---|---|
memory setup | Interactive checklist: shows the install state of each component (rules, skill) and lets you pick which to (re)install |
memory setup rules [flags] | Inject the <memory-rules> block into known agent rule files (CLAUDE.md, GEMINI.md, AGENTS.md) |
memory setup rules --remove | Strip the <memory-rules> block and reverse every paired native-memory-disable write (Claude autoMemoryEnabled, Gemini excludeTools: save_memory, Codex [features] memories) |
memory setup skill [flags] | Install SKILL.md under every known agent frontend β ~/.claude/skills/agent-memory/ (Claude Code) and ~/.gemini/skills/agent-memory/ (Gemini CLI) β so each session auto-loads a ~100-token description that nudges the model toward the CLI |
memory setup skill --remove | Delete the installed SKILL.md from every known target. Missing files are silently skipped β parity with setup rules --remove |
memory setup all [-y] | Run rules β skill non-interactively (use -y / --yes to skip confirmation) |
# Bare invocation: 2-item interactive checklist (rules + skill).
memory setup
# Rules only β detects ~/.claude/CLAUDE.md, ~/.gemini/GEMINI.md,
# ~/.codex/AGENTS.md, ~/.config/codex/AGENTS.md.
memory setup rules # detect + prompt
memory setup rules --all # update every detected file
memory setup rules --target ~/.claude/CLAUDE.md
memory setup rules --dry-run # preview, don't write
memory setup rules --print # emit just the <memory-rules> block
memory setup rules --all --remove # uninstall: strip block + reverse every native-memory-disable write
# Skill only β installs SKILL.md to every known frontend:
# ~/.claude/skills/agent-memory/SKILL.md (Claude Code)
# ~/.gemini/skills/agent-memory/SKILL.md (Gemini CLI)
memory setup skill
memory setup skill --dry-run
memory setup skill --print
memory setup skill --remove # uninstall: delete SKILL.md from every target
# Everything, scripted.
memory setup all --yes
memory setup rules writes a <memory-rules>β¦</memory-rules> block (loose-XML markers so it is easy to locate and update) and saves a .bak sibling before each modification. Re-running replaces the block in place β your agent rule files never accumulate duplicates. If the companion agent-tools setup rules block (<agent-tools-rules>β¦</agent-tools-rules>) is already present in the file, the memory block is inserted directly after it so the two protocols stay grouped at the top; otherwise it is prepended.
Interaction with native agent memory. The installed rules block directs the agent to route every memory operation through the memory CLI, so leaving each tool's built-in memory system enabled would cause the agent to write into both the tool's native memory surface and this tool's SQLite store simultaneously β silent duplication the rules block is specifically designed to avoid. For every supported frontend, memory setup rules also merges the matching native-memory-disable setting; --remove reverses each merge by deleting (not forcing) the key so prior state is restored.
| Agent | Target file | Change on install |
|---|---|---|
| Claude | ~/.claude/settings.json | set "autoMemoryEnabled": false (also ./.claude/settings.json for project-scope CLAUDE.md) |
| Gemini | ~/.gemini/settings.json | append "save_memory" to excludeTools (disables Gemini's built-in memory tool) |
| Codex | $CODEX_HOME/config.toml (or ~/.codex/, then ~/.config/codex/) | set [features] memories = false (disables the Chronicle memory feature) |
All three merges are conservative: unrelated keys, tables, and array entries are preserved; corrupt input fails loudly instead of being overwritten; re-runs are no-ops once the target state is reached. Writes are atomic (.new + rename) so a crash mid-write cannot leave a half-serialized file behind.
memory setup skill writes the same SKILL.md byte-for-byte to every known agent frontend β ~/.claude/skills/agent-memory/SKILL.md (Claude Code) and ~/.gemini/skills/agent-memory/SKILL.md (Gemini CLI). Both frontends honor the same YAML frontmatter + Markdown body; Gemini silently ignores the Claude-specific allowed-tools key. The frontmatter description is always loaded into sessions (~100 tokens), pulling the model toward memory context at task start and memory store at task end. The full body only loads on demand when the skill is picked. Install is unconditional β no auto-detection of whether each agent is installed β because running memory setup skill is itself the opt-in signal. Re-runs write a .bak sidecar then overwrite, so the command is idempotent.
Manual install (equivalent content)
If you'd rather paste the block yourself, add the following to your global CLAUDE.md, GEMINI.md, or equivalent agent instructions:
<memory-rules>
## Agent Memory -- Mandatory Protocols
### Memory Operations (MANDATORY)
**Binary:** `memory` (installed at `/opt/agentic/bin/memory` on Linux/macOS, `%USERPROFILE%\.agentic\bin\memory.exe` on Windows) -- call directly via Bash. Do NOT use MCP or skills for memory during normal workflow.
**The "Memory First/Last" Rule:** Every task must begin with a `context` or `search` call and end with a `store` call if functionality changed.
### Scope tiers
Every memory is stored under one of two scopes; retrieval boosts both:
| Scope | Boost | When to use |
|----------------------------|--------|------------------------------------------------|
| **Current project** (cwd) | 1.5Γ | Repo-specific decisions, patterns, bugs |
| **Global** (`__global__`) | 1.25Γ | Universal user preferences / directives |
| Other project | 1.0Γ | Surfaces only as prior art via the `hint` field |
`store`, `search`, and `context` auto-detect the current project from the cwd's git remote. A single `context` call returns both current-project and global hits β no second query needed.
```bash
# Context -- top-K relevant memories for a task (boost cwd + global)
memory context "<task description>" -k <limit>
# Search -- hybrid BM25 + vector search (boost cwd + global)
memory search "<query>" -k <limit>
# Store -- save a new project-scoped memory (cwd auto-detected)
memory store "<content>" -m <type> -t "<tags>"
# Store -- save a universal preference (applies across every repo)
memory store "<content>" -m <type> --scope global -t "<tags>"
# types: user, feedback, project, reference
# Get -- fetch full content for specific IDs (pair with search for two-stage flow)
memory get <id> [<id>...] # 8-char short prefix OK
# Recall -- filter by project/agent/tags/type
memory recall -m <type> -t "<tags>" -p "<project>" -k <limit>
# Projects -- list distinct project idents (spot alias mismatches)
memory projects
# Move -- reassign the project ident on one or many memories
memory move --from "<old>" --to "<new>" [--dry-run]
# Copy -- duplicate memories under a new project ident
memory copy --from "<old>" --to "<new>" [--dry-run]
# Forget -- remove a memory by ID (or by search query)
memory forget --id <uuid>
# Prune -- decay stale/low-access memories
memory prune --max-age-days 90 [--dry-run]
```
### Memory quality gate (MANDATORY)
Store memories only when they will help a future agent work faster. A good memory captures reusable patterns, operational procedures, user preferences, non-obvious constraints, failure causes, or "how to / why" guidance. Write for a cold agent who has not seen this session: the memory should tell them what to do next, which tool or system to use, and why that path is correct.
Prefer updating an existing memory over creating a new one. Before storing, search/recall for related memories in the same project and in global scope. If the new learning refines the same workflow, subsystem, failure mode, user preference, or reusable pattern, update or rewrite the existing memory instead of adding another row. New memories are for distinct reusable knowledge that a future agent should retrieve independently.
Use the applicable overall guidance for state, notes, and tasks: explicit user instructions, AGENTS.md or other repo instructions, project conventions, and the tools actually available in the environment. Git history already records timeline-specific implementation details; canonical docs, issue trackers, task boards, or other user-approved surfaces are the right place for evolving design notes, status, and open questions. Do not invent a note location, create TODO/ADR files, or assume a specific task tool if the user's guidance points elsewhere. Memory should primarily increase knowledge about **how and why** work is done, or point to the canonical system that contains live details. For example, a useful memory may say "filesystem replication decisions are tracked in the project task board; check the active task thread before changing replication behavior because it captures current constraints and open decisions." Do not copy the full note/task content into memory.
Do **not** store facts that can be recovered from git history, repository inspection, CI/release systems, or configured task/comms surfaces. In particular, do not store routine deployment status, version numbers, release events, commit SHAs, branch state, "CI passed", "tag was pushed", or "deployed version X" memories.
Exception: store deployment/version facts only when they explain a failure mode or encode a reusable procedure that prevents future mistakes.
Prefer:
- Dev server `https://foo-dev.nitecon.org` is deployed by Eventic on main branch push; do not manually deploy. Average deploy time is about 2 minutes, so set a timer before checking.
Avoid:
- "Deployed version 1.2.0."
- "Tag v1.2.0 was pushed."
- "Commit abc123 passed CI."
- "Updated pattern 019dc55f with a Mumble/Murmur example."
If a user refers to "patterns", they likely mean gateway-backed `agent-tools patterns` stored under `https://gateway.nitecon.org`. A useful memory says to inspect the current CLI with `agent-tools patterns --help`, then use `agent-tools patterns get/update/check` as appropriate. Do not save a memory that only says a pattern was updated; save the reusable workflow and the reason it matters.
Short locator memories for canonical gateway patterns are allowed when they help a cold agent quickly find and reuse non-obvious guidance. For example, a memory may say that Eventic/Kubernetes deployment pipeline guidance lives in an `agent-tools patterns` record and should be looked up before designing a new pipeline. Keep the memory to the locator, reuse instruction, and why the pattern matters; do not record "created/updated pattern X" as an audit event.
When you notice duplicated, stale, or nonsensical memories in the current project, naturally consolidate them as part of the work: merge clear duplicates, rewrite bloated memories into one stronger entry, or forget entries that fail this quality gate. Keep cleanup local and conservative β do not launch broad memory-cleanup sweeps unless the user asks.
### Rule A -- Pre-action behavior recall (MANDATORY)
Before starting any user-requested task, run one `memory context "<task>"` call first. A single call returns both global directives (1.25Γ boost) and project-specific directives (1.5Γ boost). Do not skip for "quick" tasks: directives the user has already stated must never need to be re-stated. If the `hint` field flags zero global-scope matches, pause and reflect β or ask before acting.
### Rule B -- Post-action scope classification (MANDATORY)
After completing an action, if the user stated or implied any directive, preference, or corrective rule during the session, you MUST store it and MUST classify its scope:
- **Global** (`--scope global`) -- universal preference. Signals: "I always", "I never", "from now on", "I prefer", "don't ever", "whenever we", "in general".
- **Project** (`--scope project`, the default) -- specific to this repo, service, or codebase. Signals: "in this repo", "for this service", "here we".
- **Ambiguous** -- phrasing could reasonably apply either way. You MUST ask the user before storing. Do not silently default.
Example:
```bash
memory store "User never wants PRs opened unless they explicitly ask" \
-m feedback --scope global -t "workflow,pr"
```
</memory-rules>
Prefer memory setup over hand-pasting β it keeps the block up-to-date with the latest CLI surface and guarantees the markers match what future re-runs look for.
MCP server (optional)
If you prefer MCP integration, register the server:
claude mcp add agent-memory -- /opt/agentic/bin/memory serve
Or add manually to ~/.claude.json:
{
"mcpServers": {
"agent-memory": {
"type": "stdio",
"command": "/opt/agentic/bin/memory",
"args": ["serve"]
}
}
}
This gives Claude Code ten native tools: memory_store, memory_search, memory_recall, memory_forget, memory_prune, memory_context, memory_get, memory_projects, memory_move, memory_copy.
Skills (optional)
Copy the skill directories to your Claude Code skills location:
# Personal skills (available in all projects)
cp -r skills/remember ~/.claude/skills/remember
cp -r skills/recall ~/.claude/skills/recall
This enables /remember and /recall slash commands.
CLI reference
# Store a memory (project auto-detected from cwd's git remote)
memory store "User prefers terse responses" --tags "preference" -m feedback
# Store a universal preference (applies across every repo, 1.25Γ retrieval boost)
memory store "User never wants PRs opened unless explicitly asked" \
-m feedback --scope global --tags "workflow,pr"
# Hybrid search (BM25 + vector); cwd project is boosted
memory search "how does testing work"
# Fetch full content for specific hits (two-stage retrieval)
# Short 8-char prefix is fine β resolves via `resolve_id_prefix`.
memory get 4c82c482
memory get <uuid> <uuid>
# Filter by project/agent/tags
memory recall --project myapp --memory-type feedback
# Task-relevant context
memory context "refactoring the auth middleware" -k 5
# Hard filter vs boost
memory search "storage" --only "github.com/acme/infra.git" # only this project
memory search "storage" --no-project-boost # flat ranking, no boost
# Delete by ID (short prefix supported) or by search
memory forget --id 4c82c482
memory forget --query "outdated preference"
# Clean up stale memories
memory prune --max-age-days 90 --dry-run
memory prune --max-age-days 90
# List all memories
memory list -k 50 --project myapp
# List distinct project idents (great for spotting alias mismatches)
memory projects
# Migrate memories from a legacy project name to the canonical git-remote ident
memory move --from "trading-platform-sre" --to "github.com/nitecon/SRE.git" --dry-run
memory move --from "trading-platform-sre" --to "github.com/nitecon/SRE.git"
# Reassign a single memory by ID (pass --to "" to clear the project tag)
memory move --id <uuid> --to "github.com/nitecon/SRE.git"
memory move --id <uuid> --to ""
# Duplicate memories under a new project ident (preserves content + embedding)
memory copy --from "github.com/acme/mono.git" --to "github.com/acme/split.git"
memory copy --id <uuid> --to "github.com/acme/mirror.git"
# Check for updates and install the latest version
memory update
# Setup family β interactive checklist + per-component subcommands
memory setup # interactive: pick rules and/or skill
memory setup rules # rules only: detect + prompt
memory setup rules --all # rules: update every detected file
memory setup rules --target ~/.claude/CLAUDE.md
memory setup rules --dry-run # rules: preview, don't write
memory setup rules --print # rules: print <memory-rules> block
memory setup rules --all --remove # rules: strip block + reverse every native-memory-disable write
memory setup skill # install SKILL.md under Claude + Gemini skill dirs
memory setup skill --dry-run # skill: preview SKILL.md
memory setup skill --print # skill: print SKILL.md to stdout
memory setup skill --remove # skill: delete SKILL.md from every target
memory setup all --yes # rules β skill, non-interactive
Project & global scope tiers
store, search, and context derive the current project identifier from the working directory's git remote and reduce it to the repository shortname (e.g. git@github.com:nitecon/eventic.git β eventic). SSH and HTTPS for the same repo produce the same ident. Non-git directories fall back to the directory basename. New memories are auto-tagged with this project unless you pass --project explicitly, --no-project, or --scope global.
Shortname is deliberate so auto-derived idents match the hand-written shortnames most agents already use. The trade-off is that two repos with the same basename across different orgs will collide; in that case, tag them explicitly with --project.
Retrieval applies two independent score boosts:
| Scope | Boost | Meaning |
|---|---|---|
Current project (project == cwd) | 1.5Γ | Local context β highest priority |
Global (project == "__global__") | 1.25Γ | Universal user preferences β surface in every repo |
| Other project | 1.0Γ | Cross-project prior art; flagged via the hint field |
A single context or search call returns hits from all three tiers; the response's cross_project_count, global_scope_count, and hint fields tell models how to weigh them. Strong cross-project matches can still out-rank weak current-project hits β the boosts tilt ties without hard-filtering prior art.
Global scope
Global-scoped memories are stored under the reserved sentinel project ident __global__. Users opt in with --scope global on memory store:
memory store "Never open a PR unless explicitly asked" \
-m feedback --scope global --tags "workflow,pr"
The sentinel is reserved: passing --project __global__ directly (or memory move --to __global__) is rejected with a clear error pointing users to --scope global. This keeps the sentinel load-bearing for retrieval behavior rather than a string users can accidentally collide into. When you run memory projects, __global__ shows up as its own row so you can see how many universal preferences are on file.
Search flags
| Flag | Behavior |
|---|---|
| (none) | Boost cwd project (1.5Γ) and global sentinel (1.25Γ); cross-project results still surface |
-p <ident> | Boost this project (1.5Γ) instead of cwd; global boost unchanged |
--only <ident> | Hard filter: only return memories with this project |
--no-project-boost | Flat ranking; disables both boosts |
Store flags
| Flag | Behavior |
|---|---|
| (none) | Project scope; project auto-detected from cwd |
--scope project | Explicit project scope; suppresses the reflection hint even on user/feedback stores |
--scope global | Global scope; stores under the __global__ sentinel |
--project <ident> | Override the project ident (must NOT equal __global__) |
--no-project | Store with no project tag (skips cwd auto-detect) |
Migrating project idents
If memories were stored under a legacy project name (e.g. a logical label like trading-platform-sre) but the cwd-resolver now returns the canonical git-remote ident (e.g. github.com/nitecon/SRE.git), search will treat them as cross-project and the hint field will undersell their relevance. Fix it by consolidating idents:
# 1. Inspect the distinct project idents in the database
memory projects
# 2. Preview the affected memories before writing
memory move --from "trading-platform-sre" --to "github.com/nitecon/SRE.git" --dry-run
# 3. Apply the rename
memory move --from "trading-platform-sre" --to "github.com/nitecon/SRE.git"
Use memory copy instead of memory move when you want the memory available under both idents β for example, when a shared memory applies to two forks of the same codebase. Copies keep the original content, tags, and cached embedding; only the project ident, UUID, and timestamps differ.
Output format (light-XML)
All commands emit light-XML β grouped section tags with numbered content lines. No JSON. The shape is compact on purpose: tags give the agent a structural signal while the payload stays plain lines so token overhead is minimal.
Content bodies are not entity-escaped. Angle brackets, ampersands, and quotes pass through raw so guidance text like `memory get <id>` renders readably instead of as <id>. The only escape is " β " inside attribute values (needed so the "..." delimiter isn't broken).
context / search / recall
<project_memories>
1. PRs required by CodingGuidelines.md [git,standards] (ID:4c82c482)
2. Follow docs/CodingGuidelines.md for PRs [git,standards] (ID:772fd580)
</project_memories>
<general_knowledge>
1. User avoids PRs unless required [git,standards] (ID:372bd79d)
</general_knowledge>
<other_projects>
1. colorithmic: k-means Euclidean beats OKLab on OPT [quantization] (ID:23d0142a)
</other_projects>
<hint>2 of 4 results are global-scope preferences (apply across all projects). Treat them as directives, not suggestions.</hint>
<usage>IDs are 8-char prefixes. Use `memory get <id>` for full content. Sections: project_memories=current repo, general_knowledge=user-wide directives, other_projects=prior art.</usage>
<project_memories>β hits tagged with the current (cwd-derived) project.<general_knowledge>β hits tagged with the__global__sentinel (universal preferences, 1.25Γ boost).<other_projects>β hits from other projects, prefixed with the originating project ident. Treat as prior art.<hint>β reflection / directive prompt. Only emitted when it has something to say.<usage>β static legend documenting short-ID semantics and section meanings. Emitted unconditionally on every multi-memory read (context,search,recall,list) β including zero-result runs β so cold callers always have the key in reach. Positioned at the bottom so structured data comes first.
Empty sections are elided. A query with zero global-scope hits during a scoped retrieval triggers a reflection-style <hint> nudging the agent to confirm no universal preference applies before acting.
Mutations (store / move / copy / forget / prune)
Single self-closing <result> line:
<result status="stored" id="a4936eff" scope="global" project="__global__"/>
<result status="forgot" id="a4936eff"/>
<result status="forgot" count="3"/>
<result status="no_matches"/>
<result status="pruned" count="7"/>
<result status="dry_run" count="7"/>
<result status="moved" id="a4936eff" to_project="github.com/acme/split.git"/>
memory store with memory type user or feedback and no explicit --scope gets one additional <hint>β¦</hint> line reminding the caller to reclassify to global if the memory applies across repos.
memory get
<memory id="a4936eff" project="agent-memory" type="feedback" tags="workflow,pr">
User never wants PRs opened unless they explicitly ask.
</memory>
Full content is emitted verbatim as element text (XML-escaped). IDs are shown as 8-char short prefixes everywhere β full UUIDs still work as input.
Short-ID resolution
Every command that takes an <id> accepts any prefix of 4 or more hex characters. memory get 4c82c482 expands to the full UUID when unique. When two memories share the same prefix, an <ambiguous> block lists the candidates:
<ambiguous prefix="4c82c482">
1. 4c82c482-c081-4937... [colorithmic,milestone]: colorithmic v1.0.0 milestone 2026-04-20...
2. 4c82c482-d7f2-4a18... [agent-memory,schema]: Schema v3 migration design notes...
Reply with 1..2, or re-run with a longer prefix.
</ambiguous>
The fast-path full-UUID lookup still works β prefix resolution is additive.
memory list / memory projects
Plain light-XML blocks optimized for readability:
<memories count="2">
1.*(feedback) agent-memory [workflow,pr] (ID:a4936eff): User never wants PRs opened unless they explicitly ask.
2. (user) colorithmic [setup] (ID:b12c3d4e): Prefer k-means Euclidean over OKLab for OPT quantization.
</memories>
<usage>IDs are 8-char prefixes. Use `memory get <id>` for full content. Sections: project_memories=current repo, general_knowledge=user-wide directives, other_projects=prior art.</usage>
<projects count="3">
*agent-memory (42)
colorithmic (7)
__global__ (3)
</projects>
A leading * marks the current cwd-derived project. An empty list collapses to a self-closing <memories count="0"/> or <projects count="0"/>. memory list also emits the <usage> legend (it's a multi-memory read); memory projects does not (it's a utility listing of idents, not a memory read).
Auto-update
The binary checks for new releases on GitHub once per hour (at most) during normal CLI usage. If a newer version is found, it downloads and replaces the binary automatically. The update check is non-blocking β failures are logged to stderr and never interrupt normal operation.
To disable auto-updates, set the environment variable:
export AGENT_MEMORY_NO_UPDATE=1
You can also trigger an update manually at any time with memory update.
memory update fetches the combined release archive and atomically swaps both binaries (memory + memory-dream) in place. If memory-dream wasn't previously installed, the updater force-bundles it on the next upgrade β users who never run the compactor pay ~28MB of disk but no cognitive overhead.
Dream compactor (offline condensation + dedup)
memory-dream is a one-shot batch utility that walks your memory DB and does two things:
- Condenses verbose memories into a shorter factual claim using an in-process gemma3 model (via
candle). The original text is preserved in a newcontent_rawcolumn so nothing is lost β condensed text replacescontent, raw text lives alongside. - Dedups near-identical memories via cosine similarity on the embeddings. The older of a duplicate pair gets a
superseded_bypointer to the newer one; default reads filter superseded rows out, so they stay in the DB for audit but never surface in search / context / list.
It's never a daemon. Each invocation loads the model, processes the DB, and exits. Run it however you like: cron, launchd, Windows Task Scheduler, or just manually after a heavy session.
First-time setup
The default model gemma3 (google/gemma-3-1b-it) is gated on HuggingFace β you must accept its license and supply an access token before --pull will succeed.
# 1. Visit https://huggingface.co/google/gemma-3-1b-it and accept the license.
# 2. Create an access token at https://huggingface.co/settings/tokens.
# 3. Export the token (HF_TOKEN or HUGGING_FACE_HUB_TOKEN both work).
export HF_TOKEN=hf_xxx_your_token_xxx
# 4. Download. ~2GB, cached under $AGENT_MEMORY_DIR/models/gemma3/.
# Resume-safe: interrupt with Ctrl-C and re-run to continue from the
# same byte offset. Idempotent: subsequent runs with all files present
# exit with `<result status="pull_skipped"/>` and no network activity.
memory-dream --pull
Without a token, memory-dream --pull emits <result status="auth_required" .../> and the three-step remediation above, then exits non-zero.
Smoke-testing the pull pipeline (no auth required)
For CI and contributors without an HF token, the short-name tinyllama resolves to an ungated repo (TinyLlama/TinyLlama-1.1B-Chat-v1.0, ~2GB) so the download plumbing can be exercised end-to-end without credentials:
export AGENT_MEMORY_DIR=/tmp/dream-smoke
memory-dream --pull --model tinyllama
TinyLlama is not wired into the condenser β it exists solely to validate the pull flow. Real condensation still requires gemma3.
Regular use
# Preview: walk the DB and report what would change. No writes.
memory-dream --dry-run
# Apply: full pass. Per-memory BEGIN IMMEDIATE transactions serialize
# with any concurrent `memory store` writes.
memory-dream
# Cap the pass for incremental runs on large DBs.
memory-dream --limit 50
# Swap model (rare β any HF repo id works; short `gemma3` and `tinyllama`
# resolve to canonical repos, everything else passes through unchanged).
memory-dream --model myorg/my-fork
Scheduling examples
cron (Linux/macOS) β daily at 03:00 local time:
0 3 * * * /opt/agentic/bin/memory-dream >> ~/.agentic/dream.log 2>&1
launchd (macOS) β run after login, again daily:
<!-- ~/Library/LaunchAgents/com.agentic.dream.plist -->
<plist version="1.0"><dict>
<key>Label</key><string>com.agentic.dream</string>
<key>ProgramArguments</key>
<array>
<string>/opt/agentic/bin/memory-dream</string>
</array>
<key>StartInterval</key><integer>86400</integer>
<key>RunAtLoad</key><true/>
</dict></plist>
Windows Task Scheduler β daily at 03:00:
$action = New-ScheduledTaskAction -Execute "$env:USERPROFILE\.agentic\bin\memory-dream.exe"
$trigger = New-ScheduledTaskTrigger -Daily -At 3am
Register-ScheduledTask -TaskName "agent-memory-dream" -Action $action -Trigger $trigger
What gets condensed, what gets deduped
A memory needs condensation when it has content_raw IS NULL (never processed) OR its condenser_version stamp no longer matches the current <model>:<prompt-hash> combo (prompt or model changed since last run). Stamping lets future passes detect and re-run stale rows without reprocessing everything.
Dedup candidates must share the same project AND same memory_type AND same embedding_model. The cosine threshold defaults to 0.87 (empirically tuned for all-MiniLM-L6-v2). On match, the row with the earlier created_at is marked superseded. An exact-match short-circuit runs before the cosine scan so byte-identical inserts don't pay the vector cost.
Safety nets
- Prompt injection defense: the condensation prompt wraps memory content in
<<<MEMORY>>> ... <<<END>>>and explicitly instructs the model to treat anything inside as data, not instructions. A single few-shot example anchors verbatim preservation of paths / numbers / dates. The response must be JSON ({"condensed": "..."}); non-JSON triggers a fallback to the raw memory. - Length-ratio check: if the model's "condensed" output is longer than the input, it's rejected and the raw memory stays untouched.
- Refusal detection: responses matching
I cannot,I'm sorry, but,as a language model, etc. fall back to the raw memory. - Per-memory error containment: one bad memory can't halt the pass. Errors are logged and the orchestrator moves on.
--dry-runwrites nothing: row counts are identical before/after a dry-run pass.- BEGIN IMMEDIATE transactions: every mutation runs inside a per-memory immediate transaction so concurrent
memory storecalls can't race.
MCP tools
| Tool | Purpose |
|---|---|
memory_store | Save memory with auto-embedding + BM25 indexing |
memory_search | Hybrid BM25 + vector search, returns ranked results |
memory_recall | Filter by project/agent/tags/type |
memory_forget | Remove specific memories |
memory_prune | Decay stale/low-access memories |
memory_context | Return top-K relevant memories for a task description |
memory_get | Fetch full content for one or more memory IDs (full UUID or 4+ char short prefix) |
memory_projects | List distinct project idents with memory counts (spot alias mismatches) |
memory_move | Reassign the project ident on one memory (by id) or in bulk (by from/to) |
memory_copy | Duplicate memories under a new project ident; preserves content + embedding |
Memory types
| Type | Purpose |
|---|---|
user | Facts about the user -- role, preferences, expertise |
feedback | How to approach work -- corrections and confirmed approaches |
project | Ongoing work context -- decisions, deadlines, constraints |
reference | Pointers to external resources -- URLs, dashboards, systems |
How search works
Every query runs through two retrieval paths simultaneously:
- BM25 (FTS5) -- term-frequency keyword matching, great for exact names and patterns
- Vector (fastembed cosine similarity) -- semantic similarity, great for "I vaguely remember something about..."
Results are combined via Reciprocal Rank Fusion (k=60), which merges ranked lists without requiring score normalization. A memory that ranks well in both paths gets a strong combined score.
Design decisions
- SQLite is the source of truth. FTS5 handles full-text indexing within the same database file.
- Embeddings are brute-force cosine. For a personal memory system (<100K memories), this is fast enough and avoids ANN index complexity.
- Model loads lazily. Commands that don't need embeddings (e.g.,
recall,forget --id) skip the ~200ms model load. - Access counts track usage. Every retrieval increments
access_count, enablingpruneto identify stale memories. - All logging goes to stderr. Stdout is reserved for light-XML results (CLI) or JSON-RPC transport (MCP), so logging never pollutes either channel. MCP tool responses themselves are light-XML strings delivered as a single text content block.
- Global-first storage.
/opt/agentic/memory.dbis shared across all users/agents by default, with~/.agentic/as a user-local override.
