Engram
A private, local memory layer MCP server for your LLMs
Installation
npx engramAsk AI about Engram
Powered by Claude Β· Grounded in docs
I know everything about Engram. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Engram
Event-sourced memory system for AI agents. No LLM in the write path β just reliable episode storage with semantic search.
Why Engram?
Most AI memory systems couple write reliability to LLM availability by performing entity extraction at write time. Engram takes a different approach: store episodes reliably first, search them semantically, and defer any expensive derived structures (knowledge graphs, entity extraction) to an optional second layer.
The result: writes never fail, search is fast, and you get a single portable binary with no runtime dependencies.
Features
- Three search modes β find memories by meaning (vector), by exact words (keyword), or both at once (hybrid)
- Graceful fallback β keyword search works even when the embedding service is unavailable; hybrid degrades gracefully
- Fast queries β DuckDB HNSW indexing for sub-100ms vector search
- Zero external APIs β all embeddings generated locally via Ollama
- Single binary β portable across Linux, macOS, and Windows
- MCP native β integrates directly with Claude Desktop, Claude Code, and Cursor
Quick Start
Prerequisites
- Ollama running locally (or remotely) with an embedding model
- Go 1.25+ (only if building from source)
Install
Option A: Download a pre-built binary
Download from the releases page for your platform:
| Platform | Binary |
|---|---|
| macOS (Apple Silicon) | engram-darwin-arm64 |
| macOS (Intel) | engram-darwin-amd64 |
| Linux (x86_64) | engram-linux-amd64 |
| Linux (ARM64) | engram-linux-arm64 |
| Windows | engram-windows-amd64.exe |
# macOS/Linux: make it executable
chmod +x engram-*
mv engram-* engram
Option B: Build from source
git clone https://github.com/OscillateLabsLLC/engram
cd engram
# Using just (recommended β install from https://github.com/casey/just)
just setup # install deps, pull embedding model, build
# Or manually
go build -o engram ./cmd/engram/main.go
Pull the embedding model
ollama pull nomic-embed-text
Run
Start the server:
engram serve
Engram starts on port 3490 and prints the SSE endpoint URL. All MCP clients connect to this single server -- no database locking conflicts. See docs/mcp-integration.md for instructions on running as a background service on macOS, Linux, and Windows.
Configuration
Configure via environment variables:
| Variable | Description | Default |
|---|---|---|
DUCKDB_PATH | Path to DuckDB database file | ./engram.duckdb |
OLLAMA_URL | Ollama API endpoint | http://localhost:11434 |
EMBEDDING_MODEL | Embedding model name | nomic-embed-text |
ENGRAM_PORT | Server port | 3490 |
ENGRAM_SERVER_URL | Server URL (used by stdio proxy) | http://localhost:3490 |
See .env.example for a template.
MCP Client Integration
Engram integrates with Claude Desktop, Claude Code, and Cursor via the Model Context Protocol (MCP).
Quick Setup
-
Start the server (see background service docs for persistent setup):
engram serve -
Connect your client. Most clients support SSE directly:
Cursor (
.cursor/mcp.json):{ "mcpServers": { "engram-memory": { "url": "http://localhost:3490/mcp/sse" } } }Claude Desktop (stdio proxy, for clients that require stdio):
{ "mcpServers": { "engram-memory": { "command": "/absolute/path/to/engram", "args": ["stdio"], "env": { "ENGRAM_SERVER_URL": "http://localhost:3490" } } } }
For detailed integration instructions, available MCP tools, and troubleshooting, see docs/mcp-integration.md.
Docker & Deployment
Quick Start (Development)
# macOS/Windows
just docker-up
# Linux
just docker-up-linux
For detailed deployment instructions including Docker Compose, Kubernetes, and production configurations, see docs/deployment.md.
Cleanup Patterns
Agents clean up stale memories via update_episode β engram intentionally does not expose a delete_episode MCP tool because permanent deletion is a deliberate human action, not something agents should do autonomously.
Soft-delete (reversible)
Set expired_at to a past timestamp. The episode is hidden from default search but remains in the store β recover it later by clearing expired_at.
{"tool": "update_episode", "id": "...", "expired_at": "2020-01-01T00:00:00Z"}
Demote (visible but filtered)
Replace the episode's tags to include a marker like deprecated or low-confidence. The episode stays in search results so nothing is lost, but callers can filter at query time.
{"tool": "update_episode", "id": "...", "tags": ["deprecated", "original-topic"]}
Scheduled expiration
Set expired_at to a future timestamp β the episode disappears from default search after that time with no further action.
Architecture
engram/
βββ cmd/engram/ # Entry point (serve / stdio subcommands)
βββ internal/
β βββ api/ # HTTP + MCP SSE server
β βββ db/ # DuckDB operations + VSS
β βββ embedding/ # Ollama client
β βββ mcp/ # MCP tool definitions
β βββ models/ # Data models
β βββ proxy/ # stdio-to-SSE proxy
βββ scripts/ # Build and test scripts
βββ .github/workflows/ # CI/CD (build + release)
βββ Dockerfile # Container image
- Server-first:
engram serveowns DuckDB exclusively, exposes MCP over SSE + REST API - Thin stdio proxy:
engram stdiobridges stdin/stdout to the server for clients that require stdio (e.g., Claude Desktop) - DuckDB with VSS extension for vector similarity search (HNSW indexing)
- Ollama for local embedding generation (768-dimensional,
nomic-embed-text)
For a deeper dive into the architecture, see docs/architecture.md.
Design Principles
- Writes never fail (if the database is up)
- No LLM in the write path β embeddings only, and those are retryable
- Episode log is source of truth β everything else is derived
- Simple over clever β vector search covers 80% of use cases
- Portable β single binary, single database file
Documentation
- MCP Integration Guide - Client setup, available tools, troubleshooting
- Deployment Guide - Docker Compose, Kubernetes, production deployment
- Architecture - Technical deep dive into system design
Testing
The project includes unit and integration tests:
# Run all tests
just test
# Run with coverage
just test-coverage
Contributing
See CONTRIBUTING.md for development setup, code style, and how to submit pull requests.
License
MIT
