Yantrikdb
Cognitive memory engine for AI agents β temporal decay, contradiction detection, autonomous consolidation, knowledge graph, ANN recall via HNSW. Embeddable Rust library with Python bindings; powers yantrikdb-server (HTTP gateway, MCP server, openraft cluster). AGPL.
Ask AI about Yantrikdb
Powered by Claude Β· Grounded in docs
I know everything about Yantrikdb. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
YantrikDB β A Cognitive Memory Engine for Persistent AI Systems
The memory engine for AI that actually knows you.
Get Started in 60 Seconds
For AI agents (MCP β works with Claude, Cursor, Windsurf, Copilot)
pip install yantrikdb-mcp
Add to your MCP client config:
{
"mcpServers": {
"yantrikdb": {
"command": "yantrikdb-mcp"
}
}
}
That's it. The agent auto-recalls context, auto-remembers decisions, and auto-detects contradictions β no prompting needed. See yantrikdb-mcp for full docs.
As a Python library
pip install yantrikdb
The engine ships a default embedder (potion-base-2M, ~7 MB, distilled
from BGE-base-en-v1.5) β record_text() / recall_text() work out of
the box. No sentence-transformers install. No first-run model
download. No ONNX runtime. Just one pip install.
import yantrikdb
# Default: bundled embedder, dim=64. Just works.
db = yantrikdb.YantrikDB.with_default("memory.db")
db.record("Alice is the engineering lead", importance=0.8, domain="people")
db.record("Project deadline is March 30", importance=0.9, domain="work")
db.record("User prefers dark mode", importance=0.6, domain="preference")
results = db.recall("who leads the team?", top_k=3)
# β [{"text": "Alice is the engineering lead", "score": 1.0}, ...]
db.relate("Alice", "Engineering", "leads")
db.get_edges("Alice")
db.think() # consolidate, detect conflicts, mine patterns
db.close()
Want higher-quality embeddings?
Three opt-in upgrade paths, in increasing weight:
# 1. Larger bundled variant β downloads on first call, caches under
# your user data dir. Self-hosted from yantrikos/yantrikdb-models;
# no HuggingFace dependency, no rate limits.
db = yantrikdb.YantrikDB("memory.db", embedding_dim=256)
db.set_embedder_named("potion-base-8M") # ~28 MB, ~92% MiniLM
# or: db.set_embedder_named("potion-base-32M") # ~121 MB, ~95% MiniLM
# 2. Bring your own embedder (sentence-transformers, fastembed, custom).
from sentence_transformers import SentenceTransformer
db = yantrikdb.YantrikDB("memory.db", embedding_dim=384)
db.set_embedder(SentenceTransformer("all-MiniLM-L6-v2"))
# 3. Slim build (no bundled embedder, must set_embedder yourself).
# For deployments where the ~7 MB bundle is intolerable.
# Rust: yantrikdb = { version = "0.7", default-features = false }
| Path | Quality vs MiniLM | Size on disk | Install network |
|---|---|---|---|
Bundled default (with_default) | ~89% | ~7 MB (bundled) | none |
set_embedder_named("potion-base-8M") | ~92% | ~28 MB (cached) | first call only |
set_embedder_named("potion-base-32M") | ~95% | ~121 MB (cached) | first call only |
set_embedder(MiniLM) | 100% (baseline) | ~80 MB | sentence-transformers' own download |
As a Rust crate
[dependencies]
yantrikdb = "0.7"
# Want set_embedder_named() for runtime model upgrades?
# yantrikdb = { version = "0.7", features = ["embedder-download"] }
# Slim build (no bundled embedder, no network code path):
# yantrikdb = { version = "0.7", default-features = false }
The Problem
Current AI memory is:
Store everything β Embed β Retrieve top-k β Inject into context β Hope it helps.
That's not memory. That's a search engine with extra steps.
Real memory is hierarchical, compressed, contextual, self-updating, emotionally weighted, time-aware, and predictive. YantrikDB is built for that.
Why Not Existing Solutions?
| Solution | What it does | What it lacks |
|---|---|---|
| Vector DBs (Pinecone, Weaviate) | Nearest-neighbor lookup | No decay, no causality, no self-organization |
| Knowledge Graphs (Neo4j) | Structured relations | Poor for fuzzy memory, not adaptive |
| Memory Frameworks (LangChain, Mem0) | Retrieval wrappers | Not a memory architecture β just middleware |
| File-based (CLAUDE.md, memory files) | Dump everything into context | O(n) token cost, no relevance filtering |
Benchmark: Selective Recall vs. File-Based Memory
| Memories | File-Based | YantrikDB | Token Savings | Precision |
|---|---|---|---|---|
| 100 | 1,770 tokens | 69 tokens | 96% | 66% |
| 500 | 9,807 tokens | 72 tokens | 99.3% | 77% |
| 1,000 | 19,988 tokens | 72 tokens | 99.6% | 84% |
| 5,000 | 101,739 tokens | 53 tokens | 99.9% | 88% |
At 500 memories, file-based exceeds 32K context windows. At 5,000, it doesn't fit in any context window β not even 200K. YantrikDB stays at ~70 tokens per query. Precision improves with more data β the opposite of context stuffing.
Architecture
Design Principles
- Embedded, not client-server β single file, no server process (like SQLite)
- Local-first, sync-native β works offline, syncs when connected
- Cognitive operations, not SQL β
record(),recall(),relate(), notSELECT - Living system, not passive store β does work between conversations
- Thread-safe β
Send + Syncwith internal Mutex/RwLock, safe for concurrent access
Five Indexes, One Engine
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β YantrikDB Engine β
β β
β ββββββββββββ¬βββββββββββ¬βββββββββββ¬βββββββββββ β
β β Vector β Graph β Temporal β Decay β β
β β (HNSW) β(Entities)β (Events) β (Heap) β β
β ββββββββββββ΄βββββββββββ΄βββββββββββ΄βββββββββββ β
β ββββββββββββ β
β β Key-Valueβ WAL + Replication Log (CRDT) β
β ββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Vector Index (HNSW) β semantic similarity search across memories
- Graph Index β entity relationships, profile aggregation, bridge detection
- Temporal Index β time-aware queries ("what happened Tuesday", "upcoming deadlines")
- Decay Heap β importance scores that degrade over time, like human memory
- Key-Value Store β fast facts, session state, scoring weights
Decoupled Write Path (v0.6.6+)
The vector index is structured as a two-tier LSM: a small mutable
delta and an immutable HNSW cold tier swapped atomically via
ArcSwap. Foreground writes only touch the delta (brief lock,
O(1) push); HNSW work amortizes on a dedicated compactor thread.
This is what eliminated the production wedge where sustained writes
starved readers β see CONCURRENCY.md and
docs/decoupled_write_path_rfc.md.
flowchart LR
subgraph CLIENT["Caller"]
C1["record / record_with_rid"]
C2["recall / recall_with_seq"]
end
subgraph FG["Foreground β P1, brief locks only"]
F1["assign_seq<br/>vec_seq.fetch_add<br/>(or fetch_max for cluster seq)"]
F2["DeltaIndex.append<br/>brief RwLock<Vec> push"]
F3["bump_visible_seq<br/>DashMap + AtomicU64<br/>(lock-free)"]
F4["log_op β SQLite WAL"]
end
subgraph IDX["DeltaIndex (per engine)"]
D1[("delta<br/>RwLock<Vec<DeltaEntry>><br/>cap = delta_max (256)")]
D2[("cold<br/>ArcSwap<HnswIndex><br/>lock-free read")]
end
subgraph BG["Background β P3, dedicated threads"]
B1["Compactor (1s tick)<br/>fires when delta past half-cap<br/>OR oldest entry > max_dirty_age"]
B2["Materializer pool<br/>N = cores / 2<br/>drains pending oplog ops"]
end
subgraph STORE["SQLite (WAL mode, single file)"]
S1["memories"]
S2["oplog"]
S3["entity_edges, sessions, ..."]
end
C1 --> F1
F1 --> F2
F2 --> D1
F1 --> F3
F1 --> F4
F4 --> S2
C2 -.->|"optional<br/>wait_for_visible_seq"| F3
C2 --> D1
C2 --> D2
B1 -->|"seal + clone + ArcSwap.store"| D1
B1 --> D2
B2 --> S2
B2 --> S1
B2 --> S3
The structural invariant. Foreground (P1) and background (P3) do
not share a lock primitive that holds for non-O(1) work. The cold
tier is read lock-free via ArcSwap; the delta's RwLock is held
for the O(1) push only. This is what makes "no single background
task can wedge reads, writes, or recovery" enforceable β see
CONCURRENCY.md Rules 2 and 3 for the names and
failure modes if violated.
Cluster Mode (RFC 010 + Phase 6 RYW)
For multi-node deployments, yantrikdb-server
wraps the engine with openraft
for leader-elected replication. The four cluster-mutation primitives
take the openraft commit-log index as their seq, so all nodes
agree on a single global monotonic sequence β read-your-writes works
across the cluster, not just within a node.
flowchart LR
L["Leader<br/>HTTP request"]
LR["Leader engine<br/>record_with_rid(seq=Some(log_idx))"]
OR["openraft<br/>commit log"]
F1["Follower 1 applier<br/>record_with_rid(seq=Some(log_idx))"]
F2["Follower 2 applier<br/>record_with_rid(seq=Some(log_idx))"]
R["Reader on any node<br/>recall_with_seq(min_seq=log_idx)"]
L --> LR
LR --> OR
OR -->|replicate + apply| F1
OR -->|replicate + apply| F2
F1 -.->|"visible_seq[ns] reaches log_idx"| R
F2 -.->|"visible_seq[ns] reaches log_idx"| R
LR -.->|"visible_seq[ns] reaches log_idx"| R
Each record_with_rid / tombstone_with_rid /
upsert_entity_edge_with_id / delete_entity_edge_with_id accepts
an optional seq: Option<u64>. Single-node callers pass None and
the engine allocates; cluster appliers pass Some(commit_log_index)
and the engine ratchets vec_seq up to at least that value via
fetch_max. After apply, visible_seq[namespace] reaches the
log index, so any subsequent recall_with_seq(min_seq=N) blocks
just long enough for the local node to have applied through index
N β and no longer.
Memory Types (Tulving's Taxonomy)
| Type | What it stores | Example |
|---|---|---|
| Semantic | Facts, knowledge | "User is a software engineer at Meta" |
| Episodic | Events with context | "Had a rough day at work on Feb 20" |
| Procedural | Strategies, what worked | "Deploy with blue-green, not rolling update" |
All memories carry importance, valence (emotional tone), domain, source, certainty, and timestamps β used in a multi-signal scoring function that goes far beyond cosine similarity.
Key Capabilities
Relevance-Conditioned Scoring
Not just vector similarity. Every recall combines:
- Semantic similarity (HNSW) β what's topically related
- Temporal decay β recent memories score higher
- Importance weighting β critical decisions beat trivia
- Graph proximity β entity relationships boost connected memories
- Retrieval feedback β learns from past recall quality
Weights are tuned automatically from usage patterns.
Conflict Detection & Resolution
When memories contradict, YantrikDB doesn't guess β it creates a conflict segment:
"works at Google" (recorded Jan 15) vs. "works at Meta" (recorded Mar 1)
β Conflict: identity_fact, priority: high, strategy: ask_user
Resolution is conversational: the AI asks naturally, not programmatically.
Semantic Consolidation
After many conversations, memories pile up. think() runs:
- Consolidation β merge similar memories, extract patterns
- Conflict scan β find contradictions across the knowledge base
- Pattern mining β cross-domain discovery ("work stress correlates with health entries")
- Trigger evaluation β proactive insights worth surfacing
Proactive Triggers
The engine generates triggers when it detects something worth reaching out about:
- Memory conflicts needing resolution
- Approaching deadlines (temporal awareness)
- Patterns detected across domains
- High-importance memories about to decay
- Goal tracking ("how's the marathon training?")
Every trigger is grounded in real memory data β not engagement farming.
Multi-Device Sync (CRDT)
Local-first with append-only replication log:
- CRDT merging β graph edges, memories, and metadata merge without conflicts
- Vector indexes rebuild locally β raw memories sync, each device rebuilds HNSW
- Forget propagation β tombstones ensure forgotten memories stay forgotten
- Conflict detection β contradictions across devices are flagged for resolution
Sessions & Temporal Awareness
sid = db.session_start("default", "claude-code")
db.record("decided to use PostgreSQL") # auto-linked to session
db.record("Alice suggested Redis for caching")
db.session_end(sid)
# β computes: memory_count, avg_valence, topics, duration
db.stale(days=14) # high-importance memories not accessed recently
db.upcoming(days=7) # memories with approaching deadlines
Full API
| Operation | Methods |
|---|---|
| Core | record, record_batch, recall, recall_with_response, recall_refine, forget, correct |
| Knowledge Graph | relate, get_edges, search_entities, entity_profile, relationship_depth, link_memory_entity |
| Cognition | think, get_patterns, scan_conflicts, resolve_conflict, derive_personality |
| Triggers | get_pending_triggers, acknowledge_trigger, deliver_trigger, act_on_trigger, dismiss_trigger |
| Sessions | session_start, session_end, session_history, active_session, session_abandon_stale |
| Temporal | stale, upcoming |
| Procedural | record_procedural, surface_procedural, reinforce_procedural |
| Lifecycle | archive, hydrate, decay, evict, list_memories, stats |
| Sync | extract_ops_since, apply_ops, get_peer_watermark, set_peer_watermark |
| Maintenance | rebuild_vec_index, rebuild_graph_index, learned_weights |
Technical Decisions
| Decision | Choice | Rationale |
|---|---|---|
| Core language | Rust | Memory safety, no GC, ideal for embedded engines |
| Architecture | Embedded (like SQLite) | No server overhead, sub-ms reads, single-tenant |
| Bindings | Python (PyO3), TypeScript | Agent/AI layer integration |
| Storage | Single file per user | Portable, backupable, no infrastructure |
| Sync | CRDTs + append-only log | Conflict-free for most operations, deterministic |
| Thread safety | Mutex/RwLock, Send+Sync | Safe concurrent access from multiple threads |
| Query interface | Cognitive operations API | Not SQL β designed for how agents think |
Ecosystem
| Package | What | Install |
|---|---|---|
| yantrikdb | Rust engine | cargo add yantrikdb |
| yantrikdb | Python bindings (PyO3) | pip install yantrikdb |
| yantrikdb-mcp | MCP server for AI agents | pip install yantrikdb-mcp |
Roadmap
- V0 β Embedded engine, core memory model (record, recall, relate, consolidate, decay)
- V1 β Replication log, CRDT-based sync between devices
- V2 β Conflict resolution with human-in-the-loop
- V3 β Proactive cognition loop, pattern detection, trigger system
- V4 β Sessions, temporal awareness, cross-domain pattern mining, entity profiles
- V5 β Multi-agent shared memory, federated learning across users
Worked example: Wirecard (RFC 008 substrate β with honest limits)
For nearly a decade, Wirecard's filings and EY's audit attested to β¬1.9B in Philippine escrow accounts. In June 2020 both banks and the central bank formally denied the accounts existed.
When the source_lineage fields are hand-populated β EY as [wirecard, ey] to capture audit dependence on Wirecard-provided documents, BSP as [bsp, bpi, bdo] to capture restatement of the commercial banks β RFC 008's β discounts the dependent claims, and the contest operator's temporal split distinguishes present-tense contradictions from historical state changes. On this hand-populated data, the substrate produces useful annotations.
Honest limits (surfaced by Phase 2 empirical testing, Apr 2026):
- On naturalistic evidence where a real agent populates the fields, the substrate's gates don't reliably fire. Cases B and C of the Phase 2 eval need an extractor/canonicalizer (not yet built) to work; Case A exposed that
βis mathematically incapable of flipping decisions at realistic N, regardless of coefficient tuning. - Current claim: structured schema for evidence provenance/temporal/conflict annotation, useful for audit and inspection. The dependence-discount operator works on curated inputs but needs replacement before it can drive decisions.
- Not a current claim: "decision-improvement substrate for AGI-capable agents." That framing is withdrawn pending RFC 009.
See docs/showcase/wirecard.md for the full walkthrough including the Phase 2 negative result and the gold-state ablation that partitioned operator failure from extraction failure. Run the hand-populated demonstration directly:
cargo run --example showcase_wirecard
Research & Publications
- U.S. Patent Application 19/573,392 (March 2026): "Cognitive Memory Database System with Relevance-Conditioned Scoring and Autonomous Knowledge Management"
- Zenodo: YantrikDB: A Cognitive Memory Engine for Persistent AI Systems
Author
Pranab Sarkar β ORCID Β· LinkedIn Β· developer@pranab.co.in
License
AGPL-3.0. See LICENSE for the full text.
The MCP server is MIT-licensed β using the engine via the MCP server does not trigger AGPL obligations on your code.
