Ket
Content-addressable substrate for multi-agent memory systems. BLAKE3 CAS, Merkle DAG, Dolt SQL, tree-sitter CDOM, MCP server, and Python bindings.
Ask AI about Ket
Powered by Claude Β· Grounded in docs
I know everything about Ket. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Ket $|\psi\rangle$
Purpose: Content-addressable substrate for multi-agent memory systems β BLAKE3-hashed, immutable storage with Merkle DAG provenance, scoring, and MCP integration. The infrastructure layer for externalizing LLM memory, lineage, and traversal control.
Every artifact (code, reasoning, scores) is BLAKE3-hashed, deduplicated, and stored in an immutable content-addressed store with a queryable SQL mirror powered by Dolt. Built for multi-agent workflows where provenance, lineage, and scoring matter.
Ket implements the substrate architecture described in A Content-Addressed Adaptive Knowledge Substrate for Distributed Epistemic Coordination (Joven, 2026) β a systems-layer approach to LLM reasoning failures that externalizes memory persistence, provenance, and traversal control into a deterministic, content-addressed infrastructure. The paper's core primitives (Merkle DAG nodes, depth scoring, tiered operations, delta chains, and fixed-point convergence) map directly to ket's crate architecture; see the paper's Β§9.2 for the full mapping.
Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ket-cli β
β 22 commands, --json output β
ββββββββββββ¬βββββββββββ¬ββββββββββββ¬ββββββββββββββββββββ€
β ket-mcp βket-agent β ket-score β ket-cdom β
β 16 tools β tasks β 4 dims β tree-sitter β
β JSON-RPC β routing β auto/peer β Rust + Python β
ββββββββββββΌβββββββββββ΄ββββββββββββ΄ββββββββββββββββββββ€
β ket-opt β WQS binary search Β· tier allocation β
β calibrateβ Lagrangian relaxation Β· provenance β
ββββββββββββ΄βββββββββββββββββββββββββββββββββββββββββββ€
β ket-dag β
β Merkle DAG Β· lineage Β· soft links β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β ket-cas β ket-sql β
β BLAKE3 flat-file blobs β Dolt versioned SQL β
βββββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββ
Dual storage model β CAS is the immutable source of truth; Dolt SQL is the queryable, versioned mirror. A repair command reconciles if they drift.
Workspace Crates
| Crate | Purpose |
|---|---|
| ket-cas | BLAKE3 content-addressed blob store (.ket/cas/<hash>) |
| ket-dag | Merkle DAG for provenance β parent chains, soft links, export/import bundles |
| ket-sql | Dolt SQL wrapper β 9 tables, versioned commits, lineage queries |
| ket-mcp | MCP server (stdio JSON-RPC) exposing 16 tools for Claude and other agents. Dolt is optional β CAS-only tools work without it. |
| ket-agent | Multi-agent orchestration β task lifecycle, subprocess spawning, context injection |
| ket-score | Scoring engine β correctness, efficiency, style, completeness β with auto-scoring via cargo build/test/clippy |
| ket-opt | WQS binary search optimizer β Lagrangian relaxation for compute tier allocation across DAG nodes |
| ket-cdom | Code Document Object Model β tree-sitter parsing for Rust and Python symbol extraction |
| ket-cli | CLI binary with 22 commands |
| ket-py | PyO3 Python bindings for CAS and DAG operations |
Getting Started
Three tiers β start minimal, add capabilities when you need them.
Tier 1: Just ket (no dependencies beyond Rust)
Everything you need for content-addressed agent memory: store, DAG, lineage, drift detection, MCP server. 13 of 16 MCP tools work at this tier.
# Build
cargo build --release
# Initialize a ket store
ket init
# Store a file and get its content ID
ket put myfile.rs
# Create a DAG node with lineage
ket dag create "initial reasoning" --kind reasoning --agent claude
# Track a file for drift detection
ket track add src/main.rs --agent claude
ket drift
# Start the MCP server (for Claude integration)
ket mcp
# Scan code symbols
ket scan src/lib.rs
ket cdom "parse"
Prerequisites: Rust (stable, 2021 edition). That's it.
Tier 2: Add Docker (scoring, tasks, SQL queries)
Docker runs Dolt in a container β you never install it directly. This unlocks scoring, task delegation, SQL queries, and calibration. All 16 MCP tools.
# Start the Dolt sidecar
docker compose --profile full up -d dolt
# Sync CAS β SQL
ket repair
# Now scoring and tasks work
ket agent register claude
ket task create "Implement auth module" --by claude
ket scores auto <cid> --agent claude --dir .
ket calibrate run <root_cid> --max-cost 50
Prerequisites: Rust + Docker.
Tier 3: Full Docker (no Rust needed)
Run everything in containers. Good for trying ket without installing Rust.
# Build the image
docker compose build
# Initialize a ket store
docker compose run --rm ket init
# Store a file (mount your project into /data)
docker compose run --rm -v "$PWD":/data/project ket put /data/project/myfile.rs
# DAG operations
docker compose run --rm ket dag create "initial reasoning" --kind reasoning --agent claude
docker compose run --rm ket dag ls
docker compose run --rm ket status
The /data volume persists your ket store across runs. Add the Dolt sidecar with docker compose --profile full up -d dolt.
CLI Commands
Content Store
ket initβ Initialize.ketdirectoryket put <file>β Store file, return CIDket get <cid>β Retrieve content by CIDket verify <cid>β Check integrityket cas-statsβ Store size breakdownket gc [--delete]β Garbage collect orphan blobs
DAG & Lineage
ket dag create <content>β Create node (--kind,--parent,--agent)ket dag ls/ket dag show <cid>β List/inspect nodesket dag lineage <cid>β Trace ancestor chainket dag drift <path> <cid>β Detect file driftket link create <from> <to> <rel>β Soft links (supersedes, contradicts, etc.)ket merge <content> --parents <cid>...β Multi-parent merge nodeket dot [--root <cid>]β Graphviz DOT visualizationket export <cid>/ket import <file>β Portable DAG bundles
Tasks & Agents
ket task create <title>/ket task ls/ket task assign <id> <agent>ket agent register <preset>/ket agent lsket run <task-id>β Execute task via agent subprocess
Code Intelligence
ket scan <path>β Index symbols (Rust/Python)ket cdom <query> [path]β Search extracted symbolsket search <text>β Full-text content search
Scoring
ket scores add <cid>β Record score (--dim,--value,--agent)ket scores show <cid>β Scores for a nodeket scores profile <agent>β Agent averagesket scores route <dim>β Best agent for a dimensionket scores auto <cid>β Auto-score (build/test/clippy)
Calibration
ket calibrate run <root_cid>β WQS optimize tier allocation (--max-cost,--max-depth,--max-tier3)ket calibrate inspect <cid>β Read back a stored calibrationket calibrate history <root_cid>β All calibrations for a subtree
Operations
ket sql <query>β Raw SQL against Doltket log [-n <count>]β Mutation logket statusβ Health dashboardket history/ket diffβ Dolt version historyket repair [--dry-run]β Rebuild SQL from CASket track add/ls/rmβ File drift tracking
Global Flags
--home <path>β Override.ketdirectory (env:KET_HOME)--jsonβ Structured JSON output
MCP Integration
Ket exposes 16 tools over MCP (Model Context Protocol) for agent integration.
Dolt is optional. The MCP server starts with CAS alone β 13 of 16 tools work without Dolt. Only scoring, tasks, and calibration require it.
| Tool | What it does | Needs Dolt? |
|---|---|---|
ket_put | Store content, get CID | No |
ket_get | Retrieve content by CID | No |
ket_verify | Check CID integrity | No |
ket_dag_link | Create DAG node with provenance | No |
ket_dag_lineage | Trace ancestry chain | No |
ket_dag_ls | List/filter DAG nodes | No |
ket_check_drift | Detect file changes | No |
ket_search | Full-text content search | No |
ket_status | Substrate health dashboard | No (enhanced with Dolt) |
ket_store_reasoning | Persist reasoning as DAG node | No |
ket_get_reasoning | Retrieve reasoning with context | No |
ket_query_cdom | Search code symbols | No |
ket_schema_stats | Check schema dedup effectiveness | No |
ket_score | Record quality scores | Yes |
ket_create_subtask | Delegate work to agents | Yes |
ket_calibrate | Optimize traversal tiers | Yes |
Add to your Claude MCP config:
{
"mcpServers": {
"ket": {
"command": "ket",
"args": ["mcp"]
}
}
}
Design Principles
- Content-addressed everything β Same content = same CID. Deterministic, deduped, immutable.
- Provenance by default β Every artifact links to its parents via the Merkle DAG.
- Dual storage β CAS for truth, SQL for queries. Either can reconstruct the other.
- Scoring gates routing β Historical evaluation across 4 dimensions lets the system learn which agent is best at what.
- Drift detection β Tracked files are re-hashed on demand to prevent stale reasoning context.
- Portable bundles β DAG subgraphs can be exported and imported across instances.
- Schema-linked, not schema-enforced β See below.
Schemas and Deduplication
Ket's CAS deduplicates by content hash: identical bytes produce identical CIDs. This is exact dedup, not semantic dedup. Two blobs that mean the same thing but differ by a trailing newline or key ordering get different CIDs.
Schemas address this without pulling ket above the intelligence line.
How it works
A schema is any blob you store in CAS β JSON Schema, a struct definition, a prompt template, a plain-English description. Ket does not interpret it. When creating a DAG node, you attach the schema's CID:
# Store your schema
SCHEMA_CID=$(ket put my_schema.json)
# Create a node whose output conforms to it
ket dag create "structured observation" \
--kind memory --agent claude --schema $SCHEMA_CID
The schema_cid field on the node records what shape the output claims to have. That's the contract. Enforcement is your problem.
Why this helps dedup
Content-hash dedup works when semantically equivalent data produces byte-identical output. Schemas make this achievable by constraining the surface area: sorted keys, canonical formatting, required fields only, no optional noise. If agents conform to the schema, equivalent observations hash the same.
What ket provides
schema_cidon every node β optional, stored in the DAG, queryable.- Schema stats β given a schema CID, count total nodes vs. unique output CIDs. If they're equal, the schema isn't producing dedup. If they diverge, it is. This is a pure hash-count query β no semantic understanding needed.
- Propagation via provenance β when a schema evolves, the DAG makes the blast radius visible. Query for all nodes with the old schema CID, trace their lineage, migrate explicitly.
What ket does NOT provide
- Schema validation at ingest. Ket won't reject non-conforming data.
- Schema format opinions. JSON Schema, protobuf, TOML β ket doesn't care.
- Semantic dedup. If two blobs mean the same thing but have different bytes, they get different CIDs. The schema's job is to prevent that from happening.
The substrate stays below the intelligence line. Schemas are a user-side discipline that makes the content-addressing layer work harder for you.
License
MIT
