OpenCortex
OpenCortex Memory Server — AI agent memory automatic session capture via hooks, and multi-tenant isola tion.
Ask AI about OpenCortex
Powered by Claude · Grounded in docs
I know everything about OpenCortex. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
OpenCortex
Persistent memory and context infrastructure for AI agents
Overview · Key Concepts · Architecture · Quick Start · Features · API · Repository · 中文文档
What is OpenCortex
LLM agents forget. Session context, user preferences, design decisions, debugging history, and reusable workflows disappear unless they are stored outside the model window.
OpenCortex is the persistence layer for that problem. It combines layered memory storage, intent-aware recall planning, and retrieval tuned for agent workflows, then exposes the result through one HTTP API backend.
In practice, OpenCortex is built for:
- cross-session memory and project context
- document and conversation ingestion
- retrieval that balances relevance, recency, feedback, and structure
- optional knowledge, insights, and skill-oriented services on the same substrate
- multi-tenant and project-scoped isolation via JWT-backed identity
Key Concepts
Three-layer memory
Each record is stored at multiple levels of detail:
| Layer | Role |
|---|---|
L0 | Small abstract for cheap indexing and quick confirmation |
L1 | Structured overview for most recall responses |
L2 | Full content for deep inspection and audits |
Explicit recall planning
OpenCortex does not treat every query as a generic vector search. Queries are classified, routed, and turned into a recall plan that decides whether recall should run, which context to search, and how much detail to return.
Retrieval beyond embeddings
Search combines multiple signals instead of a single vector score. Depending on configuration and query type, ranking can use semantic search, lexical weighting, rerank gating, explicit feedback, hotness, and cone-style expansion around shared entities.
Context lifecycle
The central lifecycle endpoint is /api/v1/context. It drives three phases:
prepare: plan recall and return memory or knowledge contextcommit: record the turn and feedback signalsend: flush session state and optional post-processing
Shared memory substrate
Core memory, optional knowledge extraction, insights reporting, and the skill engine share the same storage, identity, and retrieval base instead of standing up separate systems.
Architecture Overview
AI client
-> HTTP API
-> FastAPI server
-> CortexMemory
-> ingest pipelines for memory / document / conversation
-> recall planning and retrieval
-> CortexFS + embedded Qdrant storage
-> optional knowledge / insights / skill services
-> optional web console at /console
At a high level, agents and client applications call the FastAPI backend directly over HTTP. The backend coordinates storage, recall, context lifecycle, and optional higher-level analysis services.
Quick Start
Requirements
- Python
>=3.10 - Node.js
>=18only for optional console development uv
1. Install
git clone https://github.com/StardustVision/OpenCortex.git
cd OpenCortex
uv sync
2. Start the backend
uv run opencortex-server --host 127.0.0.1 --port 8921
Generate or inspect tokens when needed:
uv run opencortex-token generate
uv run opencortex-token list
3. Call the HTTP API
Create or reuse a token, then send requests directly to the running backend:
uv run opencortex-token generate
export OPENCORTEX_TOKEN="<token printed by opencortex-token>"
curl -H "Authorization: Bearer $OPENCORTEX_TOKEN" http://127.0.0.1:8921/api/v1/memory/health
The central agent lifecycle endpoint is /api/v1/context. Memory and content endpoints live under /api/v1/memory and /api/v1/content.
4. Docker option
docker compose up -d
docker compose logs -f
If built frontend assets are present, the console is served at http://127.0.0.1:8921/console.
Core Features
OpenCortex centers on one memory substrate that handles short facts, documents, and conversations; explicit lifecycle handling through /api/v1/context; retrieval that can mix semantic, lexical, feedback, recency, and structure-aware signals; and optional knowledge, insights, and skill workflows on the same backend under request-scoped isolation.
API Overview
OpenCortex exposes a broader API than this landing page lists. The most important areas are:
- Memory: persistent storage and retrieval for memories, documents, and conversations
- Context and session: the agent lifecycle centered on
/api/v1/context - Content and observability: layered content reads plus health or diagnostics surfaces
- Knowledge / insights / skills: optional higher-level workflows built on the same backend
- Auth / admin: identity, tokens, diagnostics, and administrative maintenance
Concrete next stops for route-level details:
src/opencortex/http/src/opencortex/skill_engine/src/opencortex/insights/
Repository Layout
At the top level, the repository is organized around the core backend in src/opencortex/, the optional console in web/, automated verification in tests/, and supporting material under docs/, scripts/, and examples/.
Deep Dives
Testing
uv run --group dev pytest
Python Style Gate
OpenCortex uses a repo-level Python style gate based on ruff. The current
enforced subset is the practical baseline for the Google Python Style Guide in
this repo: imports, naming, public docstrings, public type annotations, TODO
format, exception hygiene, and obvious simplification rules.
The initial ratchet currently targets the transport-facing Python surface:
src/opencortex/http/*.pysrc/opencortex/skill_engine/http_routes.py
Run it locally before shipping Python changes:
uv run --group dev ruff format --check .
uv run --group dev ruff check .
The gate intentionally starts on a bounded surface and uses a small set of temporary file-level ignores for legacy transition points. Both are debt to be burned down, not a safe place for new code.
Tech Stack
OpenCortex uses a Python/FastAPI backend, CortexFS plus embedded Qdrant for storage, and React/Vite for the optional console.
License
Apache-2.0
