Openclaude MCP
FastMCP 3.2 server plus Webapp for using the leaked Claude Code locally with ollama and open weight models. Kairos and cat included
Ask AI about Openclaude MCP
Powered by Claude · Grounded in docs
I know everything about Openclaude MCP. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
OpenClaude MCP Server
A high-performance control plane for Ollama-based local LLM sessions and background memory consolidation. Optimized for RTX 4090 environments and large-context model orchestration.
🚀 What is this?
If you need a reliable way to manage local LLM sessions, monitor VRAM usage, and maintain a structured memory between tasks without manual overhead—this is for you.
- Background Memory (KAIROS): Automatically consolidates session data into a persistent
MEMORY.mdusing asynchronous background cycles. - Ollama Control Plane: Comprehensive toolset for model management, health checks, and high-speed local inference.
- Hybrid Planning (ULTRAPLAN): Optional handoff to Claude Opus for complex reasoning, while maintaining 100% local tool execution.
- Fleet Observability: Integrated React dashboard for real-time monitoring of sessions, models, and memory status.
Quick Start
# Install dependencies
.\setup.ps1
# Start the MCP server + webapp
.\start.ps1
- SSE Endpoint:
http://localhost:10932/sse - Fleet Dashboard:
http://localhost:10933 - Health API:
http://localhost:10932/api/health
Available Tools
| Tool | Action |
|---|---|
start_session | Initialize a new Ollama session |
send_prompt | Execute a prompt in an active session |
kairos_enable | Activate background memory consolidation |
kairos_disable | Halt background memory consolidation |
list_models | Inventory of available Ollama models |
model_status | Check VRAM and load status |
ultraplan | Hybrid cloud/local planning cycle |
fleet_status | Global health of the instance |
Architecture
OpenClaude operates as a unified REST bridge for FastMCP, providing a robust integration layer between local LLM runtimes and agentic workflows. It is hardened for Windows environments with UV dependency management and subprocess isolation.
- Memory Management — KAIROS implementation details
- Hybrid Planning — ULTRAPLAN handoff logic
- Reliability & Hardening — Process management and stability
Prerequisites
- Ollama (running locally)
- Node.js (v20+)
- Python 3.13+ with
uv - Local model pulled (e.g.,
gemma2,llama3.3)
🛡️ Industrial Quality Stack
This project adheres to SOTA 14.1 industrial standards for high-fidelity agentic orchestration:
- Python (Core): Ruff for linting and formatting. Zero-tolerance for
printstatements in core handlers (T201). - Webapp (UI): Biome for sub-millisecond linting. Strict
noConsoleLogenforcement. - Protocol Compliance: Hardened
stdout/stderrisolation to ensure crash-resistant JSON-RPC communication. - Automation: Justfile recipes for all fleet operations (
just lint,just fix,just dev). - Security: Automated audits via
banditandsafety.
License
MIT
