Substrate AI
Open-source framework for building stateful AI agents with memory, tools, and streaming. Works with any OpenAI-compatible API.
Installation
npx substrate-aiAsk AI about Substrate AI
Powered by Claude Β· Grounded in docs
I know everything about Substrate AI. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Substrate AI
A production-ready AI agent framework with streaming, memory, tools, and MCP integration.
Built on modern LLM infrastructure with OpenRouter support, PostgreSQL persistence, and extensible tool architecture.
π Quick Start (One-Click Setup!)
Option 1: Automatic Setup (Recommended)
# Clone the repository
git clone https://github.com/your-username/substrate-ai.git
cd substrate-ai
# Run the setup wizard - it does EVERYTHING for you!
python setup.py
The setup script will:
- β Create Python virtual environment
- β Install all backend dependencies
- β Create configuration files
- β Install frontend dependencies
- β Validate your setup
After setup, just add your API key:
# Edit backend/.env and add your OpenRouter API key
# Get one at: https://openrouter.ai/keys
Option 2: Manual Setup
# Backend
cd backend
python3 -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
# Frontend
cd ../frontend
npm install
# Configure
cp backend/.env.example backend/.env
# Edit backend/.env and add OPENROUTER_API_KEY=sk-or-v1-your-key
Start the Application
# Terminal 1: Backend
cd backend
source venv/bin/activate
python api/server.py
# Terminal 2: Frontend
cd frontend
npm run dev
# Open http://localhost:5173 π
π Full guide: See QUICK_START.md
β¨ New users: The repository includes ALEX - a pre-configured example agent. Run python setup_alex.py after configuring your API key to get started immediately!
β¨ Features
Core Capabilities
- π€ Multi-Model Support - OpenRouter integration with 100+ LLMs
- π¬ Streaming Responses - Real-time token streaming with SSE
- π§ Memory System - Short-term (PostgreSQL) + Long-term (ChromaDB embeddings)
- π οΈ Tool Execution - Extensible tool architecture with built-in tools
- π Session Management - Multi-session support with conversation history
- π° Cost Tracking - Real-time token usage and cost monitoring
- β‘ Token Optimized - ~40% fewer context tokens via history limits + auto-summarization
Advanced Features
- π§© MCP Integration - Model Context Protocol for code execution & browser automation
- π PostgreSQL Backend - Scalable conversation & memory persistence
- πΈοΈ Graph RAG - Knowledge graph retrieval (works without Neo4j - uses local DB fallback!)
- π― Vision Support - Gemini Flash integration for image analysis
- π Security Hardened - Sandboxed code execution, rate limiting, domain whitelisting
- π Token Efficiency - 98.7% context window savings via MCP code execution
- π¨ Modern UI - React + TypeScript + Tailwind CSS
π Autonomous Agent Features (NEW!)
- π Heartbeat System - Timer-based agent activation with configurable rules
- π Rooms/Channels - Discord-style message organization (heartbeat-log, task, reflection)
- ποΈ Task Scheduler - Automatic task execution (daily, weekly, monthly, custom intervals)
- π Daemon Mode - 24/7 agent runtime with in-memory caching
π§ Miras Memory Architecture
Based on Google Research Titans/Miras papers:
- π Retention Gates - Dynamic memory decay/boost based on access patterns
- ποΈ Attentional Bias - Multi-factor scoring (semantic + temporal + importance + access)
- ποΈ Hierarchical Memory - 3-tier system (Working β Episodic β Semantic)
- π Online Learning - Hebbian associations + feedback learning during runtime
π Heartbeat & Autonomous Agent System (NEW!)
Always-on agent capabilities for 24/7 autonomous operation:
- Timer-based Heartbeats - Configurable rules with time windows & probability
- Rooms/Channels - Discord-style message organization with threads
- Task Scheduler - Daily, weekly, monthly, or custom interval tasks
- Daemon Mode - 24/7 runtime with hot agent caching
- DST-safe Timezone - Europe/Berlin default with proper timezone handling
π Documentation
Getting Started
- Quick Start Guide - 5-minute setup
- System Structure - Project layout overview
- Example Agents - Pre-configured agent templates
Advanced Topics
- MCP System Overview - Code execution & browser automation architecture
- Miras Memory Architecture - Research-backed memory system
- Heartbeat & Rooms - Autonomous agent features
- PostgreSQL Setup - Database configuration
- Compatibility Guide - System requirements
Testing & Security
- Testing Results - Test coverage & validation
- Security Checklist - Security audit & hardening
ποΈ Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β Frontend (React) β
β β’ Real-time streaming UI β
β β’ Session management β
β β’ Memory blocks editor β
β β’ Rooms/Channels overview β
β β’ Cost & token tracking β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββ
β
β HTTP/SSE/WebSocket
β
βββββββββββββββββββΌββββββββββββββββββββββββββββββββ
β Backend (Python) β
β β
β ββββββββββββββββββββββββββββββββββββββββββ β
β β Consciousness Loop β β
β β β’ Model routing (OpenRouter) β β
β β β’ Stream management β β
β β β’ Tool execution β β
β β β’ Memory integration β β
β ββββββββββββββββββββββββββββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββββββββββββ β
β β Daemon Mode (24/7 Runtime) β β
β β β’ Timer-based heartbeat β β
β β β’ Task scheduler β β
β β β’ In-memory agent caching β β
β β β’ Graceful shutdown β β
β ββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββ βββββββββββββββ β
β β Memory β β Tools β β
β β System β β Registry β β
β β + MIRAS β β β β
β β β’ Core β β β’ Web β β
β β β’ Archival β β β’ Search β β
β β β’ Embedding β β β’ Discord β β
β β β’ Retention β β β’ ArXiv β β
β β β’ Hebbian β β β’ Jina β β
β βββββββββββββββ βββββββββββββββ β
β β
β βββββββββββββββ βββββββββββββββ β
β β Channels β β Tasks β β
β β (Rooms) β β Scheduler β β
β β β’ heartbeat β β β’ daily β β
β β β’ task β β β’ weekly β β
β β β’ reflectionβ β β’ monthly β β
β β β’ threads β β β’ custom β β
β βββββββββββββββ βββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββββββββββββ β
β β Graph RAG System β β
β β β’ Knowledge graph retrieval β β
β β β’ Neo4j (optional) or local DB β β
β β β’ Relationship extraction β β
β ββββββββββββββββββββββββββββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββββββββββββ β
β β MCP Integration β β
β β β’ Code execution sandbox β β
β β β’ Browser automation (Playwright) β β
β β β’ Skills learning system β β
β β β’ Vision analysis (Gemini) β β
β ββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββ¬βββββββββββββββββββββββββββββββ
β
ββββββββββββββΌβββββββββββββ¬βββββββββββββ
β β β β
βββββββΌββββββ βββββΌβββββ ββββββΌββββββ ββββββΌββββββ
βPostgreSQL β βChromaDBβ βMCP Serversβ β Neo4j β
βPersistenceβ βVectors β β(External) β β(Optional)β
βββββββββββββ ββββββββββ ββββββββββββ ββββββββββββ
π§ Tech Stack
Backend
- Python 3.11+ - Core runtime
- Flask - API server with SSE streaming
- PostgreSQL - Primary database (conversation history, memory)
- ChromaDB - Vector embeddings for semantic search
- Neo4j - Graph database for Graph RAG (optional, local DB fallback)
- OpenRouter - Multi-model LLM gateway
- RestrictedPython - Sandboxed code execution
Frontend
- React 18 - UI framework
- TypeScript - Type safety
- Tailwind CSS - Styling
- Vite - Build tool & dev server
MCP Integration
- Playwright - Browser automation (Chromium)
- Gemini 2.0 Flash - Vision analysis (free tier)
- fastmcp - MCP protocol implementation
- MCP Servers - Stdio-based external tools
π οΈ Built-in Tools
Memory Management
core_memory_append- Add to agent's core memorycore_memory_replace- Modify core memoryarchival_memory_insert- Store in long-term memoryarchival_memory_search- Semantic search across memories
Miras Memory Architecture
Advanced memory features based on Google Research:
retention_gate.compute_retention()- Calculate memory retention scoreattentional_bias.compute_attention_score()- Multi-factor relevance scoringhierarchical_memory.store()- Store in tiered memory systemmemory_learner.on_memories_accessed()- Record Hebbian associationsmemory_learner.record_feedback()- Learn from user feedback
Web & Research
fetch_webpage- Retrieve and parse web pagesweb_search- DuckDuckGo searcharxiv_search- Academic paper searchjina_reader- Advanced web content extraction
Integration
discord_send_message- Discord bot integrationspotify_control- Spotify playback controlexecute_code- Sandboxed Python execution (MCP)
Graph RAG
/api/graph/nodes- Get graph nodes/api/graph/edges- Get graph relationships/api/graph/stats- Graph statistics/api/graph/rag- Retrieve context from knowledge graph
Rooms/Channels
/api/channels- List/Create channels (rooms)/api/channels/<id>/messages- Get/Post channel messages/api/messages/<id>/threads- Discord-style thread system- Default channels:
heartbeat-log,task,reflection
Task Scheduler
/api/tasks- List/Create scheduled tasks/api/tasks/<id>- Get/Update/Delete task- Schedules:
daily,weekly,monthly,custom(every N days)
Heartbeat System
/api/heartbeat/config- Get/Update heartbeat configuration/api/heartbeat/rules- Manage heartbeat rules/api/heartbeat/status- Daemon & agent status/api/heartbeat/trigger- Manual heartbeat trigger (testing)/api/heartbeat/templates- Predefined rule templates
Browser Automation (MCP)
navigate- Browser navigationscreenshot- Capture with vision analysisextract_text- DOM text extractionclick/fill_form- Page interactionsearch_google- Google search automation
π¦ Installation
Prerequisites
- Python 3.11+ (3.11 or 3.12 recommended; 3.13 supported but may require additional setup)
- Node.js 18+
- PostgreSQL 14+ (optional, SQLite fallback available)
- OpenRouter API key
Note for Python 3.13 users: Some packages may require newer versions. See Python 3.13 Compatibility Guide for detailed instructions.
Backend Setup
cd backend
# Create virtual environment
python3 -m venv venv
source venv/bin/activate # Mac/Linux
# venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
# If using Python 3.13 on Windows and encountering wheel issues, see:
# backend/PYTHON_3.13_COMPATIBILITY.md
# Configure environment
cp config/.env.example .env
# Edit .env with your API keys
# Optional: Install Playwright for MCP browser automation
playwright install chromium
Frontend Setup
cd frontend
# Install dependencies
npm install
# Start dev server
npm run dev
π Security Features
Code Execution Sandbox
- β RestrictedPython compilation (no unsafe operations)
- β 30-second timeout enforcement
- β 512MB memory limit per execution
- β Isolated workspace per session
- β No file system access outside sandbox
- β No network access except via MCP tools
Browser Automation Security
- β Domain whitelist (Wikipedia, GitHub, ArXiv, etc.)
- β Domain blacklist (banking, payments blocked)
- β Rate limiting (10 nav/min, 5 screenshots/min)
- β Headless mode only (no GUI)
- β HTTPS enforcement on sensitive operations
API Security
- β Rate limiting on all endpoints
- β CORS configuration
- β Input sanitization
- β API key validation
π Full audit: See FINAL_SECURITY_CHECK.md
π¦ Usage Examples
Basic Chat
# The agent maintains context across messages
User: "My name is Alex"
Agent: "Nice to meet you, Alex! I've stored that in my memory."
User: "What's my name?"
Agent: "Your name is Alex!"
Tool Usage
# Memory tools
User: "Remember that I'm learning Python"
Agent: *uses core_memory_append*
Agent: "I've added that to my memory about you!"
# Web search
User: "What's the latest on quantum computing?"
Agent: *uses web_search*
Agent: "Here's what I found about quantum computing..."
MCP Code Execution
# Browser automation with vision
User: "Take a screenshot of Wikipedia's homepage and describe it"
Agent: *writes code*
url = "https://en.wikipedia.org"
result = await mcp.browser.screenshot(url, analyze=True)
print(result['analysis'])
Result: Vision analysis returned, 98.7% token savings vs manual browsing
π Full MCP guide: See MCP_SYSTEM_OVERVIEW.md
π Performance
Token Efficiency
- Without MCP: ~100,000 tokens for complex web tasks
- With MCP: ~2,000 tokens (98.7% reduction)
- Streaming: <50ms first token latency
Context Optimization (v1.2.1)
- History Limit: 12 messages (optimized from 20, ~40% fewer context tokens)
- Auto-Summary: Triggers automatically when >30 messages accumulate
- Timeout: 120s (allows large context window processing)
- Max Response: 8192 tokens (full detailed responses, not clipped)
Execution Speed
- Code compilation: <50ms (RestrictedPython)
- Typical execution: 200-500ms
- Max timeout: 30s (enforced)
Memory Performance
- Core memory: O(1) access (PostgreSQL indexed)
- Archival search: <200ms (ChromaDB vector similarity)
- Skills lookup: O(log n) with semantic indexing
π§ͺ Testing
# Backend tests
cd backend
python test_startup.py
# Integration tests
python test_mcp_integration.py
# Full test results
cat ../TESTING_RESULTS.md
πΊοΈ Roadmap
Completed β
- Multi-model OpenRouter integration
- Streaming SSE responses
- PostgreSQL persistence
- Memory system (core + archival)
- Tool execution framework
- MCP code execution sandbox
- Browser automation (Playwright)
- Vision analysis (Gemini)
- Skills learning system
- Cost tracking
- Miras Memory Architecture (December 2025)
- Retention Gates (dynamic memory decay/boost)
- Attentional Bias (multi-factor retrieval scoring)
- Hierarchical Memory (Working β Episodic β Semantic)
- Online Learning (Hebbian associations + feedback)
- Autonomous Agent System (January 2026)
- Heartbeat System (timer-based with probability)
- Rooms/Channels (Discord-style organization)
- Task Scheduler (daily, weekly, monthly, custom)
- Daemon Mode (24/7 runtime with agent caching)
- DST-safe timezone handling
- Context & Token Optimization (January 2026)
- Reduced history_limit (20 β 12) for ~40% token savings
- Auto-summary trigger at >30 messages
- Extended timeout (60s β 120s) for large contexts
- Flexible max_completion_tokens (8192 default)
In Progress π§
- Additional MCP servers (filesystem, database)
- Collaborative skill libraries
- Advanced prompt engineering UI
- Multi-agent orchestration
Planned π―
- Voice interface
- Mobile app
- Cloud deployment templates
- Plugin marketplace
π€ Contributing
This is an open-source project. Contributions welcome!
Development Setup
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
Code Standards
- Python: PEP 8, type hints, docstrings
- TypeScript: ESLint, Prettier, strict mode
- Tests: Add tests for new features
- Docs: Update relevant documentation
π License
See LICENSE for details.
π Acknowledgments
Technologies
- OpenRouter - Multi-model API gateway
- Anthropic MCP - Model Context Protocol architecture
- Playwright - Browser automation framework
- Gemini - Vision analysis (Google)
- PostgreSQL - Database engine
- ChromaDB - Vector embeddings
Research
- Google Titans/Miras - Advanced memory architecture (Retention Gates, Attentional Bias, Online Learning)
- "It's All Connected" - Test-time memorization and retention research
Community
Built with inspiration from:
- Letta (formerly MemGPT) - Memory architecture patterns
- LangChain - Tool execution concepts
- AutoGPT - Agent autonomy ideas
π§ Support
- π Bug Reports: GitHub Issues
- π¬ Questions: GitHub Discussions
- π Documentation: See
/docsfolder - π§ Troubleshooting: See QUICK_START.md
Built for developers who need production-ready AI agents.
Version 1.2.1 | Last Updated: January 2026
π§ Miras Memory Architecture Details
Based on Google Research Titans & Miras papers, this framework implements a 4-phase advanced memory system:
Phase 1: Retention Gates (~490 lines)
File: backend/core/retention_gate.py
Dynamic memory decay/boost based on:
- Importance (35% weight)
- Access count (30% weight)
- Temporal recency (25% weight)
- Base retention (10% weight)
from core.retention_gate import RetentionGate
gate = RetentionGate()
score = gate.compute_retention(memory) # 0.0 - 1.0
action = gate.get_action(score) # BOOST, KEEP, CONSOLIDATE, DECAY, ARCHIVE
Phase 2: Attentional Bias (~610 lines)
File: backend/core/attentional_bias.py
5 attention modes with automatic query analysis:
- STANDARD - Balanced retrieval
- SEMANTIC_HEAVY - Meaning-focused
- TEMPORAL_HEAVY - Time-sensitive ("when did we...")
- IMPORTANCE_HEAVY - Critical information
- EMOTIONAL - Relationship/feeling queries
from core.attentional_bias import QueryAnalyzer, AttentionalBias
analyzer = QueryAnalyzer()
mode = analyzer.analyze("When did we last meet?") # β TEMPORAL
bias = AttentionalBias()
score = bias.compute_attention_score(memory, query, mode)
Phase 3: Hierarchical Memory (~720 lines)
File: backend/core/hierarchical_memory.py
3-tier memory architecture:
- Working Memory - Fast, volatile, LRU eviction (current session)
- Episodic Memory - Medium-term, retention gates (recent history)
- Semantic Memory - Long-term, Graph DB integration (permanent knowledge)
from core.hierarchical_memory import HierarchicalMemory, MemoryItem
hier = HierarchicalMemory()
hier.store(memory_item) # Auto-routes to appropriate tier
hier.consolidate() # Move memories between tiers
Phase 4: Online Learning (~500 lines)
File: backend/core/memory_learner.py
Hebbian learning: "Neurons that fire together, wire together"
- Memories accessed together form associations
- User feedback adjusts importance
- Association decay for unused connections
from core.memory_learner import MemoryLearner, FeedbackType
learner = MemoryLearner()
learner.on_memories_accessed(['mem1', 'mem2'], query="...") # Forms associations
learner.record_feedback('mem1', FeedbackType.HELPFUL) # +0.5 importance
learner.record_feedback('mem2', FeedbackType.NOT_HELPFUL) # -0.2 importance
Total: ~2,320 lines of research-backed memory architecture!
π Full documentation: See docs/MIRAS_TITANS_INTEGRATION.md
π Heartbeat & Autonomous Agent System Details
Always-on agent capabilities for 24/7 autonomous operation.
Daemon Mode
File: backend/core/daemon_mode.py
24/7 runtime with hot agent caching:
- Agents stay in memory (no restart overhead!)
- Connection pooling (PostgreSQL stays warm)
- Graceful shutdown (no data loss)
- Signal handling (SIGTERM, SIGINT)
from core.daemon_mode import SubstrateAIDaemon, create_daemon_from_env
# Create and start daemon
daemon = create_daemon_from_env()
daemon.start()
# Load agent (stays in memory!)
agent = daemon.get_or_create_agent("my-agent", "My Agent Name")
# Check status
daemon.print_status()
Heartbeat System
File: backend/core/daemon_mode.py
Timer-based agent activation with configurable rules:
# Example heartbeat configuration (stored in agent.config)
heartbeat_config = {
"enabled": True,
"timezone": "Europe/Berlin", # DST-safe!
"rules": [
{
"id": "rule-uuid",
"name": "Morning Check-in",
"days": ["monday", "tuesday", "wednesday", "thursday", "friday"],
"start_time": "08:00",
"end_time": "12:00",
"interval_minutes": 60,
"probability": 0.8 # 80% chance to trigger when timer fires
},
{
"id": "rule-uuid-2",
"name": "Evening Reflection",
"days": ["monday", "tuesday", "wednesday", "thursday", "friday"],
"start_time": "18:00",
"end_time": "20:00",
"interval_minutes": 45,
"probability": 0.9
}
]
}
How it works:
- Timer fires at random interval (50-100% of configured interval)
- Probability check determines if heartbeat triggers
- Heartbeat message sent to agent via ConsciousnessLoop
- Agent can respond, use tools, organize memories
- Messages posted to
heartbeat-logchannel AND main thread
Task Scheduler
File: backend/core/task_scheduler.py
Automatic task execution with multiple schedule types:
from core.task_scheduler import TaskScheduler, calculate_next_run
# Schedule types
next_run = calculate_next_run('daily', '09:00') # Every day at 09:00
next_run = calculate_next_run('weekly', '10:00', days_of_week=[0, 2, 4]) # Mon/Wed/Fri
next_run = calculate_next_run('monthly', '12:00') # Monthly at 12:00
next_run = calculate_next_run('custom', '08:00', every_n_days=3) # Every 3 days
# Task structure (stored in PostgreSQL)
task = {
"task_name": "Daily Summary",
"description": "Generate daily activity summary",
"schedule": "daily",
"time": "18:00",
"action_type": "self_task", # Send to agent
"action_template": "Please summarize today's activities.",
"one_time": False # Recurring
}
Features:
- One-time and recurring tasks
- Automatic
next_runcalculation - DST-safe timezone handling
- Integration with ConsciousnessLoop
- Channel-based message organization
Rooms/Channels
File: backend/api/routes_channels.py
Discord-style message organization:
Agent
βββ heartbeat-log (System heartbeats)
βββ task (Scheduled tasks)
βββ reflection (Agent reflections)
βββ [custom] (User-created channels)
Features:
- Default channels auto-created per agent
- Messages posted to BOTH channel AND main thread
- Thread system for message replies
- Discord webhook integration (optional)
API Endpoints
# Heartbeat Configuration
GET /api/heartbeat/config?agent_id=default
PUT /api/heartbeat/config
GET /api/heartbeat/rules?agent_id=default
POST /api/heartbeat/rules
PUT /api/heartbeat/rules/<rule_id>
DELETE /api/heartbeat/rules/<rule_id>
GET /api/heartbeat/status
POST /api/heartbeat/trigger
GET /api/heartbeat/templates
# Tasks
GET /api/tasks?agent_id=default
POST /api/tasks
GET /api/tasks/<task_id>
PUT /api/tasks/<task_id>
DELETE /api/tasks/<task_id>
# Channels
GET /api/channels?agent_id=default
POST /api/channels
GET /api/channels/<channel_id>/messages
POST /api/channels/<channel_id>/messages
π Implementation details: See IMPLEMENTATION_PLAN_HEARTBEAT_ROOMS.md
