Rag Anythink MCP
Model Context Protocol server for advanced RAG with Knowledge Graphs
Ask AI about Rag Anythink MCP
Powered by Claude Β· Grounded in docs
I know everything about Rag Anythink MCP. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
RAG-Anything MCP Server
π Model Context Protocol server for advanced RAG with Knowledge Graphs
Knowledge Graph + Document Processing + Multimodal AI
π Overview
RAG-Anything MCP Server is a production-ready Model Context Protocol (MCP) server that combines:
- π§ Knowledge Graph Queries - Multiple query modes (naive, local, global, hybrid, mix, bypass)
- π Document Ingestion - Text and PDF processing with multimodal content extraction
- π Entity Extraction - Automatic entity and relationship extraction from documents
- πΎ Hybrid Storage - Neo4j (graph) + PostgreSQL with pgvector (vectors)
- πΌοΈ Multimodal Support - Process images, tables, and equations from PDFs
- π‘ MCP Compliant - Standard Model Context Protocol implementation
β¨ Features
Core Capabilities
| Feature | Description |
|---|---|
| π Document Ingestion | Text and PDF processing with multimodal content extraction |
| π§ Knowledge Graph Queries | Multiple query modes (naive, local, global, hybrid, mix, bypass) |
| π Entity Extraction | Automatic entity and relationship extraction from documents |
| πΎ Hybrid Storage | Neo4j (graph) + PostgreSQL with pgvector (vectors) |
| π‘ MCP Compliant | Standard Model Context Protocol implementation |
| πΌοΈ Multimodal Support | Process images, tables, and equations from PDFs |
Query Modes
- naive - Simple keyword search
- local - Local entity-based search
- global - Global community-based search
- hybrid - Combines local and global (recommended)
- mix - Mixes multiple strategies
- bypass - Direct LLM query without graph
π Quick Start
Prerequisites
- Docker & Docker Compose (for database services)
- Python 3.13+
- OpenAI API Key
Option 1: All-in-One Docker (Recommended)
Start everything with Docker Compose:
# Clone the repository
git clone https://github.com/serkanyasr/rag-anythink-mcp.git
cd rag-anythink-mcp
# Copy environment template
cp .env.example .env
# Edit .env and set your OpenAI API key
# OPENAI_API_KEY=sk-...
# Start all services (Neo4j + PostgreSQL + MCP Server)
docker-compose up -d
# View logs
docker-compose logs -f rag-mcp
# Stop services
docker-compose down
This will start:
- Neo4j 5.23 on
bolt://localhost:7687(HTTP UI onhttp://localhost:7474) - PostgreSQL 16 + pgvector on
localhost:5432 - MCP Server on
http://localhost:8000
Option 2: Development Mode
Run the MCP server locally while databases run in Docker:
Windows:
# One-click startup - handles everything automatically
start-dev.bat
Linux/Mac:
# Make executable and run
chmod +x start-dev.sh
./start-dev.sh
Option 3: Manual Setup
# 1. Start Docker services (databases only)
docker-compose up -d neo4j postgres
# 2. Create virtual environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# 3. Install dependencies
pip install -e .
# 4. Run the server
python main.py
βοΈ Configuration
Create a .env file in the project root:
# =====================
# Neo4j Configuration
# =====================
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_neo4j_password
# =====================
# PostgreSQL Configuration (for RAG)
# =====================
RAG_DB_HOST=localhost
RAG_DB_PORT=5432
RAG_DB_NAME=rag_anythink
RAG_DB_USER=postgres
RAG_DB_PASSWORD=your_postgres_password
# =====================
# OpenAI Configuration
# =====================
OPENAI_API_KEY=your_openai_api_key
# =====================
# Document Processing
# =====================
# Parser: mineru, docling
KG_PARSER=mineru
# Parse method: auto, ocr, txt
KG_PARSE_METHOD=auto
# Enable image extraction from PDFs
KG_ENABLE_IMAGE=true
# Enable table extraction from PDFs
KG_ENABLE_TABLE=true
# Enable equation extraction from PDFs
KG_ENABLE_EQUATION=true
# =====================
# RAG Configuration
# =====================
# Working directory for RAG output
KG_WORKING_DIR=./rag_output
# Workspace name (production, development, etc.)
KG_WORKSPACE=production
# Context window for LLM (pages before/after for context)
KG_CONTEXT_WINDOW=1
# Maximum concurrent files for processing
KG_MAX_CONCURRENT_FILES=4
# Embedding dimension (depends on model)
KG_EMBEDDING_DIM=3072
# Maximum token size for embeddings
KG_MAX_TOKEN_SIZE=8192
# LLM model for knowledge graph operations
KG_LLM_MODEL=gpt-4o-mini
# Vision model for multimodal processing
KG_VISION_MODEL=gpt-4o
# Embedding model
KG_EMBEDDING_MODEL=text-embedding-3-large
# Default query mode (naive, local, global, hybrid, mix, bypass)
KG_DEFAULT_MODE=hybrid
# =====================
# MCP Configuration
# =====================
# MCP server name
RAG_MCP_NAME=rag-anything-mcp
# MCP server version
RAG_MCP_VERSION=1.0.0
# MCP server host
RAG_MCP_HOST=localhost
# MCP server port
RAG_MCP_PORT=8055
# MCP log level
RAG_MCP_LOG=info
# MCP transport protocol: stdio, http, sse, streamable-http
RAG_MCP_TRANSPORT=streamable-http
# Application log level: DEBUG, INFO, WARNING, ERROR, CRITICAL
LOG_LEVEL=INFO
# Application log format: json, text
LOG_FORMAT=json
π MCP Tools
The server provides the following MCP tools:
| Tool | Description |
|---|---|
ingest_document | Ingest text or PDF documents |
query_knowledge_graph | Query with multiple modes (naive, local, global, hybrid) |
query_multimodal | Query with images, tables, equations |
process_document_file | Process PDF files with multimodal extraction |
insert_content_list | Insert pre-parsed content |
delete_data | Delete documents by ID |
get_graph_statistics | Get graph statistics |
get_config_info | Get configuration info |
π Usage Examples
Python Client
from src.services.kg_service import KGService
# Initialize service
service = KGService()
await service.initialize()
# Ingest a document
result = await service.ingest_text(
text="Your document text here...",
metadata={"title": "My Document"}
)
# Query the knowledge graph
response = await service.query(
query_text="What are the main topics?",
mode="hybrid"
)
MCP Client (Claude Desktop)
Add to your Claude Desktop MCP config:
{
"mcpServers": {
"rag-anything": {
"command": "docker-compose",
"args": ["up", "rag-mcp"],
"env": {
"OPENAI_API_KEY": "your-api-key"
}
}
}
}
ποΈ Project Structure
rag-anythink-mcp/
βββ src/
β βββ config/ # Pydantic configuration
β βββ core/ # Interfaces and models
β βββ database/ # Database connections (Neo4j, PostgreSQL)
β β βββ kg/ # Knowledge Graph layer
β βββ mcp/ # MCP servers
β β βββ kg/ # RAG MCP server
β βββ services/ # Business logic
β βββ utils/ # Utilities
β βββ llm.py # LLM clients
βββ main.py # Entry point
βββ Dockerfile # Docker image for MCP server
βββ docker-compose.yml # Multi-container orchestration
βββ pyproject.toml # Dependencies (uv)
βββ start-dev.bat # Windows dev startup
βββ start-dev.sh # Linux/Mac dev startup
βββ README.md
π οΈ Development
Setup Development Environment
# Clone repository
git clone https://github.com/serkanyasr/rag-anythink-mcp.git
cd rag-anythink-mcp
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install in development mode with all extras
pip install -e ".[full]"
Running Tests
# Run all tests
pytest
# Run with coverage
pytest --cov=src
# Run specific test file
pytest tests/test_kg_service.py
Code Quality
# Format code
ruff format src/
# Check linting
ruff check src/
# Type checking
pyright src/
π³ Docker Deployment
Full Stack Deployment
# Deploy all services
docker-compose up -d
# Check service health
docker-compose ps
# View logs
docker-compose logs -f
# Stop all services
docker-compose down
# Stop and remove volumes (clean slate)
docker-compose down -v
Individual Services
# Only databases (for local dev)
docker-compose up -d neo4j postgres
# Only MCP server (databases must be running)
docker-compose up -d rag-mcp
Health Checks
The services include built-in health checks:
- Neo4j: Cypher-shell connectivity test
- PostgreSQL:
pg_isreadychecks - MCP Server: Depends on healthy databases
π Architecture
System Components
βββββββββββββββββββ
β MCP Client β
β (Claude, etc) β
ββββββββββ¬βββββββββ
β MCP Protocol
βΌ
βββββββββββββββββββ
β MCP Server β
β (FastMCP) β
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ
β KG Service β
β (RAG-Anything) β
ββββββββββ¬βββββββββ
β
ββββββ΄βββββ
βΌ βΌ
ββββββββ βββββββββββ
βNeo4jβ βPostgreSQLβ
βGraphβ β+ pgvectorβ
ββββββββ βββββββββββ
Data Flow
- Ingestion: Documents β Entity Extraction β Neo4j (graph) + PostgreSQL (vectors)
- Query: Query Text β Embedding β Vector Search + Graph Traversal β LLM Synthesis
π€ Contributing
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Run tests (
pytest) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Development Guidelines
- Write tests for new features
- Follow PEP 8 style guidelines
- Update documentation as needed
- Keep commits atomic and well-described
β FAQ
How do I change the database passwords?
Update the passwords in both .env and docker-compose.yml. Make sure they match.
Can I use a different embedding model?
Yes! Set OPENAI_EMBEDDING_MODEL and adjust RAG_EMBEDDING_DIM in your .env file.
How do I backup my data?
# Neo4j backup
docker exec rag-neo4j neo4j-admin database dump neo4j --to-path=/backups
# PostgreSQL backup
docker exec rag-postgres pg_dump -U postgres rag_anythink > backup.sql
The server won't start - what do I do?
- Check Docker is running:
docker ps - Check service logs:
docker-compose logs - Verify environment variables in
.env - Ensure databases are healthy:
docker-compose ps
π License
This project is licensed under the MIT License - see the LICENSE file for details.
