📦

Rag Anythink MCP

Model Context Protocol server for advanced RAG with Knowledge Graphs

0 installs

Trust: 34 — Low

Rag

Ask AI about Rag Anythink MCP

I know everything about Rag Anythink MCP. Ask me about installation, configuration, usage, or troubleshooting.

0/500

Loading tools...

Reviews

Documentation

RAG-Anything MCP Server

🚀 Model Context Protocol server for advanced RAG with Knowledge Graphs

Knowledge Graph + Document Processing + Multimodal AI

📋 Overview

RAG-Anything MCP Server is a production-ready Model Context Protocol (MCP) server that combines:

🧠 Knowledge Graph Queries - Multiple query modes (naive, local, global, hybrid, mix, bypass)
📄 Document Ingestion - Text and PDF processing with multimodal content extraction
🔍 Entity Extraction - Automatic entity and relationship extraction from documents
💾 Hybrid Storage - Neo4j (graph) + PostgreSQL with pgvector (vectors)
🖼️ Multimodal Support - Process images, tables, and equations from PDFs
📡 MCP Compliant - Standard Model Context Protocol implementation

✨ Features

Core Capabilities

Feature	Description
📄 Document Ingestion	Text and PDF processing with multimodal content extraction
🧠 Knowledge Graph Queries	Multiple query modes (naive, local, global, hybrid, mix, bypass)
🔍 Entity Extraction	Automatic entity and relationship extraction from documents
💾 Hybrid Storage	Neo4j (graph) + PostgreSQL with pgvector (vectors)
📡 MCP Compliant	Standard Model Context Protocol implementation
🖼️ Multimodal Support	Process images, tables, and equations from PDFs

Query Modes

naive - Simple keyword search
local - Local entity-based search
global - Global community-based search
hybrid - Combines local and global (recommended)
mix - Mixes multiple strategies
bypass - Direct LLM query without graph

🚀 Quick Start

Prerequisites

Docker & Docker Compose (for database services)
Python 3.13+
OpenAI API Key

Option 1: All-in-One Docker (Recommended)

Start everything with Docker Compose:

# Clone the repository
git clone https://github.com/serkanyasr/rag-anythink-mcp.git
cd rag-anythink-mcp

# Copy environment template
cp .env.example .env

# Edit .env and set your OpenAI API key
# OPENAI_API_KEY=sk-...

# Start all services (Neo4j + PostgreSQL + MCP Server)
docker-compose up -d

# View logs
docker-compose logs -f rag-mcp

# Stop services
docker-compose down

This will start:

Neo4j 5.23 on bolt://localhost:7687 (HTTP UI on http://localhost:7474)
PostgreSQL 16 + pgvector on localhost:5432
MCP Server on http://localhost:8000

Option 2: Development Mode

Run the MCP server locally while databases run in Docker:

Windows:

# One-click startup - handles everything automatically
start-dev.bat

Linux/Mac:

# Make executable and run
chmod +x start-dev.sh
./start-dev.sh

Option 3: Manual Setup

# 1. Start Docker services (databases only)
docker-compose up -d neo4j postgres

# 2. Create virtual environment
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

# 3. Install dependencies
pip install -e .

# 4. Run the server
python main.py

⚙️ Configuration

Create a .env file in the project root:

# =====================
# Neo4j Configuration
# =====================
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_neo4j_password

# =====================
# PostgreSQL Configuration (for RAG)
# =====================
RAG_DB_HOST=localhost
RAG_DB_PORT=5432
RAG_DB_NAME=rag_anythink
RAG_DB_USER=postgres
RAG_DB_PASSWORD=your_postgres_password

# =====================
# OpenAI Configuration
# =====================
OPENAI_API_KEY=your_openai_api_key

# =====================
# Document Processing
# =====================
# Parser: mineru, docling
KG_PARSER=mineru
# Parse method: auto, ocr, txt
KG_PARSE_METHOD=auto
# Enable image extraction from PDFs
KG_ENABLE_IMAGE=true
# Enable table extraction from PDFs
KG_ENABLE_TABLE=true
# Enable equation extraction from PDFs
KG_ENABLE_EQUATION=true

# =====================
# RAG Configuration
# =====================
# Working directory for RAG output
KG_WORKING_DIR=./rag_output
# Workspace name (production, development, etc.)
KG_WORKSPACE=production
# Context window for LLM (pages before/after for context)
KG_CONTEXT_WINDOW=1
# Maximum concurrent files for processing
KG_MAX_CONCURRENT_FILES=4
# Embedding dimension (depends on model)
KG_EMBEDDING_DIM=3072
# Maximum token size for embeddings
KG_MAX_TOKEN_SIZE=8192
# LLM model for knowledge graph operations
KG_LLM_MODEL=gpt-4o-mini
# Vision model for multimodal processing
KG_VISION_MODEL=gpt-4o
# Embedding model
KG_EMBEDDING_MODEL=text-embedding-3-large
# Default query mode (naive, local, global, hybrid, mix, bypass)
KG_DEFAULT_MODE=hybrid

# =====================
# MCP Configuration
# =====================
# MCP server name
RAG_MCP_NAME=rag-anything-mcp
# MCP server version
RAG_MCP_VERSION=1.0.0
# MCP server host
RAG_MCP_HOST=localhost
# MCP server port
RAG_MCP_PORT=8055
# MCP log level
RAG_MCP_LOG=info
# MCP transport protocol: stdio, http, sse, streamable-http
RAG_MCP_TRANSPORT=streamable-http
# Application log level: DEBUG, INFO, WARNING, ERROR, CRITICAL
LOG_LEVEL=INFO
# Application log format: json, text
LOG_FORMAT=json

🔌 MCP Tools

The server provides the following MCP tools:

Tool	Description
`ingest_document`	Ingest text or PDF documents
`query_knowledge_graph`	Query with multiple modes (naive, local, global, hybrid)
`query_multimodal`	Query with images, tables, equations
`process_document_file`	Process PDF files with multimodal extraction
`insert_content_list`	Insert pre-parsed content
`delete_data`	Delete documents by ID
`get_graph_statistics`	Get graph statistics
`get_config_info`	Get configuration info

📚 Usage Examples

Python Client

from src.services.kg_service import KGService

# Initialize service
service = KGService()
await service.initialize()

# Ingest a document
result = await service.ingest_text(
    text="Your document text here...",
    metadata={"title": "My Document"}
)

# Query the knowledge graph
response = await service.query(
    query_text="What are the main topics?",
    mode="hybrid"
)

MCP Client (Claude Desktop)

Add to your Claude Desktop MCP config:

{
  "mcpServers": {
    "rag-anything": {
      "command": "docker-compose",
      "args": ["up", "rag-mcp"],
      "env": {
        "OPENAI_API_KEY": "your-api-key"
      }
    }
  }
}

🗂️ Project Structure

rag-anythink-mcp/
├── src/
│   ├── config/         # Pydantic configuration
│   ├── core/           # Interfaces and models
│   ├── database/       # Database connections (Neo4j, PostgreSQL)
│   │   └── kg/         # Knowledge Graph layer
│   ├── mcp/            # MCP servers
│   │   └── kg/         # RAG MCP server
│   ├── services/       # Business logic
│   ├── utils/          # Utilities
│   └── llm.py          # LLM clients
├── main.py             # Entry point
├── Dockerfile          # Docker image for MCP server
├── docker-compose.yml  # Multi-container orchestration
├── pyproject.toml      # Dependencies (uv)
├── start-dev.bat       # Windows dev startup
├── start-dev.sh        # Linux/Mac dev startup
└── README.md

🛠️ Development

Setup Development Environment

# Clone repository
git clone https://github.com/serkanyasr/rag-anythink-mcp.git
cd rag-anythink-mcp

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install in development mode with all extras
pip install -e ".[full]"

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=src

# Run specific test file
pytest tests/test_kg_service.py

Code Quality

# Format code
ruff format src/

# Check linting
ruff check src/

# Type checking
pyright src/

🐳 Docker Deployment

Full Stack Deployment

# Deploy all services
docker-compose up -d

# Check service health
docker-compose ps

# View logs
docker-compose logs -f

# Stop all services
docker-compose down

# Stop and remove volumes (clean slate)
docker-compose down -v

Individual Services

# Only databases (for local dev)
docker-compose up -d neo4j postgres

# Only MCP server (databases must be running)
docker-compose up -d rag-mcp

Health Checks

The services include built-in health checks:

Neo4j: Cypher-shell connectivity test
PostgreSQL: pg_isready checks
MCP Server: Depends on healthy databases

📖 Architecture

System Components

┌─────────────────┐
│   MCP Client    │
│  (Claude, etc)  │
└────────┬────────┘
         │ MCP Protocol
         ▼
┌─────────────────┐
│  MCP Server     │
│  (FastMCP)      │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  KG Service     │
│  (RAG-Anything) │
└────────┬────────┘
         │
    ┌────┴────┐
    ▼         ▼
┌──────┐  ┌─────────┐
│Neo4j│  │PostgreSQL│
│Graph│  │+ pgvector│
└──────┘  └─────────┘

Data Flow

Ingestion: Documents → Entity Extraction → Neo4j (graph) + PostgreSQL (vectors)
Query: Query Text → Embedding → Vector Search + Graph Traversal → LLM Synthesis

🤝 Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes
Run tests (pytest)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Development Guidelines

Write tests for new features
Follow PEP 8 style guidelines
Update documentation as needed
Keep commits atomic and well-described

❓ FAQ

How do I change the database passwords?

Update the passwords in both .env and docker-compose.yml. Make sure they match.

Can I use a different embedding model?

Yes! Set OPENAI_EMBEDDING_MODEL and adjust RAG_EMBEDDING_DIM in your .env file.

How do I backup my data?

# Neo4j backup
docker exec rag-neo4j neo4j-admin database dump neo4j --to-path=/backups

# PostgreSQL backup
docker exec rag-postgres pg_dump -U postgres rag_anythink > backup.sql

The server won't start - what do I do?

Check Docker is running: docker ps
Check service logs: docker-compose logs
Verify environment variables in .env
Ensure databases are healthy: docker-compose ps

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.