Multi Agent RAG With FastMCP
A multi-agent AI system using FastAPI that answers FAQ queries using Retrieval-Augmented Generation (RAG), provides weather information, and manages tasks through a FastMCP Todo server.
Ask AI about Multi Agent RAG With FastMCP
Powered by Claude Β· Grounded in docs
I know everything about Multi Agent RAG With FastMCP. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Multi-Agent RAG System with FastMCP Integration
A multi-agent AI system built with FastAPI that answers FAQ queries using Retrieval-Augmented Generation (RAG), provides weather information, and manages tasks through a FastMCP Todo server. The system uses LangGraph for agent orchestration, ChromaDB for vector storage, and is fully dockerized with a decoupled Streamlit frontend.
Table of Contents
- System Architecture
- System Design
- Tech Stack
- Project Structure
- Setup Instructions
- Docker Setup
- API Documentation
- Testing
- Agents Deep Dive
System Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β STREAMLIT FRONTEND β
β (Decoupled Microservice) β
β Chat UI β Weather UI β Todo Manager UI β
βββββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββ
β HTTP/REST (JSON)
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FASTAPI BACKEND β
β β
β ββββββββββββ ββββββββββββ βββββββββββββ ββββββββββββββββββββ β
β β Auth API β β Chat API β βWeather APIβ β Todo API β β
β β /auth/* β β /chat β β /weather β β /todos/* β β
β ββββββ¬ββββββ ββββββ¬ββββββ βββββββ¬ββββββ ββββββββββ¬ββββββββββ β
β β β β β β
β β ββββββΌβββββββββββββββ΄βββββββββββββββββββ β
β β β β
β JWT Auth βββββΌβββββββββββββββββββββββββββββββββββββ β
β Middleware β PARENT AGENT (LangGraph Orchestrator) β β
β β β β
β β ββββββββββββ ββββββββββββββββββββββββ β
β β βIntent β β Conditional Router ββ β
β β βDetection ββββΆ faq β RAG Agent ββ β
β β β(LLM) β β weather β Tool Agent ββ β
β β ββββββββββββ β todo β Tool Agent ββ β
β β ββββββββββββββββββββββββ β
β βββββββ¬βββββββββββββββββββββ¬ββββββββββββββ β
β β β β
β ββββββββββΌβββββββ ββββββββββΌββββββββββββ β
β β RAG AGENT β β TOOL AGENT β β
β β β β β β
β β 1. Semantic β β ββββββββββββββββ β β
β β Search β β β Weather Tool β β β
β β 2. Context β β β (Open-Meteo) β β β
β β Building β β ββββββββββββββββ β β
β β 3. LLM Answer β β ββββββββββββββββ β β
β β Generation β β β Todo Tools β β β
β βββββββββ¬ββββββββ β β (FastMCP) β β β
β β β ββββββββ¬ββββββββ β β
β βββββββββΌββββββββ βββββββββββΌβββββββββββ β
β β ChromaDB β β β
β β Vector Store β βββββββββββΌβββββββββββ β
β β (FAQ Data) β β FastMCP Todo β β
β βββββββββββββββββ β Server (PostgreSQL)β β
β βββββββββββ¬βββββββββββ β
β β β
β βββββββββββΌβββββββββββ β
β β PostgreSQL DB β β
β β (Users + Todos) β β
β ββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
System Design
Microservices Architecture
The application follows a microservices architecture with two independently deployable services:
| Service | Technology | Port | Responsibility |
|---|---|---|---|
| Backend | FastAPI | 8000 | API server, agent orchestration, RAG, MCP |
| Frontend | Streamlit | 8501 | User interface, communicates via REST API only |
| Database | PostgreSQL | 5432 | Persistent storage for users and todos |
The frontend is loosely coupled β it only knows the backend's URL and communicates exclusively through HTTP REST calls. No shared code, no shared state.
Multi-Agent System (LangGraph)
The agent system is built with LangGraph, implementing a directed state graph:
βββββββββββββββββββ
β detect_intent β (Entry Point)
β (LLM-based) β
ββββββββββ¬βββββββββ
β
ββββββββββΌβββββββββ
β Conditional β
β Router β
ββββ¬βββββββ¬ββββ¬ββββ
β β β
"faq" β β β "unknown"
β β β
ββββββββββΌβ ββββΌββββΌββββββ βββββββββββββ
βrag_node β β tool_node β βunknown_nodeβ
β β β β β β
ββββββ¬βββββ βββββββ¬βββββββ βββββββ¬βββββββ
β β β
βββββββββββββββ΄ββββββββββββββββ
β
[END]
- Parent Agent (Orchestrator): Receives queries, uses LLM to classify intent (
faq,weather,todo,unknown), then routes to the appropriate specialist. - RAG Agent: Performs semantic search on ChromaDB, builds context from top-k results, and generates grounded answers using the LLM.
- Tool Agent: Dispatches to weather API or FastMCP todo tools based on the classified intent.
RAG Pipeline
CSV Data β LangChain Documents β OpenAI Embeddings β ChromaDB Vector Store
β
User Query β Embedding β Similarity Search (top-4) β Context Building β LLM β Answer
- Data Source: FAQ CSV file (386 question/answer pairs from BigRock)
- Embeddings: OpenAI
text-embedding-3-small - Vector Store: ChromaDB (in-memory + optional persist)
- LLM: GPT-4.1 Mini (configurable)
- Fallback: Returns "I couldn't find an answer" if no relevant context exists
FastMCP Integration
The Todo service is implemented as a FastMCP 3.x server with tools exposed via the MCP protocol. Todos are persisted in PostgreSQL via SQLAlchemy:
create_taskβ Create a new todo itemlist_tasksβ List all tasks (with filter)get_taskβ Retrieve a task by IDupdate_taskβ Modify task title/description/statusdelete_taskβ Remove a task
The backend uses a FastMCP Client with in-memory transport (no network hop for MCP calls within the same process), while also exposing REST API endpoints for direct access. The agent also supports keyword-based task resolution β users can refer to tasks by name instead of ID.
JWT Authentication
All API endpoints (except /auth/* and health checks) are protected by JWT Bearer token authentication.
Flow:
POST /auth/registerorPOST /auth/loginβ returns JWT- Include
Authorization: Bearer <token>in subsequent requests - Token is validated on every protected endpoint
Data Flow
User Input β Streamlit β POST /chat β JWT Validation β Orchestrator
β Intent Detection (LLM) β Route to Agent β Execute β Return Response
β Streamlit renders answer
Tech Stack
| Component | Technology |
|---|---|
| Backend API | FastAPI, Uvicorn |
| Agent Framework | LangGraph, LangChain |
| LLM | OpenAI GPT-4.1 Mini |
| Embeddings | OpenAI text-embedding-3-small |
| Vector Database | ChromaDB |
| Database | PostgreSQL, SQLAlchemy, Alembic |
| MCP Server | FastMCP 3.x |
| Weather API | Open-Meteo (free, no key) |
| Authentication | JWT (python-jose) |
| Frontend | Streamlit |
| HTTP Client | httpx (async), requests |
| Testing | Pytest |
| Containerization | Docker, Docker Compose |
Project Structure
Multi_Agent_RAG_With_FastMCP/
βββ docker-compose.yml # Orchestrates backend + frontend containers
βββ README.md
βββ data/
β βββ faqs.csv # FAQ dataset (question/answer pairs)
β βββ faqs 3.xlsx # Original Excel source
β
βββ backend/
β βββ Dockerfile
β βββ .dockerignore
β βββ requirements.txt
β βββ pytest.ini
β βββ alembic.ini # Alembic config (URL overridden by env.py)
β βββ .env # Secret variables (gitignored)
β βββ .env.example # Template for env vars
β βββ alembic/ # Database migrations
β β βββ env.py # Migration environment (reads DATABASE_URL)
β β βββ versions/ # Migration scripts
β βββ app/
β β βββ main.py # FastAPI entry point, CORS, lifespan
β β βββ config.py # Settings from env vars
β β βββ database.py # SQLAlchemy engine, session, Base
β β β
β β βββ models/ # ORM models
β β β βββ __init__.py # Exports User, Todo
β β β βββ user.py # User model
β β β βββ todo.py # Todo model
β β β
β β βββ auth/ # Authentication module
β β β βββ routes.py # POST /auth/login, /auth/register
β β β βββ controllers.py # Login/register logic
β β β βββ jwt_handler.py # JWT create/verify, FastAPI dependency
β β β
β β βββ chat/ # Chat module
β β β βββ routes.py # POST /chat
β β β βββ controllers.py # Delegates to orchestrator
β β β
β β βββ weather/ # Weather module
β β β βββ routes.py # GET /weather
β β β βββ controllers.py # Delegates to tool agent
β β β
β β βββ todo/ # Todo module
β β β βββ routes.py # CRUD /todos/*
β β β βββ controllers.py # Proxies to FastMCP client
β β β
β β βββ agents/ # Multi-agent system
β β β βββ orchestrator.py # Parent agent (LangGraph state graph)
β β β βββ rag_agent.py # RAG pipeline agent
β β β βββ tool_agent.py # Weather + Todo agent
β β β
β β βββ rag/ # RAG infrastructure
β β β βββ loader.py # CSV β LangChain Documents
β β β βββ vectorstore.py # ChromaDB build/query
β β β
β β βββ mcp/ # FastMCP integration
β β βββ todo_server.py # FastMCP server with CRUD tools (PostgreSQL)
β β βββ client.py # FastMCP client (in-memory transport)
β β
β βββ tests/
β βββ conftest.py # Shared fixtures
β βββ test_auth.py # Auth API tests
β βββ test_weather.py # Weather API tests
β βββ test_todos.py # Todo API tests
β βββ test_chat.py # Chat API tests
β
βββ frontend/
βββ Dockerfile
βββ .dockerignore
βββ requirements.txt
βββ .env
βββ .env.example
βββ config.py # Frontend configuration
βββ api_client.py # HTTP client for backend API
βββ app.py # Streamlit application
βββ .streamlit/
βββ config.toml # Streamlit server config
Setup Instructions
Prerequisites
- Python 3.12+
- OpenAI API key
- PostgreSQL 16+ (or use Docker which includes it)
- (Optional) Docker & Docker Compose
Local Development (Without Docker)
1. Backend Setup
cd backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # Linux/Mac
# or: venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
# Configure environment
cp .env.example .env
# Edit .env with your OPENAI_API_KEY and DATABASE_URL
# Run database migrations
alembic upgrade head
# Run the backend
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
The API docs are available at http://localhost:8000/docs (Swagger UI).
2. Frontend Setup
cd frontend
# Create virtual environment
python -m venv venv
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Configure environment
cp .env.example .env
# Run the frontend
streamlit run app.py --server.port 8501
The frontend is available at http://localhost:8501.
3. Default Credentials
| Username | Password |
|---|---|
| admin | admin123 |
You can also register a new account through the UI or API.
Docker Setup
Quick Start
# From the project root
cp backend/.env.example backend/.env
# Edit backend/.env with your OPENAI_API_KEY
docker-compose up --build
This starts three services:
- PostgreSQL at localhost:5432 (persistent via Docker volume)
- Backend at http://localhost:8000 (API + Swagger UI at
/docs) - Frontend at http://localhost:8501
The backend container automatically runs Alembic migrations on startup.
Docker Networking
The docker-compose.yml overrides certain .env values for container networking:
DATABASE_URLβ points todb:5432instead oflocalhost:5432BACKEND_URLβ frontend useshttp://backend:8000internallyFAQ_CSV_PATHβ/data/faqs.csv(mounted from./data)
Your .env keeps localhost values so local development (without Docker) still works.
Stop Services
docker-compose down
# To also remove volumes (clears DB and ChromaDB data):
docker-compose down -v
API Documentation
Once the backend is running, interactive API docs are at:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
Endpoints Summary
| Method | Endpoint | Auth | Tags | Description |
|---|---|---|---|---|
| POST | /auth/login | No | Authentication | Login and get JWT token |
| POST | /auth/register | No | Authentication | Register new user and get JWT |
| POST | /chat | Yes | Chat | Send query to orchestrator agent |
| GET | /weather | Yes | Weather | Get current weather for a city |
| POST | /todos | Yes | Todos | Create a new task |
| GET | /todos | Yes | Todos | List all tasks |
| GET | /todos/{task_id} | Yes | Todos | Get a specific task |
| PUT | /todos/{task_id} | Yes | Todos | Update a task |
| DELETE | /todos/{task_id} | Yes | Todos | Delete a task |
| GET | / | No | Health | Health check |
| GET | /health | No | Health | Detailed health check |
Example Usage
# 1. Login (uses form-encoded data for OAuth2 compatibility)
TOKEN=$(curl -s -X POST http://localhost:8000/auth/login \
-d 'username=admin&password=admin123' \
| jq -r '.access_token')
# 2. Chat (FAQ query)
curl -X POST http://localhost:8000/chat \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"query": "What domains are available from BigRock?"}'
# 3. Weather
curl -X GET "http://localhost:8000/weather?city=London" \
-H "Authorization: Bearer $TOKEN"
# 4. Create a todo
curl -X POST http://localhost:8000/todos \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"title": "Buy groceries", "description": "Milk, eggs, bread"}'
Testing
Run the happy-path test suite:
cd backend
pytest tests/ -v
The test suite covers:
- Auth API: Registration, login, duplicate handling, JWT validation
- Weather API: Default city, custom city, auth enforcement
- Todo API: Create, list, update, delete tasks, auth enforcement
- Chat API: FAQ/weather/todo queries (mocked orchestrator for isolation)
Agents Deep Dive
Parent Agent (Orchestrator)
File: backend/app/agents/orchestrator.py
Uses LangGraph StateGraph with typed state (AgentState). The LLM classifies the user's intent into one of: faq, weather, todo, or unknown. A conditional edge routes to the appropriate specialist node based on the classification.
RAG Agent
File: backend/app/agents/rag_agent.py
- Receives the original query
- Performs
similarity_searchagainst ChromaDB (top 4 results) - Formats retrieved documents into a context block
- Sends context + query to GPT-4.1 Mini with strict grounding instructions
- Returns the generated answer (or fallback message)
Tool Agent
File: backend/app/agents/tool_agent.py
Dispatches to:
- Weather Tool: Calls the Open-Meteo API (free, no API key required). Uses a 2-step flow: geocode city name β fetch current weather.
- Todo Tools: Calls the FastMCP Todo server via in-memory MCP client. Supports keyword-based task resolution β users can refer to tasks by name/description instead of ID.
Configuration Reference
All secrets and config are stored in backend/.env:
| Variable | Description | Default |
|---|---|---|
OPENAI_API_KEY | OpenAI API key | (required) |
OPENAI_MODEL | LLM model name | gpt-4.1-mini |
JWT_SECRET_KEY | JWT signing secret | (required) |
JWT_ALGORITHM | JWT algorithm | HS256 |
JWT_ACCESS_TOKEN_EXPIRE_MINUTES | Token expiry in minutes | 30 |
DATABASE_URL | PostgreSQL connection string | (required) |
WEATHER_DEFAULT_CITY | Default city for weather queries | London |
CHROMA_PERSIST_DIR | ChromaDB data directory | ./chroma_db |
CHROMA_COLLECTION_NAME | ChromaDB collection name | faq_collection |
FAQ_CSV_PATH | Path to FAQ CSV file | ../data/faqs.csv |
License
See LICENSE for details.
