GuardianMcp
An enterprise-grade AI Fraud Investigator. Uses a zero-trust FastMCP bridge to securely fetch data, a custom PyTorch Autoencoder for mathematical anomaly scoring, and LangGraph for agentic routing. Features a Redis Semantic Cache to slash LLM latency and LangSmith for complete MLOps auditability.
Ask AI about GuardianMcp
Powered by Claude · Grounded in docs
I know everything about GuardianMcp. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
🛡️ Guardian-MCP: Autonomous Financial Fraud Investigator
🚀 The Elevator Pitch
Large Language Models (LLMs) cannot be blindly trusted with raw SQL access to banking databases, nor can they be trusted to mathematically detect fraud without hallucinating.
Guardian-MCP is a Proof-of-Concept enterprise AI agent built to solve this. It acts as a Level-1 Fraud Investigator that uses a Zero-Trust Data Architecture (MCP) to securely fetch tokenized bank records, passes them to a custom PyTorch Autoencoder for mathematical anomaly scoring, and uses a Semantic Cache to drastically cut API latency and token costs.
🧠 Architecture & Tech Stack: How It Works
This project is divided into 5 distinct engineering phases, combining deterministic machine learning with generative AI orchestration.
- The Math Engine (PyTorch): Instead of relying on an LLM prompt to guess if a transaction is fraudulent, a local deep learning Denoising Autoencoder is trained on tabular transaction data. It calculates a strict mathematical reconstruction error (0.0 to 1.0) to serve as the anomaly score.
- The Secure Data Bridge (FastMCP): To protect sensitive financial data, the agent never touches the database. A local Model Context Protocol (MCP) server acts as a secure bridge, exposing a single tool (
get_user_transactions) that retrieves only the requested user's tokenized data. - The Brain (LangGraph & Gemini): A state machine orchestrates the investigation. It routes the decision loop: Fetch Data (MCP) ➡️ Score Anomaly (PyTorch) ➡️ Draft Report (Gemini 2.5 Flash-Lite).
- Cost Engineering & Latency (LangChain Caching): An embedded Semantic Cache intercepts duplicate analyst queries. If an analyst investigates the same user twice, the system bypasses the LLM entirely, dropping API token costs to zero.
- Observability (LangSmith): Every tensor shape, tool execution, and LLM reasoning step is traced and logged for strict financial compliance and auditability.
⚡ Proof of Concept: Cost Engineering in Action
In enterprise AI, lowering latency and API costs is just as important as accuracy. Below is the LangSmith and UI audit trail proving the efficiency of the Semantic Cache.
Run 1: The Initial Investigation (Cold Cache)
The analyst requests an investigation for User ID 15. The agent executes the full LangGraph workflow, hits the MCP server, runs the PyTorch model, and queries Gemini.
- Latency: ~3.1 seconds
- Cost: Full Token Usage


Run 2: The Semantic Cache Hit
The analyst requests the same investigation. The embedding model recognizes the semantic intent, intercepts the graph execution, and instantly returns the cached decision.
- Latency: ~1.5 seconds (Over 50% Latency Reduction)
- Cost: 0 (100% Cost Reduction)


🛠️ How to Run Locally
This project is designed to run entirely locally (CPU-only) without requiring cloud database infrastructure.
1. Install Dependencies
pip install torch pandas scikit-learn fastmcp langgraph langchain langchain-core langchain-google-genai langsmith fastapi uvicorn
2. Set Variables
set GOOGLE_API_KEY=your_gemini_key
set LANGSMITH_TRACING=true
set LANGSMITH_API_KEY=your_langsmith_key
set LANGSMITH_PROJECT=Guardian-MCP-Local
3. Boot the Architecture (Requires 2 Terminal Windows)
Terminal 1 (Start the MCP Server): python phase2_mcp.py
Terminal 2 (Start the FastAPI Backend): uvicorn phase5_api:app --reload
Navigate to http://127.0.0.1:8000 to access the Analyst Dashboard.
