📦

GuardianMcp

An enterprise-grade AI Fraud Investigator. Uses a zero-trust FastMCP bridge to securely fetch data, a custom PyTorch Autoencoder for mathematical anomaly scoring, and LangGraph for agentic routing. Features a Redis Semantic Cache to slash LLM latency and LangSmith for complete MLOps auditability.

0 installs

Trust: 34 — Low

Agents

Ask AI about GuardianMcp

I know everything about GuardianMcp. Ask me about installation, configuration, usage, or troubleshooting.

0/500

Loading tools...

Reviews

Documentation

🛡️ Guardian-MCP: Autonomous Financial Fraud Investigator

🚀 The Elevator Pitch

Large Language Models (LLMs) cannot be blindly trusted with raw SQL access to banking databases, nor can they be trusted to mathematically detect fraud without hallucinating.

Guardian-MCP is a Proof-of-Concept enterprise AI agent built to solve this. It acts as a Level-1 Fraud Investigator that uses a Zero-Trust Data Architecture (MCP) to securely fetch tokenized bank records, passes them to a custom PyTorch Autoencoder for mathematical anomaly scoring, and uses a Semantic Cache to drastically cut API latency and token costs.

🧠 Architecture & Tech Stack: How It Works

This project is divided into 5 distinct engineering phases, combining deterministic machine learning with generative AI orchestration.

The Math Engine (PyTorch): Instead of relying on an LLM prompt to guess if a transaction is fraudulent, a local deep learning Denoising Autoencoder is trained on tabular transaction data. It calculates a strict mathematical reconstruction error (0.0 to 1.0) to serve as the anomaly score.
The Secure Data Bridge (FastMCP): To protect sensitive financial data, the agent never touches the database. A local Model Context Protocol (MCP) server acts as a secure bridge, exposing a single tool (get_user_transactions) that retrieves only the requested user's tokenized data.
The Brain (LangGraph & Gemini): A state machine orchestrates the investigation. It routes the decision loop: Fetch Data (MCP) ➡️ Score Anomaly (PyTorch) ➡️ Draft Report (Gemini 2.5 Flash-Lite).
Cost Engineering & Latency (LangChain Caching): An embedded Semantic Cache intercepts duplicate analyst queries. If an analyst investigates the same user twice, the system bypasses the LLM entirely, dropping API token costs to zero.
Observability (LangSmith): Every tensor shape, tool execution, and LLM reasoning step is traced and logged for strict financial compliance and auditability.

⚡ Proof of Concept: Cost Engineering in Action

In enterprise AI, lowering latency and API costs is just as important as accuracy. Below is the LangSmith and UI audit trail proving the efficiency of the Semantic Cache.

Run 1: The Initial Investigation (Cold Cache)

The analyst requests an investigation for User ID 15. The agent executes the full LangGraph workflow, hits the MCP server, runs the PyTorch model, and queries Gemini.

Latency: ~3.1 seconds
Cost: Full Token Usage

Initial Run UI

Initial LangSmith Trace

Run 2: The Semantic Cache Hit

The analyst requests the same investigation. The embedding model recognizes the semantic intent, intercepts the graph execution, and instantly returns the cached decision.

Latency: ~1.5 seconds (Over 50% Latency Reduction)
Cost: 0 (100% Cost Reduction)

Cached Run UI

Cached LangSmith Trace

🛠️ How to Run Locally

This project is designed to run entirely locally (CPU-only) without requiring cloud database infrastructure.

1. Install Dependencies

pip install torch pandas scikit-learn fastmcp langgraph langchain langchain-core langchain-google-genai langsmith fastapi uvicorn

2. Set Variables

set GOOGLE_API_KEY=your_gemini_key
set LANGSMITH_TRACING=true
set LANGSMITH_API_KEY=your_langsmith_key
set LANGSMITH_PROJECT=Guardian-MCP-Local

3. Boot the Architecture (Requires 2 Terminal Windows)

Terminal 1 (Start the MCP Server): python phase2_mcp.py

Terminal 2 (Start the FastAPI Backend): uvicorn phase5_api:app --reload

Navigate to http://127.0.0.1:8000 to access the Analyst Dashboard.