Asset Aware MCP
Medical RAG with Asset-Aware MCP - AI Agent precisely accesses tables, sections, figures and knowledge graphs from PDF documents
Installation
npx asset-aware-mcpAsk AI about Asset Aware MCP
Powered by Claude Β· Grounded in docs
I know everything about Asset Aware MCP. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
asset-aware-mcp
π₯ Medical RAG with Asset-Aware MCP - Precise PDF asset retrieval (tables, figures, sections) and Knowledge Graph for AI Agents.
π ηΉι«δΈζ
π― Why Asset-Aware MCP?
AI cannot directly read image files on your computer. This is a common misconception.
| Method | Can AI analyze image content? | Description |
|---|---|---|
| β Provide PNG path | No | AI cannot access the local file system |
| β Asset-Aware MCP | Yes | Retrieves Base64 via MCP, allowing AI vision to understand directly |
Real-world Effect
# After retrieving the image via MCP, the AI can analyze it directly:
User: What is this figure about?
AI: This is the architecture diagram for Scaled Dot-Product Attention:
1. Inputs: Q (Query), K (Key), V (Value)
2. MatMul of Q and K
3. Scale (1/βdβ)
4. Optional Mask (for decoder)
5. SoftMax normalization
6. Final MatMul with V to get the output
This is the value of Asset-Aware MCP - enabling AI Agents to truly "see" and understand charts and tables in your PDF literature.
β¨ Features
- π Asset-Aware ETL - PDF β Markdown with dual-engine PDF parsing:
- PyMuPDF (default) - Fast extraction (~50MB)
- Marker (optional,
use_marker=True) - High-precision structured parsing withblocks.json(bbox/coordinates)
- π§© Unified Segmentation Export - Normalized
segmentation.jsonmerges manifest, blocks, reading order, and persisted markdown line spans for downstream tools and extensions. - πΌοΈ Layout Overlay Debugging - Render page overlays from
original.pdfto inspect bbox, segment type, and reading order visually. - π€ On-Demand OCR Preprocessing - Optional
ocrmypdfpreprocessing path for scanned PDFs before ETL. - π§ Section Navigation - Dynamic hierarchy section tree with 5 tools: browse, search, detail, content reading, and block extraction for any depth of headings.
- π Async Job Pipeline - Supports asynchronous task processing and progress tracking for large documents.
- πΊοΈ Document Manifest - Provides a structured "map" of the document for precise data access by Agents.
- π§ LightRAG Integration - Knowledge Graph + Vector Index, supporting cross-document comparison and reasoning.
- π§Ύ Citation-Aware KG Output -
consult_knowledge_graphnow supports structured answer/reference payloads for downstream agent workflows. - π Docx Editing (DFM) - Edit .docx files in Markdown via Docx-Flavored Markdown format. Supports legacy
.doc,.odt, and.odsingest via LibreOffice auto-conversion. 16 tools: ingest, read, save, list, delete, export, strict round-trip validation, DOCXβPDF, DOCXβDOC, DOCXβODT, and Docx β A2T bridges. - π‘οΈ DFM Integrity Checker - Automatic validation and auto-repair at every pipeline stage (post-ingest, pre-save, post-save). Catches orphan markers, column mismatches, and format inconsistencies.
- π A2T (Anything to Table) - 7 operation-based tools for building professional tables from any source (PDF assets, Knowledge Graph, URLs, user input). Features: Citations (AssetRef), Audit Trail, Schema Evolution, Templates, Drafting, and Token-efficient resumption.
- π₯οΈ VS Code Management Extension - Graphical interface for monitoring server status, ingested documents, and A2T tables/drafts with one-click Excel export.
- π MCP Server - Exposes tools and resources to Copilot/Claude via FastMCP.
- π₯ Medical Research Focus - Optimized for medical literature, supporting Base64 image transmission for Vision AI analysis.
ποΈ Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AI Agent (Copilot) β
βββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β MCP Protocol (Tools & Resources)
βββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ
β MCP Server (Modular Presentation) β
β βββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β tools/: 59 tools in 7 modules β β
β β document (18) β docx (16) β section (5) β β
β β job (4) β knowledge (3) β table (7) β profile (6) β
β βββββββββββββββββββββββββββββββββββββββββββββββββββ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β resources/: 13 resources in 2 modules β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ
β ETL Pipeline (DDD) β
β ββββββββββββ ββββββββββββ ββββββββββββ β
β β PyMuPDF β β Asset β β LightRAG β β
β β Adapter ββ β Parser ββ β Index β β
β ββββββββββββ ββββββββββββ ββββββββββββ β
βββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ
β Local Storage β
β ./data/ β
β βββ doc_{id}/ # Document Assets β
β βββ docx_{id}/ # Docx IR + DFM + Assets β
β βββ tables/ # A2T Tables (JSON/MD/XLSX) β
β β βββ drafts/ # Table Drafts (Persistence) β
β βββ lightrag_db/ # Knowledge Graph β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π Project Structure (DDD)
asset-aware-mcp/
βββ src/
β βββ domain/ # π΅ Domain: Entities, Value Objects, Interfaces
β βββ application/ # π’ Application: Doc Service, Table Service (A2T), Asset Service
β βββ infrastructure/ # π Infrastructure: PyMuPDF, LightRAG, Excel Renderer
β βββ presentation/ # π΄ Presentation: MCP Server (FastMCP)
βββ data/ # Document and Asset Storage
βββ docs/
β βββ spec.md # Technical Specification
βββ tests/ # Unit and Integration Tests
βββ vscode-extension/ # VS Code Management Extension
βββ pyproject.toml # uv Project Config
π Architecture Diagrams
Visual overview for the project. All diagrams use consistent GitHub README style.
| Diagram | Description |
|---|---|
| 01 β System Architecture | Full stack: Telegram β Gateway β MCP Adapter β 3 MCP servers β Ollama |
| 02 β Data Layout | 59 tools organized in 7 categories with asset-aware data tree |
| 03 β PDF Ingestion Pipeline | 7-stage flow from PDF upload to knowledge graph |
| 04 β DOCX Bidirectional Edit | DOCX ingest β TableContext edit β round-trip save workflow |
| 05 β Knowledge Graph Search | Cross-document search with 3 parallel query paths |
| 06 β Installation Steps | 7-step installation from clone to verification |
| 07 β PDF ETL Pipeline | Dual-engine parsing: PyMuPDF + Marker |
| 08 β KG Architecture | lightrag-hku 3-layer KG architecture |
| 09 β Agent Harness Concept | Assistant harness model for stateless agents |
π‘ All generation prompts are saved in docs/diagrams/ALL-PROMPTS.md for style consistency and regeneration.
π Quick Start
# Install dependencies (using uv) β default install skips Marker/torch
uv sync
# v0.6.27: Marker extra is temporarily empty because marker-pdf pins
# Pillow<11 while the secure runtime requires Pillow>=12.2.0.
# Use the default PyMuPDF backend until upstream marker-pdf supports patched Pillow.
# Run MCP Server
uv run python -m src.presentation.server
# Or use the VS Code extension for graphical management
Runtime note:
The VS Code extension prefers a managed Python 3.11 runtime when launching the MCP server via uv or uvx. This avoids native package builds on end-user machines, especially macOS systems without Xcode Command Line Tools, while keeping the project itself compatible with newer Python versions.
Installation scope note:
- The VS Code extension installs once per user (global). The MCP server launched through
uvx asset-aware-mcpreuses the user uv cache rather than reinstalling per workspace. - Runtime data stays with your repo:
.envandassetAwareMcp.dataDirdefault to./data, so ingested assets remain scoped to the current workspace.
Marker note:
In v0.6.27 the packaged Marker extra is intentionally on security hold: upstream marker-pdf 1.10.2 requires Pillow<11, while this release pins Pillow>=12.2.0 for patched image-processing security. Default installs use the PyMuPDF backend only. use_marker=True / parse_pdf_structure will report that Marker is unavailable until upstream Marker supports a patched Pillow range.
π MCP Tools
Document & Asset Tools
| Tool | Purpose |
|---|---|
ingest_documents | Process PDF files with PyMuPDF; use_marker=True currently falls back or fails closed while Marker is on security hold |
list_documents | List all ingested documents and their asset counts |
delete_document | Delete an ingested PDF, its local artifacts, and LightRAG index entries when enabled |
convert_pdf_to_docx | Reconstruct a readable DOCX from extracted PDF content |
convert_pdf_to_pptx | Rebuild editable PPTX slides from extracted PDF markdown and figures |
inspect_document_manifest | Inspect document structure before fetching specific assets |
fetch_document_asset | Precisely retrieve tables (MD) / figures (B64) / sections |
parse_pdf_structure | Queue structured parsing work; Marker output is unavailable in v0.6.27 until upstream Marker supports patched Pillow |
search_source_location | Search exact source locations with page + bbox for verification |
export_document_segmentation | Export normalized segmentation.json with reading order + line ranges |
visualize_document_layout | Render page overlay images for bbox / type / reading-order inspection |
ocr_pdf_document | Run OCR preprocessing and generate a cleaned PDF for later ETL |
find_evidence_spans | Search citation-ready spans with source revision, locator, hash, and CRAAP scaffold |
verify_citation_ref | Verify span AssetRefs against the current citation index and locator metadata |
document | Operation-based facade over PDF ingest/list/delete/inspect/parse |
document_asset | Operation-based facade over asset fetch and section tree/detail/blocks/search |
evidence | Operation-based facade over citation span find/verify/source-location search |
convert_document | Operation-based facade for PDF, DOCX/DFM, and Markdown conversions |
Job Management Tools
| Tool | Purpose |
|---|---|
get_job_status | Get async ingestion job progress and final result |
list_jobs | List active or historical ETL jobs |
cancel_job | Cancel a running ETL job |
job | Operation-based facade over job get/list/cancel |
Knowledge Graph Tools
| Tool | Purpose |
|---|---|
consult_knowledge_graph | Citation-aware knowledge graph query with structured, data, and text response modes |
export_knowledge_graph | Export graph summary / JSON / Mermaid for inspection |
knowledge | Operation-based facade over knowledge graph consult/export |
Knowledge graph note:
consult_knowledge_graphdefaults toresponse_mode="structured"and can returnanswer,references,metadata,retrieval, andcountsfor agent-side citation workflows.- Use
response_mode="data"when you want retrieval payloads without final answer synthesis, orresponse_mode="text"for legacy plain-text behavior.
Section Navigation Tools (Dynamic Hierarchy)
| Tool | Purpose |
|---|---|
list_section_tree | Display complete section hierarchy tree (supports any depth) |
get_section_detail | Get detailed info for a specific section |
get_section_blocks | Extract all blocks from a section with page + bbox |
search_sections | Search section titles |
get_section_content | Read section content via asset service |
Docx Editing Tools (DFM β Docx-Flavored Markdown)
Edit .docx files as Markdown. Preserves formatting, tables, media on round-trip.
| Tool | Purpose |
|---|---|
ingest_docx | Import .docx and decompose into DFM blocks |
get_docx_content | Read DFM content of specific blocks |
save_docx | Write DFM edits back to .docx |
list_docx_blocks | List document block structure |
list_docx_documents | List all ingested DOCX/DFM documents |
delete_docx | Delete an ingested DOCX/DFM document and its local artifacts |
convert_docx_to_pdf | Export the current DOCX/DFM state to PDF in fidelity mode |
convert_docx_to_doc | Export the current DOCX/DFM state to DOC in fidelity mode |
docx_validate_roundtrip | 6-dimension round-trip fidelity validation + file-level comparison (SHA-256, ZIP diff) |
docx_table_to_context | Bridge: Docx table β A2T context |
docx_table_from_context | Bridge: A2T table β Docx table |
docx_chart_data | Extract chart data from Docx |
export_markdown | Export Markdown to .docx/.pdf/.doc |
convert_docx_to_odt | Export the current DOCX/DFM state to ODT |
docx | Operation-based facade over DOCX/DFM ingest/get/save/list/delete/blocks/validate |
docx_table | Operation-based facade over DOCX table to_context/from_context/chart_data |
A2T (Anything to Table) Tools β 7 Operation-Based Tools
Agent-friendly design: each tool handles multiple operations via
operationparameter. Tables accept any source β PDF assets, KG entities, external URLs, or user input.
| Tool | Operations | Purpose |
|---|---|---|
plan_table | schema / templates / from_template | Schema planning, browse 4 built-in templates, create from template |
table_manage | create / delete / list / preview / resume / render / add_column / remove_column / rename_column | Table lifecycle + Schema evolution |
table_data | add_rows / get_row / update_row / delete_row / get_cell / update_cell / clear_cell | Row & cell CRUD |
table_cite | add / get / remove / cell_history | Citation management with AssetRef (7 source types) |
table_history | changes / tokens | Audit trail & token estimation |
table_draft | create / update / add_rows / resume / commit / list / delete | Draft workflow with persistence |
discover_sources | β | Cross-document source discovery (sections, tables, figures, KG) |
ETL Profile Tools
Different journals/formats need different extraction settings. Use these tools to switch profiles.
| Tool | Purpose |
|---|---|
list_etl_profiles | List all available profiles (default, arxiv, nature, ieee, elsevier) |
get_etl_profile | Get detailed configuration of a specific profile |
get_current_etl_profile | Show currently active profile |
set_etl_profile | Switch profile for subsequent document ingestion |
load_etl_profile_from_json | Load custom profile from JSON file |
etl_profile | Operation-based facade over profile list/get/current/set/load |
π§ Tech Stack
| Category | Technology |
|---|---|
| Language | Python 3.10+ |
| Package Manager | uv (all pip/setup-python removed) |
| ETL | PyMuPDF (fitz); Marker is temporarily on security hold |
| RAG | LightRAG (lightrag-hku) |
| MCP | FastMCP |
| Storage | Local filesystem (JSON/Markdown/PNG) |
π Documentation
Installation guidance:
-
Default install:
uv sync -
Marker backend: temporarily disabled in v0.6.27 because
marker-pdfpins vulnerablePillow<11; themarker/pdfextras are compatibility placeholders until upstream supports patched Pillow. -
VS Code extension:
assetAwareMcp.enableMarkerBackendis retained as a setting, but the launcher will not installmarker-pdfwhile the security hold is active. -
Technical Spec - Detailed technical specification
-
Architecture - System architecture
-
Constitution - Project principles
-
Competitive Analysis - MCP + DOCX ecosystem landscape
