Claude Skill For Multimodal Report Generation
A Claude Skill that converts mixed-format files (documents, images, audio/video) into structured, template-based reports using a custom MCP server.
Installation
npx claude-skill-for-multimodal-report-generationAsk AI about Claude Skill For Multimodal Report Generation
Powered by Claude Β· Grounded in docs
I know everything about Claude Skill For Multimodal Report Generation. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Document Processing Skill
A Claude Code skill and MCP server for converting scattered meeting materials into structured Word documents.
For a detailed explanation of the design and development of this code, see the related article on Data Science Collective.
Overview
This project provides an automated pipeline for transforming various inputs (audio recordings, handwritten notes, diagrams, digital notes, and supplementary documents) into a single, well-formatted Microsoft Word deliverable.

Project Structure
document-processing-skill/
βββ documenting-meetings/ # Claude skill for meeting documentation
β βββ SKILL.md # Main skill specification and workflow
β βββ EVALUATION.md # Test evaluation prompts and criteria
β βββ reference/
β βββ INPUT_FORMATS.md # Detailed input file handling guide
β βββ OUTPUT_SECTIONS.md # Output section specification
βββ transcription-MCP/ # MCP server for audio/video transcription
β βββ server.py # FastMCP server implementation
β βββ .env # Environment configuration (requires setup)
βββ sample_data/ # Example data for testing
βββ input_documents/ # Meeting materials (audio, images, docs, notes)
βββ templates/ # Blank template documents
βββ sample_documents/ # Sample output documents for formatting reference
Components
1. Claude Skill: documenting-meetings
A Claude Code skill that orchestrates the entire meeting documentation workflow.
Trigger Keywords: meeting notes, meeting summary, meeting minutes, meeting documentation, action items from meeting
Supported Input Formats:
| Category | File Types |
|---|---|
| Audio/Video | .mp3, .m4a, .wav, .ogg, .flac, .mp4, .mov, .avi, .mkv, .webm |
| Images | .jpg, .png, .gif, .webp, .bmp, .tiff, .heic |
| Digital Notes | .txt, .md, .rtf, .html |
| Supplementary | .pdf, .pptx, .xlsx, .docx |
Workflow:
- Collect context (meeting title, output preferences, focus areas)
- Validate input folder structure
- Inventory input files
- Transcribe audio/video recordings via MCP
- Interpret images (handwritten notes, diagrams)
- Read digital notes
- Check for templates and sample documents
- Generate Word deliverable
- Save to input folder
Default Output Structure:
- Meeting Summary (date, attendees, duration)
- Executive Summary
- Decisions Made
- Action Items (table with owner, due date, priority)
- Open Questions
- Follow-up Message (email template)
Sections are omitted if no relevant information exists.
2. MCP Server: transcription-MCP
A FastMCP server providing audio/video transcription capabilities using the GAIK transcriber library and OpenAI API.
Tool Exposed:
transcribe_audio(file_path: str, enhanced: bool = False) -> str
Parameters:
file_path(required): Full path to audio/video fileenhanced(optional): Return enhanced transcript ifTrue
Returns: Raw transcription text preserving original flow and structure.
Setup
Prerequisites
- Python 3.8+
- OpenAI API key
- Claude Code with MCP support
- GAIK transcriber library
MCP Server Configuration
-
Navigate to the transcription-MCP folder:
cd transcription-MCP -
Create/update the
.envfile with your OpenAI API key:OPENAI_API_KEY=your_openai_api_key OPENAI_API_TYPE=openai -
Install dependencies:
pip install mcp python-dotenv gaik -
Register the MCP server in your Claude Code configuration.
Skill Installation
- Copy the
documenting-meetingsfolder to your Claude Code skills directory - Ensure the MCP server is registered as
gaik-transcriber - Verify MCP filesystem access is configured
Usage
Basic Usage
Provide a folder containing meeting materials:
I have meeting materials in C:\Meetings\Q3-Roadmap. Please create meeting minutes.
Folder Structure
<your-folder>/
βββ input_documents/ # Required: all meeting materials
βββ templates/ # Optional: blank template with structure
βββ sample_documents/ # Optional: sample showing desired style/format
With Template
Place a .docx template in the templates/ subfolder to use custom formatting and structure.
With Sample Document
Place a completed meeting minutes example in sample_documents/ to guide the style, tone, and length of the output.
Sample Data
The sample_data/ folder contains example files for testing:
| File | Description |
|---|---|
input_documents/notes.txt | Digital meeting notes |
input_documents/meeting_recording.mp3 | Audio recording |
input_documents/sketch.png | Handwritten notes/diagram |
input_documents/roadmap-presentation.pptx | PowerPoint slides |
input_documents/project-budget.xlsx | Budget spreadsheet |
input_documents/deployment-freeze-policy.pdf | Policy document |
templates/meeting-template.docx | Blank template |
sample_documents/sample-meeting-minutes.docx | Example output |
Key Design Principles
- Modular Architecture - Separate MCP server for transcription enables independent scaling
- Fault Tolerant - Continues processing if individual file operations fail
- No Fabrication - Only uses information from provided inputs; marks missing info as "TBD"
- Format Flexible - Adapts output based on template/sample presence
- Path Safe - Handles both Windows and POSIX path formats
Dependencies
Skill Dependencies
- MCP filesystem server
- gaik-transcriber MCP server
- Docx skill (for Word document creation)
- PDF/PPTX/XLSX skills (optional, for supplementary documents)
MCP Server Dependencies
mcp.server.fastmcpgaik.building_blocks.transcriberpython-dotenv- OpenAI API
