Factory
Create a complete MCP server for your product in 60 seconds. The fastest way to get your product discovered by AI assistants like Claude and ChatGPT.
Ask AI about Factory
Powered by Claude Β· Grounded in docs
I know everything about Factory. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
MCP Factory
Automated generation of Model Context Protocol servers from Windows binaries
Project: USF CSE Senior Design Capstone - Microsoft Sponsored
Objective: Enable AI agents to interact with Windows applications through automated MCP server generation
Team
| Role | Member | GitHub |
|---|---|---|
| Lead & Sections 2-3 | Evan King | @evanking12 |
| Section 4 (MCP Generation) | Layalie AbuOleim | @abuoleim1 |
| Section 4 (MCP Generation) | Caden Spokas | @bustlingbungus |
| Section 5 (Verification) | Thinh Nguyen | @TheNgith |
Business Scenario
Enterprise organizations need AI-powered customer service that can invoke existing internal tools lacking API documentation or modern integration points. MCP Factory bridges this gap by automatically analyzing Windows binaries and generating standards-compliant Model Context Protocol servers.
Azure Resource Status
| Resource | Name / ID | Status |
|---|---|---|
| Subscription ID | abb10328-e7f1-4d4a-9067-c1967fd70429 | β Active |
| Tenant ID | bfddc1f5-1e88-471c-9b5e-44611ddd3c22 | β Active |
| Resource Group | mcp-factory-rg / eastus | β Active |
| Managed Identity | mcp-factory-identity (clientId f70e3ce7β¦, principalId ef864658β¦) | β Active |
| Key Vault | mcp-factory-kv / mcp-factory-kv.vault.azure.net | β Active |
| Storage Account | mcpfactorystore / Standard_LRS | β Active |
| Blob β uploads | mcpfactorystore/uploads | β Active |
| Blob β artifacts | mcpfactorystore/artifacts | β Active |
| Container Registry | mcpfactoryacr / mcpfactoryacr.azurecr.io | β Active |
| ACA Environment | mcp-factory-env / eastus (Log Analytics 0baaed31β¦) | β Active |
| ACA Default Domain | icycoast-8ddfa278.eastus.azurecontainerapps.io | β Active (VNet-integrated) |
| ACA App β pipeline | mcp-factory-pipeline | β
Live β mcp-factory-pipeline.icycoast-8ddfa278.eastus.azurecontainerapps.io |
| ACA App β web UI | mcp-factory-ui | β
Live β mcp-factory-ui.icycoast-8ddfa278.eastus.azurecontainerapps.io |
| Azure OpenAI | mcp-factory-openai / https://mcp-factory-openai.openai.azure.com/ | β Active |
| OpenAI Deployment | gpt-4o (model 2024-11-20, 10K TPM) | β Active |
| App Insights | mcp-factory-insights (Log Analytics: mcp-factory-logs) | β Active |
| App Service | mcp-factory-web | β Skipped (VM quota = 0, pivoted to ACA) |
Identity & RBAC
All access is via Managed Identity β no keys or secrets in environment variables.
| Role | Resource |
|---|---|
| Key Vault Secrets User + Secrets Officer | mcp-factory-kv |
| Storage Blob Data Contributor | mcpfactorystore |
| AcrPull + AcrPush | mcpfactoryacr |
| Cognitive Services OpenAI User | mcp-factory-openai |
Key Vault Secrets
Wired to ACA containers via secretref::
| Secret name | Used by |
|---|---|
azure-storage-account | pipeline |
openai-endpoint | pipeline |
openai-deployment | pipeline |
azure-client-id | pipeline |
appinsights-connection | pipeline |
Deployment
Two containers β auto-deployed on every push to main via .github/workflows/ci-cd.yml (GitHub Actions OIDC, no stored secrets). Manual commands for emergency hot-fixes:
# Build and push
docker build -t mcpfactoryacr.azurecr.io/mcp-factory-pipeline:latest .
docker build -t mcpfactoryacr.azurecr.io/mcp-factory-ui:latest -f Dockerfile.ui .
docker push mcpfactoryacr.azurecr.io/mcp-factory-pipeline:latest
docker push mcpfactoryacr.azurecr.io/mcp-factory-ui:latest
# Update running containers
az containerapp update --name mcp-factory-pipeline --resource-group mcp-factory-rg --image mcpfactoryacr.azurecr.io/mcp-factory-pipeline:latest
az containerapp update --name mcp-factory-ui --resource-group mcp-factory-rg --image mcpfactoryacr.azurecr.io/mcp-factory-ui:latest
- Pipeline (
Dockerfile) βuvicorn api.main:appon port 8000. Runs discovery, reads Key Vault secrets via Managed Identity. Writes uploads/artifacts to Blob Storage. - UI (
Dockerfile.ui) βuvicorn ui.main:appon port 8080. Proxies/api/*to the pipeline URL viahttpx. Holds no secrets.
Current Status β Week 10 / 16
Summary: Full end-to-end pipeline is live in Azure. Analyze β generate β chat verified against
calc.exeandnotepad.exe. MCP stdio server (generated/notepad/mcp_stdio.py) registered in.vscode/mcp.jsonβ VS Code Copilot connects natively via#mcp-notepad. Azure Monitor Workbook deployed alongside App Insights showing live operational metrics. Self-hosted Windows runner VM provisioned in Bicep for GUI automation CI jobs. All known gaps closed.
- Sections 2-3: Hybrid Discovery Engine β COMPLETE
- β PE DLL/EXE, .NET, COM/TLB, RPC, CLI, SQL, 9 scripting languages, all Β§1 legacy protocols
- β
--targetaccepts a file or an installed directory (C:\Program Files\AppD\) β Β§2.a - β
--registryflag scans HKLM App Paths, Uninstall keys, and COM CLSID registrations β Β§1.c - β
Uniform
{ name, kind, confidence, description, return_type, parameters, execution }schema - β 29/29 demo targets pass across all 10 source-type sections
- Section 4: MCP Generation β COMPLETE (cloud + local + Copilot)
- β
python mcp_factory.py --target <file>runs full pipeline in one command - β
Flask MCP server with
/tools,/invoke,/chat,/download/invocables - β
Chat UI at
http://localhost:5000; shows tool calls + live execution results - β Working demos: Calculator (55 invocables, WinUI3) and Notepad (Win32)
- β
/api/generatelive in ACA β returns correct tool schema, saved to Blob artifacts - β
generated/notepad/mcp_stdio.pyβ native MCP JSON-RPC 2.0 stdio server using themcpPython SDK. Same tool registry as the HTTP server; runs_execute_tool(pywinauto) in a thread executor so the asyncio loop never blocks. - β
.vscode/mcp.jsonβ VS Code Copilot server registration. Open Copilot Chat β type#mcp-notepadβ tools appear; ask Copilot to open Notepad and type text β it callstype_textthrough the MCP protocol.
- β
- Section 5: Verification UI β COMPLETE (cloud)
- β
FastAPI web UI (
ui/main.py) β 4-step wizard: Upload β Select β Generate β Chat - β
Installed-path input field (Β§2.b) β paste
C:\Program Files\AppD\directly - β
Chat tab sends
invocablesmetadata so the pipeline actually executes tool calls - β Download schema JSON button
- β
Optional API-key guard (
UI_API_KEYenv var) β Β§6 access restriction
- β
FastAPI web UI (
- Section 6: Azure Infrastructure β FULLY DEPLOYED
- β Resource Group, Managed Identity, Key Vault, Storage (uploads + artifacts), ACR, ACA Environment all provisioned
- β Managed Identity: Storage Blob Data Contributor + Cognitive Services OpenAI User + Key Vault Secrets User + Officer + AcrPull + AcrPush. No secrets/keys in env vars.
- β
Azure OpenAI
mcp-factory-openaiβgpt-4odeployment (2024-11-20, 10K TPM) active - β
Application Insights
mcp-factory-insightswired to both containers - β
Docker images pushed to
mcpfactoryacr.azurecr.io; both ACA apps live - β
mcp-factory-pipelineβ revision0000003, end-to-end verified - β
.NET Aspire β
aspire/AppHost/Program.csorchestrates both containers; port bindings, App Insights, andPIPELINE_URLinjection all wired. Run locally withcd aspire/AppHost && dotnet run(requires .NET 8 SDK + Docker Desktop). - β
CI/CD β
.github/workflows/ci-cd.ymlβ four-job pipeline (test β gui-tests (Windows) β build β deploy), GitHub OIDC (no stored secrets). Triggers on every push tomain. - β ACA scale-to-zero β both container apps scale to 0 replicas when idle, up to 3/2 on HTTP load. Zero cost when unused.
- β
Blob-backed job state β
_register_invocablespersists invocable map toartifacts/{job_id}/invocables_map.json;_get_invocablereloads from Blob on cache miss. State survives container recycles and scale-to-zero. - β
Registry scan wired into API β
_run_discoverypasses--registryon Windows, enabling HKLM App Paths / Uninstall / COM CLSID enumeration (Β§1.c) from the cloud API. - β
Azure Monitor Workbook β
infra/workbook.bicepdeploys a shared Workbook intomcp-factory-rgalongside App Insights. Five tiles: analyses this week, avg invocables/job, tool call success %, avg+P95 latency table, throughput timechart. All KQL queries usetoint(customDimensions[...])matching the actual telemetry schema. - β
Self-hosted Windows runner VM β
infra/runner-vm.bicepprovisions aStandard_D2s_v3Windows Server 2022 VM withCustomScriptExtensionthat auto-installs Python, pywinauto, and the GitHub Actions runner service on first boot (scripts/install-github-runner.ps1). Thegui-testsCI job targets[self-hosted, windows, x64]. - β
Azure AI Search β
api/search.pyvector-indexes tool descriptions at generation time; chat endpoint uses nearest-neighbor retrieval to select the 15 most relevant tools per turn when a server has >15 tools β prevents context-window exhaustion on large binaries.
- Sponsor Requirements (Β§6 checklist)
- β Azure Cloud (compute, storage, networking, OpenAI) β live
- β GitHub + GitHub Copilot β in use; generated MCP server connects natively to Copilot Chat
- β
VS Code β dev environment +
.vscode/mcp.jsonfor Copilot MCP registration - β
.NET Aspire app host β
aspire/AppHost/Program.csβ both containers fully wired - β
GitHub Codespaces β
.devcontainer/devcontainer.json - β Microsoft docs cited β References section in this README
- β
Budget alert script β
scripts/setup_budget_alert.ps1($150/month cap) - β FERPA compliance statement β below
Sponsor Demo E2E
The full sponsor demo proof runs in GitHub Actions through Sponsor Demo E2E. The workflow uploads a GitHub Actions artifact named sponsor-demo-e2e.
Canonical green run: 24613173130. Focused Remote DCOM runtime source proof: 24577926238. Hardening integrity proof: 24613434034. Deployed demo UI: mcp-factory-ui.icycoast-8ddfa278.eastus.azurecontainerapps.io.
Static sponsor references:
- Final demo brief is the shortest sponsor/teammate handoff page with the canonical run, deployed UI, trust/evidence summary, and architecture/routing charts.
- Sponsor video demo walkthrough gives the recommended SOAP/WSDL UI demo path, GPT prompts, backend diagram, and teammate error-output refactor assessment.
- Sponsor proof index maps the canonical run to the exact artifact paths and fast proof workflows.
- Sponsor demo caveats states the proof boundary for JSON-RPC, SOAP, CORBA, RPC, JNDI, COM/DCOM, and arbitrary binary recovery.
- Non-Ghidra stretch closeout tracks the same-subnet Remote DCOM proof path without adding Ghidra or undocumented binary recovery scope.
- Non-code sponsor artifacts covers Azure cost posture, FERPA/access control, architecture references, and rerun guidance.
Recommended recorded demo target: SOAP/WSDL through the deployed UI. Use the
Load SOAP/WSDL Showcase button or upload
tests/fixtures/sponsor/contoso_service.wsdl, then analyze, select
GetCustomer or SubmitTicket, generate the MCP schema, and ask GPT to call
the generated tool with a deterministic sentinel. The chat step includes a
Live Proof Trace panel showing the GPT tool_call, backend route, runtime
mode, and tool_result. The walkthrough above has copy-ready prompts and the
exact evidence to show on screen.
Interpret the artifact as the canonical sponsor proof bundle:
final-summary.mdis the sponsor-readable report. It includes overall pass/fail, slow-target diagnostics, bridge/session proof, and a requirement-to-proof matrix.sponsor-report.htmlis a browsable rendering of the same report for live review.final-summary.jsonis the machine-readable form of the same gate.- GPT
tool_call+ backendtool_resultsentinel proofs are required for all 13 non-VM cases: OpenAPI/REST, JSON-RPC, SOAP/WSDL, CORBA IDL, RPC IDL, JNDI, SQL, Python, JavaScript, Ruby, PHP, PowerShell, and CMD/BAT. - JSON-RPC is hosted as a JSON-RPC 2.0 service in the pipeline API. SOAP validates SOAP envelopes and dispatches WSDL-named operations. SQL executes against deterministic SQLite-backed Contoso data. OpenAPI/REST validates declared routes and methods. JNDI uses a controlled LDAPv3-compatible bind/search/lookup runtime, CORBA uses a controlled OmniORB/IIOP runtime for deterministic Contoso IDL, and RPC uses a controlled DCE/RPC-compatible runtime for deterministic RPC IDL. COM/TLB discovery and local COM automation are proven through the Windows bridge. Canonical run
24613173130imports focused run24577926238and proves a controlled same-subnet Remote DCOM fixture through WMI over DCOM. Generalized CORBA estate migration, enterprise directory migration, arbitrary MSRPC estate support, and arbitrary enterprise DCOM estate migration are not claimed. A "provider required" result is now a fallback only when those services are disabled or unreachable, not a passing required sponsor result. ci_artifacts/demo/gpt-format-matrix/summary.jsonrecordsruntime_mode_counts,runtime_backed_cases,adapter_backed_cases, per-casetool_call/tool_resultflags, and transcript paths.- The canonical run above reports
13/13live execution format proofs and0required provider-required cases. - Target input supports both uploaded files and installed paths/directories. Installed paths must be accessible to the server or Windows bridge VM context performing discovery;
system32_directoryis the required installed-directory proof for requirement2.a. cmd_exeis an optional Windows diagnostic. The required CMD/BAT evidence is the deterministic.cmdfixture in the GPT format matrix.
Key artifact paths:
ci_artifacts/demo/final-summary.mdci_artifacts/demo/final-summary.jsonci_artifacts/demo/sponsor-report.htmlci_artifacts/demo/windows/summary.jsonci_artifacts/demo/windows/cmd_exe/cmd_exe.summary.jsonci_artifacts/demo/windows/notepad_exe/notepad_exe.summary.jsonci_artifacts/demo/windows-gpt/summary.jsonwhen the pushback-hardening Windows GPT matrix is runci_artifacts/demo/repo-ingestion/summary.jsonwhen the repo-ingestion proof is run
These are CI proof artifacts from GitHub Actions. They are separate from the app UI blob downloads served by /api/download/{job_id}/{filename}.
Known Gaps
| Gap | Status |
|---|---|
| Thin GUI descriptions in Linux container | β
Closed 2026-03-07 β gui-tests CI job runs on self-hosted Windows runner; pywinauto/UIA runs against a live Windows session in CI. |
| CI/CD OIDC activation | β
Closed 2026-03-07 β Federated credential created; Contributor role assigned on both ACA apps. |
| MCP protocol proof (Copilot tool call) | β
Closed 2026-03-07 β Confirmed live: opened Copilot Chat, asked it to open Notepad and type "hello world" β it called file_new then type_text through the MCP stdio protocol. Notepad opened and text appeared. |
| GUI / COM / CLI Windows analysis in cloud | β
Closed 2026-03-07 β scripts/gui_bridge.py FastAPI worker runs on the Windows runner VM, exposing POST /analyze for all 4 Windows-only source types (GUI pywinauto, COM/TLB pythoncom, Windows EXE CLI, registry scan). api/main.py calls the bridge after static analysis and merges results. Wired into Bicep via guiBridgeUrl / guiBridgeSecret params; VM auto-starts the bridge as a scheduled task on boot. |
Approach: A Hybrid Discovery Engine that intelligently routes any target file to the appropriate analyzers based on detected capabilities, producing a uniform MCP JSON contract that Β§4 consumes directly.
VS Code Copilot Integration
The generated MCP server connects directly to VS Code Copilot Chat β no extra configuration required.
Prerequisites: .venv activated, mcp>=1.0 installed (pip install -r generated/notepad/requirements.txt).
Steps:
- Open this repo in VS Code. The
.vscode/mcp.jsonis already present and points togenerated/notepad/mcp_stdio.py. - Open Copilot Chat (
Ctrl+Alt+I). - Type:
#mcp-notepad open a new file and type hello world - Copilot calls
file_newthentype_textthrough the MCP stdio protocol. Notepad opens on your desktop and "hello world" appears in it.
What this proves: The factory generated a binary β invocables.json β mcp_stdio.py β VS Code Copilot can control Notepad. That is the complete proof-of-concept end-to-end.
Run the server manually (for debugging):
.venv\Scripts\Activate.ps1
cd generated\notepad
python mcp_stdio.py
# Send JSON-RPC manually or use scripts\mcp_smoke_test.py
python ..\..\scripts\mcp_smoke_test.py
Platform Requirements
This tool is designed around Windows. If you are on a Mac, see docs/mac-compatibility.md.
Prerequisites
Required (install manually on Windows 10/11):
- PowerShell 5.1+ (built into Windows 10+)
- Python 3.8+ β Download from python.org
- Add Python to PATH during installation (checked by default)
- Git β Download from git-scm.com
Installation
Prerequisites: Git, Python 3.8+.
# Clone and run the demo
git clone https://github.com/evanking12/mcp-factory.git
cd mcp-factory
pip install -r requirements.txt
python scripts/demo_all_capabilities.py
Troubleshooting: If you have multiple Python versions installed and python points to Python 2.x, use:
py -3 -m pip install -r requirements.txt
py -3 scripts/demo_all_capabilities.py
Quick Start
1. Capabilities Demo β‘ (The "It Works" Demo)
What you'll see:
demo_all_capabilities.py runs 29 live analyses across all supported source types and reports per-target results:
MCP FACTORY β ALL CAPABILITIES DEMO
=====================================
[Section 1: Native PE]
kernel32.dll ... exports_mcp.json 1491 invocables
user32.dll ... exports_mcp.json 1037 invocables
zstd.dll ... exports_mcp.json 16 invocables
[Section 2: .NET Assemblies]
System.dll ... dotnet_methods_mcp.json 143 invocables
mscorlib.dll ... dotnet_methods_mcp.json 48 invocables
[Section 3: COM / Type Library]
shell32.dll ... com_objects_mcp.json 482 invocables
oleaut32.dll ... com_objects_mcp.json 12 invocables
stdole2.tlb ... com_objects_mcp.json 50 invocables
[Section 4: CLI Tools]
cmd.exe ... cli_mcp.json 8 invocables
git.exe ... cli_mcp.json 35 invocables
[Section 5: SQL]
sample.sql ... sql_file_mcp.json 14 invocables
sqlite3.dll ... exports_mcp.json 16 invocables
[Section 6: Scripting Languages]
sample.py ... python_script_mcp.json 5 invocables
sample.ps1 ... powershell_script_mcp.json 4 invocables
sample.sh ... shell_script_mcp.json 5 invocables
sample.bat ... batch_script_mcp.json 4 invocables
sample.vbs ... vbscript_mcp.json 4 invocables
sample.rb ... ruby_script_mcp.json 4 invocables
sample.php ... php_script_mcp.json 5 invocables
[Section 7: JavaScript / TypeScript]
sample.js ... javascript_mcp.json 6 invocables
sample.ts ... typescript_mcp.json 5 invocables
[Section 8: RPC Interfaces]
lsass.exe ... rpc_mcp.json 8 invocables
[Section 9: Legacy Protocols & Spec Formats]
sample_openapi.yaml ... openapi_spec_mcp.json 9 invocables
sample_jsonrpc.json ... jsonrpc_spec_mcp.json 5 invocables
sample.wsdl ... wsdl_file_mcp.json 7 invocables
sample.idl ... corba_idl_mcp.json 12 invocables
sample.jndi ... jndi_config_mcp.json 12 invocables
zstd.pdb ... pdb_file_mcp.json 871 invocables
[Section 10: Directory Scan (Β§2.a installed-instance)]
scripts/ (dir) ... scripts_scan_mcp.json 1184 invocables
Summary: 29 succeeded 0 skipped (29 total)
Why this matters: Every output β regardless of source type β produces the same JSON schema that Β§4 consumes. A PE export, a Python function, and a SQL stored procedure all look identical to the MCP generator.
Confidence levels:
- guaranteed: Explicit metadata (type annotations, doc comments + return type)
- high: Partial docs or exported with header match
- medium: Pattern-matched, best effort
- low: Minimal information (symbol name only)
2. Live App Demo β‘ (The "It Controls Windows" Demo)
Start a pre-built server and chat with it:
# Calculator (WinUI3/MSIX, 55 invocables β digit buttons, operators, scientific functions)
python mcp_factory.py --serve calculator-test2
# Notepad (classic Win32 β type, save, open, append)
python mcp_factory.py --serve notepad
Open http://localhost:5000 and try:
- "Open the calculator and compute the square root of 144, then multiply by 7"
- "Open notepad, type a short poem, save it as poem.txt, then reopen it and append a title"
The chat UI shows every tool call, its arguments, and the raw result from the live application window.
Run the full pipeline on any binary:
# Discovery β selection TUI β server generation in one command
python mcp_factory.py --target C:\Windows\System32\notepad.exe --description "text editor"
Analyze a Specific File or Installed Directory
# Native DLL
python src/discovery/main.py --target "C:\Windows\System32\kernel32.dll" --out artifacts
# Script file
python src/discovery/main.py --target "path\to\service.py" --out artifacts
# OpenAPI / WSDL / IDL / JNDI / PDB β same command, format auto-detected
python src/discovery/main.py --target "api\openapi.yaml" --out artifacts
# Installed application directory (Β§2.a) β walks tree, analyses every recognized file
python src/discovery/main.py --target "C:\Program Files\MyApp\" --out artifacts
Analyze Windows CLI Tools
python src/discovery/cli_analyzer.py "C:\Windows\System32\ipconfig.exe"
Run the selection UI to review discovered tools and choose what the MCP server exposes:
# Single file
python src/ui/select_invocables.py --target tests/fixtures/vcpkg_installed/x64-windows/bin/zstd.dll
# Installed directory (Β§2.a) β same flag, directory support is transparent
python src/ui/select_invocables.py --target tests/fixtures/scripts/
# With a free-text hint (Β§2.b) β highlights matching rows in the table
python src/ui/select_invocables.py --target zstd.dll --description "compress decompress streaming"
# From an already-generated discovery JSON (skip re-analysis)
python src/ui/select_invocables.py --input artifacts/discovery-output.json
The UI defaults guaranteed + high confidence ON, medium + low OFF (Β§3.b). Commands: <n> toggle row, 3-10 range, g guaranteed+high only, m toggle medium, l toggle low, a/n all/none, f <text> filter, done save β artifacts/selected-invocables.json.
Repository Structure
mcp-factory/
βββ mcp_factory.py # Single-command entry point (--target / --serve)
βββ scripts/
β βββ demo_all_capabilities.py # Main demo β 29 targets, all source types (10 sections)
β βββ demo_legacy_protocols.py # Standalone 6-target suite for spec-gap analyzers
β βββ validate_features.py # Validation suite
β βββ analyze_json_anomalies.py # Hygiene verification
βββ src/discovery/ # Sections 2-3: Discovery pipeline
β βββ main.py # CLI orchestrator β single file OR directory walk (Β§2.a)
β βββ classify.py # File-type detection (22+ source types)
β βββ exports.py # Native PE exports (pefile)
β βββ pe_parse.py # .NET reflection
β βββ com_scan.py # COM registry + TLB scanning
β βββ cli_analyzer.py # CLI argument extraction
β βββ gui_analyzer.py # GUI discovery β UIA button walk, menu enumeration, app-type detection
β βββ rpc_scan.py # RPC interface scanning
β βββ sql_analyzer.py # SQL stored procs, views, tables, triggers
β βββ script_analyzer.py # Python, PowerShell, Shell, Batch, VBScript, Ruby, PHP
β βββ js_analyzer.py # JavaScript + TypeScript
β βββ openapi_analyzer.py # OpenAPI 3.x / Swagger 2.x + JSON-RPC 2.0
β βββ wsdl_analyzer.py # SOAP / WSDL 1.1
β βββ idl_analyzer.py # CORBA IDL interfaces
β βββ jndi_analyzer.py # JNDI bindings (.properties, Spring XML)
β βββ pdb_analyzer.py # PDB debug symbols (dbghelp.dll)
β βββ schema.py # Unified Invocable schema β MCP JSON
βββ src/ui/
β βββ select_invocables.py # Interactive Β§3 selection UI (rich table, confidence filter)
βββ src/generation/ # Section 4: MCP server generation
β βββ section4_select_tools.py
β βββ section4_generate_server.py # Flask server template + chat UI template
βββ generated/ # Pre-built servers (ready to --serve)
β βββ calculator-test2/ # Calculator β 55 invocables, WinUI3/MSIX
β βββ notepad/ # Notepad β classic Win32, type/save/open/append
βββ tests/
β βββ fixtures/scripts/ # Sample files for all supported source types
β βββ sample_openapi.yaml # OpenAPI 3.0 fixture (9 operations)
β βββ sample_jsonrpc.json # JSON-RPC 2.0 fixture (5 methods)
β βββ sample.wsdl # WSDL 1.1 fixture (7 operations)
β βββ sample.idl # CORBA IDL fixture (12 methods)
β βββ sample.jndi # JNDI fixture (12 bindings)
βββ demo_output/unified/ # Generated demo artifacts (one sub-dir per target)
βββ docs/
βββ adr/ # Architecture Decision Records (ADR-0001 β ADR-0008)
βββ copilot-log/entries.md # Session-by-session development log
βββ sections-2-3.md # Β§2-3 feature coverage reference (29/29 targets)
βββ schemas/ # JSON schema contracts for Section 4
Team Responsibilities
- Sections 2-3 (Binary Analysis): Evan King - DLL/EXE export discovery, header matching, tiered output
- Section 4 (MCP Generation): Layalie AbuOleim, Caden Spokas - JSON schema generation, tool definitions
- Section 5 (Verification): Thinh Nguyen - Interactive UI, LLM-based validation
- Integration & Deployment: Team effort - Azure deployment, CI/CD, documentation
Section 4: MCP Server Generation
Full pipeline (discovery β selection β server)
# Run the entire pipeline on any target in one command
python mcp_factory.py --target C:\Windows\System32\calc.exe --description "calculator"
# Skip re-discovery β load an existing discovery JSON directly
python mcp_factory.py --input artifacts/discovery-output.json
# Generate the server without auto-launching it
python mcp_factory.py --target zstd.dll --skip-launch
# Suppress the browser auto-open (e.g. headless / CI)
python mcp_factory.py --serve notepad --no-browser
This runs discovery, opens the selection TUI, then generates and starts the server.
| Flag | Description |
|---|---|
--target FILE_OR_DIR | Binary, script, or directory to analyse β runs full discovery |
--serve COMPONENT | Skip pipeline entirely; start a pre-built server from generated/ |
--input JSON | Skip discovery; load an existing discovery-output.json directly |
--description TEXT | Free-text hint that highlights matching rows in the selection TUI |
--no-browser | Do not auto-open the browser after the server starts |
--skip-launch | Stop after generation β do not start the server |
Start a pre-built server
python mcp_factory.py --serve notepad
python mcp_factory.py --serve calculator-test2
Manual pipeline steps
# 1. Discovery
python src/discovery/main.py --target <file> --out artifacts
# 2. Select invocables (interactive TUI)
python src/ui/select_invocables.py --target <file>
# Writes: artifacts/selected-invocables.json
# 3. Generate server
python src/generation/section4_generate_server.py
# Writes: generated/<name>/server.py + static/index.html
# 4. Run the server
cd generated/<name>
cp .env.example .env # fill in OPENAI_API_KEY
python server.py
Verify with:
curl http://localhost:5000/tools
# Open http://localhost:5000 in browser for chat UI
Data Contract Stability (for Section 4)
Section 2-3 produces a stable JSON schema that Section 4 teams depend on:
- Schema: docs/schemas/discovery-output.schema.json - Formal JSON Schema
- Versioning: Breaking changes β v2.0. See CHANGELOG.md
- For Section 4 teams: Pin schema version in MCP generation to prevent drift
Contributing
This is an active capstone project. For development setup and workflow guidelines, see CONTRIBUTING.md.
Documentation
| Document | Description |
|---|---|
| Project Description | Original sponsor requirements (Sections 1-7) |
| Architecture | System design and component overview |
| Sections 2-3 Details | Binary discovery implementation |
| Product Flow | Full pipeline (Sections 2-5) |
| Schemas | JSON schema contracts for Section 4 |
| ADRs | Architecture decision records |
| Troubleshooting | Common issues and solutions |
Sponsored by Microsoft | Mentored by Microsoft Engineers
Last updated: March 7, 2026 β Aspire builds clean; CI/CD workflow, App Insights custom telemetry, and KV cleanup script added
FERPA Compliance Statement
MCP Factory is developed and operated in compliance with the Family Educational Rights and Privacy Act (FERPA) and all other applicable data-privacy regulations.
- No student PII is collected or stored. The system does not collect, process, or retain names, student IDs, email addresses, or any other personally identifiable information.
- Uploaded binaries are ephemeral. Files uploaded through the web UI are written to Azure Blob Storage solely for the duration of the analysis pipeline job. Blobs are stored under a randomized job ID; no filename-to-identity mapping is created. Blob lifecycle management policies delete uploaded files after 24 hours.
- No conversation data is persisted. Chat messages sent to Azure OpenAI through the
/api/chatendpoint are not logged to persistent storage. Azure OpenAI does not store prompt/completion data by default when accessed via API. - Access is restricted to the project team. Both Azure Container Apps are deployed with Microsoft Entra IDβbacked Managed Identity authentication; no anonymous write access is permitted to storage or AI services. The UI endpoint can be further hardened with a shared API key (
UI_API_KEYenvironment variable, see Gap #8 above). - Azure resources are scoped to the project subscription (
abb10328-e7f1-4d4a-9067-c1967fd70429) and are not shared with other courses or students.
Questions regarding data handling should be directed to the project sponsor contact.
References β Microsoft Documentation
The following Microsoft Learn pages and official documentation were used in the design and development of this project:
