Windows Desktop Use MCP
An MCP server giving Windows Desktop eyes, ears, and hands. WindowsDesktopでの 目・耳・手足を与えるMCPサーバー。
Ask AI about Windows Desktop Use MCP
Powered by Claude · Grounded in docs
I know everything about Windows Desktop Use MCP. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
windows-desktop-use-mcp
An MCP server for controlling and perceiving Windows 11 from AI assistants. It provides AI with "eyes" (vision), "ears" (hearing), and "limbs" (input control), making the desktop environment accessible from MCP clients like Claude.
Main Features
- Vision (Screen Capture): Capture monitors, specific windows, or arbitrary regions. GPU acceleration supported (enables capture of YouTube/Netflix without black screens).
- Hearing (Audio & Transcription): Record system audio or microphone, with high-quality local transcription using Whisper AI.
- Limbs (Desktop Input): Mouse movement, clicking, dragging, and safe navigation key operations (security restricted).
- Live Monitoring (Streaming): Monitor screen changes in real-time, viewable via HTTP streaming in a browser.
- Analysis: Structural text extraction from windows using UI Automation (Markdown) and AI-optimized visual features.
For Non-Developers (Pre-built .exe)
If you don't have a development environment, you can use the pre-built executable from Releases.
1. Download
- Go to the Releases page
- Download the latest
WindowsDesktopUse.zip - Extract to your preferred location (e.g.,
C:\Tools\WindowsDesktopUse)
2. Configure Claude Desktop
Option A: Automatic setup
cd C:\Tools\WindowsDesktopUse
WindowsDesktopUse.exe setup
Option B: Manual setup
Add this to your %AppData%\Roaming\Claude\claude_desktop_config.json:
{
"mcpServers": {
"windows-desktop-use": {
"command": "C:\\Tools\\WindowsDesktopUse\\WindowsDesktopUse.exe",
"args": ["--httpPort", "5000"]
}
}
}
3. Restart Claude Desktop
Close and reopen Claude Desktop to load the new MCP server.
For Developers (Build from Source)
1. Build
dotnet build src/WindowsDesktopUse.App/WindowsDesktopUse.App.csproj -c Release
2. Configure Claude Desktop
Option A: Automatic setup
WindowsDesktopUse.exe setup
Option B: Manual setup
Add this to your %AppData%\Roaming\Claude\claude_desktop_config.json:
{
"mcpServers": {
"windows-desktop-use": {
"command": "C:\\path\\to\\WindowsDesktopUse.exe",
"args": ["--httpPort", "5000"]
}
}
}
3. Verify Installation
WindowsDesktopUse.exe doctor
CLI Commands
doctor - System Diagnostics
Check system compatibility and configuration.
WindowsDesktopUse.exe doctor
WindowsDesktopUse.exe doctor --verbose # Show detailed information
WindowsDesktopUse.exe doctor --json # Output in JSON format
setup - Claude Desktop Configuration
Automatically configure Claude Desktop integration.
WindowsDesktopUse.exe setup # Use default config path
WindowsDesktopUse.exe setup --config-path "C:\custom\path.json" # Custom config path
WindowsDesktopUse.exe setup --no-merge # Overwrite existing config
WindowsDesktopUse.exe setup --dry-run # Show config without writing
whisper - Whisper AI Models
Manage Whisper AI models for audio transcription.
WindowsDesktopUse.exe whisper # List available models and check installation
WindowsDesktopUse.exe whisper --list # Show model list only
Available MCP Tools
Vision
visual_list: List monitors, windows, or all. Switch withtypeparameter.visual_capture: Capture monitor, window, or region. Dynamic quality control (Normal=30/Detailed=70).visual_watch: Continuous monitoring/streaming. Switch video/monitor/unified withmodeparameter.visual_stop: Stop all sessions (watch, capture, audio) with unified command.
Hearing
listen: Record system audio or microphone and transcribe to text using Whisper AI.
Input Control
input_mouse: Unified mouse operations (move, click, drag) withactionparameter.input_window: Window operations (close, minimize, maximize, restore) withactionparameter.keyboard_key: Press safe navigation keys (Enter, Tab, arrow keys, etc.). Text typing and modifier keys (Ctrl, Alt, Win) are blocked for security.
Utility
read_window_text: Extract window text as Markdown using UI Automation.
Documentation Index
- Tools Reference - Detailed command list and usage examples (Japanese).
- Development Guide - Details on build, test, and architecture (Japanese).
- Whisper AI - Information about speech recognition features and models.
- Quality Test Report - Analysis of information quality differences by quality settings.
Requirements
- Windows 11 (or Windows 10 1809+)
- .NET 8.0 Runtime/SDK
- High DPI aware (Uses physical pixel coordinates)
License
MIT License. See LICENSE file.
