Visiontest
Vision Test is an MCP (Model Context Protocol) server that provides a standardized way for AI agents and Large Language Models to interact with mobile devices.
Ask AI about Visiontest
Powered by Claude Β· Grounded in docs
I know everything about Visiontest. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
VisionTest - MCP Server for Mobile Automation
An MCP server that lets AI agents interact with Android devices and iOS simulators β tap, swipe, type, read UI elements, and launch apps.
What It Does
- Android + iOS automation through a single MCP server
- UI interaction: tap, swipe, type text, find elements, read screen hierarchy
- App management: list, inspect, and launch apps
- Device detection: automatically finds connected Android devices and booted iOS simulators
- Zero-config iOS: uses pre-built test bundle when installed, falls back to source build if needed
Prerequisites
- JDK 17 or higher
- macOS or Linux (arm64 or x86_64)
- Android Platform Tools (for Android automation): Download
- Xcode Command Line Tools (for iOS simulator automation, macOS only)
Installation
Quick Install (Recommended)
curl -fsSL https://github.com/docer1990/visiontest/releases/latest/download/install.sh | bash
This will:
- Check that Java 17+ is installed
- Download the latest release JAR, Android APKs, and iOS test bundle
- Create a
visiontestcommand in~/.local/bin/ - Verify all downloads via SHA-256 checksums
You can customize the install directory:
VISIONTEST_DIR="$HOME/my-tools/visiontest" curl -fsSL https://github.com/docer1990/visiontest/releases/latest/download/install.sh | bash
To update, re-run the same command.
Configure Your AI Coding Tool
Claude Code
claude mcp add visiontest java -- -jar ~/.local/share/visiontest/visiontest.jar
Claude Desktop
Edit the config file:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Linux:
~/.config/Claude/claude_desktop_config.json
{
"mcpServers": {
"visiontest": {
"command": "java",
"args": ["-jar", "/ABSOLUTE/PATH/TO/.local/share/visiontest/visiontest.jar"]
}
}
}
Note: Replace
/ABSOLUTE/PATH/TOwith your home directory (e.g./Users/yournameon macOS,/home/yournameon Linux). JSON does not expand~.
GitHub Copilot CLI
Add to ~/.copilot/mcp-config.json:
{
"mcpServers": {
"visiontest": {
"command": "java",
"args": ["-jar", "/ABSOLUTE/PATH/TO/.local/share/visiontest/visiontest.jar"],
"type": "stdio"
}
}
}
OpenAI Codex CLI
codex mcp add visiontest -- java -jar ~/.local/share/visiontest/visiontest.jar
Or add to ~/.codex/config.toml:
[mcp_servers.visiontest]
command = "java"
args = ["-jar", "/ABSOLUTE/PATH/TO/.local/share/visiontest/visiontest.jar"]
OpenCode
Add to opencode.json (project root or ~/.config/opencode/opencode.json):
{
"mcp": {
"visiontest": {
"type": "local",
"command": ["java", "-jar", "/ABSOLUTE/PATH/TO/.local/share/visiontest/visiontest.jar"]
}
}
}
Build from Source
For development or contributing, see CONTRIBUTING.md.
Usage
Your AI coding tool discovers all available tools automatically via MCP. Just ask it to interact with a device and it will use the right tools.
Android Workflow
1. install_automation_server β Install APKs (one-time setup)
2. start_automation_server β Start the JSON-RPC server
3. get_interactive_elements β Get interactive elements with tap coordinates
4. android_tap_by_coordinates β Tap using centerX/centerY
5. android_input_text β Type text into focused field
iOS Workflow
1. ios_start_automation_server β Start XCUITest server (pre-built or source build)
2. ios_get_interactive_elements β Get interactive elements with tap coordinates
3. ios_tap_by_coordinates β Tap using centerX/centerY
4. ios_input_text β Type text into focused field
Available Tools
Device Management: available_device_android, list_apps_android, info_app_android, launch_app_android, ios_available_device, ios_list_apps, ios_info_app, ios_launch_app
Android Automation: install_automation_server, start_automation_server, automation_server_status, get_ui_hierarchy, get_interactive_elements, find_element, android_tap_by_coordinates, android_swipe, android_swipe_direction, android_swipe_on_element, android_get_device_info, android_input_text, android_press_back, android_press_home
iOS Automation: ios_start_automation_server, ios_automation_server_status, ios_get_ui_hierarchy, ios_get_interactive_elements, ios_find_element, ios_tap_by_coordinates, ios_swipe, ios_swipe_direction, ios_get_device_info, ios_input_text, ios_press_home, ios_stop_automation_server
CLI Usage
The same operations are also available as direct CLI commands β no MCP client needed:
visiontest automation_server_status -p android
visiontest get_interactive_elements -p ios
visiontest tap_by_coordinates -p android 100 200
visiontest screenshot -p ios --output ./screenshot.png
visiontest swipe_direction -p android up --distance long --speed fast
Every command requires --platform android or --platform ios (alias -p). Run visiontest --help for the full command list, or visiontest <command> --help for per-command usage.
With no arguments, visiontest starts the MCP stdio server.
Exit Codes
| Code | Meaning |
|---|---|
| 0 | Success |
| 1 | Generic failure |
| 2 | Usage error (missing/invalid args) |
| 3 | Automation server not reachable |
| 4 | Device/simulator not found |
| 5 | Platform not supported for this command |
Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
VISION_TEST_LOG_LEVEL | PRODUCTION | PRODUCTION, DEVELOPMENT, DEBUG |
VISION_TEST_APK_PATH | (auto-detected) | Explicit path to Android test APK |
VISION_TEST_IOS_PROJECT_PATH | (auto-detected) | Explicit path to iOS .xcodeproj |
VISIONTEST_DIR | ~/.local/share/visiontest | Override install directory (must be under $HOME) |
Ports
- Android: 9008 (requires ADB port forwarding, set up automatically)
- iOS: 9009 (no port forwarding needed β simulators share the Mac's network)
Future Plans
- Text input/typing support
- Screenshot capture via UIAutomator / XCUITest
- CLI mode (direct command-line usage without MCP)
- Long press operations
- Wait/sync operations for E2E testing
- Multi-device coordination
- Generic app install/uninstall
- Clipboard operations (read/write)
- Physical iOS device support
- WebSocket support for real-time updates
- Notification/status bar interaction
- Permission dialog automation
- Video recording of automation sessions
- Separate CLI-only artifact (smaller download, no MCP dependencies)
Contributing
See CONTRIBUTING.md for build-from-source instructions, architecture details, JSON-RPC API reference, testing guide, and how to extend VisionTest.
License
This project is licensed under the MIT License - see the LICENSE file for details.
