Appium Server
Node.js server that exposes mobile automation capabilities with Appium as MCP tools for LLM agents.
Installation
npx mcp-appium-serverAsk AI about Appium Server
Powered by Claude Β· Grounded in docs
I know everything about Appium Server. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
MCP Appium Server
Model Context Protocol (MCP) Server for Appium β A Node.js server that exposes mobile automation capabilities as MCP tools for LLM agents.
Status - Still Work in progress
What is Model Context Protocol (MCP)?
Model Context Protocol (MCP) is an open protocol that standardizes how applications provide context to Large Language Models (LLMs). Think of it as a universal adapter that allows AI agents like GPT-4 or Claude to interact with external tools and services.
Key Concepts
- MCP Server: A program that exposes capabilities (tools, resources, prompts) to LLM agents
- MCP Client: An LLM agent or application that consumes these capabilities
- Tools: Functions that an LLM can call to perform actions (e.g., click a button, type text)
- stdio Transport: Communication over standard input/output (perfect for local execution)
Why MCP for Mobile Automation?
Instead of writing imperative test scripts, you describe what you want to test in natural language. The LLM agent:
- Understands your intent
- Calls MCP tools to interact with the mobile app
- Observes the results (screenshots, element states)
- Makes decisions autonomously
- Provides human-readable reports
This is not a test framework β it's an LLM-controlled mobile automation control plane.
Architecture
βββββββββββββββββββββββββββββββββββββββ
β LLM Agent (GPT-4 / Claude) β
β "Tap the login button" β
ββββββββββββββ¬βββββββββββββββββββββββββ
β MCP Protocol
β (stdio)
βΌ
βββββββββββββββββββββββββββββββββββββββ
β MCP Appium Server (Node.js) β
βββββββββββββββββββββββββββββββββββββββ€
β β’ Tool Registry (6 tools) β
β β’ Session Manager (stateful) β
β β’ Appium Launcher β
β β’ Command Executor (WebdriverIO) β
ββββββββββββββ¬βββββββββββββββββββββββββ
β WebDriver Protocol
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Appium Server (2.x) β
ββββββββββββββ¬βββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Android Emulator / iOS Simulator β
βββββββββββββββββββββββββββββββββββββββ
MCP Tools
This server exposes 6 production-ready MCP tools:
1. create_mobile_session
Purpose: Create an Appium session for Android or iOS
Input:
{
"platform": "android | ios",
"capabilities": {
"deviceName": "Pixel_5_API_31",
"app": "/path/to/app.apk",
"automationName": "UiAutomator2"
}
}
Output:
{
"success": true,
"sessionId": "abc123...",
"platform": "android"
}
2. tap_element
Purpose: Tap/click on a UI element
Input:
{
"sessionId": "abc123...",
"strategy": "id | xpath | accessibilityId",
"selector": "com.example:id/login_button"
}
3. type_text
Purpose: Type text into an element
Input:
{
"sessionId": "abc123...",
"strategy": "accessibilityId",
"selector": "username_field",
"text": "testuser@example.com"
}
4. navigate_back
Purpose: Press back button (Android) or navigate back (iOS)
Input:
{
"sessionId": "abc123..."
}
5. take_screenshot
Purpose: Capture current screen state
Input:
{
"sessionId": "abc123..."
}
Output:
{
"success": true,
"imageBase64": "iVBORw0KGgoAAAANSUhEU...",
"format": "png"
}
6. close_mobile_session
Purpose: Close session and cleanup
Input:
{
"sessionId": "abc123..."
}
Setup
Prerequisites
-
Node.js 18+
node --version # Should be >= 18.0.0 -
Appium 2.x
npm install -g appium appium driver install uiautomator2 # For Android appium driver install xcuitest # For iOS -
Android Setup (for Android testing)
- Android SDK installed
- Emulator or physical device available
adb devicesshows your device
-
iOS Setup (for iOS testing)
- Xcode installed (macOS only)
- iOS Simulator available
- Xcode Command Line Tools installed
Installation
-
Clone and install dependencies:
cd mcp-appium-server npm install -
Build TypeScript:
npm run build -
Test the server (optional):
npm start # Server will start and listen on stdio # Press Ctrl+C to stop
Usage
Option 1: Direct Execution
node dist/index.js
The server will:
- Start listening on stdio
- Auto-launch Appium when first session is created
- Handle graceful shutdown on SIGINT/SIGTERM
Option 2: MCP Client Configuration
Configure your MCP client (e.g., Claude Desktop, custom LLM agent) to use this server:
Example MCP config (~/.mcp/config.json):
{
"mcpServers": {
"appium": {
"command": "node",
"args": ["/absolute/path/to/mcp-appium-server/dist/index.js"],
"env": {}
}
}
}
Example LLM Conversation
User: "Start an Android session for the Calculator app, then tap the number 5 button"
LLM Agent (internal reasoning):
- Call
create_mobile_sessionwith Android capabilities - Receive sessionId
- Call
tap_elementwith selector for "5" button - Return success to user
Result: The calculator app opens and "5" is tapped autonomously.
Project Structure
mcp-appium-server/
βββ src/
β βββ index.ts # Entry point, signal handlers
β βββ mcp/
β β βββ server.ts # MCP server setup & tool registry
β β βββ tools.ts # Tool implementations
β βββ appium/
β β βββ appiumLauncher.ts # Start/stop Appium process
β β βββ sessionManager.ts # Session lifecycle & state
β β βββ commandExecutor.ts # WebDriver command mapping
β βββ types/
β βββ toolSchemas.ts # JSON schemas for tools
βββ package.json
βββ tsconfig.json
βββ README.md
Design Principles
1. MCP-Compliant
- Follows MCP spec for tool registration
- Uses stdio transport
- Structured errors with error codes
2. Stateless Tools, Stateful Sessions
- Each tool call is idempotent
- Session state managed in-memory
- Session IDs passed explicitly
3. Safe for Autonomous Agents
- No destructive commands exposed
- Graceful error handling
- Session validation on every call
4. Production-Quality
- Comprehensive logging (Winston)
- TypeScript strict mode
- Graceful shutdown with cleanup
Error Handling
All errors are returned as MCP-compatible structured errors:
{
"error": {
"code": "InternalError",
"message": "Session not found: xyz123. Create a session first using create_mobile_session."
}
}
Common Errors:
Session not found: Callcreate_mobile_sessionfirstElement not found: Check your selector strategy/valueAppium startup timeout: Ensure Appium is installed and in PATH
Known Limitations
- Local Execution Only: No cloud device farm support (yet)
- stdio Transport Only: No WebSocket/HTTP support (yet)
- Basic Element Interaction: No gestures (swipe, pinch) yet
- No Accessibility Tree Export: Coming in future versions
- No Vision Integration: Screenshot analysis requires external LLM vision
Future Extensions
- π Gesture support (swipe, scroll, pinch)
- π³ Accessibility tree export for better element discovery
- πΈ Vision API integration for screenshot analysis
- π WebSocket transport for remote execution
- βοΈ Cloud device farm support (BrowserStack, Sauce Labs)
- π Enhanced element discovery with AI-powered selectors
Troubleshooting
"Appium startup timeout"
Solution: Ensure Appium is installed globally:
npm install -g appium
appium --version
"Session creation failed"
Solution: Check device availability:
# Android
adb devices
# iOS
xcrun simctl list devices
"Element not found"
Solution: Take a screenshot first to verify element visibility:
{
"tool": "take_screenshot",
"input": { "sessionId": "..." }
}
Contributing
This is a production-grade MCP server designed for LLM-controlled mobile automation. Contributions welcome!
Design Guidelines:
- Keep tools simple and composable
- Never expose raw WebDriver APIs
- Maintain backward compatibility
- Test with real LLM agents
License
ISC
Credits
Built with:
- @modelcontextprotocol/sdk - MCP protocol implementation
- Appium - Mobile automation framework
- WebdriverIO - WebDriver client
- Winston - Logging
Questions? Check out the Model Context Protocol documentation or open an issue.
