Agent Simulator
An iOS simulator in a browser you can inspect
Ask AI about Agent Simulator
Powered by Claude Β· Grounded in docs
I know everything about Agent Simulator. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Agent Simulator
Live iOS Simulator preview with React Native inspector, accessibility-tree driven tools, and an MCP bridge. Drive iOS apps from your browser or an AI agent β see the live screen, click any element to get its real source code, tap / swipe / type without moving the macOS cursor.
Built for three audiences at once:
- RN / Expo devs who want a Figma-style "layers panel" for their running app, with real
App.tsx:42source lookups. - QA & demo flows that want a scriptable sim without the pain of XCUITest β accessibility-label taps, swipes, keyboard input, screenshots.
- AI agents (Claude Desktop, OpenAI Agents SDK, Cursor, β¦) that need a first-class tool surface to see what's on the screen and act on it. Everything the browser can do is exposed over MCP.
ββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββ
β Browser UI β β MCP client β
β / (agent-simulator) β β (Claude Desktop, Codex, β¦) β
β - live MJPEG stream β ββββββββββββββββββ¬βββββββββββββ
β - iOS a11y layer tree β β stdio
β - React component tree β β
β - props + source window β βββββββββββββββββββββββββββββββ
β - tap/swipe/drive/type β β mcp/server.mjs β
βββββββββββββββ¬βββββββββββββ β 15 MCP tools over WS β
β WS + HTTP ββββββββββββββββββ¬βββββββββββββ
β β WS
ββββββββββββββββββββββββββββββββββββββββββββββββββββ
β server.js (Node) β
β static Β· /api/config Β· /api/tree Β· /api/source β
β WS bridge Β· Metro /symbolicate proxy β
ββββ¬βββββββββ¬βββββββββββββββββββββββββββββββββββββββ
β stdin β WS (opt.)
β β
ββββββββββββ βββββββββββββββββββββββ
βsim-serverβ β Expo / RN app with β
β (Rust) β β agent-simulator β
β MJPEG + β β metro plugin β
β axe HID β β (inspector bridge) β
βββββββ¬βββββ βββββββββββββββββββββββ
β
iOS Simulator (axe + simctl)
Features
| Cursor-free input | Every tap / swipe / key goes through axe β FBSimulatorHID β CoreSimulator. No macOS cursor movement, no Simulator.app focus requirement, pixel-exact at any zoom. |
| Drive any iOS app | Taps, drags, mouse-wheel scroll, multitask, home, lock, keyboard passthrough. Works on Settings, Maps, Messages β not just RN apps. |
| iOS accessibility tree | Populated from axe describe-ui on boot. Every on-screen element with its label, role, and frame β before you click anything. |
| React component tree | For RN apps running the metro plugin, inspect any rendered component and get the full fiber stack with real source paths (React 19 _debugStack + Metro /symbolicate). |
| Real source code panel | Select any component and the Properties panel fetches the actual file and renders a code window around the target line. Works for your code, RN's Libraries, and Expo shims. |
| MCP bridge | 15 tools: sim_tree, sim_tap_by_label, sim_swipe, sim_type, sim_inspect, sim_source, sim_screenshot, and more. Every browser capability exposed to AI agents. |
Requirements
-
macOS with Xcode + at least one iOS simulator installed.
-
Node β₯ 18 (or Bun β scripts are bun-friendly).
-
Rust toolchain (for building
sim-serveronce). -
axe β the cursor-free input driver. One-line install:
brew install cameroncooke/axe/axe -
A React Native / Expo app with the agent-simulator metro plugin, only if you want React component inspection. Driving, screenshots, and the iOS accessibility tree all work against any iOS app.
Quick start (1 minute)
# 1. Clone & install
git clone https://github.com/jkneen/agent-simulator
cd agent-simulator
bun install
# 2. Build the Rust sim-server (once)
(cd sim-server && cargo build --release)
# 3. Build the web UI (once, or after UI changes)
(cd web-app && bun install && bun run build)
# 4. Boot any simulator yourself, then start the server
xcrun simctl boot "iPhone 17 Pro" || true
open -a Simulator
bun start
# 5. Open the UI
open http://localhost:3200
The UI will attach to whatever iOS simulator is booted. Open Settings, Maps, Messages, or any app β you can drive it immediately. The iOS layer tree on the left is populated from the accessibility API; the React tree is empty until you load an RN app with the inspector bridge (see below).
One-command demo (the fun one)
We ship an Expo demo app and a launcher that starts everything for you.
bun demo
This will:
- Boot the first available iPhone simulator (or reuse the booted one).
- Start Metro for
examples/expo-demoon port 8081. - Launch the demo via
exp://127.0.0.1:8081in Expo Go. - Start the agent-simulator Node server on :3200.
- Open
http://localhost:3200in your browser.
Ctrl-C cleans up both processes. Once it's running you can:
- Click anywhere in the preview to drive the app (the tap animates with a pulse ring, and the Layers panel expands to the nearest accessibility element).
- Hit
Ito enter Inspect mode, hover / click a React component, and watch the Properties panel fill with the component name, source file, and actual JSX code. - Tap the
+on the counter β the Code panel opensexamples/expo-demo/App.tsx:42with the<Text style={styles.pillText}>+</Text>line highlighted.
Flags:
bun demo --no-open # skip launching the browser
bun demo --port=3300 # use a different agent-simulator port
bun demo --metro-port=8082 # use a different Metro port
bun demo --device=<UDID> # target a specific simulator
Adding agent-simulator to your own RN / Expo app
If you already have an app, add the metro plugin so the inspector bridge injects at boot:
// metro.config.js
const { getDefaultConfig } = require("expo/metro-config");
const { withAgentSimulator } = require("agent-simulator/runtime/metro-plugin");
module.exports = withAgentSimulator(getDefaultConfig(__dirname));
β¦then install this package alongside:
# Note: `agent-simulator` is not published to npm yet.
# Install directly from GitHub:
bun add -D github:jasonkneen/agent-simulator
Or, if you don't want to touch Metro config, import the bridge at your app's entry point before registerRootComponent:
// index.ts
import "agent-simulator/runtime/inspector-bridge";
import { registerRootComponent } from "expo";
import App from "./App";
registerRootComponent(App);
When your app launches with either setup, the BRIDGE pill in the toolbar turns green and inspect clicks produce full component stacks with real source locations.
The UI
| Region | What it shows |
|---|---|
| Header | Device chip, stream + bridge status pills, Inspect toggle, Home / Multitask / Lock / Rotate, panel toggles, theme menu. |
| Layers panel (left) | Toggles between the iOS accessibility tree (populated automatically) and the React component tree (populated on inspect). Click any layer to select, hover to highlight, the tree auto-expands ancestors when you tap inside the sim. |
| Simulator canvas | Live MJPEG. In drive mode, the native cursor is hidden and replaced by a round in-preview pointer; taps fire a ping ring, drags become swipes, the mouse wheel scrolls, the keyboard types into the focused field. Inspect mode replaces the pointer with a crosshair and draws selection overlays. |
| Properties panel (right) | Selected layer's accessibility properties, React component name + source, and a highlighted code window around the JSX call site or component definition. |
| Status bar | UDID, mode flags, last-inspect frame count, keyboard shortcut hints. |
Keyboard shortcuts
Iβ toggle Inspect (disabled when no RN bridge)Escβ cancel Inspect / unpin selection- Any other keystroke while the preview is focused β forwarded to the sim (works for any text field)
MCP bridge
mcp/server.mjs is a standalone Model Context Protocol server. It speaks stdio to any MCP host and bridges to the running agent-simulator server over WebSocket. All 15 tools:
| Tool | Purpose |
|---|---|
sim_info | Device UDID / name / stream URL / device-point size. |
sim_tree | Full iOS accessibility tree. flat: true returns a labelled-element list for cheap LLM context. |
sim_tap | Tap at (x, y) sim-ratio. |
sim_tap_by_label | Look up an AX element by label / value (optional type filter) and tap its centre β robust to layout changes. |
sim_swipe | Single-call native swipe with durationMs. |
sim_type | Type printable text into the focused TextField. |
sim_key | Press one USB-HID keycode (Return=40, Backspace=42, Esc=41, arrows 80β83β¦). |
sim_button | home / lock / power / side-button / siri / apple-pay. |
sim_multitask | Open the app switcher. |
sim_screenshot | Returns the current frame as an MCP image content. |
sim_inspect | Inspect the RN component at (x, y) β returns a source-symbolicated component stack. |
sim_source | Read a code window around file:line for any absolute path in the workspace. |
sim_select_by_source | Given fileName + line, probes a grid until it finds a rendered component whose source matches. |
sim_subscribe_selections | Streams inspectResult events back as MCP logging notifications. |
sim_unsubscribe_selections | Stops the above. |
Configure an MCP client
For Claude Desktop, add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"agent-simulator": {
"command": "node",
"args": ["/absolute/path/to/agent-simulator/mcp/server.mjs"],
"env": { "SIM_PREVIEW_URL": "http://localhost:3200" }
}
}
}
Then start the agent-simulator server (bun start or bun demo). The MCP server connects on demand.
For any other MCP host (OpenAI Agents SDK, Cursor, Continue, Codex, custom harnesses): spawn node /path/to/mcp/server.mjs with SIM_PREVIEW_URL pointing at your running server.
Example agent flow
1. sim_tree flat:true β see every element on screen
2. sim_tap_by_label "+" β tap the counter's plus
3. sim_tap_by_label "Your name" type:"TextField"
4. sim_type "hello agent"
5. sim_key 40 β Return
6. sim_inspect x:0.86 y:0.24 β returns App.tsx:42
7. sim_source file:".../App.tsx" line:42 β read the JSX
Architecture
Three binaries, one simulator:
- sim-server (Rust) β captures the simulator screen via
simctl io screenshot, encodes MJPEG, serves/stream.mjpeg,/snapshot.jpg,/api/tree, and/health. Reads tap / swipe / type commands from stdin and shells out toaxe. - server.js (Node) β orchestrates sim-server, serves the Vite-built UI, exposes
/api/config//api/tree//api/source, bridges browser and MCP WebSocket clients, proxies stack frames through Metro's/symbolicatefor React inspection. - web-app (React + shadcn + Tailwind) β the UI. Dark / light themes, two trees, resizable panels, keyboard capture, drag-to-swipe, wheel-to-scroll.
Touch injection
Clicks from the UI (or MCP) travel:
browser ββ WS ββ> server.js ββ stdin ββ> sim-server ββ exec ββ> axe tap -x β¦ -y β¦ --udid β¦
β
βββ> FBSimulatorHID β CoreSimulator
(device-point coords, cursor-free)
Coordinates are carried as [0, 1] ratios end-to-end; sim-server converts to device points using the root AXFrame from axe describe-ui (cached with a 5 s TTL, invalidated on rotate). Every gesture is one subprocess call β no per-step stdin traffic.
/api/tree and /api/source
GET /api/treeβ proxiesaxe describe-ui --udid <udid>via sim-server and returns the raw JSON. The UI pulls this on boot and on refresh; the MCPsim_treetool wraps it.GET /api/source?file=ABS&line=N&context=Mβ reads a window of lines aroundlinefrom an absolute path, returns{file, lines, startLine0, endLine0, targetLine0, language}. Powers the Code panel in the UI and thesim_sourceMCP tool.
React source resolution
React 19 removed fiber._debugSource. Sources now live inside fiber._debugStack, an Error object whose stack string points at bundle-URL line/col pairs. On every inspectResult:
- The inspector-bridge (in the RN app) parses
_debugStack, collects up to 12 candidate frames per fiber, attaches them asbundleFrames: [β¦]. server.jsbatch-POSTs every bundle frame tohttp://localhost:8081/symbolicatein one round trip and picks the first candidate per fiber that isn't inside React's owncreateElement/jsxDEVplumbing.- The UI receives a regular
stack[i].sourcewith a real absolute file path.
This means every user component, RN internal component (RCTView, ScrollView, β¦), and Expo shim resolves to actual code on disk.
Stream quality and the FPS ceiling
Click the Quality dropdown in the toolbar. Five presets plus fine-tune sliders for FPS, JPEG quality, scale, and capture mode.
| Preset | Target FPS | Resolution | Mode | Notes |
|---|---|---|---|---|
| Eco | 2 | 300Γ650 | MJPEG | battery-friendly, background monitoring |
| Balanced | 3 | 400Γ870 | MJPEG | default β matches the CSS size the browser paints |
| Smooth | up to 10 | 400Γ870 | MJPEG | good for scrolling |
| Max | up to 30 | 400Γ870 | MJPEG | upper limit of the MJPEG path |
| Fluid | up to 60 | 400Γ870 | BGRA + keepalive | experimental, see caveats below |
Changing a preset fires a WebSocket setCapture, which makes the server
respawn sim-server with the new arguments and broadcast a
configChanged event. The browser <img> re-subscribes to the new
stream URL automatically. Total switch time is about half a second.
Click accuracy is decoupled from resolution
Every tap / swipe carries [0, 1] ratios. They get converted to
device points server-side using the root AXFrame from
axe describe-ui, so a 300Γ650 Eco stream has exactly the same click
precision as a 1206Γ2622 native-retina stream. Lower resolutions only
cost you pixels on screen, not targeting.
Honest FPS numbers
Measured on an iPhone 17 Pro simulator on an M-series Mac:
| Path | Requested | Actual |
|---|---|---|
axe screenshot (one-shot) | β | 280β370 ms per call β ~3 fps ceiling |
axe stream-video --format raw --fps 30 | 30 | ~5β10 fps (axe's screenshot loop is the bottleneck) |
axe stream-video --format bgra | 60 | First frame, then stalls β FBVideoStreamConfiguration is hard-coded to framesPerSecond: nil, IOSurface callback stays silent on current iOS sims |
So the "up to 60 fps" on the Fluid preset is aspirational: the architecture is in place, but axe as shipped cannot deliver 60 fps today. To unblock real 60 fps you'd need one of:
- Fork axe and pass a concrete
framesPerSecondvalue intoFBVideoStreamConfigurationon the BGRA path. - Replace the
axe stream-videosubprocess with a Swift helper that linksFBSimulatorControl.xcframeworkdirectly and drivesSimDevice/SimDeviceIOClientourselves.
Both are real half-day projects. The current wire protocol is already a drop-in replacement β swapping in a faster backend is one file.
Keepalive in Fluid mode
Because BGRA otherwise delivers a single frame and then nothing,
sim-server runs a screenshot keepalive alongside the BGRA reader
in that mode. If no frame arrives for 300 ms, it fires one
axe screenshot, decodes the PNG, re-encodes as JPEG, and broadcasts
it β so the preview is never blank. When BGRA does fire (animation-heavy
apps, scrolling), the keepalive stays silent.
Net effect in practice:
- Idle screen β ~3 fps from keepalive.
- Active scrolling β keepalive silent, BGRA may or may not contribute.
- Worst case is never 0 fps.
Env-var overrides
The server accepts the same knobs as environment variables so you can pick defaults without the UI:
SP_FPS=10 SP_QUALITY=60 SP_SCALE=0.33 SP_CAPTURE=mjpeg bun start
SP_CAPTURE=bgra bun start # force BGRA path from boot
Tips
- Use
axe's simulator conventions (device points, not Mac pixels) if you're writing agent scripts directly. The UI handles the ratio β point conversion. axe describe-uiis a great reference:axe describe-ui --udid <UDID> | jq '.[0]'.- The Metro cache is invalidated when
inspector-bridge.jschanges. If the bridge version log doesn't advance after an edit, restart Metro withexpo start --clear. SP_FPSandSP_QUALITYenv vars tune MJPEG performance (defaults:fps=3 q=55).
Development
# Re-build the Rust server after Rust edits
(cd sim-server && cargo build --release)
# Re-build the UI after web-app edits
(cd web-app && bun run build)
# Run the MCP server standalone
bun mcp
# Tail MCP traffic with the official inspector
npx @modelcontextprotocol/inspector node mcp/server.mjs
Project layout:
agent-simulator/
βββ server.js Node orchestrator, WS bridge, HTTP
βββ scripts/demo.mjs One-command demo launcher
βββ sim-server/ Rust MJPEG + axe driver
βββ runtime/ RN metro plugin + inspector bridge
βββ web-app/ Vite + React + shadcn UI
βββ web/ Legacy single-file UI (served at /classic)
βββ mcp/ MCP server (15 tools)
βββ examples/expo-demo/ Expo app with the plugin pre-configured
βββ LICENSE AGPL-3.0-or-later
License
AGPL-3.0-or-later. See LICENSE.
If you're embedding agent-simulator inside a commercial product, or you need a different license, open an issue β I'm happy to talk about dual-licensing.
