Pupil
Let agents perceive, indicate, and act in any application.
Ask AI about Pupil
Powered by Claude Β· Grounded in docs
I know everything about Pupil. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Pupil
Pupil is a Windows MCP server that lets an AI agent perceive your UI as structured data, indicate with an on-screen overlay, and act on the desktop when you accept.
Demo where the human operator only does : Tab, Tab, Tab...
Click the GIF for the full-quality MP4.
Why Pupil
Today, working with an agent on a real desktop usually means a chat back-and-forth: you describe what you see, the agent describes what to do, you do it, you describe again. It works, but it's slow and a lot gets lost in translation.
Two things make that loop hard:
- agents can't reliably see what's on screen, so context comes from your words (or repeated screenshots sent through the model);
- and for many steps they need you to act β clicking a specific button, typing into a specific field, confirming a dialog β because they don't have hands on your machine.
Pupil turns that chat into something more like working side by side. The agent gets a structured view of the UI instead of guessing from screenshots, and when it needs you it draws an overlay card on the exact control to click or field to fill. You stay in charge β you can accept, skip, or ignore β and as a bonus the agent can also execute the action itself when you let it, so the same channel covers "show me", "do this", and "let me do it for you".
It's not a full autopilot. It's a tighter loop between what the agent sees, what it asks for, and what actually happens on your screen.
Examples
Overlay cards for each indicate type (info, warning, wait, danger, click, action, input):
Quick start (Windows)
- From the repo root, run
.\scripts\build.ps1β builds the .NET core, copiespupil-core.exeintoapp\vendor\win32-x64\, then runspnpm installandpnpm rebuild electronunderapp\. - Point Cursorβs MCP config at the Node entrypoint. Replace
<path-to-pupil-repo>with the absolute path to your clone (forward slashes are fine on Windows):
{
"mcpServers": {
"pupil": {
"command": "node",
"args": ["<path-to-pupil-repo>/app/bin/pupil-mcp.js"]
}
}
}
- Reload MCP or restart Cursor so the server starts.
If native binaries are missing or locked, run .\scripts\kill.ps1 before rebuilding. .\scripts\smoke.ps1 runs a basic syntax + bridge check. A legacy Python server in mcp/main.py exists for reference β see docs/MCP.md.
How it works
flowchart LR
Agent[AI_Agent] -->|MCP_stdio| Shim[pupil-mcp.js]
Shim -->|IPC| Daemon[Electron_daemon]
Daemon -->|spawn| Core[pupil_core]
app/bin/pupil-mcp.jsβ MCP stdio entry; loadsapp/src/shimand the Electron overlay daemon.core/β .NET native sidecar (pupil-core.exe) that does the actual perception.mcp/β optional Python server (mcp/main.py) β legacy / minimal; most setups use Node only.scripts/β Windows build, smoke, and kill helpers.
Documentation
docs/MCP.mdβ full MCP & indicator contract (perceive/indicate, types, accept semantics, response shape).
Status & community
Early development, Windows-focused today. This is my first open-source project β feedback, bug reports, and questions are very welcome via GitHub Issues.
