📦

Openpocket

🐹 An Intelligent Phone That Never Sleeps.

0 installs

Trust: 59 — Fair

Comms

Ask AI about Openpocket

I know everything about Openpocket. Ask me about installation, configuration, usage, or troubleshooting.

0/500

Loading tools...

Reviews

Documentation

OpenPocket

An Intelligent Phone That Never Sleeps
Let AI handle your phone-use tasks — locally, privately, autonomously.

Website · Documentation · Quickstart · Discord · Reddit · Contributing

What is OpenPocket?

Imagine having a second phone that works for you around the clock — replying to messages, handling payments, playing games, posting on social media — all while your real phone stays safely in your pocket, untouched.

OpenPocket is an open-source framework that lets AI agents operate an Android phone on your behalf. Tell it what you want in plain language, and it figures out the rest — tapping, scrolling, typing, navigating between apps — just like a real person would.

Local-first — everything runs on your machine; your data never leaves your computer.
Privacy by default — the agent phone and your personal phone are completely isolated. Sensitive actions require explicit human approval.
Emulator + real device — connect to Android emulators or physical phones over ADB. Run multiple agents against multiple targets to build your own local phone farm.
Extensible — add new capabilities through a single SKILL.md file, or build your own agent workflows.

Demos

Social Auto-manage social media	Gaming Play mobile games autonomously	Utility Payment Handle bills and payments
Studying Assist with learning tasks	Social Autopilot your X	Utility Find best price/travel option

Highlights

Multi-model — works with OpenAI GPT-5.x, Claude 4.6, Gemini 3.x, DeepSeek, Qwen, GLM, Kimi, MiniMax, Doubao, and more.
Multi-agent — run multiple isolated agents, each with its own config, workspace, target device, and session state.
Scheduled jobs — create cron tasks from chat or CLI in natural language (e.g. "Every day at 8am open Slack and check in").
Human-auth relay — sensitive actions (camera, payments, location) escalate to you for approval through a private local relay.
Channel integrations — receive tasks and results through Telegram, Discord, WhatsApp, or CLI.
Skills framework — extend agent capabilities by dropping a SKILL.md into the skills directory — no code changes needed.

Quick Start

Option A — npm (recommended)

npm install -g openpocket
openpocket onboard

Option B — from source (for contributors)

git clone git@github.com:pockebot/openpocket.git
cd openpocket
npm install
npm run build
./openpocket onboard

Then start the agent gateway:

openpocket gateway start

Or run a one-off task directly:

openpocket agent --model gpt-5.2-codex "Open Chrome and search weather"

For full setup details see the Quickstart guide, Device targets, and Configuration.

Usage

Multi-agent management

openpocket create agent review-bot --type physical-phone --device R5CX123456A
openpocket create agent ops-bot --type emulator
openpocket agents list

Target a specific agent with --agent:

openpocket --agent review-bot gateway start
openpocket --agent review-bot config-show
openpocket --agent review-bot target show
openpocket --agent review-bot channels login --channel discord

Scheduled jobs

From chat or CLI, describe a schedule in natural language. OpenPocket confirms before persisting. Jobs run in isolated cron:<jobId> sessions.

openpocket cron list
openpocket cron add --id daily-slack-checkin \
  --name "Daily Slack Check-in" \
  --cron "0 8 * * *" --tz Asia/Shanghai \
  --task "Open Slack and complete check-in" \
  --channel telegram --to 12345
openpocket cron disable --id daily-slack-checkin

Manager dashboard and shared relay

openpocket dashboard manager
openpocket human-auth-relay start

dashboard manager — overview of all agents, targets, channels, and gateway status.
human-auth-relay start — shared relay hub for human-auth approval flows, with optional ngrok public URL.

Device targets

openpocket target show
openpocket target set --type emulator
openpocket target set --type physical-phone
openpocket target pair --host <device-ip> --pair-port <pair-port> --code <pairing-code> --type physical-phone

Model profiles

Model configuration is per-agent. New agents inherit from the onboard template; each can diverge independently.

openpocket model show
openpocket model list
openpocket model set --name gpt-5.4
openpocket --agent review-bot model set --provider google --model gemini-3.1-pro-preview

Gateway logging

Tune log level, payload redaction, and per-module output in your agent config:

{
  "gatewayLogging": {
    "level": "info",
    "includePayloads": false,
    "maxPayloadChars": 160,
    "modules": {
      "core": true,
      "access": true,
      "task": true,
      "channel": true,
      "cron": true,
      "heartbeat": false,
      "humanAuth": true,
      "chat": false
    }
  }
}

Full CLI reference: CLI and Gateway | Filesystem layout

Architecture

flowchart LR
    U["User Surfaces\nCLI / Telegram / Dashboard"] --> M["Manager Layer\nagent registry + ports + target locks"]
    M --> D0["Default Agent\nconfig + workspace + state"]
    M --> D1["Managed Agent A\nconfig + workspace + state"]
    M --> D2["Managed Agent B\nconfig + workspace + state"]

    D0 --> G0["Gateway + Dashboard"]
    D1 --> G1["Gateway + Dashboard"]
    D2 --> G2["Gateway + Dashboard"]

    G0 --> A0["Agent Runtime"]
    G1 --> A1["Agent Runtime"]
    G2 --> A2["Agent Runtime"]

    A0 --> T0["Target A"]
    A1 --> T1["Target B"]
    A2 --> T2["Target C"]

    A0 --> H0["Private Local Relay"]
    A1 --> H1["Private Local Relay"]
    A2 --> H2["Private Local Relay"]

    H0 --> RH["Shared Relay Hub\noptional ngrok"]
    H1 --> RH
    H2 --> RH

Components

#	Component	What it does	Docs
1	Multi-agent manager	Registry, port allocation, and target locks for one default + N managed agents	Multi-agent, Filesystem
2	Gateway orchestration	Per-agent gateway, dashboard, session store, channel credentials, task queue	CLI & Gateway, Runbook
3	Prompting & model loop	System/user prompt composition, context budgeting, model-driven step execution	Prompting, Prompt templates
4	Tool execution	ADB phone actions, coding tools, memory tools, and user-defined scripts	Action schema, Scripts, Skills
5	Human authorization	Capability probe + escalation for sensitive actions, shared relay hub with per-agent state	Remote human auth
6	Device target abstraction	One target per agent, mutual exclusion enforced by the manager	Device targets
7	Persistence & audit	Sessions, memory, screenshots, relay state, artifacts — all inside the agent workspace	Filesystem, Session & memory
8	Runtime operations	Manager dashboard, per-agent dashboards, keep-awake heartbeat, troubleshooting playbooks	Runbook, Troubleshooting, Screen awake

Contributing

We'd love your help making OpenPocket better. Whether it's fixing a typo, improving docs, adding a new skill, or building a major feature — every contribution matters.

Getting started

Fork the repo and create a feature branch.
Install dependencies and run the dev checks:

npm install
npm run check
npm run test
npm run smoke:dual-side
npm run docs:build

Keep changes focused — one concern per PR.
Add or update tests for any behavioral change.
Open a PR with context on what changed, why, and how you verified it.

Where to help

Good first issues — look for issues labeled good first issue in the issue tracker.
Skills — write a new SKILL.md to teach the agent a new phone workflow. See the Skills guide.
Model adapters — add or improve support for new LLM providers.
Channel integrations — improve Telegram, Discord, or WhatsApp support, or add new channels.
Docs — fix gaps, add examples, or translate content.

Read the full guidelines before opening a PR: CONTRIBUTING.md | contribution.md

License

This project is licensed under the MIT License — see LICENSE for details.

Acknowledgments

Special thanks to the open-source projects that make OpenPocket possible:

pi-mono ecosystem by Mario Zechner — @mariozechner/pi-agent-core, @mariozechner/pi-ai, @mariozechner/pi-coding-agent
Messaging SDKs — node-telegram-bot-api, discord.js, baileys
Core runtime — openai, @modelcontextprotocol/sdk, zod, @sinclair/typebox, sharp, qrcode
Docs & tooling — vitepress, mermaid, typescript, tsx

And thanks to all contributors and community members who report issues, share real-world scenarios, and help shape the project. We're building this together.