Selene
Selene is a desktop app that runs AI agents on your machine. Connect them to your WhatsApp, Telegram, Slack, or Discord. Write code, generate images, build personal assistants. All from one place. Your data stays on your device.
Ask AI about Selene
Powered by Claude Β· Grounded in docs
I know everything about Selene. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Selene
Selene is an agent-first desktop app that runs AI on your machine. Chat, write code, generate images, design UIs, control a browser β then pipe all of it into WhatsApp, Telegram, Slack, or Discord. Your data stays on your device. Every part of Selene β chat, embeddings, voice, images β lets you pick between local and cloud. Run fully offline or bring your own API keys. Mix and match.
Agent-First, Not Button-First
Every UI action is also an agent action β step in anywhere or let it run end-to-end.
Why We Built It
AI is expensive because models re-read everything, every turn. Selene runs a small retrieval pipeline first β it finds what's relevant, the main agent picks it up and moves on. Tools load on demand. Context stays lean. You pay less per turn.
What's New in v0.3.4
- Voice input, rewritten. Your words land in the composer the instant recording stops; grammar polishing happens in the background. Start a new recording while a previous one is still polishing β they no longer block each other.
- Design Workspace. Generate UI components with AI, preview them live in an isolated sandbox, edit-in-place via patches.
- Ghost OS. Agents can see your screen through a vision sidecar, wired into the MCP tool pipeline.
- Folder Sync. Replaces the old Knowledge Base. Native FSEvents on macOS, toast notifications, and a proper progress bar.
- Claude Opus 4.7 + Kimi OAuth. New flagship Anthropic model (1M context, thinking) and device-flow sign-in for Kimi β no more API-key copy-paste.
- Platform bump. Electron 41 / Chrome 146 / Node 24, 9 Dependabot advisories closed, hundreds of unused files pruned.
Full notes: RELEASE_NOTES_v0.3.4.md Β· All releases on GitHub
Modes
Selene Dev
- Git, diffs, and PRs. Stage, branch, diff, PR β from the UI or via agent.
- Built-in browser. Agent-controlled Chromium with console log access and session replay.
- Output protection. Bundled Rust tool trims long build/test output before it hits the model.
- Automatic checks. Hooks run type-checking, linting, or any custom logic after agent edits.
Selene Fun
- 3D avatar. Animated face with lip-sync and emotion detection.
- Voice cloning. Custom voice via ElevenLabs or Microsoft.
- Scheduled assistants. Cron tasks delivered to any connected channel.
- Memory. Selene surfaces things to remember; you approve what sticks.
Design Workspace
- AI components, live. Generate React/HTML components; they render immediately in an isolated sandbox.
- Edit-in-place. Ask for tweaks and the agent applies targeted patches instead of regenerating from scratch.
- Responsive previews. Mobile / tablet / desktop viewport toggles, plus Light / Dark / System.
Chromium Workspace
- Real browser, agent-driven. Embedded Chromium controlled through Playwright β navigate, click, type, extract, evaluate JS.
- Parallel isolation. Each agent gets its own BrowserContext; sub-agents can drive separate tabs concurrently without stepping on each other.
- Accessibility-tree observation. Token-efficient snapshots instead of screenshots β deterministic, cheap, no vision model needed.
- Full action replay. Every session is recorded with inputs, outputs, and DOM snapshots; replay with retry and output verification.
Ghost OS
- Screen awareness. Agents see what's on your display through a vision sidecar.
- MCP-native.
ghost_parse_screenandghost_annotateare exposed as tools the main agent can pick up. - Pre-flight health checks. Sidecar auto-boots before the first call; status visible in MCP settings.
Channels
Connect your agent to your apps. Not through webhooks β as native integrations.
| Channel | Setup | What works |
|---|---|---|
| Scan a QR code | Messages, voice notes, attachments | |
| Telegram | Paste a bot token | Messages, voice bubbles, interactive buttons |
| Slack | Socket Mode | Messages, files, native UI elements, threads |
| Discord | Paste a bot token | Messages, threads, buttons, attachments |
Voice notes are transcribed automatically. Pair with the scheduler for cron-based delivery.
Features
| Voice Input | Instant transcription, background grammar polish, concurrent recordings, per-recording cursor memory, dedicated transcriber model |
| Voice & Avatar | STT (cloud/local, 32 languages), TTS with voice cloning, 3D avatar with lip-sync |
| Images | Local or cloud generation, reference images, ComfyUI workflows as agent tools |
| Video | Images β MP4 with transitions and overlays |
| Design Workspace | Generate UI components with AI, live sandbox preview, edit-in-place via patches |
| Chromium Workspace | Agent-driven embedded browser (Playwright), parallel session isolation, accessibility-tree snapshots, full action replay |
| Ghost OS | Agents can see your screen via a vision sidecar |
| Folder Sync | Sync folders directly to agents; native FSEvents on macOS; replaces the old Knowledge Base |
| Deep Research | Multi-pass web search with cited writeups |
| Memory | Surfaces suggestions after conversations; you approve what sticks |
| Scheduler | Cron, interval, or one-time tasks β delivered to any channel or kept in chat |
| Skills | Reusable agent instructions. 37+ built-in, create your own from the UI |
| Plugins | Bundle skills and tools together. Install from GitHub or a URL |
| Workflows | Multi-agent delegation with parallel sub-agents and auto-delivered results |
| MCP | Connect external services as agent tools |
| Hooks | Run custom logic before or after any agent action |
| Workspace Styles | Classic Sidebar or Browser Tabs layout |
| Themes | 8 color themes, light/dark, 50 wallpapers (20 live), rich text prompt editor |
Providers
Use any combination, or go fully local with no API keys.
| Provider | Models |
|---|---|
| Anthropic | Claude (Opus 4.7, Sonnet, Haiku) via API or Agent SDK |
| OpenAI | GPT-5 family + Codex |
| OpenRouter | Claude, Gemini, Grok, DeepSeek, and hundreds more |
| Ollama | Any local model; dynamic thinking detection |
| vLLM | Self-hosted inference |
| Kimi / Moonshot | K2.5, K2.6-code; OAuth device-flow sign-in |
| Minimax | Multiple variants |
| Antigravity | Free tier via Google OAuth |
Download
macOS. Signed DMG, drag to Applications. Windows. Signed installer or portable build.
One download, no prerequisites. Selene bundles everything: Electron 41 (Chrome 146 / Node 24), local model support, browser engine, platform tools. The app is larger than usual because it ships what other tools make you install separately.
Grab the latest build on the Releases page.
For Developers
Setup
npm install
npm run electron:dev
Build
# Windows
npm run electron:dist:win:nosign
# macOS
npm run electron:dist:mac:nosign
Troubleshooting
- Native module errors:
npm run electron:rebuild-native(rebuilds against the bundled Electron ABI) - Embeddings mismatch: reindex from Settings
- MCP ENOENT: reinstall from latest DMG/installer
Thanks
Built on open-source. See THANKS.md.
