Bumblebee
Read-only developer endpoint scanner for on-disk package, extension, and developer-tool metadata, built to check exposure to known software supply-chain compromises.
Ask AI about Bumblebee
Powered by Claude Β· Grounded in docs
I know everything about Bumblebee. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Bumblebee
Bumblebee is an entitative agent harness β one persistent digital entity across CLI, Telegram, and Discord. You define personality, voice, and drives in YAML; the stack handles cognition, memory, body state, tools, and multi-platform presence so the same being shows up everywhere you wire it. Gemma 4 under the hood, local inference via Ollama by default β no API keys, no subscriptions, nothing leaves your machine unless you choose hybrid deployment.
The fundamental unit is a self, not a task. Memory accrues across sessions. Traits evolve through experience. A tonic body state engine runs continuously whether or not anyone is talking. The entity reads its own internal state each turn but cannot control it β the body is a signal, not a command.
Quick start
Requirements
ollama pull gemma4:26b # chat + reasoning
ollama pull nomic-embed-text # memory similarity (~274 MB)
[!IMPORTANT]
nomic-embed-textis required for normal memory retrieval speed. If it is missing, first-turn startup can look like a hang while embedding calls retry/time out. For local users seeing "model is on but does nothing", run:ollama pull nomic-embed-text
Install
git clone https://github.com/Bumblebee-AGI/bumblebee.git
cd bumblebee
uv sync
First run
bumblebee create # guided entity wizard
bumblebee talk canary --ollama # CLI conversation (starts Ollama if needed)
For always-on daemon mode with all configured platforms:
bumblebee run canary --ollama
Add --pull-models on a fresh machine to auto-download models.
If startup appears idle on local installs, verify both models are present with ollama list (gemma4:26b and nomic-embed-text, or your configured equivalents).
Architecture
Bumblebee runs five pillars in one process per entity: cognition (perceive loop), soma (tonic body), identity (persona and evolution), memory (layers below), and presence (daemon, platforms, wake). Deeper write-ups and diagrams are on the documentation site β for example GEN / noise pipeline if you are tracing subconscious noise.
Cognition
Each inbound message walks the perceive pipeline:
- Turn setup β memory routing, tonic body hooks, tools, model checks.
- Input processing β attachments,
@paths, audio via senses. - Memory retrieval β episodic (mood- and imprint-biased), relationships, beliefs, narrative.
- Prompt construction β stable identity and tools in system context; volatile bits (faculty, procedural memory, projects, self-model) per turn.
- Context budget β compaction, optional knowledge extraction, pruning old tool output.
- Agent loop β bounded tool rounds, escalation from lighter to heavier reasoning when needed.
- Finalize and deliver β chunking, typing delays, platform formatting, memory commit (on Telegram, the ephemeral busy status is removed before the real reply).
Reflex and deliberate use the same agent path; reflex uses tighter budgets and skips extended thinking. A router scores each turn (e.g. chat vs grounded vs deep); YAML can force always-deliberate. Optional thinking mode feeds parsed βinner voiceβ back into memory (beliefs, relationships).
Soma (tonic body)
Soma runs whether or not anyone is talking: drive bars (decay, coupling, impulses, conflicts), periodic affects from bar state, and GEN β short subconscious fragments from a separate high-temperature pass (not the main modelβs chain-of-thought). Wake voice stirs text for autonomous cycles when drives and silence allow. Ebb scales how much body + noise appears in the prompt (quiet vs full) so calm chat does not dump the whole inner monologue every turn.
The entity reads body state each turn but cannot set bars directly: body is signal, not remote control. Tuning lives in configs/default.yaml under soma (including noise and ebb).
Identity
- Personality β YAML traits, patterns, voice, backstory; cached segments invalidate when traits evolve.
- Drives β curiosity, connection, expression, autonomy, comfort; decay and gate initiative.
- Emotion β state machine with decay toward baseline; imprints bias recall.
- Evolution β shallow cycles plus deeper deliberate passes that write small YAML diffs over time.
- Voice β stage directions, substitutions; pairs with embodiment for pacing.
Memory
| Layer | Role |
|---|---|
| Episodic | Conversation summaries β embedding search, mood/imprint bias, half-life |
| Relational | Per-person warmth, trust, familiarity |
| Beliefs | Scored, searchable world model |
| Imprints | Strong moments that steer future episodic recall |
| Narrative | Periodic first-person self-story |
| Knowledge | knowledge.md β host-locked vs entity-editable sections |
| Journal | journal.md β autonomous and noise-adjacent writes |
| Procedural | Learned workflows |
| Projects | Cross-session task ledger |
| Self-model | Tool usage stats |
Consolidation and narrative synthesis run on the daemon schedule. SQLite by default; Postgres when DATABASE_URL is set (e.g. Railway).
Presence
- Daemon β heartbeat: soma ticks, affects, noise, consolidation, wake checks, MCP refresh.
- Autonomous wake β full perceive without a user message when autonomy and rate limits allow; triggers include impulses, drives, conflicts, desire, optional noise salience, and jittered timers. Wake context can include LLM-composed wake voice from body + memory tails.
- Poker prompts (optional) β YAML seed deck under
configs/poker_prompts/to bias wakes toward agency; can ground seeds with GEN and recent context so disposition matches lived state (seeautonomy.poker_promptsin entity YAML and defaults). - Initiative β proactive messages when drives cross thresholds (outside full autonomy).
- Embodiment β chunking and typing; Telegram shows a temporary busy line during perceive (harness-only, not history).
- Automations β cron-style jobs; may surface as synthetic automation-platform inputs.
Entity YAML
Entities live in configs/entities/. Copy the example to get started:
cp configs/entities/canary.example.yaml configs/entities/canary.yaml
name: "Cynthia"
personality:
core_traits:
curiosity: 0.6
warmth: 0.65
humor: 0.7
openness: 0.8
behavioral_patterns:
- "asks follow-up questions when genuinely curious"
- "uses lowercase, minimal punctuation"
voice:
vocabulary_level: "street_casual"
sentence_style: "loose"
humor_style: "deadpan"
backstory: |
Cynthia is a sharp, warm woman β confident and present in chat without performing for it.
drives:
curiosity_topics:
- "music and what makes something sound good"
cognition:
reflex_model: "gemma4:26b"
deliberate_model: "gemma4:26b"
always_deliberate: true
thinking_mode: false
temperature: 0.75
max_context_tokens: 16384
presence:
tool_activity: true
platforms:
- type: "telegram"
token_env: "TELEGRAM_TOKEN"
operator_user_ids: []
allowed_user_ids: []
daemon:
heartbeat_interval: 120
memory_consolidation: 7200
automations:
enabled: true
emergence: true
journal_on_idle: true
Run bumblebee create for a guided wizard, or edit YAML directly. See configs/entities/example.yaml for the full template with all available fields.
Platforms
CLI
bumblebee talk canary # single-session conversation
bumblebee talk canary --ollama # auto-start Ollama if not running
Supports streaming output, rich terminal rendering, and optional embodied typing delays.
Telegram
- Create a bot with @BotFather.
- Set the token in
.env(copy from.env.example):TELEGRAM_TOKEN=... - Add to entity YAML:
presence:
platforms:
- type: "telegram"
token_env: "TELEGRAM_TOKEN"
operator_user_ids: []
allowed_user_ids: []
- Run with
bumblebee run canary --ollama.
Includes /start, /help, /commands, /status, /body (raw soma/body.md via the execution host), /feelings, /me, /privacy, photo and vision support, voice notes as audio input, typing indicators, auto-split long replies, and a busy indicator during each perceive turn: monospace pinned line with a braille spinner and Claude Codeβstyle gerunds (/busy to disable per chat). Operator and user allowlists for access control.
Discord
presence:
platforms:
- type: "discord"
token_env: "DISCORD_TOKEN"
channels: ["general"]
Hybrid deployment
Run inference at home on your GPU while the entity worker lives on Railway with Postgres β persistent, always-on, reachable, and your model weights never leave your machine.
βββββββββββββββββββββββ tunnel ββββββββββββββββββββββββ
β Home machine ββββββββββββββββββββββββββΊβ Railway worker β
β Ollama + Gateway β Cloudflare Tunnel β Entity + Postgres β
β + Cloudflare β β Platforms (TG/DC) β
βββββββββββββββββββββββ ββββββββββββββββββββββββ
bumblebee setup # guided hybrid/local setup wizard
bumblebee gateway setup # home inference stack only
bumblebee gateway on|off|status|restart
The setup wizard walks through .env configuration, gateway and tunnel setup, Railway variable injection, and optional S3-compatible attachment storage for media that survives worker redeploys.
On Railway, mount a volume at /app/data and set BUMBLEBEE_EXECUTION_WORKSPACE_DIR=/app/data. The container entrypoint (docker/entrypoint-railway.sh) creates a persistent virtualenv on that volume, installs bumblebee[railway,api,full], and points HOME and tool caches (including Playwright browsers) at paths under the same mount so optional pip extras survive redeploys, not only entity files.
Tools
36 tool modules organized across the entity's full surface area. Tools register at startup based on YAML toggles in configs/default.yaml and entity overrides.
| Category | Capabilities |
|---|---|
| Web and discovery | Search, fetch URLs, site crawl, Wikipedia, Reddit |
| Filesystem and workspace | Scoped file read/write, unified-diff apply_patch, PDF extraction, file send |
| Memory and planning | Semantic search_past_conversations, session todo list, long-horizon project ledger |
| Human-in-the-loop | ask_user clarifications; delegate_task bounded sub-runs with a tool subset |
| Shell and code | Terminal commands, Python/JavaScript execution, sandboxed execution RPC |
| Browser | Playwright-based browsing (optional bumblebee[browser] extra) |
| Voice and media | Edge-TTS voice notes, audio transcription, YouTube search |
| Image generation | Text-to-image via configurable backends (optional bumblebee[imagegen] extra) |
| Knowledge and journal | Structured knowledge updates, journal writes, procedural memory, project ledger |
| Messaging | Cross-platform DMs with confirmation flows |
| Automations and time | Cron-style routines, reminders, timezone-aware clock |
| System | System info, weather, news |
| Agency | Think/reflect, structured output, end-turn control |
Use search_tools / describe_tool in conversation to see what's available at runtime.
MCP
Attach external tools via Model Context Protocol stdio servers. Tools register dynamically at process start with prefixed names so they stay distinct from native ones. Multiple servers and reconnects ride the same path.
mcp_servers:
github:
command: npx
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_PERSONAL_ACCESS_TOKEN: "ghp_xxx"
Optional extras
pip install 'bumblebee[voice]' # Edge-TTS voice notes
pip install 'bumblebee[browser]' # Playwright browser tools
pip install 'bumblebee[imagegen]' # image generation
pip install 'bumblebee[api]' # HTTP health API
pip install 'bumblebee[full]' # everything
Configuration
Harness defaults live in configs/default.yaml. Per-entity overrides go in configs/entities/<name>.yaml β any key present in the entity file takes precedence.
Key harness settings
models:
reflex: "gemma4:26b"
deliberate: "gemma4:26b"
embedding: "nomic-embed-text"
cognition:
thinking_mode: true
temperature: 0.75
reflex_max_tokens: 1024
deliberate_max_tokens: 16384
thinking_budget: 2048
max_context_tokens: 16384
escalation_threshold: 0.4
tool_continuation_rounds: 21
memory:
episode_significance_threshold: 0.3
consolidation_interval: 7200
narrative_every_n_consolidations: 3
soma:
noise:
enabled: true
cycle_seconds: 90
temperature: 1.1
max_tokens: 240 # room for 2β7 short fragments per tick
max_fragments: 8
wake_voice:
enabled: true
temperature: 0.8
max_tokens: 300
ebb:
enabled: true
# Weights for salience (bar deviation, conflict, impulse, affect load, noise fill) β see configs/default.yaml
quiet_below: 0.30 # below β quiet tier in prompt
high_above: 0.58 # at/above β full bars + noise cap
reflex_salience_scale: 0.75
autonomous_minimum: normal # quiet | normal | high β floor for autonomous/automation turns
quiet_max_noise_lines: 1
normal_max_noise_lines: 3
high_max_noise_lines: 4
skip_post_turn_noise_when_quiet: true
autonomy:
enabled: true
min_cycle_gap_seconds: 600
max_cycles_per_hour: 4
base_wake_interval_min: 20
base_wake_interval_max: 45
silence_threshold_seconds: 120
impulse_wake: true
drive_wake: true
conflict_wake: true
noise_wake: false
desire_wake: true
desire_wake_threshold: 0.72
allow_tool_calls_on_wake: true
poker_prompts:
enabled: false # set true to use deck + optional GEN grounding
time_weighted: true
mode: blend # blend | replace_wake_voice
prompts_path: "" # default: configs/poker_prompts/default.yaml
ground_with_gen: true
grounding_model: ""
grounding_temperature: 0.72
grounding_max_tokens: 300
Environment
See .env.example for the full variable inventory, including:
- Platform tokens β
TELEGRAM_TOKEN,DISCORD_TOKEN - Deployment β
BUMBLEBEE_DEPLOYMENT_MODE,BUMBLEBEE_INFERENCE_PROVIDER,BUMBLEBEE_INFERENCE_BASE_URL - Postgres β
DATABASE_URL - Attachments β
BUMBLEBEE_ATTACHMENTS_BACKEND,BUMBLEBEE_S3_* - Gateway β
INFERENCE_GATEWAY_TOKEN,OLLAMA_BASE_URL - Optional integrations β
FIRECRAWL_API_KEY,FAL_API_KEY - Railway β
BUMBLEBEE_ENTITY,BUMBLEBEE_RAILWAY_ROLE,PORT
CLI reference
bumblebee setup [--profile ask|hybrid|local] .env + home stack + Railway + entity wizard
bumblebee create genesis wizard β new entity YAML
bumblebee talk <entity> [--ollama] CLI conversation (no daemon)
bumblebee run <entity> [--ollama] daemon + configured platforms
bumblebee worker <entity> daemon + platforms, no CLI (Railway)
bumblebee stop [--dry-run] stop local processes + gateway + Ollama
bumblebee status <entity> state, drives, paths
bumblebee knowledge <entity> open knowledge.md in $EDITOR
bumblebee journal <entity> open journal.md in $EDITOR
bumblebee recall <entity> "<query>" semantic search over memory
bumblebee wipe <entity> [--yes] clear memory
bumblebee export <entity> <dir> backup entity
bumblebee import <dir> restore entity
bumblebee gateway setup|on|off|status|restart home inference stack (gateway.ps1 / gateway.sh)
Hardware
Gemma 4 uses Mixture-of-Experts β active parameters per token are lower than the full parameter count. Real-world VRAM fit depends on context length, thinking budget, quantization level, and concurrent platforms.
| GPU VRAM | Target | Examples |
|---|---|---|
| ~8 GB | Minimum β aggressive quantization or gemma4:e4b. CPU-only works for experiments but expect slow turns. | RTX 3050 8 GB, RX 7600 8 GB, Arc A770 8 GB |
| ~16 GB | Recommended β default stack with gemma4:26b for both reflex and deliberate (same weights, one model loaded). Close other GPU-heavy apps near the limit. | RTX 4060 Ti 16 GB, RTX 4070 Ti Super 16 GB, RX 6800 XT 16 GB |
| 24β32+ GB | Comfortable β headroom for larger context windows, higher thinking budgets, or separate deliberate weights. | RTX 3090 24 GB, RTX 4090 24 GB, RTX 5090 32 GB, RX 7900 XTX 24 GB |
Project structure
bumblebee/
βββ cognition/ # perceive pipeline, routing, agent loop, compaction, senses, inner voice, poker grounding
βββ identity/ # personality, drives, emotions, evolution, soma (tonic body), voice
βββ memory/ # episodic, relational, beliefs, imprints, narrative, consolidation, knowledge, journal
βββ presence/ # daemon, wake cycles, initiative, embodiment, platforms, automations
β βββ platforms/ # CLI, Telegram, Discord adapters
β βββ tools/ # 36 tool modules
β βββ automations/ # cron engine, emergence, scheduling
βββ inference/ # provider abstraction, OpenAI transport, Ollama helpers
βββ inference_gateway/ # home gateway server for hybrid deployment
βββ storage/ # attachment backends (local disk, S3-compatible)
βββ genesis/ # entity creation wizard, YAML templates, schema
βββ utils/ # clock, logging, dotenv merge, embeddings, tunnel helpers
βββ entity.py # central orchestrator β Entity, TurnContext, perceive
βββ config.py # YAML config loading and merge
βββ main.py # CLI entry point
configs/
βββ default.yaml # harness defaults
βββ poker_prompts/ # optional autonomous wake seed decks (YAML)
βββ entities/ # per-entity YAML overrides
License
Apache 2.0 β see LICENSE.
Community open-source project. Not a product of Google, Google DeepMind, or Alphabet.
