Devboy Tools
Open Source MCP server for AI coding agents β GitLab, GitHub, ClickUp and Jira
Ask AI about Devboy Tools
Powered by Claude Β· Grounded in docs
I know everything about Devboy Tools. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
DevBoy tools
A research-driven tool bundle for AI coding agents. A single curated set of dev-workflow tools (GitHub, GitLab, Jira, ClickUp, Confluence, Slack, Fireflies) reachable from any agent β Claude Code, Copilot CLI, Codex, Cursor, Kimi, Gemini, β¦ β through three transports: MCP server, CLI, or installable agent skills. Output goes through a token-aware pipeline that compresses responses by 26β69% per call on the data-shape-friendly endpoints it targets (issues, pipelines, large lists) β see paper 2 measurements on a 144k-event production corpus.
npm install -g @devboy-tools/cli # binary for your platform
devboy onboard # detects your AI agent, installs the right skills
That's it. Verify with devboy doctor.
Why DevBoy
DevBoy isn't another aggregator with a long tool list. Four things that aren't standard elsewhere:
- Research-driven. Every optimisation in the pipeline traces back to a paper grounded in a real corpus β 523 Claude Code sessions, 10,644 MCP tool responses. We don't ship a heuristic without measuring it. See research index.
- Three transports, one bundle. The exact same tool set is reachable as MCP server, CLI for humans / CI, or as agent skills that call individual tools. Pick whichever fits today; layer the rest later.
- Privacy by default. Tokens live in OS keychain (macOS Keychain / Windows Credential Manager / Linux Secret Service) with env-var fallback for CI. No cloud round-trip just to authenticate.
- Multi-project context. One server, many project contexts (different GitHub/GitLab/Jira combinations), instant switch β no respawn, no config edit. Concrete:
devboy context use dashboardand the same MCP session now talks to a different project's APIs.
| Generic MCP aggregator | DevBoy | |
|---|---|---|
| Output efficiency | Default API JSON | Knapsack-based pagination + format-adaptive encoding (papers 1, 2) β measured per-call savings on the 144k-event corpus: 69% avg on get_issues, 92% top per call, 26% avg on *_pipeline. KV-cache pass on Sonnet 4.5 lifts ~40% of tokens off the input side. |
| Tool catalogue | Static | Dynamic per-project, with provider enrichers (custom field params, enum hints) |
| Transport | MCP only | MCP, CLI, or agent skills β same tools |
| Onboarding | Manual config + per-agent install | devboy onboard autodetects and bundles |
| Credentials | Cloud / config files | OS keychain, env vars only as fallback |
| Extensibility | Forking | Plugin system (Rust today; WASM, TypeScript planned) |
Quick start (60 seconds)
# 1. Install
npm install -g @devboy-tools/cli
# 2. Bootstrap β picks your agent + installs a curated skill bundle
devboy onboard
# 3. Configure your first provider (interactive)
devboy init
# 4. Verify
devboy doctor
After this devboy issues returns your open tickets, your agent has the relevant skills loaded, and the MCP server is registered with whatever client you use.
Install via plugin (Claude Code / Codex CLI)
If you live inside Claude Code or Codex CLI, skip the npm step entirely.
Claude Code:
/plugin marketplace add meteora-pro/devboy-tools
/plugin install devboy@meteora-devboy
Codex CLI β reads the same .claude-plugin/marketplace.json (one of the four official marketplace sources), so the install is symmetric:
codex plugin marketplace add meteora-pro/devboy-tools
codex plugin install devboy@meteora-devboy
Either way, the bundled setup skill installs the devboy CLI on first use (npm install -g with a SHA-256-verified GitHub Release tarball as fallback), wires up the MCP server, and runs devboy onboard. After the binary lands, run /reload-plugins (Claude Code) or restart your Codex session once.
OpenCode and Kimi CLI users get the same skills for free β both auto-read ~/.claude/skills/, so installing the Claude Code plugin or running devboy onboard covers them too. See the per-agent guides (Claude Code, Codex) and ADR-018 for the architecture.
If you'd rather pick everything by hand:
Manual install / configuration
# Configure GitHub (replace gitlab/clickup/jira similarly)
devboy config set github.owner meteora-pro
devboy config set github.repo devboy-tools
devboy config set-secret github.token <token> # β OS keychain
# Or via env vars (CI / Docker β keychain unavailable)
export DEVBOY_GITHUB_TOKEN=ghp_...
# Compatibility: GITHUB_TOKEN is read too
# Pick skills explicitly instead of using a profile
devboy skills list
devboy skills install review-mr --agent claude
devboy skills install --all --agent all
Build from source:
git clone https://github.com/meteora-pro/devboy-tools.git
cd devboy-tools && cargo build --release
./target/release/devboy --version
Skills & onboarding
DevBoy ships a catalogue of skills β one-page Markdown recipes that tell an AI agent how to use the bundle to accomplish a common task. Skills are CLI-first (devboy tools call <name> under the hood), agent-agnostic (Claude Code / Codex / Cursor / Kimi or a vendor-neutral path), and versioned with the binary.
devboy onboard is the fastest path: it scans ~/.claude/, ~/.copilot/, ~/.codex/, ~/.kimi/, Cursor's storage, ~/.gemini/, and ~/.gemini/antigravity/, scores each agent on freshness Γ volume (recency wins ties), and installs a profile-specific bundle.
devboy onboard # auto-detect + install `dev` bundle
devboy onboard --profile pm # PM bundle (issues + meetings + chat)
devboy onboard --profile oncall # diagnostics + notifications
devboy onboard --agent kimi --yes # explicit agent + non-interactive
devboy agents list # show all detected agents with score
Three profiles ship today; categories below cover the full catalogue.
| Category | Skills |
|---|---|
self-bootstrap | setup, repair, tools-catalog, pipeline-tune |
issue-tracking | get-issues, create-issue, update-issue, link-issues, solve-issue |
code-review | review-mr, fix-review-comments, self-review |
self-feedback | run-and-verify, daily-report, retro, knowledge-extract, qa-sweep, analyze-usage |
meeting-notes | meeting-search, meeting-transcript, meeting-to-tasks |
messenger | chat-search, chat-summary, notify |
Skill installs keep a per-location manifest with SHA-256s so upgrades leave user-modified files alone (ADR-014). Self-feedback skills read session traces from .devboy/sessions/ (ADR-015).
analyze-usage is a featured skill that ships in two parts: a thin baseline (one Markdown file, embedded in the binary) plus a heavier Python backend (~1 MB, sparse-checked-out via curl on first use). It produces graphic monthly / weekly digests of how your AI sessions actually went β biome aquariums, 8-archetype bars, DORA radar, friction markers β plus shareable anonymised parquet bundles. See ./.claude/skills/analyze-usage/.
Three integration modes
The same tool set, three transports β pick what your workflow already uses.
| Mode | When to use | Example |
|---|---|---|
| MCP server | Claude Desktop, Claude Code, any MCP-compatible client | devboy mcp (stdio) |
| CLI | Humans at the terminal, CI jobs, shell scripts | devboy issues, devboy mrs, devboy tools call get_issues '{"limit": 20}' |
| Agent skills | Agents that don't want the full MCP tool-list tax β call only the tools a skill needs | devboy tools call get_issues from inside a skill script |
JSON arguments tip.
devboy tools call <name>takes an optional positional JSON string (defaults to{}). POSIX shells: wrap in single quotes. Windowscmd.exe/PowerShell: escape inner quotes βdevboy tools call get_issues "{\"limit\": 20}".
Claude Code
The fastest way to get started is devboy onboard β it auto-detects which AI agent you actively use (by scanning ~/.claude/, ~/.copilot/, ~/.codex/, ~/.kimi/, Cursor's storage, ~/.gemini/, ~/.gemini/antigravity/) and installs a curated skill bundle for that agent:
devboy onboard # detect primary agent, install the `dev` bundle
devboy onboard --profile pm # PM bundle (issue tracking + meetings + messenger)
devboy onboard --profile oncall # on-call bundle (diagnostics + notifications)
devboy onboard --agent kimi --yes # explicit agent + non-interactive (CI / dotfiles)
devboy agents list # show all detected agents with sessions / last-used / score
If you'd rather pick skills by hand:
claude mcp add devboy -- devboy mcp
claude mcp list
Claude Desktop
~/Library/Application Support/Claude/claude_desktop_config.json:
{ "mcpServers": { "devboy": { "command": "devboy", "args": ["mcp"] } } }
For Codex / Cursor / Kimi / Copilot CLI / Gemini CLI / Antigravity β devboy onboard autoconfigures the MCP entry; or follow the agent's docs for adding a stdio MCP server pointing at devboy mcp.
Providers
Seven provider plugins ship today β each with a dedicated client + schema enricher so the tool list adapts to your project's actual fields (custom fields, enum values, status taxonomies):
| Provider | Crate | What you get |
|---|---|---|
| GitHub | devboy-github | Issues, pull requests, comments, branches, repos |
| GitLab | devboy-gitlab | Issues, merge requests, discussions, pipelines, MR diffs |
| Jira | devboy-jira | Issues with custom-field metadata, sprints, transitions, project versions (releases) |
| ClickUp | devboy-clickup | Tasks, custom fields, lists, custom task IDs |
| Confluence | devboy-confluence | Knowledge-base pages, search, spaces, create / update with labels (Server / Data Center, v1 + v2 API) |
| Slack | devboy-slack | Chat search, channel summary, post message |
| Fireflies | devboy-fireflies | Meeting transcripts, search, action items |
Adding a provider is a Rust crate implementing Provider + a ToolEnricher (ADR-007).
Research
Every non-trivial optimisation in the pipeline is backed by a paper grounded in a real corpus β 523 Claude Code sessions, 10,644 MCP responses from production traffic. The full docs/research/INDEX.md tracks methods, datasets, and reproducibility scripts.
| # | Paper | Status | Headline result |
|---|---|---|---|
| 1 | TrimTree: priority-driven pagination β binary knapsack within a token budget, pβ metric | draft (820-line full draft, all experiments complete) | 3.3Γ pβ vs uniform on power-law data; FIFO baseline 35% replicated across 3 corpora; KV-cache pass on Sonnet 4.5 β 40% input-side savings (66.5% hit rate) |
| 2 | Format-adaptive tree encoding β multi-choice knapsack picking CSV / table / key:value per subtree | draft | Per-call savings on the corpus: avg 69% on get_issues (top 92%), avg 26% on *_pipeline; β₯ 20% bucket hits 1.25% of all events but most calls of the shape-friendly endpoints |
| 3 (theory Β· implementation) | Context Enrichment Hypothesis + tool-aware knapsack with provider value models | draft (prefetch dispatcher merged in v0.22; production telemetry pending) | Pearson r = β0.280 between chars_per_item and follow-up enrichment calls; thin issues (< 200 chars/item) β 43% of turns add a get_issue; rich (1.5 kβ4 k) β 2% |
| 4 | Dataset-as-context β large responses become queryable Parquet artefacts the LLM pulls from | draft (early concept, no measurements yet) | Hypothesised 60β80% additional savings on top of TrimTree; evaluation harness not yet built |
Other corpus baselines used across papers (the 523 Claude Code sessions / 10,644 MCP-response sample, paper 1 Β§B):
get_merge_request_diffs: P90 = 35 k chars β 10 k tokens β 28% of responses exceed an 8 k-token budgetget_epics: P90 = 43 k chars β 12 k tokens β 37% exceed budget- After overflow, agents always produce a text response on the next turn β they never retry / paginate (paper 1 Β§3, paper-1-trimtree.md:30 and Β§C)
Paper 3's prefetch dispatcher already runs in the format pipeline; papers 1 and 2 land in the next minor version. Paper 4 is at concept stage β no production code yet.
Architecture
Crate layout
crates/
βββ devboy-core/ Traits (Provider, ToolEnricher), shared types, config
βββ devboy-executor/ Tool execution engine + enrichment pipeline
βββ devboy-mcp/ MCP server (JSON-RPC over stdio)
βββ devboy-cli/ CLI binary (`devboy`)
βββ devboy-skills/ Skill catalogue, install/upgrade, manifests, traces
βββ devboy-storage/ Credential storage (keychain, env vars)
βββ devboy-assets/ File attachments (ADR-010)
βββ plugins/
βββ api/ { github, gitlab, jira, clickup, slack, fireflies }
βββ format-pipeline The token-aware output pipeline (papers 1, 2, 3)
Multi-project contexts
One server, many contexts. Each context is its own provider config bundle:
ββ DevBoy MCP / CLI βββββββββββββββββ
β context: devboy-tools β
β βββ GitHub: meteora-pro/devboy β
β βββ Slack: #devboy β
β context: dashboard β
β βββ GitLab: project #42 β
β βββ ClickUp: list abc123 β
β βββ Jira: DEV β
ββββββββββββββββββββββββββββββββββββββ
Switch with devboy context use <name> (CLI) or the use_context tool (MCP). No respawn β the active session re-reads the new bindings on the next call.
Executor + enricher pipeline
Tool call β Executor
1. Enrichers transform args (e.g. cf_story_points β customFields)
2. Provider factory builds the client from ProviderConfig
3. Provider executes API calls β typed ToolOutput
4. Format pipeline encodes output β text (markdown / compact / json)
Three enricher categories, single ToolEnricher trait:
- Provider enrichers β adapt schemas per provider (drop unsupported params, surface custom-field params, populate enums from project metadata).
- Pipeline enrichers β add output-control parameters (
formatenum, pagination knobs). - Custom enrichers β third-party plugins.
Architecture details: executor, enrichers, format pipeline.
Documentation map
- Getting started β
docs/guide/getting-started/ - CLI reference (auto-generated) β
docs/guide/reference/cli.md - Tool reference (auto-generated) β
docs/guide/reference/tools.md - Skills user guide β
docs/guide/skills/ - Configuration (env vars, contexts, doctor, proxy, format pipeline) β
docs/guide/configuration/ - Architecture β
docs/guide/architecture/ - ADRs β
docs/architecture/adr/INDEX.md(17 decisions logged) - Research papers β
docs/research/INDEX.md
Development
cargo build # debug build
cargo test # runs the workspace test suite
cargo clippy --all-targets # lint (CI uses RUSTFLAGS=-Dwarnings)
cargo fmt --all # format
cargo run -p devboy-cli -- doctor # smoke
The CLI reference is gated in CI: after touching clap definitions, run
cargo run -p devboy-cli -- docs cli --output docs/guide/reference/cli.md
so the committed reference matches the binary. Same idea for devboy tools docs and the tool reference.
See CONTRIBUTING.md for the full guide (commit conventions, branch naming, ADR workflow, release process).
Community
- Issues / feature requests β GitHub Issues
- Design discussions β GitHub Discussions
- Code review tooling β open a PR; CI runs
Format,Clippy,Teston macOS / Linux / Windows,Coverage, and the docs drift gate
License
Apache License 2.0 β use it, modify it, ship it; if you build something interesting on top, we'd love a heads-up via Discussions.
