📊

Boyce

Deterministic SQL compiler for AI agents. Your agent stops guessing SQL.

0 installs

Trust: 37 — Low

Data

Ask AI about Boyce

I know everything about Boyce. Ask me about installation, configuration, usage, or troubleshooting.

0/500

Loading tools...

Reviews

Documentation

Boyce: Semantic Protocol & Safety Layer for Agentic Database Workflows

The semantic safety layer for agentic database workflows. Boyce connects LLMs to live database context with built-in safety rails.

Named for Raymond F. Boyce, co-inventor of SQL (1974) and co-author of Boyce-Codd Normal Form (BCNF).

AI agents querying databases without proper context generate unreliable SQL — working from incomplete schemas, inferring column names, guessing join paths. Boyce gives agents the structured database intelligence they need to generate correct, safe SQL every time — through three interconnected systems:

Layer	What it does
SQL Compiler	`ask_boyce` — NL → StructuredFilter → deterministic SQL. Zero LLM in the SQL builder. Same inputs, same SQL, byte-for-byte, every time.
Database Inspector	`query_database` / `profile_data` — Live Postgres/Redshift adapters let your agent see real schema and real data distributions before writing a single filter.
Query Verification	Pre-flight `EXPLAIN` loops on every generated query. Bad SQL is caught at planning time, not at 2am in your on-call rotation.

Why does this matter? → The Null Trap: Your AI Agent's SQL Is Correct. The Answer Is Still Wrong.

Install

Requires Python 3.10+

pip install boyce

# With live Postgres/Redshift adapter (enables EXPLAIN pre-flight + column profiling)
pip install "boyce[postgres]"

# uv (recommended)
uv pip install boyce
uv pip install "boyce[postgres]"

From source:

git clone https://github.com/boyce-io/boyce
uv pip install -e "boyce/"

Quickstart

After installing, run boyce init to configure your MCP host automatically:

boyce init

The wizard detects Claude Desktop, Cursor, Claude Code, and JetBrains (DataGrip, IntelliJ, etc.), and writes the correct config block for each.

Developing from source? The repo includes a setup script:

./quickstart.sh   # detects uv or python, installs package, writes .env template

Configure Your MCP Host

The fastest path is boyce init — it detects your MCP host and writes the config automatically:

boyce init

Or configure manually. There are two setup paths depending on your host:

Path 1 — MCP Hosts (No LLM key required)

If you're using Claude Desktop, Cursor, Claude Code, Codex, Cline, Windsurf, JetBrains (DataGrip, IntelliJ), or any MCP-compatible host, you do not need to configure an LLM provider for Boyce. The host's own model handles reasoning — Boyce supplies the schema context and deterministic SQL compiler via get_schema and ask_boyce. Only BOYCE_DB_URL is needed (and even that is optional).

Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "boyce": {
      "command": "boyce",
      "env": {
        "BOYCE_DB_URL": "postgresql://user:pass@host:5432/db"
      }
    }
  }
}

Cursor (.cursor/mcp.json in project root):

{
  "mcpServers": {
    "boyce": {
      "command": "boyce",
      "env": {
        "BOYCE_DB_URL": "postgresql://user:pass@host:5432/db"
      }
    }
  }
}

Path 2 — With Boyce's Built-in NL→SQL

If you're using the CLI (boyce ask), HTTP API, or a non-MCP client (e.g., the VS Code extension), configure Boyce's internal query planner with your LLM provider:

{
  "mcpServers": {
    "boyce": {
      "command": "boyce",
      "env": {
        "BOYCE_PROVIDER": "anthropic",
        "BOYCE_MODEL": "claude-sonnet-4-6",
        "ANTHROPIC_API_KEY": "sk-ant-...",
        "BOYCE_DB_URL": "postgresql://user:pass@host:5432/db"
      }
    }
  }
}

Boyce supports any LLM provider available through LiteLLM: Anthropic, OpenAI, Ollama (local), vLLM (local), Azure, Bedrock, Vertex, Mistral, and more.

BOYCE_DB_URL is optional on both paths. Without it, Boyce runs in schema-only mode — SQL generation still works; EXPLAIN pre-flight and live query tools return "status": "unchecked".

Environment Variables

Variable	When needed	Example	Purpose
`BOYCE_PROVIDER`	Path 2 only (CLI/HTTP/non-MCP)	`anthropic`	LiteLLM provider name
`BOYCE_MODEL`	Path 2 only (CLI/HTTP/non-MCP)	`claude-sonnet-4-6`	Model ID passed to LiteLLM
`ANTHROPIC_API_KEY`	When using Anthropic	`sk-ant-...`	Anthropic credentials
`OPENAI_API_KEY`	When using OpenAI	`sk-...`	OpenAI credentials
`BOYCE_DB_URL`	Optional (either path)	`postgresql://user:pass@host:5432/db`	asyncpg DSN — enables EXPLAIN pre-flight + live query tools
`BOYCE_HTTP_TOKEN`	Path 2 HTTP API only	`my-secret-token`	Bearer token for `boyce serve --http`
`BOYCE_STATEMENT_TIMEOUT_MS`	Optional	`30000`	Per-statement timeout in ms (default: 30s)

MCP Tools

Tool	Description
`ingest_source`	Parse a `SemanticSnapshot` from dbt manifest, dbt project, LookML, DDL, SQLite, Django, SQLAlchemy, Prisma, CSV, or Parquet.
`ingest_definition`	Store a certified business definition — injected automatically at query time.
`get_schema`	Return full schema context + StructuredFilter format docs. Used by MCP hosts so the host LLM can construct queries without a Boyce API key.
`ask_boyce`	Full NL → SQL pipeline: query planner (LiteLLM) → deterministic kernel → NULL trap check → EXPLAIN pre-flight.
`validate_sql`	Validate hand-written SQL — EXPLAIN pre-flight, Redshift lint, NULL risk — without executing.
`query_database`	Execute a read-only `SELECT` against the live database. Write operations rejected at two independent layers.
`profile_data`	Null %, distinct count, min/max for any column — surface data quality issues before they affect query results.
`check_health`	Operational health check — DB connectivity, snapshot freshness, actionable fix commands. Call when queries fail unexpectedly.

Architecture

SemanticSnapshot (JSON)
        │
        ▼  ingest_source
 ┌─────────────────────────────────────────────┐
 │          SemanticGraph (NetworkX)            │  ← in-memory, loaded per session
 │  nodes = entities (tables/views/dbt models) │
 │  edges = joins  (weighted by confidence)    │
 └─────────────────────────────────────────────┘
        │                         │
        ▼  ask_boyce              ▼  (internal)
  QueryPlanner                 Dijkstra
  (LiteLLM)                    join resolver
  NL → StructuredFilter             │
        │                           │
        └──────────┬────────────────┘
                   ▼
           kernel.process_request()          ← ZERO LLM HERE
           SQLBuilder (dialect-aware)
                   │
                   ▼
           EXPLAIN pre-flight                ← Query Verification
           (PostgresAdapter)
                   │
                   ▼
            SQL + validation result

Dialect support: redshift, postgres, duckdb, bigquery

Redshift safety rails (safety.py): Automatic linting for LATERAL, JSONB, REGEXP_COUNT, lookahead regex patterns, and numeric cast rewrites for Redshift 1.0 (PG 8.0.2).

Scan CLI

# Scan a single file
boyce scan demo/magic_moment/manifest.json

# Scan a directory (auto-detects all parseable sources)
boyce scan ./my-project/ -v

# Save snapshots for MCP server use
boyce scan ./my-project/ --save

10 parsers: dbt manifest, dbt project, LookML, SQLite, DDL, CSV, Parquet, Django, SQLAlchemy, Prisma.

Verify the Install

# Unit tests — no DB required, runs in ~4 seconds
python boyce/tests/verify_eyes.py

# Expected output:
# Ran 15 tests in 3.5s
# OK
# ✅  All checks passed.

SemanticSnapshot Format

The ingest_source tool accepts a SemanticSnapshot JSON dict. Minimal example:

{
  "snapshot_id": "<sha256>",
  "source_system": "dbt",
  "entities": {
    "entity:orders": {
      "id": "entity:orders",
      "name": "orders",
      "schema": "public",
      "fields": ["field:orders:order_id", "field:orders:revenue"]
    }
  },
  "fields": {
    "field:orders:order_id": {
      "id": "field:orders:order_id",
      "entity_id": "entity:orders",
      "name": "order_id",
      "field_type": "ID",
      "data_type": "INTEGER"
    }
  },
  "joins": []
}

See boyce/tests/live_fire/mock_snapshot.json for a complete field/entity example.

Project Layout

boyce/                          ← PRIMARY — headless FastMCP server + pip package
├── boyce/
│   ├── server.py               ← MCP entry point (8 tools)
│   ├── kernel.py               ← Deterministic SQL kernel
│   ├── graph.py                ← SemanticGraph (NetworkX)
│   ├── safety.py               ← Redshift compatibility rails
│   ├── types.py                ← Protocol contract (Pydantic)
│   ├── scan.py                 ← Scan CLI (boyce scan)
│   ├── connections.py          ← DSN persistence (ConnectionStore)
│   ├── doctor.py               ← Environment diagnostics (boyce doctor)
│   ├── sql/                    ← SQLBuilder, dialect layer, join resolver
│   ├── parsers/                ← 10 parsers (dbt, lookml, ddl, sqlite, csv, etc.)
│   ├── planner/                ← QueryPlanner (LiteLLM → StructuredFilter)
│   └── adapters/               ← PostgresAdapter (Eyes)
└── tests/
    ├── verify_eyes.py          ← 15-test suite, no DB required
    ├── test_parsers.py         ← Parser tests (all 10 parsers)
    ├── test_scan.py            ← Scan CLI tests
    └── live_fire/              ← Docker Compose integration tests

Status

Capability	Status
NL → SQL (deterministic kernel)	Operational
SemanticGraph (join resolution)	Operational
10 source parsers	Operational
Scan CLI (`boyce scan`)	Operational
PostgresAdapter (read-only)	Operational
EXPLAIN pre-flight validation	Operational
NULL Trap detection	Operational
Redshift 1.0 safety linting	Operational
Snapshot persistence across restarts	Operational
Audit logging (append-only JSONL)	Operational
Business definitions (`ingest_definition`)	Operational
DSN persistence (`ConnectionStore`)	Operational
Environment diagnostics (`boyce doctor` / `check_health`)	Operational
Multi-snapshot merge	Planned

Support

Troubleshooting guide: docs/troubleshooting.md
Local LLM setup (Ollama/vLLM): docs/local-llm-setup.md
Bug reports: GitHub Issues
Setup help: GitHub Issues
Email: will@convergentmethods.com — for issues involving credentials or sensitive config