Graph Indexed Development
Graph-Indexed Development (GID) β unified graph-based project management(design to develop) for AI agents. Rust core + CLI, TypeScript MCP server + CLI.
Ask AI about Graph Indexed Development
Powered by Claude Β· Grounded in docs
I know everything about Graph Indexed Development. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Graph Indexed Development (GID)
An operating system for AI-driven software development.
Every AI coding tool fights the same battle: what context should the agent see? Cursor dumps entire files. Devin guesses. Copilot Workspace hopes for the best. GID solves this with a code knowledge graph β your codebase as a queryable structure where agents trace dependencies, assess impact, and receive only the precise context they need.
69,000 lines of Rust. 1,080 tests. Zero hand-waving.
.gid/graph.yml β Like .git/ for version control, but for architecture intelligence
What GID Does That Nobody Else Does
1. Understands Code Structure β Not Just Text
Code Graph Engine (11,106 lines) β tree-sitter AST parsing for Rust, TypeScript, and Python. Extracts functions, classes, modules, and their real relationships: calls, imports, inheritance, type references. Not regex. Not keyword search. Full structural understanding.
gid extract src/
# β Parses every file, builds a typed dependency graph
# β Functions know what they call, modules know what they import
2. Discovers Architecture Automatically
Infomap Community Detection (8,084 lines) β treats your code graph as an information flow network and runs Infomap to discover natural module boundaries. Seven edge-weight strategies (calls=1.0, imports=0.8, type references=0.5, co-citation=0.4, ...) capture different coupling signals. Often finds better module boundaries than humans drew.
gid analyze --dir src/
# β "Component A: auth.rs, session.rs, middleware.rs (cohesion: 0.87)"
# β "Component B: db.rs, models.rs, migrations.rs (cohesion: 0.91)"
3. Predicts Impact Before You Break Things
Impact Analysis + Working Memory β before changing a function, GID tells you exactly what else is affected: which callers, which modules, which tests. The Working Memory module tracks the blast radius of in-progress changes so agents don't fix A and break B.
gid query impact UserService
# β 12 callers affected across 3 modules
# β 2 hub nodes in the dependency chain
# β 4 test files need updating
gid code-impact auth.py --dir src/
# β Traces through the actual code graph, not guessing
4. Orchestrates Multi-Agent Execution
Task Harness (10,549 lines) β not a todo list. A full execution engine with topological scheduling (parallel what can be parallel, serialize what must be serial), critical path analysis, and automatic orphan/cycle detection. Each task gets precise context assembly β the harness resolves the task's graph edges to extract exactly the right design doc sections, requirement GOALs, and project guards.
gid tasks --ready
# β Shows tasks whose dependencies are all satisfied
gid complete fix-auth-bug
# β Marks done, shows what's newly unblocked
# β Updates execution state, logs telemetry
5. Enforces Development Process
Ritual Engine (9,945 lines) β a pure-function state machine with 14 states: Idle β Triage β Requirements β Design β Review β Plan β Graph β Implement β Verify β Done. The Composer scans your project (Has graph? Has tests? What language?) and dynamically assembles the right ritual phases. Every phase has approval gates β what needs human review gets human review.
Ritual flow (dynamically composed per project):
Triage β Requirements β Design β Review β Plan β Implement β Verify β Done
β β β
βββ Clarification βββ Approval Gate
6. Controls Agent Access to Source Code
Tool Gating β no active ritual? Source code directories are write-locked. Agents must go through design β implement β verify before touching production code. Configuration-driven (glob/regex patterns), overridable for specific paths.
7. Upgrades Understanding Over Time
Semantify β LLM-assisted graph enrichment. Promotes file-level nodes to named components, assigns architectural layers (API / service / storage), discovers cross-cutting features. Your graph gets smarter the more you use it.
8. Assembles Precise Context β The Killer Feature
This is what makes GID fundamentally different from "just another code search tool."
When a sub-agent implements a task, it doesn't get the whole repo dumped into its context window. GID resolves the task's graph edges:
Task "add-oauth"
β implements β auth-feature
β design_doc β .gid/features/auth/design.md Β§ 3.2 (OAuth Flow)
β satisfies β GOAL-auth.3 from requirements.md
β project guards β GUARD-1 (no plaintext secrets)
Result: the agent sees only what it needs. Not 50 files. Not the whole repo. The exact design section, the exact requirement, the exact constraints. This is why GID agents produce better code β they're not drowning in irrelevant context.
Quick Start
# Install
cargo install gid-dev-cli
# Initialize in your project
cd your-project
gid init
# Option A: Top-down (design first)
gid design "E-commerce with auth, payments, orders"
# β Outputs a structured prompt. Feed to LLM, get YAML graph back.
echo "<yaml from LLM>" | gid design --parse
# Option B: Bottom-up (extract from code)
gid extract src/
# β Parses codebase with tree-sitter, builds dependency graph
# Start working
gid tasks --ready # What can I work on?
gid query impact AuthSvc # What breaks if I change this?
gid advise # How healthy is the project?
gid visual --format mermaid # See the architecture
Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β gid-core (Rust, 59K lines) β
β β
β ββββββββββββββββ ββββββββββββββββ β
β β Code Graph β β Infomap β β
β β Engine β β Clustering β β
β β (tree-sitter)β β (community β β
β β β β detection) β β
β ββββββββ¬ββββββββ ββββββββ¬ββββββββ β
β β β β
β ββββββββΌββββββββββββββββββΌββββββββ β
β β Graph (.gid/graph.yml) β β
β ββββββββ¬ββββββββββββββββ¬ββββββββββ β
β β β β
β ββββββββΌββββββββ βββββββΌβββββββββ β
β β Harness β β Ritual β β
β β (execution β β (state β β
β β engine) β β machine) β β
β ββββββββ¬ββββββββ ββββββββ¬ββββββββ β
β β β β
β ββββββββΌβββββββββββββββββΌββββββββ β
β β Context Assembly + Gating β β
β βββββββββββββββββββββββββββββββββ β
β β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββ
β
ββββββββββββββΌβββββββββββββ
βΌ βΌ βΌ
gid CLI MCP Server Rust embed
(39 cmds) (TS wrapper) (cargo add)
One implementation. One schema. Everywhere. The MCP server is an 850-line thin wrapper β it translates MCP tool calls to gid --json CLI commands. Zero graph logic duplicated.
Packages
| Package | Version | Install |
|---|---|---|
| gid-core | v0.3.1 | cargo add gid-core |
| gid-dev-cli | v0.3.1 | cargo install gid-dev-cli |
| MCP Server | npm | npx graph-indexed-development-mcp |
Feature Flags (gid-core)
[dependencies]
gid-core = "0.3.1" # Graph only (minimal)
gid-core = { version = "0.3.1", features = ["infomap"] } # + community detection
gid-core = { version = "0.3.1", features = ["harness"] } # + task execution engine
gid-core = { version = "0.3.1", features = ["ritual"] } # + development pipeline
gid-core = { version = "0.3.1", features = ["full"] } # Everything
All 39 Commands
Graph Operations
init Β· read Β· validate Β· add-node Β· remove-node Β· add-edge Β· remove-edge Β· edit-graph
Task Management
tasks Β· task-update Β· complete
Code Analysis
extract Β· analyze Β· schema Β· file-summary Β· code-search Β· code-snippets Β· code-failures Β· code-symptoms Β· code-trace Β· code-complexity Β· code-impact
Graph Queries
query impact Β· query deps Β· query path Β· query topo Β· query common-cause
AI & Design
design Β· semantify Β· advise
History & Refactoring
history list Β· history save Β· history diff Β· history restore Β· refactor rename Β· refactor merge Β· refactor split Β· refactor extract
Visualization
visual (ASCII, DOT, Mermaid)
All commands support --json for machine-readable output.
MCP Server (Claude, Cursor, VS Code)
Give any MCP-compatible IDE instant access to your architecture:
{
"mcpServers": {
"gid": {
"command": "npx",
"args": ["graph-indexed-development-mcp"]
}
}
}
Then ask your AI:
- "What would break if I change UserService?" β
gid_query_impact - "Show me the project health" β
gid_advise - "Design a notification system" β
gid_design - "What are the ready tasks?" β
gid_tasks
The Graph
Every GID project has a .gid/graph.yml:
project:
name: my-app
nodes:
- id: auth-service
title: Authentication Service
status: in_progress
node_type: component
metadata:
design_doc: ".gid/features/auth/design.md"
- id: add-oauth
title: Add OAuth support
status: todo
metadata:
design_ref: "3.2" # Links to Β§ 3.2 of the design doc
satisfies: ["GOAL-auth.3"] # Traces to requirement
edges:
- from: add-oauth
to: auth-service
relation: implements
- from: add-oauth
to: user-model
relation: depends_on
Nodes are tasks, components, features, or code entities. Edges define relationships: depends_on, implements, calls, imports, tested_by.
How It Compares
| Capability | GID | Cursor/Copilot | Devin | Aider |
|---|---|---|---|---|
| AST-level code graph | β tree-sitter, 3 langs | β text search | β | β |
| Automatic module discovery | β Infomap clustering | β | β | β |
| Impact analysis | β graph traversal | β | β | β |
| Precise context assembly | β graph-driven | β full files | β repo map | β repo map |
| Multi-agent task orchestration | β Harness engine | β | Partial | β |
| Development pipeline enforcement | β Ritual + Gating | β | β | β |
| Task dependency tracking | β DAG + topo sort | β | β | β |
| Source code access control | β Tool Gating | β | β | β |
FAQ
"Won't long context windows (12M tokens, sparse attention) make GID obsolete?"
No β they make GID more valuable, not less.
The intuition that "if the model can fit the whole codebase, you don't need to select files for it" misunderstands what GID actually does. Long context solves capacity. GID provides deterministic architectural truth. They are different layers, and they compose.
Sparse attention selects tokens inside the model. GID selects files, functions, and dependencies at the application layer.
Concretely:
- Determinism vs stochastic reasoning.
gid query impact UserServicereturns the exact set of affected callers via graph traversal. A 12M-context LLM asked the same question gives a plausible answer that varies between runs and silently misses transitive dependencies. Sparse attention has documented failure modes on multi-hop reasoning β and "what does this change break?" is exactly that kind of multi-hop trace. - Auditability. Edges in the GID graph are typed (
calls,imports,satisfies,tests_for). You can explain why a file entered the agent's context. You cannot explain why an LLM's attention landed where it did. - Build-time vs run-time. GID indexes once and answers queries in milliseconds. Long-context models re-reason from scratch on every prompt, paying the full prefill cost each time.
- Multi-agent coordination. When you split a task across 8 parallel sub-agents, each one needs a focused context, not the whole repo.
gid task-context <task-id>gives each agent precisely what it needs.
The right pipeline is: GID narrows the search space at the application layer β long-context model does focused reasoning inside that space. Two selection layers in series outperform either alone β graph traversal catches the structural dependencies sparse attention misses, and the model handles the semantic reasoning the graph can't.
This is the same pattern that played out with databases and compilers: stronger LLMs didn't kill them, they increased the demand for deterministic structural tools that LLMs can call. Architectural ground truth becomes more valuable, not less, when the model layer becomes more powerful and more stochastic.
"Isn't this just RAG with extra steps?"
RAG retrieves passages by semantic similarity. GID retrieves nodes by structural relationship β the things RAG can't index because they only exist as graph edges (call sites, type references, satisfies-this-requirement, blocks-this-task). Both can coexist; they answer different questions.
"Do I have to use the full ritual / harness pipeline?"
No. The graph engine, code extraction, and impact queries work standalone. The Ritual and Harness layers are opt-in for teams that want to enforce design-before-code or run multi-agent task orchestration.
Monorepo Structure
graph-indexed-development/
βββ crates/
β βββ gid-core/ # Core library (59K lines, 1,080 tests)
β β βββ src/
β β βββ code_graph/ # tree-sitter extraction (Rust/TS/Python)
β β βββ infer/ # Infomap community detection
β β βββ harness/ # Task execution engine
β β βββ ritual/ # Development pipeline state machine
β β βββ graph.rs # Core graph types
β β βββ query.rs # Impact/deps/path queries
β β βββ working_mem.rs # Change blast radius tracking
β β βββ ...
β βββ gid-cli/ # CLI binary (39 commands)
βββ packages/
β βββ mcp/ # MCP server (thin TS wrapper)
βββ Cargo.toml
βββ package.json
Related
- π GID Paper β Formal methodology (Zenodo)
- π GID Methodology β Specification and examples
- ποΈ Artifact Layer β File-first model: identity, default Layout, placeholder vocabulary, relation discovery (ISS-053)
License
MIT β See LICENSE for details.
Author
Toni Tang β @tonitangpotato
