code-intelligence
This server indexes your codebase locally to provide fast, semantic, and structure-aware code navigation to tools like ClaudeCode, OpenCode, Trae, and Cursor.
Installation
npx @iceinvein/code-intelligence-mcpAsk AI about code-intelligence
Powered by Claude Β· Grounded in docs
I know everything about code-intelligence. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Code Intelligence MCP Server
Give your AI coding agent a deep understanding of your codebase.
A local code indexing engine that gives LLM agents like Claude Code, Cursor, Trae, and OpenCode semantic search, call graphs, type hierarchies, and impact analysis across your codebase. Written in Rust with Metal GPU acceleration.
Zero config. Runs via npx. Indexes in the background.
Install
Claude Code
claude mcp add code-intelligence -- npx -y @iceinvein/code-intelligence-mcp
Or add manually to ~/.claude.json
{
"mcpServers": {
"code-intelligence": {
"command": "npx",
"args": ["-y", "@iceinvein/code-intelligence-mcp"],
"env": {}
}
}
}
Cursor
Add to .cursor/mcp.json:
{
"mcpServers": {
"code-intelligence": {
"command": "npx",
"args": ["-y", "@iceinvein/code-intelligence-mcp"]
}
}
}
OpenCode / Trae
Add to opencode.json:
{
"mcp": {
"code-intelligence": {
"type": "local",
"command": ["npx", "-y", "@iceinvein/code-intelligence-mcp"],
"enabled": true
}
}
}
On first launch, the server downloads three models, ~3.2 GB total: the embedding model (Jina Code 1.5b, ~1.5 GB), the description LLM (Qwen2.5-Coder-1.5B, ~1.0 GB), and the cross-encoder reranker (bge-reranker-v2-m3, ~600 MB). Indexing then runs in the background. Models are cached in
~/.code-intelligence/models/.
What It Does
Unlike basic text search (grep/ripgrep), this server builds a local knowledge graph of your code and exposes it through 32 MCP tools.
| Capability | How It Works |
|---|---|
| Hybrid search | BM25 keyword search (Tantivy) + semantic vector search (LanceDB, jina-code-embeddings-1.5b, 1536-dim Matryoshka) merged via Reciprocal Rank Fusion |
| Cross-encoder reranking | bge-reranker-v2-m3 re-scores top candidates (llama.cpp + Metal) for precision tuning |
| On-device LLM descriptions | Qwen2.5-Coder-1.5B generates natural-language summaries for every symbol, bridging the gap between how you search ("auth handler") and how code is named (authenticate_request) |
| Graph intelligence | Call hierarchies, type graphs, dependency trees, and PageRank-based importance scoring |
| Impact analysis | Find all code affected by a change, with optional git co-change history for confidence scoring |
| Smart ranking | Test detection, export boosting, directory semantics, intent detection, edge expansion, framework-pattern injection, score-gap filtering, sub-query coverage |
| Multi-repo | Index and search across multiple repositories simultaneously, including cross-repo dependency exploration |
| Auto-reindex | OS-native file watching (FSEvents) keeps the index fresh as you code |
Tools (32)
Upgrade note (3.0.0):
search_codeno longer assembles acontextmarkdown bundle by default. Passcontext: "snippets"for compact per-hit code, orcontext: "full"to restore the v2 behavior. See Migration below.
Search & Navigation
| Tool | What It Does |
|---|---|
search_code | Semantic + keyword hybrid search. Handles natural language ("how does auth work?") and structural queries ("class User"). Pass context: "snippets" or "full" to receive source code alongside hits. |
get_definition | Jump to a symbol's full definition |
find_references | Find all usages of a function, class, or variable |
get_call_hierarchy | Upstream callers and downstream callees |
get_type_graph | Inheritance chains, type aliases, implements relationships |
explore_dependency_graph | Module-level import/export dependencies |
get_file_symbols | All symbols defined in a file |
get_usage_examples | Real-world usage examples from the codebase |
get_context_bundle | Pre-assembled context bundle (definitions, call chains, tests, similar code) for a task description, in one call |
Analysis
| Tool | What It Does |
|---|---|
find_affected_code | Reverse dependency analysis β what breaks if this changes? |
predict_impact | Like find_affected_code but also factors in git co-change history for confidence scoring |
trace_data_flow | Follow variable reads and writes through the code |
find_similar_code | Semantically similar code to a given symbol |
get_similarity_cluster | Symbols in the same semantic cluster |
find_duplicates | Groups of semantically near-duplicate symbols based on embedding clusters |
find_dead_code | Symbols with zero incoming references β candidates for safe removal |
explain_search | Scoring breakdown explaining why results ranked as they did |
summarize_file | File summary with symbol counts and key exports |
get_module_summary | All exported symbols from a module with signatures |
Testing, Frameworks & Discovery
| Tool | What It Does |
|---|---|
find_tests_for_symbol | Find tests that cover a given symbol |
search_todos | Search TODO/FIXME comments |
search_decorators | Find TypeScript/JavaScript decorators |
search_framework_patterns | Find framework-specific patterns (routes, middleware, WebSocket handlers) |
find_undocumented_symbols | Symbols missing LLM-generated descriptions, ranked by importance |
find_stale_descriptions | Symbols whose LLM descriptions are out of sync with the current code (content-hash mismatch) |
Cross-Repo (standalone mode)
| Tool | What It Does |
|---|---|
search_across_repos | Run a single query across all indexed repos, merged by score |
explore_cross_repo_dependencies | Walk dependency edges that cross repo boundaries |
Index Management & Learning
| Tool | What It Does |
|---|---|
hydrate_symbols | Load full context for a set of symbol IDs |
report_selection | Feedback loop β tell the server which result was useful |
report_file_access | Tell the server when a file is viewed/edited; feeds file-affinity ranking |
refresh_index | Manually trigger re-indexing |
get_index_stats | Index statistics (files, symbols, edges, last updated) |
Supported Languages
Rust, TypeScript/TSX, JavaScript, Python, Go, Java, C, C++
Standalone Mode (Multi-Client)
By default each MCP client spawns its own server process. If you run multiple clients (e.g. 5 Claude Code sessions across 3 repos), standalone mode loads the models once and shares them:
npx @iceinvein/code-intelligence-mcp-standalone
Then point all clients to http://localhost:3333/mcp:
Claude Code
claude mcp add --transport http code-intelligence http://localhost:3333/mcp
Cursor
{
"mcpServers": {
"code-intelligence": {
"url": "http://localhost:3333/mcp"
}
}
}
OpenCode
{
"mcp": {
"code-intelligence": {
"type": "remote",
"url": "http://localhost:3333/mcp",
"enabled": true
}
}
}
The server auto-detects each client's workspace via the MCP roots capability. Separate indexes are maintained per repo, models are shared.
ββββββββββββ ββββββββββββ ββββββββββββ
β Claude A β β Cursor B β β Trae C β
βββββββ¬ββββββ βββββββ¬ββββββ βββββββ¬ββββββ
β β β
βββββββββ POST /mcp ββββββββββ
β
ββββββββββββββ΄βββββββββββββ
β Standalone Server β
β (shared models, once) β
βββββββββββββββββββββββββββ€
β Repo A Repo B Repo Cβ
β indexes indexes indexesβ
βββββββββββββββββββββββββββ
Configuration
Works out of the box with no configuration. All settings are optional environment variables.
Environment variables
Core:
| Variable | Default | Description |
|---|---|---|
WATCH_MODE | true | Auto-reindex on file changes |
INDEX_PATTERNS | **/*.ts,**/*.rs,... | Glob patterns to index |
EXCLUDE_PATTERNS | **/node_modules/**,... | Glob patterns to exclude |
REPO_ROOTS | β | Comma-separated paths for multi-repo |
Embeddings:
| Variable | Default | Description |
|---|---|---|
EMBEDDINGS_BACKEND | llamacpp | llamacpp or hash (fast testing, no model download) |
EMBEDDINGS_DEVICE | metal | metal (GPU) or cpu |
Ranking:
| Variable | Default | Description |
|---|---|---|
HYBRID_ALPHA | 0.7 | Vector vs keyword weight (0 = all keyword, 1 = all vector) |
RANK_EXPORTED_BOOST | 1.0 | Boost for exported/public symbols |
RANK_TEST_PENALTY | 0.1 | Penalty multiplier for test files |
RANK_POPULARITY_WEIGHT | 0.05 | PageRank influence on ranking |
Context:
| Variable | Default | Description |
|---|---|---|
MAX_CONTEXT_TOKENS | 8192 | Token budget for assembled context |
MAX_CONTEXT_BYTES | 200000 | Byte-based fallback limit |
Learning (off by default):
| Variable | Default | Description |
|---|---|---|
LEARNING_ENABLED | false | Track user selections to personalize results |
LEARNING_SELECTION_BOOST | 0.1 | Max boost from selection history |
LEARNING_FILE_AFFINITY_BOOST | 0.05 | Max boost from file access frequency |
Standalone server config (~/.code-intelligence/server.toml)
[server]
host = "127.0.0.1"
port = 3333
[embeddings]
backend = "llamacpp"
device = "metal"
[repos.defaults]
index_patterns = "**/*.ts,**/*.tsx,**/*.rs,**/*.py,**/*.go"
exclude_patterns = "**/node_modules/**,**/dist/**,**/.git/**"
watch_mode = true
[lifecycle]
warm_ttl_seconds = 300 # How long idle repos stay in memory
Priority: CLI flags > Environment variables > server.toml > Defaults
How Ranking Works
The search pipeline runs keyword search (BM25) and semantic vector search in parallel, merges them with Reciprocal Rank Fusion, then applies structural signals:
- Intent detection β "struct User" boosts definitions, "who calls login" triggers graph lookup, "User schema" boosts models 50-75x
- Query decomposition β "authentication and authorization" automatically splits into sub-queries; sub-query coverage ensures each term has at least one matching result
- LLM-enriched index β on-device Qwen2.5-Coder generates descriptions bridging vocabulary gaps between how you search and how code is named
- Cross-encoder reranker β bge-reranker-v2-m3 re-scores top candidates for precision (always-on by default, disable with
RERANKER_ENABLED=false) - PageRank β graph-based importance scoring identifies central, heavily-used symbols
- Morphological expansion β
watchmatcheswatcher,indexmatchesreindex - Framework-pattern injection β route, middleware, and handler patterns surface alongside symbol matches
- Multi-layer test detection β file paths, symbol names, and AST-level analysis (
#[test],mod tests) - Edge expansion β high-ranking symbols pull in structurally related code (callers, type members)
- Export boost β public API surface ranks above private helpers
- Score-gap detection β drops trailing results that fall off a relevance cliff
- Token-aware truncation β context assembly keeps query-relevant lines within token budgets
For the full deep dive, see System Architecture.
Data Storage
All data lives in ~/.code-intelligence/:
~/.code-intelligence/
βββ models/ # Shared across repos (~3.2 GB total)
β βββ jina-code-embeddings-1.5b-gguf/ # ~1.5 GB, 1536-dim Matryoshka, Q8_0
β βββ qwen2.5-coder-1.5b-gguf/ # ~1.0 GB, Q4_K_M, description LLM
β βββ bge-reranker-v2-m3-gguf/ # ~600 MB, Q8_0, cross-encoder reranker
βββ repos/
β βββ registry.json # Tracks all known repos
β βββ <hash>/ # Per-repo (SHA256 of repo path)
β βββ code-intelligence.db # SQLite (symbols, edges, metadata, descriptions)
β βββ tantivy-index/ # BM25 full-text search
β βββ vectors/ # LanceDB vector embeddings
βββ logs/
βββ server.toml # Standalone config (optional)
Development
cargo build --release
cargo test # Full test suite
EMBEDDINGS_BACKEND=hash cargo test # Fast (no model download)
./scripts/start_mcp.sh # Start MCP server
Project structure
src/
βββ indexer/ # File scanning, Tree-Sitter parsing, symbol extraction, embeddings, LLM descriptions
βββ storage/ # SQLite, Tantivy (BM25), LanceDB (vectors)
βββ retrieval/ # Hybrid search, ranking signals, RRF, context assembly, reranker, HyDE
βββ graph/ # PageRank, call hierarchy, type graphs, dependency graph
βββ handlers/ # MCP tool implementations
βββ server/ # MCP protocol routing (embedded + standalone)
βββ tools/ # Tool definitions (32 MCP tools)
βββ embeddings/ # jina-code-embeddings-1.5b (GGUF via llama.cpp + Metal)
βββ llm/ # Qwen2.5-Coder-1.5B (GGUF via llama.cpp + Metal)
βββ reranker/ # bge-reranker-v2-m3 cross-encoder (GGUF via llama.cpp + Metal)
βββ path/ # UTF-8 path normalization (camino)
Migration: v2 β v3
search_code previously returned both ranked hits and a context markdown bundle (source code for top hits + auto-expanded "Examples" / "Related" symbols). The bundle was always assembled, even when callers only needed the ranked list, and could exceed 30 KB per call.
In v3.0.0, search_code is a discovery tool by default. It returns hits only. Source code is opt-in via the new context parameter:
context value | What you get | Typical size (limit=5) |
|---|---|---|
"none" (default) | hits array only β no source code, no graph expansion | ~600 B |
"snippets" | hits with a snippet field on each (signature + first 8 body lines) | ~2-4 KB |
"full" | Legacy v2 behavior: context markdown bundle with graph expansion | ~15 KB |
To restore v2 behavior, pass context: "full" on every call.
For most agent workflows, "snippets" is the recommended setting: enough code to ground the next decision, without rendering an entire markdown bundle. Agents that need full source for selected hits should call hydrate_symbols(ids[]) after search_code.
The web UI and cross-repo aggregator continue to request context: "full" internally; only the public MCP search_code tool default has changed.
License
MIT
