Qex
Lightweight MCP server for semantic code search β BM25 + optional dense vectors + tree-sitter chunking
Ask AI about Qex
Powered by Claude Β· Grounded in docs
I know everything about Qex. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
QEX
Lightweight MCP server for semantic code search
BM25 + optional dense vectors + tree-sitter chunking
English | δΈζ
QEX is a high-performance MCP server for semantic code search built in Rust. It combines BM25 full-text search with optional dense vector embeddings for hybrid retrieval β delivering Cursor-quality search from a single ~19 MB binary. Tree-sitter parsing understands code structure (functions, classes, methods), Merkle DAG change detection enables incremental indexing, and everything runs locally with zero cloud dependencies.
What's New
- Pluggable Embedding Backends β Trait-based abstraction over ONNX Runtime (local) and OpenAI API embedding providers with env var configuration
- Hybrid Search β BM25 + dense vector search with Reciprocal Rank Fusion for 48% better accuracy than dense-only retrieval
- 10 Language Support β Python, JavaScript, TypeScript, Rust, Go, Java, C, C++, C#, Markdown via tree-sitter
- Incremental Indexing β Merkle DAG change detection, only re-indexes what changed
- Optional Dense Vectors β snowflake-arctic-embed-s (33 MB, 384-dim, INT8 quantized) via ONNX Runtime, or OpenAI text-embedding-3-small via API
- MCP Native β plugs directly into Claude Code as a tool server via stdio
Why QEX?
Claude Code uses grep + glob for code search β effective but token-hungry and lacks semantic understanding. Cursor uses vector embeddings with cloud indexing (~3.5 GB stack). QEX is the middle ground:
- BM25 + Dense Hybrid: 48% better accuracy than dense-only retrieval (Superlinked 2025)
- Tree-sitter Chunking: Understands code structure β functions, classes, methods β not just lines
- Incremental Indexing: Merkle DAG change detection, only re-indexes what changed
- Zero Cloud Dependencies: Everything runs locally via ONNX Runtime
- MCP Native: Plugs directly into Claude Code as a tool server
Quick Start
Install from crates.io:
cargo install qex-mcp
# Add to Claude Code
claude mcp add qex --scope user -- ~/.cargo/bin/qex
Build from source:
# Build (BM25-only, ~19 MB)
cargo build --release
# Or with dense vector search (~36 MB)
cargo build --release --features dense
# Or with OpenAI embedding support
cargo build --release --features openai
# Or with all embedding backends
cargo build --release --features "dense,openai"
# Install
cp target/release/qex ~/.local/bin/
# Add to Claude Code
claude mcp add qex --scope user -- ~/.local/bin/qex
That's it. Claude will now have access to search_code and index_codebase tools.
Enable Dense Search (Optional)
Dense search adds semantic understanding β finding "authentication middleware" even when the code says verify_token. Two embedding backends are available:
Option A: Local ONNX Model (Recommended)
Requires the dense feature flag. Zero cloud dependencies.
# Download the embedding model (~33 MB)
./scripts/download-model.sh
# Or via MCP tool (after adding to Claude)
# Claude: "download the embedding model"
Model: snowflake-arctic-embed-s β 384-dim, INT8 quantized, 512 token max.
When the model is present, search automatically switches to hybrid mode. No configuration needed.
Option B: OpenAI API Embeddings
Requires the openai feature flag and an API key. Supports any OpenAI-compatible API.
# Build with OpenAI support (can combine with dense)
cargo build --release --features "dense,openai"
# Configure
export QEX_EMBEDDING_PROVIDER=openai
export QEX_OPENAI_API_KEY=sk-... # or OPENAI_API_KEY
See Configuration for all options.
Architecture
Claude Code ββ(stdio/JSON-RPC)βββΆ qex
β
βββββββββββββββββΌββββββββββββββββ
βΌ βΌ βΌ
tree-sitter tantivy ort + usearch
Chunking BM25 Dense Vectors
(11 langs) (<1ms) (optional)
β β β
βββββββββ¬ββββββββ β
βΌ β
Ranking Engine ββββββββββββββββββ
(RRF + multi-factor)
β
βΌ
Ranked Results
How Search Works
- Query Analysis β Tokenization, stop-word removal, intent detection
- BM25 Search β Full-text search via tantivy with field boosts (name, content, tags, path)
- Dense Search (optional) β Embed query via pluggable backend (ONNX or OpenAI) β HNSW cosine similarity β top-k vectors
- Reciprocal Rank Fusion β Merge BM25 and dense results:
score = Ξ£ 1/(k + rank) - Multi-factor Ranking β Re-rank by chunk type, name match, path relevance, tags, docstring presence
- Test Penalty β Down-rank test files (0.7Γ) to prioritize implementation code
How Indexing Works
- File Walking β Respects
.gitignore, filters by extension - Tree-sitter Parsing β Language-aware AST traversal, extracts functions/classes/methods
- Chunk Enrichment β Tags (async, auth, database...), complexity score, docstrings, decorators
- BM25 Indexing β 14-field tantivy schema with per-field boosts
- Dense Indexing (optional) β Batch embedding via
Embeddertrait (ONNX or OpenAI) β HNSW index - Merkle Snapshot β SHA-256 DAG for incremental change detection
- Dimension Guard β
dense_meta.jsontracks provider/model/dimensions; mismatches trigger full re-index
MCP Tools
index_codebase
Index a project for semantic search.
| Parameter | Type | Required | Description |
|---|---|---|---|
path | string | yes | Absolute path to project directory |
force | boolean | no | Force full re-index (default: false) |
extensions | string[] | no | Only index specific extensions, e.g. ["py", "rs"] |
Returns file count, chunk count, detected languages, and timing.
search_code
Search the indexed codebase with natural language or keywords.
| Parameter | Type | Required | Description |
|---|---|---|---|
path | string | yes | Absolute path to project directory |
query | string | yes | Search query (natural language or keywords) |
limit | integer | no | Max results (default: 10) |
extension_filter | string | no | Filter by extension, e.g. "py" |
Auto-indexes if needed. Returns ranked results with code snippets, file paths, line numbers, and relevance scores.
get_indexing_status
Check if a project is indexed and get stats.
| Parameter | Type | Required | Description |
|---|---|---|---|
path | string | yes | Absolute path to project directory |
Returns index status, file/chunk counts, languages, and whether dense search is available.
clear_index
Delete all index data for a project.
| Parameter | Type | Required | Description |
|---|---|---|---|
path | string | yes | Absolute path to project directory |
download_model
Download the embedding model for dense search. Requires the dense feature.
| Parameter | Type | Required | Description |
|---|---|---|---|
force | boolean | no | Re-download even if exists (default: false) |
Supported Languages
| Language | Extensions | Chunk Types |
|---|---|---|
| Python | .py, .pyi | function, method, class, module-level, imports |
| JavaScript | .js | function, method, class, module-level |
| TypeScript | .ts, .tsx | function, method, class, interface, module-level |
| Rust | .rs | function, method, struct, enum, trait, impl, macro |
| Go | .go | function, method, struct, interface |
| Java | .java | method, class, interface, enum |
| C | .c, .h | function, struct |
| C++ | .cpp, .cc, .cxx, .hpp | function, method, class, struct, namespace |
| C# | .cs | method, class, struct, interface, enum, namespace |
| Markdown | .md | section, document |
Crates
| Crate | Description | |
|---|---|---|
qex-core | Core library: chunking, search, indexing, Merkle DAG | |
qex-mcp | MCP server binary (stdio transport via rmcp) |
Project Structure
qex/
βββ Cargo.toml # Workspace root
βββ scripts/
β βββ download-model.sh # Model download script
βββ crates/
β βββ qex-core/ # Core library
β β βββ src/
β β βββ lib.rs
β β βββ chunk/ # Tree-sitter chunking engine
β β β βββ tree_sitter.rs # AST traversal
β β β βββ multi_language.rs # Language dispatcher
β β β βββ languages/ # 11 language implementations
β β βββ search/ # Search engines
β β β βββ bm25.rs # Tantivy BM25 index
β β β βββ dense.rs # HNSW vector index (feature: dense)
β β β βββ embedding.rs # Embedder trait + ONNX backend (feature: dense|openai)
β β β βββ openai_embedder.rs # OpenAI API backend (feature: openai)
β β β βββ hybrid.rs # Reciprocal Rank Fusion (feature: dense)
β β β βββ ranking.rs # Multi-factor re-ranking
β β β βββ query.rs # Query analysis
β β βββ index/ # Incremental indexer
β β β βββ mod.rs # Main indexing logic
β β β βββ storage.rs # Project storage layout
β β βββ merkle/ # Change detection
β β β βββ mod.rs # Merkle DAG
β β β βββ change_detector.rs
β β β βββ snapshot.rs
β β βββ ignore.rs # Gitignore-aware file walking
β β
β βββ qex-mcp/ # MCP server binary
β βββ src/
β βββ main.rs # Entry point, stdio transport
β βββ server.rs # Tool handlers
β βββ tools.rs # Parameter schemas
β βββ config.rs # CLI args
β
βββ tests/fixtures/ # Test source files
Storage
All data is stored locally under ~/.qex/:
~/.qex/
βββ projects/
β βββ {name}_{hash}/ # Per-project index
β βββ tantivy/ # BM25 index
β βββ dense/ # Vector index (optional)
β β βββ dense.usearch # HNSW index file
β β βββ dense_mapping.json # Chunk ID β vector key mapping
β β βββ dense_meta.json # Provider/model/dimensions guard
β βββ snapshot.json # Merkle DAG
β βββ stats.json # Index stats
β
βββ models/
βββ arctic-embed-s/ # Embedding model (optional)
βββ model.onnx # 33 MB, INT8 quantized
βββ tokenizer.json
Embedding Backends
QEX uses a pluggable Embedder trait to support multiple embedding providers. The backend is selected via the QEX_EMBEDDING_PROVIDER environment variable.
ONNX Runtime (default)
Local inference with zero cloud dependencies. Requires the dense feature flag.
| Variable | Default | Description |
|---|---|---|
QEX_EMBEDDING_PROVIDER | onnx | Set to onnx (or omit) |
QEX_ONNX_MODEL_DIR | ~/.qex/models/arctic-embed-s | Override model directory |
OpenAI API
Cloud-based embeddings via the OpenAI API (or any compatible API like Ollama, LiteLLM, Azure). Requires the openai feature flag.
| Variable | Default | Description |
|---|---|---|
QEX_EMBEDDING_PROVIDER | β | Set to openai |
QEX_OPENAI_API_KEY | β | API key (also reads OPENAI_API_KEY) |
QEX_OPENAI_MODEL | text-embedding-3-small | Model name |
QEX_OPENAI_BASE_URL | https://api.openai.com/v1 | API base URL |
QEX_OPENAI_DIMENSIONS | auto | Override dimensions for unknown models |
Security features:
- SSRF protection: only HTTPS or
http://localhostURLs allowed for base URL - API key sanitization: keys are never leaked in error messages
- Typed retry: exponential backoff (1s, 2s, 4s) on 429/5xx/timeout/connection errors
Compatible APIs: Any OpenAI-compatible embeddings endpoint works. Set QEX_OPENAI_BASE_URL to your provider's URL:
# Ollama
export QEX_OPENAI_BASE_URL=http://localhost:11434/v1
# Azure OpenAI
export QEX_OPENAI_BASE_URL=https://your-resource.openai.azure.com/openai/deployments/your-model
Dimension Mismatch Guard
When switching embedding providers or models, QEX detects the mismatch via dense_meta.json and automatically triggers a full re-index. This prevents silent search quality degradation from mismatched vector spaces.
Build & Test
# Run tests (BM25-only)
cargo test # 41 tests
# Run tests (with dense search)
cargo test --features dense # 48 tests
# Run tests (with OpenAI embedder)
cargo test --features openai # 50 tests
# Run tests (all features)
cargo test --features "dense,openai" # 55 tests
# Build for release
cargo build --release # ~19 MB binary
cargo build --release --features dense # ~36 MB binary
cargo build --release --features "dense,openai" # All backends
Key Dependencies
| Crate | Version | Purpose |
|---|---|---|
| tantivy | 0.22 | BM25 full-text search |
| tree-sitter | 0.24 | Code parsing (11 languages) |
| rmcp | 0.17 | MCP server framework (stdio) |
| rusqlite | 0.32 | SQLite metadata (bundled) |
| ignore | 0.4 | Gitignore-compatible file walking |
| rayon | 1.10 | Parallel chunking |
| ort | 2.0.0-rc.11 | ONNX Runtime (optional, dense) |
| usearch | 2.24 | HNSW vector index (optional, dense) |
| tokenizers | 0.22 | HuggingFace tokenizer (optional, dense) |
| ureq | 3 | Sync HTTP client (optional, openai) |
Performance
Benchmarked on an Apple Silicon Mac:
| Metric | Value |
|---|---|
| Full index (400 chunks) | ~20s with dense, ~2s BM25-only |
| Incremental index (no changes) | <100ms |
| BM25 search | <5ms |
| Hybrid search | ~50ms (includes embedding) |
| Binary size | 19 MB (BM25) / 36 MB (dense) |
| Model size | 33 MB (INT8 quantized) |
