📦

Qex

Lightweight MCP server for semantic code search — BM25 + optional dense vectors + tree-sitter chunking

0 installs

Trust: 34 — Low

Rag

Ask AI about Qex

I know everything about Qex. Ask me about installation, configuration, usage, or troubleshooting.

0/500

Loading tools...

Reviews

Documentation

QEX

Lightweight MCP server for semantic code search

BM25 + optional dense vectors + tree-sitter chunking

English | 中文

QEX is a high-performance MCP server for semantic code search built in Rust. It combines BM25 full-text search with optional dense vector embeddings for hybrid retrieval — delivering Cursor-quality search from a single ~19 MB binary. Tree-sitter parsing understands code structure (functions, classes, methods), Merkle DAG change detection enables incremental indexing, and everything runs locally with zero cloud dependencies.

What's New

Pluggable Embedding Backends — Trait-based abstraction over ONNX Runtime (local) and OpenAI API embedding providers with env var configuration
Hybrid Search — BM25 + dense vector search with Reciprocal Rank Fusion for 48% better accuracy than dense-only retrieval
10 Language Support — Python, JavaScript, TypeScript, Rust, Go, Java, C, C++, C#, Markdown via tree-sitter
Incremental Indexing — Merkle DAG change detection, only re-indexes what changed
Optional Dense Vectors — snowflake-arctic-embed-s (33 MB, 384-dim, INT8 quantized) via ONNX Runtime, or OpenAI text-embedding-3-small via API
MCP Native — plugs directly into Claude Code as a tool server via stdio

Why QEX?

Claude Code uses grep + glob for code search — effective but token-hungry and lacks semantic understanding. Cursor uses vector embeddings with cloud indexing (~3.5 GB stack). QEX is the middle ground:

BM25 + Dense Hybrid: 48% better accuracy than dense-only retrieval (Superlinked 2025)
Tree-sitter Chunking: Understands code structure — functions, classes, methods — not just lines
Incremental Indexing: Merkle DAG change detection, only re-indexes what changed
Zero Cloud Dependencies: Everything runs locally via ONNX Runtime
MCP Native: Plugs directly into Claude Code as a tool server

Quick Start

Install from crates.io:

cargo install qex-mcp

# Add to Claude Code
claude mcp add qex --scope user -- ~/.cargo/bin/qex

Build from source:

# Build (BM25-only, ~19 MB)
cargo build --release

# Or with dense vector search (~36 MB)
cargo build --release --features dense

# Or with OpenAI embedding support
cargo build --release --features openai

# Or with all embedding backends
cargo build --release --features "dense,openai"

# Install
cp target/release/qex ~/.local/bin/

# Add to Claude Code
claude mcp add qex --scope user -- ~/.local/bin/qex

That's it. Claude will now have access to search_code and index_codebase tools.

Enable Dense Search (Optional)

Dense search adds semantic understanding — finding "authentication middleware" even when the code says verify_token. Two embedding backends are available:

Option A: Local ONNX Model (Recommended)

Requires the dense feature flag. Zero cloud dependencies.

# Download the embedding model (~33 MB)
./scripts/download-model.sh

# Or via MCP tool (after adding to Claude)
# Claude: "download the embedding model"

Model: snowflake-arctic-embed-s — 384-dim, INT8 quantized, 512 token max.

When the model is present, search automatically switches to hybrid mode. No configuration needed.

Option B: OpenAI API Embeddings

Requires the openai feature flag and an API key. Supports any OpenAI-compatible API.

# Build with OpenAI support (can combine with dense)
cargo build --release --features "dense,openai"

# Configure
export QEX_EMBEDDING_PROVIDER=openai
export QEX_OPENAI_API_KEY=sk-...  # or OPENAI_API_KEY

See Configuration for all options.

Architecture

Claude Code ──(stdio/JSON-RPC)──▶ qex
                                      │
                      ┌───────────────┼───────────────┐
                      ▼               ▼               ▼
                 tree-sitter      tantivy        ort + usearch
                  Chunking         BM25         Dense Vectors
                 (11 langs)       (<1ms)         (optional)
                      │               │               │
                      └───────┬───────┘               │
                              ▼                       │
                      Ranking Engine ◄────────────────┘
                    (RRF + multi-factor)
                              │
                              ▼
                      Ranked Results

How Search Works

Query Analysis — Tokenization, stop-word removal, intent detection
BM25 Search — Full-text search via tantivy with field boosts (name, content, tags, path)
Dense Search (optional) — Embed query via pluggable backend (ONNX or OpenAI) → HNSW cosine similarity → top-k vectors
Reciprocal Rank Fusion — Merge BM25 and dense results: score = Σ 1/(k + rank)
Multi-factor Ranking — Re-rank by chunk type, name match, path relevance, tags, docstring presence
Test Penalty — Down-rank test files (0.7×) to prioritize implementation code

How Indexing Works

File Walking — Respects .gitignore, filters by extension
Tree-sitter Parsing — Language-aware AST traversal, extracts functions/classes/methods
Chunk Enrichment — Tags (async, auth, database...), complexity score, docstrings, decorators
BM25 Indexing — 14-field tantivy schema with per-field boosts
Dense Indexing (optional) — Batch embedding via Embedder trait (ONNX or OpenAI) → HNSW index
Merkle Snapshot — SHA-256 DAG for incremental change detection
Dimension Guard — dense_meta.json tracks provider/model/dimensions; mismatches trigger full re-index

MCP Tools

`index_codebase`

Index a project for semantic search.

Parameter	Type	Required	Description
`path`	string	yes	Absolute path to project directory
`force`	boolean	no	Force full re-index (default: false)
`extensions`	string[]	no	Only index specific extensions, e.g. `["py", "rs"]`

Returns file count, chunk count, detected languages, and timing.

`search_code`

Search the indexed codebase with natural language or keywords.

Parameter	Type	Required	Description
`path`	string	yes	Absolute path to project directory
`query`	string	yes	Search query (natural language or keywords)
`limit`	integer	no	Max results (default: 10)
`extension_filter`	string	no	Filter by extension, e.g. `"py"`

Auto-indexes if needed. Returns ranked results with code snippets, file paths, line numbers, and relevance scores.

`get_indexing_status`

Check if a project is indexed and get stats.

Parameter	Type	Required	Description
`path`	string	yes	Absolute path to project directory

Returns index status, file/chunk counts, languages, and whether dense search is available.

`clear_index`

Delete all index data for a project.

Parameter	Type	Required	Description
`path`	string	yes	Absolute path to project directory

`download_model`

Download the embedding model for dense search. Requires the dense feature.

Parameter	Type	Required	Description
`force`	boolean	no	Re-download even if exists (default: false)

Supported Languages

Language	Extensions	Chunk Types
Python	`.py`, `.pyi`	function, method, class, module-level, imports
JavaScript	`.js`	function, method, class, module-level
TypeScript	`.ts`, `.tsx`	function, method, class, interface, module-level
Rust	`.rs`	function, method, struct, enum, trait, impl, macro
Go	`.go`	function, method, struct, interface
Java	`.java`	method, class, interface, enum
C	`.c`, `.h`	function, struct
C++	`.cpp`, `.cc`, `.cxx`, `.hpp`	function, method, class, struct, namespace
C#	`.cs`	method, class, struct, interface, enum, namespace
Markdown	`.md`	section, document

Crates

Crate	Description
`qex-core`	Core library: chunking, search, indexing, Merkle DAG
`qex-mcp`	MCP server binary (stdio transport via rmcp)

Project Structure

qex/
├── Cargo.toml                        # Workspace root
├── scripts/
│   └── download-model.sh             # Model download script
├── crates/
│   ├── qex-core/            # Core library
│   │   └── src/
│   │       ├── lib.rs
│   │       ├── chunk/                # Tree-sitter chunking engine
│   │       │   ├── tree_sitter.rs    # AST traversal
│   │       │   ├── multi_language.rs # Language dispatcher
│   │       │   └── languages/        # 11 language implementations
│   │       ├── search/               # Search engines
│   │       │   ├── bm25.rs           # Tantivy BM25 index
│   │       │   ├── dense.rs          # HNSW vector index (feature: dense)
│   │       │   ├── embedding.rs      # Embedder trait + ONNX backend (feature: dense|openai)
│   │       │   ├── openai_embedder.rs # OpenAI API backend (feature: openai)
│   │       │   ├── hybrid.rs         # Reciprocal Rank Fusion (feature: dense)
│   │       │   ├── ranking.rs        # Multi-factor re-ranking
│   │       │   └── query.rs          # Query analysis
│   │       ├── index/                # Incremental indexer
│   │       │   ├── mod.rs            # Main indexing logic
│   │       │   └── storage.rs        # Project storage layout
│   │       ├── merkle/               # Change detection
│   │       │   ├── mod.rs            # Merkle DAG
│   │       │   ├── change_detector.rs
│   │       │   └── snapshot.rs
│   │       └── ignore.rs             # Gitignore-aware file walking
│   │
│   └── qex-mcp/            # MCP server binary
│       └── src/
│           ├── main.rs               # Entry point, stdio transport
│           ├── server.rs             # Tool handlers
│           ├── tools.rs              # Parameter schemas
│           └── config.rs             # CLI args
│
└── tests/fixtures/                   # Test source files

Storage

All data is stored locally under ~/.qex/:

~/.qex/
├── projects/
│   └── {name}_{hash}/         # Per-project index
│       ├── tantivy/           # BM25 index
│       ├── dense/             # Vector index (optional)
│       │   ├── dense.usearch  # HNSW index file
│       │   ├── dense_mapping.json  # Chunk ID ↔ vector key mapping
│       │   └── dense_meta.json     # Provider/model/dimensions guard
│       ├── snapshot.json      # Merkle DAG
│       └── stats.json         # Index stats
│
└── models/
    └── arctic-embed-s/        # Embedding model (optional)
        ├── model.onnx         # 33 MB, INT8 quantized
        └── tokenizer.json

Embedding Backends

QEX uses a pluggable Embedder trait to support multiple embedding providers. The backend is selected via the QEX_EMBEDDING_PROVIDER environment variable.

ONNX Runtime (default)

Local inference with zero cloud dependencies. Requires the dense feature flag.

Variable	Default	Description
`QEX_EMBEDDING_PROVIDER`	`onnx`	Set to `onnx` (or omit)
`QEX_ONNX_MODEL_DIR`	`~/.qex/models/arctic-embed-s`	Override model directory

OpenAI API

Cloud-based embeddings via the OpenAI API (or any compatible API like Ollama, LiteLLM, Azure). Requires the openai feature flag.

Variable	Default	Description
`QEX_EMBEDDING_PROVIDER`	—	Set to `openai`
`QEX_OPENAI_API_KEY`	—	API key (also reads `OPENAI_API_KEY`)
`QEX_OPENAI_MODEL`	`text-embedding-3-small`	Model name
`QEX_OPENAI_BASE_URL`	`https://api.openai.com/v1`	API base URL
`QEX_OPENAI_DIMENSIONS`	auto	Override dimensions for unknown models

Security features:

SSRF protection: only HTTPS or http://localhost URLs allowed for base URL
API key sanitization: keys are never leaked in error messages
Typed retry: exponential backoff (1s, 2s, 4s) on 429/5xx/timeout/connection errors

Compatible APIs: Any OpenAI-compatible embeddings endpoint works. Set QEX_OPENAI_BASE_URL to your provider's URL:

# Ollama
export QEX_OPENAI_BASE_URL=http://localhost:11434/v1

# Azure OpenAI
export QEX_OPENAI_BASE_URL=https://your-resource.openai.azure.com/openai/deployments/your-model

Dimension Mismatch Guard

When switching embedding providers or models, QEX detects the mismatch via dense_meta.json and automatically triggers a full re-index. This prevents silent search quality degradation from mismatched vector spaces.

Build & Test

# Run tests (BM25-only)
cargo test                              # 41 tests

# Run tests (with dense search)
cargo test --features dense             # 48 tests

# Run tests (with OpenAI embedder)
cargo test --features openai            # 50 tests

# Run tests (all features)
cargo test --features "dense,openai"    # 55 tests

# Build for release
cargo build --release                   # ~19 MB binary
cargo build --release --features dense  # ~36 MB binary
cargo build --release --features "dense,openai"  # All backends

Key Dependencies

Crate	Version	Purpose
tantivy	0.22	BM25 full-text search
tree-sitter	0.24	Code parsing (11 languages)
rmcp	0.17	MCP server framework (stdio)
rusqlite	0.32	SQLite metadata (bundled)
ignore	0.4	Gitignore-compatible file walking
rayon	1.10	Parallel chunking
ort	2.0.0-rc.11	ONNX Runtime (optional, `dense`)
usearch	2.24	HNSW vector index (optional, `dense`)
tokenizers	0.22	HuggingFace tokenizer (optional, `dense`)
ureq	3	Sync HTTP client (optional, `openai`)

Performance

Benchmarked on an Apple Silicon Mac:

Metric	Value
Full index (400 chunks)	~20s with dense, ~2s BM25-only
Incremental index (no changes)	<100ms
BM25 search	<5ms
Hybrid search	~50ms (includes embedding)
Binary size	19 MB (BM25) / 36 MB (dense)
Model size	33 MB (INT8 quantized)

License

AGPL-3.0

Qex