Paparats
Semantic code search for AI coding assistants. Local Qdrant, multi-repo, no API keys.
Ask AI about Paparats
Powered by Claude Β· Grounded in docs
I know everything about Paparats. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Paparats MCP
Paparats-kvetka β a magical flower from Slavic folklore that blooms on Kupala Night and grants power to whoever finds it. Likewise, paparats-mcp helps you find the right code across a sea of repositories.
Semantic code search for AI coding assistants. Give Claude Code, Cursor, Windsurf, Codex and rest deep understanding of your entire codebase β single repo or multi-project workspaces. Search by meaning, not keywords. Keep your index fresh with real-time file watching. Return only relevant chunks instead of full files to save tokens.
Everything runs locally. No cloud. No API keys. Your code never leaves your machine.
Table of Contents
- Why Paparats?
- Quick Start
- Deployment Guides
- How It Works
- Key Features
- Use Cases
- Configuration
- MCP Tools Reference
- Connecting MCP
- CLI Commands
- Docker & Ollama
- Monitoring
- Architecture
- Embedding Model Setup
- Comparison with Alternatives
- Token Savings Metrics
- Contributing
- Links
Why Paparats?
AI coding assistants are smart, but they can only see files you open. They don't know your codebase structure, where the authentication logic lives, or how services connect. Paparats fixes that.
What you get
- Semantic code search β ask "where is the rate limiting logic?" and get exact code ranked by meaning, not grep matches
- Real-time sync β edit a file, and 2 seconds later it's re-indexed. No manual re-runs
- LSP intelligence β go-to-definition, find-references, rename symbols via CCLSP integration
- Token savings β return only relevant chunks instead of full files to reduce context size
- Multi-project workspaces β search across backend, frontend, infra repos in one query
- 100% local & private β Qdrant vector database + Ollama embeddings. Nothing leaves your laptop
- AST-aware chunking β code split by AST nodes (functions/classes) via tree-sitter, not arbitrary character counts (TypeScript, JavaScript, TSX, Python, Go, Rust, Java, Ruby, C, C++, C#; regex fallback for Terraform)
- Rich metadata β each chunk knows its symbol name (from tree-sitter AST), service, domain context, and tags from directory structure
- Symbol graph β find usages and cross-chunk relationships powered by AST-based symbol extraction (defines/uses analysis)
- Git history per chunk β see who last modified a chunk, when, and which tickets (Jira, GitHub) are linked to it
Who benefits
| Use Case | How Paparats Helps |
|---|---|
| Solo developers | Quickly navigate unfamiliar codebases, find examples of patterns, reduce context-switching |
| Multi-repo teams | Cross-project search (backend + frontend + infra), consistent patterns, faster onboarding |
| AI agents | Foundation for product support bots, QA automation, dev assistants β any agent that needs code context |
| Legacy modernization | Find all usages of deprecated APIs, identify migration patterns, discover hidden dependencies |
| Contractors/consultants | Accelerate ramp-up on client codebases, reduce "where is X?" questions |
Quick Start
# 1. Install CLI
npm install -g @paparats/cli
# 2. One-time setup (downloads ~1.6 GB GGUF model, starts Docker containers)
paparats install
# 3. In your project
cd your-project
paparats init # creates .paparats.yml
paparats index # index the codebase
# 4. Keep index fresh with file watching
paparats watch # run in background or separate terminal
# 5. Connect your IDE (Cursor, Claude Code) β see "Connecting MCP" below
Prerequisites
Install these before running paparats install:
| Tool | Purpose | Install |
|---|---|---|
| Docker | Runs Qdrant vector DB + MCP server | docker.com |
| Docker Compose | Orchestrates containers (v2) | Included with Docker Desktop; Linux: apt install docker-compose-plugin |
| Ollama | Local embedding model (on host) | ollama.com (or use --ollama-mode docker) |
The CLI checks that docker and ollama (or docker only in Docker Ollama mode) are available. If missing, it exits with installation links.
Deployment Guides
Paparats supports three deployment modes, each designed for a different use case:
Developer Local Setup
The default mode β for developers using Claude Code, Cursor, or other AI assistants locally.
# Install with local Ollama (default, requires Ollama installed on host)
paparats install --mode developer
# Or with Docker Ollama (no host Ollama needed)
paparats install --mode developer --ollama-mode docker
# Or with an external Qdrant instance (e.g. Qdrant Cloud)
paparats install --mode developer --qdrant-url http://your-qdrant:6333
# Then, in each project:
cd your-project
paparats init # creates .paparats.yml
paparats index # index the codebase
paparats watch # auto-reindex on file changes
What happens:
- Checks Docker (and Ollama if local mode)
- Asks whether to use an external Qdrant instance (or pass
--qdrant-urlto skip the prompt) - Generates docker-compose with qdrant + paparats server (+ ollama if docker mode). When using external Qdrant, the Qdrant container is omitted
- Downloads and registers the embedding model (local mode) or uses pre-baked Docker image (docker mode)
- Auto-configures Cursor MCP if
~/.cursor/exists
Server / Production Setup
For teams wanting a self-contained Docker stack that auto-indexes repos on a schedule. No IDE integration β headless operation.
# Full stack: qdrant + ollama + paparats server + indexer
paparats install --mode server --repos org/repo1,org/repo2
# With private repos
paparats install --mode server \
--repos org/private-repo,org/other \
--github-token ghp_xxx
# Custom schedule (default: every 6 hours)
paparats install --mode server \
--repos org/repo \
--cron "0 */2 * * *"
# With external Qdrant (e.g. Qdrant Cloud)
paparats install --mode server \
--repos org/repo \
--qdrant-url https://qdrant.example.com \
--qdrant-api-key your-api-key
# All repos in one shared collection
paparats install --mode server \
--repos org/repo1,org/repo2 \
--group shared-index
What happens:
- Checks Docker only (no Ollama check β runs in Docker)
- Asks whether to use an external Qdrant instance (or pass
--qdrant-urlto skip the prompt) - Generates docker-compose with all services: qdrant + ollama + paparats + indexer. When using external Qdrant, the Qdrant container is omitted
- Creates
~/.paparats/.envwithREPOS,GITHUB_TOKEN,CRON,PAPARATS_GROUP,QDRANT_API_KEY(as applicable) - Starts all containers
- Indexer clones repos and indexes them on the configured schedule
After setup:
# Trigger immediate reindex
curl -X POST http://localhost:9877/trigger
# Trigger specific repos only
curl -X POST http://localhost:9877/trigger -H 'Content-Type: application/json' \
-d '{"repos": ["org/repo1"]}'
# Check indexer status
curl http://localhost:9877/health
# MCP endpoints for clients
# Coding: http://localhost:9876/mcp
# Support: http://localhost:9876/support/mcp
Support Agent Setup
For support teams and bots that connect to an existing Paparats server β no Docker, no Ollama needed.
# Connect to a running server (default: localhost:9876)
paparats install --mode support
# Connect to a remote server
paparats install --mode support --server http://prod-server:9876
What happens:
- Verifies the server is reachable (health check)
- Configures Cursor MCP with support endpoint (
/support/mcp) - Configures Claude Code MCP if
~/.claude/exists - Prints available tools and endpoint info
Support endpoint tools: search_code, get_chunk, find_usages, health_check, get_chunk_meta, search_changes, explain_feature, recent_changes, impact_analysis
How It Works
Your projects Paparats AI assistant
(Claude Code / Cursor)
backend/ ββββββββββββββββββββββββ
.paparats.yml βββββββββΊβ Indexer β
frontend/ β - chunks code β ββββββββββββββββ
.paparats.yml βββββββββΊβ - embeds via Ollama βββββββββββΊβ MCP search β
infra/ β - stores in Qdrant β β tool call β
.paparats.yml βββββββββΊβ - watches changes β ββββββββββββββββ
ββββββββββββββββββββββββ
Indexing Pipeline
When you run paparats index (or a file changes during paparats watch), each file goes through this pipeline:
Source file
β
βΌ
βββββββββββββββββββ
β 1. File discoveryβ Collect files from indexing.paths, apply
β & filtering β gitignore + exclude patterns, skip binary
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β 2. Content hash β SHA-256 of file content β compare with
β check β existing Qdrant chunks β skip unchanged
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β 3. AST parsing β tree-sitter parses the file once (WASM)
β (single pass) β β reused for chunking AND symbol extraction
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β 4. Chunking β AST nodes β chunks at function/class
β β boundaries. Regex fallback for unsupported
β β languages (brace/indent/block strategies)
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β 5. Symbol β AST queries extract defines (function,
β extraction β class, variable names) and uses (calls,
β β references) per chunk. 10+ languages
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β 6. Metadata β Service name, bounded_context, tags from
β enrichment β config + auto-detected directory tags
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β 7. Embedding β Jina Code Embeddings 1.5B via Ollama
β β SQLite cache (content-hash key) β skip
β β already-embedded content
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β 8. Qdrant upsert β Vectors + payload (content, file, lines,
β β symbols, metadata) β batched upsert
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β 9. Git history β git log per file β diff hunks β map
β (post-index) β commits to chunks by line overlap β
β β extract ticket refs β store in SQLite
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β10. Symbol graph β Cross-chunk edges: calls β called_by,
β (post-index) β references β referenced_by β SQLite
βββββββββββββββββββ
Search Flow
AI assistant queries via MCP β server detects query type (nl2code / code2code / techqa) β expands query (abbreviations, case variants, plurals) β all variants searched in parallel against Qdrant β results merged by max score β only relevant chunks returned with confidence scores and symbol info.
Watching
paparats watch monitors file changes via chokidar with debouncing (1s default). On change, only the affected file re-enters the pipeline. Unchanged content is never re-embedded thanks to the content-hash cache.
Key Features
Better Search Quality
Task-specific embeddings β Jina Code Embeddings supports 3 query types (nl2code, code2code, techqa) with different prefixes for better relevance:
"find authentication middleware"βnl2codeprefix (natural language β code)"function validateUser(req, res)"βcode2codeprefix (code β similar code)"how does OAuth work in this app?"βtechqaprefix (technical questions)
Query expansion β every search generates 2-3 variations server-side:
- Abbreviations:
authβauthentication,dbβdatabase - Case variants:
userAuthβuser_authβUserAuth - Plurals:
usersβuser,dependenciesβdependency - Filler removal:
"how does auth work"β"auth"
All variants searched in parallel, results merged by max score.
Confidence scores β each result includes a percentage score (β₯60% high, 40β60% partial, <40% low) to guide AI next steps.
Performance
Embedding cache β SQLite cache with content-hash keys + Float32 vectors. Unchanged code never re-embedded. LRU cleanup at 100k entries.
AST-aware chunking β tree-sitter AST nodes define natural chunk boundaries for 11 languages. Falls back to regex strategies (block-based for Ruby, brace-based for JS/TS, indent-based for Python, fixed-size) for unsupported languages.
Real-time watching β paparats watch monitors file changes with debouncing (1s default). Edit β save β re-index in ~2 seconds.
Integrations
CCLSP (Claude Code LSP) β during paparats init, optionally sets up:
- LSP server for your language (TypeScript, Python, Go, Ruby, etc.)
- MCP config for go-to-definition, find-references, rename
- Typical AI workflow:
search_code(semantic) βfind_definition(precise navigation) βfind_references(impact analysis)
Skip with --skip-cclsp if not needed.
Use Cases
For Developers (Coding)
Connect via the coding endpoint (/mcp):
| Use Case | How |
|---|---|
| Navigate unfamiliar code | search_code "authentication middleware" β exact locations |
| Find similar patterns | search_code "retry with exponential backoff" β examples |
| Trace dependencies | find_usages <chunk_id> --direction both β callers + deps |
| Explore context | get_chunk <chunk_id> --radius_lines 50 β expand around |
For Support Teams
Connect via the support endpoint (/support/mcp):
| Use Case | How |
|---|---|
| Explain a feature | explain_feature "rate limiting" β code locations + changes + modules |
| Recent changes | recent_changes "auth" --since 2024-01-01 β timeline with tickets |
| Impact analysis | impact_analysis "payment processing" β blast radius + service graph |
| Change history | get_chunk_meta <chunk_id> β authors, dates, linked tickets |
Support chatbot example:
User: "How do I configure rate limiting?"
Bot workflow (via /support/mcp):
1. explain_feature("rate limiting", group="my-app")
β returns code locations + recent changes + related modules
2. get_chunk_meta(<chunk_id>)
β returns who last modified it, when, linked tickets
3. Bot synthesizes response in plain language with ticket references
Configuration
.paparats.yml in your project root:
group: 'my-project-group' # required β maps to Qdrant collection "paparats_my-project-group"
language: ruby # required β or array: [ruby, typescript]
indexing:
paths: ['app/', 'lib/'] # directories to index (default: ["./"])
exclude: ['vendor/**'] # additional excludes (merged with language defaults)
extensions: ['.rb'] # override auto-detected extensions
chunkSize: 1024 # max chars per chunk (default: 1024)
concurrency: 2 # parallel file processing (default: 2)
batchSize: 50 # Qdrant upsert batch size (default: 50)
watcher:
enabled: true # auto-reindex on file changes (default: true)
debounce: 1000 # ms debounce (default: 1000)
embeddings:
provider: 'ollama' # embedding provider (default: "ollama")
model: 'jina-code-embeddings' # Ollama alias (see below)
dimensions: 1536 # vector dimensions (default: 1536)
metadata:
service: 'my-service' # service name (default: project directory name)
bounded_context: 'identity' # domain context (default: null)
tags: ['api', 'auth'] # global tags applied to all chunks
directory_tags: # tags applied to chunks from specific directories
src/controllers: ['controller']
src/models: ['model']
git:
enabled: true # extract git history per chunk (default: true)
maxCommitsPerFile: 50 # max commits to analyze per file (1-500, default: 50)
ticketPatterns: # custom regex patterns for ticket extraction
- 'TASK_\d+'
- 'ISSUE-\d+'
Groups
Projects with the same group name share a search scope. All indexed together in one Qdrant collection. The group name maps to a Qdrant collection with a paparats_ prefix (e.g. group my-fullstack β collection paparats_my-fullstack). This prevents namespace collisions when sharing a Qdrant instance with other applications.
# backend/.paparats.yml
group: 'my-fullstack'
language: ruby
indexing:
paths: ['app/', 'lib/']
# frontend/.paparats.yml
group: 'my-fullstack'
language: typescript
indexing:
paths: ['src/']
Now searching "authentication flow" finds code in both backend and frontend.
Server mode shared group: When using the indexer container with multiple repos, set PAPARATS_GROUP (or --group during install) to index all repos into a single collection:
paparats install --mode server --repos org/repo1,org/repo2 --group shared-index
Metadata
The metadata section enriches each indexed chunk with contextual information that improves search filtering and helps AI assistants understand code ownership.
| Field | Description | Default |
|---|---|---|
service | Service name (e.g., payment-service) | Project directory name |
bounded_context | Domain context (e.g., billing, identity) | null |
tags | Global tags applied to all chunks | [] |
directory_tags | Tags applied to chunks from specific directories | {} |
Tags from directory_tags are matched by path prefix. Additionally, tags are auto-detected from directory structure (e.g., src/controllers/user.ts gets a controllers tag).
Git History
When metadata.git.enabled is true (the default), the server extracts git history after indexing:
- For each indexed file, runs
git logto get commit history - Parses diff hunks to determine which commits affected which line ranges
- Maps commits to chunks by line-range overlap
- Extracts ticket references from commit messages
- Stores results in a local SQLite database (
~/.paparats/metadata.db) - Enriches Qdrant payloads with
last_commit_at,last_author_email,ticket_keys
Built-in ticket patterns:
- Jira:
PROJ-123,TEAM-456 - GitHub issues:
#42 - GitHub cross-repo:
org/repo#99
Custom patterns can be added via metadata.git.ticketPatterns β each entry is a regex string. Use a capture group to extract the ticket key, or the full match is used.
| Config | Description | Default |
|---|---|---|
git.enabled | Enable git history extraction | true |
git.maxCommitsPerFile | Max commits to analyze per file | 50 (range: 1-500) |
git.ticketPatterns | Custom regex patterns for ticket extraction | [] |
Git metadata extraction is non-fatal β if a project is not a git repository or git is unavailable, indexing continues normally without git enrichment.
MCP Tools Reference
Paparats exposes 10 tools via the Model Context Protocol on two separate endpoints, each with its own tool set and system instructions:
Coding Endpoint (/mcp)
For developers using Claude Code, Cursor, etc. Focus: search code, read chunks, trace symbol dependencies, manage indexing.
| Tool | Description |
|---|---|
search_code | Semantic search across indexed projects. Returns code chunks with symbol definitions/uses and confidence scores |
get_chunk | Retrieve a chunk by ID with optional surrounding context. Returns code with symbol info |
find_usages | Find symbol relationships: incoming (callers), outgoing (dependencies), or both directions |
health_check | Indexing status: chunks per group, running jobs |
reindex | Trigger full reindex; track progress with health_check |
Support Endpoint (/support/mcp)
For support teams and bots without direct code access. Focus: feature explanations, change history, impact analysis β all in plain language.
| Tool | Description |
|---|---|
search_code | Semantic search across indexed projects |
get_chunk | Retrieve a chunk by ID with optional surrounding context |
find_usages | Find symbol relationships: callers, dependencies, or both |
health_check | Indexing status: chunks per group, running jobs |
get_chunk_meta | Git history and ticket references for a chunk: commits, authors, dates, linked tickets. No code |
search_changes | Semantic search filtered by last commit date. Each result shows when it was last changed |
explain_feature | Comprehensive feature analysis: code locations + recent changes + related modules for a question |
recent_changes | Timeline of changes matching a query, grouped by date with commits, tickets, and affected files. Supports since date filter |
impact_analysis | Dependency impact subgraph: seed chunks + impact grouped by service/context + dependency edges. 1-2 hop graph traversal |
Typical Workflow
Drill-down workflow β start broad, zoom in:
1. search_code "authentication middleware" β find relevant chunks with symbols
2. get_chunk <chunk_id> --radius_lines 50 β expand context around a result
3. find_usages <chunk_id> --direction both β see callers and dependencies
4. get_chunk_meta <chunk_id> β see who modified it, when, linked tickets
5. search_changes "auth" --since 2024-01-01 β find recent auth changes
Single-call workflow β get the full picture in one round-trip:
1. explain_feature "How does authentication work?" β code locations + changes + related modules
2. recent_changes "auth" --since 2024-01-01 β timeline of auth changes with tickets
3. impact_analysis "rate limiting" β blast radius: seed chunks + service graph + edges
4. get_chunk <chunk_id> β drill into any specific chunk for code
Connecting MCP
After paparats install and paparats index, connect your IDE:
Cursor
Create or edit ~/.cursor/mcp.json (global) or .cursor/mcp.json (project):
{
"mcpServers": {
"paparats": {
"type": "http",
"url": "http://localhost:9876/mcp"
}
}
}
For support use case (feature explanations, change history, impact analysis):
{
"mcpServers": {
"paparats-support": {
"type": "http",
"url": "http://localhost:9876/support/mcp"
}
}
}
Restart Cursor after changing config.
Claude Code
# Coding endpoint (default)
claude mcp add --transport http paparats http://localhost:9876/mcp
# Support endpoint (for support bots/agents)
claude mcp add --transport http paparats-support http://localhost:9876/support/mcp
Or add to .mcp.json in project root:
{
"mcpServers": {
"paparats": {
"type": "http",
"url": "http://localhost:9876/mcp"
}
}
}
Verify
paparats statusβ check server is running- Coding endpoint (
/mcp): tools βsearch_code,get_chunk,find_usages,health_check,reindex - Support endpoint (
/support/mcp): tools βsearch_code,get_chunk,find_usages,health_check,get_chunk_meta,search_changes,explain_feature,recent_changes,impact_analysis - Ask the AI: "Search for authentication logic in the codebase"
CLI Commands
| Command | Description |
|---|---|
paparats init | Create .paparats.yml (interactive or --non-interactive) |
paparats install | Set up Docker + Ollama + MCP configuration |
paparats update | Update CLI from npm + pull latest Docker image |
paparats index | Index the current project |
paparats search <query> | Semantic search across indexed projects |
paparats watch | Watch files and auto-reindex on changes |
paparats status | System status (Docker, Ollama, config, server health, groups) |
paparats doctor | Run diagnostic checks |
paparats groups | List all indexed groups and projects |
Most commands support --server <url> (default: http://localhost:9876) and --json for machine-readable output.
Common Options
paparats init
--forceβ Overwrite existing config--group <name>β Set group (skip prompt)--language <lang>β Set language (skip prompt)--non-interactiveβ Use defaults without prompts--skip-cclspβ Skip CCLSP language server setup
paparats install
--mode <mode>β Install mode:developer(default),server, orsupport--ollama-mode <mode>β Ollama deployment:dockerorlocal(default, developer/server mode)--ollama-url <url>β External Ollama URL (e.g.http://192.168.1.10:11434). Implies--ollama-mode local. Skips local Ollama binary check and model setup--skip-dockerβ Skip Docker setup (developer mode)--skip-ollamaβ Skip Ollama model (developer mode)--qdrant-url <url>β External Qdrant URL β skip Qdrant Docker container (developer/server mode)--qdrant-api-key <key>β Qdrant API key for authenticated access (e.g. Qdrant Cloud)--repos <repos>β Comma-separated repos to index (server mode)--github-token <token>β GitHub token for private repos (server mode)--cron <expression>β Cron schedule for indexing (server mode, default:0 */6 * * *)--group <name>β Shared Qdrant group β all repos in one collection (server mode). SetsPAPARATS_GROUPenv var--server <url>β Server URL to connect to (support mode)-v, --verboseβ Show detailed output
paparats index
-f, --forceβ Force reindex (clear existing chunks)--dry-runβ Show what would be indexed--timeout <ms>β Request timeout (default: 300000)-v, --verboseβ Show skipped files and errors--jsonβ Output as JSON
paparats search <query>
-n, --limit <n>β Max results (default: 5)-p, --project <name>β Filter by project-g, --group <name>β Override group from config--timeout <ms>β Request timeout (default: 30000)-v, --verboseβ Show token savings--jsonβ Output as JSON
paparats watch
--dry-runβ Show what would be watched-v, --verboseβ Show file events--jsonβ Output events as JSON lines--pollingβ Use polling instead of native watchers (fewer file descriptors; use if EMFILE occurs)
Docker & Ollama
Paparats supports two ways to run Ollama: on the host (local) or in Docker.
Local Ollama
The default mode. Ollama runs on your host machine, and the Docker containers connect to it.
- Qdrant and MCP server run in Docker containers
- Ollama runs on the host (not Docker). Server connects via
host.docker.internal:11434(Mac/Windows) - On Linux, set
OLLAMA_URL=http://172.17.0.1:11434in~/.paparats/docker-compose.yml - Embedding cache (SQLite) persists in
paparats_dataDocker volume
paparats install # local Ollama (default)
paparats install --ollama-mode local # explicit
Docker Ollama
Ollama runs in a Docker container using ibaz/paparats-ollama β a custom image with the Jina Code Embeddings model pre-baked (~3 GB). No host Ollama installation needed.
paparats install --ollama-mode docker # Docker Ollama
Benefits:
- Zero host setup β no Ollama binary, no GGUF download
- Model immediately ready on container start
- Consistent across environments
Trade-offs:
- ~1.7 GB Docker image (one-time pull)
- CPU-only β no GPU/Metal acceleration (sufficient for embedding generation, but slower than native Ollama on Mac)
External Ollama
If you run Ollama on a separate machine (e.g. AWS Fargate, a GPU server, or another host on your network), use --ollama-url to point the install at it:
paparats install --ollama-url http://192.168.1.10:11434
# Server mode with external Ollama
paparats install --mode server --ollama-url http://ollama.internal:11434 --repos org/repo1
When --ollama-url is set:
- The Ollama Docker container is omitted from the generated
docker-compose.yml - No local
ollamabinary is required β GGUF download and model registration are skipped - The
OLLAMA_URLenvironment variable in the paparats server (and indexer in server mode) points to your external instance - Implies
--ollama-mode local(no Docker Ollama)
This is useful when Docker Ollama is too slow (e.g. CPU-only on Mac, where native Ollama can use Metal GPU acceleration) or when you want to share a single Ollama instance across multiple machines.
External Qdrant
By default, paparats install runs Qdrant as a Docker container. If you already have a Qdrant instance (e.g. Qdrant Cloud, a shared cluster, or a host-level install), you can skip the Qdrant container entirely:
# Via CLI flag
paparats install --qdrant-url http://your-qdrant:6333
# With API key authentication (e.g. Qdrant Cloud)
paparats install --qdrant-url https://xxx.cloud.qdrant.io --qdrant-api-key your-api-key
# Or answer the interactive prompt during install
paparats install
# ? Use an external Qdrant instance? (skip Qdrant Docker container) Yes
# ? Qdrant URL: http://your-qdrant:6333
When --qdrant-url is set:
- The Qdrant Docker service is omitted from the generated
docker-compose.yml - The
QDRANT_URLenvironment variable in the paparats server (and indexer in server mode) points to your external instance - Health check during install verifies the external Qdrant is reachable
When --qdrant-api-key is set:
QDRANT_API_KEYis passed to all containers (server + indexer) viadocker-compose.ymland~/.paparats/.env- Can also be set directly as an environment variable:
QDRANT_API_KEY=your-keyon the server or indexer process
This works with both --mode developer and --mode server.
Monitoring
Paparats exposes Prometheus metrics for operational visibility. Opt in by setting PAPARATS_METRICS=true in the server's environment:
# In ~/.paparats/docker-compose.yml, under paparats service:
environment:
PAPARATS_METRICS: 'true'
Metrics endpoint
curl http://localhost:9876/metrics
Key metrics
| Metric | Type | Description |
|---|---|---|
paparats_search_total | Counter | Search requests by group and method |
paparats_search_duration_seconds | Histogram | Search latency |
paparats_index_files_total | Counter | Files indexed |
paparats_index_chunks_total | Counter | Chunks indexed |
paparats_query_cache_hit_rate | Gauge | Query result cache hit rate |
paparats_embedding_cache_hit_rate | Gauge | Embedding cache hit rate |
paparats_watcher_events_total | Counter | File watcher events |
Prometheus scrape config
scrape_configs:
- job_name: paparats
scrape_interval: 15s
static_configs:
- targets: ['localhost:9876']
Query cache
Search results are cached in-memory (LRU, default 1000 entries, 5-minute TTL). The cache is automatically invalidated when files change. Configure via environment variables:
QUERY_CACHE_MAX_ENTRIESβ max cached queries (default: 1000)QUERY_CACHE_TTL_MSβ TTL in milliseconds (default: 300000)
Cache stats are included in GET /api/stats under the queryCache field.
Architecture
paparats-mcp/
βββ packages/
β βββ server/ # MCP server (Docker image: ibaz/paparats-server)
β β βββ src/
β β β βββ lib.ts # Public library exports (for programmatic use)
β β β βββ index.ts # HTTP server bootstrap + graceful shutdown
β β β βββ app.ts # Express app + HTTP API routes
β β β βββ indexer.ts # Group-aware indexing, single-parse chunkFile()
β β β βββ searcher.ts # Search with query expansion, cache, metrics
β β β βββ query-expansion.ts # Abbreviation, case, plural expansion
β β β βββ task-prefixes.ts # Jina task prefix detection
β β β βββ query-cache.ts # In-memory LRU search result cache
β β β βββ metrics.ts # Prometheus metrics (opt-in)
β β β βββ ast-chunker.ts # AST-based code chunking (tree-sitter, primary strategy)
β β β βββ chunker.ts # Regex-based code chunking (fallback for unsupported languages)
β β β βββ ast-symbol-extractor.ts # AST-based symbol extraction (tree-sitter, 10 languages)
β β β βββ ast-queries.ts # Tree-sitter S-expression queries per language
β β β βββ tree-sitter-parser.ts # WASM tree-sitter manager
β β β βββ symbol-graph.ts # Cross-chunk symbol edges
β β β βββ embeddings.ts # Ollama provider + SQLite cache
β β β βββ config.ts # .paparats.yml reader + validation
β β β βββ metadata.ts # Tag resolution + auto-detection
β β β βββ metadata-db.ts # SQLite store for git commits + tickets
β β β βββ git-metadata.ts # Git history extraction + chunk mapping
β β β βββ ticket-extractor.ts # Jira/GitHub/custom ticket parsing
β β β βββ mcp-handler.ts # MCP protocol β dual-mode (coding /mcp + support /support/mcp)
β β β βββ watcher.ts # File watcher (chokidar)
β β β βββ types.ts # Shared types
β β βββ Dockerfile
β βββ indexer/ # Automated repo indexer (Docker image: ibaz/paparats-indexer)
β β βββ src/
β β β βββ index.ts # Entry: Express mini-server + cron scheduler
β β β βββ repo-manager.ts # parseReposEnv(), cloneOrPull() using simple-git
β β β βββ scheduler.ts # node-cron wrapper
β β β βββ types.ts # IndexerConfig, RepoConfig, RunStatus
β β βββ Dockerfile
β βββ ollama/ # Custom Ollama with pre-baked model (Docker image: ibaz/paparats-ollama)
β β βββ Dockerfile
β βββ cli/ # CLI tool (npm package: @paparats/cli)
β β βββ src/
β β βββ index.ts # Commander entry
β β βββ docker-compose-generator.ts # Programmatic YAML generation
β β βββ commands/ # init, install, update, index, etc.
β βββ shared/ # Shared utilities (npm package: @paparats/shared)
β βββ src/
β βββ path-validation.ts # Path validation
β βββ gitignore.ts # Gitignore parsing
β βββ exclude-patterns.ts # Glob exclude normalization
β βββ language-excludes.ts # Language-specific exclude defaults
βββ examples/
βββ paparats.yml.* # Config examples per language
Stack
- Qdrant β vector database (1 collection per group with
paparats_prefix, cosine similarity, payload filtering) - Ollama β local embeddings via Jina Code Embeddings 1.5B with task-specific prefixes
- SQLite β embedding cache (
~/.paparats/cache/embeddings.db) + git metadata store (~/.paparats/metadata.db) - MCP β Model Context Protocol (SSE for Cursor, Streamable HTTP for Claude Code). Dual endpoints:
/mcp(coding) and/support/mcp(support) - TypeScript monorepo with Yarn workspaces
Integration Examples
Support Chatbot
Use paparats as the knowledge backend for a product support bot. Connect the bot to the support endpoint (/support/mcp) for access to explain_feature, recent_changes, impact_analysis, and other support-oriented tools:
User: "How do I configure rate limiting?"
Bot workflow (via /support/mcp):
1. explain_feature("rate limiting", group="my-app")
β returns code locations + recent changes + related modules
2. get_chunk_meta(<chunk_id>)
β returns who last modified it, when, linked tickets
3. Bot synthesizes response in plain language with ticket references
CI/CD (GitHub Actions)
Re-index on every push to keep the search index fresh:
name: Reindex Paparats
on:
push:
branches: [main]
jobs:
reindex:
runs-on: ubuntu-latest
services:
qdrant:
image: qdrant/qdrant:latest
ports: ['6333:6333']
steps:
- uses: actions/checkout@v4
- uses: jcarpenter/setup-ollama@v1
- run: npm install -g @paparats/cli
- run: paparats install --skip-docker
- run: paparats index --server http://localhost:9876
CI/CD with Indexer Container
For server deployments, trigger the indexer directly via HTTP:
name: Trigger Paparats Reindex
on:
push:
branches: [main]
jobs:
reindex:
runs-on: ubuntu-latest
steps:
- run: |
curl -X POST http://your-server:9877/trigger \
-H 'Content-Type: application/json' \
-d '{"repos": ["your-org/your-repo"]}'
Code Review Assistant
Combine multiple tools to analyze the impact of a pull request:
1. explain_feature("the feature being changed")
β understand what the code does and how it connects
2. impact_analysis("the changed function or module")
β blast radius: which services and modules are affected
3. search_changes("related area", since="2024-01-01")
β recent changes that might conflict or overlap
Embedding Model Setup
Default: jinaai/jina-code-embeddings-1.5b-GGUF β code-optimized, 1.5B params, 1536 dims, 32k context. Not in Ollama registry, so we create a local alias.
Recommended: paparats install automates this:
- Local mode (
--ollama-mode local): Downloads GGUF (~1.65 GB) to~/.paparats/models/, creates Modelfile and runsollama create jina-code-embeddings - Docker mode (
--ollama-mode docker): Usesibaz/paparats-ollamaimage with model pre-baked β zero setup
Manual setup:
# 1. Download GGUF
curl -L -o jina-code-embeddings-1.5b-Q8_0.gguf \
"https://huggingface.co/jinaai/jina-code-embeddings-1.5b-GGUF/resolve/main/jina-code-embeddings-1.5b-Q8_0.gguf"
# 2. Create Modelfile
cat > Modelfile <<'EOF'
FROM ./jina-code-embeddings-1.5b-Q8_0.gguf
PARAMETER num_ctx 8192
EOF
# 3. Register in Ollama
ollama create jina-code-embeddings -f Modelfile
# 4. Verify
ollama list | grep jina
| Spec | Value |
|---|---|
| Parameters | 1.5B |
| Dimensions | 1536 |
| Context | 32,768 tokens (recommended β€ 8,192) |
| Quantization | Q8_0 (~1.6 GB) |
| Languages | 15+ programming languages |
Task-specific prefixes (nl2code, code2code, techqa) applied automatically.
Comparison with Alternatives
Feature Matrix
Deployment
| Feature | Paparats | Vexify | SeaGOAT | Augment | Sourcegraph | Greptile | Bloop |
|---|---|---|---|---|---|---|---|
| Open source | β MIT | β MIT | β MIT | β | β οΈ Partial | β | β οΈ 1 |
| Fully local | β | β | β | β οΈ No 2 | β | β | β |
Search Quality
| Feature | Paparats | Vexify | SeaGOAT | Augment | Sourcegraph | Greptile | Bloop |
|---|---|---|---|---|---|---|---|
| Code embeddings | β Jina 3 | β οΈ 4 | β 5 | β οΈ Partial | β οΈ Partial | β οΈ Partial | β |
| Vector database | β Qdrant | SQLite | ChromaDB | Propri. | Propri. | pgvector | Qdrant |
| AST chunking | β | β | β | β οΈ Partial | β οΈ Partial | β οΈ Partial | β |
| Query expansion | β 6 | β | β | β οΈ Partial | β οΈ Partial | β οΈ Partial | β |
Developer Experience
| Feature | Paparats | Vexify | SeaGOAT | Augment | Sourcegraph | Greptile | Bloop |
|---|---|---|---|---|---|---|---|
| Real-time watching | β Auto | β | β | β οΈ CI/CD | β | β οΈ Partial | β οΈ Partial |
| Embedding cache | β SQLite | β οΈ Partial | β | β οΈ Partial | β οΈ Partial | β οΈ Partial | β |
| Multi-project | β Groups | β | β | β | β | β | β |
| One-cmd install | β | β οΈ Partial | β οΈ Partial | β | β | β | β |
AI Integration
| Feature | Paparats | Vexify | SeaGOAT | Augment | Sourcegraph | Greptile | Bloop |
|---|---|---|---|---|---|---|---|
| MCP native | β | β | β | β | β | β οΈ API | β |
| LSP integration | β CCLSP | β | β | β | β οΈ Partial | β | β |
| Token metrics | β | β | β | β οΈ Partial | β | β | β |
| Git history | β | β | β | β | β οΈ Partial | β | β |
| Ticket extraction | β | β | β | β | β | β | β |
Pricing
| Paparats | Vexify | SeaGOAT | Augment | Sourcegraph | Greptile | Bloop | |
|---|---|---|---|---|---|---|---|
| Cost | β Free | β Free | β Free | β Paid | β Paid | β Paid | β οΈ Archived |
Notes
- Bloop archived January 2, 2025
- Augment Context Engine indexes locally but stores vectors in cloud
- Jina Code Embeddings 1.5B (1536 dims) with task-specific prefixes (nl2code, code2code, techqa)
- Vexify supports Ollama models but limited to specific embeddings (jina-embeddings-2-base-code, nomic-embed-text)
- SeaGOAT locked to all-MiniLM-L6-v2 (384 dims, general-purpose)
- Abbreviations, case variants, plurals, filler word removal
Token Savings Metrics
What we measure (and what we don't)
Paparats provides estimated token savings to help you understand the order of magnitude of context reduction. These are heuristics, not precise measurements.
Per-search response
{
"metrics": {
"tokensReturned": 150,
"estimatedFullFileTokens": 5000,
"tokensSaved": 4850,
"savingsPercent": 97
}
}
| Field | Calculation | Reality Check |
|---|---|---|
tokensReturned | ceil(content.length / 4) | Based on actual returned content; /4 is rough approximation |
estimatedFullFileTokens | ceil(endLine * 50 / 4) | Heuristic: assumes 50 chars/line, never loads actual files |
tokensSaved | estimated - returned | Derived: difference between two estimates |
savingsPercent | (saved / estimated) * 100 | Relative: percentage of heuristic estimate |
Cumulative stats
curl -s http://localhost:9876/api/stats | jq '.usage'
{
"searchCount": 47,
"totalTokensSaved": 152340,
"avgTokensSavedPerSearch": 3241
}
These are sums of estimates, not measured token counts from a real tokenizer.
License
MIT
Releasing (maintainers)
Full release checklist
# 1. Commit all changes, clean working tree
git status # must be clean
# 2. Bump version, sync to all packages, commit (no tag, no push)
yarn release minor # 0.1.x β 0.2.0 (or: patch, major, or explicit version)
# 3. Publish npm packages
npm login # if needed
yarn publish:npm # publishes @paparats/shared + @paparats/cli
# 4. Tag and push (triggers CI workflows)
yarn release:push # creates git tag, pushes branch + tag
# 5. Build and push Docker images
./scripts/release-docker.sh --push # builds and pushes all 3 images
What each step does
| Step | Script | Effect |
|---|---|---|
yarn release <ver> | scripts/release.js | Bumps version in root package.json, syncs to all packages via sync-version.js, commits |
yarn publish:npm | root scripts | Publishes @paparats/shared and @paparats/cli to npm |
yarn release:push | scripts/release-push.js | Creates v{version} tag, pushes branch + tag. Triggers docker-publish.yml and publish-mcp.yml |
./scripts/release-docker.sh | scripts/release-docker.sh | Builds ibaz/paparats-server, ibaz/paparats-indexer, ibaz/paparats-ollama with version + latest tags. --push pushes to Docker Hub |
Docker images
| Image | Source | Size |
|---|---|---|
ibaz/paparats-server | packages/server/Dockerfile | ~200 MB |
ibaz/paparats-indexer | packages/indexer/Dockerfile | ~200 MB |
ibaz/paparats-ollama | packages/ollama/Dockerfile | ~3 GB (includes model) |
Contributing
Contributions welcome! Areas of interest:
- Additional language support (PHP, Elixir, Scala, Kotlin, Swift)
- Alternative embedding providers (OpenAI, Cohere, local GGUF via llama.cpp)
- Performance optimizations (chunking strategies, cache eviction)
- Agent use cases (support bots, QA automation, code analytics)
Open an issue or pull request to get started.
Links
- Jina Code Embeddings β embedding model
- CCLSP β LSP integration for MCP
- Qdrant β vector database
- Ollama β local LLM runtime
- MCP β Model Context Protocol
Star the repo if Paparats helps you code faster!
