PindeX
No description available
Ask AI about PindeX
Powered by Claude Β· Grounded in docs
I know everything about PindeX. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
PindeX β MCP Codebase Indexer
Structural codebase indexing for AI coding assistants β significantly fewer tokens for code exploration in medium-to-large projects.
PindeX is an MCP (Model Context Protocol) server that parses your project with tree-sitter and regex-based extractors, stores symbols, imports, and dependency graphs in a local SQLite database, and exposes 13 targeted tools so AI assistants can answer questions about your code β and your documentation β without reading entire files.
Supported languages: TypeScript, JavaScript, Java, Kotlin, Python, PHP, Vue, Svelte, Ruby, C#, Go, Rust
Contents
- How It Works
- When Is PindeX Worth Using?
- Requirements
- Installation
- Quick Start
- Multi-Project & Federation
- MCP Tools
- Environment Variables
- Integrations
- CLI Reference
- Monitoring Dashboard
- Development
- Project Structure
How It Works
Your project files
β
βββ .ts/.js βββΊ tree-sitter AST βββΊ symbols, imports, dependencies
β β
βββ .md/.yaml/.txt βββΊ chunker βββββββββββΊ documents (heading/line chunks)
β
Claude calls save_context(β¦) βββββββββββββββββββΊ context_entries
β
βΌ
SQLite (FTS5) β stored in ~/.pindex/projects/{hash}/index.db
βββ files (path, hash, language, token estimate)
βββ symbols (name, kind, signature, lines) ββΊ search_symbols
βββ dependencies (import graph) ββΊ get_dependencies
βββ usages (symbol β call sites) ββΊ find_usages
βββ documents (text chunks from .md/.yaml/.txt) ββΊ search_docs
βββ context_entries (notes saved by Claude mid-session) ββΊ search_docs
βββ token_log (per-session metrics)
β
βΌ
14 MCP tools ββββ stdio βββββΊ Claude Code / Goose / any MCP client
Instead of sending full file contents to the AI, PindeX lets it call search_symbols, search_docs, get_context, or get_file_summary β returning only what it actually needs.
Claude can also persist important facts across sessions with save_context, then retrieve them later with search_docs instead of re-reading large files.
Token savings are tracked per session and visible in a live web dashboard.
When Is PindeX Worth Using?
PindeX adds a fixed overhead per API turn (its 14 tool definitions are sent with every request). The overhead pays off when the project is large enough β and the questions complex enough β that the savings from targeted symbol lookups exceed that cost.
Realism benchmark (Sonnet 4.6, N=1, 6 Q&A tasks per codebase)
A/B comparison: vanilla Claude Code (no PindeX MCP server, only native Read/Grep/Glob) vs. Claude Code with PindeX registered, on identical prompts.
| Codebase | Baseline tokens | PindeX tokens | Ratio | Reduction |
|---|---|---|---|---|
| PindeX itself (~50 files) | 742 607 | 603 160 | 0.812 | β19 % |
microsoft/typescript-eslint (~600 files) | 948 848 | 779 469 | 0.821 | β18 % |
Where the savings come from: multi-hop / dependency-graph queries (e.g. "which modules does src/indexer/index.ts import?", "explain how processParsedFile interacts with the AST diff engine") deliver per-task ratios of 0.55 / 0.65 β PindeX wins clearly. Single-symbol lookups (e.g. "where is class X defined?") wash to ratio ~1.0; Claude can read the project's CLAUDE.md from the prompt cache and answer directly without any file reads in either condition.
100 % cache-read share in both conditions means the 14-tool overhead is fully amortised by Anthropic's prompt cache after the first turn. There is no latency penalty.
Full report: benchmarks/results/2026-04-25-realism-3.md. Re-run yourself with npm run bench:realism.
Caveats
- N=1 across 12 tasks total. Stochastic variance means the headline number is in the 15β25 % range, not a precise 18.5 %. For marketing-grade numbers, repeat at N=3.
- Q&A workload only. Coding tasks ("implement feature X") are not measured; the cost profile is likely different and possibly less favourable.
- TypeScript-heavy codebases. Both targets are pure TS / TS-monorepo. PindeX' regex extractors for Python / Java / Go / Rust have lower symbol-extraction quality than tree-sitter-typescript; the benefit on those stacks may be smaller.
- Older marketing copy claimed "80β90 % token reduction". That number was aspirational and is not what the realism benchmark produces. The honest figure is ~18β19 %.
Rule of thumb
PindeX is beneficial when:
| Condition | Threshold |
|---|---|
| Number of files | β₯ 40 |
| Average file length | β₯ 150 lines/file |
| Question type | Multi-hop / dependency / impact analysis (single-symbol lookups break even) |
Built-in recommendation: get_project_overview always returns an index_recommendation field:
{
"index_recommendation": {
"worthwhile": false,
"reason": "Small project (25 files, avg ~76 lines/file) β direct reads may be more efficient than index overhead",
"avgFileLinesEstimate": 76,
"breakEvenFiles": 40
}
}
Where PindeX's value is largest: sessions that bounce across many files (impact analysis, refactoring research, onboarding to an unfamiliar codebase). Sessions that are mostly editing one file get little benefit and may pay net overhead.
Requirements
| Dependency | Version |
|---|---|
| Node.js | β₯ 18.0.0 |
| npm | β₯ 8 |
| Operating System | macOS, Linux, Windows (WSL recommended) |
Note:
better-sqlite3ships prebuilt binaries for most platforms. If your environment is unusual,npm installwill compile from source β you'll needpython3and a C++ compiler (build-essential/ Xcode CLT).
Installation
Install from npm (recommended)
npm install -g pindex
Install from source
git clone https://github.com/phash/PindeX.git
cd PindeX
npm install
npm run build
npm install -g .
This makes three commands available globally:
| Command | Purpose |
|---|---|
pindex | CLI β init, federation, status |
pindex-server | MCP stdio server (Claude Code spawns this automatically) |
pindex-gui | Aggregated dashboard for all projects |
Quick Start
1. Set up a project
Run pindex (with no arguments) in any project directory:
cd /my/project
pindex
PindeX will:
- Walk upward from your current directory to find the project root (
package.json,.git, etc.) - Assign a dedicated monitoring port for this project
- Write
.mcp.jsoninto the project root with absolute paths - Register the project in
~/.pindex/registry.json - Inject a PindeX workflow section into
CLAUDE.md(created if missing) - Add a
PreToolUsehook to.claude/settings.jsonto remind Claude to prefer PindeX tools
Output:
ββββββββββββββββββββββββββββββββββββββββββββ
β PindeX β Ready β
ββββββββββββββββββββββββββββββββββββββββββββ
Project : /my/project
Index : ~/.pindex/projects/a3f8b2c1/index.db
Port : 7856
Config : .mcp.json (written)
CLAUDE.md : section added
Hooks : created
ββ Next steps βββββββββββββββββββββββββββββ
1. Restart Claude Code in this directory
2. Open the dashboard: pindex-gui
2. Restart Claude Code
Claude Code auto-discovers .mcp.json. On the next startup it will spawn pindex-server with the correct PROJECT_ROOT β the index is built automatically in the background.
3. Use the tools
Once connected, your AI assistant can call tools like:
search_symbols("AuthService")
get_file_summary("src/auth/service.ts")
get_context("src/auth/service.ts", 42, 20)
find_usages("validateToken")
get_dependencies("src/api/routes.ts", "both")
# Documentation and context memory:
search_docs("authentication JWT") # search CLAUDE.md, README.md, β¦
get_doc_chunk("CLAUDE.md", 2) # read one section only
save_context("Decision: use JWT β¦", "auth") # store for future sessions
4. Open the dashboard
pindex-gui
Opens http://localhost:7842 β an aggregated dashboard showing token savings, symbol counts, and session stats for all registered projects. The savings figures reflect tokens saved on PindeX tool calls; see the note in How It Works for context.
Multi-Project & Federation
Multiple independent projects
Each project gets its own .mcp.json (pointing to its own PROJECT_ROOT) and its own SQLite database at ~/.pindex/projects/{hash}/index.db. When Claude Code opens Project A, it spawns pindex-server with PROJECT_ROOT=/path/to/project-a β it never touches Project B's index.
cd /project-a && pindex # registers project-a
cd /project-b && pindex # registers project-b, different port + different DB
Linking repos (federation)
Federate other indexed PindeX projects into the current project so all read-only tools can search across them in one query.
CLI commands:
cd /my/main-project
pindex federate add /path/to/other-repo
pindex federate list
pindex federate remove other-repo
Federated repos appear under stable names (the directory basename, with a hash suffix on collision). Every search/lookup tool returns results tagged with project: <name>.
Scoping queries to federated repos:
To scope a query to a subset of federated repos, use the optional repos parameter in any read-only tool:
// MCP tool call (example)
search_symbols({ "query": "AuthService", "repos": ["main", "auth-service"] })
The repos parameter accepts a list of project names and restricts results to only those repos. Omit the parameter to search all federated repos + the main project.
Which tools support federation:
The 9 read-only tools (search_symbols, find_usages, get_symbol, get_file_summary, get_context, get_dependencies, get_project_overview, search_docs, get_doc_chunk) accept an optional repos: string[] param to scope to specific federated repos.
The 5 write/session tools (reindex, save_context, get_session_memory, start_comparison, get_token_stats) stay strictly local and do not accept the repos param.
Legacy env var:
The FEDERATION_REPOS env var (colon-separated paths) is still supported for backward compatibility and is what the CLI writes to .mcp.json:
FEDERATION_REPOS=/path/to/repo-a:/path/to/repo-b pindex-server
For new setups, prefer the CLI commands (pindex federate add/remove/list) over manually editing the env var.
View all projects:
pindex status
3 registered project(s):
[idle] project-a + 1 federated repo
/home/user/project-a
port: 7856 index: ~/.pindex/projects/a3f8b2c1/
[idle] project-b
/home/user/project-b
port: 7901 index: ~/.pindex/projects/f1e2d3c4/
...
MCP Tools
All 14 tools are available over stdio transport.
Code tools
search_symbols
Full-text search across all indexed symbols (names, signatures, summaries) using SQLite FTS5.
When federation is active, results from linked repos include a project field.
| Parameter | Type | Required | Description |
|---|---|---|---|
query | string | β | Search term (supports FTS5 syntax) |
limit | number | Max results per project (default: 20) |
Returns: List of matching symbols with name, kind, signature, file path, and line number.
get_symbol
Detailed information about a specific symbol including its signature, location, and the files it depends on.
| Parameter | Type | Required | Description |
|---|---|---|---|
name | string | β | Symbol name |
file | string | Narrow results to a specific file path |
Returns: Symbol record + file-level dependency list.
get_context
Read a slice of a source file centred around a given line. Files are read from disk at call time β only metadata lives in the DB.
| Parameter | Type | Required | Description |
|---|---|---|---|
file | string | β | File path (relative to PROJECT_ROOT) |
line | number | β | Centre line |
range | number | Lines above and below (default: 30) |
Returns: Code snippet with detected language and line numbers.
get_file_summary
High-level overview of a file without loading its full content.
| Parameter | Type | Required | Description |
|---|---|---|---|
file | string | β | File path |
Returns: Language, summary text, all symbols (with kind + signature), imports, and exports.
find_usages
All locations in the codebase where a symbol is referenced.
| Parameter | Type | Required | Description |
|---|---|---|---|
symbol | string | β | Symbol name to look up |
Returns: List of { file, line, context } entries.
get_dependencies
Import graph for a file β what it imports, what imports it, or both.
| Parameter | Type | Required | Description |
|---|---|---|---|
target | string | β | File path |
direction | "imports" | "imported_by" | "both" | Default: "both" |
Returns: Dependency list with resolved file paths and imported symbol names.
get_project_overview
Project-wide statistics β no parameters required. When federation is active, also includes stats for each linked repository.
Returns: Total file count, dominant language, entry points (index, main, app files), module list with symbol counts, an index_recommendation field (see When Is PindeX Worth Using?), and (if federated) per-repo breakdowns.
reindex
Rebuild the index for a single file or the entire project.
| Parameter | Type | Required | Description |
|---|---|---|---|
target | string | File path or omit for full project reindex |
Returns: Count of indexed / updated / error files.
get_token_stats
Token usage and savings statistics for a session.
| Parameter | Type | Required | Description |
|---|---|---|---|
session_id | string | Defaults to "default" |
Returns: Total tokens used, estimated tokens without the index, net savings, and savings percentage.
start_comparison
Create a labelled A/B testing session to compare indexed vs. baseline token usage.
| Parameter | Type | Required | Description |
|---|---|---|---|
label | string | β | Human-readable session name |
mode | "indexed" | "baseline" | β | Tracking mode |
Returns: session_id and the monitoring dashboard URL.
get_session_memory
Query passive session observations β facts and patterns the system recorded automatically by observing tool calls and file changes. No save_context calls required.
| Parameter | Type | Required | Description |
|---|---|---|---|
session_id | string | Filter by session ID | |
file | string | Filter by file path | |
symbol | string | Filter by symbol name |
Returns: List of observations with type, content, linked symbol/file, and a stale flag (set when the linked symbol changed since the observation was recorded).
Observations are also surfaced automatically inside get_project_overview, get_symbol, and get_file_summary β you rarely need to call this tool directly.
Document & context tools
These three tools extend PindeX beyond code: documentation files are indexed automatically alongside source files, and Claude can persist notes to a persistent knowledge store.
What gets indexed as documents:
| File type | Chunking strategy |
|---|---|
.md / .markdown | Split at # / ## / ### heading boundaries β each section is one chunk |
.yaml / .yml | Fixed 50-line chunks |
.txt | Fixed 50-line chunks |
Documents are discovered by indexAll() and kept in sync by the same MD5-hash incremental indexer used for code files.
search_docs
Full-text search (FTS5) across indexed document chunks and saved context entries. Use this instead of loading entire documentation files.
| Parameter | Type | Required | Description |
|---|---|---|---|
query | string | β | Search term |
limit | number | Max results (default: 20) | |
type | "docs" | "context" | "all" | Filter by source (default: "all") |
Returns: List of matches, each with:
typeβ"doc"(from a file) or"context"(saved by Claude)content_previewβ first 200 characters of the chunkfile,heading,start_lineβ for"doc"results, enables precise navigationtags,session_id,created_atβ for"context"results
get_doc_chunk
Retrieve the full content of one or all chunks of an indexed document.
More token-efficient than get_context for large documentation files because it returns pre-segmented sections.
| Parameter | Type | Required | Description |
|---|---|---|---|
file | string | β | File path (project-relative) |
chunk_index | number | Specific chunk to retrieve β omit for all chunks |
Returns: { file, total_chunks, chunks: [{ index, heading, start_line, end_line, content }] }
Typical workflow:
search_docs("authentication JWT")
β { file: "CLAUDE.md", heading: "Authentication", start_line: 12, chunk_index: 2 }
get_doc_chunk("CLAUDE.md", 2)
β full text of the Authentication section only
save_context
Persist an important fact, decision, or snippet to the context store.
Entries are searchable across all future sessions via search_docs.
Use this to offload information from the context window β instead of keeping a long summary in the prompt, write it once and retrieve it on demand.
| Parameter | Type | Required | Description |
|---|---|---|---|
content | string | β | The text to store |
tags | string | Comma-separated keywords for better retrieval (e.g. "auth,jwt,security") |
Returns: { id, session_id, created_at }
Example β saving a decision:
save_context(
"JWT expiry: access=1h, refresh=7d. Refresh stored in Redis. See src/auth/tokens.ts.",
"auth,jwt,redis"
)
Example β retrieving it in a later session:
search_docs("JWT expiry", type: "context")
β { content_preview: "JWT expiry: access=1h, refresh=7d β¦", tags: "auth,jwt,redis" }
Environment Variables
These are set automatically in the generated .mcp.json β you rarely need to change them by hand.
| Variable | Default | Description |
|---|---|---|
PROJECT_ROOT | . | Root directory of the project to index |
INDEX_PATH | ~/.pindex/projects/{hash}/index.db | Path to the SQLite database |
LANGUAGES | typescript,javascript | Comma-separated list of languages to index. Supported values: typescript, javascript, java, kotlin, python, php, vue, svelte, ruby, csharp, go, rust |
AUTO_REINDEX | true | Watch for file changes and reindex automatically |
MONITORING_PORT | assigned per-project | Port for the live dashboard + WebSocket |
MONITORING_AUTO_OPEN | false | Open the dashboard in the browser on startup |
BASELINE_MODE | false | Disable the index entirely (for A/B baseline sessions) |
GENERATE_SUMMARIES | false | Generate LLM summaries per file and symbol (requires SUMMARIZER_API_KEY) |
SUMMARIZER_API_KEY | (empty) | API key for the LLM summarization provider |
SUMMARIZER_BASE_URL | https://api.openai.com/v1 | Base URL for the OpenAI-compatible chat completions API. Works with OpenAI, Ollama, LiteLLM, Anthropic proxy, etc. |
SUMMARIZER_MODEL | gpt-4o-mini | Model name for summarization requests |
TOKEN_PRICE_PER_MILLION | 3.00 | USD price per million tokens β used for cost estimates |
PINDEX_PARSE_WORKERS | (empty) | Parse workers (0=sync, empty=auto) |
PINDEX_BIND_HOST | 127.0.0.1 | Bind host for monitoring/GUI (default loopback) |
PINDEX_LSP | true | opt-in LSP parsing for Python (set to false to force regex) |
FEDERATION_REPOS | (empty) | Colon-separated absolute paths to linked repositories |
DOCUMENT_PATTERNS | **/*.md,**/*.markdown,**/*.yaml,**/*.yml,**/*.txt | Glob patterns for document files to index alongside code |
OBSERVATION_RETENTION | permanent | How long passive session observations are kept: permanent, session, or Nd (e.g. 30d) |
Integrations
Claude Code
Run pindex in each project you want to index. The command writes .mcp.json automatically:
cd /my/project
pindex
# β .mcp.json written
# restart Claude Code β pindex-server starts automatically
The .mcp.json format (auto-generated, do not edit by hand):
{
"mcpServers": {
"pindex": {
"command": "pindex-server",
"args": [],
"env": {
"PROJECT_ROOT": "/absolute/path/to/project",
"INDEX_PATH": "/home/user/.pindex/projects/a3f8b2c1/index.db",
"MONITORING_PORT": "7856",
"AUTO_REINDEX": "true",
"GENERATE_SUMMARIES": "false",
"MONITORING_AUTO_OPEN": "false",
"BASELINE_MODE": "false",
"TOKEN_PRICE_PER_MILLION": "3.00"
}
}
}
}
With federation (pindex add /other/project):
{
"mcpServers": {
"pindex": {
"command": "pindex-server",
"args": [],
"env": {
"PROJECT_ROOT": "/absolute/path/to/project",
"FEDERATION_REPOS": "/absolute/path/to/other/project",
"..."
}
}
}
}
Goose
Goose reads extensions from ~/.config/goose/config.yaml.
Step 1 β Install PindeX:
git clone https://github.com/phash/PindeX.git
cd PindeX && npm install && npm run build && npm install -g .
Step 2 β Run pindex in your project to get the assigned hash and port:
cd /my/project && pindex
Step 3 β Edit ~/.config/goose/config.yaml:
extensions:
pindex:
name: PindeX
type: stdio
cmd: pindex-server
args: []
envs:
PROJECT_ROOT: /absolute/path/to/project
INDEX_PATH: /home/user/.pindex/projects/{hash}/index.db
LANGUAGES: typescript,javascript
AUTO_REINDEX: "true"
GENERATE_SUMMARIES: "false"
MONITORING_PORT: "{port}"
MONITORING_AUTO_OPEN: "false"
BASELINE_MODE: "false"
TOKEN_PRICE_PER_MILLION: "3.00"
enabled: true
timeout: 300
Replace {hash} and {port} with the values shown by pindex. A ready-to-copy template is available in goose-extension.yaml.
Step 4 β Restart Goose:
goose session start
Python (LSP)
PindeX ships with Pyright as an optional dependency. When installed, Pyright's LSP server produces the precise symbol tree instead of the fallback regex extractor. Set PINDEX_LSP=false to opt out. If Pyright is missing (for example when installed with --no-optional), PindeX logs a one-time warning and falls back to the regex path.
CLI Reference
pindex [command] [options]
| Command | Description |
|---|---|
(no args) / init | Set up this project: write .mcp.json, inject CLAUDE.md section + hooks, register globally |
reinit | Re-inject PindeX section into CLAUDE.md and .claude/settings.json (e.g. after an update) |
reinit --force | Replace the existing section with the current template |
add <path> | Link another repo for cross-repo search (federation) |
remove | Fully unregister project: remove .mcp.json, CLAUDE.md section, hooks, stop daemon |
remove <path> | Remove a federated repo link only |
setup | One-time global setup (autostart config) |
status | Show all registered projects and their status |
list | List all registered projects (compact) |
index [path] | Manually index a directory (default: current directory) |
index --force | Force full reindex, bypassing MD5 hash checks |
gui | Open the aggregated monitoring dashboard in the browser |
stats | Print a short stats summary |
uninstall | Stop all daemons (data stays in ~/.pindex) |
Examples:
# Set up a new project
cd /my/project && pindex
# Link project-b for cross-repo search
pindex add /my/project-b
# Check all registered projects
pindex status
# Manually force a full reindex
pindex index --force
# Index a Java + Vue project
LANGUAGES=typescript,javascript,java,vue pindex index --force
# Re-inject CLAUDE.md section (e.g. after updating pindex)
pindex reinit --force
# Fully remove pindex from a project
pindex remove
# Open the dashboard
pindex-gui
Monitoring Dashboard
Per-project dashboard
Each pindex-server instance starts a monitoring server on its assigned port. Open it at:
http://localhost:{MONITORING_PORT}
Or let it open automatically on startup:
MONITORING_AUTO_OPEN=true node dist/index.js
Aggregated dashboard (all projects)
pindex-gui
Opens http://localhost:7842 β reads all registered project databases directly and shows:
- Token savings per project (bar chart)
- Indexed file and symbol counts
- Session history
- Average savings % across all projects
The GUI refreshes automatically (default 15 seconds) and works even when no pindex-server is running. Use the slider in the header to adjust the refresh interval from 1β60 seconds.
Dashboard features (both dashboards):
- Real-time chart (Chart.js) of tokens used vs. estimated cost without index
- Per-tool breakdown: which tools are used most and how much they save
- Session comparison: side-by-side indexed vs. baseline A/B data
- REST API at
/api/sessionsand/api/sessions/:idfor programmatic access
Development
Setup
git clone https://github.com/phash/PindeX.git
cd PindeX
npm install
Build
npm run build # compile src/ β dist/
npm run build:watch # watch mode
Tests
npm test # run full test suite (vitest, pool: forks)
npm run test:watch # watch mode
npm run test:coverage # coverage report β threshold: 80%
Tests use
pool: 'forks'β required becausebetter-sqlite3uses native bindings that cannot share a process with the vitest worker pool.
Lint / Type-check
npm run lint # tsc --noEmit (type errors only, no output files)
Test structure
tests/
βββ setup.ts # global mocks (tree-sitter, chokidar, open)
βββ helpers/ # createTestDb(), fixtures, test server
βββ db/ # schema, migrations, queries
βββ indexer/ # parser, indexer, watcher
βββ tools/ # one file per MCP tool + validation tests
βββ monitoring/ # estimator, token-logger, Express server
βββ cli/ # project-detector, setup, daemon
βββ integration/
βββ mcp-server.test.ts # MCP server wiring smoke tests
βββ doc-indexing.test.ts # full document + context memory workflow
Project Structure
src/
βββ index.ts # Entry point β MCP stdio server + FEDERATION_REPOS
βββ server.ts # Tool registration (14 tools, Zod validation, FederatedDb)
βββ types.ts # Shared TypeScript interfaces
β
βββ db/
β βββ schema.ts # SQLite schema + FTS5 tables + triggers (v2)
β βββ queries.ts # Typed query helpers
β βββ database.ts # Connection management
β βββ migrations.ts # Schema versioning (PRAGMA user_version)
β
βββ indexer/
β βββ index.ts # Orchestrator β code + document file discovery
β βββ parser.ts # tree-sitter AST β symbols; text β doc chunks
β βββ summarizer.ts # LLM summaries (OpenAI-compatible API, concurrency-limited)
β βββ watcher.ts # chokidar file watcher β auto-reindex
β
βββ tools/ # One file per MCP tool
β βββ search_symbols.ts # FTS5 symbol search β supports federated DBs
β βββ get_symbol.ts
β βββ get_context.ts
β βββ get_file_summary.ts
β βββ find_usages.ts
β βββ get_dependencies.ts
β βββ get_project_overview.ts # federation-aware stats
β βββ reindex.ts
β βββ get_token_stats.ts
β βββ start_comparison.ts
β βββ search_docs.ts # FTS5 across documents + context entries
β βββ get_doc_chunk.ts # retrieve specific document section(s)
β βββ save_context.ts # persist a fact/decision to context store
β βββ get_session_memory.ts # query passive session observations
β βββ schemas.ts # Zod schemas for runtime input validation
β
βββ memory/ # Passive session memory (v1.1+)
β βββ ast-diff.ts # AST diff engine β detects symbol changes
β βββ observer.ts # SessionObserver β hooks tool calls + FileWatcher
β βββ anti-patterns.ts # AntiPatternDetector β dead-ends, thrashing, loops
β
βββ monitoring/
β βββ server.ts # Express + WebSocket (per-project instance)
β βββ token-logger.ts # Per-call token logging
β βββ estimator.ts # "without index" heuristic
β βββ ui/ # Dashboard HTML / CSS / Chart.js
β
βββ gui/
β βββ index.ts # pindex-gui entry point
β βββ server.ts # Aggregated Express app (reads all project DBs)
β
βββ cli/
βββ index.ts # CLI router
βββ init.ts # initProject(), writeMcpJson(), addFederatedRepo()
βββ setup.ts # One-time setup (pindex setup)
βββ daemon.ts # Per-project PID-file daemon management
βββ project-detector.ts # getPindexHome(), findProjectRoot(), GlobalRegistry
Key implementation notes
- ES Modules β all relative imports use
.jsextensions (TypeScript ESM / NodeNext resolution). - FTS5 sync β
symbols,documents, andcontext_entriesare all kept in sync by SQLiteAFTER INSERT/UPDATE/DELETEtriggers; no application-level bookkeeping needed. - Incremental reindexing β MD5 hash per file; unchanged files are skipped for both code and document indexing.
- Document chunking β markdown splits at
#/##/###heading boundaries; all other text files use fixed 50-line windows. Empty chunks are filtered out before storage. - Context memory β
save_contextwrites tocontext_entrieskeyed bysession_id. Entries are never scoped to a single session βsearch_docsalways searches the full history, enabling cross-session knowledge retrieval. - Live context β
get_contextreads from disk at call time so it always returns the current file state, not a stale cache. - Testability β
createMonitoringApp()(returns the Expressapp) andstartMonitoringServer()(binds the HTTP/WebSocket server) are separate functions so tests can mount the app without binding a port. - Per-project ports β assigned deterministically as
7842 + (parseInt(hash.slice(0,4), 16) % 2000)and stored inregistry.jsonso they never change. pindex-guireads DBs directly β no running server required; works as a standalone dashboard even when Claude Code is not open.- Migration β
getPindexHome()automatically renames~/.mcp-indexerβ~/.pindexon first call if the old directory exists. - Passive session memory β
SessionObserverwires into every MCP tool handler and theFileWatcherat startup; no application code in tools needs to know about memory. Observations are linked to symbol names so the staleness engine can cross-reference them with the AST diff output on re-index.
License
MIT
