Telegram Private Search
Local-first MCP server and Kotlin/JVM CLI for indexing private Telegram chats and searching them with hybrid full-text + semantic retrieval.
Ask AI about Telegram Private Search
Powered by Claude · Grounded in docs
I know everything about Telegram Private Search. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
telegram-private-search
Turn your Telegram history into searchable memory for you and your AI tools.
telegram-private-search is a local-first Kotlin/JVM tool and MCP server that turns years of private Telegram chats into a structured, searchable knowledge source.
It is built for a very specific feeling: you know a conversation happened, you remember what it meant, but you do not remember the exact wording well enough to find it with normal search.
With this project, your assistant can stop guessing and start querying your local Telegram archive directly.
Quick start
cp .env.example .env
./gradlew test
./gradlew run --console=plain --args='index --limit-per-chat 500'
./gradlew run --args='mcp'
Then connect an MCP-compatible client to the server and query your Telegram history in natural language.
Why this is interesting
Your Telegram history often contains:
- decisions that were never written down anywhere else
- informal project updates
- agreements, promises, and follow-ups
- travel plans, addresses, links, and recommendations
- context that only exists inside chat threads
This project turns that messy, memory-shaped information into something an MCP-compatible client can actually use.
Instead of scrolling through years of chats and trying random keywords, you can ask higher-level questions such as:
- "Find the last time we discussed Readian progress"
- "When did she mention sending the invoice?"
- "Show messages where he said he was blocked on the backend"
- "Find the conversation where we compared apartment options"
It indexes your private Telegram chats into a local SQLite database, then combines local keyword search, recency-aware ranking, and thread expansion so vague, memory-shaped queries still have a good chance of finding the right message.
What this MCP server is
This project exposes your Telegram search index through an MCP server over stdio.
That means an AI tool that supports MCP can call into this server as a capability instead of you manually copying text around. In practice, this lets your assistant search your local Telegram history as a structured tool, returning relevant messages and context when you ask questions in natural language.
Think of it as a bridge between:
- your local Telegram archive
- a search engine tuned for conversational recall
- an MCP-compatible assistant or client
Why MCP matters here
Without MCP, this project is already useful as a CLI search tool.
With MCP, it becomes part of a larger assistant workflow. Your client can treat Telegram search as a real tool call instead of a manual side task. That makes it possible to:
- ask questions in natural language and get grounded results
- combine Telegram search with other tools in the same assistant session
- reduce hallucinated memory reconstruction
- keep the message index local while still making it usable by AI tooling
Architecture at a glance
Telegram account
|
v
TDLight client
|
v
Message ingestion
|
v
Local SQLite index (Room)
|
+--> keyword search
|
+--> local lexical retrieval
|
v
Ranking and result shaping
|
+--> CLI search
|
+--> MCP server over stdio
|
v
MCP-compatible assistant or client
In short: Telegram messages come in, a local index is built, local retrieval finds relevant results, and the MCP server exposes that capability to tools that can speak MCP.
Why it exists
People rarely remember the exact words used in a chat. They remember intent, context, and fragments:
- "the message where he finally agreed"
- "that time we planned the trip"
- "the last update about the feature rollout"
Traditional search is great when you remember exact text. It is much less helpful when you remember meaning.
telegram-private-search is built for that second case. It helps recover messages from fuzzy memory, long-running conversations, and topic shifts spread across multiple chats.
What it can do
- Authenticate with your own Telegram account
- Index private chat messages into a local SQLite database
- Search with local full-text retrieval and thread expansion
- Rank results with recency-aware heuristics
- Expose search through an MCP-compatible server
- Run as a local CLI for indexing and direct querying
How it works
- The app authenticates against Telegram using your own API credentials.
- It reads private chat messages from your account.
- Messages are stored locally in SQLite via Room.
- Search uses local keyword matching, recency-aware ranking, and thread expansion.
- The MCP server exposes that search capability to external AI clients.
The current ingestion path focuses on text messages from main and archived private chats.
Privacy and data flow
This project keeps indexing, search, and context reconstruction local, and only returns the conversation slices you actually ask for.
- Telegram data is indexed into a local database under
data/ - Your
.envstays local and should never be committed - The MCP server runs locally over
stdio search_messagesreconstructs expanded thread context locally before returning resultssearch_messagesrefreshes the local index for latest-oriented queries and reuses a very recent refresh to avoid duplicate imports- Query interpretation uses local heuristics only
- The server does not call any external LLM or embedding API
- Your current Copilot/AI agent should do the reasoning after calling the MCP tools
If you want deeper reasoning over vague memory-style queries, let the MCP client ask for broader local context and let the current AI agent analyze that returned thread slice. The server itself stays local.
Example use cases
This project is useful when you want to:
- recover the last concrete progress update from a teammate or friend
- find a decision that was made informally in chat
- trace when someone promised, postponed, confirmed, or declined something
- search conversations from fuzzy memory without relying on exact phrasing
- give an MCP-enabled assistant access to your Telegram memory as a tool
Stack
- Kotlin/JVM
- Clean architecture with MVVM-style presentation state
- Room 3 with bundled SQLite
- TDLight for Telegram user-account access
- MCP server over
stdio
Setup
- Copy
.env.exampleto.env. - Fill in your Telegram API credentials from
https://my.telegram.org. - Install OpenSSL 3 on macOS if it is not already installed.
- Run
./gradlew test. - Optionally set
TELEGRAM_PHONE_NUMBERin.envto skip the first login-mode prompt.
Example local configuration:
TELEGRAM_API_ID=
TELEGRAM_API_HASH=
TELEGRAM_PHONE_NUMBER=
TELEGRAM_USE_CONSOLE_LOGIN=true
TELEGRAM_SESSION_DIR=data/telegram-session
DATABASE_PATH=data/telegram-search.db
CLI usage
Build the local index:
./gradlew run --console=plain --args='index --limit-per-chat 500'
Install a runnable distribution:
./gradlew installDist
./build/install/telegram-private-search/bin/telegram-private-search index --limit-per-chat 500
Run a direct search:
./gradlew run --args='search "find last message where he reported progress on Readian" --context-before-messages 12 --context-after-messages 12'
MCP usage
Start the MCP server:
./gradlew run --args='mcp'
Once running, an MCP-compatible client can use the server's search tools to query your indexed Telegram history.
The bundled Kotlin MCP stdio transport uses one JSON-RPC message per line. It does not use Content-Length framing, so manual smoke tests should write newline-delimited JSON objects to stdin and read newline-delimited JSON objects from stdout.
Quick smoke test:
python3 - <<'PY'
import json, subprocess
proc = subprocess.Popen(
['./build/install/telegram-private-search/bin/telegram-private-search', 'mcp'],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
text=True,
bufsize=1,
)
for message in [
{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"smoke-test","version":"1.0"}}},
{"jsonrpc":"2.0","method":"notifications/initialized","params":{}},
{"jsonrpc":"2.0","id":2,"method":"tools/list","params":{}},
]:
proc.stdin.write(json.dumps(message) + "\n")
proc.stdin.flush()
print(proc.stdout.readline().strip())
print(proc.stdout.readline().strip())
proc.terminate()
proc.wait(timeout=5)
PY
The intended architecture is:
- MCP server: local indexing, local search, local thread expansion
- current AI agent: reasoning, reranking, summaries, follow-up investigation
search_messages accepts these optional context controls:
context_before_messages: how many earlier messages to include around each matched anchorcontext_after_messages: how many later messages to include around each matched anchor
Both default to 12, so the tool returns a local conversation slice rather than an isolated chunk. Set them to 0 if you want anchor-only results.
When the query asks for the latest messages, the MCP server refreshes the local Telegram index before searching so the answer reflects recent chat activity. Repeated latest-style queries reuse a very recent refresh instead of launching another full Telegram import right away.
Notes
- The first
indexrun is interactive ifTELEGRAM_USE_CONSOLE_LOGIN=true. - TDLight logs are reduced to keep login prompts visible.
- Search combines local keyword filtering, local heuristics, recency-aware ranking, and local thread expansion.
