Polyglot
Local GPU translation MCP server β TranslateGemma via Ollama, 57 languages, zero cloud dependency
Ask AI about Polyglot
Powered by Claude Β· Grounded in docs
I know everything about Polyglot. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
ζ₯ζ¬θͺ | δΈζ | EspaΓ±ol | FranΓ§ais | ΰ€Ήΰ€Ώΰ€¨ΰ₯ΰ€¦ΰ₯ | Italiano | PortuguΓͺs (BR)

Local GPU translation MCP server β 57 languages, zero cloud dependency.
What it does
Translates text between 57 languages using TranslateGemma running locally on your GPU via Ollama. No API keys, no cloud, no rate limits β everything stays on your machine.
Quick Start
1. Install Ollama
Download from ollama.com and start it:
ollama serve
2. Pull a model
ollama pull translategemma:12b # 8.1 GB β best quality/speed balance
# or
ollama pull translategemma:4b # 3.3 GB β faster, lower quality
# or
ollama pull translategemma:27b # 17 GB β highest quality
Tip: You can skip this step β Polyglot auto-pulls the model on first use.
3. Add to your MCP client
Claude Code / Claude Desktop β add to claude_desktop_config.json or .mcp.json:
{
"mcpServers": {
"polyglot": {
"command": "npx",
"args": ["-y", "@mcptoolshop/polyglot-mcp"]
}
}
}
From source:
git clone https://github.com/mcp-tool-shop-org/polyglot-mcp.git
cd polyglot-mcp
npm install && npm run build
node dist/index.js
That's it. Ask Claude to translate something and it will use the translate tool automatically.
Tools
Polyglot exposes five MCP tools:
translate
Translate text between any supported language pair.
| Parameter | Required | Description |
|---|---|---|
text | yes | Text to translate |
from | yes | Source language code or name (e.g., en, English) |
to | yes | Target language code or name (e.g., ja, Japanese) |
model | no | Ollama model (default: translategemma:12b) |
glossary | no | Custom term overrides as {"source": "translation"} β merged with the built-in software glossary |
Long text is automatically split into chunks at paragraph and sentence boundaries, translated in sequence, and reassembled. All translations are validated for quality (empty output, echo detection, truncation, garbled text).
translate_markdown
Translate an entire markdown document while preserving structure. Code blocks, HTML elements, badges, URLs, and table formatting are kept intact β only prose content (headings, paragraphs, taglines, table cells) is translated.
| Parameter | Required | Description |
|---|---|---|
markdown | yes | The full markdown content to translate |
from | yes | Source language code or name |
to | yes | Target language code or name |
model | no | Ollama model (default: translategemma:12b) |
list_languages
List all 57 supported languages with their codes.
check_status
Check if Ollama is running and which TranslateGemma models are installed. Attempts auto-start if Ollama isn't running.
translate_all
Translate markdown content into multiple languages at once (default: 7 β Japanese, Chinese, Spanish, French, Hindi, Italian, Portuguese). Runs translations concurrently with GPU-safe semaphore limiting.
| Parameter | Required | Description |
|---|---|---|
markdown | yes | The full markdown content to translate |
from | no | Source language code (default: en) |
languages | no | Array of target language codes (default: all 7) |
model | no | Ollama model (default: translategemma:12b) |
concurrency | no | Max concurrent translations (default: 2, max: 3) |
navBar | no | Inject language nav bar (default: true) |
Features
Auto-start & Auto-pull
Ollama is automatically started if it isn't running. The TranslateGemma model is automatically pulled if it isn't installed. Zero manual setup required.
Retry with Exponential Backoff
Transient Ollama failures (network blips, temporary overload) are automatically retried up to 2 times with exponential backoff (1 s, 2 s). Non-retryable errors (bad model name, invalid input) fail immediately.
Smart Chunking
Long text is split at natural boundaries β paragraphs, then sentences β so translation context is preserved. Chunk sizes adapt to the model: 2K chars for 2B/4B models, 4K for 12B, 6K for 27B.
Segment Cache
Translated segments are cached by content hash (SHA-256 of source text + target language + model). Unchanged segments skip re-translation entirely. Cache lives in .polyglot-cache.json with a 30-day TTL.
Translation Memory (Fuzzy Cache)
When an exact cache hit isn't found, Polyglot checks for near-miss segments using Levenshtein similarity. If a cached source is β₯85% similar to the current segment, the existing translation is reused. This dramatically speeds up retranslation after minor README edits.
Concurrency Semaphore
All Ollama calls are guarded by a counting semaphore (default limit: 1) to prevent GPU OOM on systems with limited VRAM. Override with POLYGLOT_CONCURRENCY:
POLYGLOT_CONCURRENCY=2 npx @mcptoolshop/polyglot-mcp
MCP Progress Tokens
All tools report progress via MCP notifications/progress when the client provides a progressToken. Translate reports per-chunk, translate_markdown per-segment-batch, translate_all per-language, and check_status per-step.
Software Glossary
A built-in glossary of 12 technical terms (API, CLI, SDK, etc.) ensures consistent translation of software terminology. Custom glossary entries can be passed per-request and are merged with the defaults.
Batch Translation
translateBatch groups multiple segments into a single prompt where possible, reducing round-trips. Falls back to individual translation if the batch separator is mangled.
Configurable Default Model
Set the POLYGLOT_MODEL environment variable to override the default model:
POLYGLOT_MODEL=translategemma:27b npx @mcptoolshop/polyglot-mcp
Structured Errors
All errors use PolyglotError with a machine-readable code (MODEL_NOT_FOUND, OLLAMA_UNAVAILABLE, TRANSLATION_FAILED, etc.), a human-readable message, an optional hint, and a retryable flag.
Output Validation
Every translation is automatically validated: empty output throws (retryable), source-text echo is flagged, severe truncation and hallucination blowup are warned, garbled encoding and model meta-commentary are detected. Warnings appear in the MCP tool response.
Streaming
OllamaClient.generateStream() yields tokens via NDJSON as Ollama produces them. The translate() function accepts an onToken callback for real-time progress display. Both streaming and non-streaming paths share retry logic.
Supported Languages
Afrikaans, Albanian, Arabic, Bengali, Bulgarian, Catalan, Chinese (Simplified), Chinese (Traditional), Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Greek, Gujarati, Hebrew, Hindi, Hungarian, Indonesian, Irish, Italian, Japanese, Kannada, Korean, Latvian, Lithuanian, Macedonian, Malay, Malayalam, Maltese, Marathi, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Scottish Gaelic, Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Vietnamese, Welsh.
Performance
On an RTX 5080 (16 GB VRAM) with TranslateGemma 12B (Q4):
| Metric | Value |
|---|---|
| First translation (cold model load) | ~15 s |
| Subsequent translations | ~600 ms |
| VRAM usage | ~8.1 GB |
| Long text (per chunk) | ~600 ms |
Architecture
MCP Client (Claude Code, etc.)
β
β MCP protocol (stdio)
βΌ
ββββββββββββββββββββ
β index.ts β MCP server β 5 tools: translate, translate_markdown,
β β translate_all, list_languages, check_status
ββββββββββββββββββββ€
β translate.ts β Prompt building, chunking, batch mode, streaming
ββββββββββββββββββββ€
βtranslateMarkdown β Markdown-aware segmentation, table parsing, reassembly
ββββββββββββββββββββ€
β translateAll.ts β Multi-language orchestrator with nav bar injection
ββββββββββββββββββββ€
β semaphore.ts β Counting semaphore for GPU-safe concurrency
ββββββββββββββββββββ€
β validate.ts β Output validation (empty, echo, truncation, garble)
ββββββββββββββββββββ€
β ollama.ts β HTTP client β auto-start, auto-pull, retry, streaming
ββββββββββββββββββββ€
β cache.ts β Segment cache + fuzzy translation memory
ββββββββββββββββββββ€
β glossary.ts β Software term dictionary
ββββββββββββββββββββ€
β polish.ts β Post-translation artifact cleanup
ββββββββββββββββββββ€
β languages.ts β 57 language definitions
ββββββββββββββββββββ€
β errors.ts β PolyglotError structured error class
ββββββββββββββββββββ
β
β HTTP (localhost:11434)
βΌ
Ollama + TranslateGemma (GPU)
Security & Data Scope
| Aspect | Detail |
|---|---|
| Data touched | Text sent to local Ollama API (localhost:11434), .polyglot-cache.json segment cache |
| Data NOT touched | No files outside working directory, no browser data, no OS credentials |
| Network | HTTP to localhost:11434 only β zero external/internet egress |
| Telemetry | None collected or sent |
See SECURITY.md for the vulnerability reporting policy.
Development
npm install # install deps
npm run typecheck # type-check without emitting
npm test # run 256 unit tests (vitest)
npm run build # compile TypeScript to dist/
npm run verify # typecheck + test + build + pack (full gate)
License
MIT β see LICENSE.
Built by MCP Tool Shop
