Docutranslate
文档(小说、论文、字幕)翻译工具(支持 pdf/word/excel/json/epub/srt...)Document (Novel, Thesis, Subtitle) Translation Tool (Supports pdf/word/excel/json/epub/srt...)
Ask AI about Docutranslate
Powered by Claude · Grounded in docs
I know everything about Docutranslate. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
DocuTranslate
简体中文 / English / 日本語 / Tiếng Việt
A lightweight local file translation tool based on Large Language Models.
- ✅ Support Multiple Formats: Translates
pdf,docx,xlsx,md,txt,json,epub,srt,ass, and more. - ✅ Auto-Generate Glossary: Supports automatic glossary generation to ensure term alignment.
- ✅ PDF Table, Formula, Code Recognition: Uses
mineru(online or locally deployed) for PDF parsing, supporting recognition and translation of tables, formulas, and code commonly found in academic papers. - ✅ JSON Translation: Supports specifying values to translate within JSON using paths (
jsonpath-ngsyntax). - ✅ Word/Excel Format Preservation: Supports
docxandxlsxfiles (currently does not supportdocorxls) while maintaining original formatting. - ✅ Multi-AI Platform Support: Supports most AI platforms, allowing for high-performance concurrent AI translation with custom prompts.
- ✅ Async Support: Designed for high-performance scenarios, providing full asynchronous support and interfaces for parallel multi-tasking.
- ✅ LAN & Multi-user Support: Supports simultaneous use by multiple users within a local area network (LAN).
- ✅ Interactive Web Interface: Provides an out-of-the-box Web UI and RESTful API for easy integration and usage.
- ✅ Compact, Portable Packages: Windows and Mac portable packages under 40MB.
When translating
QQ Community Group: 1047781902 1081128602
UI Interface:

Paper Translation:

Novel Translation:

Integration Packages
For users who want to get started quickly, we provide integration packages on GitHub Releases. Simply download, unzip, and enter your AI platform API-Key to start using it.
Quick Start
Using pip
# Basic installation
pip install docutranslate
# Install mcp extension
pip install docutranslate[mcp]
docutranslate -i
#docutranslate -i --with-mcp
Using uv
# Initialize environment
uv init
# Basic installation
uv add docutranslate
# Install mcp extension
uv add docutranslate[mcp]
uv run --no-dev docutranslate -i
#uv run --no-dev docutranslate -i --with-mcp
Using git
# Initialize environment
git clone https://github.com/xunbu/docutranslate.git
cd docutranslate
uv sync --no-dev
# uv sync --no-dev --extra mcp
# uv sync --no-dev --all-extras
Using docker
docker run -d -p 8010:8010 xunbu/docutranslate:latest
# docker run -it -p 8010:8010 xunbu/docutranslate:latest
# docker run -it -p 8010:8010 xunbu/docutranslate:v1.5.4
Start Web UI and API Service
For ease of use, DocuTranslate provides a fully functional Web Interface and RESTful API.
Start the Service:
docutranslate -i (Start GUI, default local access)
docutranslate -i --host 0.0.0.0 (Allow access from other devices on LAN)
docutranslate -i -p 8081 (Specify port number)
docutranslate -i --cors (Enable default CORS settings)
docutranslate -i --with-mcp (Start GUI with MCP SSE endpoint, shared queue, shared port)
docutranslate --mcp (Start MCP server, stdio mode)
docutranslate --mcp --transport sse (Start MCP server, SSE mode)
docutranslate --mcp --transport sse --mcp-host MCP_HOST --mcp-port MCP_PORT (Start MCP server, SSE mode)
docutranslate --mcp --transport streamable-http (Start MCP server, Streamable HTTP mode)
- Interactive Interface: After starting the service, please visit
http://127.0.0.1:8010(or your specified port) in your browser. - API Documentation: Full API documentation (Swagger UI) is located at
http://127.0.0.1:8010/docs. - MCP: SSE service endpoint is at
http://127.0.0.1:8010/mcp/sse(started with --with-mcp) orhttp://127.0.0.1:8000/mcp/sse(started with --mcp)
MCP Configuration
DocuTranslate can be used as an MCP (Model Context Protocol) server. For detailed documentation, see MCP Documentation.
Supported Environment Variables
| Environment Variable | Description | Required |
|---|---|---|
DOCUTRANSLATE_API_KEY | AI platform API key | Yes |
DOCUTRANSLATE_BASE_URL | AI platform base URL | Yes |
DOCUTRANSLATE_MODEL_ID | Model ID | Yes |
DOCUTRANSLATE_TO_LANG | Target language (default: Chinese) | No |
DOCUTRANSLATE_CONCURRENT | Concurrent requests (default: 10) | No |
DOCUTRANSLATE_CONVERT_ENGINE | PDF conversion engine | No |
DOCUTRANSLATE_MINERU_TOKEN | MinerU API Token | No |
uvx Configuration (No Installation Required)
{
"mcpServers": {
"docutranslate": {
"command": "uvx",
"args": ["--from", "docutranslate[mcp]", "docutranslate", "--mcp"],
"env": {
"DOCUTRANSLATE_API_KEY": "sk-xxxxxx",
"DOCUTRANSLATE_BASE_URL": "https://api.openai.com/v1",
"DOCUTRANSLATE_MODEL_ID": "gpt-4o",
"DOCUTRANSLATE_TO_LANG": "Chinese",
"DOCUTRANSLATE_CONCURRENT": "10",
"DOCUTRANSLATE_CONVERT_ENGINE": "mineru",
"DOCUTRANSLATE_MINERU_TOKEN": "your-mineru-token"
}
}
}
}
SSE Mode Configuration
First start the MCP server in SSE mode:
docutranslate --mcp --transport sse --mcp-host 127.0.0.1 --mcp-port 8000
Then configure the SSE endpoint in your client: http://127.0.0.1:8000/mcp/sse
Usage Examples
Using the Simple Client SDK (Recommended)
The easiest way to get started is using the Client class, which provides a simple and intuitive API for translation:
from docutranslate.sdk import Client
# Initialize the client with your AI platform settings
client = Client(
api_key="YOUR_OPENAI_API_KEY", # or any other AI platform API key
base_url="https://api.openai.com/v1/",
model_id="gpt-4o",
to_lang="Chinese",
concurrent=10, # Number of concurrent requests
)
# Example 1: Translate plain text files (no PDF parsing engine needed)
result = client.translate("path/to/your/document.txt")
print(f"Translation complete! Saved to: {result.save()}")
# Example 2: Translate PDF files (requires mineru_token or local deployment)
# Option A: Use online MinerU (token required: https://mineru.net/apiManage/token)
result = client.translate(
"path/to/your/document.pdf",
convert_engine="mineru",
mineru_token="YOUR_MINERU_TOKEN", # Replace with your MinerU Token
formula_ocr=True, # Enable formula recognition
)
result.save(fmt="html")
# Option B: Use locally deployed MinerU (recommended for intranet/offline)
# First start local MinerU service, reference: https://github.com/opendatalab/MinerU
result = client.translate(
"path/to/your/document.pdf",
convert_engine="mineru_deploy",
mineru_deploy_base_url="http://127.0.0.1:8000", # Your local MinerU address
mineru_deploy_backend="hybrid-auto-engine", # Backend type
)
result.save(fmt="markdown")
# Example 3: Translate Docx files (preserve formatting)
result = client.translate(
"path/to/your/document.docx",
insert_mode="replace", # replace/append/prepend
)
result.save(fmt="docx") # Save as docx format
# Example 4: Export as base64 encoded string (for API transmission)
base64_content = result.export(fmt="html")
print(f"Exported content length: {len(base64_content)}")
# You can also access the underlying workflow for advanced operations
# workflow = result.workflow
Client Features:
- Auto-detection: Automatically detects file type and selects the appropriate workflow
- Flexible Configuration: Override any default settings per translation call
- Multiple Output Options: Save to disk or export as Base64 string
- Async Support: Use
translate_async()for concurrent translation tasks
Client SDK Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| api_key | str | - | AI platform API key |
| base_url | str | - | AI platform base URL (e.g., https://api.openai.com/v1/) |
| model_id | str | - | Model ID to use for translation |
| to_lang | str | - | Target language (e.g., "Chinese", "English", "Japanese") |
| concurrent | int | 10 | Number of concurrent LLM requests |
| convert_engine | str | "mineru" | PDF parsing engine: "mineru", "mineru_deploy" |
| md2docx_engine | str | "auto" | Markdown to Docx engine: "python" (pure Python), "pandoc" (use Pandoc), "auto" (use Pandoc if installed, otherwise Python), null (do not generate docx) |
| mineru_deploy_base_url | str | - | Local minerU API address (when convert_engine="mineru_deploy") |
| mineru_deploy_parse_method | str | "auto" | Local minerU parsing method: "auto", "txt", "ocr" |
| mineru_deploy_table_enable | bool | True | Enable table recognition for local minerU |
| mineru_token | str | - | minerU API token (when using online minerU) |
| skip_translate | bool | False | Skip translation, only parse document |
| output_dir | str | "./output" | Default output directory for save() |
| chunk_size | int | 3000 | Text chunk size for LLM processing |
| temperature | float | 0.3 | LLM temperature parameter |
| timeout | int | 60 | Request timeout in seconds |
| retry | int | 3 | Number of retry attempts on failure |
| provider | str | "auto" | AI provider type (auto, openai, azure, etc.) |
| force_json | bool | False | Force JSON output mode |
| rpm | int | - | Requests per minute limit |
| tpm | int | - | Tokens per minute limit |
| extra_body | str | - | Additional request body parameters in JSON string format, will be merged into API request |
| thinking | str | "auto" | Thinking mode: "auto", "none", "block" |
| custom_prompt | str | - | Custom prompt for translation |
| system_proxy_enable | bool | False | Enable system proxy |
| insert_mode | str | "replace" | Docx/Xlsx/Txt insertion mode: "replace", "append", "prepend" |
| separator | str | "\n" | Text separator for append/prepend modes |
| segment_mode | str | "line" | Segmentation mode: "line", "paragraph", "none" |
| translate_regions | list | - | Excel translation regions (e.g., "Sheet1!A1:B10") |
| model_version | str | "vlm" | MinerU model version: "pipeline", "vlm" |
| formula_ocr | bool | True | Enable formula OCR for PDF parsing |
| code_ocr | bool | True | Enable code OCR for PDF parsing |
| mineru_deploy_backend | str | "hybrid-auto-engine" | MinerU local backend: "pipeline", "vlm-auto-engine", "vlm-http-client", "hybrid-auto-engine", "hybrid-http-client" |
| mineru_deploy_formula_enable | bool | True | Enable formula recognition for local MinerU |
| mineru_deploy_start_page_id | int | 0 | Start page ID for local MinerU parsing |
| mineru_deploy_end_page_id | int | 99999 | End page ID for local MinerU parsing |
| mineru_deploy_lang_list | list | - | Language list for local MinerU parsing |
| mineru_deploy_server_url | str | - | MinerU local server URL |
| json_paths | list | - | JSONPath expressions for JSON translation (e.g., "$.data.*") |
| glossary_generate_enable | bool | - | Enable auto glossary generation |
| glossary_dict | dict | - | Glossary dictionary (e.g., {"Jobs": "Steve Jobs"}) |
| glossary_agent_config | dict | - | Glossary agent configuration |
Result Methods
| Method | Parameters | Description |
|---|---|---|
| save() | output_dir, name, fmt | Save translation result to disk |
| export() | fmt | Export as Base64 encoded string |
| supported_formats | - | Get list of supported output formats |
| workflow | - | Access underlying workflow object |
import asyncio
from docutranslate.sdk import Client
async def translate_multiple():
client = Client(
api_key="YOUR_API_KEY",
base_url="https://api.openai.com/v1/",
model_id="gpt-4o",
to_lang="Chinese",
)
# Translate multiple files concurrently
files = ["doc1.pdf", "doc2.docx", "notes.txt"]
results = await asyncio.gather(
*[client.translate_async(f) for f in files]
)
for r in results:
print(f"Saved: {r.save()}")
asyncio.run(translate_multiple())
Using Workflow API (For Advanced Control)
For more control, use the Workflow API directly. Each workflow follows the same pattern:
# Pattern:
# 1. Create TranslatorConfig (LLM settings)
# 2. Create WorkflowConfig (workflow settings)
# 3. Create Workflow instance
# 4. workflow.read_path(file)
# 5. await workflow.translate_async()
# 6. workflow.save_as_*(name=...) or export_to_*(...)
Available Workflows and Output Methods
| Workflow | Inputs | save_as_* | export_to_* | Key Config Options |
|---|---|---|---|---|
| MarkdownBasedWorkflow | .pdf, .docx, .md, .png, .jpg | html, markdown, markdown_zip, docx | html, markdown, markdown_zip, docx | convert_engine, md2docx_engine, translator_config |
| TXTWorkflow | .txt | txt, html | txt, html | translator_config |
| JsonWorkflow | .json | json, html | json, html | translator_config, json_paths |
| DocxWorkflow | .docx | docx, html | docx, html | translator_config, insert_mode |
| XlsxWorkflow | .xlsx, .csv | xlsx, html | xlsx, html | translator_config, insert_mode |
| SrtWorkflow | .srt | srt, html | srt, html | translator_config |
| EpubWorkflow | .epub | epub, html | epub, html | translator_config, insert_mode |
| HtmlWorkflow | .html, .htm | html | html | translator_config, insert_mode |
| AssWorkflow | .ass | ass, html | ass, html | translator_config |
Key Configuration Options
Common TranslatorConfig Options:
| Option | Type | Default | Description |
|---|---|---|---|
base_url | str | - | AI platform base URL |
api_key | str | - | AI platform API key |
model_id | str | - | Model ID |
to_lang | str | - | Target language |
chunk_size | int | 3000 | Text chunk size |
concurrent | int | 10 | Concurrent requests |
temperature | float | 0.3 | LLM temperature |
timeout | int | 60 | Request timeout (seconds) |
retry | int | 3 | Retry attempts |
Format-Specific Options:
| Option | Applicable Workflows | Description |
|---|---|---|
insert_mode | Docx, Xlsx, Html, Epub | "replace" (default), "append", "prepend" |
json_paths | Json | JSONPath expressions (e.g., ["$.*", "$.name"]) |
separator | Docx, Xlsx, Html, Epub | Text separator for append/prepend modes |
convert_engine | MarkdownBased | "mineru" (default), "mineru_deploy" |
Example 1: Translate a PDF File (Using MarkdownBasedWorkflow)
This is the most common use case. We will use the minerU engine to convert the PDF to Markdown, and then translate it using an LLM. This example uses asynchronous execution.
import asyncio
from docutranslate.workflow.md_based_workflow import MarkdownBasedWorkflow, MarkdownBasedWorkflowConfig
from docutranslate.converter.x2md.converter_mineru import ConverterMineruConfig
from docutranslate.translator.ai_translator.md_translator import MDTranslatorConfig
from docutranslate.exporter.md.md2html_exporter import MD2HTMLExporterConfig
async def main():
# 1. Build Translator Configuration
translator_config = MDTranslatorConfig(
base_url="https://open.bigmodel.cn/api/paas/v4", # AI Platform Base URL
api_key="YOUR_ZHIPU_API_KEY", # AI Platform API Key
model_id="glm-4-air", # Model ID
to_lang="English", # Target Language
chunk_size=3000, # Text chunk size
concurrent=10, # Concurrency level
# glossary_generate_enable=True, # Enable auto-glossary generation
# glossary_dict={"Jobs":"Steve Jobs"}, # Pass in a glossary dictionary
# system_proxy_enable=True, # Enable system proxy
)
# 2. Build Converter Configuration (Using minerU)
converter_config = ConverterMineruConfig(
mineru_token="YOUR_MINERU_TOKEN", # Your minerU Token
formula_ocr=True # Enable formula recognition
)
# 3. Build Main Workflow Configuration
workflow_config = MarkdownBasedWorkflowConfig(
convert_engine="mineru", # Specify parsing engine
converter_config=converter_config, # Pass converter config
translator_config=translator_config, # Pass translator config
html_exporter_config=MD2HTMLExporterConfig(cdn=True) # HTML export config
)
# 4. Instantiate Workflow
workflow = MarkdownBasedWorkflow(config=workflow_config)
# 5. Read file and execute translation
print("Starting to read and translate file...")
workflow.read_path("path/to/your/document.pdf")
await workflow.translate_async()
# Or use synchronous method
# workflow.translate()
print("Translation complete!")
# 6. Save results
workflow.save_as_html(name="translated_document.html")
workflow.save_as_markdown_zip(name="translated_document.zip")
workflow.save_as_markdown(name="translated_document.md") # Markdown with embedded images
print("Files saved to ./output folder.")
# Or get content strings directly
html_content = workflow.export_to_html()
html_content = workflow.export_to_markdown()
# print(html_content)
if __name__ == "__main__":
asyncio.run(main())
Other Workflows
All workflows follow the same pattern. Import the corresponding config and workflow, then configure:
# TXT: from docutranslate.workflow.txt_workflow import TXTWorkflow, TXTWorkflowConfig
# JSON: from docutranslate.workflow.json_workflow import JsonWorkflow, JsonWorkflowConfig
# DOCX: from docutranslate.workflow.docx_workflow import DocxWorkflow, DocxWorkflowConfig
# XLSX: from docutranslate.workflow.xlsx_workflow import XlsxWorkflow, XlsxWorkflowConfig
# EPUB: from docutranslate.workflow.epub_workflow import EpubWorkflow, EpubWorkflowConfig
# HTML: from docutranslate.workflow.html_workflow import HtmlWorkflow, HtmlWorkflowConfig
# SRT: from docutranslate.workflow.srt_workflow import SrtWorkflow, SrtWorkflowConfig
# ASS: from docutranslate.workflow.ass_workflow import AssWorkflow, AssWorkflowConfig
Key config options:
- insert_mode:
"replace","append", or"prepend"(for docx/xlsx/html/epub) - json_paths: JSONPath expressions for JSON translation (e.g.,
["$.*", "$.name"]) - separator: Text separator for
"append"/"prepend"modes
Prerequisites and Detailed Configuration
1. Get Large Model API Key
Translation functionality relies on Large Language Models. You need to obtain a base_url, api_key, and model_id from the corresponding AI platform.
Recommended Models: Volcengine's
doubao-seed-1-6-flash,doubao-seed-1-6series, Zhipu'sglm-4-flash, Alibaba Cloud'sqwen-plus,qwen-flash, Deepseek'sdeepseek-chat, etc.
302.AI 👈 Register via this link to get $1 free credit.
| Platform Name | Get API Key | Base URL |
|---|---|---|
| ollama | http://127.0.0.1:11434/v1 | |
| lm studio | http://127.0.0.1:1234/v1 | |
| 302.AI | Click to Get | https://api.302.ai/v1 |
| openrouter | Click to Get | https://openrouter.ai/api/v1 |
| openai | Click to Get | https://api.openai.com/v1/ |
| gemini | Click to Get | https://generativelanguage.googleapis.com/v1beta/openai/ |
| deepseek | Click to Get | https://api.deepseek.com/v1 |
| Zhipu AI | Click to Get | https://open.bigmodel.cn/api/paas/v4 |
| Tencent Hunyuan | Click to Get | https://api.hunyuan.cloud.tencent.com/v1 |
| Alibaba Bailian | Click to Get | https://dashscope.aliyuncs.com/compatible-mode/v1 |
| Volcengine | Click to Get | https://ark.cn-beijing.volces.com/api/v3 |
| SiliconFlow | Click to Get | https://api.siliconflow.cn/v1 |
| DMXAPI | Click to Get | https://www.dmxapi.cn/v1 |
| Juguang AI | Click to Get | https://ai.juguang.chat/v1 |
2. PDF Parsing Engine (Skip if you don't need to translate PDFs)
2.1 Get minerU Token (Online PDF Parsing, Free, Recommended)
If you choose mineru as the document parsing engine (convert_engine="mineru"), you need to apply for a free Token.
- Visit minerU Website to register and apply for the API.
- Create a new API Token in the API Token Management Interface.
Note: The minerU Token is valid for 14 days. Please recreate it after expiration.
2.2. Locally Deployed MinerU Service
For offline/intranet environments, you can use locally deployed minerU. Set mineru_deploy_base_url to your minerU API endpoint.
Client SDK:
from docutranslate.sdk import Client
client = Client(
api_key="YOUR_LLM_API_KEY",
model_id="llama3",
to_lang="Chinese",
convert_engine="mineru_deploy",
mineru_deploy_base_url="http://127.0.0.1:8000", # Your minerU API address
)
result = client.translate("document.pdf")
result.save(fmt="markdown")
FAQ
Q: Output is in original language? A: Check logs for errors. Usually due to exhausted API credits or network issues.
Q: Port 8010 occupied?
A: Use docutranslate -i -p 8011 or set DOCUTRANSLATE_PORT=8011.
Q: Scanned PDFs supported?
A: Yes, use mineru engine with OCR capabilities.
Q: Use in intranet/offline? A: Yes. Local translation can be achieved by deploying a local LLM (Ollama/LM Studio/VLLM, etc.). If you need to parse PDFs locally, you also need to deploy MinerU locally.
Q: PDF cache mechanism?
A: MarkdownBasedWorkflow caches parsing results in memory (last 10 parses). Configure via DOCUTRANSLATE_CACHE_NUM.
Q: Enable proxy?
A: Set system_proxy_enable=True in TranslatorConfig.
Star History
Donation Support
Welcome to support the author. Please specify the reason for the donation in the comments!
