DeepSeek OCR WebUI
๐จ Ready-to-use DeepSeek-OCR Web UI | Modern Interface | 7 Recognition Modes | Batch Processing | Real-time Logging | Fully Responsive
Ask AI about DeepSeek OCR WebUI
Powered by Claude ยท Grounded in docs
I know everything about DeepSeek OCR WebUI. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
๐ DeepSeek-OCR-WebUI
๐ English | ็ฎไฝไธญๆ | ็น้ซไธญๆ | ๆฅๆฌ่ช
Intelligent OCR System ยท Vue 3 Modern UI ยท Batch Processing ยท Multi-Mode Support
Features โข Quick Start โข Screenshots โข Contributors
๐ v4.1 Update: UI Improvements & Model Version Display

Header shows OCR-2 model badge ยท Footer displays v4.1 ยท OCR-2
- ๐ท๏ธ OCR-2 Model Badge โ Header now shows a prominent
OCR-2badge so users instantly know the model version - ๐จ Table Rendering Fix โ OCR-detected tables now display with white backgrounds, dark text, and zebra striping for clear readability (previously appeared as dark/unreadable blocks)
- ๐ก Health API
model_versionโ/healthendpoint now returns"model_version": "DeepSeek-OCR-2"for programmatic version detection - ๐ Footer Version โ Updated to
v4.1 ยท OCR-2
๐ v4.0 Update: DeepSeek-OCR-2 Model Upgrade!
๐ Major model upgrade to DeepSeek-OCR-2 (Visual Causal Flow) โ better accuracy, higher resolution!
โจ What's New in v4.0
- ๐ง DeepSeek-OCR-2 Model - Upgraded to the latest DeepSeek-OCR-2 with Visual Causal Flow architecture
- ๐ฌ Higher Resolution - Dynamic resolution up to (0-6)ร768ร768 + 1ร1024ร1024 (was 640ร640)
- โก Flash Attention 2 - Native
flash_attention_2support on CUDA for optimal inference speed - ๐ฏ Improved Accuracy - Better document understanding, chart parsing, and text recognition
- ๐ Full Backward Compatibility - All 7 recognition modes, REST API, and frontend unchanged
- ๐ณ Docker v4.0 - New all-in-one image with pre-downloaded OCR-2 model (
Dockerfile.v4.0) - ๐ฆ Unified Tokenizer - Switched from
AutoProcessortoAutoTokenizer(aligned with official OCR-2 API)
๐ง Technical Changes
| Component | v3.6 (OCR v1) | v4.0 (OCR-2) |
|---|---|---|
| Model | deepseek-ai/DeepSeek-OCR | deepseek-ai/DeepSeek-OCR-2 |
image_size | 640 | 768 |
| Attention | eager | flash_attention_2 (CUDA) |
| Tokenizer | AutoProcessor | AutoTokenizer |
| Resolution | Fixed crops | Dynamic (0-6)ร768 + 1ร1024 |
๐ก All existing features from v3.6 (concurrency, rate limiting, queue management, Vue 3 frontend) are fully preserved.
๐ v3.6 Update: Backend Concurrency & Rate Limiting!
๐ Performance optimization with smart queue management and rate limiting!
โจ What's New in v3.6
- โก Backend Concurrency Optimization - Non-blocking inference with ThreadPoolExecutor
- ๐ Rate Limiting - Per-client and per-IP request limits (X-Client-ID header support)
- ๐ Queue Management - Real-time queue status with position tracking
- ๐ฅ Enhanced Health API - Queue depth, status (healthy/busy/full), and rate limit info
- ๐ New Languages - Added Traditional Chinese (zh-TW) and Japanese (ja-JP)
- ๐ฏ 429 Error Handling - Graceful handling when queue is full or rate limited
๐ Contributors: @cloudman6 (PR #41)
๐ v3.5 Major Update: Brand New Vue 3 Frontend!
๐ Complete UI Overhaul with Modern Vue 3 + TypeScript Architecture!
| Home Page | Processing Page |
|---|---|
![]() | ![]() |
โจ What's New in v3.5
- ๐จ Brand New Vue 3 UI - Modern, responsive design with Naive UI components
- โก TypeScript Support - Full type safety and better developer experience
- ๐ฆ Dexie.js Database - Local IndexedDB for offline page management
- ๐ Real-time Processing Queue - Visual OCR progress with queue management
- ๐ฅ Health Check System - Backend status monitoring with visual indicators
- ๐ Enhanced PDF Support - Smooth PDF rendering with page-by-page processing
- ๐ i18n Ready - Built-in internationalization (EN/CN/TW/JP)
- ๐งช E2E Testing - Comprehensive Playwright test coverage
๐ฅ Contributors
๐ Special Thanks to Our Amazing Contributors! ๐
This project is the result of an outstanding collaboration. The Vue 3 frontend was developed through a successful merge of PR #34.
|
CloudMan ๐ Vue 3 Frontend Lead Developer 164 commits ยท Complete UI Rewrite |
neosun100 ๐ฏ Project Maintainer Backend ยท Docker ยท Integration |
๐ก About the Vue 3 Frontend: @cloudman6 contributed an exceptional Vue 3 + TypeScript frontend with 164 commits, including comprehensive E2E tests, modern UI components, and production-ready architecture. This collaboration transformed DeepSeek-OCR-WebUI into a professional-grade application!
๐ Introduction
DeepSeek-OCR-WebUI is an intelligent document recognition web application powered by the DeepSeek-OCR model. It provides a modern, intuitive interface for converting images and PDFs to structured text with high accuracy.
โจ Core Highlights
| Feature | Description |
|---|---|
| ๐ฏ 7 Recognition Modes | Document, OCR, Chart, Find, Freeform, and more |
| ๐ผ๏ธ Bounding Box Visualization | Find mode with automatic position annotation |
| ๐ฆ Batch Processing | Process multiple images/pages sequentially |
| ๐ PDF Support | Upload PDFs, auto-convert to images |
| ๐จ Modern Vue 3 UI | Responsive design with Naive UI |
| ๐ Multilingual | EN, ็ฎไฝไธญๆ, ็น้ซไธญๆ, ๆฅๆฌ่ช |
| ๐ Apple Silicon | Native MPS acceleration for M1/M2/M3/M4 |
| ๐ณ Docker Ready | One-command deployment |
| โก GPU Acceleration | NVIDIA CUDA support |
๐ Features
7 Recognition Modes
| Mode | Icon | Description | Use Cases |
|---|---|---|---|
| Doc to Markdown | ๐ | Preserve format and layout | Contracts, papers, reports |
| General OCR | ๐ | Extract all visible text | Image text extraction |
| Plain Text | ๐ | Pure text without format | Simple text recognition |
| Chart Parser | ๐ | Recognize charts and formulas | Data charts, math formulas |
| Image Description | ๐ผ๏ธ | Generate detailed descriptions | Image understanding |
| Find & Locate | ๐ | Find and annotate positions | Invoice field locating |
| Custom Prompt | โจ | Customize recognition needs | Flexible tasks |
๐ Vue 3 Frontend Features
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ๐ Page Sidebar โ ๐ Document Viewer โ
โ โโ Thumbnail List โ โโ High-res Image Display โ
โ โโ Drag & Drop Reorder โ โโ OCR Overlay Toggle โ
โ โโ Batch Selection โ โโ Zoom Controls โ
โ โโ Quick Actions โ โโ Status Indicators โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ ๐ Processing Queue โ ๐ Result Panel โ
โ โโ Real-time Progress โ โโ Markdown Preview โ
โ โโ Cancel/Retry โ โโ Word/PDF Export โ
โ โโ Health Monitoring โ โโ Copy to Clipboard โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ผ๏ธ Screenshots
Home Page

Clean, modern landing page with quick access to all features
Processing Interface

Full-featured document processing with sidebar, viewer, and results panel
Quick Start Guide

Step-by-step guide: Import files โ Select pages โ Choose OCR mode โ Get results
๐ฆ Quick Start
๐ณ Docker (Recommended)
# Pull and run
docker pull neosun/deepseek-ocr:v4.1
docker run -d \
--name deepseek-ocr \
--gpus all \
-p 8001:8001 \
--shm-size=8g \
neosun/deepseek-ocr:v4.1
# Access: http://localhost:8001
Available Docker Tags
| Tag | Description |
|---|---|
latest | Latest stable (= v4.1) |
v4.1 | UI improvements & model version display |
v4.0 | DeepSeek-OCR-2 model upgrade |
v3.6 | Backend concurrency & rate limiting |
v3.5 | Vue 3 frontend version |
v3.3.1-fix-bfloat16 | BFloat16 compatibility fix |
๐ Mac (Apple Silicon)
# Clone and setup
git clone https://github.com/neosun100/DeepSeek-OCR-WebUI.git
cd DeepSeek-OCR-WebUI
# Create conda environment
conda create -n deepseek-ocr python=3.11
conda activate deepseek-ocr
# Install dependencies
pip install -r requirements-mac.txt
# Start service
./start.sh
# Access: http://localhost:8001
๐ง Linux (Native)
# With NVIDIA GPU
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt
./start.sh
๐ API & Integration
REST API
import requests
# Single image OCR
with open("image.png", "rb") as f:
response = requests.post(
"http://localhost:8001/ocr",
files={"file": f},
data={"prompt_type": "ocr"}
)
print(response.json()["text"])
# PDF OCR (all pages)
with open("document.pdf", "rb") as f:
response = requests.post(
"http://localhost:8001/ocr-pdf",
files={"file": f},
data={"prompt_type": "document"}
)
print(response.json()["merged_text"])
Endpoints:
GET /health- Health checkPOST /ocr- Single image OCRPOST /ocr-pdf- PDF OCR (all pages)POST /pdf-to-images- Convert PDF to images
๐ Full API Documentation: API.md
MCP (Model Context Protocol)
Enable AI assistants like Claude Desktop to use OCR:
{
"mcpServers": {
"deepseek-ocr": {
"command": "python",
"args": ["/path/to/mcp_server.py"]
}
}
}
๐ MCP Setup Guide: MCP_SETUP.md
๐ Multilingual Support
| Language | Code | Status |
|---|---|---|
| ๐บ๐ธ English | en-US | โ Default |
| ๐จ๐ณ ็ฎไฝไธญๆ | zh-CN | โ |
| ๐น๐ผ ็น้ซไธญๆ | zh-TW | โ |
| ๐ฏ๐ต ๆฅๆฌ่ช | ja-JP | โ |
Switch language via the selector in the top-right corner.
๐ Version History
v4.1 (2026-02-20) - UI Improvements & Model Version Display
๐ท๏ธ UI & API Enhancements:
- โ OCR-2 model badge in header for instant version recognition
- โ Table rendering fix: white background, dark text, zebra striping
- โ
Health API returns
model_version: "DeepSeek-OCR-2" - โ
Footer updated to
v4.1 ยท OCR-2
v4.0 (2026-02-20) - DeepSeek-OCR-2 Model Upgrade
๐ง Major Model Upgrade:
- โ Upgraded to DeepSeek-OCR-2 (Visual Causal Flow)
- โ Dynamic resolution: (0-6)ร768ร768 + 1ร1024ร1024
- โ Flash Attention 2 on CUDA for optimal inference speed
- โ
Switched from
AutoProcessortoAutoTokenizer - โ
image_sizeupgraded from 640 to 768 - โ
New
Dockerfile.v4.0with pre-downloaded OCR-2 model - โ Full backward compatibility with all v3.6 features
v3.6 (2026-01-20) - Backend Concurrency & Rate Limiting
โก Performance Optimization:
- โ Non-blocking inference with ThreadPoolExecutor
- โ Concurrency control with asyncio.Semaphore (OCR: 1, PDF: 2)
- โ Queue system with MAX_OCR_QUEUE_SIZE and dynamic status
- โ Per-IP and per-Client-ID rate limiting (X-Client-ID header)
- โ 429 error handling (queue full, client limit, IP limit)
- โ Health indicator with 3 status colors (green/yellow/red)
- โ OCR queue popover with real-time position display
๐ Contributors: @cloudman6 (PR #41)
v3.5 (2026-01-17) - Vue 3 Frontend
๐จ Complete UI Overhaul:
- โ Vue 3 + TypeScript + Naive UI
- โ Dexie.js local database
- โ Real-time processing queue
- โ Health check monitoring
- โ E2E test coverage (Playwright)
- โ GitHub links in header
๐ Contributors: @cloudman6 (164 commits)
v3.3.1 (2025-12-16) - BFloat16 Fix
- โ Fixed GPU compatibility for RTX 20xx, GTX 10xx
- โ Auto-detect compute capability
v3.3 (2025-11-05) - Apple Silicon
- โ Native MPS backend for Mac M1/M2/M3/M4
- โ Multi-platform architecture
v3.2 (2025-11-04) - PDF Support
- โ PDF upload and conversion
- โ ModelScope auto-fallback
๐ Documentation
| Document | Description |
|---|---|
| API.md | REST API reference |
| MCP_SETUP.md | MCP integration guide |
| DOCKER_HUB.md | Docker deployment |
| CHANGELOG.md | Version history |
๐ Star History
๐ค Contributing
Contributions welcome! Please:
- Fork this repository
- Create feature branch (
git checkout -b feature/AmazingFeature) - Commit changes (
git commit -m 'Add AmazingFeature') - Push to branch (
git push origin feature/AmazingFeature) - Open Pull Request
๐ License
This project is licensed under the MIT License.
๐ Acknowledgments
- DeepSeek-AI - DeepSeek-OCR model
- @cloudman6 - Vue 3 frontend development
- All contributors and users
