Web Page Fetcher
Fetch web pages with SSRF protection. Returns clean, readable markdown from any URL.
Ask AI about Web Page Fetcher
Powered by Claude Β· Grounded in docs
I know everything about Web Page Fetcher. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
paimon-mcp-fetch
Give your AI assistant the ability to read any webpage.
A lightweight MCP server that fetches URLs and returns clean, readable markdown. Built with Go β starts in 5ms, uses ~10MB RAM, zero runtime dependencies.
What It Does
You give it a URL β it returns clean markdown.
Good for:
- Reading articles, blog posts, documentation
- Extracting data from news sites, forums, schedules
- Summarizing web content for your AI assistant
- Getting structured output (headings, lists, tables, code blocks preserved)
Not good for:
- Scraping login-protected pages
- Bypassing paywalls
- Replacing a full browser automation tool
Quick Start
1. Install
Pick one method:
# Go (recommended)
go install github.com/paimonchan/paimon-mcp-fetch/cmd/paimon-mcp-fetch@latest
# Homebrew (macOS/Linux)
brew tap paimonchan/tap
brew install paimon-mcp-fetch
# Scoop (Windows)
scoop bucket add paimonchan https://github.com/paimonchan/scoop-bucket
scoop install paimon-mcp-fetch
# Winget (Windows)
winget install paimonchan.paimon-mcp-fetch
# Docker
docker run -i --rm ghcr.io/paimonchan/paimon-mcp-fetch:latest
2. Configure Your AI Assistant
Add this to your MCP client config:
{
"mcp": {
"paimon-mcp-fetch": {
"type": "local",
"command": ["paimon-mcp-fetch"],
"enabled": true
}
}
}
3. Done
Your AI can now read any URL you give it.
Why This Over Other Fetch Tools?
| paimon-mcp-fetch | Basic text fetch | |
|---|---|---|
| Output | Structured markdown | Plain text |
| Article extraction | Readability algorithm (strips ads, nav, sidebars) | Raw HTML body |
| Images | Optional extraction + processing | None |
| JS rendering | Optional (headless Chrome) | Static only |
| Caching | Built-in LRU cache | None |
| Rate limiting | Per-domain, configurable | None |
| SSRF protection | 7-layer defense | None |
| Startup time | ~5ms | Varies |
| Memory | ~10MB | Varies |
Configuration
Everything is controlled via environment variables. You probably don't need to change anything β defaults work well for most use cases.
| Variable | Default | What it does |
|---|---|---|
PAIMON_MCP_FETCH_TIMEOUT_MS | 12000 | Request timeout (ms) |
PAIMON_MCP_FETCH_MAX_HTML_BYTES | 10485760 | Max page size (10MB) |
PAIMON_MCP_FETCH_CACHE_TTL_SECS | 300 | Cache lifetime (5 min) |
PAIMON_MCP_FETCH_RATE_LIMIT_PER_SECOND | 5.0 | Requests/sec per domain |
PAIMON_MCP_FETCH_RATE_LIMIT_BURST | 10 | Max burst size |
PAIMON_MCP_FETCH_RETRY_MAX_ATTEMPTS | 3 | Retry on transient errors |
PAIMON_MCP_FETCH_JS_RENDER_ENABLED | false | Enable headless Chrome |
Optional Features
Image Processing
Extract and process images from webpages:
go build -tags image -o paimon-mcp-fetch ./cmd/paimon-mcp-fetch/
JS Rendering
For JavaScript-heavy sites (SPAs, dynamic content):
go build -tags jsrender -o paimon-mcp-fetch ./cmd/paimon-mcp-fetch/
PAIMON_MCP_FETCH_JS_RENDER_ENABLED=true ./paimon-mcp-fetch
Note: Requires Chrome or Chromium installed. Slower (~3-5s/page) but handles sites that static fetch can't.
Supported AI Assistants
Works with any MCP-compatible client. Add the configuration below to your client's config file.
OpenCode
Config file: ~/.config/opencode/opencode.json
{
"mcp": {
"paimon-mcp-fetch": {
"type": "local",
"command": ["paimon-mcp-fetch"],
"enabled": true
}
}
}
Claude Desktop
Config file: claude_desktop_config.json
{
"mcpServers": {
"paimon-mcp-fetch": {
"command": "paimon-mcp-fetch"
}
}
}
Cursor
Config file: .cursor/mcp.json (project) or ~/.cursor/mcp.json (global)
{
"mcpServers": {
"paimon-mcp-fetch": {
"command": "paimon-mcp-fetch",
"env": {}
}
}
}
VS Code (GitHub Copilot)
Config file: .vscode/mcp.json (workspace) or user settings
{
"servers": {
"paimon-mcp-fetch": {
"type": "stdio",
"command": "paimon-mcp-fetch"
}
}
}
Cline
Config file: .cline/mcp.json
{
"mcpServers": {
"paimon-mcp-fetch": {
"command": "paimon-mcp-fetch"
}
}
}
Windsurf
Config file: .windsurf/mcp.json
{
"mcpServers": {
"paimon-mcp-fetch": {
"command": "paimon-mcp-fetch"
}
}
}
With Custom Configuration
All clients support environment variables. Example with custom timeout and cache:
{
"mcpServers": {
"paimon-mcp-fetch": {
"command": "paimon-mcp-fetch",
"env": {
"PAIMON_MCP_FETCH_TIMEOUT_MS": "30000",
"PAIMON_MCP_FETCH_CACHE_TTL_SECS": "600"
}
}
}
}
Security
Built with security in mind from day one:
- SSRF Protection β 7-layer defense against server-side request forgery
- Private IP blocking β Can't access localhost or internal networks
- Redirect validation β Every redirect hop re-checked for safety
- Size limits β Stream-based reading, no memory bombs
- Timeouts β All requests have deadlines
- No secrets in logs β API keys and tokens never logged
Windows Antivirus
Some antivirus may flag unsigned Go binaries as a false positive. This is a known industry issue. Solutions:
- Use
go installβ antivirus sees the signed Go compiler - Use Docker β no local binary
- Build from source β verify the code yourself
Architecture
Built with Clean Architecture principles:
MCP Server β UseCase β Domain (entities, ports)
β
Adapters implement interfaces
- Domain β Business rules, zero external dependencies
- UseCase β Orchestration logic
- Adapters β HTTP client, content extractor, cache, rate limiter, image processor, JS renderer
Full details in the project plan.
License
MIT β do whatever you want with it.
Built with Go. Zero runtime dependencies. Single binary. ~10MB RAM. Starts in 5ms.
