Cloudflare MCP For Static Sites
MCP server template for searching static website content via Cloudflare Workers
Ask AI about Cloudflare MCP For Static Sites
Powered by Claude Β· Grounded in docs
I know everything about Cloudflare MCP For Static Sites. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Cloudflare MCP Server for Static Sites
Turn your static website into an AI-accessible knowledge base. This Cloudflare Worker serves your content over MCP, so AI tools like Claude can search and retrieve it directly.
Cloudflare is well-suited for hosting remote MCP servers β its Workers platform handles the transport layer, and Durable Objects maintain persistent client sessions.
To see this approach in action, see my blog post Publishing Your Content to AI Assistants.
- Why This Matters
- How It Works
- Prerequisites
- Quick Start
- MCP Client Setup
- Threat Model
- Adapters
- Configuration
- Development
- Troubleshooting
- Examples
- Author
Why This Matters
AI assistants answer questions based on training data that may be outdated or incomplete. Web search and retrieval augmented generation (RAG) each put obstacles between the AI and your content. Web search depends on the AI choosing to search and hoping the right results come back. RAG can be tricky to deploy, and readers can't easily add it to their AI tools.
An MCP server makes your content a native capability of any AI tool that connects to it. The AI discovers and queries your content automatically, in a fast and token-friendly way.
You might use this to:
- Help users find answers in your documentation
- Give AI assistants access to your blog's content
- Let AI tools cite your articles with accurate, up-to-date information
How It Works
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Your Static Site β
β (Markdown files with frontmatter) β
ββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Adapter β
β (Astro, Hugo, or Generic β runs at build time) β
β β
β Scans your content files, extracts metadata from frontmatter, β
β and generates a search-index.json file. β
ββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Cloudflare R2 β
β β
β Stores the search index. Only your Worker can access it. β
β The Worker caches the index in memory for four hours. β
ββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Cloudflare Worker β
β β
β Implements the MCP server. Uses Fuse.js for fuzzy search. β
β Durable Objects maintain persistent sessions with MCP clients. β
ββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MCP Clients β
β (Claude Desktop, Claude Code, Cursor, etc.) β
β β
β Tools available to the AI: β
β β’ search_<prefix> β Find content by keywords β
β β’ get_article β Retrieve a specific page by URL β
β β’ get_index_info β Get index statistics β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Prerequisites
| Requirement | What It's For |
|---|---|
| Cloudflare account | Hosts the Worker and R2 bucket. The free tier is sufficient. |
| Node.js 18+ or Bun | Runs the adapter that generates your search index. |
| Wrangler CLI | Deploys the Worker and manages R2. Installed via bun install. |
Quick Start
You can follow these steps manually or point an AI coding tool (Claude Code, Cursor, etc.) at this repo and ask it to set things up. Either way, you'll need a Cloudflare account and these details about your site:
- Site name and domain (e.g., "My Blog" and "blog.example.com")
- Content directory path to your markdown files
- Tool prefix for MCP tool names (e.g., "myblog" β
search_myblog) - MCP endpoint domain (e.g., "mcp.example.com")
1. Clone and Install
git clone https://github.com/lennyzeltser/cloudflare-mcp-for-static-sites.git my-site-mcp
cd my-site-mcp
bun install
2. Configure
Edit wrangler.jsonc:
{
"name": "my-site-mcp-server",
"routes": [
{ "pattern": "mcp.example.com", "custom_domain": true }
],
"r2_buckets": [
{ "binding": "SEARCH_BUCKET", "bucket_name": "my-site-mcp-data" }
]
}
3. Create R2 Bucket
bunx wrangler r2 bucket create my-site-mcp-data
4. Generate and Upload Index
Pick an adapter for your site (see Adapters):
bun adapters/generic/generate-index.js \
--content-dir=../my-site/content \
--site-name="My Site" \
--site-domain="example.com" \
--tool-prefix="mysite"
bunx wrangler r2 object put my-site-mcp-data/search-index.json \
--file=./search-index.json \
--content-type=application/json
5. Deploy
bun run deploy
Your MCP server is now running. Connect an MCP client to start searching.
CI/CD: The included GitHub Actions workflow (.github/workflows/deploy.yml) is set to manual trigger only. To deploy via GitHub Actions, go to Actions β Deploy β Run workflow. To enable auto-deploy on push, edit the workflow and add push: branches: [main] to the triggers.
MCP Client Setup
Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"my-site": {
"command": "npx",
"args": ["-y", "mcp-remote", "https://mcp.example.com/mcp"]
}
}
}
Claude Code
claude mcp add my-site --transport http https://mcp.example.com/mcp --scope user
Cursor
Add to your Cursor mcp.json:
{
"mcpServers": {
"my-site": {
"url": "https://mcp.example.com/mcp"
}
}
}
Other Clients
Use the mcp-remote package to connect via the /mcp endpoint (streamable HTTP, recommended) or /sse endpoint (SSE transport, legacy).
Available Tools
| Tool | Description |
|---|---|
search_<prefix> | Search by keywords. Returns titles, URLs, dates, and summaries. |
get_article | Retrieve full content by URL path (e.g., /about). |
get_index_info | Get page count, generation date, and tool names. |
Threat Model
This MCP server is designed for public content only. Consider these security characteristics before deploying:
What's Exposed
| Exposure | Mechanism |
|---|---|
| All indexed content | get_article retrieves full text by URL path |
| Content enumeration | search_* with broad queries reveals page titles and summaries |
| Site metadata | / endpoint and get_index_info reveal page count, domain, and tool names |
Assumptions
- Your content is already public. The indexed pages come from a public website. This server makes them AI-searchable, not newly public.
- R2 is not the security boundary. While the R2 bucket is private, the Worker exposes its contents through MCP tools. Anyone with the endpoint URL can query all indexed content.
- No authentication. The MCP server accepts connections from any client. There's no API key, OAuth, or access control.
Not Designed For
- Private or internal documentation
- Content requiring authentication or authorization
- Partial access control (all-or-nothing visibility)
Recommendations
If you need access control, consider:
- Cloudflare Access for authentication at the Worker level
- A separate private deployment for internal content
- Excluding sensitive pages from the search index
Adapters
An adapter generates the search index from your content. It scans your files, extracts frontmatter metadata, and outputs search-index.json.
Each adapter handles the specifics of a particular static site generator.
Generic (Markdown)
Works with any site that uses markdown files with YAML frontmatter.
bun adapters/generic/generate-index.js \
--content-dir=./content \
--site-name="My Website" \
--site-domain="example.com" \
--tool-prefix="mysite" \
--output=./search-index.json
See adapters/generic/README.md.
Astro
An Astro integration that generates the index at build time.
// astro.config.mjs
import { searchIndexIntegration } from './src/integrations/search-index.mjs';
export default defineConfig({
integrations: [
searchIndexIntegration({
siteName: 'My Blog',
siteDomain: 'blog.example.com',
toolPrefix: 'myblog',
}),
],
});
Hugo
A Node.js script that handles both TOML and YAML frontmatter.
bun adapters/hugo/generate-index.js \
--content-dir=./content \
--site-name="My Hugo Site" \
--site-domain="example.com"
Writing Your Own Adapter
If your static site generator isn't listed, you can write an adapter. It just needs to output JSON in the v3.0 format.
Your adapter should:
- Find your content files (markdown, MDX, HTML, etc.)
- Extract metadata from frontmatter (title, date, tags)
- Extract body text for search
- Map file paths to URLs
- Write
search-index.json
Here's a template:
import { writeFileSync } from 'fs';
const pages = [/* your content processing logic */];
const index = {
version: "3.0",
generated: new Date().toISOString(),
site: {
name: "My Site",
domain: "example.com",
description: "Brief description for the MCP tool",
toolPrefix: "mysite",
},
pageCount: pages.length,
pages: pages.map(page => ({
url: page.url, // Required: starts with /
title: page.title, // Required
abstract: page.summary, // Optional
date: page.date, // Optional: YYYY-MM-DD
topics: page.tags, // Optional: array
body: page.content, // Recommended for search quality
})),
};
writeFileSync("search-index.json", JSON.stringify(index, null, 2));
Validate your index:
bun run validate ./search-index.json
Upload to R2:
bunx wrangler r2 object put my-site-mcp-data/search-index.json \
--file=./search-index.json \
--content-type=application/json
Configuration
wrangler.jsonc
| Field | Description |
|---|---|
name | Worker name in Cloudflare dashboard |
routes[].pattern | Your custom domain |
r2_buckets[].bucket_name | R2 bucket name |
For testing, you can use a workers.dev subdomain instead of a custom domain:
"workers_dev": true,
// Comment out "routes"
Index Format
The search index follows the v3.0 schema:
{
"version": "3.0",
"generated": "2025-01-15T12:00:00.000Z",
"site": {
"name": "My Website",
"domain": "example.com",
"description": "A site about interesting topics",
"toolPrefix": "mysite"
},
"pageCount": 42,
"pages": [
{
"url": "/about",
"title": "About Us",
"abstract": "Learn about our team.",
"date": "2025-01-01",
"topics": ["about", "team"],
"body": "Full page content..."
}
]
}
| Field | Required | Description |
|---|---|---|
version | Yes | Schema version ("3.0") |
generated | Yes | ISO 8601 timestamp |
site.name | Yes | Site name |
site.domain | Yes | Domain without protocol |
site.description | No | Shown in MCP tool description |
site.toolPrefix | No | Tool name prefix (default: website) |
pageCount | Yes | Number of pages |
pages[].url | Yes | Path starting with / |
pages[].title | Yes | Page title |
pages[].body | No | Full text (recommended) |
Development
bun run dev # Local development server
bun run type-check # TypeScript checking
bun run lint:fix # Lint and fix
bun run format # Format code
bun run deploy # Deploy to Cloudflare
Note: This is a template repository. The bun run deploy command is for users who clone this template to deploy their own MCP server. To contribute to this template itself, use standard git workflows (git push).
Troubleshooting
"Search index not found in R2 bucket"
- Check the bucket exists:
bunx wrangler r2 bucket list - Check the file was uploaded:
bunx wrangler r2 object list my-site-mcp-data - Verify the bucket name in
wrangler.jsoncmatches
MCP client won't connect
- Use the
/mcpendpoint (recommended) or/ssefor legacy clients - Visit your worker URL in a browser β you should see JSON
- Make sure the URL includes
https://
Search returns no results
- Validate your index:
bun run validate ./search-index.json - Check that pages have
bodycontent - Try broader search terms
Wrong tool names
Tool names come from toolPrefix in your search index. Regenerate and re-upload the index with the correct value.
Local development
You need a local copy of the search index:
mkdir -p .wrangler/state/r2/my-site-mcp-data
cp search-index.json .wrangler/state/r2/my-site-mcp-data/search-index.json
Examples
Two sites using this approach:
REMnux Documentation
MCP server for REMnux, the Linux toolkit for malware analysis. It gives your AI guidance on the REMnux tools, installation steps, and malware analysis workflows.
Repo: github.com/REMnux/remnux-docs-mcp-server
# Claude Code
claude mcp add remnux-docs --transport http https://docs-mcp.remnux.org/mcp --scope user
Lenny Zeltser's Website
MCP server for zeltser.com. It gives your AI guidance on drafting IR reports, shaping cybersecurity product strategy, and other security leadership topics.
# Claude Code
claude mcp add zeltser-search --transport http https://website-mcp.zeltser.com/mcp --scope user
AI Agent Quick Reference
Key Files
| File | Purpose |
|---|---|
src/index.ts | Worker entry point: MCP server, tool definitions, Fuse.js search, R2 index cache |
src/types.ts | Shared TypeScript types for the search index schema |
wrangler.jsonc | Cloudflare deployment config (Worker name, R2 binding, routes) |
adapters/ | Index generators for Astro, Hugo, and generic markdown sites |
scripts/validate-index.ts | Validates search-index.json against the v3.0 schema |
Architecture
Markdown Files β Adapter (build time) β search-index.json β R2 β Worker (MCP) β AI Client
- Adapters run at build time to generate
search-index.json - The Worker loads the index from R2 with 4-hour in-memory caching (uses etag revalidation on expiry)
- Fuse.js provides fuzzy search across titles, abstracts, body text, and topics
- Durable Objects manage persistent MCP client sessions
Common Dev Tasks
bun run dev # Local dev server (needs local search-index.json in .wrangler/)
bun run deploy # Deploy Worker to Cloudflare
bun run type-check # TypeScript checking
bun run validate ./search-index.json # Validate index
Security Notes
- No authentication: any client with the endpoint URL can query all indexed content
- Designed for public content only
- R2 bucket is private but Worker exposes contents via MCP tools
Author
Lenny Zeltser: Builder of security products and programs. Teacher of those who run them.
