Crawl
Async web search, fetch, crawl, and screenshot tooling with SDK, CLI, and MCP interfaces.
Ask AI about Crawl
Powered by Claude · Grounded in docs
I know everything about Crawl. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
crawl
Async web search, fetch, crawl, extraction, and screenshot tooling with SDK, CLI, and MCP entrypoints.
curl_cffi for HTTP
nodriver for browser automation
FastMCP for the compact agent-facing MCP layer
What It Is
crawl is organized into three layers:
sdk: the full Python capability surfacecli: direct command-line access to the SDKmcp: a smaller workflow-oriented server for AI agents
The SDK is the source of truth. The CLI wraps most of it. The MCP layer intentionally exposes fewer tools so agents do not get flooded with schemas.
Install
Editable install:
python -m pip install -e .
Pinned dependencies:
python -m pip install -r requirements.txt
Entry Points
Repo-root entrypoints:
python cli.py --help
python server.py
Installed scripts:
crawl-cli --help
crawl-mcp
Documentation
Highlights
- Search providers:
google,searxng,auto,hybrid - Browser-capable SDK and MCP paths support
headless - Consent handling and resource blocking are built into browser workflows
- Structured extraction, article extraction, forms, feeds, contacts, and technology fingerprinting are all in the SDK
- The MCP server exposes a compact tool surface:
search_webinspect_urldiscover_siteextract_structuredcapture_screenshot
Quick Examples
SDK:
import asyncio
from crawl.sdk import fetch_page, websearch
async def main() -> None:
search_payload = await websearch("python async browser automation", provider="auto")
page_payload = await fetch_page("https://example.com", mode="browser", headless=True)
print(search_payload["count"])
print(page_payload["final_url"])
asyncio.run(main())
CLI:
python cli.py websearch "python async browser automation" --provider auto --max-results 5 --pages 1
python cli.py fetch-page https://example.com --mode browser --include-html
MCP:
- run
python server.py - connect your MCP client to the stdio server
- use the compact workflow tools documented in docs/mcp.md
