Source Coop MCP
MCP server for Source Cooperative auto-discovery and data exploration
Installation
npx source-coop-mcpAsk AI about Source Coop MCP
Powered by Claude Β· Grounded in docs
I know everything about Source Coop MCP. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Source Cooperative MCP Server
Discover and access 800TB+ of geospatial data through AI agents.
An MCP (Model Context Protocol) server for Source Cooperative - a collaborative repository with datasets from Maxar, Harvard, ESA, USGS, and 90+ organizations.
ποΈ Architecture Overview
graph TB
subgraph "AI Clients"
A1[Claude Desktop]
A2[Claude Code]
A3[Cursor]
A4[Cline]
A5[Zed]
A6[Continue.dev]
end
subgraph "MCP Server"
MCP[Source Cooperative MCP<br/>FastMCP + obstore]
end
subgraph "6 Available Tools"
T1[list_accounts<br/>94+ orgs]
T2[list_products<br/>hybrid S3+API]
T3[get_product_details<br/>+ README]
T4[list_product_files<br/>tree mode]
T5[get_file_metadata<br/>no download]
T6[search<br/>hybrid fuzzy]
end
subgraph "Data Sources"
S1[HTTP API<br/>source.coop/api]
S2[S3 Direct<br/>opendata.source.coop]
end
A1 -->|JSON-RPC| MCP
A2 -->|JSON-RPC| MCP
A3 -->|JSON-RPC| MCP
A4 -->|JSON-RPC| MCP
A5 -->|JSON-RPC| MCP
A6 -->|JSON-RPC| MCP
MCP --> T1
MCP --> T2
MCP --> T3
MCP --> T4
MCP --> T5
MCP --> T6
T1 --> S2
T2 --> S1
T2 --> S2
T3 --> S1
T3 --> S2
T4 --> S2
T5 --> S2
T6 --> S1
style MCP fill:#4CAF50,stroke:#2E7D32,stroke-width:3px,color:#fff
style S1 fill:#2196F3,stroke:#1976D2,stroke-width:2px,color:#fff
style S2 fill:#2196F3,stroke:#1976D2,stroke-width:2px,color:#fff
Key Features:
- β Token Optimized - 72% reduction for large datasets
- β Smart Partitions - Auto-detects Hive-style patterns
- β Fuzzy Search - Handles typos and partial matches
- β No Auth - All 800TB+ is public
π Quick Start
Install
uvx source-coop-mcp
Configure Your AI Client
Claude Desktop / Claude Code / Cursor / Cline
Add to config file:
- Claude Desktop:
~/Library/Application Support/Claude/claude_desktop_config.json(macOS) - Claude Code: VS Code
settings.json - Cursor: Cursor settings
- Cline: Cline MCP settings
{
"mcpServers": {
"source-coop": {
"command": "uvx",
"args": ["source-coop-mcp"]
}
}
}
Zed
Add to Zed settings:
{
"context_servers": {
"source-coop": {
"command": "uvx",
"args": ["source-coop-mcp"]
}
}
}
Continue.dev
Add to Continue config (~/.continue/config.json):
{
"experimental": {
"modelContextProtocolServers": [
{
"transport": {
"type": "stdio",
"command": "uvx",
"args": ["source-coop-mcp"]
}
}
]
}
}
Restart your AI client and start exploring!
π οΈ Available Tools
| Tool | Purpose | Performance |
|---|---|---|
list_accounts() | Find all 94+ organizations | ~850ms |
list_products() | Hybrid: S3 mode (default) for ALL datasets + file counts | ~240ms |
list_products(include_unpublished=False) | API mode for published datasets with rich metadata | ~500ms |
get_product_details() | Get metadata + README automatically | ~650ms |
list_product_files() | List files with S3/HTTP paths | ~240ms |
list_product_files(show_tree=True) | Tree view (72% token savings) | ~980ms |
get_file_metadata() | Get file info without downloading | ~230ms |
search(query) | Hybrid: Search accounts + products (published + unpublished), top 5 results | ~5-10s |
π‘ What You Can Do
Discover Data
"List all organizations in Source Cooperative"
β Returns 94+ organizations: maxar, planet, harvard, etc.
"Find all datasets for harvard-lil"
β Discovers published + unpublished products
"Search for climate datasets"
β Smart fuzzy search handles typos and partial matches
Access Files
"List files in harvard-lil/gov-data"
β Returns S3 paths and HTTP URLs ready for analysis
"Show me the file tree with partition detection"
β Smart visualization: year={2020,2021,...+5 more}/ [partitioned]
"Get file metadata without downloading"
β Size, last modified, ETag
Smart Search
"Search for climte" (typo)
β Finds "climate" datasets (fuzzy matching)
"Search for geo" (partial)
β Finds "geospatial", "geocoding", etc.
β‘ Features
| Feature | Description |
|---|---|
| Complete Discovery | Finds unpublished products the official API doesn't show |
| No Authentication | All 800TB+ data is public |
| Fast Performance | Rust-backed S3 client (9x faster than boto3) |
| Token Optimized | Tree mode: 72% token reduction for large datasets |
| Smart Partitions | Auto-detects patterns: year={2020,2021,...} |
| Fuzzy Search | Handles typos and partial matches |
| README Integration | Documentation automatically included |
| 800TB+ Data | 94+ organizations, geospatial datasets |
π Example Workflow
1. "List all organizations"
β Get 94+ account names
2. "Show me all datasets from maxar"
β Discover published + unpublished products
3. "Search for climate data"
β Smart fuzzy search finds relevant datasets
4. "Get details for harvard-lil/gov-data"
β Full metadata + README content
5. "List files in this dataset with tree view"
β Token-optimized tree with partition detection
π― Why This Server?
Problem
Source Cooperative has 800TB+ of valuable data, but:
- Official API only shows published products
- No auto-discovery of organizations
- Requires knowing what you're looking for
Solution
This MCP server provides:
- β Complete auto-discovery (published + unpublished)
- β Smart search with fuzzy matching
- β Direct S3 access for all files
- β Token-optimized outputs (72% reduction)
- β Smart partition detection (10-88% additional savings)
- β README documentation included automatically
- β No authentication required
π Performance
All operations complete in under 1 second:
list_accounts(): ~850ms (94+ organizations)
list_products(): ~240ms (S3 mode - ALL datasets + file counts)
list_products(include_unpublished=False): ~500ms (API mode - published with metadata)
list_product_files(): ~240ms (simple list)
list_product_files(tree=True): ~980ms (72% token savings)
get_file_metadata(): ~230ms (HEAD only)
search(query): ~5-10s (hybrid search - 1 recursive S3 scan, top 5 enriched)
Token Optimization Impact
| Dataset Size | Without Tree | With Tree | Saved |
|---|---|---|---|
| 10 files | 1,500 tokens | 415 tokens | 72.3% |
| 100 files | 15,000 tokens | 4,150 tokens | 72.3% |
| 1,000 files | 150,000 tokens | 41,500 tokens | 72.3% |
With partition detection (1,000 partitions): 88% total savings!
π§ Requirements
- Python: 3.11 or higher
- Package Manager:
uv(installed automatically byuvx) - Operating Systems: macOS, Linux, Windows
π€ Development
See DEVELOPMENT.md for:
- Architecture details
- Testing instructions
- Contributing guidelines
- Performance benchmarks
- Token optimization details
π Support
- Issues: GitHub Issues
π License
MIT License - see LICENSE for details.
