Basedosdados MCP
Model Context Protocol server for Base dos Dados (Brazilian open data platform)
Installation
npx basedosdados-mcpAsk AI about Basedosdados MCP
Powered by Claude · Grounded in docs
I know everything about Basedosdados MCP. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Base dos Dados MCP
A Model Context Protocol (MCP) server that provides AI-optimized access to Base dos Dados, Brazil's largest open data platform.
💡 Create Reports with Real Brazilian Datasets with Claude Desktop
Ask it to replicate a news article:
User:
https://www.gov.br/secom/pt-br/assuntos/noticias/2025/05/no-melhor-abril-do-novo-caged-brasil-gera-257-mil-vagas-com-carteira-assinada
consegue replicar a materia e checar os numeros?
Claude: https://claude.ai/public/artifacts/15d7f5c5-f017-4a96-9380-a93e535001fd
Or a research question
User:
https://claude.ai/share/f8dbbe5b-1c34-4804-9462-1bfb9008558d
Claude: https://claude.ai/share/f8dbbe5b-1c34-4804-9462-1bfb9008558d
✨ Features
- 🔍 Smart Search: Portuguese language support with accent normalization
- 📊 BigQuery Integration: Direct SQL execution and table references
- 🇧🇷 Brazilian Data Focus: IBGE, RAIS, TSE, INEP datasets
- 🤖 AI-Optimized: Single-call comprehensive data retrieval
🚀 Quick Install to Claude Desktop
Just run this line in your terminal
bash -i <(curl -LsSf https://raw.githubusercontent.com/JoaoCarabetta/basedosdados-mcp/refs/heads/main/install.sh)
Add your BigQuery project_id, location and service account.
Restart Claude Desktop and you are good to go!
⚠️ I just tested it in my Mac M-Series
🛠️ Tools
| Tool | Description |
|---|---|
search_datasets | Search datasets with Portuguese support |
get_dataset_overview | Get complete dataset overview with tables |
get_table_details | Get table details with columns and SQL samples |
execute_bigquery_sql | Execute SQL queries directly |
check_bigquery_status | Check BigQuery authentication |
🔧 Development
Quick Development Setup
Set up the MCP server for local development with live code reloading:
git clone https://github.com/JoaoCarabetta/basedosdados-mcp
cd basedosdados-mcp
./dev_install.sh
This script will:
- ✅ Install all dependencies in development mode
- ✅ Create a development wrapper script with live reloading
- ✅ Configure Claude Desktop with
basedosdadosdevserver - ✅ Set up proper environment variables and paths
Manual Development Setup
If you prefer manual setup:
# Clone and install dependencies
git clone https://github.com/JoaoCarabetta/basedosdados-mcp
cd basedosdados-mcp
uv sync --extra dev
# Run development server
uv run basedosdados-mcp-dev
# Or test directly
./run_dev_server.sh
Development Features
- 🔄 Live Code Reloading: Changes to
src/are reflected immediately - 🐛 Debug Logging: Enhanced logging for development
- 🧪 Full Test Suite: Comprehensive pytest-based tests
- 🎯 Separate Server:
basedosdadosdevserver in Claude Desktop - ⚡ Fast Development: No need to reinstall after code changes
🧪 Testing
Install development dependencies and run the comprehensive test suite:
# Install dev dependencies (includes pytest)
uv sync --extra dev
# Run all encoding tests
pytest tests/ -v
# Run specific test categories
pytest tests/ -k "encoding" -v # Encoding tests only
pytest tests/ -k "live" -v # Live server tests only
pytest tests/ -m "not live" -v # Skip live tests (faster)
# Run with coverage
pytest tests/ --cov=basedosdados_mcp --cov-report=html
Test Structure
The pytest-based test suite includes:
-
test_encoding_pytest.py: Core Portuguese character encoding tests- UTF-8 encoding/decoding validation
- JSON serialization integrity
- MCP protocol compatibility
- Parametrized tests for individual Portuguese words
-
test_live_mcp_pytest.py: Live MCP server endpoint tests- Real API response validation
- Backend API encoding verification
- Large response handling
- End-to-end encoding flow
-
conftest.py: Shared fixtures and pytest configuration- Common test data and utilities
- Automatic test marking and organization
Encoding Validation
Portuguese characters tested: população, educação, saúde, região, satélites, políticas, públicas, and 30+ more common Brazilian dataset terms.
Corruption patterns detected: é, á, Ã, ó, ú, ã, ç, ô, ê, à , õ (common UTF-8 corruption)
If you experience encoding issues in Claude Desktop (e.g., seeing satélites instead of satélites), run the tests to verify the issue is in the display layer, not the MCP server itself.
📚 About
Base dos Dados is Brazil's largest open data platform, providing standardized access to Brazilian public datasets through BigQuery.
Data Coverage: Demographics, Economics, Education, Politics, Health, Environment
📄 License
MIT License
