Find Kusto Table MCP
A mcp server for finding the right kusto table when you have thousands of tables across multiple clusters
Installation
npx find-kusto-table-mcpAsk AI about Find Kusto Table MCP
Powered by Claude Β· Grounded in docs
I know everything about Find Kusto Table MCP. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
π Kusto Table Search MCP Server
An experimental Model Context Protocol (MCP) server for Kusto table discovery and query building. Built with FastMCP.
π What Does It Do?
This is an experimental MCP server for AI agents working with Azure Data Explorer (Kusto):
Features
- π Table Search: Find tables using natural language queries
- π‘οΈ Schema Validation: Helps prevent AI hallucination by sampling actual table schemas
- π Jupyter Notebook Integration: Transparent, repeatable investigations with Kqlmagic
- β‘ Caching: Lazy loading and LRU caching for performance
- π§ Azure SDK Integration: Connects to real Kusto clusters with Azure authentication
- οΏ½ Interactive Visualizations: Charts and dashboards directly in notebooks
- π― Query Helpers: Anti-pattern detection and optimization suggestions
π― Use Cases
- Helping AI agents build Kusto queries using actual table schemas
- Exploring large Kusto environments with many tables
- Creating transparent, repeatable investigation notebooks
- Interactive data analysis with Python/KQL integration
- Learning and experimenting with Kusto/KQL
π Quick Start
One-Click Installation for VS Code
Prerequisites
- Python 3.9 or higher
- Azure credentials with Kusto access (Managed Identity, Azure CLI, or Device Auth)
pipxfor isolated installation (recommended)
Installation
Option 1: Using pipx (Recommended)
# Install pipx if you don't have it
python -m pip install --user pipx
python -m pipx ensurepath
# Install the MCP server
pipx install find-kusto-table-mcp
# Run the server
find-kusto-table-mcp
Option 2: Using pip
pip install find-kusto-table-mcp
python -m kusto_mcp
Option 3: From Source (Development)
# Clone the repository
git clone https://github.com/AmeliaRose802/find-kusto-table-mcp.git
cd find-kusto-table-mcp
# Install dependencies
pip install -r requirements.txt
# Setup connection strings (copy from example)
cp connection_strings.json.example connection_strings.json
# Edit connection_strings.json with your cluster details
# Run the server
python kusto_server.py
Configuration for Claude Desktop
Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):
{
"mcpServers": {
"kusto-table-search": {
"command": "python",
"args": ["/path/to/cache-kusto-info/kusto_server.py"],
"env": {}
}
}
}
Using with FastMCP Client
from fastmcp import FastMCP
from fastmcp.client import Client
import kusto_server
async with Client(kusto_server.mcp) as client:
# Search for tables
result = await client.call_tool(
"search_kusto_tables",
arguments={"query": "wireserver requests", "limit": 5}
)
print(result)
π¨ Key Features Explained
1. Real Kusto Connectivity (NEW!)
Connect to Azure Data Explorer using the official Azure SDK with automatic authentication fallback:
# Automatic authentication chain:
# 1. Managed Identity (for Azure-hosted services)
# 2. Azure CLI (for local development)
# 3. Device Auth (interactive fallback)
### Configuration
```bash
# Configure in cache/connection_strings.json
{
"clusters": {
"mycluster": {
"cluster_url": "https://mycluster.westus.kusto.windows.net",
"use_real_client": true # Enable Azure SDK
}
}
}
# Built-in resilience:
# - Circuit breaker (protects against cascading failures)
# - Exponential backoff retry (3 attempts with jitter)
# - Bulkhead isolation (20 concurrent queries max)
# - Rate limiting (token bucket algorithm)
2. Advanced Analytics Engine (NEW!)
Perform sophisticated analysis on cached query results without polluting context:
# Statistical Analysis
stats = query_handle_statistical_analysis(
handle="qh_abc123",
columns=["ResponseTime", "RequestCount"]
)
# Returns: mean, median, stddev, P50/P75/P90/P95/P99, skewness, kurtosis
# Correlation Detection
correlations = query_handle_correlation_analysis(
handle="qh_abc123",
numeric_columns=["CPU", "Memory", "Latency"]
)
# Returns: Pearson correlation matrix
# Anomaly Detection (3 methods)
anomalies = query_handle_outlier_detection(
handle="qh_abc123",
column="RequestLatency",
method="iqr", # or "zscore" or "modified_zscore"
threshold=1.5
)
# Returns: outlier values, indices, statistics
# Time Series Analysis
trends = query_handle_time_series_analysis(
handle="qh_abc123",
time_column="Timestamp",
value_column="ErrorRate",
window_size=10
)
# Returns: trend (increasing/decreasing/stable), rate_of_change, moving_averages
3. Query Optimization (NEW!)
Intelligent query analysis with anti-pattern detection and automatic optimization:
# Comprehensive Query Analysis
analysis = query_analyze_optimization(
query="""
MyTable
| where EventTime > ago(7d)
| extend ComputedField = tostring(RawData)
| order by EventTime desc
| take 100
"""
)
# Returns:
# - Anti-patterns detected (no time filter, SELECT *, inefficient ordering, etc.)
# - Complexity score (0-100+)
# - Estimated cost (low/medium/high/very high)
# - Optimization suggestions
# - Optimized query
# Index Recommendations
indexes = query_suggest_indexes(
query="MyTable | where UserId == 'user123' and EventTime > ago(1h)",
table_name="MyTable"
)
# Returns: recommended indexes (range, datetime, hash) with justification
# Query Comparison
comparison = query_compare_performance(
query1="MyTable | where EventTime > ago(1d) | take 1000",
query2="MyTable | where EventTime > ago(1d) | summarize count()"
)
# Returns: side-by-side complexity, cost, anti-patterns, recommendation
4. Jupyter Notebook Integration (NEW!)
Create transparent, repeatable investigations using Kqlmagic:
# Create an investigation notebook
create_investigation_notebook(
name="my_investigation",
title="Production Issue Analysis",
description="Investigating errors in ApplicationLogs",
pattern="investigation" # or monitoring, troubleshooting, exploration
)
# Add Kqlmagic connection
add_kusto_connection(
notebook="my_investigation",
cluster="admeus",
database="AdmeusDB",
auth_method="azcli" # Azure CLI authentication
)
# Add KQL query cells
add_query_to_notebook(
notebook="my_investigation",
title="Error Rate Analysis",
query="ApplicationLogs | where Severity >= 3 | summarize count() by bin(EventTime, 5m)",
description="Track error rates over time",
add_python_analysis=True # Adds pandas analysis cell
)
# Add visualization
add_visualization_to_notebook(
notebook="my_investigation",
title="Error Distribution",
query="ApplicationLogs | summarize count() by ErrorType | render piechart"
)
# Add documentation
add_markdown_to_notebook(
notebook="my_investigation",
title="Investigation Summary",
content="## Findings\n\n- Error rate spiked at 3pm...\n- Root cause: ...\n"
)
Benefits:
- β Transparent: All queries visible in cells
- β Repeatable: Re-run entire notebook anytime
- β
Shareable: Export
.ipynband share with team - β Interactive: Mix KQL, Python, visualizations
- β Version-controlled: Commit to git
5. Smart Table Search with Lazy Caching
Traditional approach: Load all schemas upfront (slow, memory-intensive) Our approach: Load schemas on-demand as you search (fast, efficient)
# Search is instant - schemas loaded only when needed
search_kusto_tables(
query="node health monitoring tables",
method="hybrid", # Combines keyword + fuzzy + semantic
limit=10
)
6. Anti-Hallucination Query Building
AI agents love to invent column names. We prevent this:
# Step 1: Sample table to get ACTUAL column names
schema = sample_table_for_query_building(
cluster="admeus",
database="AdmeusDB",
table="NodeHealthEvents"
)
# Step 2: Build query using ONLY the real column names
# β
Uses: EventTime, NodeName, Status (from actual schema)
# β Never invents: Timestamp, Node, State
7. Investigation Workflows
List and discover investigation notebooks:
# List all investigation notebooks
list_investigation_notebooks()
# Returns: List of all notebooks with metadata (pattern, queries, creation date)
# Open notebook in Jupyter
# notebooks/my_investigation.ipynb
π Tool Reference
Core Search Tools
| Tool | Purpose | Key Parameters |
|---|---|---|
search_kusto_tables | Find tables using natural language | query, method, limit |
get_table_details | Get comprehensive table info | cluster, database, table |
sample_table_for_query_building | Anti-hallucination: Get real schema | cluster, database, table |
Query Handle Tools
| Tool | Purpose | Key Parameters |
|---|---|---|
execute_query_with_handle | Execute query, return handle | query, limit |
query_handle_analyze | Analyze cached results | handle, operation, column |
query_handle_list | List all active handles | include_expired |
query_handle_validate | Check if handle is valid | handle |
query_handle_get_sample | Get sample rows from handle | handle, num_rows |
Advanced Analytics Tools (NEW!)
| Tool | Purpose | Key Parameters |
|---|---|---|
query_handle_statistical_analysis | Statistical analysis on cached results | handle, columns |
query_handle_correlation_analysis | Detect correlations between columns | handle, numeric_columns |
query_handle_outlier_detection | Anomaly detection (IQR/Z-score) | handle, column, method, threshold |
query_handle_time_series_analysis | Trend detection and time-series analysis | handle, time_column, value_column, window_size |
Query Optimization Tools (NEW!)
| Tool | Purpose | Key Parameters |
|---|---|---|
query_analyze_optimization | Comprehensive query analysis | query |
query_suggest_indexes | Index recommendations | query, table_name |
query_compare_performance | Side-by-side query comparison | query1, query2 |
Template Tools
| Tool | Purpose | Key Parameters |
|---|---|---|
template_create | Create reusable query template | name, query, parameters |
template_list | List all templates (10 production templates included) | tags, search |
template_render | Generate query from template | name, parameters |
template_get | Get template details | name |
template_delete | Delete custom template | name |
Cache & Performance Tools
| Tool | Purpose | Key Parameters |
|---|---|---|
cache_stats | Get comprehensive cache statistics | - |
cache_clear | Clear expired cache entries | - |
performance_stats | Get performance metrics | operation (optional) |
π‘οΈ Anti-Hallucination Best Practices
β DO: Always Sample Before Querying
# CORRECT WORKFLOW
1. search_kusto_tables("error logs")
2. sample_table_for_query_building(cluster, db, table)
3. Build query using EXACT column names from step 2
4. execute_query_with_handle(query)
β DON'T: Assume Column Names
# WRONG - Assumes "Timestamp" exists
query = "ErrorLogs | where Timestamp > ago(1h)"
# RIGHT - Uses actual column from schema
schema = sample_table_for_query_building(...)
time_col = schema["schema"]["primary_time_column"] # e.g., "EventTime"
query = f"ErrorLogs | where {time_col} > ago(1h)"
οΏ½ Running the Server
Local Development (Default)
# Run with STDIO transport (for Claude Desktop)
python kusto_server.py
Alternative Transports
# Run with HTTP transport
fastmcp run kusto_server.py --transport http --port 8000
# Run with SSE transport
fastmcp run kusto_server.py --transport sse
π§ͺ Testing
# Run all tests
pytest
# Run with coverage
pytest --cov=src --cov-report=html
# Run specific test file
pytest tests/test_fastmcp_server.py -v
# Run integration tests only
pytest tests/test_fastmcp_server.py::TestKustoServerIntegration -v
# Run resilience pattern tests
pytest tests/test_resilience.py -v
# Run real Kusto client tests
pytest tests/test_kusto_client.py -v
π Recent Enhancements
See OVERNIGHT_ENHANCEMENTS_V3.md for comprehensive documentation of recent production-ready improvements:
1. Real Kusto Client Integration
- Native Azure SDK connectivity (azure-kusto-data 4.0+)
- Multi-auth support: Managed Identity β Azure CLI β Device Auth
- Built-in resilience: circuit breakers, retry, bulkheads
- Impact: Production-ready Kusto connectivity with enterprise reliability
2. Advanced Analytics Engine
- Statistical analysis: mean, median, stddev, percentiles (P50-P99)
- Correlation detection: Pearson correlation matrices
- Anomaly detection: IQR, Z-score, Modified Z-score methods
- Time-series analysis: trend detection, rate of change, moving averages
- Data quality scoring: completeness, consistency, validity (0-100 score)
- Impact: Perform sophisticated analysis without context pollution
3. Query Optimization Analyzer
- Anti-pattern detection: 10+ patterns (no time filter, SELECT *, inefficient ordering)
- Complexity scoring: 0-100+ with cost estimation
- Automatic optimization: query rewriting with best practices
- Index recommendations: range, datetime, hash indexes
- Side-by-side comparison: compare alternative queries
- Impact: Write better queries, reduce costs, improve performance
4. Enhanced Error Recovery System
- Circuit breaker: state machine with auto-recovery (5 failure threshold, 60s timeout)
- Exponential backoff retry: 3 attempts with jitter (1s β 2s β 4s)
- Bulkhead isolation: 20 concurrent query limit
- Fallback handlers: graceful degradation
- Rate limiting: token bucket algorithm
- Impact: Production-grade fault tolerance
5. Query Result Caching with Persistence
- Disk-backed storage: survives server restarts
- LRU eviction: memory limit 1000 handles
- Lazy loading: handles loaded only when accessed
- Integrity verification: pickle serialization with validation
- Cache statistics: hit rate, disk I/O, memory usage
- Impact: Efficient memory usage, reliable caching
6. Advanced Query Template Library
- 10 production-ready templates: error analysis, performance, security, cost
- Comprehensive documentation: parameters, tags, execution times
- Common scenarios: monitoring, SLA tracking, threat detection, FinOps
- Impact: Accelerate common analysis patterns
Total Additions: ~3000+ lines of production code, 7 new files, 7 new MCP tools, comprehensive test coverage
π Performance Characteristics
- Table Search: <100ms for cached queries, <500ms for first search
- Schema Cache: <10ms for cached lookups, <2s for fresh fetch
- Query Handles: <50ms for analytics operations, persistent across restarts
- Memory: <200MB typical, <500MB with large caches, LRU eviction at 1000 handles
- Disk I/O: Lazy loading, handles loaded only when accessed
- Resilience: Circuit breaker opens after 5 failures, 60s timeout, auto-recovery
- Concurrency: Bulkhead limits 20 concurrent Kusto queries, prevents resource exhaustion
- Retry: Exponential backoff (1s β 2s β 4s) with jitter, 3 max attempts
ποΈ Architecture
kusto_server.py (FastMCP server - 20+ tools)
βββ @mcp.tool decorators (20+ tools including analytics & optimization)
βββ @mcp.resource (2 help resources)
βββ @mcp.prompt (2 reusable prompts)
src/
βββ core/ # Infrastructure
β βββ logging_config.py # Structured logging
β βββ config.py # Configuration management
β βββ performance.py # Performance monitoring
β βββ exceptions.py # Custom exceptions
β βββ resilience.py # NEW: Circuit breakers, retry, bulkheads, rate limiting
βββ services/ # Business logic
β βββ schema_cache_service.py # Lazy schema caching
β βββ query_handle_service.py # NEW: Disk-backed result caching with LRU
β βββ query_template_service.py # Template management (10 production templates)
β βββ query_templates.json # NEW: Production query template library
β βββ kql_query_builder_service.py # KQL query generation
β βββ analytics_engine.py # NEW: Statistical analysis, correlation, anomalies
β βββ query_optimizer.py # NEW: Anti-pattern detection, query optimization
βββ tools/ # High-level tools
β βββ table_discovery.py # Table search orchestration
β βββ query_sampler.py # Anti-hallucination sampling
βββ utils/ # Utilities
βββ anti_hallucination.py # Schema validation
βββ kusto_client.py # NEW: Real Azure SDK + Mock clients with resilience
βββ helpers.py # Helper functions
tests/ # Comprehensive test suite
βββ test_fastmcp_server.py # Server integration tests
βββ test_resilience.py # NEW: Circuit breaker, retry, bulkhead tests
βββ test_kusto_client.py # NEW: Real & mock client tests
βββ test_ai_query_builder.py # NEW: AI query building tests
βββ test_query_handle_service.py # Query handle tests
βββ test_schema_cache_service.py # Schema cache tests
π€ Contributing
Contributions are welcome! Please see DEVELOPMENT_GUIDE.md for development practices and OVERNIGHT_ENHANCEMENTS_V3.md for recent feature additions.
π License
MIT License - see LICENSE file for details
π Acknowledgments
- Built with FastMCP by @jlowin
- Powered by Azure SDK for Python
- Inspired by the need for trustworthy AI-powered query building
- Special thanks to the Azure Data Explorer team
π Latest Update: Major production-ready enhancements including real Azure SDK integration, advanced analytics, query optimization, enterprise resilience patterns, persistent caching, and 10 production query templates. See OVERNIGHT_ENHANCEMENTS_V3.md for details.
π§ Support
- π Documentation
- π Issue Tracker
- π¬ Discussions
Legacy Files (Pre-FastMCP Migration)
The following files are from the previous MCP SDK implementation and are kept for reference:
mcp_table_search_server.py- Original MCP SDK serverenhanced_mcp_server.py- Previous enhanced server (now replaced bykusto_server.py)
Use kusto_server.py for all new development.
