Rostaing MCP
A universal Model Content Protocol (MCP) for Pandas and Polars DataFrames, designed to be used as a tool by Large Language Models (LLMs).
Installation
npx rostaing-mcpAsk AI about Rostaing MCP
Powered by Claude · Grounded in docs
I know everything about Rostaing MCP. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Rostaing's Model Content Protocol (rostaing-mcp)
A universal Model Content Protocol (MCP) for Pandas and Polars DataFrames, designed to be used as a powerful tool by Large Language Models (LLMs) like GPT-4, Claude, and Mistral.
This package allows LLMs to interact securely with data, performing advanced tasks ranging from simple filtering to statistical hypothesis testing and data visualization (returning viewable images).
Key Features
- Library Agnostic: Seamlessly handles
pandasandpolarsDataFrames. - Visual Intelligence: Generates Plotly charts returning Base64 PNG images, allowing LLMs to "see" and generate charts.
- Statistical Suite: Built-in support for T-tests, ANOVA, Chi-Squared, Normality tests (Shapiro), and Survival Analysis (Logrank).
- Smart NLU (Fuzzy Matching): Automatically corrects typos in column names (e.g., understands that "salary" refers to "Salary_USD").
- Stateful Analysis: Filters and modifications persist within the agent's session context.
Installation
pip install rostaing-mcp
Note: This will also install necessary dependencies like plotly, kaleido (for image generation), pingouin, and scipy.
Quick Start
import pandas as pd
from rostaing_mcp import DataFrameAgent, DataFrameToolHandler
# 1. Create a sample DataFrame
data = {
'employee_name': ['Rostaing', 'Lucrèce', 'Isnard', 'Charline', 'Dacier', 'Nora'],
'salary': [100000, 70000, 85000, 60000, 95000, 62000],
'experience_level': ['Expert', 'Senior', 'Manager', 'Junior', 'Principal', 'Mid-Level'],
'department': ['AI', 'Sales', 'AI', 'Sales', 'AI', 'Sales']
}
df = pd.DataFrame(data)
# 2. Initialize the core agent (Works with Polars too!)
data_agent = DataFrameAgent(df, source_description="Employee salary data")
# 3. Wrap it in the tool handler
df_tool = DataFrameToolHandler(data_agent)
# --- EXAMPLES OF DIRECT USAGE ---
# A. Inspect Data
print(df_tool.get_schema())
# B. Statistical Test (e.g., T-test between groups)
# Note: Handles fuzzy matching if you type 'Department' instead of 'department'
print(df_tool.perform_t_test(a='salary', group='department'))
# C. Visualization (Returns Base64 Image string)
# The LLM can call this to generate a chart
image_data = df_tool.plot_bar_chart(x='employee_name', y='salary', color='department')
print("Chart generated successfully (Base64 data ready).")
Integration with LLM Agents (e.g., Upsonic, LangChain)
To let an LLM use all available tools, you can pass the methods dynamically or wrap them in a proxy class.
from upsonic import Agent, Task
# Get the list of all callable tools for the LLM
tools_list = df_tool.get_all_tools()
task = Task(
description="Analyze the salary distribution and plot a bar chart by department.",
tools=tools_list
)
agent = Agent(model="openai/gpt-4o", name="Data Analyst")
result = agent.do(task)
print(result)
Available Tools
📊 Visualization
plot_histogram,plot_bar_chart,plot_line_chartplot_scatter_plot,plot_box_plot,plot_violin_plotplot_heatmap,plot_pie_chart,plot_3d_scatter, and more.
🧮 Statistics
get_summary_statistics,get_correlation_matrixperform_normality_test(Shapiro-Wilk)perform_t_test(Student's t-test)perform_anova(One-way ANOVA)perform_chi2_test(Independence)perform_logrank_test(Survival analysis)
🛠 Manipulation
filter_rows(Complex conditions supported)sort_valuesselect_columns
