Vox MCP
Multi-model AI gateway β 8 providers plus local models, no system prompt injection.
Ask AI about Vox MCP
Powered by Claude Β· Grounded in docs
I know everything about Vox MCP. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Vox MCP
Multi-model AI gateway for MCP clients.
Why
MCP clients like Claude Code, Claude Desktop, and Cursor are locked to their host model. Vox gives them access to every other model β Gemini, GPT, Grok, DeepSeek, Kimi, or your local Ollama β through a single chat tool.
The design is deliberately minimal: prompts go to providers unmodified, responses come back unmodified. No system prompt injection. No response formatting. No behavioral directives. The only value Vox adds is routing and conversation memory β everything else is pure passthrough.
What it does
Send a prompt, optionally attach files or images, pick a model (or let the agent pick), and get back the model's raw response. Conversation threads persist in memory via continuation_id for multi-turn exchanges across any provider β start a thread with Gemini, continue it with GPT. Threads are shadow-persisted to disk as JSONL for durability and can be exported as Markdown.
3 tools:
| Tool | Description |
|---|---|
chat | Send prompts to any configured AI model with optional file/image context |
listmodels | Show available models, aliases, and capabilities |
dump_threads | Export conversation threads as JSON or Markdown |
8 providers:
| Provider | Env Variable | Example Models |
|---|---|---|
| Google Gemini | GEMINI_API_KEY | gemini-2.5-pro |
| OpenAI | OPENAI_API_KEY | gpt-5.1, gpt-5, o3, o4-mini |
| Anthropic | ANTHROPIC_API_KEY | claude-4-opus, claude-4-sonnet |
| xAI | XAI_API_KEY | grok-3, grok-3-fast |
| DeepSeek | DEEPSEEK_API_KEY | deepseek-chat, deepseek-reasoner |
| Moonshot (Kimi) | MOONSHOT_API_KEY | kimi-k2-thinking-turbo, kimi-k2.5 |
| OpenRouter | OPENROUTER_API_KEY | Any OpenRouter model |
| Custom | CUSTOM_API_URL | Ollama, vLLM, LM Studio, etc. |
Quick start
git clone https://github.com/linxule/vox-mcp.git
cd vox-mcp
cp .env.example .env
# Edit .env β add at least one API key
uv sync
uv run python server.py
MCP client configuration
Vox runs as a stdio MCP server. Each client needs to know how to launch it.
Replace /path/to/vox-mcp with the absolute path to your cloned repo.
Claude Code (CLI)
claude mcp add vox-mcp \
-e GEMINI_API_KEY=your-key-here \
-- uv run --directory /path/to/vox-mcp python server.py
Or add to .mcp.json in your project root:
{
"mcpServers": {
"vox-mcp": {
"command": "uv",
"args": ["run", "--directory", "/path/to/vox-mcp", "python", "server.py"],
"env": {
"GEMINI_API_KEY": "your-key-here"
}
}
}
}
Claude Desktop
Add to claude_desktop_config.json:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"vox-mcp": {
"command": "uv",
"args": ["run", "--directory", "/path/to/vox-mcp", "python", "server.py"],
"env": {
"GEMINI_API_KEY": "your-key-here"
}
}
}
}
Cursor
Add to .cursor/mcp.json (project) or ~/.cursor/mcp.json (global):
{
"mcpServers": {
"vox-mcp": {
"command": "uv",
"args": ["run", "--directory", "/path/to/vox-mcp", "python", "server.py"],
"env": {
"GEMINI_API_KEY": "your-key-here"
}
}
}
}
Windsurf
Add to ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"vox-mcp": {
"command": "uv",
"args": ["run", "--directory", "/path/to/vox-mcp", "python", "server.py"],
"env": {
"GEMINI_API_KEY": "your-key-here"
}
}
}
}
Any MCP client
The canonical stdio configuration:
{
"mcpServers": {
"vox-mcp": {
"command": "uv",
"args": ["run", "--directory", "/path/to/vox-mcp", "python", "server.py"],
"env": {
"GEMINI_API_KEY": "your-key-here"
}
}
}
}
Tips:
- Paths must be absolute
- You only need one API key to start β add more providers later via
.env - The
.envfile in the vox-mcp directory is loaded automatically, so API keys can go there instead of in the client config - Use
VOX_FORCE_ENV_OVERRIDE=truein.envif client-passed env vars conflict with your.envvalues
Configuration
Copy .env.example to .env and configure:
- API keys β at least one provider key is required
DEFAULT_MODELβauto(default, agent picks) or a specific model name- Model restrictions β
GOOGLE_ALLOWED_MODELS,OPENAI_ALLOWED_MODELS, etc. CONVERSATION_TIMEOUT_HOURSβ thread TTL (default: 24h)MAX_CONVERSATION_TURNSβ thread length limit (default: 100)
See .env.example for the full reference.
Development
uv sync
uv run python -c "import server" # smoke test
uv run pytest # run tests
See CONTRIBUTING.md for code style, project structure, and how to add providers.
License
Apache 2.0 β see LICENSE and NOTICE.
Derived from pal-mcp-server by Beehive Innovations.
