Model ID Cheatsheet
Accurate API model IDs, pricing, and specs for 46 models across 7 AI providers.
Ask AI about Model ID Cheatsheet
Powered by Claude · Grounded in docs
I know everything about Model ID Cheatsheet. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Model ID Cheatsheet
Stop your AI coding agent from hallucinating outdated model names. This MCP server gives any AI assistant instant access to accurate, up-to-date API model IDs, pricing, and specs for 107 models across 19 providers.
Built in Go. Single 10MB binary. Zero external calls. Sub-millisecond responses. Auto-updated daily.
- model = "gpt-4-turbo" # Hallucinated - doesn't exist anymore
+ model = "gpt-5.3-codex" # Correct - verified against official docs
- model = "claude-3-opus-20240229" # Deprecated
+ model = "claude-opus-4-6" # Current - latest Anthropic flagship
Quick Start
Pick one option below. You'll be up and running in under a minute.
Option A: Claude Code (one command)
claude mcp add --transport sse --scope user model-id-cheatsheet \
https://universal-model-registry-production.up.railway.app/sse
Verify it works:
claude mcp list
# Should show: model-id-cheatsheet ... Connected
Then start a new Claude Code session and ask: "What's the latest OpenAI model?" - it will use the tools automatically.
Option B: Cursor
Add to ~/.cursor/mcp.json:
{
"mcpServers": {
"model-id-cheatsheet": {
"url": "https://universal-model-registry-production.up.railway.app/sse"
}
}
}
Restart Cursor to pick up the change.
Option C: Windsurf
Add to Settings > MCP Servers (or edit ~/.codeium/windsurf/mcp_config.json):
{
"mcpServers": {
"model-id-cheatsheet": {
"serverUrl": "https://universal-model-registry-production.up.railway.app/sse"
}
}
}
Option D: Codex CLI
Add to ~/.codex/config.toml:
[mcp_servers.model-id-cheatsheet]
command = "uvx"
args = ["mcp-proxy", "--transport", "sse", "https://universal-model-registry-production.up.railway.app/sse"]
Option E: OpenCode
Add to ~/.config/opencode/opencode.json:
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"model-id-cheatsheet": {
"type": "remote",
"url": "https://universal-model-registry-production.up.railway.app/sse"
}
}
}
Option F: Any MCP Client
Connect to the SSE endpoint directly (no API key, no auth):
https://universal-model-registry-production.up.railway.app/sse
Or use the Streamable HTTP transport:
https://universal-model-registry-production.up.railway.app/mcp
Verify Your Setup
Once connected, try asking your AI assistant any of these:
- "What's the correct model ID for Claude Opus 4.6?"
- "Is gpt-4o still available?"
- "Compare gpt-5.2 vs claude-opus-4-6"
- "What's the cheapest model with vision?"
If the agent calls a tool like get_model_info or check_model_status before answering, it's working.
How It Works
Your AI agent gains 6 tools that it calls automatically before writing any model ID:
| Tool | What It Does | Example Prompt |
|---|---|---|
get_model_info(model_id) | Full specs: API ID, pricing, context window, capabilities | "What's the model ID for Claude Sonnet?" |
list_models(provider?, status?, capability?) | Browse and filter the registry | "Show me all current Google models" |
recommend_model(task, budget?) | Ranked recommendations for a task | "Best model for coding, cheap budget" |
check_model_status(model_id) | Verify if a model is current, legacy, or deprecated | "Is gpt-4o still available?" |
compare_models(model_ids) | Side-by-side comparison table | "Compare gpt-5.2 vs claude-opus-4-6" |
search_models(query) | Free-text search across all fields | "Search for reasoning models" |
Resources
| URI | Description |
|---|---|
model://registry/all | Full JSON dump of all 107 models |
model://registry/current | Only current (non-deprecated) models as JSON |
model://registry/pricing | Pricing table sorted cheapest-first (markdown) |
What Happens Under the Hood
- You ask your agent to write code or answer a model question
- The agent automatically calls the appropriate tool (e.g.,
get_model_info) - The server responds in sub-milliseconds with verified data (no external API calls)
- The agent writes code with the correct, current model ID
The server instructions tell the agent: "NEVER use a model ID from your training data without verifying it first." This means the agent will always check before writing.
Real-World Examples
Writing an API call:
# You: "Call the OpenAI API with their best coding model"
# Agent calls: get_model_info("gpt-5.4")
response = client.chat.completions.create(
model="gpt-5.4", # Verified via model registry
messages=[...]
)
Catching deprecated models:
# You: "Use gpt-4o for this task"
# Agent calls: check_model_status("gpt-4o")
# Agent: "gpt-4o is deprecated. I'll use gpt-5 instead."
response = client.chat.completions.create(
model="gpt-5", # Updated automatically
messages=[...]
)
Finding the cheapest option:
# You: "Use the cheapest model that supports vision"
# Agent calls: list_models(capability="vision", status="current")
response = client.chat.completions.create(
model="gpt-5-nano", # $0.05/$0.40 per 1M tokens
messages=[...]
)
Comparing options:
# You: "Should I use Claude or GPT for this?"
# Agent calls: compare_models(["claude-opus-4-6", "gpt-5.2"])
# Agent gets a side-by-side table and makes a recommendation
Resource Footprint
A common concern: "Will this slow down my agent or eat tokens?"
| Metric | Value |
|---|---|
| Binary size | ~10MB |
| Runtime memory | Minimal (static in-memory map, no database) |
| External API calls | Zero (all data is baked in) |
| Response time | Sub-millisecond |
| Token cost per tool call | ~200-500 tokens (small text response) |
| Tool schema overhead | ~500-800 tokens in system prompt |
For comparison, a single web search costs more tokens than all 6 tool schemas combined.
Covered Models (107 total)
Current Models (79)
| Provider | Models | API IDs |
|---|---|---|
| OpenAI (15) | GPT-5.4, GPT-5.4 Pro, GPT-5.3 Instant, GPT-5.2, GPT-5.2 Pro, GPT-5.1, GPT-5.1 Codex, GPT-5.1 Mini, GPT-5, GPT-5 Mini, GPT-5 Nano, GPT-4.1 Mini, GPT-4.1 Nano, o3, o4-mini | gpt-5.4, gpt-5.4-pro, gpt-5.3-chat-latest, gpt-5.2, gpt-5.2-pro, gpt-5.1, gpt-5.1-codex, gpt-5.1-mini, gpt-5, gpt-5-mini, gpt-5-nano, gpt-4.1-mini, gpt-4.1-nano, o3, o4-mini |
| Anthropic (4) | Claude Opus 4.6, Claude Sonnet 4.6, Claude Sonnet 4.5, Claude Haiku 4.5 | claude-opus-4-6, claude-sonnet-4-6, claude-sonnet-4-5-20250929, claude-haiku-4-5-20251001 |
| Mistral (11) | Mistral Large 3, Mistral Medium 3, Mistral Small 3.2, Mistral Saba, Ministral 3B, Ministral 8B, Ministral 14B, Magistral Small 1.2, Magistral Medium 1.2, Devstral 2, Devstral Small 2 | mistral-large-2512, mistral-medium-2505, mistral-small-2506, mistral-saba-2502, ministral-3b-2512, ministral-8b-2512, ministral-14b-2512, magistral-small-2509, magistral-medium-2509, devstral-2512, devstral-small-2512 |
| Amazon (6) | Nova Micro, Nova Lite, Nova Pro, Nova Premier, Nova 2 Lite, Nova 2 Pro | amazon-nova-micro, amazon-nova-lite, amazon-nova-pro, amazon-nova-premier, amazon-nova-2-lite, amazon-nova-2-pro |
| Google (5) | Gemini 3.1 Pro, Gemini 3.1 Flash Lite, Gemini 3 Flash, Gemini 2.5 Pro, Gemini 2.5 Flash | gemini-3.1-pro-preview, gemini-3.1-flash-lite-preview, gemini-3-flash-preview, gemini-2.5-pro, gemini-2.5-flash |
| Cohere (5) | Command A, Command A Reasoning, Command A Vision, Command A Translate, Command R7B | command-a-03-2025, command-a-reasoning-08-2025, command-a-vision-07-2025, command-a-translate-08-2025, command-r7b-12-2024 |
| xAI (4) | Grok 4, Grok 4.1 Fast, Grok 4 Fast, Grok Code Fast 1 | grok-4, grok-4.1-fast, grok-4-fast, grok-code-fast-1 |
| Microsoft (4) | Phi-4, Phi-4 Multimodal, Phi-4 Reasoning, Phi-4 Reasoning Plus | phi-4, phi-4-multimodal-instruct, phi-4-reasoning, phi-4-reasoning-plus |
| Perplexity (4) | Sonar, Sonar Pro, Sonar Reasoning Pro, Sonar Deep Research | sonar, sonar-pro, sonar-reasoning-pro, sonar-deep-research |
| Moonshot (3) | Kimi K2.5, Kimi K2 Thinking, Kimi K2 (0905) | kimi-k2.5, kimi-k2-thinking, kimi-k2-0905-preview |
| Tencent (3) | Hunyuan TurboS, Hunyuan T1, Hunyuan A13B | hunyuan-turbos, hunyuan-t1, hunyuan-a13b |
| Zhipu (3) | GLM-5, GLM-4.7, GLM-4.7 FlashX | glm-5, glm-4.7, glm-4.7-flashx |
| Meta (2) | Llama 4 Maverick, Llama 4 Scout | llama-4-maverick, llama-4-scout |
| DeepSeek (2) | DeepSeek Reasoner, DeepSeek Chat | deepseek-reasoner, deepseek-chat |
| NVIDIA (2) | Nemotron 3 Nano 30B, Nemotron Ultra 253B | nvidia/nemotron-3-nano-30b-a3b, nvidia/llama-3.1-nemotron-ultra-253b-v1 |
| AI21 (2) | Jamba Large 1.7, Jamba Mini 1.7 | jamba-large-1.7, jamba-mini-1.7 |
| MiniMax (2) | MiniMax M2.5, MiniMax M2.5 Lightning | minimax-m2.5, minimax-m2.5-lightning |
| Kuaishou (1) | KAT-Coder Pro | kat-coder-pro |
| Xiaomi (1) | MiMo V2 Flash | mimo-v2-flash |
Legacy & Deprecated Models (30)
Tracked so your agent can detect outdated model IDs and suggest current replacements:
- OpenAI:
gpt-5.3-codex(deprecated),gpt-5.2-codex(deprecated),gpt-5.1-codex-mini(deprecated),o3-pro(deprecated),o3-deep-research(deprecated),o3-mini(legacy),gpt-4.1(deprecated),gpt-4o(deprecated),gpt-4o-mini(deprecated) - Anthropic:
claude-opus-4-5(legacy),claude-opus-4-1(legacy),claude-opus-4-0(legacy),claude-sonnet-4-0(legacy),claude-3-7-sonnet-20250219(deprecated) - Google:
gemini-3-pro-preview(deprecated),gemini-3-pro-image-preview(deprecated),gemini-2.5-flash-lite(deprecated),gemini-2.0-flash-lite(deprecated),gemini-2.0-flash(deprecated) - xAI:
grok-4.1(deprecated),grok-3(legacy),grok-3-mini(legacy) - Mistral:
mistral-small-2503(legacy),codestral-2508(legacy) - MiniMax:
minimax-m2.1(legacy),minimax-01(deprecated) - Meta:
llama-3.3-70b(legacy) - DeepSeek:
deepseek-r1(legacy),deepseek-v3(deprecated) - Zhipu:
glm-4.6v(deprecated)
Self-Hosting
If you prefer to run the server locally instead of using the hosted endpoint:
Option 1: Build from Source (recommended for local use)
Requires Go 1.23+.
git clone https://github.com/aezizhu/universal-model-registry.git
cd universal-model-registry/go-server
go build -o model-id-cheatsheet ./cmd/server
Then add it to Claude Code as a local stdio server (zero latency, no network):
claude mcp add --scope user model-id-cheatsheet -- /path/to/model-id-cheatsheet
Or run in SSE mode for other clients:
MCP_TRANSPORT=sse PORT=8000 ./model-id-cheatsheet
# Endpoint: http://localhost:8000/sse
Option 2: Docker
git clone https://github.com/aezizhu/universal-model-registry.git
cd universal-model-registry
docker build -t model-id-cheatsheet .
docker run -p 8000:8000 model-id-cheatsheet
Your SSE endpoint will be at http://localhost:8000/sse.
Option 3: Deploy to Railway
Or manually:
railway login
railway init
railway up
Staying Up to Date
Model data is automatically checked and updated daily at 7 PM Pacific Time -- no human intervention needed.
How it works:
- Railway cron runs the updater daily, scraping 6 providers' public documentation pages (no API keys needed)
- Models removed from docs --> auto-deprecated via PR (status changed to
"deprecated"in code) - New models detected --> GitHub issue created for review
- CI runs on the auto-generated PR --> if tests pass --> auto-merged into main
- Railway auto-deploys from main
No provider API keys required. The updater reads publicly available documentation pages to detect model changes. Only GITHUB_TOKEN and GITHUB_REPO are needed for creating PRs and issues.
Auto-Update Pipeline Details
Railway Cron (primary) -- The hosted instance uses a Railway cron service that runs the updater daily. See configs/railway-updater.toml for the configuration.
Required env vars (set in Railway dashboard):
GITHUB_TOKEN-- GitHub personal access token with repo scopeGITHUB_REPO-- Repository in"owner/repo"format (e.g."aezizhu/universal-model-registry")
Providers checked (via public docs):
- OpenAI (via GitHub SDK source), Anthropic, Google, Mistral, xAI, DeepSeek
CI/CD Workflows:
.github/workflows/ci.yml-- runs tests on every PR.github/workflows/auto-merge.yml-- auto-merges bot PRs (labeledauto-update) after CI passes
GitHub Actions (alternative) -- A GitHub Actions workflow is also included at .github/workflows/auto-update.yml for users who self-host without Railway. No API keys needed -- only GITHUB_TOKEN (automatically provided by GitHub Actions).
Security
- Rate limiting: 60 requests/minute per IP
- Connection limits: Max 5 SSE connections per IP, 100 total
- Request body limit: 64KB max
- Input sanitization: All string inputs truncated to safe lengths
- HTTP hardening: ReadTimeout 15s, ReadHeaderTimeout 5s, IdleTimeout 120s, 64KB max headers
- Non-root Docker: Containers run as unprivileged user
- Graceful shutdown: Clean connection draining on SIGINT/SIGTERM
Tech Stack
- Language: Go 1.23
- MCP SDK:
github.com/modelcontextprotocol/go-sdkv1.3.0 (official) - Transports: stdio, SSE, Streamable HTTP
- Binary size: ~10MB
- Tests: 156 unit tests
- Security: Per-IP rate limiting, connection limits, input sanitization
- Deploy: Docker (alpine), Railway
Contributing
Contributions are welcome! Whether it's adding a new model, fixing data, or improving the server:
- Fork the repo and clone it locally
- Edit model data in
go-server/internal/models/data.go - Update test counts in
go-server/internal/models/data_test.go - Run the tests:
cd go-server && go test ./... -v - Submit a PR -- we'll review it quickly
If you spot an outdated model or incorrect pricing, opening an issue is just as helpful.
License
MIT
