Omni-Video Studio MCP
Enables AI-powered video editing in IDEs through a pipeline-driven workflow, including proxy ingestion, motion graphics generation, and high-fidelity rendering.
Ask AI about Omni-Video Studio MCP
Powered by Claude Β· Grounded in docs
I know everything about Omni-Video Studio MCP. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Omni-Video Studio MCP
The Omni-Video Studio MCP is an enterprise-grade, autonomous Model Context Protocol (MCP) server that empowers any LLM-enabled IDE (Cursor, Claude Code, Antigravity) to act as a professional video editor.
It evolves simple transcription-based editing into a deterministic, token-efficient, and pipeline-driven workflow, featuring agent-native motion graphics (Hyperframes), visual metadata proxies, and high-fidelity final renders.
π Key Features
- Metadata Proxy Ingestion: Instead of streaming expensive video tokens to an LLM, this server pre-processes footage to extract a
takes_packed.md(audio mapping) and a Visual Scene Graph. The agent edits using text proxies, cutting costs and accelerating reasoning. - Hyperframes Engine: Forget complex Node.js dependencies (e.g., Remotion). The agent generates deterministic HTML/CSS motion graphics which are instantly rendered to transparent video using Playwright.
- Advanced Rendering Pipeline: Powered by robust FFmpeg filter graphs, the final output supports EDL (Edit Decision List) cuts, overlay rendering, Subtitle burning, LUT color grading, and optional DeepFilterNet AI audio restoration.
- IDE-Agnostic: Because it adheres to the official MCP specification, it drops directly into Cursor, Antigravity, or Claude Desktop without custom plugins.
π¦ Installation
Prerequisites:
python 3.10+ffmpeg(must be installed on your system path)uv(recommended for dependency management)
# Clone the repository
git clone https://github.com/your-org/omni-video-mcp.git
cd omni-video-mcp
# Install dependencies
uv venv
source .venv/bin/activate
uv pip install -e .
# Install Playwright browsers (for Hyperframes)
playwright install chromium
π Configuration
Add the server to your IDE's MCP settings file (e.g., ~/.gemini/antigravity/mcp_config.json, ~/.cursor/mcp.json, or Claude Desktop config):
{
"mcpServers": {
"omni-video-mcp": {
"command": "uv",
"args": [
"run",
"/path/to/omni-video-mcp/server.py"
],
"env": {
"ELEVENLABS_API_KEY": "your_api_key_here"
}
}
}
}
Note: The ELEVENLABS_API_KEY is currently required for high-fidelity word-level transcription mapping during ingestion.
π¬ How it Works (The Agent Pipeline)
When the agent uses this MCP server, it follows a 4-phase architecture:
- Phase 1: Ingestion (
omni_video_ingest) The agent scans your raw.mp4/.movfiles, extracting a packed markdown transcript and an initial Visual Scene Graph. - Phase 2: Director's Cut (
omni_video_preview) The agent uses the transcript to construct an EDL (Edit Decision List) of the best takes. Ambiguous cuts can be visually verified by generating filmstrip PNGs via the preview tool. - Phase 3: VFX (
omni_video_generate_vfx) The agent generates HTML/CSS motion graphics (lower thirds, b-roll layouts) and the server renders them deterministically into transparent.webmvideos via Hyperframes. - Phase 4: Sweetening & Render (
omni_video_render) The agent passes the EDL, VFX timestamps, and render settings to the server, which builds a complex FFmpeg graph to concatenate the footage, grade it, restore the audio, and export a final master.
π€ Contributing
Contributions are welcome! If you're adding new render pipeline capabilities (like auto-tracking or local whisper fallbacks), please open a PR. Ensure that any added Python dependencies are added to the pyproject.toml using uv add <package>.
π License
MIT License
