Cutscene Agent
No description available
Ask AI about Cutscene Agent
Powered by Claude Β· Grounded in docs
I know everything about Cutscene Agent. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Cutscene Agent
Companion repositories: cutscene_copilot Β· CutsceneProvider
An AI-powered agent that creates cutscene animations in Unreal Engine by orchestrating characters, dialogue audio, body animations, and cameras through natural language instructions.
Built on the OpenAI Agents SDK with a dual-MCP (Model Context Protocol) architecture.
Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Cutscene Agent β
β βββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββ β
β β PromptManager β β SubAgentRunnerβ β SessionRecorder β β
β βββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββββββββββββββ β
β β β β
β βββββββββ΄ββββββββββββββββββ΄ββββββββββββ β
β β Main Director Agent β β
β β (plans outline, delegates tasks) β β
β ββββββββββββ¬βββββββββββββββ¬ββββββββββββ β
β β β β
β ββββββββββ΄ββββ ββββββββ΄βββββββ β
β β Scene β β Animation β ββββββββββββββββ β
β β Specialist β β Specialist β β Photographer β β
β ββββββββββ¬ββββ ββββββββ¬βββββββ ββββββββ¬ββββββββ β
βββββββββββββββΌβββββββββββββββΌββββββββββββββββββΌβββββββββββββββ
β β β
ββββββββββ΄βββββββββββββββ΄ββββββββββββββββββ΄βββββββ
β MCP Tool Layer (dual servers) β
β β
β βββββββββββββββββββββββ ββββββββββββββββββββ β
β β CutsceneProvider β β AIGCAssetTools β β
β β (UE Plugin MCP) β β (Stdio MCP) β β
β β β β β β
β β β’ add_character β β β’ tts_function β β
β β β’ add_animation β β β’ audio_to_face β β
β β β’ add_camera β β _expression β β
β β β’ apply_camera β β β’ push_file_to_ue β β
β β _template β β β’ video β β
β β β’ query_assets β β _understanding β β
β β β’ import_dynamic β β β’ get_available β β
β β _asset β β _tone β β
β β β’ move_view β β β β
β β β’ take_screenshot β β β β
β β β’ ... β β β β
β βββββββββββββββββββββββ ββββββββββββββββββββ β
ββββββββββββββ¬βββββββββββββββββββββββ¬ββββββββββββββ
β β
βΌ βΌ
ββββββββββββββββββββ ββββββββββββββββββββ
β Unreal Engine β β External Services β
β Level Sequence β β (TTS, FaceAnim, β
β β β Vision LLM) β
ββββββββββββββββββββ ββββββββββββββββββββ
Dual-MCP Design
| MCP Server | Transport | Role |
|---|---|---|
| CutsceneProvider | Streamable HTTP | UE plugin β manages characters, animations, cameras, asset queries, and dynamic imports inside the editor |
| AIGCAssetTools | Stdio | External content generation β TTS audio, facial expressions, file bridging to UE |
The agent connects to both servers simultaneously. Generated assets (e.g. audio WAV) flow through push_file_to_ue β CutsceneProvider's import_dynamic_asset to enter the UE pipeline.
Sub-Agent System
The main Director agent can delegate tasks to specialized sub-agents:
| Sub-Agent | Purpose |
|---|---|
| Scene Specialist | Batch character placement and orientation |
| Animation Specialist | Animation clip selection and timeline choreography |
| Photographer | Vision-based viewport composition (screenshots + iterative adjustment) |
| Custom | Ad-hoc sub-agent with user-defined instructions and tool scope |
Prerequisites
- Python 3.10+
- [CutsceneProvider](link upcoming) β UE plugin providing the editor-side MCP server (default:
http://localhost:8100/mcp) - An OpenAI-compatible LLM API (OpenAI, Anthropic via proxy, local Ollama, etc.)
- (Optional) TTS service, facial expression generation service, vision LLM β for full audio/face pipeline
Installation
# Clone the repository
git clone link-to-repo
cd cutscene_agent
# Install dependencies
pip install -r requirements.txt
# Copy and edit environment config
cp .env.example .env
# Edit .env with your LLM API credentials and UE MCP URL
Configuration
All settings are configured via environment variables or a .env file. See .env.example for the full list:
| Variable | Description | Default |
|---|---|---|
LLM_BASE_URL | OpenAI-compatible API endpoint | http://localhost:11434/v1 |
LLM_API_KEY | API key for the LLM service | (required) |
LLM_MODEL_NAME | Model identifier | claude-sonnet-4-20250514 |
UE_MCP_URL | CutsceneProvider MCP endpoint | http://localhost:8100/mcp |
AUTO_MANAGED_MSG_ROLE | Role for auto-managed messages (system / developer) | system |
TOOLCALL_HISTORY_LENGTH | Max tool-call entries before history compression | 999 |
Quick Start
- Start CutsceneProvider in Unreal Engine (the plugin auto-starts its MCP server).
- Run the agent:
python main.py
- Describe your cutscene in the CLI prompt, e.g.:
> Two characters face each other. Character A greets Character B,
> then B responds with a surprised expression. Camera starts wide,
> then cuts to a close-up of B's reaction.
The agent will plan an outline, query available assets, generate audio, and orchestrate the full sequence in UE.
MCP Tool Reference
AIGCAssetTools (Stdio Server)
Tools provided by mcp_servers/aigc_asset_tools.py:
| Tool | Status | Description |
|---|---|---|
get_available_tone | π§ Stub | List available TTS voice tones |
tts_function | π§ Stub | Generate speech audio from text |
audio_to_face_expression | π§ Stub | Generate facial animation from audio |
video_understanding | π§ Stub | Analyze video via vision LLM |
push_file_to_ue | β Ready | Bridge local files to UE via base64 import |
Stub tools raise
NotImplementedErrorby default. Implement them by connecting to your own services. See Custom Service Integration below.
CutsceneProvider (UE Plugin)
Provided by the CutsceneProvider plugin running inside Unreal Engine. Key tools include:
| Category | Tools |
|---|---|
| Characters | add_character, orient_character_to_center |
| Audio | add_character_audio, add_character_facial_animation |
| Animations | add_character_animation |
| Cameras | add_camera, set_active_camera, apply_camera_template |
| Asset Queries | get_available_characters, get_available_animations, get_available_camera_templates, get_queryable_asset_kinds, get_query_instruction, query_assets |
| Dynamic Import | get_importable_asset_types, get_import_guide, import_dynamic_asset |
| Viewport | move_view, take_editor_screenshot, take_camera_screenshot |
| Sequence | get_sequence_content, clear_sequence, save_sequence_as |
See CutsceneProvider documentation for the full tool reference.
Custom Service Integration
The stub tools in AIGCAssetTools are designed to be replaced with your own service integrations.
TTS (Text-to-Speech)
Edit mcp_servers/aigc_asset_tools.py and replace the tts_function_tool implementation:
@mcp_app.tool(name="tts_function", ...)
async def tts_function_tool(text, gender, tone, emotion) -> dict:
# Call your TTS API here
audio_bytes = await your_tts_api.synthesize(text, voice=tone, emotion=emotion)
# Save to local file
output_path = BASE_DIR / "asset" / "generated_audio" / f"{identifier}.wav"
output_path.write_bytes(audio_bytes)
return {
"status": "success",
"file_path": str(output_path),
"audio_duration": duration,
"audio_sample_rate": sample_rate,
"text": text,
"gender": gender,
}
The returned file_path can then be used with push_file_to_ue to import the audio into UE.
Facial Expression Generation
Replace audio_to_face_expression_tool similarly. The output should be a .npy file containing blendshape weights, which is then pushed to UE via push_file_to_ue.
Video Understanding
Replace video_understanding_tool with a call to any vision-capable LLM (e.g. GPT-4o, Gemini).
Project Structure
cutscene_agent/
βββ main.py # CLI entry point
βββ .env.example # Environment variable template
βββ requirements.txt # Python dependencies
β
βββ core/
β βββ cutscene_agent.py # Main CutsceneAgent orchestrator
β βββ subagent_runner.py # Sub-agent template system & execution
β βββ session_recorder.py # Session recording (events, conversation snapshots)
β βββ sub_agents/
β βββ photographer.py # Vision-based viewport composition agent
β
βββ prompt/
β βββ __init__.py # Exports: PromptManager, ContextBlock, etc.
β βββ manager.py # Priority-driven prompt assembly with token budget
β βββ elements.py # SystemInstruction, ContextBlock, TextElement
β βββ base.py # PromptElement base classes
β βββ utils.py # Token counting utilities
β βββ templates/
β βββ cutscene.py # Cutscene creation rules & workflow prompts
β βββ common.py # Identity, safety, formatting instructions
β
βββ mcp_servers/
β βββ aigc_asset_tools.py # AIGCAssetTools MCP server (TTS, face, bridge)
β
βββ doc/ # Documentation
Citation
If you find this work useful, please cite our paper:
@article{he2026cutscene,
title={Cutscene Agent: An LLM Agent Framework for Automated 3D Cutscene Generation},
author={He, Lanshan and Pang, Haozhou and Gan, Qi and Shen, Xin and Zhang, Ziwei and Liu, Yibo and Fang, Gang and Liu, Bo and Sheng, Kai and Zeng, Shengfeng and Li, Chaofan and Hui, Zhen and Zhou, Keer and Zhou, Lan and Dai, Shujun},
journal={arXiv preprint arXiv:2604.25318},
year={2026}
}
License
See LICENSE for details.
