Gemini Media MCP
Unified Go MCP server for AI media generation via Google Gemini API and Vertex AI
Installation
npx gemini-media-mcpAsk AI about Gemini Media MCP
Powered by Claude ยท Grounded in docs
I know everything about Gemini Media MCP. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Gemini Media MCP
MCP server for generating images and videos using Google Gemini and VEO models.
Quick start
uvx gemini-media-mcp setup
The setup wizard walks you through the whole onboarding flow end-to-end:
- Pick a credential mode: Gemini API (images only, easier) or Vertex AI (images + video).
- Enter your API key, or your Google Cloud project plus a service account JSON (file path or inline paste).
- Choose where generated media should be written (defaults to
~/gemini-media). - Optionally set a
VIDEO_GCS_BUCKETfor large video output, and auto-populateGCS_ALLOWED_BUCKETS. - Validate your credentials by constructing a Google GenAI client.
- Print a ready-to-paste Claude Desktop JSON block. On macOS, the wizard can also merge the block directly into
~/Library/Application Support/Claude/claude_desktop_config.json(existing servers are preserved and the prior file is backed up to.bak).
For scripted use, all prompts can be supplied via flags:
uvx gemini-media-mcp setup --non-interactive --mode=gemini --api-key=AIzaSy...
If you prefer to configure everything by hand, the manual steps are below.
Setup
Prerequisites
- For video generation (VEO): Google Cloud project with Vertex AI API enabled and a service account with Vertex AI permissions (setup instructions)
- For image generation only: Gemini API key (setup instructions)
Environment Variables
For Vertex AI (required for VEO video generation):
export GOOGLE_GENAI_USE_VERTEXAI=true
export GOOGLE_CLOUD_PROJECT=your-project-id
export GOOGLE_CLOUD_LOCATION=us-central1
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
โ See Vertex AI Setup for detailed instructions
Alternatively, for Gemini API (image generation only):
export GEMINI_API_KEY=your-api-key
โ See Gemini API Setup for detailed instructions
Optional security hardening:
# Restrict gs:// fetches and output_gcs_uri to specific buckets.
# If unset and VIDEO_GCS_BUCKET is not set, gs:// fetches log a warning.
export GCS_ALLOWED_BUCKETS=bucket-a,bucket-b
Local file:// and bare-path inputs are always restricted to DATA_FOLDER.
HTTP(S) fetches reject hosts that resolve to private, loopback, link-local,
or metadata IPs, and downloads are capped at 50 MB.
Claude Desktop Configuration
Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):
{
"mcpServers": {
"gemini-media": {
"command": "uvx",
"args": ["gemini-media-mcp"],
"env": {
"GOOGLE_GENAI_USE_VERTEXAI": "true",
"GOOGLE_CLOUD_PROJECT": "your-project-id",
"GOOGLE_CLOUD_LOCATION": "us-central1",
"GOOGLE_APPLICATION_CREDENTIALS": "/path/to/service-account.json"
}
}
}
}
Or using Docker (note: DATA_FOLDER must be set to the host path, with matching volume mount):
{
"mcpServers": {
"gemini-media": {
"command": "docker",
"args": [
"run", "--rm", "-i",
"-e", "GOOGLE_GENAI_USE_VERTEXAI=true",
"-e", "GOOGLE_CLOUD_PROJECT=your-project-id",
"-e", "GOOGLE_CLOUD_LOCATION=us-central1",
"-e", "GOOGLE_APPLICATION_CREDENTIALS=/credentials.json",
"-e", "DATA_FOLDER=/Users/yourusername/gemini-output",
"-v", "/path/to/service-account.json:/credentials.json:ro",
"-v", "/Users/yourusername/gemini-output:/Users/yourusername/gemini-output",
"cxoagi/gemini-media-mcp"
]
}
}
}
This writes files to your host path and returns paths like /Users/yourusername/gemini-output/images/abc.png that Claude Desktop can open directly. The DATA_FOLDER directory will contain images/ and videos/ subdirectories.
Available Tools
generate_image
Generate images using Gemini or Imagen models.
Parameters:
-
prompt(required): Text description of the image -
model: Pick by use case. GA (stable) โ preferred in production:gemini-2.5-flash-image(Nano Banana) โ default; fastest, cheapest, great for conversational editingimagen-4.0-fast-generate-001โ cheapest photorealimagen-4.0-generate-001โ balanced photorealimagen-4.0-ultra-generate-001โ highest-fidelity photoreal, precise text renderingimagen-3.0-generate-002โ legacy, kept for compatibility
Preview โ newest capabilities, may change without notice:
gemini-3.1-flash-image-preview(Nano Banana 2) โ 4K output, up to 14 reference images, fastgemini-3-pro-image-preview(Nano Banana Pro) โ 4K, reasoning,thought_signaturefor multi-turn editing
-
image_uri: Input image URI for image-to-image generation -
image_base64: Base64 encoded input image
Gemini 3.x Image Parameters (for gemini-3-pro-image-preview and gemini-3.1-flash-image-preview):
reference_image_uris: List of up to 14 reference image URIs for multi-image composition- Up to 6 object images for high-fidelity inclusion
- Up to 5 human images for character consistency across scenes
image_size: Output resolution (1K,2K,4K) - must use uppercase Kthinking_level: Reasoning depth (lowfor fast,highfor complex generation)media_resolution: Input image processing quality (MEDIA_RESOLUTION_LOW,MEDIA_RESOLUTION_MEDIUM,MEDIA_RESOLUTION_HIGH)thought_signature: For multi-turn editing workflows - pass back the signature from previous responses
generate_video
Generate videos using VEO models (requires Vertex AI).
Parameters:
prompt(required): Text description of the videomodel: Model to use:veo-3.1-generate-001(default): Highest quality, 4/6/8s duration, audio supportveo-3.1-fast-generate-001: Faster generation with audio supportveo-3.1-lite-generate-preview: Most cost-effective, 4/6/8s, audio; no video extension or 4K. Currently served via the Gemini API; Vertex AI projects may return 404 until Google publishes the model on Vertex.
aspect_ratio:16:9(default) or9:16duration_seconds: Video duration (4/6/8s)include_audio: Enable audio generationaudio_prompt: Audio descriptionnegative_prompt: Things to avoid in the videoseed: Random seed for reproducibilityimage_uri: First frame image URI for image-to-video generation
Additional Parameters:
last_frame_uri: Last frame image URI for first+last frame control- When combined with
image_uri, generates smooth transitions between frames
- When combined with
reference_image_uris: List of up to 3 reference image URIs for subject preservation- Preserves the appearance of a person, character, or product in the output video
- Note: Only supports 8-second duration (automatically enforced)
- Cannot be used together with first/last frame inputs
extend_video_uri: URI of existing VEO-generated video to extend- Extends the final second of the video and continues the action
- Can be chained multiple times for longer videos (up to ~148s total)
- Note: Cannot be used together with other image inputs
Generation Modes (automatically selected based on inputs):
text_to_video: Text-only promptimage_to_video: First frame image inputfirst_last_frame: First and last frame controlreference_to_video: Reference images for subject preservation (8s only)extend_video: Extend existing video
Google Vertex AI and Gemini Access
Vertex AI Setup (Required for VEO Video Generation)
Step 1: Create a Google Cloud Project
- Go to the Google Cloud Console
- Click the project dropdown at the top of the page
- Click "New Project"
- Enter a project name and click "Create"
- Note your Project ID (you'll need this later)
Step 2: Enable Vertex AI API
- In the Cloud Console, go to "APIs & Services" > "Library" (or visit API Library)
- Search for "Vertex AI API"
- Click on "Vertex AI API" in the results
- Click the "Enable" button
- Wait for the API to be enabled (this may take a minute)
Step 3: Create a Service Account
- Go to "IAM & Admin" > "Service Accounts" (or visit Service Accounts)
- Click "Create Service Account" at the top
- Enter a name (e.g., "gemini-media-mcp") and description
- Click "Create and Continue"
- In the "Grant this service account access to project" section:
- Click the "Select a role" dropdown
- Search for "Vertex AI User"
- Select "Vertex AI User" role
- Click "Continue"
- Click "Done" (you can skip the optional "Grant users access" section)
Step 4: Download Service Account Key
- In the Service Accounts list, find the account you just created
- Click the three dots (โฎ) in the "Actions" column
- Select "Manage keys"
- Click "Add Key" > "Create new key"
- Select "JSON" as the key type
- Click "Create"
- The JSON key file will automatically download to your computer
- Important: Move this file to a secure location and note the path (e.g.,
~/credentials/gemini-media-service-account.json) - Security Note: Never commit this file to version control or share it publicly
Step 5: Update Configuration
Use the following values in your configuration:
GOOGLE_CLOUD_PROJECT: Your Project ID from Step 1GOOGLE_CLOUD_LOCATION:us-central1(or your preferred region)GOOGLE_APPLICATION_CREDENTIALS: Full path to the JSON key file from Step 4
Gemini API Setup (Image Generation Only)
For simpler image generation without video capabilities:
- Visit Google AI Studio
- Sign in with your Google account
- Click "Create API Key"
- Copy your key (starts with
AIzaSy...) - Set the environment variable:
export GEMINI_API_KEY=your-api-key
Note: The Gemini API does not support VEO video generation. For video capabilities, you must use Vertex AI.
Contributing
Development Setup
uv sync
Running Tests
uv run pytest
Code Quality
# Type checking
uv run basedpyright src/ tests/
# Linting and formatting
uv run ruff check src/ tests/
uv run ruff format src/ tests/
# Pre-commit hooks
uv run prek
Building Docker Image
docker build -t gemini-media-mcp .
# With specific version
docker build --build-arg VERSION=1.0.0 -t gemini-media-mcp:1.0.0 .
License
MIT
