📦

Gemini Media MCP

Unified Go MCP server for AI media generation via Google Gemini API and Vertex AI

0 installs

1 stars

Trust: 59 — Fair

Content

Installation

npx gemini-media-mcp

Ask AI about Gemini Media MCP

I know everything about Gemini Media MCP. Ask me about installation, configuration, usage, or troubleshooting.

0/500

Loading tools...

Reviews

Documentation

Gemini Media MCP

MCP server for generating images and videos using Google Gemini and VEO models.

Quick start

uvx gemini-media-mcp setup

The setup wizard walks you through the whole onboarding flow end-to-end:

Pick a credential mode: Gemini API (images only, easier) or Vertex AI (images + video).
Enter your API key, or your Google Cloud project plus a service account JSON (file path or inline paste).
Choose where generated media should be written (defaults to ~/gemini-media).
Optionally set a VIDEO_GCS_BUCKET for large video output, and auto-populate GCS_ALLOWED_BUCKETS.
Validate your credentials by constructing a Google GenAI client.
Print a ready-to-paste Claude Desktop JSON block. On macOS, the wizard can also merge the block directly into ~/Library/Application Support/Claude/claude_desktop_config.json (existing servers are preserved and the prior file is backed up to .bak).

For scripted use, all prompts can be supplied via flags:

uvx gemini-media-mcp setup --non-interactive --mode=gemini --api-key=AIzaSy...

If you prefer to configure everything by hand, the manual steps are below.

Setup

Prerequisites

For video generation (VEO): Google Cloud project with Vertex AI API enabled and a service account with Vertex AI permissions (setup instructions)
For image generation only: Gemini API key (setup instructions)

Environment Variables

For Vertex AI (required for VEO video generation):

export GOOGLE_GENAI_USE_VERTEXAI=true
export GOOGLE_CLOUD_PROJECT=your-project-id
export GOOGLE_CLOUD_LOCATION=us-central1
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json

→ See Vertex AI Setup for detailed instructions

Alternatively, for Gemini API (image generation only):

export GEMINI_API_KEY=your-api-key

→ See Gemini API Setup for detailed instructions

Optional security hardening:

# Restrict gs:// fetches and output_gcs_uri to specific buckets.
# If unset and VIDEO_GCS_BUCKET is not set, gs:// fetches log a warning.
export GCS_ALLOWED_BUCKETS=bucket-a,bucket-b

Local file:// and bare-path inputs are always restricted to DATA_FOLDER. HTTP(S) fetches reject hosts that resolve to private, loopback, link-local, or metadata IPs, and downloads are capped at 50 MB.

Claude Desktop Configuration

Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "gemini-media": {
      "command": "uvx",
      "args": ["gemini-media-mcp"],
      "env": {
        "GOOGLE_GENAI_USE_VERTEXAI": "true",
        "GOOGLE_CLOUD_PROJECT": "your-project-id",
        "GOOGLE_CLOUD_LOCATION": "us-central1",
        "GOOGLE_APPLICATION_CREDENTIALS": "/path/to/service-account.json"
      }
    }
  }
}

Or using Docker (note: DATA_FOLDER must be set to the host path, with matching volume mount):

{
  "mcpServers": {
    "gemini-media": {
      "command": "docker",
      "args": [
        "run", "--rm", "-i",
        "-e", "GOOGLE_GENAI_USE_VERTEXAI=true",
        "-e", "GOOGLE_CLOUD_PROJECT=your-project-id",
        "-e", "GOOGLE_CLOUD_LOCATION=us-central1",
        "-e", "GOOGLE_APPLICATION_CREDENTIALS=/credentials.json",
        "-e", "DATA_FOLDER=/Users/yourusername/gemini-output",
        "-v", "/path/to/service-account.json:/credentials.json:ro",
        "-v", "/Users/yourusername/gemini-output:/Users/yourusername/gemini-output",
        "cxoagi/gemini-media-mcp"
      ]
    }
  }
}

This writes files to your host path and returns paths like /Users/yourusername/gemini-output/images/abc.png that Claude Desktop can open directly. The DATA_FOLDER directory will contain images/ and videos/ subdirectories.

Available Tools

generate_image

Generate images using Gemini or Imagen models.

Parameters:

prompt (required): Text description of the image
model: Pick by use case. GA (stable) — preferred in production:
- gemini-2.5-flash-image (Nano Banana) — default; fastest, cheapest, great for conversational editing
- imagen-4.0-fast-generate-001 — cheapest photoreal
- imagen-4.0-generate-001 — balanced photoreal
- imagen-4.0-ultra-generate-001 — highest-fidelity photoreal, precise text rendering
- imagen-3.0-generate-002 — legacy, kept for compatibility
Preview — newest capabilities, may change without notice:
- gemini-3.1-flash-image-preview (Nano Banana 2) — 4K output, up to 14 reference images, fast
- gemini-3-pro-image-preview (Nano Banana Pro) — 4K, reasoning, thought_signature for multi-turn editing
image_uri: Input image URI for image-to-image generation
image_base64: Base64 encoded input image

Gemini 3.x Image Parameters (for gemini-3-pro-image-preview and gemini-3.1-flash-image-preview):

reference_image_uris: List of up to 14 reference image URIs for multi-image composition
- Up to 6 object images for high-fidelity inclusion
- Up to 5 human images for character consistency across scenes
image_size: Output resolution (1K, 2K, 4K) - must use uppercase K
thinking_level: Reasoning depth (low for fast, high for complex generation)
media_resolution: Input image processing quality (MEDIA_RESOLUTION_LOW, MEDIA_RESOLUTION_MEDIUM, MEDIA_RESOLUTION_HIGH)
thought_signature: For multi-turn editing workflows - pass back the signature from previous responses

generate_video

Generate videos using VEO models (requires Vertex AI).

Parameters:

prompt (required): Text description of the video
model: Model to use:
- veo-3.1-generate-001 (default): Highest quality, 4/6/8s duration, audio support
- veo-3.1-fast-generate-001: Faster generation with audio support
- veo-3.1-lite-generate-preview: Most cost-effective, 4/6/8s, audio; no video extension or 4K. Currently served via the Gemini API; Vertex AI projects may return 404 until Google publishes the model on Vertex.
aspect_ratio: 16:9 (default) or 9:16
duration_seconds: Video duration (4/6/8s)
include_audio: Enable audio generation
audio_prompt: Audio description
negative_prompt: Things to avoid in the video
seed: Random seed for reproducibility
image_uri: First frame image URI for image-to-video generation

Additional Parameters:

last_frame_uri: Last frame image URI for first+last frame control
- When combined with image_uri, generates smooth transitions between frames
reference_image_uris: List of up to 3 reference image URIs for subject preservation
- Preserves the appearance of a person, character, or product in the output video
- Note: Only supports 8-second duration (automatically enforced)
- Cannot be used together with first/last frame inputs
extend_video_uri: URI of existing VEO-generated video to extend
- Extends the final second of the video and continues the action
- Can be chained multiple times for longer videos (up to ~148s total)
- Note: Cannot be used together with other image inputs

Generation Modes (automatically selected based on inputs):

text_to_video: Text-only prompt
image_to_video: First frame image input
first_last_frame: First and last frame control
reference_to_video: Reference images for subject preservation (8s only)
extend_video: Extend existing video

Google Vertex AI and Gemini Access

Vertex AI Setup (Required for VEO Video Generation)

Step 1: Create a Google Cloud Project

Go to the Google Cloud Console
Click the project dropdown at the top of the page
Click "New Project"
Enter a project name and click "Create"
Note your Project ID (you'll need this later)

Step 2: Enable Vertex AI API

In the Cloud Console, go to "APIs & Services" > "Library" (or visit API Library)
Search for "Vertex AI API"
Click on "Vertex AI API" in the results
Click the "Enable" button
Wait for the API to be enabled (this may take a minute)

Step 3: Create a Service Account

Go to "IAM & Admin" > "Service Accounts" (or visit Service Accounts)
Click "Create Service Account" at the top
Enter a name (e.g., "gemini-media-mcp") and description
Click "Create and Continue"
In the "Grant this service account access to project" section:
- Click the "Select a role" dropdown
- Search for "Vertex AI User"
- Select "Vertex AI User" role
- Click "Continue"
Click "Done" (you can skip the optional "Grant users access" section)

Step 4: Download Service Account Key

In the Service Accounts list, find the account you just created
Click the three dots (⋮) in the "Actions" column
Select "Manage keys"
Click "Add Key" > "Create new key"
Select "JSON" as the key type
Click "Create"
The JSON key file will automatically download to your computer
Important: Move this file to a secure location and note the path (e.g., ~/credentials/gemini-media-service-account.json)
Security Note: Never commit this file to version control or share it publicly

Step 5: Update Configuration

Use the following values in your configuration:

GOOGLE_CLOUD_PROJECT: Your Project ID from Step 1
GOOGLE_CLOUD_LOCATION: us-central1 (or your preferred region)
GOOGLE_APPLICATION_CREDENTIALS: Full path to the JSON key file from Step 4

Gemini API Setup (Image Generation Only)

For simpler image generation without video capabilities:

Visit Google AI Studio
Sign in with your Google account
Click "Create API Key"
Copy your key (starts with AIzaSy...)
Set the environment variable: export GEMINI_API_KEY=your-api-key

Note: The Gemini API does not support VEO video generation. For video capabilities, you must use Vertex AI.

Contributing

Development Setup

uv sync

Running Tests

uv run pytest

Code Quality

# Type checking
uv run basedpyright src/ tests/

# Linting and formatting
uv run ruff check src/ tests/
uv run ruff format src/ tests/

# Pre-commit hooks
uv run prek

Building Docker Image

docker build -t gemini-media-mcp .

# With specific version
docker build --build-arg VERSION=1.0.0 -t gemini-media-mcp:1.0.0 .

License

MIT