Open Webui Tools
OpenโWebUI Tools is a modular toolkit designed to extend and enrich your Open WebUI instance, turning it into a powerful AI workstation. With a suite of over 15 specialized tools, function pipelines, and filters, this project supports academic research, agentic autonomy, multimodal creativity, workflows, and more
Ask AI about Open Webui Tools
Powered by Claude ยท Grounded in docs
I know everything about Open Webui Tools. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Open WebUI Tools Collection
๐ A modular collection of tools, function pipes, and filters to supercharge your Open WebUI experience.
Transform your Open WebUI instance into a powerful AI workstation with this comprehensive toolkit. From academic research and image generation to music creation and autonomous agents, this collection provides everything you need to extend your AI capabilities.
โจ What's Inside
This repository contains 20+ specialized tools and functions designed to enhance your Open WebUI experience:
๐ ๏ธ Tools
- arXiv Search - Academic paper discovery (no API key required!)
- Perplexica Search - Web search using Perplexica API with citations
- Pexels Media Search - High-quality photos and videos from Pexels API
- YouTube Search & Embed - Search YouTube and play videos in embedded player
- Native Image Generator - Direct Open WebUI image generation with Ollama model management
- Hugging Face Image Generator - AI-powered image creation
- ComfyUI Image-to-Image (Qwen Edit 2509) - Advanced image editing with multi-image support
- ComfyUI ACE Step 1.5 Audio - Advanced music generation (New)
- ComfyUI ACE Step Audio (Legacy) - Advanced music generation
- ComfyUI Text-to-Video - Generate short videos from text using ComfyUI (default WAN 2.2 workflow)
- Flux Kontext ComfyUI - Professional image editing
- OpenWeatherMap Forecast Tool - Interactive weather widget with current conditions and forecasts
๐ Function Pipes
- Planner Agent v3 - Advanced autonomous agent with agentic planning, multi-agent delegation, and real-time visual execution tracking
- arXiv Research MCTS - Advanced research with Monte Carlo Tree Search
- Multi Model Conversations v2 - Multi-agent discussions with interactive UI, tool support, and improved reasoning handling
- Resume Analyzer - Professional resume analysis
- Mopidy Music Controller - Music server management
- Letta Agent - Autonomous agent integration
- Perplexica Pipe - AI-powered web search with streaming responses and citations
- Google Veo Text-to-Video & Image-to-Video - Generate videos from text or a single image using Google Veo (only one image supported as input)
- MiniMax LLM Pipe - Route chat completions to MiniMax's OpenAI-compatible API with M2.7 and M2.7-highspeed models (204K context)
๐ง Filters
- Doodle Paint - Toggleable filter that opens a paint canvas before sending each message
- Prompt Enhancer - Automatic prompt improvement
- Semantic Router - Intelligent model selection
- Full Document - File processing capabilities
- Clean Thinking Tags - Conversation cleanup
- OpenRouter WebSearch Citations - Enable web search for OpenRouter models with citation handling
๐ Quick Start
Option 1: Open WebUI Hub (Recommended)
- Visit https://openwebui.com/u/Haervwe
- Browse the collection and click "Get" for desired tools
- Follow the installation prompts in your Open WebUI instance
Option 2: Manual Installation
- Copy
.pyfiles fromtools/,functions/, orfilters/directories - Navigate to Open WebUI Workspace > Tools/Functions/Filters
- Paste the code, provide a name and description, then save
๐ฏ Key Features
- ๐ Plug-and-Play: Most tools work out of the box with minimal configuration
- ๐จ Visual Integration: Seamless integration with ComfyUI workflows
- ๐ค AI-Powered: Advanced features like MCTS research and autonomous planning
- ๐ Academic Focus: arXiv integration for research and academic work
- ๐ต Creative Tools: Music generation and image editing capabilities
- ๐ Smart Routing: Intelligent model selection and conversation management
- ๐ Document Processing: Full document analysis and resume processing
๐ Prerequisites
- Open WebUI: Version 0.6.0+ recommended
- Python: 3.8 or higher
- Optional Dependencies:
- ComfyUI (for image/music generation tools)
- Mopidy (for music controller)
- Various API keys (Hugging Face, Tavily, etc.)
๐ง Configuration
Most tools are designed to work with minimal configuration. Key configuration areas:
- API Keys: Required for some tools (Hugging Face, Tavily, etc.)
- ComfyUI Integration: For image and music generation tools
- Model Selection: Choose appropriate models for your use case
- Filter Setup: Enable filters in your model configuration
๐ Detailed Documentation
Table of Contents
- arXiv Search Tool
- Perplexica Search Tool
- Pexels Media Search Tool
- YouTube Search & Embed Tool
- Native Image Generator
- Hugging Face Image Generator
- Cloudflare Workers AI Image Generator
- SearxNG Image Search Tool
- ComfyUI Image-to-Image Tool (Qwen Image Edit 2509)
- ComfyUI ACE Step 1.5 Audio Tool
- ComfyUI ACE Step Audio Tool (Legacy)
- ComfyUI Text-to-Video Tool
- OpenWeatherMap Forecast Tool
- Flux Kontext ComfyUI Pipe
- Google Veo Text-to-Video & Image-to-Video Pipe
- MiniMax LLM Pipe
- Planner Agent v3
- arXiv Research MCTS Pipe
- Multi Model Conversations v2 Pipe
- Resume Analyzer Pipe
- Mopidy Music Controller
- Letta Agent Pipe
- Perplexica Pipe
- OpenRouter Image Pipe
- OpenRouter WebSearch Citations Filter
- Doodle Paint Filter
- Prompt Enhancer Filter
- Semantic Router Filter
- Full Document Filter
- Clean Thinking Tags Filter
- Using the Provided ComfyUI Workflows
- Installation
- Contributing
- License
- Credits
- Support
๐งช Tools
arXiv Search Tool
Description
Search arXiv.org for relevant academic papers on any topic. No API key required!
Configuration
- No configuration required. Works out of the box.
Usage
-
Example:
Search for recent papers about "tree of thought" -
Returns up to 5 most relevant papers, sorted by most recent.
Example arXiv search result in Open WebUI
Perplexica Search Tool
Description
Search the web for factual information, current events, or specific topics using the Perplexica API. This tool provides comprehensive search results with citations and sources, making it ideal for research and information gathering. Perplexica is an open-source AI-powered search engine and alternative to Perplexity AI that must be self-hosted locally. It uses advanced language models to provide accurate, contextual answers with proper source attribution.
Configuration
BASE_URL(str): Base URL for the Perplexica API (default:http://host.docker.internal:3001)OPTIMIZATION_MODE(str): Search optimization mode - "speed" or "balanced" (default:balanced)CHAT_MODEL(str): Default chat model for search processing (default:llama3.1:latest)EMBEDDING_MODEL(str): Default embedding model for search (default:bge-m3:latest)OLLAMA_BASE_URL(str): Base URL for Ollama API (default:http://host.docker.internal:11434)
Prerequisites: You must have Perplexica installed and running locally at the configured URL. Perplexica is a self-hosted open-source search engine that requires Ollama with the specified chat and embedding models. Follow the installation instructions in the Perplexica repository to set up your local instance.
Usage
-
Example:
Search for "latest developments in AI safety research 2024" -
Returns comprehensive search results with proper citations
-
Automatically emits citations for source tracking in Open WebUI
-
Provides both summary and individual source links
Features
- Web Search Integration: Direct access to current web information
- Citation Support: Automatic citation generation for Open WebUI
- Model Flexibility: Configurable chat and embedding models
- Real-time Status: Progress updates during search execution
- Source Tracking: Individual source citations with metadata
Pexels Media Search Tool
Description
Search and retrieve high-quality photos and videos from the Pexels API. This tool provides access to Pexels' extensive collection of free stock photos and videos, with comprehensive search capabilities, automatic citation generation, and direct image display in chat. Perfect for finding professional-quality media for presentations, content creation, or creative projects.
Configuration
PEXELS_API_KEY(str): Free Pexels API key from https://www.pexels.com/api/ (required)DEFAULT_PER_PAGE(int): Default number of results per search (default: 5, recommended for LLMs)MAX_RESULTS_PER_PAGE(int): Maximum allowed results per page (default: 15, prevents overwhelming LLMs)DEFAULT_ORIENTATION(str): Default photo orientation - "all", "landscape", "portrait", or "square" (default: "all")DEFAULT_SIZE(str): Default minimum photo size - "all", "large" (24MP), "medium" (12MP), or "small" (4MP) (default: "all")
Prerequisites: Get a free API key from Pexels API and configure it in the tool's Valves settings.
Usage
-
Photo Search Example:
Search for photos of "modern office workspace" -
Video Search Example:
Search for videos of "ocean waves at sunset" -
Curated Photos Example:
Get curated photos from Pexels
Features
- Three Search Functions:
search_photos,search_videos, andget_curated_photos - Direct Image Display: Images are automatically formatted with markdown for immediate display in chat
- Advanced Filtering: Filter by orientation, size, color, and quality
- Attribution Support: Automatic citation generation with photographer credits
- Rate Limit Handling: Built-in error handling for API limits and invalid keys
- LLM Optimized: Results are limited and formatted to prevent overwhelming language models
- Real-time Status: Progress updates during search execution
YouTube Search & Embed Tool
Description
Search YouTube for videos and display them in a beautiful embedded player directly in your Open WebUI chat. This tool provides comprehensive YouTube search capabilities with automatic citation generation, detailed video information, and a custom-styled embedded player. Perfect for finding tutorials, music videos, educational content, or any video content you need.
Configuration
YOUTUBE_API_KEY(str): YouTube Data API v3 key from https://console.cloud.google.com/apis/credentials (required)MAX_RESULTS(int): Maximum number of search results to return (default: 5, range: 1-10)SHOW_EMBEDDED_PLAYER(bool): Show embedded YouTube player for the first result (default:True)REGION_CODE(str): Region code for search results, e.g., "US", "GB", "JP" (default: "US")SAFE_SEARCH(str): Safe search filter - "none", "moderate", or "strict" (default: "moderate")
Prerequisites: Get a free YouTube Data API v3 key from Google Cloud Console and enable the YouTube Data API v3 in your project.
Usage
-
Search for Videos:
Search YouTube for "python tutorial for beginners" -
Play Specific Video:
Play YouTube video dQw4w9WgXcQ -
Search with Custom Results:
Search YouTube for "cooking recipes" with 10 results
Features
- Two Main Functions:
search_youtubefor searching andplay_videofor playing specific video IDs - Embedded Player: Beautiful custom-styled YouTube player embedded directly in chat with responsive design
- Safe Search: Built-in content filtering options
- Region Support: Localized search results based on region code
- Direct Links: Provides YouTube links and "Watch on YouTube" buttons
- Rate Limit Handling: Proper error handling for API quota limits
- Real-time Status: Progress updates during search and loading
Getting Started
-
Get a YouTube API Key:
- Visit Google Cloud Console
- Create a new project or select an existing one
- Enable the "YouTube Data API v3"
- Create credentials (API Key)
- Copy the API key
-
Configure the Tool:
- Open the tool's Valves settings in Open WebUI
- Paste your API key into the
YOUTUBE_API_KEYfield - Adjust other settings as desired (region, max results, etc.)
-
Start Searching:
- Use natural language: "Search YouTube for [topic]"
- Or use the function directly:
search_youtube("topic")
Example of YouTube video embedded in Open WebUI chat
Native Image Generator
Description
Generate images using Open WebUI's native image generation middleware configured in admin settings. This tool leverages whatever image generation backend you have configured (such as AUTOMATIC1111, ComfyUI, or OpenAI DALL-E) through Open WebUI's built-in image generation system, with optional Ollama model management to free up VRAM when needed.
Configuration
unload_ollama_models(bool): Whether to unload all Ollama models from VRAM before generating images (default:False)ollama_url(str): Ollama API URL for model management (default:http://host.docker.internal:11434)emit_embeds(bool): Whether to emit HTML image embeds via theembedsevent so generated images are displayed inline in the chat (default:True). WhenFalse, the tool will skip emitting embeds and only return bare download URLs. Ifemit_embedsisTruebut no event emitter is available, images cannot be displayed inline and only the URLs will be returned.
Prerequisites: You must have image generation configured in Open WebUI's admin settings under Settings > Images. This tool works with any image generation backend you have set up (AUTOMATIC1111, ComfyUI, OpenAI, etc.).
Usage
-
Example:
Generate an image of "a serene mountain landscape at sunset" -
Uses whatever image generation backend is configured in Open WebUI admin settings
-
Automatically manages model resources if Ollama unloading is enabled
-
Returns markdown-formatted image links for immediate display
Features
- Native Integration: Uses Open WebUI's native image generation middleware without external dependencies
- Backend Agnostic: Works with any image generation backend configured in admin settings (AUTOMATIC1111, ComfyUI, OpenAI, etc.)
- Memory Management: Optional Ollama model unloading to optimize VRAM usage
- Flexible Model Support: You can prompt de agent to change the image generation model, providing the name is given to it.
- Real-time Status: Provides generation progress updates via event emitter
- Error Handling: Comprehensive error reporting and recovery
Hugging Face Image Generator
Description
Generate high-quality images from text descriptions using Hugging Face's Stable Diffusion models.
Configuration
- API Key (Required): Obtain a Hugging Face API key from your HuggingFace account and set it in the tool's configuration in Open WebUI.
- API URL (Optional): Uses Stability AI's SD 3.5 Turbo model as default. Can be customized to use other HF text-to-image model endpoints.
Usage
-
Example:
Create an image of "beautiful horse running free" -
Multiple image format options: Square, Landscape, Portrait, etc.
Example image generated with Hugging Face tool
Cloudflare Workers AI Image Generator
Description
Generate images using Cloudflare Workers AI text-to-image models, including FLUX, Stable Diffusion XL, SDXL Lightning, and DreamShaper LCM. This tool provides model-specific prompt preprocessing, parameter optimization, and direct image display in chat. It supports fast and high-quality image generation with minimal configuration.
Configuration
cloudflare_api_token(str): Your Cloudflare API Token (required)cloudflare_account_id(str): Your Cloudflare Account ID (required)default_model(str): Default model to use (e.g.,@cf/black-forest-labs/flux-1-schnell)
Prerequisites: Obtain a Cloudflare API Token and Account ID from your Cloudflare dashboard. No additional dependencies beyond requests.
Usage
-
Example:
# Generate an image with a prompt await tools.generate_image(prompt="A futuristic cityscape at sunset, vibrant colors") -
Returns a markdown-formatted image link for immediate display in chat.
Features
- Multiple Models: Supports FLUX, SDXL, SDXL Lightning, DreamShaper LCM
- Prompt Optimization: Automatic prompt enhancement for best results per model
- Parameter Handling: Smart handling of steps, guidance, negative prompts, and size
- Direct Image Display: Returns markdown image links for chat
- Error Handling: Comprehensive error and status reporting
- Real-time Status: Progress updates via event emitter
SearxNG Image Search Tool
Description
Search and retrieve images from the web using a self-hosted SearxNG instance. This tool provides privacy-respecting, multi-engine image search with direct image display in chat. Ideal for finding diverse images from multiple sources without tracking or ads.
Configuration
SEARXNG_ENGINE_API_BASE_URL(str): The base URL for the SearxNG search engine API (default:http://searxng:4000/search)MAX_RESULTS(int): Maximum number of images to return per search (default: 5)
Prerequisites: You must have a running SearxNG instance. See SearxNG documentation for setup instructions.
Usage
-
Example:
# Search for images of cats await tools.search_images(query="cats", max_results=3) -
Returns a list of markdown-formatted image links for immediate display in chat.
Features
- Privacy-Respecting: No tracking, ads, or profiling
- Multi-Engine: Aggregates results from multiple search engines
- Direct Image Display: Images are formatted for chat display
- Customizable: Choose engines, result count, and more
- Error Handling: Handles connection and search errors gracefully
๐ Function Pipes
Perplexica Pipe
Description
AI-powered web search using Perplexica with streaming responses, intelligent citations, and comprehensive source tracking. This function pipe integrates with your self-hosted Perplexica instance to provide real-time web search capabilities with proper source attribution, making it perfect for research, fact-checking, and staying up-to-date with current events.
Configuration
enable_perplexica(bool): Enable or disable Perplexica search (default:True)perplexica_api_url(str): Perplexica API endpoint (default:http://localhost:3001/api/search)perplexica_chat_provider(str): Provider ID for chat model (default:550e8400-e29b-41d4-a716-446655440000)perplexica_chat_model(str): Chat model to use (default:gpt-4o-mini)perplexica_embedding_provider(str): Provider ID for embeddings (default:550e8400-e29b-41d4-a716-446655440000)perplexica_embedding_model(str): Embedding model to use (default:text-embedding-3-large)perplexica_focus_mode(str): Search focus mode (default:webSearch)perplexica_optimization_mode(str): Optimization mode - "speed" or "balanced" (default:balanced)task_model(str): Model for non-search tasks (default:gpt-4o-mini)max_history_pairs(int): Maximum conversation history pairs to include (default: 12)perplexica_timeout_ms(int): HTTP socket read timeout in milliseconds (default: 1500)
Prerequisites: You must have Perplexica installed and running locally. Perplexica is an open-source AI-powered search engine that requires setup with Ollama or OpenAI-compatible providers.
Usage
-
Example:
Investigate the latest news on AI regulation for different areas US europe , china, etc, do only one tool call -
Automatically routes search queries to Perplexica
-
Provides streaming responses with real-time updates
-
Emits citations with source metadata for each result
-
Handles conversation history for contextual searches
Features
- Streaming Support: Real-time streaming responses for faster interaction
- Smart Citations: Automatic citation generation with metadata (title, URL, content)
- Conversation History: Maintains context from previous messages (configurable)
- Multiple Focus Modes: webSearch, academicSearch, youtubeSearch, and more
- Status Updates: Real-time progress updates during search
- Source Tracking: Comprehensive source metadata with URLs and snippets
- Task Routing: Intelligent routing between search and non-search tasks
- Error Handling: Robust error handling with user-friendly messages
Getting Started
-
Install Perplexica:
- Follow the Perplexica installation guide
- Set up your chat and embedding providers (Ollama, OpenAI, etc.)
- Start the Perplexica server (default: http://localhost:3001)
-
Configure the Pipe:
- Open the pipe's Valves settings in Open WebUI
- Set
perplexica_api_urlto your Perplexica instance URL - Configure your chat and embedding providers/models
- Adjust focus mode and optimization settings as needed
-
Start Searching:
- Select the "Perplexica Pipe" model in Open WebUI
- Ask questions or request web searches
- View results with automatic citations and source links
Example of Perplexica pipe search results with citations in Open WebUI
ComfyUI Image-to-Image Tool (Qwen Image Edit 2509)
Description
Edit and transform images using ComfyUI workflows with AI-powered image editing. Features the Qwen Image Edit 2509 model as default, supporting up to 3 images for advanced editing with context, style transfer, and multi-image blending. Also includes Flux Kontext workflow for artistic transformations. Images are automatically extracted from message attachments and rendered as beautiful HTML embeds.
Configuration
comfyui_api_url(str): ComfyUI HTTP API endpoint (default:http://localhost:8188)workflow_type(str): Choose your workflowโ"Flux_Kontext", "QWen_Edit", or "Custom" (default:QWen_Edit)custom_workflow(Dict): Custom ComfyUI workflow JSON (only used when workflow_type='Custom')max_wait_time(int): Maximum wait time in seconds for job completion (default:600)unload_ollama_models(bool): Automatically unload Ollama models from VRAM before generating images (default:False)ollama_api_url(str): Ollama API URL for model management (default:http://localhost:11434)return_html_embed(bool): Return a beautiful HTML image embed with comparison view (default:True)
Prerequisites: You must have ComfyUI installed and running with the required models and custom nodes:
- For Flux Kontext: Flux Dev model, Flux Kontext LoRA, and required ComfyUI nodes
- For Qwen Edit 2509: Qwen Image Edit 2509 model, Qwen CLIP, VAE, and ETN_LoadImageBase64 custom node
- See the Extras folder for workflow JSON files:
flux_context_owui_api_v1.jsonandimage_qwen_image_edit_2509_api_owui.json
Usage
-
Example:
# Attach image(s) and provide editing instructions "Remove the background" "Change car to red" "Apply lighting from first image to second image"
Features
- Qwen Edit 2509 (Default): State-of-the-art image editing with precise control and instruction-following
- Multi-Image Support: Qwen Edit workflow accepts 1-3 images for advanced editing with context and style transfer
- Dual Workflow Support: Switch to Flux Kontext for artistic transformations and creative reimagining
- Automatic Image Handling: Images are extracted from messages and passed to the AI automatically
- VRAM Management: Optional Ollama model unloading to free GPU memory before generation
- Beautiful HTML Embeds: Displays results with elegant before/after comparison view
- OpenWebUI Integration: Automatically uploads generated images to OpenWebUI storage
- Flexible Workflows: Use built-in workflows or provide your own custom ComfyUI JSON
Workflow Details
Qwen Edit 2509 (Default):
- Supports 1-3 images with multi-image context and style transfer
- Lightning-fast 4-step generation
- Best for: precise edits, object manipulation, style transfer
Flux Kontext (Alternative):
- Single image input (multi-image support planned)
- 20-step high-quality generation
- Best for: artistic transformations, creative reimagining
Custom Workflow:
- Bring your own ComfyUI workflow JSON
- Full flexibility for advanced users
Getting Started
-
Set up ComfyUI:
- Install ComfyUI
- Download required models (Flux Dev, Qwen Edit 2509, etc.)
- Install necessary custom nodes (especially
ETN_LoadImageBase64for Qwen workflow)
-
Import workflows:
- Load
Extras/flux_context_owui_api_v1.jsonorExtras/image_qwen_image_edit_2509_api_owui.jsonin ComfyUI - Verify all nodes are recognized (install missing custom nodes if needed)
- Load
-
Configure the tool:
- Set
comfyui_api_urlto your ComfyUI server address - Choose your preferred workflow type
- Optionally enable Ollama model unloading if you have limited VRAM
- Set
-
Start editing:
- Attach an image (or up to 3 for multi-image editing) to your message
- Describe your desired transformation in natural language
- Watch the magic happen!
Note for Custom Workflows: If you're using a custom workflow with different capabilities (e.g., single-image only or different prompting requirements), you should modify the edit_image function's docstring in the tool code. The docstring instructs the AI on how to use the tool and what prompting strategies work best. Adjust it to match your workflow's specific capabilities and requirements.
Multi-Image Support Status:
- Qwen Edit 2509: Full support for 1-3 images (default workflow)
- Flux Kontext: Single image currently; multi-image support planned for future release
- Custom workflows: Depends on your workflow implementation
Example of Qwen Image Edit 2509 transforming a cyberpunk dolphin into a natural mountain scene
ComfyUI ACE Step 1.5 Audio Tool
Description
Generate high-quality music using the improved ACE Step 1.5 model via ComfyUI. This tool builds upon the legacy version with enhanced control over musical elements like key, time signature, BPM, and language. It features the same beautiful embedded player and supports batch generation.
Configuration
comfyui_api_url(str): ComfyUI API endpoint (default:http://localhost:8188)model_name(str): ACE Step 1.5 checkpoint name (default:ace_step_1.5_turbo_aio.safetensors)batch_size(int): Number of tracks to generate per request (default:1)max_duration(int): Maximum song duration in seconds (default:180)max_number_of_steps(int): Maximum allowed sampling steps (default:50)max_wait_time(int): Max wait time for generation in seconds (default:600)workflow_json(str): ComfyUI Workflow JSON (default:ace_step_1.5_workflow)checkpoint_node(str): Node ID for CheckpointLoaderSimple (default:"97")text_encoder_node(str): Node ID for TextEncodeAceStepAudio1.5 (default:"94")empty_latent_node(str): Node ID for EmptyAceStep1.5LatentAudio (default:"98")sampler_node(str): Node ID for KSampler (default:"3")save_node(str): Node ID for SaveAudioMP3 (default:"104")vae_decode_node(str): Node ID for VAEDecodeAudio (default:"18")unload_node(str): Node ID for UnloadAllModels (default:"105")owui_base_url(str): Open WebUI base URL (default:http://localhost:3000)save_local(bool): Save generated audio to local storage (default:True)show_player_embed(bool): Show the embedded audio player (default:True)unload_comfyui_models(bool): Unload models after generation using ComfyUI-Unload-Model node (default:False)
Prerequisites
-
ComfyUI-Unload-Model Node: To use the model unloading feature (
unload_comfyui_models), you must install the ComfyUI-Unload-Model custom node in your ComfyUI instance.Note: You can use other model unloading nodes in a custom workflow, but you must correctly configure the
unload_nodevalve with the ID of that node.
User Configuration (Per-User Valves)
Users can customize these settings for their individual sessions by clicking the "Valves" icon in the chat interface:
generate_audio_codes(bool): Enable/disable audio code generation. Disabling it (Fast Mode) speeds up generation but may reduce quality (default:True)steps(int): Number of sampling steps for generation. Higher values may improve quality but take longer (default:8, capped by Adminmax_number_of_steps)seed(int): Random seed for generation. Set to-1for random, or a specific number for reproducible results (default:-1)
Usage
-
Example:
Generate a "cyberpunk, darkwave" song about "AI takeover" in E minor, 140 BPM, duration 60s -
Advanced Features:
ACE Step 1.5 Audio Player
- Control Key Scale (e.g., "C Major", "F# Minor")
- Set Time Signature (e.g., 4/4, 3/4)
- Choose Language (e.g., "en", "ja", "zh")
Features
- New in 1.5: Key scale, time signature, language support, and improved audio quality
- Batch Generation: Generate multiple variations at once
- Embedded Player: Sleek, transparent player with lyrics and waveform visualization
- Customizable: Full control over generation parameters
ComfyUI ACE Step Audio Tool (Legacy)
Description
Generate music using the ACE Step AI model via ComfyUI. This tool lets you create songs from tags and lyrics, with full control over the workflow JSON and node numbers. Features a beautiful, transparent custom audio player with play/pause controls, progress tracking, volume adjustment, and a clean scrollable lyrics display. Designed for advanced music generation and can be customized for different genres and moods.
Configuration
comfyui_api_url(str): ComfyUI API endpoint (e.g.,http://localhost:8188)model_name(str): Model checkpoint to use (default:ACE_STEP/ace_step_v1_3.5b.safetensors)workflow_json(str): Full ACE Step workflow JSON as a string. Use{tags},{lyrics}, and{model_name}as placeholders.tags_node(str): Node number for the tags input (default:"14")lyrics_node(str): Node number for the lyrics input (default:"14")model_node(str): Node number for the model checkpoint input (default:"40")save_local(bool): Copy the generated song to Open WebUI storage backend (default:True)owui_base_url(str): Your Open WebUI base URL (default:http://localhost:3000)show_player_embed(bool): Show the embedded audio player. If false, only returns download link (default:True)
Usage
- Import the ACE Step workflow:
- In ComfyUI, go to the workflow import section and load
extras/ace_step_api.json. - Adjust nodes as needed for your setup.
- In ComfyUI, go to the workflow import section and load
- Configure the tool in Open WebUI:
- Set the
comfyui_api_urlto your ComfyUI backend. - Paste the workflow JSON (from the file or your own) into
workflow_json. - Set the correct node numbers if you modified the workflow.
- Set the
- Generate music:
- Provide a song title, tags, and (optionally) lyrics.
- The tool will return either an embedded audio player or a download link based on your configuration.
-
Example:
Generate a song About Ai and Humanity friendship
The sleek, transparent audio player embedded in Open WebUI chat
Features
- Custom Audio Player: Beautiful, semi-transparent player with blur effects
- Full Playback Controls: Play/pause, seek, volume control with SVG icons
- Song Title Display: User-defined song titles prominently shown
- Scrollable Lyrics: Clean lyrics display with custom scrollbar (max 120px height)
- Transparent UI: Integrates seamlessly with any Open WebUI theme
- Toggle Player: Option to show/hide player embed and just return download links
- Local Storage: Optionally saves songs to Open WebUI cache for persistence
Returns an embedded audio player with download link or just the link, depending on configuration. Advanced users can fully customize the workflow for different genres, moods, or creative experiments.
ComfyUI Text-to-Video Tool
Description
Generate short videos from text prompts using a ComfyUI workflow that defaults to the WAN 2.2 text-to-video models. This tool wraps the ComfyUI HTTP + WebSocket API, waits for the job to complete, extracts the produced video, and (optionally) uploads it to Open WebUI storage so it can be embedded in chat.
The default workflow file included in this repository is extras/video_wan2_2_14B_t2v.json and the tool implementation lives at tools/text_to_video_comfyui_tool.py.
Configuration
comfyui_api_url(str): ComfyUI HTTP API endpoint (default:http://localhost:8188)prompt_node_id(str): Node ID in the workflow that receives the text prompt (default:"89")workflow(json/dict): ComfyUI workflow JSON; if empty the bundled WAN 2.2 workflow is usedmax_wait_time(int): Maximum seconds to wait for the ComfyUI run (default:600)unload_ollama_models(bool): Whether to unload Ollama models from VRAM before running (default:False)ollama_api_url(str): Ollama API URL used when unloading models (default:http://localhost:11434)
Usage
- Import the workflow
- In ComfyUI, import the workflow JSON
extras/video_wan2_2_14B_t2v.jsonif you want to inspect or modify nodes.
- Install / Configure the tool
- Copy
tools/text_to_video_comfyui_tool.pyinto your Open WebUI tools and set thecomfyui_api_urland other valves as needed in the tool settings.
- Generate a video
- Call the tool with a prompt (e.g. "A cyberpunk panda skating through neon streets, 3s shot") and wait for the job to complete. The tool emits progress events and will provide an embedded HTML player or a direct ComfyUI URL.
Example:
Generate a 3 second shot of "a cyberpunk panda skating through neon city streets" using the default WAN 2.2 workflow
Example short video generated via ComfyUI WAN 2.2 workflow (thumbnail).
Features
- Uses WAN 2.2 text-to-video model workflow by default (
video_wan2_2_14B_t2v.json) - Submits workflow to ComfyUI and listens on WebSocket for completion
- Extracts produced video files and optionally uploads them to Open WebUI storage for inline embedding
- Optional Ollama VRAM unloading to free memory before runs
- Configurable prompt node and wait timeout
OpenWeatherMap Forecast Tool
Description
Tool that fetches weather forecasts using the OpenWeatherMap API and displays an interactive HTML weather widget with current conditions, hourly, and daily forecasts. Supports both the free 2.5 API and the premium One Call 3.0 API.
Configuration
openweathermap_api_key(str): Your OpenWeatherMap API key (required)api_version(str): API version: '2.5' (free, includes current + 5-day/3h forecast) or '3.0' (One Call API, requires separate subscription) (default:2.5)units(str): Units of measurement: 'metric', 'imperial', or 'standard' (default:metric)language(str): Language code for weather descriptions (default:en)show_weather_embed(bool): Show the embedded weather widget (default:True)
Usage
-
Example:
What is the weather like in Tokyo, JP? -
Fetches current conditions, hourly forecast, and multi-day daily forecast
-
Displays an interactive weather widget and returns a text summary for the LLM
Example OpenWeatherMap Forecast Tool widget
๐ Function Pipes
Flux Kontext ComfyUI Pipe
Description
A pipe that connects Open WebUI to the Flux Kontext image-to-image editing model through ComfyUI. This integration allows for advanced image editing, style transfers, and other creative transformations using the Flux Kontext workflow. Features an interactive /setup command system for easy configuration by administrators.
Configuration
The pipe includes an interactive setup system that allows administrators to configure all settings through chat commands. Most configuration can be done using the /setup command, which provides an interactive form for easy adjustment of parameters.
Key Configuration Options:
- COMFYUI_ADDRESS: Address of the running ComfyUI server (default:
http://127.0.0.1:8188) - COMFYUI_WORKFLOW_JSON: The entire ComfyUI workflow in JSON format
- PROMPT_NODE_ID: Node ID for text prompt input (default:
"6") - IMAGE_NODE_ID: Node ID for Base64 image input (default:
"196") - KSAMPLER_NODE_ID: Node ID for the sampler node (default:
"194") - ENHANCE_PROMPT: Enable vision model-based prompt enhancement (default:
False) - VISION_MODEL_ID: Vision model to use for prompt enhancement
- UNLOAD_OLLAMA_MODELS: Free RAM by unloading Ollama models before generation (default:
False) - MAX_WAIT_TIME: Maximum wait time for generation in seconds (default:
1200) - AUTO_CHECK_MODEL_LOADER: Auto-detect model loader type for .safetensors or .gguf (default:
False)
Usage
Initial Setup
-
Import the workflow:
- In ComfyUI, import
extras/flux_context_owui_api_v1.jsonas a workflow - Adjust node IDs if you modify the workflow
- In ComfyUI, import
-
Configure using /setup command (Admin only):
- Type
/setupin the chat to launch the interactive configuration form - The form will display all current settings with input fields
- Adjust any settings you need to change
- Submit the form to apply and optionally save the configuration
- Settings can be persisted to a backend config file for permanent storage
- Type
-
Alternative: Manual configuration:
- Access the pipe's Valves in Open WebUI's admin panel
- Set
COMFYUI_ADDRESSto your ComfyUI backend - Paste the workflow JSON into
COMFYUI_WORKFLOW_JSON - Configure node IDs and other parameters as needed
Using the Pipe
-
Basic image editing:
- Upload an image to the chat
- Provide a text prompt describing the desired changes
- The pipe processes the image through ComfyUI and returns the edited result
-
Enhanced prompts (optional):
- Enable
ENHANCE_PROMPTin settings - Set a
VISION_MODEL_ID(e.g., a multimodal model like LLaVA or GPT-4V) - The vision model will analyze the input image and automatically refine your prompt for better results
- Enable
-
Memory management:
- Enable
UNLOAD_OLLAMA_MODELSto free RAM before generation - The default workflow includes a
Clean VRAMnode for VRAM management in ComfyUI
- Enable
Example - Image editing:
Prompt: "Edit this image to look like a medieval fantasy king, preserving facial features."
[Upload image]
Example of Flux Kontext /setup command interface
Example of Flux Kontext image editing output
MiniMax LLM Pipe
Description
Route chat completions to MiniMax's OpenAI-compatible API (api.minimax.io/v1) directly from Open WebUI. This pipe exposes MiniMax-M2.7 and MiniMax-M2.7-highspeed models (both with 204K context windows) as selectable models in your Open WebUI instance.
Configuration
MINIMAX_API_KEY(str): Your MiniMax API key (required, get one at https://platform.minimaxi.com)ENABLED_MODELS(list): Which MiniMax models to expose (default: all)STRIP_THINKING(bool): Strip<think>โฆ</think>blocks from responses (default:True)DEFAULT_TEMPERATURE(float): Default temperature when none is specified, 0.01โ1.0 (default:0.7)
Prerequisites: Get a MiniMax API key from MiniMax Platform.
Usage
- Install the pipe: Copy
functions/minimax_pipe.pyinto Open WebUI via Workspace > Functions - Configure: Set your
MINIMAX_API_KEYin the pipe's Valves settings - Select model: Choose "MiniMax M2.7" or "MiniMax M2.7 Highspeed" from the model dropdown
- Start chatting: The pipe streams responses directly from the MiniMax API
Features
- OpenAI-Compatible Routing: Uses MiniMax's
/v1/chat/completionsendpoint - Two Models: MiniMax-M2.7 (full) and MiniMax-M2.7-highspeed (faster) โ both with 204K context
- Streaming: Real-time streamed responses via
chat:message:deltaevents - Temperature Clamping: Automatically clamps temperature to MiniMax's accepted range (0.01โ1.0)
- Think-Tag Stripping: Strips
<think>โฆ</think>reasoning blocks from output (configurable) - Parameter Forwarding: Passes
max_tokens,top_p, and other parameters to the API
Google Veo Text-to-Video & Image-to-Video Pipe
Description
Generate high-quality videos from text prompts or a single image using Google Veo via the Gemini API. This pipe enables advanced video generation capabilities directly from Open WebUI, supporting creative and professional use cases. It supports both text-to-video and image-to-video generation.
Note: Only one image is supported as input at this time. Multi-image input is not available.
Configuration
GOOGLE_API_KEY(str): Google API key for Gemini API access (required)MODEL(str): The Veo model to use for video generation (default: "veo-3.1-generate-preview")ENHANCE_PROMPT(bool): Use vision model to enhance prompt (default: False)VISION_MODEL_ID(str): Vision model to be used as prompt enhancerENHANCER_SYSTEM_PROMPT(str): System prompt for prompt enhancement processMAX_WAIT_TIME(int): Max wait time for video generation in seconds (default: 1200)
Prerequisites:
- You must have access to the Google Gemini API and a valid API key.
- Only one image is supported as input for image-to-video generation (Gemini API limitation).
Usage
-
Text-to-Video Example:
Generate a video of "a futuristic city at sunset with flying cars" -
Image-to-Video Example:
Create a video from this image: [Attach image]
Features
- Text-to-Video: Generate videos from descriptive text prompts
- Image-to-Video: Animate a single image into a video sequence
- High Quality: Leverages Google Veo's advanced video generation models
- Direct Embedding: Returns markdown-formatted video links for display in chat
- Status Updates: Progress and error reporting during generation
Limitations
- Only one image is supported as input for image-to-video generation (Gemini API limitation)
- Multi-image or video editing features are not available
Example Output
Example of Google Veo video generation output in Open WebUI
Planner Agent v3
Advanced autonomous agent with agentic planning, multi-agent delegation, and real-time visual execution tracking.
The Planner Agent v3 is a state-of-the-art autonomous system designed for Open WebUI. It transforms complex user requests into structured, executable plans, delegating specialized tasks to a fleet of subagents while providing interactive feedback and visual progress updates.
๐ Key Features
- ๐ง Agentic Planning & Self-Correction: Automatically decomposes high-level goals into a dependency-aware task tree with user-in-the-loop approval and adaptive rescheduling.
- โก Parallel Execution (v15+): Blazing fast performance via concurrent execution of tool calls and subagent tasks using
asyncio.gather. This allows multiple independent tasks to be performed simultaneously. - ๐ Robust State Persistence: Automatically saves and recovers task states, results, and subagent histories across chat turns via attached JSON files.
- ๐ Native OWUI Integration:
- User Skills: Automatically resolves and injects available skills for the model (Planner and Custom Workspace models) for it to query them.
- Knowledge Bases & RAG: Direct integration with OWUI knowledge bases, notes, and user memory via the
knowledge_agent. - Custom Functions & Tools: Full support for user-created Python tools, imported tools, and external OpenAPI/DB tools.
- MCP Servers: Extended support for Model Context Protocol (MCP) servers with connection deduplication and resilience.
- Terminal Integration: Full interactive terminal access for shell-based tasks and file management (requires
terminal_agent). - Native Tool Parity: Intelligently inherits built-in tool capabilities (Web Search, Image Gen, etc.) when specialized subagents are disabled.
- ๐ Specialized Built-in Subagents:
- Web Search Agent: Autonomous research with source synthesis and citation handling.
- Image Gen Agent: High-quality creation using OWUI's native image middleware.
- Knowledge Agent: Context-aware RAG from your documents and user memory.
- Code Interpreter Agent: Secure Python execution for data science and automation.
- Terminal Agent: Direct system access for technical task execution.
- ๐ ๏ธ MCP Resilience System: Full Model Context Protocol (MCP) support with built-in parallelism patches and connection deduplication to prevent deadlocks.
- ๐ญ Interactive UI Modals: Native UI components for
ask_user,give_options, andplan_approvalallow the agent to request clarification or confirmation. - ๐ Visual Execution Tracker: Real-time HTML interface showing live task status (Pending, In-Progress, Completed, Failed).
โ๏ธ Configuration (Valves)
[!IMPORTANT] Model ID & Feature Configuration
- Base Models: Found in Admin Panel > Settings > Models. These are the raw model IDs (e.g.,
qwen2.5:7b,gpt-4o).
- Essential for:
PLANNER_MODEL(Mandatory).- Fallback Support:
REVIEW_MODEL,TERMINAL_AGENT_MODEL, and all Virtual Agent Models will fallback to thePLANNER_MODELif left blank. However, if specified, they must be Base Models (not workspace presets).- Workspace Models (Presets): Found in Workspace > Models. These are custom presets with specific personas and settings.
- Used for:
SUBAGENT_MODELS. This is where you configure specific Knowledge Base access, custom tool features, skills, and specialized system prompts for your subagents.
Parallel Execution (New)
Planner Agent v3 supports parallel execution of tool calls and subagent calls. This significantly improves performance when multiple independent tasks can be performed simultaneously.
PARALLEL_TOOL_EXECUTION: When enabled, the planner executes all identified tool calls (including subagent calls) in parallel.PARALLEL_SUBAGENT_EXECUTION: When enabled, subagents execute their internal tool calls (search, code interpreter, etc.) in parallel.
[!WARNING] Parallel execution may lead to external race conditions if tools have stateful dependencies within the same turn (e.g., one tool depends on a file created by another tool in the same turn). Use with caution for complex, inter-dependent workflows. Most standard search and generation tasks are independent and safe for parallelism. Subagents interdependance of task and Async state for the pipe is heavily guarded and safe. but you are responsible for the effects it migh have on external services. If you go for full paralellisim you might need to use an async db to avoid deadlocks and slowdowns with a large amount of SubAgents
Model & Subagent Setup
PLANNER_MODEL: The primary "brain" model for planning and orchestration (Mandatory).SUBAGENT_MODELS: Comma-separated list of specialized models or Workspace Model presets for delegation. Best for Knowledge Base access and custom personas.WORKSPACE_TERMINAL_MODELS: List of model IDs allowed to use the local terminal environment, overriding the default virtual terminal agent check.SUBAGENT_TIMEOUT: Global timeout for subagent and MCP tool calls to prevent bottlenecks.
Interaction & Control
ENABLE_PLAN_APPROVAL: Pause for user review before starting any tasks.YOLO_MODE: Fully autonomous mode: disables iteration limits and confirmation gates.TASK_ITERATION_LIMIT: Global safety cap to prevent infinite agentic loops.ENABLE_USER_INPUT_TOOLS: Toggle availability of interactive UI modals (ask_user,give_options).
๐ Tool Inheritance & Virtual Agents
The Planner V3 features a smart tool inheritance logic:
- Delegation Mode: If a Virtual Agent (e.g.,
web_search_agent) is enabled in the Planner Valves, the planner will delegate tasks to that specialized subagent using its own configuration. - Inherent Mode: If a Virtual Agent is disabled, the Planner itself "inherits" those capabilities (if the Planner's Base Model/Admin tool settings allow it) and performs the task directly without delegation.
๐ก Visual Walkthrough
Screencast of Planner V3 in action: Automated planning, subagent execution, and final multi-media synthesis.
Real-time monitoring of subagent tasks and planning progress.
Extensive configuration options to tailor the agentic behavior.
Autonomous agents requesting user choice through interactive UI modals.
Deep visibility into the agent's reasoning process and tool interactions.
Final output synthesis leveraging specialized subagents (e.g., Music Generation & HTML Layout).
arXiv Research MCTS Pipe
Description
Search arXiv.org for relevant academic papers and iteratively refine a research summary using a Monte Carlo Tree Search (MCTS) approach.
Configuration
model: The model ID from your LLM providertavily_api_key: Required. Obtain your API key from tavily.commax_web_search_results: Number of web search results to fetch per querymax_arxiv_results: Number of results to fetch from the arXiv API per querytree_breadth: Number of child nodes explored per MCTS iterationtree_depth: Number of MCTS iterationsexploration_weight: Controls balance between exploration and exploitationtemperature_decay: Exponentially decreases LLM temperature with tree depthdynamic_temperature_adjustment: Adjusts temperature based on parent node scoresmaximum_temperature: Initial LLM temperature (default 1.4)minimum_temperature: Final LLM temperature at max tree depth (default 0.5)
Usage
-
Example:
Do a research summary on "DPO laser LLM training"
Example of arXiv Research MCTS Pipe output
Multi Model Conversations v2 Pipe
Description
An advanced multi-model conversation system that enables interactive, multi-agent discussions with a custom configuration UI. Feature parity with the latest Open WebUI capabilities including tool support, reasoning tag handling (thinking blocks), and dynamic speaker management. Configure up to 5 participants with unique personas and models, and use the optional Group Chat Manager to orchestrate the discussion flow.
Configuration
Version 2 introduces a sophisticated Configuration Overlay that allows you to set up your multi-agent conversation visually. It still supports User Valves for defaults, but the primary way to configure a chat is through the interactive UI.
Key Features:
- Dynamic Speaker Selection: Enables or disables the Group Chat Manager.
- Model-Specific Prompts: Set unique system messages for each participant.
- Tool Integration: Models can now use available tools within the conversation.
- Reasoning Support: Full support for "thinking" models with collapsible reasoning blocks.
Core Settings:
NUM_PARTICIPANTS: Set the number of participants (1-5)ROUNDS_PER_CONVERSATION: Total rounds of replies in the conversationUseGroupChatManager: Enable dynamic speaker selection by a manager model
Per-Participant Configuration:
Participant[1-5]Model: Model for each participantParticipant[1-5]Alias: Display name for each participantParticipant[1-5]SystemMessage: Persona and instructions for each participant
Accessing the Configuration UI
To configure the conversation:
- Select the Pipe: Choose "Multi Model Conversations v2 Pipe" as your model.
- Open Configuration: Click the settings icon (list icon in a new message) in the chat input area OR look for the Configuration Overlay that appears when starting a new chat.
- Configure agents: Set your models, aliases and system prompts.
- Save and Start: Click "Start Conversation" to begin the multi-agent session.
Example of Multi Model Conversations User Valves configuration panel
Example of Multi Model Conversations Setup Popup
Video Demos


Usage
-
Example:
Start a conversation between three AI agents about climate change.
Use Cases:
- Debates: Set up opposing viewpoints (optimist vs. skeptic)
- Brainstorming: Multiple creative perspectives on a problem
- Role-playing: Interactive storytelling with multiple characters
- Analysis: Different analytical approaches to the same topic
- Expert Panels: Simulate domain experts discussing a complex issue
Resume Analyzer Pipe
Description
Analyze resumes and provide tags, first impressions, adversarial analysis, potential interview questions, and career advice.
Configuration
model: The model ID from your LLM providerdataset_path: Local path to the resume dataset CSV filerapidapi_key(optional): For job search functionalityweb_search: Enable/disable web search for relevant job postingsprompt_templates: Customizable templates for all steps
Usage
- Requires the Full Document Filter (see below) to work with attached files.
- Example:
Analyze this resume:
[Attach resume file]
Screenshots of Resume Analyzer Pipe output
Mopidy Music Controller
Description
Control your Mopidy music server to play songs from the local library or YouTube, manage playlists, and handle various music commands. This pipe provides an intuitive interface for music playback, search, and playlist management through natural language commands.
โ ๏ธ Requirements: This pipe requires Mopidy-Iris to be installed for the player interface. Iris provides a beautiful, feature-rich web interface for controlling Mopidy.
Configuration
model: The model ID from your LLM providermopidy_url: URL for the Mopidy JSON-RPC API endpoint (default:http://localhost:6680/mopidy/rpc) - Iris UI must be installedyoutube_api_key: YouTube Data API key for search functionalitytemperature: Model temperature (default: 0.7)max_search_results: Maximum number of search results to return (default: 5)system_prompt: System prompt for request analysis
Prerequisites
- Mopidy Server: Install and configure Mopidy
- Mopidy-Iris: Install the Iris web interface:
pip install Mopidy-Iris - Optional Extensions:
- Mopidy-Local (for local library)
- Mopidy-YouTube (for YouTube playback)
Usage
-
Example:
Play the song "Imagine" by John Lennon -
Quick text commands: stop, halt, play, start, resume, continue, next, skip, pause
Features
- Natural Language Control: Use conversational commands to control playback
- YouTube Integration: Search and play songs directly from YouTube
- Local Library Support: Access and play songs from your local Mopidy library
- Playlist Management: Create, modify, and manage playlists
- Iris UI Integration: Beautiful, professional web interface with full playback controls
- Seamless Embedding: Iris player embedded directly in Open WebUI chat interface
Example of Mopidy Music Controller Pipe with Iris UI (v0.7.0)
Letta Agent Pipe
Description
Connect with Letta agents, enabling seamless integration of autonomous agents into Open WebUI conversations. Supports task-specific processing and maintains conversation context while communicating with the agent API.
Configuration
agent_id: The ID of the Letta agent to communicate withapi_url: Base URL for the Letta agent API (default:http://localhost:8283)api_token: Bearer token for API authenticationtask_model: Model to use for title/tags generation taskscustom_name: Name of the agent to be displayedtimeout: Timeout to wait for Letta agent response in seconds (default: 400)
Usage
-
Example:
Chat with the built in Long Term memory Letta MemGPT agent.
Example of Letta Agent Pipe
OpenRouter Image Pipe
Description
An adapter pipe for the OpenRouter API that enables streaming, multi-modal chat completions with built-in websearch and image generation support. This pipe focuses on image generation capabilities and web search integration, with no support for external tools and streaming-only completions. Images are automatically saved to the Open WebUI backend and URLs are emitted for stable access.
Configuration (Valves)
API_KEY(str): OpenRouter API key (Bearer token)ALLOWED_MODELS(List[str]): List of allowed model slugs (only these models can be invoked by the pipe)USE_WEBSEARCH(bool): Enable the web search plugin globally or enable per-model by appending:onlineto the model idUSE_IMAGE_EMBEDDING(bool): When True the pipe will emit generated images as HTML<img>embeds; otherwise images are emitted as markdown links
Features
- Streaming text deltas to the client in real-time (low-latency partial responses)
- Emits structured reasoning details when available from the model
- Saves base64 image responses to the Open WebUI files backend and returns stable URLs (with cache-busting timestamps)
- Built-in websearch integration for enhanced responses
- Model capability detection (queries OpenRouter models endpoint to find supported modalities and adapts payloads automatically)
- No support for external tools - focused on core image generation and websearch functionality
Usage
Copy functions/openrouter_image_pipe.py into your Open WebUI Functions and enable it in your workspace. The pipe registers ids in the format openrouter-<model>-pipe (for example: openrouter-openai/gpt-4o-pipe). When invoked it will stream messages/events back to the Open WebUI frontend using the event emitter API.
Example:
"Explain this image"
"Web search recent news about Argentina and make an image about it"
Example screenshots
Below are example screenshots showing the pipe in action inside Open WebUI โ streaming assistant text, vision-capable model input/output, and generated images.
Example: image generation with websearch integration.
OpenRouter WebSearch Citations Filter
Description
Enables web search for OpenRouter models by adding plugins and options to the request payload. This filter provides a UI toggle to use OpenRouter's native websearch with proper citation handling. It processes web search results and emits structured citation events for proper source attribution in Open WebUI.
Configuration (Valves)
engine(str): Web search engine - "auto" (automatic selection), "native" (provider's built-in), or "exa" (Exa API)max_results(int): Maximum number of web search results to retrieve (1-10)search_prompt(str): Template for incorporating web search results. Use{date}placeholder for current date.search_context_size(str): Search context size - "low" (minimal), "medium" (moderate), "high" (extensive)
Features
- UI toggle for enabling web search on OpenRouter models
- Automatic citation generation with markdown links using domain names
- Structured citation events for Open WebUI integration
- Flexible search engine selection (auto, native, or Exa)
- Configurable search result limits and context size
- Real-time status updates during search execution
Usage
Copy filters/openrouter_websearch_citations_filter.py into your Open WebUI Filters and enable it in your model configuration. The filter will add web search capabilities to OpenRouter models with proper citation handling.
Example search prompt template:
A web search was conducted on {date}. Incorporate the following web search results into your response.
IMPORTANT: Cite them using markdown links named using the domain of the source.
Example: [nytimes.com](https://nytimes.com/some-page).
The filter processes annotations in the response stream and emits citation events with source URLs, titles, and metadata for each web search result.
๐ง Filters
Doodle Paint Filter
Description
Toggleable filter that opens a paint canvas before sending each message, letting you attach a hand-drawn sketch to your prompt. Perfect for visually explaining concepts, requesting changes to UI drafts, or adding a personal touch to your AI interactions.
Features
- Integrated Canvas: Opens a sleek, fullscreen paint canvas directly within your Open WebUI space.
- Rich Tools: Includes a pen, eraser, color palette, custom color picker, brush size adjustment, clear canvas, and undo/redo functionality.
- Native Persistence: Uses Open WebUI's native
Chatsmodel so generated doodles permanently attach to the user's message body, persisting seamlessly across the entire conversation history instead of as hacky assistant attachments.
Usage
- Enable the Filter: Turn on the Doodle Paint filter within your model configuration or parameters.
- Send a Message: Type your message and hit send.
- Draw: A beautiful fullscreen Doodle Paint canvas will automatically appear. Draw your sketch!
- Attach: Click โ Attach & Send to append the drawing to your message (or "Skip" to send text-only).
Sending a promt triggers the doodle paint canvas if active
Fullscreen paint canvas overlay
Final interaction with the AI model
Prompt Enhancer Filter
Description
Uses an LLM to automatically improve the quality of your prompts before they are sent to the main language model.
Configuration
user_customizable_template: Tailor the instructions given to the prompt-enhancing LLMshow_status: Displays status updates during the enhancement processshow_enhanced_prompt: Outputs the enhanced prompt to the chat windowmodel_id: Select the specific model to use for prompt enhancement
Usage
- Enable in your model configuration's filters section.
- Toggle the filter on or off as needed in chat settings.
- The filter will automatically process each user message before it's sent to the main LLM.

Semantic Router Filter
Description
Acts as an intelligent model router that analyzes the user's message and available models, then automatically selects the most appropriate model, pipe, or preset for the task. Features vision model filtering, dynamic vision re-routing, conversation persistence, knowledge base integration, and robust file handling with Open WebUI's RAG system.
The filter uses an innovative invisible text marker system to persist routing decisions across conversation turns. When a model is selected, the filter emits zero-width unicode characters in the first assistant message. These markers are invisible to the LLM (stripped before processing) but persist in the chat database, ensuring the same model, tools, and knowledge bases are used throughout the entire conversation without requiring metadata or system message manipulation.
The filter automatically detects when images are added to an existing conversation and intelligently re-routes to a vision-capable model if the current model lacks vision support. This enables seamless transitions from text-only conversations to image-based interactions without manual model switching.

Configuration Valves
- vision_fallback_model_id: Fallback model for image queries when no vision-capable models are available
- banned_models: List of model IDs to exclude from routing selection
- allowed_models: List of model IDs to whitelist (when set, only these models will be considered)
- router_model_id: Specific model to use for routing decisions (leave empty to use current model)
- system_prompt: System prompt for the router model (customizable)
- disable_qwen_thinking: Append
/no_thinkto router prompt for Qwen models - show_reasoning: Display routing reasoning in chat
- status: Show status updates in chat
- debug: Enable debug logging
Features
- Conversation Persistence: Routes only on first user message, then automatically maintains the selected model throughout the conversation using invisible text markers
- Dynamic Vision Re-Routing: Automatically detects when images are added mid-conversation and re-routes to a vision-capable model if the current model lacks vision support
- Vision Model Filtering: Automatically filters model selection to only vision-capable models when images are detected in the conversation (checks
meta.capabilities.visionflag) - Smart Fallback: Uses
vision_fallback_model_idonly when no vision models are available in the filtered list - Knowledge Base Integration: Properly handles files from knowledge collections with full RAG retrieval support
- Tool Preservation: Maintains model-specific tools across conversation turns
- File Structure Compliance: Passes files in correct INPUT format to Open WebUI's
get_sources_from_items()for proper RAG processing - Whitelist Support: Use
allowed_modelsto restrict selection to specific models only - Cross-Backend Compatibility: Automatically converts payloads between OpenAI and Ollama formats when routing between different backend types
- Automatic Fallback: Gracefully handles errors by falling back to the original model
Usage
- Enable in your model configuration's filters section
- Configure
vision_fallback_model_idto specify a fallback model for image queries - Optionally set
allowed_modelsto create a whitelist of preferred models, or usebanned_modelsto exclude specific ones - The filter will automatically:
- Route on the first user message only (analyzes task requirements and available models)
- Emit an invisible marker that persists the routing decision in chat history
- Detect and restore routing on subsequent messages in the conversation
- Re-route dynamically when images are added to a conversation if the current model lacks vision capability
- Detect images in conversations and filter to vision-capable models when present
- Preserve the selected model's tools and knowledge bases throughout the conversation
- Attach relevant files from knowledge collections with proper RAG retrieval
- Convert payloads between OpenAI and Ollama formats as needed
How It Works
First Message (Routing):
- Analyzes user message and available models
- Filters to vision-capable models if images are detected
- Routes to the best model for the task
- Emits invisible unicode marker (e.g.,
โโโโ model-idโโโโ) in first assistant message - Preserves model's tools, knowledge bases, and configuration
Subsequent Messages (Persistence):
- Detects invisible marker in conversation history
- Extracts persisted model ID
- Checks if images are present but current model lacks vision capability โญ NEW
- If vision mismatch detected, triggers fresh routing with vision filter
- Otherwise, reconstructs full routing (model + tools + knowledge + metadata)
- Strips marker from message content (invisible to LLM)
- Continues conversation with same model and configuration
Dynamic Vision Re-Routing Example:
User: "Explain quantum physics"
โ Router selects text model (e.g., llama3.2:latest)
User: "Thanks! Now what's in this picture?" [attaches image]
โ Filter detects: images present + current model lacks vision
โ Automatically triggers re-routing with vision filter
โ Router selects vision model (e.g., llama3.2-vision:latest)
โ Vision model processes image and responds
How Vision Filtering Works
When images are detected in the conversation:
- Filter checks all available models for
meta.capabilities.visionflag - Only vision-capable models are included in the routing selection
- If no vision models are found, uses
vision_fallback_model_idas fallback - Router model receives images for contextual routing decisions
- If router model doesn't support vision, automatically switches to vision fallback for routing

Full Document Filter
Description
Allows Open WebUI to process entire attached files (such as resumes or documents) as part of the conversation. Cleans and prepends the file content to the first user message, ensuring the LLM receives the full context.
Configuration
priority(int): Priority level for the filter operations (default:0)max_turns(int): Maximum allowable conversation turns for a user (default:8)
User Valves
max_turns(int): Maximum allowable conversation turns for a user (default:4)
Usage
-
Enable the filter in your model configuration.
-
When you attach a file in Open WebUI, the filter will automatically clean and inject the file content into your message.
-
No manual configuration is needed for most users.
-
Example:
Analyze this resume: [Attach resume file]
Clean Thinking Tags Filter
Description
Checks if an assistant's message ends with an unclosed or incomplete "thinking" tag. If so, it extracts the unfinished thought and presents it as a user-visible message.
Configuration
- No configuration required.
Usage
- Works automatically when enabled.
๐จ Using the Provided ComfyUI Workflows
Importing a Workflow
- Open ComfyUI.
- Click the "Load Workflow" or "Import" button.
- Select the provided JSON file (e.g.,
ace_step_api.jsonorflux_context_owui_api_v1.json). - Save or modify as needed.
- Use the node numbers in your Open WebUI tool configuration.
Best Practices
- Always check node numbers after importing, as they may change if you modify the workflow.
- You can create and share your own workflows by exporting them from ComfyUI.
Why this matters
This approach allows you to leverage state-of-the-art image and music generation/editing models with full control and customization, directly from Open WebUI.
๐ฆ Installation
From Open WebUI Hub (Recommended)
- Visit https://openwebui.com/u/haervwe
- Click "Get" for desired tool/pipe/filter.
- Follow prompts in your Open WebUI instance.
Manual Installation
- Copy
.pyfiles fromtools/,functions/, orfilters/into Open WebUI via the Workspace > Tools/Functions/Filters section. - Provide a name and description, then save.
๐ค Contributing
Feel free to contribute to this project by:
- Forking the repository
- Creating your feature branch
- Committing your changes
- Opening a pull request
๐ License
MIT License
๐ Credits
- Developed by Haervwe
- Credit to the amazing teams behind:
- And all model trainers out there providing these amazing tools.
Contributors
- Adriaan Knapen
- Ampersandru
- Florian Euler
- Hristo Karamanliev
- iChristGit
- Ikko Eltociear Ashimine
- rahxam
- Tan Yong Sheng
- The JSN
- Zed Unknown
Security Audit
๐ฏ Usage Examples
Academic Research
# Search for recent papers on a topic
Search for recent papers about "large language model training"
# Conduct comprehensive research
Do a research summary on "DPO laser LLM training"
Creative Projects
# Generate images
Create an image of "beautiful horse running free"
# Create music
Generate a song in the style of "funk, pop, soul" with lyrics: "In the shadows where secrets hide..."
# Edit images
Edit this image to look like a medieval fantasy king, preserving facial features
Productivity
# Analyze documents
Analyze this resume: [Attach resume file]
# Plan complex tasks
Create a fully-featured Single Page Application (SPA) for Conway's Game of Life
Multi-Agent Conversations
# Start group discussions
Start a conversation between three AI agents about climate change
๐ Community & Ecosystem
This collection is part of the broader Open WebUI ecosystem. Here's how you can get involved:
- ๐ Open WebUI Hub: Discover more tools at openwebui.com
- ๐ Documentation: Learn more about Open WebUI at docs.openwebui.com
- ๐ก Ideas: Share your ideas and feature requests
- ๐ Bug Reports: Help improve the tools by reporting issues
- ๐ Star the Repository: Show your support by starring this repo
๐ฌ Support
For issues, questions, or suggestions, please open an issue on the GitHub repository.

