Ha Openai Realtime
Home Assistant OpenAI Realtime API used MCP used Component which works right away
Ask AI about Ha Openai Realtime
Powered by Claude ยท Grounded in docs
I know everything about Ha Openai Realtime. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Realtime AI Audio for Home Assistant
A Home Assistant custom component that integrates with OpenAI's Realtime API and Google's Gemini Live API for real-time voice and text conversations, with MCP (Model Context Protocol) server support.
๐ฏ Included Integrations
| Integration | API | Voice Model |
|---|---|---|
| OpenAI Realtime | OpenAI Realtime API | GPT-4o Realtime |
| Gemini Live | Google Gemini Live API | Gemini 2.0 Flash |
Both integrations provide native speech-to-speech capabilities with minimal latency.
Features
Common Features (Both Integrations)
- Real-time Conversations: WebSocket-based low-latency responses
- Native Speech-to-Speech: Direct audio processing without separate STT/TTS pipeline
- Voice Support: Multiple voice options with configurable settings
- Home Assistant Integration: Built-in tools for controlling smart home devices
- Conversation Agent: Works as a Home Assistant conversation agent
- Media Player Entity: Control audio input/output directly
- Binary Sensors: Monitor connection, listening, speaking, and processing states
- Custom Lovelace Card: Browser-based microphone with real-time visualizer
OpenAI Realtime Specific
- MCP Server Integration: Connect to external MCP servers for extended tool capabilities
- Custom STT/TTS Providers: Use Realtime API for speech recognition and synthesis
Gemini Live Specific
- Session Resumption: Automatic session recovery on disconnection
- Image/Audio File Input: Send images and audio files for multimodal conversations
- Google Search Integration: Built-in Google Search tool
Privacy & Personalization
This integration offers an optional "personalization" feature that can improve and tailor AI responses by using conversation content to adapt behavior over time. For privacy reasons, personalization is disabled by default.
- What enabling personalization does: The integration may send additional conversation content or metadata to the external AI service to allow it to provide more personalized responses.
- Default: Off. You must explicitly enable it during configuration and confirm that you accept the privacy implications.
- Recommendation: Keep personalization disabled unless you understand and accept the data handling implications and trust the service provider.
If you enable personalization, review your service provider's privacy policy and data retention practices.
Architecture
Unlike the default Home Assistant voice pipeline (STT โ AI โ TTS), these integrations use native speech-to-speech APIs:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Default HA Pipeline โ
โ โโโโโโโ โโโโโโโโโโโโโโ โโโโโโโ โโโโโโโ โ
โ โ Mic โโโโโถโ STT Engine โโโโโถโ AI โโโโโถโ TTS โโโโโถ๐ โ
โ โโโโโโโ โโโโโโโโโโโโโโ โโโโโโโ โโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ OpenAI / Gemini Live Pipeline โ
โ โโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Mic โโโโโถโ Realtime API โโโโโถ๐ โ
โ โโโโโโโ โ (Native Speech-to-Speech) โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Requirements
- Home Assistant 2024.1.0 or later
- For OpenAI Realtime: OpenAI API key with access to the Realtime API
- For Gemini Live: Google AI API key (Gemini API)
- Python 3.11 or later
NOTE: Detailed configuration and platform-specific documentation has been moved into each integration folder. See:
- custom_components/openai_realtime โ OpenAI Realtime integration files and component-specific notes.
- custom_components/gemini_live/GOOGLE_DOC.md โ Gemini Live specific guidance and examples.
The sections below provide a concise quick-config summary for each integration. For full details and advanced options, open the files in the corresponding component folders above.
Quick Configuration (Minimal examples)
Below are compact configuration snippets and key settings you may need when setting up each integration. These are meant as a quick reference โ see the component folders for extended examples and edge-case options.
OpenAI Realtime - Quick Config
Core options exposed in the integration UI or via YAML when applicable:
api_key: Your OpenAI API key with Realtime accessmodel: Realtime model (example:gpt-4o-realtime-preview)voice: Choose a voice (example:alloy)temperature: Float 0.0โ2.0mcp_servers: List of MCP server configs (SSE or Stdio)
Example minimal YAML for an MCP server entry (SSE):
# OpenAI Realtime MCP server example
- name: homeassistant
url: http://localhost:8123/api/mcp
type: sse
token: YOUR_LONG_LIVED_ACCESS_TOKEN
When using the integration UI, supply your api_key and configure model/voice/temperature there. Add MCP servers through the integration options.
Gemini Live - Quick Config
Core options exposed in the integration UI or via YAML when applicable:
api_key/google_api_key: Your Google AI key for Geminimodel: Gemini model (example:gemini-2.0-flash-exp)voice: Voice name (example:Puck)ephemeral_token(optional): Use for client-side authenable_session_resumption: true/falseenable_affective_dialog: true/false (v1alpha)enable_proactive_audio: true/false (v1alpha)
Example minimal settings (UI-oriented):
# Gemini Live basic settings (example representation)
model: gemini-2.0-flash-exp
voice: Puck
enable_session_resumption: true
# optional: ephemeral_token: xxxxx
For advanced features (session resumption handles, proactive audio, image inputs), open the Gemini docs in the component folder: custom_components/gemini_live/GOOGLE_DOC.md
Installation
HACS (Recommended)
- Open HACS in your Home Assistant
- Click on "Integrations"
- Click the three dots in the top right corner
- Select "Custom repositories"
- Add this repository URL:
https://github.com/your-username/ha-realtime-ai-audio - Install "Realtime AI Audio for Home Assistant"
- Restart Home Assistant
Manual Installation
- Download the repository
- Copy both folders to your Home Assistant
custom_componentsdirectory:custom_components/openai_realtime- For OpenAI integrationcustom_components/gemini_live_audio- For Gemini integration
- Restart Home Assistant
๐ต OpenAI Realtime Integration
Configuration
- Go to Settings โ Devices & Services โ Add Integration
- Search for "OpenAI Realtime"
- Enter your OpenAI API key
- Configure the settings:
- Model: Select the Realtime model (default:
gpt-4o-realtime-preview) - Voice: Choose the voice for audio responses
- Instructions: Custom system instructions
- Temperature: Response creativity (0.0 - 2.0)
- Max Output Tokens: Maximum response length
- Model: Select the Realtime model (default:
- Optionally add MCP servers for extended functionality
MCP Server Configuration
MCP (Model Context Protocol) servers allow you to extend the AI's capabilities with external tools. This integration supports two types of MCP servers:
MCP Server Types
| Type | Description | Use Case |
|---|---|---|
| SSE | HTTP-based Server-Sent Events | Remote servers, cloud-hosted MCP services |
| Stdio | Local subprocess communication | Local tools, CLI-based MCP servers |
SSE Servers (Recommended for HASSIO)
SSE servers communicate over HTTP/HTTPS and are passed directly to OpenAI's Realtime API. This is the recommended approach for Home Assistant OS (HASSIO) installations.
To add an SSE server:
- Go to integration options โ Add SSE Server
- Configure:
- Server Name: A unique identifier (letters, numbers, underscores, hyphens only)
- Server URL: The HTTP/HTTPS endpoint (e.g.,
http://localhost:8123/api/mcp) - Token (optional): Authentication token if required
Stdio Servers
Stdio servers run as local subprocesses and communicate via stdin/stdout. The integration connects to these servers locally and registers their tools as function calls.
โ ๏ธ Important: Stdio servers require the command to be available on the Home Assistant host system.
To add a Stdio server:
- Go to integration options โ Add Stdio Server
- Configure:
- Server Name: A unique identifier
- Command: The executable to run (e.g.,
python,node,/usr/bin/my-mcp-server) - Arguments: Comma-separated arguments (e.g.,
-m,mcp_server,--port,3000) - Environment Variables: Comma-separated key=value pairs (e.g.,
API_KEY=xxx,DEBUG=true)
โ ๏ธ HASSIO / Home Assistant OS Limitations
Node.js (npx, node) commands will NOT work on Home Assistant OS (HASSIO) because:
- HASSIO is a minimal, containerized Linux environment
- Node.js is not pre-installed and cannot be easily added
- The host OS is read-only and doesn't support package installation
Workarounds for HASSIO Users:
-
Use SSE mode instead of Stdio (Recommended)
Many MCP servers support both modes. Run the server on a separate machine with Node.js and connect via SSE:
# On a machine with Node.js (not HASSIO) npx @anthropic/mcp-server-brightdata --transport sse --port 3000Then configure as an SSE server with URL
http://your-server-ip:3000/sse -
Use
uvxwith Python-based MCP servers โ Works in HASSIO!If
uv/uvxis not already installed on your Home Assistant OS, you can use an add-on to install it on every boot:๐ ha-uv Add-on - Installs uv/uvx on Home Assistant OS
Once installed, this integration automatically configures the required environment variables for
uvanduvxcommands to work in HASSIO:Command: uvx Args: mcp-server-fetchThe integration automatically sets:
UV_TOOL_DIR=/config/.uv/toolsUV_CACHE_DIR=/config/.uv/cacheTMPDIR=/config/.uv/tmp
This ensures uvx uses the
/configdirectory (which has exec permissions) instead of/tmp(which has noexec in HASSIO). -
Use Python module directly
If a package is installed in HA's Python environment:
Command: python Args: -m,mcp_server_filesystem,/config -
Run MCP servers in Docker containers
If running HA in Docker (not HASSIO), add MCP server containers to your compose file:
mcp-server: image: node:20-alpine command: npx @anthropic/mcp-server-example --transport sse --port 3000 ports: - "3000:3000" -
Create a Home Assistant Add-on
Build a custom add-on that includes the MCP server. The add-on runs in its own container with all dependencies.
Example MCP Server Configurations
Home Assistant's Built-in MCP Server (SSE)
Home Assistant has a built-in MCP Server integration that exposes all your entities and services to MCP clients. This is the easiest way to give the AI full access to your smart home.
Step 1: Enable the MCP Server Integration
-
Add to your
configuration.yaml:mcp_server: -
Restart Home Assistant
-
The MCP server will be available at:
http://localhost:8123/api/mcpOr if using HTTPS:
https://localhost:8123/api/mcp
For more details, see the Home Assistant MCP Server documentation.
Step 2: Configure OpenAI Realtime to Use It
When setting up or configuring the OpenAI Realtime integration:
- Go to Settings โ Devices & Services โ OpenAI Realtime โ Configure
- Add MCP Server with:
- Name:
homeassistant(or any name you prefer) - URL:
http://localhost:8123/api/mcp - Token: Create a Long-Lived Access Token:
- Go to your profile (click your name in sidebar)
- Scroll to "Long-Lived Access Tokens"
- Click "Create Token"
- Copy the token and paste it here
- Name:
Example Configuration
# MCP Server settings in OpenAI Realtime integration
name: homeassistant
url: http://localhost:8123/api/mcp
token: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9... # Your long-lived access token
What This Enables
With the HA MCP Server connected, the AI gains access to:
- All entity states and attributes
- All available services
- Area and device information
- Much more comprehensive control than the built-in tools alone
Note: The built-in tools (
get_entity_state,call_service, etc.) still work alongside MCP servers. MCP servers provide additional capabilities.
Bright Data MCP Server (SSE - External Machine)
Bright Data MCP Server provides web scraping capabilities.
On a machine with Node.js:
npx @anthropic/mcp-server-brightdata --transport sse --port 3001
In OpenAI Realtime integration:
- Type: SSE
- Name:
bright_data - URL:
http://your-nodejs-machine:3001/sse - Token: Your Bright Data API key (if required)
Filesystem MCP Server (Stdio - Python)
For HA Core installations with Python available:
- Type: Stdio
- Name:
filesystem - Command:
python - Args:
-m,mcp_server_filesystem,/config
Custom MCP Server (Stdio - Local Binary)
If you have a compiled MCP server binary:
- Type: Stdio
- Name:
my_custom_server - Command:
/usr/local/bin/my-mcp-server - Args:
--config,/config/mcp/config.yaml - Env:
DEBUG=true,LOG_LEVEL=info
Managing MCP Servers
You can manage MCP servers through the integration options:
- Go to Settings โ Devices & Services โ OpenAI Realtime โ Configure
- Choose from:
- Add SSE Server: Add a new HTTP-based MCP server
- Add Stdio Server: Add a new subprocess-based MCP server
- Manage Existing Servers: Edit, enable/disable, or delete servers
- After making changes, the integration will reload automatically
MCP Server Naming Rules
Server names must match the pattern ^[a-zA-Z0-9_-]+$:
- โ
home_assistant,bright-data,myServer1 - โ
Home Assistant,my server,์๋ฒ์ด๋ฆ
Spaces and special characters in names will be automatically converted to underscores.
Built-in Home Assistant Tools
The integration provides these built-in tools for controlling Home Assistant:
get_entity_state
Get the current state of any Home Assistant entity.
"Turn on the living room light" โ Checks light.living_room state
call_service
Call any Home Assistant service.
"Set the thermostat to 72 degrees" โ climate.set_temperature
get_entities_by_domain
List all entities in a domain.
"What lights do I have?" โ Lists all light entities
get_area_entities
Get all entities in a specific area.
"What devices are in the bedroom?" โ Lists entities in bedroom area
Usage
As Conversation Agent
- Go to Settings โ Voice Assistants
- Create a new assistant or edit an existing one
- Select "OpenAI Realtime" as the conversation agent
- Use with any voice input method (Assist, voice satellites, etc.)
Using the Media Player
The integration creates a media player entity for direct audio control:
- Play: Start listening for audio input
- Stop: Stop audio processing and cancel responses
Binary Sensors
Monitor the state of the realtime connection:
| Sensor | Description |
|---|---|
binary_sensor.openai_realtime_connected | WebSocket connection status |
binary_sensor.openai_realtime_listening | User is speaking (VAD detected) |
binary_sensor.openai_realtime_speaking | Assistant is responding |
binary_sensor.openai_realtime_processing | Request is being processed |
Services
openai_realtime.send_message
Send a text message and get a response.
service: openai_realtime.send_message
data:
message: "What's the weather like?"
openai_realtime.send_audio
Send audio data directly to the API.
service: openai_realtime.send_audio
data:
audio_data: "<base64_encoded_pcm_audio>"
openai_realtime.start_listening
Start the audio session.
service: openai_realtime.start_listening
openai_realtime.stop_listening
Stop audio processing.
service: openai_realtime.stop_listening
openai_realtime.add_mcp_server
Add an MCP server at runtime.
service: openai_realtime.add_mcp_server
data:
name: "my_server"
url: "https://mcp.example.com"
token: "optional_token"
openai_realtime.clear_conversation
Clear the conversation history.
service: openai_realtime.clear_conversation
Example Commands
- "Turn on the kitchen lights"
- "What's the temperature in the living room?"
- "Set the bedroom thermostat to 68 degrees"
- "Lock all the doors"
- "What lights are on?"
Lovelace Card (Browser Microphone)
This integration includes a custom Lovelace card that captures audio directly from your browser's microphone and streams it to the OpenAI Realtime API.
Step 1: Add Lovelace Resource
The integration tries to register the card automatically, but you may need to add it manually:
- Go to Settings โ Dashboards โ โฎ (three dots) โ Resources
- Click Add Resource
- Enter:
- URL:
/openai_realtime/openai-realtime-card.js?v=(random_int_for_debug/update) - Resource type: JavaScript Module
- URL:
- Click Create
Alternatively, add to your configuration.yaml:
lovelace:
resources:
- url: /openai_realtime/openai-realtime-card.js
type: module
Step 2: Add the Card to Dashboard
Note: This card does not support the visual editor. When you see the error "Visual editor is not supported" or
setConfig is not a function, use the YAML editor instead.
Using YAML Editor
- Go to your dashboard and click Edit (pencil icon)
- Click + Add Card
- Scroll down and select Manual (or click the three dots and choose "Edit in YAML")
- Paste the following configuration:
type: custom:openai-realtime-card
title: OpenAI Realtime Voice
show_transcript: true
show_waveform: true
mute_while_speaking: true
- Click Save
Editing an Existing Card
If you need to edit the card later:
- Click the three dots (โฎ) on the card
- Select Edit
- If you see "Visual editor is not supported", click Edit in YAML
- Make your changes and save
Card Features
- Push-to-Talk: Hold the microphone button to speak
- Audio Visualization: Real-time waveform display while speaking
- Transcript: Live display of your speech and AI responses
- Audio Playback: Automatic playback of AI voice responses
Card Configuration Options
| Option | Type | Default | Description |
|---|---|---|---|
title | string | "OpenAI Realtime" | Card title |
show_transcript | boolean | true | Show conversation transcript |
show_waveform | boolean | true | Show audio waveform visualization |
mute_while_speaking | boolean | true | Mute microphone while AI is speaking to prevent echo/feedback. Set to false to allow interrupting the AI (requires headphones or good hardware echo cancellation) |
Browser Requirements
- Modern browser with Web Audio API support
- Microphone permissions granted
- HTTPS connection (required for microphone access)
Using with Voice Satellites
For ESP-based voice satellites, configure them to use the custom STT/TTS providers created by this integration, or use the direct WebSocket API.
Audio Configuration
The Realtime API uses PCM audio at 24kHz. The integration handles audio conversion automatically when used with Home Assistant's voice pipeline.
Supported Audio Formats
- Input: PCM 16-bit, 24kHz
- Output: PCM 16-bit, 24kHz
OpenAI Voice Options
Available voices:
alloy- Neutral, balancedecho- Deep, resonantfable- Warm, storytellingonyx- Deep, authoritativenova- Youthful, energeticshimmer- Clear, expressivecoral- Warm, engaging
OpenAI Pricing
OpenAI Realtime API pricing (per 1M tokens):
| Type | Input | Cached Input | Output |
|---|---|---|---|
| Text | $4.00 | $0.50 | $16.00 |
| Audio | $32.00 | $0.50 | $64.00 |
๐ข Gemini Live Integration
Configuration
- Go to Settings โ Devices & Services โ Add Integration
- Search for "Gemini Live"
- Enter your Google AI API key
- Configure the settings:
- Model: Select the model (default:
gemini-2.0-flash-exp) - Voice: Choose the voice for audio responses
- Instructions: Custom system instructions
- Model: Select the model (default:
Gemini Voice Options
Available voices:
Puck- Playful, energeticCharon- Deep, mysteriousKore- Warm, friendlyFenrir- Strong, confidentAoede- Clear, melodic
Gemini Lovelace Card
Add Lovelace Resource
- Go to Settings โ Dashboards โ โฎ (three dots) โ Resources
- Click Add Resource
- Enter:
- URL:
/gemini_live/gemini-live-card.js?v=1 - Resource type: JavaScript Module
- URL:
- Click Create
Add the Card to Dashboard
type: custom:gemini-live-card
title: Gemini Live Voice
Card Features
- Push-to-Talk: Click the microphone button to start/stop speaking
- Real-time Visualizer: Live audio level visualization
- Live Transcripts: See your input and AI responses in real-time
- Text Input: Type messages instead of speaking
- File Upload: Send images and audio files for multimodal conversations
- Mute Toggle: Mute microphone while AI is speaking to prevent echo
Card Configuration Options
| Option | Type | Default | Description |
|---|---|---|---|
title | string | "Gemini Live" | Card title |
Gemini Services
gemini_live.send_message
Send a text message and get a response.
service: gemini_live.send_message
data:
message: "What's the weather like?"
gemini_live.send_audio
Send audio data directly to the API.
service: gemini_live.send_audio
data:
audio_data: "<base64_encoded_pcm_audio>"
gemini_live.start_listening
Start the audio session.
service: gemini_live.start_listening
gemini_live.stop_listening
Stop audio processing.
service: gemini_live.stop_listening
Gemini Binary Sensors
| Sensor | Description |
|---|---|
binary_sensor.gemini_live_connected | WebSocket connection status |
binary_sensor.gemini_live_listening | User is speaking |
binary_sensor.gemini_live_speaking | Assistant is responding |
binary_sensor.gemini_live_processing | Request is being processed |
Gemini Pricing
Gemini 2.0 Flash is currently in preview with generous free tier limits. Check Google AI pricing for current rates.
Troubleshooting
Enable Debug Logging
Add to your configuration.yaml:
logger:
default: info
logs:
custom_components.openai_realtime: debug
Then restart Home Assistant.
Key Log Messages to Look For
| Log Message | Meaning |
|---|---|
Connected to OpenAI Realtime API | WebSocket connected successfully |
Session created: sess_XXX | Session established with OpenAI |
Updating session with X tools | Tools are being registered |
Session updated | Tools registered successfully โ |
API Error: ... | Something went wrong โ |
Function call received | AI is calling a Home Assistant tool |
Function result sent to OpenAI | Tool execution completed |
Registering event handlers including function_call handler | WebSocket API handlers set up |
Common Issues
API Error: Invalid type for 'session.max_response_output_tokens'
This was a known issue where the token value was sent as a decimal. Update to the latest version.
Tools not working / AI says it did something but nothing happened
- Check logs for
API Errormessages afterUpdating session with X tools - Look for
Session updated- if missing, session config failed and tools weren't registered - Verify entity IDs exist in Home Assistant
- Check for
Function call receivedin logs to confirm AI is trying to call tools
No audio playback
- Ensure your browser allows audio playback
- Check browser console for errors (F12 โ Console)
- Try refreshing the page with Ctrl+Shift+R
Microphone not working
- Ensure HTTPS is enabled (required for microphone access)
- Check browser permissions for microphone access
- Try a different browser (Chrome recommended)
"Not connected to OpenAI Realtime API"
- Check your API key is valid
- Ensure you have access to the Realtime API (not all accounts have it)
- Check your internet connection
Audio playing multiple times / overlapping
- This was fixed in recent versions - update to latest
- Clear browser cache and reload
MCP Server Issues
Stdio server not working on HASSIO
- Cause: Node.js (
npx,node) is not available on Home Assistant OS - Solution: Use SSE mode instead. Run the MCP server on a separate machine and connect via HTTP
MCP server "command not found"
- Cause: The command is not installed or not in PATH
- Solution:
- Use full path to executable (e.g.,
/usr/bin/python3instead ofpython) - For Python MCP servers, ensure the module is installed:
pip install mcp-server-xxx
- Use full path to executable (e.g.,
MCP tools not appearing in AI responses
- Check logs for
Loading X MCP servers from config - For stdio servers, look for
Connected to stdio MCP server X, found Y tools - Verify the server is enabled in options
- Check for connection errors in logs
MCP call succeeds but no audio response
- Cause: After MCP calls, OpenAI may need a trigger to generate audio
- Solution: This is handled automatically in recent versions. Update to latest and restart.
"Server not found" error when calling MCP tool
- Cause: Server name in function call doesn't match configured server
- Solution: Check server name for special characters. Names are sanitized (spaces โ underscores)
Browser Console Debugging
- Open browser developer tools (F12)
- Go to Console tab
- Look for messages:
Subscribing to OpenAI Realtime events...- Card is connectingSubscribed successfully- Connection establishedReceived event:- Events coming from backendPlaying audio chunk- Audio is being played
Check Integration Status
- Go to Settings โ Devices & Services
- Find "OpenAI Realtime"
- Check if it shows any errors
View Full Logs
# In Home Assistant terminal or SSH
tail -f /config/home-assistant.log | grep openai_realtime
Test API Connection
Try sending a text message via Developer Tools โ Services:
service: openai_realtime.send_message
data:
message: "Hello, can you hear me?"
Updating
Updating the Integration
-
Via HACS:
- Go to HACS โ Integrations
- Find "OpenAI Realtime" and click "Update"
- Restart Home Assistant
-
Manual Update:
- Replace the
custom_components/openai_realtimefolder with the new version - Restart Home Assistant
- Replace the
Updating the JavaScript Card (Manual Cache Bust)
The JS card version is automatically updated based on file modification time. However, browsers may cache the old version. Here's how to force an update:
Method 1: Update Resource Version in Dashboard Settings (Recommended)
- Go to Settings โ Dashboards
- Click the three-dot menu (โฎ) โ Resources
- Find the resource containing
/openai_realtime/openai-realtime-card.js - Click to edit it
- Change the URL version parameter:
(Just change the number to anything different)Before: /openai_realtime/openai-realtime-card.js?v=1733000000 After: /openai_realtime/openai-realtime-card.js?v=1733100000 - Click Update
- Hard refresh your browser:
Ctrl+Shift+R(Windows/Linux) orCmd+Shift+R(Mac)
Method 2: Hard Refresh Browser
- Windows/Linux:
Ctrl+Shift+R - Mac:
Cmd+Shift+R - Or open DevTools (F12) โ Right-click Refresh โ "Empty Cache and Hard Reload"
Method 3: Delete and Re-add Resource
- Go to Settings โ Dashboards โ Resources
- Delete the OpenAI Realtime card resource
- Reload the integration:
- Settings โ Devices & Services โ OpenAI Realtime โ โฎ โ Reload
- The resource will be re-added automatically with new version
Method 4: Clear Browser Cache Completely
- Open browser settings
- Clear cached images and files
- Reload the dashboard
After Updating
- Clear browser cache or hard refresh (
Ctrl+Shift+R) - Reload the dashboard
- Check browser console (F12) for any errors
- Test the microphone button
Connection Issues
- Verify your API key has Realtime API access
- Check your network allows WebSocket connections
- Review Home Assistant logs for detailed error messages
MCP Server Issues
- Ensure the MCP server URL is accessible from Home Assistant
- Verify authentication tokens are correct
- Check MCP server logs for connection issues
Audio Issues
- Ensure audio is in the correct format (PCM 24kHz)
- Check voice assistant configuration
- Verify microphone/speaker setup
Development
Local Development
# Clone the repository
git clone https://github.com/your-username/ha-realtime-ai-audio.git
# Create virtual environment
python -m venv venv
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Link to Home Assistant custom_components
ln -s $(pwd)/custom_components/openai_realtime ~/.homeassistant/custom_components/
ln -s $(pwd)/custom_components/gemini_live ~/.homeassistant/custom_components/
Running Tests
pytest tests/
License
MIT License - see LICENSE for details.
Contributing
Contributions are welcome! Please read our contributing guidelines and submit pull requests.
Changelog
2.0.0
- Added Gemini Live Audio integration with Google's Gemini Live API
- Gemini features: Session resumption, image/audio input, Google Search
- Renamed project to "ha-realtime-ai-audio"
- Updated README for both integrations
1.0.0
- Initial release
- OpenAI Realtime API integration
- MCP server support
- Home Assistant conversation agent
- Built-in smart home tools
