Modelguide
Open Source AI Contact Center. Alternative to closed & price heavy SaaS like Sierra or Decagon. Stop renting your customer experience!
Installation
npx modelguideAsk AI about Modelguide
Powered by Claude Β· Grounded in docs
I know everything about Modelguide. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Own your agent stack.
ModelGuide is the open-source orchestration layer for production voice-first agents.
Keep your runtime. Wire up integrations once. Define agent behavior with playbooks, SOPs, and guardrails.
Build β generate tests β simulate β score β improve β ship. A closed feedback loop you own.
No vendor lock-in. Bring your own models, runtimes, channels, and deployment.
Start with a reference implementation β LiveKit Β· Pipecat Β· ElevenLabs Β· Mastra
Quick Start Β· Reference Implementations Β· Connect Your Agent Β· Admin Guide Β· Build a Connector Β· Roadmap
The Missing Feedback Loop
Getting an agent to talk is easy. Making it reliable is the hard part.
A bad conversation happens. Someone reviews it manually. A prompt gets tweaked. But no reusable test is created, no eval is added, and the same failure comes back later in a slightly different form.
The missing layer is the feedback loop around the runtime: business tool access, policy enforcement, session history, QA workflows, evals, provisioning, and deployment.
ModelGuide gives you that layer as open source β so you can turn failures into tests, tests into better instructions, and ship voice agents on any stack without rebuilding production infrastructure from scratch. Start with voice. Extend to other customer-facing channels when needed.

What ModelGuide Is
ModelGuide sits between your agent runtime and your business systems. It is not a voice runtime and it is not a hosted black box. It is the orchestration layer you own.
- Connect business systems once over MCP
- Assign the right tools to each agent with confirmation gates and secure credentials
- Compile SOPs and guardrails into agent behavior
- Record sessions with transcripts, tool traces, CSAT, and QA tags
- Run evals and simulations against real workflows
- Provision new organizations from repeatable YAML blueprints
Why Builders Use ModelGuide
| Builder need | What ModelGuide gives you |
|---|---|
| Closed feedback loop | Run simulations and evals, turn failed conversations into reusable test cases and evaluators, and recompile better instructions |
| Less production glue code | Connect tools, sessions, SOPs, evals, and operator workflows without rebuilding the harness around every runtime |
| Runtime portability | Keep LiveKit, Pipecat, ElevenLabs, Mastra, or your own runtime. The business layer stays portable. |
| One place for agent context | Manage tools, SOPs, guardrails, confirmation policies, and review workflows from a single control layer |
| Reviewable behavior | Full session records, tool traces, CSAT, QA tags, and eval results β complements your observability stack |
| Self-hostable production infrastructure | Open-source, self-hostable, with multi-tenant auth, encrypted secrets, and row-level security |
ModelGuide focuses on agent behavior and review: transcripts, tool traces, CSAT, QA tags, SOP adherence, and eval results. Keep Langfuse, Datadog, Honeycomb, or OpenTelemetry for lower-level runtime telemetry and infrastructure tracing.
| Connect Tools | Review Conversations | Define Behavior |
![]() | ![]() | ![]() |
| Write Playbooks | Track Quality | Run Evals |
![]() | ![]() | ![]() |
Quick Start
Prerequisites: Docker 24+, Bun 1.1+, Node 22+
git clone https://github.com/modelguide/modelguide.git
cd modelguide
make quickstart
Then in separate terminals:
make api-dev # API at http://localhost:3000
make ui-dev # Dashboard at http://localhost:3001
Open http://localhost:3001. The seed creates three industry-vertical organizations β retail, medical call center, B2B industrial β each with Medusa e-commerce and Zendesk helpdesk connectors, two agents, and ~300 realistic sessions. Log in with delivered+admin-glowbox@resend.dev (magic link printed to API console).
Full vertical matrix, dev accounts, and session scenarios: docs/guide/seed-data.md.
How Teams Use ModelGuide
1. Define what your agent should do. Describe the persona, connect your business systems, set the rules and guardrails. ModelGuide keeps that operational context in one place.
2. Generate the instructions your runtime uses. ModelGuide compiles that context into agent instructions and exposes the approved business tools over MCP.
3. Generate test assets automatically. ModelGuide creates synthetic conversations, eval suites, evaluators, and QA workflows to test the agent before it reaches production traffic.
4. Run the feedback loop. ModelGuide runs simulations, scores behavior, and gives your team transcripts, tool traces, CSAT, QA tags, and eval results to review.
5. Tighten the operating context. Use failures to update SOPs, guardrails, persona, tools, and compiled instructions until the automated checks consistently look right.
6. Validate manually before launch. Once the agent passes the automated checks, run manual tests in your runtime and confirm the experience is good enough to ship.
The closed feedback loop is already here: define the context, compile the instructions, generate tests, run simulations, score behavior, and improve the agent from failures. Over time, more of the prompt and context fixes can be automated.
Reference Implementations
The reference implementations prove that the orchestration layer stays portable across runtimes and channels.
Start with the LiveKit implementation for the fastest end-to-end path. Use the Pipecat or ElevenLabs examples if your team already runs there. The Mastra example shows the same orchestration layer extending beyond voice when you need another customer-facing channel.
| Runtime | Why it exists | Path |
|---|---|---|
| LiveKit Agents (flagship) | Fastest path to a production voice agent with telephony, MCP tool wiring, session tracking, eval tests, and deployment docs | examples/agents/livekit-agent/ |
| Pipecat | Same orchestration model for teams already committed to Pipecat | examples/agents/pipecat-agent/ |
| ElevenLabs Conversational AI | Manage platform agent config, tools, and prompts from version-controlled local definitions | examples/agents/elevenlabs-agent/ |
| Mastra | Email "Where Is My Order?" example showing the orchestration layer extends beyond voice when you need another customer-facing channel | examples/agents/mastra-wismo-email-agent/ |
Provisioning an Organization
The mg CLI provisions a new organization from a directory of YAML files β users, connectors, agents with compiled instructions, SOPs, guardrails, and demo sessions β in one command. Safe to re-run against the same directory.
bun run src/cli/mg.ts setup /path/to/my-org/
Full flag reference, per-command usage, and Railway instructions: docs/guide/cli.md.
Roadmap
π§ Sub-agents & Workflow Builder β Compose multi-step agent workflows with branching and handoffs
π§ OTEL + A/B Testing via Langfuse β OpenTelemetry traces, prompt variant experiments, side-by-side comparison
π§ Agentic Insights β Custom funnels tracking agent behavior through business-defined conversion paths
π§ Closed-loop instruction tuning β turn repeated eval and simulation failures into suggested SOP, guardrail, and instruction fixes
π More Blueprints β Contact center ships first; healthcare intake, field service, B2B sales next
π Connector Marketplace β Community-built integrations
Deployment
Docker Compose for local and staging (make docker-up), Railway for production. The Railway architecture is PostgreSQL + API + UI + Caddy load balancer (the LB is the only public-facing service, routing /api/* and /mcp to the API and everything else to the UI over Railway's internal network). Config is as-code via railway.toml per service β full setup and deploy steps in railway/DEPLOY.md.
Tech Stack
| Layer | Technology |
|---|---|
| API | Hono + Bun.js |
| Agent Protocol | MCP (@modelcontextprotocol/sdk) |
| Database | PostgreSQL 16 + Drizzle ORM |
| Dashboard | TanStack Start + React 19 + Tailwind CSS v4 |
| Auth | JWT + magic links (users) Β· API keys (agents) |
| API Docs | Scalar (auto-generated from OpenAPI) |
No proprietary components. Every layer is inspectable, replaceable, forkable.
Production foundations include RBAC with separate admin/support/agent auth paths, encrypted secrets, row-level security, and a full CI pipeline running lint, typecheck, unit, integration, and MCP-protocol tests on every PR. See ADR-005 for the SOP primitive, ADR-007 and ADR-009 for the evals engine.
Documentation
| Resource | Description |
|---|---|
| MCP Integration Guide | Connect your AI agent via MCP |
| Admin Guide | Configure connectors, agents, and tools through the dashboard |
| Adding a Connector | Build a new connector manifest, handlers, and tests |
mg CLI β Provisioning | Provision organizations from YAML |
| Seed Data | Dev accounts, orgs, and session scenarios |
| Architecture Decisions | ADRs for significant design choices |
| Deployment Guide | Railway production deployment |
| Contributing | Setup, workflow, project structure, conventions |
Contributing
Contributions welcome. No CLA. See CONTRIBUTING.md for the full guide.
# Run checks before submitting
make api-test # Unit + integration tests
make ui-test # UI component tests
make api-lint-check # Linting
make api-typecheck # Type checking
Check open issues β look for good first issue. Fork β branch β PR with tests.
License
Built by ModelGuide Β· The open-source orchestration framework for production voice-first agents Β· π΅π± Poland







