Aura
A production-ready framework for composing AI agents from declarative TOML configuration, with MCP tool integration, RAG pipelines, and an OpenAI-compatible web API.
Ask AI about Aura
Powered by Claude Β· Grounded in docs
I know everything about Aura. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Aura
A production-ready framework for composing AI agents from declarative TOML configuration, with MCP tool integration, RAG pipelines, and an OpenAI-compatible web API. Built on Rig.rs with reliability and operability enhancements.
Key capabilities:
- Declarative agent composition via TOML with multi-provider LLM support
- Dynamic MCP tool discovery across HTTP, SSE, and STDIO transports
- Automatic schema sanitization for OpenAI function-calling compatibility
- RAG pipeline integration with in-memory and external vector stores
- Embeddable Rust core independent from configuration layer
Looking for orchestration mode? Multi-agent orchestration is available on the
feature/orchestration-modebranch and is currently in open alpha β APIs, behavior, and configuration are changing rapidly as we iterate.The
mainbranch is Aura's production-ready single-agent framework: declarative TOML-driven agents with MCP tool integration, RAG pipelines, multi-provider LLM support, and an OpenAI-compatible streaming API.Issues and feature requests are welcome β we'd love your feedback on both.
Table of Contents
Project Structure
aura/
βββ crates/
β βββ aura/ # Core agent builder library
β βββ aura-config/ # TOML parser and config loader
β βββ aura-web-server/ # OpenAI-compatible HTTP/SSE server
β βββ aura-test-utils/ # Shared testing utilities
βββ compose/ # Docker Compose files for integration testing
βββ examples/ # Example configuration files
βββ development/ # LibreChat and OpenWebUI setup
βββ docs/ # Architecture and protocol documentation
βββ scripts/ # CI and utility scripts
Setup
- Install Rust if needed:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
- Clone and configure:
cd aura
cp examples/reference.toml config.toml
- Set required environment variables:
export OPENAI_API_KEY="your-api-key"
- Build:
cargo build --release
Security: keep secrets in environment variables and reference them in TOML using {{ env.VAR_NAME }}.
Usage
Web API Server
Run the server:
# Default: reads config.toml
cargo run --bin aura-web-server
# Custom config path
CONFIG_PATH=my-config.toml cargo run --bin aura-web-server
# Host/port override
HOST=0.0.0.0 PORT=3000 cargo run --bin aura-web-server
# Enable Aura custom SSE events
AURA_CUSTOM_EVENTS=true cargo run --bin aura-web-server
Core server options:
| Option | Env Variable | Default | Description |
|---|---|---|---|
--config | CONFIG_PATH | config.toml | Path to TOML config |
--host | HOST | 127.0.0.1 | Bind host |
--port | PORT | 8080 | Bind port |
--streaming-timeout-secs | STREAMING_TIMEOUT_SECS | 900 | Max SSE request duration |
--first-chunk-timeout-secs | FIRST_CHUNK_TIMEOUT_SECS | 30 | Max time to first provider chunk |
--streaming-buffer-size | STREAMING_BUFFER_SIZE | 400 | SSE backpressure buffer |
--aura-custom-events | AURA_CUSTOM_EVENTS | false | Enable aura.* events |
--aura-emit-reasoning | AURA_EMIT_REASONING | false | Enable aura.reasoning |
--tool-result-mode | TOOL_RESULT_MODE | none | Tool result streaming: none, open-web-ui, aura |
--tool-result-max-length | TOOL_RESULT_MAX_LENGTH | 100 | Max chars before truncation (aura events) |
--shutdown-timeout-secs | SHUTDOWN_TIMEOUT_SECS | 30 | Graceful shutdown window |
Tool result modes:
none: spec-compliant; tool results appear only in model summary.open-web-ui: tool results emitted throughtool_callsfor OpenWebUI compatibility.aura: tool results emitted viaaura.tool_completeevents.
API examples:
# Health
curl http://localhost:8080/health
# OpenAI-compatible chat completion
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "Hello"}]}'
# Streaming response
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "Hello"}], "stream": true}'
SSE protocol details, event types, custom events, and client handling are documented in docs/streaming-api-guide.md.
For LibreChat/OpenWebUI integration, see development/README.md.
Configuration
Configuration sections:
[llm]: provider and model configuration.[agent]: system prompt and runtime behavior.[[vector_stores]]: optional RAG/vector store configuration.[mcp]and[mcp.servers.*]: MCP configuration, schema sanitization, and transports.
Supported providers: OpenAI, Anthropic, Bedrock, Gemini, and Ollama.
Supported MCP transports:
http_streamable(recommended for production)ssestdio- for local processes. In production, bridge through mcp-proxy to avoid Rig.rs STDIO lifecycle issues:
mcp-proxy --port=8081 --host=127.0.0.1 npx your-mcp-server
Then point your config at the HTTP/SSE endpoint instead.
headers_from_request can forward incoming request headers to MCP servers for per-request auth. See development/README.md for practical examples.
turn_depth controls how many tool-calling rounds can happen in a single turn. Higher values allow multi-step tool workflows before final response generation. This acts as a failsafe to prevent models from spinning out in unbounded tool-call loops.
The complete starter configuration is in examples/reference.toml. Minimal per-provider configs are in examples/minimal/ and complete agent examples are in examples/complete/.
Minimal example:
[llm]
provider = "openai"
api_key = "{{ env.OPENAI_API_KEY }}"
model = "gpt-5.2"
[mcp.servers.my_server]
transport = "http_streamable"
url = "http://localhost:8080/mcp"
headers = { "Authorization" = "Bearer {{ env.MCP_TOKEN }}" }
[agent]
name = "Assistant"
system_prompt = "You are a helpful assistant."
turn_depth = 2
Validate config parsing quickly:
cargo run -p aura-config --bin debug_config
Ollama
Aura supports Ollama, including fallback tool-call parsing for model outputs that emit tool calls as text. Full setup, parameter guidance, and model caveats are in docs/ollama-guide.md.
Observability
OpenTelemetry support is enabled by default via the otel feature in both aura and aura-web-server. Configure your OTLP endpoint using standard environment variables (for example OTEL_EXPORTER_OTLP_ENDPOINT) to export traces.
Aura emits spans using the OpenInference semantic convention (llm.*, tool.*, input.*, output.*) rather than the gen_ai.* conventions. Rig-originated gen_ai.* attributes are automatically translated to OpenInference equivalents at export time. This makes Aura traces natively compatible with Phoenix and other OpenInference-aware observability tools.
Docker Deployment
Aura includes containerized deployment assets at the repo root:
Dockerfile: multi-stage build for the web server.docker-compose.yml: local container deployment wiring.
Run with Docker Compose:
docker compose up --build
Default container port mapping is 3030:3030 in docker-compose.yml. Ensure your config path and API key environment variables are set for the container runtime.
Development and Testing
Quick commands:
# Full local quality checks
make ci
# Individual checks
make fmt
make fmt-check
make test
make lint
# Build targets
make build
make build-release
Test CI pipeline locally before pushing:
./scripts/test-ci.sh
The script mirrors Jenkins checks: format, workspace tests, and clippy with warnings denied.
Testing
Web server integration tests live under crates/aura-web-server/tests/.
Run web server integration test workflow:
./crates/aura-web-server/tests/run_tests.sh
Integration test feature flags (crates/aura-web-server/Cargo.toml):
- Parent flag:
integration - Suite flags:
integration-streaming,integration-header-forwarding,integration-mcp,integration-events,integration-cancellation,integration-progress - Optional suite:
integration-vector(requires external Qdrant setup)
Detailed test guidance: crates/aura-web-server/README.md#running-integration-tests.
Documentation
- CHANGELOG.md: release and version history.
- docs/request-lifecycle.md: request flow diagram, lifecycle, timeout, cancellation, and shutdown behavior.
- docs/streaming-api-guide.md: SSE protocol guide, event taxonomy, tool result modes, custom
aura.*events, and client examples. - docs/rig-tool-execution-order.md: tool execution ordering analysis.
- docs/rig-fork-changes.md: Rig fork changes and rationale.
- development/README.md: LibreChat/OpenWebUI setup and header-forwarding examples.
Architecture
Aura separates concerns across crates:
aura: runtime agent building, MCP integration, tool orchestration, and vector workflows.aura-config: typed TOML parsing and validation.aura-web-server: OpenAI-compatible REST/SSE serving layer.
This separation means:
- Embeddable core - use
auradirectly in any Rust application without config file dependencies. - Flexible config -
aura-configcan be extended to support other formats (JSON, YAML). - Testable boundaries - each crate has focused responsibilities and clear interfaces.
Key architectural characteristics:
- Dynamic MCP tool discovery at runtime.
- Automatic schema sanitization (anyOf, missing types, optional parameters) driven by OpenAI function-calling requirements β MCP tool schemas are transformed at discovery time to conform to OpenAI's strict subset of JSON Schema.
- Header forwarding support (
headers_from_request) for per-request MCP auth delegation. - Config-driven composition with embeddable Rust core.
Request execution and cancellation flow are documented in docs/request-lifecycle.md.
License
Licensed under the Apache License, Version 2.0.
