📦

Aura

A production-ready framework for composing AI agents from declarative TOML configuration, with MCP tool integration, RAG pipelines, and an OpenAI-compatible web API.

0 installs

Trust: 39 — Low

Rag

Ask AI about Aura

I know everything about Aura. Ask me about installation, configuration, usage, or troubleshooting.

0/500

Loading tools...

Reviews

Documentation

Aura

Aura is an agentic harness that turns an LLM model into a reliable, autonomous service capable of executing real SRE work. Aura provides the guardrails, API servers, state management, authentication, streaming, error handling, and tool integrations necessary to run AI SRE agents safely in production.

Key capabilities:

Declarative agent composition via TOML with multi-provider LLM support and multi-agent serving
Dynamic MCP tool discovery across HTTP, SSE, and STDIO transports
Automatic schema sanitization for OpenAI function-calling compatibility
RAG pipeline integration with in-memory, Qdrant, and AWS Bedrock Knowledge Base vector stores, using OpenAI or AWS Bedrock embeddings
Embeddable Rust core, independent from configuration layer

Looking for orchestration mode? Multi-agent orchestration is available on the feature/orchestration-mode branch and is currently in open alpha — APIs, behavior, and configuration are changing rapidly as we iterate.

The main branch is Aura's production-ready single-agent framework: declarative TOML-driven agents with MCP tool integration, RAG pipelines, multi-provider LLM support, and an OpenAI-compatible streaming API.

Issues and feature requests are welcome — we'd love your feedback on both.

Aura

Project Structure

aura/
├── crates/
│   ├── aura/                # Core agent builder library
│   ├── aura-config/         # TOML parser and config loader
│   ├── aura-web-server/     # OpenAI-compatible HTTP/SSE server
│   └── aura-test-utils/     # Shared testing utilities
├── compose/                 # Docker Compose files for integration testing
├── examples/                # Example configuration files
├── development/             # LibreChat and OpenWebUI setup
├── docs/                    # Architecture and protocol documentation
└── scripts/                 # CI and utility scripts

Setup

Install Rust if needed:

 curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Clone and configure:

 cd aura
 cp examples/reference.toml config.toml

Set required environment variables:

 export OPENAI_API_KEY="your-api-key"

Build:

 cargo build --release

Security: keep secrets in environment variables and reference them in TOML using {{ env.VAR_NAME }}.

Usage

Web API Server

Run the server:

# Default: reads config.toml
cargo run --bin aura-web-server

# Custom config file
CONFIG_PATH=my-config.toml cargo run --bin aura-web-server

# Config directory (serves multiple agents)
CONFIG_PATH=configs/ cargo run --bin aura-web-server

# Host/port override
HOST=0.0.0.0 PORT=3000 cargo run --bin aura-web-server

# Enable Aura custom SSE events
AURA_CUSTOM_EVENTS=true cargo run --bin aura-web-server

Core server options:

Option	Env Variable	Default	Description
`--config`	`CONFIG_PATH`	`config.toml`	Path to TOML config file or directory
`--host`	`HOST`	`127.0.0.1`	Bind host
`--port`	`PORT`	`8080`	Bind port
`--streaming-timeout-secs`	`STREAMING_TIMEOUT_SECS`	`900`	Max SSE request duration
`--first-chunk-timeout-secs`	`FIRST_CHUNK_TIMEOUT_SECS`	`30`	Max time to first provider chunk
`--streaming-buffer-size`	`STREAMING_BUFFER_SIZE`	`400`	SSE backpressure buffer
`--aura-custom-events`	`AURA_CUSTOM_EVENTS`	`false`	Enable `aura.*` events
`--aura-emit-reasoning`	`AURA_EMIT_REASONING`	`false`	Enable `aura.reasoning`
`--tool-result-mode`	`TOOL_RESULT_MODE`	`none`	Tool result streaming: none, open-web-ui, aura
`--tool-result-max-length`	`TOOL_RESULT_MAX_LENGTH`	`100`	Max chars before truncation (aura events)
`--shutdown-timeout-secs`	`SHUTDOWN_TIMEOUT_SECS`	`30`	Graceful shutdown window

Tool result modes:

none: spec-compliant; tool results appear only in model summary.
open-web-ui: tool results emitted through tool_calls for OpenWebUI compatibility.
aura: tool results emitted via aura.tool_complete events.

API examples:

# Health
curl http://localhost:8080/health

# List available models (agents)
curl http://localhost:8080/v1/models

# OpenAI-compatible chat completion
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Hello"}]}'

# Select a specific agent by name or alias via the model field
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "my-agent", "messages": [{"role": "user", "content": "Hello"}]}'

# Streaming response
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Hello"}], "stream": true}'

SSE protocol details, event types, custom events, and client handling are documented in docs/streaming-api-guide.md.

For LibreChat/OpenWebUI integration, see development/README.md.

Configuration

CONFIG_PATH can point to a single TOML file or a directory of .toml files. When pointed at a directory, Aura loads every .toml file and serves each as a selectable agent. Clients choose an agent via the model field in chat completion requests — the same field that tools like LibreChat, OpenWebUI, and CLI clients use to present a model picker.

Multiple Agents

To serve multiple agents, create a directory with one TOML file per agent:

configs/
├── research-assistant.toml
├── devops-agent.toml
└── code-reviewer.toml

CONFIG_PATH=configs/ cargo run --bin aura-web-server

Each agent is identified by its alias (if set) or name. Clients discover available agents via GET /v1/models and select one by passing its identifier as the model field in requests. When no model is specified, the server resolves the agent via DEFAULT_AGENT, or automatically when only one config is loaded.

The alias field provides a stable, client-facing identifier that is independent of the agent's display name:

[agent]
name = "DevOps Assistant"
alias = "devops"             # clients send "model": "devops"
system_prompt = "You are a DevOps expert."
model_owner = "mezmo"        # override owned_by in /v1/models (defaults to LLM provider)

Aliases must be unique across all loaded configs. If two configs share the same name and neither has an alias, loading fails with a validation error.

Configuration Sections

Configuration sections:

[llm]: provider and model configuration.
[agent]: identity, system prompt, and runtime behavior.
[[vector_stores]]: optional RAG/vector store configuration.
[mcp] and [mcp.servers.*]: MCP configuration, schema sanitization, and transports.

Supported LLM providers: OpenAI, Anthropic, Bedrock, Gemini, and Ollama.

Supported vector stores: in_memory, qdrant, and bedrock_kb (AWS Bedrock Knowledge Bases — managed RAG, no embedding model required). For in_memory and qdrant, supported embedding providers are OpenAI and AWS Bedrock. See the [[vector_stores]] examples in examples/reference.toml.

Supported MCP transports:

http_streamable (recommended for production)
sse
stdio - for local processes. In production, bridge through mcp-proxy to avoid Rig.rs STDIO lifecycle issues:

mcp-proxy --port=8081 --host=127.0.0.1 npx your-mcp-server

Then point your config at the HTTP/SSE endpoint instead.

headers_from_request can forward incoming request headers to MCP servers for per-request auth. See development/README.md for practical examples.

turn_depth controls how many tool-calling rounds can happen in a single turn. Higher values allow multi-step tool workflows before final response generation. This acts as a failsafe to prevent models from spinning out in unbounded tool-call loops.

context_window sets the context window size (in tokens) for the agent, used for usage percentage reporting in aura.session_info streaming events.

The complete starter configuration is in examples/reference.toml. Minimal per-provider configs are in examples/minimal/ and complete agent examples are in examples/complete/.

Minimal example:

[llm]
provider = "openai"
api_key = "{{ env.OPENAI_API_KEY }}"
model = "gpt-5.2"

[mcp.servers.my_server]
transport = "http_streamable"
url = "http://localhost:8080/mcp"
headers = { "Authorization" = "Bearer {{ env.MCP_TOKEN }}" }

[agent]
name = "Assistant"
alias = "my-assistant"       # optional: stable client-facing identifier
system_prompt = "You are a helpful assistant."
turn_depth = 2

Validate config parsing quickly:

cargo run -p aura-config --bin debug_config

Ollama

Aura supports Ollama, including fallback tool-call parsing for model outputs that emit tool calls as text. Full setup, parameter guidance, and model caveats are in docs/ollama-guide.md.

Observability

OpenTelemetry support is enabled by default via the otel feature in both aura and aura-web-server. Configure your OTLP endpoint using standard environment variables (for example OTEL_EXPORTER_OTLP_ENDPOINT) to export traces.

Aura emits spans using the OpenInference semantic convention (llm.*, tool.*, input.*, output.*) rather than the gen_ai.* conventions. Rig-originated gen_ai.* attributes are automatically translated to OpenInference equivalents at export time. This makes Aura traces natively compatible with Phoenix and other OpenInference-aware observability tools.

Docker Deployment

Aura includes containerized deployment assets at the repo root:

Dockerfile: multi-stage build for the web server.
docker-compose.yml: local container deployment wiring.

Run with Docker Compose:

docker compose up --build

Default container port mapping is 3030:3030 in docker-compose.yml. Ensure your config path and API key environment variables are set for the container runtime.

Development and Testing

Quick commands:

# Full local quality checks
make ci

# Individual checks
make fmt
make fmt-check
make test
make lint

# Build targets
make build
make build-release

Test CI pipeline locally before pushing:

./scripts/test-ci.sh

The script mirrors Jenkins checks: format, workspace tests, and clippy with warnings denied.

Testing

Web server integration tests live under crates/aura-web-server/tests/.

Run web server integration test workflow:

./crates/aura-web-server/tests/run_tests.sh

Integration test feature flags (crates/aura-web-server/Cargo.toml):

Parent flag: integration
Suite flags: integration-streaming, integration-header-forwarding, integration-mcp, integration-events, integration-cancellation, integration-progress
Optional suite: integration-vector (requires external Qdrant setup)

Detailed test guidance: crates/aura-web-server/README.md#running-integration-tests.

Documentation

CHANGELOG.md: release and version history.
docs/request-lifecycle.md: request flow diagram, lifecycle, timeout, cancellation, and shutdown behavior.
docs/streaming-api-guide.md: SSE protocol guide, event taxonomy, tool result modes, custom aura.* events, and client examples.
docs/rig-tool-execution-order.md: tool execution ordering analysis.
docs/rig-fork-changes.md: Rig fork changes and rationale.
development/README.md: LibreChat/OpenWebUI setup and header-forwarding examples.

Architecture

Aura separates concerns across crates:

aura: runtime agent building, MCP integration, tool orchestration, and vector workflows.
aura-config: typed TOML parsing and validation.
aura-web-server: OpenAI-compatible REST/SSE serving layer.

This separation means:

Embeddable core - use aura directly in any Rust application without config file dependencies.
Flexible config - aura-config can be extended to support other formats (JSON, YAML).
Testable boundaries - each crate has focused responsibilities and clear interfaces.

Key architectural characteristics:

Dynamic MCP tool discovery at runtime.
Automatic schema sanitization (anyOf, missing types, optional parameters) driven by OpenAI function-calling requirements — MCP tool schemas are transformed at discovery time to conform to OpenAI's strict subset of JSON Schema.
Header forwarding support (headers_from_request) for per-request MCP auth delegation.
Config-driven composition with embeddable Rust core.

Request execution and cancellation flow are documented in docs/request-lifecycle.md.

License

Licensed under the Apache License, Version 2.0.

Aura

Reviews

Documentation

Aura

Table of Contents

Project Structure

Setup

Usage

Web API Server

Configuration

Multiple Agents

Configuration Sections

Ollama

Observability

Docker Deployment

Development and Testing

Testing

Documentation

Architecture

License

Security Checklist