AI Devops Actions
The CI/CD layer for AI-native development β 5 GitHub Actions for PR quality, cost tracking, MCP testing, supply chain security, and agent skill validation
Ask AI about AI Devops Actions
Powered by Claude Β· Grounded in docs
I know everything about AI Devops Actions. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
AI DevOps Actions
π΄ Live example: Open PR #1 (canonical demo) or any recent PR to see the full suite running on itself β context summary, quality score, and root cause hints, all from
GITHUB_TOKEN.
The full CI/CD layer for AI-native development β 8 GitHub Actions covering PR quality, safety, cost, infra, and behavioral testing.
AI-native repos have problems that standard CI/CD doesn't solve. PRs flooded with AI slop. Unchecked LLM spend. Sensitive data leaking through AI outputs. MCP servers shipped without validation. Action tags silently compromised. Agent skills published without schema checks. Behavioral regressions invisible until production.
This suite covers the full stack β eight GitHub Actions that work independently or as a pipeline.
π§ The AI Reliability Loop
Most CI tells you something broke. This system tells you why β and what to do next.
Three things this loop does that standard CI/CD can't:
| Stage | What it does |
|---|---|
| π Detect | Catch regressions in AI behavior β not just code |
| π§ Explain | Identify root cause: prompt change, model drift, data shift, cost spike |
| π§ Fix | Turn failures into actionable feedback, automatically |
Example output when your AI pipeline breaks:
β Eval failed
Root Cause Analysis:
β Knowledge drift (HIGH confidence)
β RAG corpus changed 2 commits ago
β Eval suite: 3/12 assertions failed
Suggested fix:
β Re-run evals with updated embeddings
β Check: ai-workflow-evals + llm-cost-tracker
AI systems don't fail like code β they degrade silently, drift with data, and pass tests while producing worse outputs. This is the CI layer that catches it.
β‘οΈ 1-minute setup
- Copy this into
.github/workflows/ai-hygiene.ymlin any repo:
name: AI PR Hygiene
on:
pull_request:
types: [opened, synchronize, reopened]
jobs:
hygiene:
runs-on: ubuntu-latest
permissions:
pull-requests: write
contents: read
steps:
- uses: actions/checkout@v4
- uses: ollieb89/pr-context-enricher@v1.0.0
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
- uses: ollieb89/ai-pr-guardian@v1.0.0
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
threshold: 60
- uses: ollieb89/ai-root-cause-hints@v1.0.0
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
-
Open a PR.
-
You'll see:
π PR Context Summary
Author: @you | Base: main β feature/my-change
5 files changed (+120/-30) | Risk: low | Complexity: 3/10
Related issues: #42
β
PR Quality Score: 78/100
π’ No correlated failure patterns detected.
No API keys. No external services. Just GITHUB_TOKEN.
Start here
New to the suite? Pick your entry point:
| You want to... | Start with |
|---|---|
| Gate AI-generated PR slop | ai-pr-guardian |
| Give AI reviewers full PR context | pr-context-enricher |
| Stop secrets leaking from AI outputs | ai-output-redacter |
| Lock down your supply chain | actions-lockfile-generator |
| Catch agent behavioral regressions | ai-workflow-evals |
| Validate your MCP server in CI | mcp-server-tester |
| Publish agent skills safely | agent-skill-validator |
| Understand why your AI pipeline broke | ai-root-cause-hints |
| Track LLM spend before it hits your card | llm-cost-tracker |
Each action works standalone. The full pipeline shows how they compose.
The Suite
π PR Quality & Context
| Action | What it solves |
|---|---|
| ai-pr-guardian | Scores PR quality 0β100, detects AI-generated slop, gates merges |
| pr-context-enricher | Auto-generates rich context summaries: files, risk level, commit history, ready-to-paste AI reviewer prompt |
π‘οΈ Safety & Security
| Action | What it solves |
|---|---|
| ai-output-redacter | Scans and redacts API keys, tokens, PII, and secrets from AI-generated outputs before they leave CI |
| actions-lockfile-generator | Pins all uses: to full commit SHAs β prevents supply chain attacks |
π§ͺ Testing & Behavioral Validation
| Action | What it solves |
|---|---|
| ai-workflow-evals | Runs eval suites for prompts, agents, and workflows β catches behavioral regressions before merge |
| mcp-server-tester | Validates MCP servers: health, protocol compliance, tool/resource discovery |
| agent-skill-validator | Lints and validates agent skill repos (OpenClaw, Claude Code, Codex, Gemini) |
π° Infrastructure & Cost
| Action | What it solves |
|---|---|
| llm-cost-tracker | Tracks OpenAI/Anthropic/Gemini spend in CI, alerts on budget overruns |
Usage paths β start with your problem
Don't know which action to use? Pick your problem:
| My problem | Recommended stack |
|---|---|
| AI outputs regressed, don't know why | AI Debugging Stack β evals + cost tracker + root cause hints |
| PRs are noisy and AI-sloppy | PR Hygiene Stack β context enricher + guardian + lockfile |
| Worried about secrets leaking | AI Safety Stack β output redacter + lockfile + root cause hints |
| Shipping agent skills or MCP servers | agent-skill-validator + mcp-server-tester |
| Want everything | Full pipeline |
β Detailed "when to use which action" guide
Full Pipeline
jobs:
ai-devops:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
# Enrich PR with context for AI reviewers
- id: context
uses: ollieb89/pr-context-enricher@v1.0.0
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
# Gate AI-generated / low-quality PRs before review
- uses: ollieb89/ai-pr-guardian@v1.0.0
with:
threshold: 60
on-low-quality: comment
# Scan AI outputs for secrets and PII before they ship
- uses: ollieb89/ai-output-redacter@v1.0.0
with:
path: ./outputs
mode: enforce
# Enforce SHA pinning on all workflow action references
- uses: ollieb89/actions-lockfile-generator@v1.0.0
with:
mode: enforce
github-token: ${{ secrets.GITHUB_TOKEN }}
# Catch behavioral regressions in prompts and agents
- uses: ollieb89/ai-workflow-evals@v1.0.0
with:
eval-suite: ./evals
fail-on: regression
# Track what this run cost in LLM calls
- uses: ollieb89/llm-cost-tracker@v1.0.0
with:
provider: anthropic
budget-limit: '1.00'
# Validate MCP server didn't regress
- uses: ollieb89/mcp-server-tester@v1.0.0
with:
transport: stdio
server-command: "node dist/server.js"
# Validate agent skills before publish
- uses: ollieb89/agent-skill-validator@v1.0.0
with:
ecosystem: auto
fail-on: errors
Install any action independently
# PR Quality & Context
uses: ollieb89/ai-pr-guardian@v1.0.0
uses: ollieb89/pr-context-enricher@v1.0.0
# Safety & Security
uses: ollieb89/ai-output-redacter@v1.0.0
uses: ollieb89/actions-lockfile-generator@v1.0.0
# Testing & Behavioral Validation
uses: ollieb89/ai-workflow-evals@v1.0.0
uses: ollieb89/mcp-server-tester@v1.0.0
uses: ollieb89/agent-skill-validator@v1.0.0
# Infrastructure & Cost
uses: ollieb89/llm-cost-tracker@v1.0.0
All actions are MIT licensed, independently versioned, and production-ready.
Suite stats
| Layer | Actions | Total tests |
|---|---|---|
| PR Quality & Context | 2 | 86 |
| Safety & Security | 2 | 127 |
| Testing & Behavioral Validation | 4 | 180 |
| Infrastructure & Cost | 1 | 48 |
| Total | 9 | 441 |
Related tools
- workflow-guardian β Workflow health monitoring and linting
- ghact β CLI toolkit: lint workflows, audit security, check for updates
- workflow-linter-vscode β VS Code extension for real-time workflow linting
