NixOS Dev Quick Deploy
This is a script that will automatically setup a development environment with tools and packages. That is geared towards developing NixOS environments and AI assisted coding.
Ask AI about NixOS Dev Quick Deploy
Powered by Claude Β· Grounded in docs
I know everything about NixOS Dev Quick Deploy. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
NixOS Dev Quick Deploy
A production-grade NixOS-first deployment harness that transforms a fresh NixOS host into a fully operational local AI platform with declarative provisioning, host-local inference, multi-agent orchestration, continuous learning, and unified operator visibility.
Table of Contents
- Overview
- Quick Start
- Architecture
- AI Stack Services
- CLI Tools
- Agent Integrations
- MCP Servers
- Skills Library
- Workflows
- Configuration
- Documentation
- License
Overview
What This Is
NixOS-Dev-Quick-Deploy is:
- A Nix-first deployment harness for provisioning NixOS systems (workstations, servers, SBCs)
- A local AI stack framework providing host-local inference, embeddings, retrieval, and orchestration
- An operator control plane with a command center dashboard and programmatic APIs
- An agent-oriented development platform supporting multi-agent workflows and continuous learning
- A single-repository system that bootstraps from a fresh NixOS host to a fully operational AI platform
Core Philosophy
| Principle | Description |
|---|---|
| Declarative-first | Nix modules define system state; runtime scripts are fallbacks only |
| Local-first | Host-local inference and storage by default; hybrid routing to remote models optional |
| Zero bolt-on | All core features auto-enable on deployment β no manual toggles |
| Reproducible | All decisions tracked, git history preserved, rollback-safe |
| Operator-facing | Health, visibility, and control surfaces built-in |
Who This Is For
- NixOS users who want one repository to provision and operate an AI-capable machine
- Developers who want local inference, retrieval, and orchestration without manual setup
- Teams experimenting with agentic development patterns on a reproducible host
- Operators who want clearer health, deployment, and verification workflows
Quick Start
First-time Machine Bootstrap
git clone https://github.com/MasterofNull/NixOS-Dev-Quick-Deploy.git ~/NixOS-Dev-Quick-Deploy
cd ~/NixOS-Dev-Quick-Deploy
chmod +x nixos-quick-deploy.sh
./nixos-quick-deploy.sh --host "$(hostname)" --profile ai-dev
Day-2 Operations
./deploy --help # Show all commands
./deploy health # Run health checks
./deploy ai-stack # AI stack management
./deploy test # Run validation tests
Post-Deploy Verification
curl http://127.0.0.1:8889/api/health # Dashboard health
aq-qa 0 --json # Run QA suite
aq-hints "how do I configure NixOS services" # Get workflow hints
Architecture
flowchart TD
subgraph "Deployment Layer"
A[flake.nix + NixOS modules] --> B[nixos-quick-deploy.sh]
A --> C[deploy CLI]
end
subgraph "System Layer"
B --> D[systemd services]
C --> D
end
subgraph "AI Stack"
D --> E[llama-cpp<br/>:8080]
D --> F[Embeddings<br/>:8001]
D --> G[Switchboard<br/>:8085]
D --> H[AIDB<br/>:8002]
D --> I[Hybrid Coordinator<br/>:8003]
D --> J[Ralph Wiggum<br/>:8004]
end
subgraph "Data Layer"
D --> K[PostgreSQL<br/>:5432]
D --> L[Redis<br/>:6379]
D --> M[Qdrant<br/>:6333]
end
subgraph "Operator Layer"
D --> N[Command Center<br/>:8889]
N --> O[Dashboard UI]
N --> P[Health APIs]
end
subgraph "Agent Layer"
G --> Q[Continue.dev]
G --> R[Aider]
G --> S[Claude/Codex/Qwen]
I --> T[Multi-Agent Orchestration]
end
System At A Glance
| Area | What You Get |
|---|---|
| Operating model | Declarative NixOS + systemd runtime |
| Primary bootstrap | nixos-quick-deploy.sh |
| Day-2 operations | deploy CLI |
| Operator surface | Command Center dashboard on 127.0.0.1:8889 |
| Local inference | llama.cpp inference + embeddings (GPU-accelerated) |
| Data layer | PostgreSQL, Qdrant, Redis |
| AI coordination | AIDB, hybrid coordinator, Ralph Wiggum, switchboard |
| Secret handling | SOPS-nix via /run/secrets/* |
AI Stack Services
Inference Layer
| Service | Port | Purpose | Technology |
|---|---|---|---|
| llama-cpp | 8080 | OpenAI-compatible inference API | llama.cpp (CUDA/Vulkan/CPU) |
| Embeddings | 8001 | Sentence transformer embeddings | Qwen3-Embedding-4B |
| Switchboard | 8085 | LLM routing proxy (local/remote hybrid) | FastAPI + profile routing |
| Open WebUI | 3000 | Browser chat interface (optional) | Web-based UI |
Knowledge & Retrieval
| Service | Port | Purpose | Technology |
|---|---|---|---|
| AIDB | 8002 | Knowledge base + tool discovery | PostgreSQL + MCP |
| Hybrid Coordinator | 8003 | Query routing, context augmentation, learning | Python + Qdrant |
| Ralph Wiggum | 8004 | Autonomous loop orchestrator | Task queuing + checkpoints |
| Qdrant | 6333 | Vector search | Qdrant HTTP API |
Data & Monitoring
| Service | Port | Purpose |
|---|---|---|
| PostgreSQL | 5432 | Relational DB (AIDB, memory, learning) |
| Redis | 6379 | Cache + session store |
| Prometheus | 9090 | Metrics collection |
| Command Center | 8889 | Operator dashboard + APIs |
Hardware-Aware Scaling
Services automatically adapt to detected hardware tier:
| Tier | RAM | Concurrency | Model Quantization |
|---|---|---|---|
| nano | <2G | 1 | Q2_K (4-bit) |
| micro | 2-7G | 2 | Q4_K_M |
| small | 8-15G | 4 | Q4_K_M |
| medium | 16-31G | 8 | Q8_0 |
| large | β₯32G | 16 | fp16 |
CLI Tools
Primary Orchestration: aqd
The main workflow CLI for agent orchestration, skill management, and QA.
# Workflow orchestration
aqd workflows list # List available workflows
aqd workflows project-init --target <dir> # Scaffold new project
aqd workflows brownfield --target <repo> # Improve existing project
aqd workflows primer --target <repo> # Read-only session context
# Skill management
aqd skill validate # Validate skill format
aqd skill init <name> # Create new skill
aqd skill package <path> # Package for distribution
# MCP server management
aqd mcp scaffold <name> [--type python|deno] # Create MCP server
aqd mcp validate <name> # Validate server
aqd mcp test <name> # Run server tests
# Quality assurance
aqd parity advanced-suite # Full QA suite
aqd parity regression-gate --online # Regression testing
aqd parity chaos-smoke # Chaos injection
Core AI Tools (aq-*)
| Tool | Purpose |
|---|---|
aq-hints | Query ranked workflow hints for a task |
aq-qa | Run QA validation suites |
aq-report | Generate system health reports |
aq-context-bootstrap | Classify task and suggest entry points |
aq-context-card | Generate progressive disclosure context |
aq-capability-gap | Detect missing CLI tools or MCP servers |
aq-capability-plan | Generate implementation plan for gaps |
aq-capability-remediate | Auto-apply fixes for gaps |
aq-gap-import | Import external docs into AIDB |
aq-llama-debug | Troubleshoot inference issues |
aq-rag-prewarm | Pre-compute embeddings |
aq-autoresearch | Self-directed autonomous research |
aq-patterns | Extract reusable patterns |
aq-collaborate | Multi-agent task delegation |
aq-meta-optimize | Autonomous parameter tuning |
System PATH vs Repo-Local
System PATH installed (7 tools):
ls /run/current-system/sw/bin/aq-*
# aq-hints, aq-qa, aq-report, aqd, harness-rpc, project-init, workflow-primer
Repo-local scripts (44+ tools):
ls /opt/nixos-quick-deploy/scripts/ai/aq-*
Agent Integrations
Supported Agents
| Agent/IDE | Integration | Features |
|---|---|---|
| Continue.dev | MCP stdio bridge | Hints provider, tool catalog, local inference |
| Aider | MCP wrapper | Git-aware editing, PR generation |
| Claude (API) | Remote delegation | Long-form synthesis, architecture decisions |
| Codex/OpenAI | Switchboard routing | Profile-based context pruning |
| Qwen | Local/remote | Fast implementation tasks |
| Gemini | Remote routing | Research & discovery |
| Ollama | OpenAI-compatible | Additional local models |
Multi-Agent Workflow
The system supports orchestrator/sub-agent patterns:
-
Orchestrator (Claude Opus, Codex):
- Plan: decompose into discrete slices
- Delegate: route architecture β claude, implementation β qwen
- Review: validate evidence before acceptance
-
Sub-agents (Qwen, Gemini):
- Execute only assigned slice
- Return evidence: files changed, commands run, tests passed
- Never re-scope or finalize acceptance
Agent Profiles
| Profile | Use Case | Agent |
|---|---|---|
nixos-systems-architect | NixOS modules, flakes, hardware | claude (sub) |
senior-ai-stack-dev | AI stack, model selection, observability | claude (sub) |
general-coding | Patches, test scaffolding, runtime logic | qwen |
research-synthesis | Discovery, analysis, documentation | gemini |
MCP Servers
Core MCP Servers
| Server | Port | Purpose |
|---|---|---|
| hybrid-coordinator | 8003 | Context augmentation, continuous learning, query routing |
| aidb | 8002 | Knowledge base, tool discovery, document lifecycle |
| embeddings-service | 8001 | Sentence transformer API |
| ralph-wiggum | 8004 | Autonomous loop orchestration |
| aider-wrapper | β | IDE integration for git-aware editing |
| health-monitor | β | Service health tracking |
| nixos | β | NixOS option search, module queries |
| container-engine | β | OCI/Podman integration |
MCP Bridge
mcp-bridge-hybrid.py translates MCP stdio protocol to hybrid-coordinator REST API.
Exposed tools:
hybrid_searchβ semantic search with optional LLM synthesisget_hintsβ workflow hints for current taskworkflow_planβ phased plan generationworkflow_run_startβ guarded workflow executionstore_memory/recall_memoryβ agent memory operationsquery_aidbβ knowledge base search
Skills Library
25+ unified skills in .agent/skills/:
AI/Development
ai-stack-qaβ QA & validation workflowsai-model-managementβ Model lifecyclenixos-deploymentβ Deployment automationmcp-builderβ Create MCP serversskill-creatorβ Create new skillssecurity-scannerβ Vulnerability scanningperformance-profilerβ System profilingdebug-workflowβ Interactive debugging
Data & Knowledge
aidb-knowledgeβ Query AIDB knowledge baserag-techniquesβ RAG implementation patternsxlsxβ Spreadsheet operationspdfβ PDF manipulationpptxβ PowerPoint operations
Design & UI
frontend-designβ Web UI designcanvas-designβ Visual art (PNG, PDF)web-artifacts-builderβ React/Tailwind artifactstheme-factoryβ Theme generation
Operations
health-monitoringβ System health trackingproject-importβ Import external projectssystem_bootstrapβ System initialization
Workflows
PRSI Protocol
Plan β Validate β Execute β Measure β Feedback β Compress
# Generate workflow plan
curl -X POST http://127.0.0.1:8003/workflow/plan \
-H "Content-Type: application/json" \
-d '{"q": "implement new feature"}'
# Execute with guardrails
curl -X POST http://127.0.0.1:8003/workflow/run/start \
-H "Content-Type: application/json" \
-d '{"plan_id": "...", "dry_run": false}'
Coordinator-First Prompt Routing
Continue/editor prompt ingress now targets the hybrid coordinator first. The coordinator classifies prompt intent, selects the execution lane, and then uses switchboard as the downstream execution proxy for local/remote model traffic.
Execution lanes:
| Profile | Behavior |
|---|---|
default | Coordinator-selected local-first chat lane |
continue-local | Short prompts, local llama.cpp |
remote-free | Lightweight planning, retrieval, bounded synthesis |
remote-reasoning | Architecture/policy decisions |
remote-coding | Implementation via coding models |
remote-tool-calling | Tool-calling oriented remote execution |
embedding-local | Retrieval-only (no reasoning) |
OpenAI-compatible coordinator ingress:
GET /v1/modelsPOST /v1/chat/completionsPOST /v1/completions
Continuous Learning
The hybrid-coordinator implements autonomous learning:
- Interaction tracking β Every query + response recorded
- Pattern extraction β Identifies reusable snippets and patterns
- Quality cache β Caches high-value interactions (30-50% token savings)
- Federated sync β Shares patterns across agents
Autonomous Improvement
Local LLM-driven system optimization (enabled via timer):
- Analyzes system metrics (GPU util, memory, latency)
- Generates optimization hypotheses
- Executes experiments
- Validates improvements
- Records decisions
Configuration
System Options
mySystem = {
hostName = "nixos";
primaryUser = "user";
profile = "ai-dev"; # ai-dev, gaming, minimal
hardware = {
gpuVendor = "nvidia"; # amd, nvidia, intel, intel-arc, none
cpuVendor = "amd"; # amd, intel, arm, qualcomm, apple
systemRamGb = 32;
};
};
AI Stack Options
mySystem.aiStack = {
enable = true;
acceleration = "cuda"; # auto, vulkan, cuda, rocm, cpu
llamaCpp = {
port = 8080;
huggingFaceRepo = "Qwen/Qwen3-4B-Instruct-GGUF";
gpuLayers = 99;
};
embeddingServer = {
port = 8001;
huggingFaceRepo = "Qwen/Qwen3-Embedding-4B-GGUF";
};
switchboard = {
port = 8085;
remoteUrl = "https://openrouter.ai/api"; # optional
};
mcpServers = {
aidbPort = 8002;
hybridPort = 8003;
ralphPort = 8004;
};
};
Repository Structure
repo/
βββ flake.nix # Nix flake entry point
βββ nixos-quick-deploy.sh # Bootstrap script
βββ deploy # Day-2 operations CLI
βββ CLAUDE.md # Always-read agent guidance
βββ AGENTS.md # Agent onboarding (compact)
β
βββ nix/
β βββ modules/
β β βββ core/ # Base options, secrets, networking
β β βββ services/ # AI stack services
β β βββ roles/ # System profiles (ai-stack, desktop, server)
β β βββ hardware/ # GPU/CPU/storage tuning
β βββ hosts/ # Host-specific configurations
β βββ home/ # Home Manager configurations
β
βββ scripts/
β βββ ai/ # Core harness CLI (aqd, aq-*, mcp-bridge)
β βββ governance/ # Repo structure, validation
β βββ health/ # Health checks
β βββ testing/ # Test runners
β
βββ config/
β βββ service-endpoints.sh # Port definitions
β βββ *.yaml # Service configurations
β
βββ ai-stack/
β βββ mcp-servers/ # MCP server implementations
β βββ agents/ # Agent skill definitions
β βββ autonomous-improvement/ # Self-optimization
β βββ continue/ # Continue.dev integration
β
βββ .agent/
β βββ skills/ # 25+ unified skills
β βββ commands/ # Slash-command implementations
β βββ workflows/ # Workflow state
β
βββ dashboard/ # Command center (React + FastAPI)
β
βββ docs/ # Comprehensive documentation
βββ agent-guides/ # Progressive disclosure guides
βββ architecture/ # System design
βββ operations/ # Day-2 operations
Documentation
Getting Started
Architecture
Operations
Agent Guides
Validation
Pre-Commit
scripts/governance/tier0-validation-gate.sh --pre-commit
Pre-Deploy
scripts/governance/tier0-validation-gate.sh --pre-deploy
Health Checks
./deploy health
aq-qa 0 --json
Visual Tour

| Overview | Host telemetry | AI stack status |
|---|---|---|
![]() | ![]() | ![]() |



