Containarium
Hundred of containers living together with LXC container for saving resources and make the configuration easier
Installation
npx containariumAsk AI about Containarium
Powered by Claude Β· Grounded in docs
I know everything about Containarium. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Containarium
Run hundreds of isolated development environments on a single VM. Built with Incus (LXC/QEMU), SSH jump hosts, and cloud-native automation.
π« No Kubernetes π« No VM-per-user β Fast, cheap, isolated Linux environments (Ubuntu, Rocky/RHEL 9) β Windows Server VMs with RDP access β GPU passthrough for ML/AI workloads β Multi-backend: GCP spot VMs + bare-metal GPU nodes
Container Management

Container List View

App Hosting

Network Topology

Traffic Monitoring

Monitoring Dashboard

Alerts

Audit Logs

GPU Node (Multi-Backend)

Security Scanning

π Live Demo: https://containarium.kafeido.app/webui/demo
Quick Start
Web UI
Access the web-based dashboard at http://your-server:50051/webui/
Features:
- Multi-server management with tabs
- Real-time container metrics (CPU, Memory, Disk)
- Container lifecycle management
- Browser-based terminal access
- Client-side SSH key generation
Authentication
Containarium uses JWT tokens for API authentication. Tokens are generated via CLI only (not exposed via API).
Generate a token:
# On the server running containarium daemon
containarium token generate \
--username admin \
--roles admin \
--expiry 720h \
--secret-file /etc/containarium/jwt.secret
# Or with inline secret (for testing)
containarium token generate \
--username admin \
--roles admin \
--expiry 24h \
--secret 'your-jwt-secret'
Token options:
| Flag | Description |
|---|---|
--username | Username for the token (required) |
--roles | Comma-separated roles (default: user) |
--expiry | Token validity (e.g., 24h, 720h, 0 for no expiry) |
--secret | JWT secret key |
--secret-file | Path to file containing JWT secret |
Use the token:
# REST API
curl -H "Authorization: Bearer <token>" http://localhost:50051/v1/containers
# Web UI: Click "Add Server" and paste the token
Why Containarium?
Most teams still provision one VM per developer for SSH-based development.
That approach is:
πΈ Expensive
π’ Slow to provision
π§± Wasteful (idle CPU, memory, disk)
Containarium replaces that model with multi-tenant system containers (LXC):
One VM β many isolated Linux environments β massive cost savings
In real deployments, this reduces infrastructure costs by up to 90%.
What It Does
Containarium is a multi-backend development environment platform that:
- Hosts many isolated environments on cloud VMs and bare-metal GPU nodes
- Gives each user SSH access to their own container
- Supports multiple OS types: Ubuntu 24.04, Rocky Linux 9 (dev), RHEL 9 (production)
- Runs Windows Server VMs with RDP access via QEMU/KVM
- Provides GPU passthrough (NVIDIA) for ML/AI workloads
- Keeps containers persistent, even across VM restarts and spot preemptions
- Managed via CLI, REST API, gRPC, Web UI, and MCP (Claude Desktop)
Each container behaves like a lightweight VM:
- Full Linux OS (or Windows Server VM)
- User accounts with SSH access
- Docker/Podman support with nested containers
- GPU passthrough for CUDA workloads
- Pre-configured software stacks (Node.js, Python, Go, Rust, GPU/CUDA, Android, Docker, etc.)
Architecture (High Level)
Developer Laptop
|
| ssh / https
v
+---------------------------+
| Sentinel (e2-micro) | sshpiper + reverse proxy
+---------------------------+
| | |
v v v
+---------+ +---------+ +---------+
| GCP Spot| | GPU Node| | GPU Node|
| VM | | (tunnel)| | (tunnel)|
+---------+ +---------+ +---------+
| LXC x19 | | LXC x5 | | LXC x4 |
| Caddy | | RTX 4090| | RTX 3090|
| ZFS | | ZFS | | ZFS |
+---------+ +---------+ +---------+
Key Features
π Fast Provisioning
- Create a full Linux environment in seconds
- Pre-configured stacks: Node.js, Python, Go, Rust, Data Science, DevOps, Docker, GPU/CUDA, Android, Full Stack
- Multi-OS: Ubuntu 24.04, Rocky Linux 9, RHEL 9
π Strong Isolation
- Unprivileged LXC containers
- Separate users, filesystems, and processes
- SSH jump host with sshpiper (username-based routing)
- ClamAV + Trivy security scanning across all backends
πΎ Persistent Storage
- Containers survive:
- VM restarts
- Spot/preemptible instance termination
- Backed by ZFS persistent disks with compression
π₯οΈ Multi-Backend Architecture
- GCP Spot VMs: Cost-effective cloud backends
- Bare-metal GPU nodes: RTX 3090, RTX 4090, etc. connected via tunnel
- Windows Server VMs: QEMU/KVM with RDP access
- Containers from all backends appear in a single unified dashboard
- Multi-pool: One sentinel can front multiple isolated clusters (e.g.,
containarium-prod.example.comandcontainarium-lab.example.com), each with its own primary VM and peers β see docs/MULTI-POOL.md - See docs/WINDOWS-VM-SETUP.md for Windows VMs
π‘οΈ Sentinel HA (Spot Instance Recovery)
- One tiny always-on VM (e2-micro, free tier) monitors all backends
- Detects preemption in ~10s, serves maintenance page instantly
- Restarts spot VMs automatically β ~85s total recovery
- Routes SSH via sshpiper, HTTP via reverse proxy
- See docs/SENTINEL-DESIGN.md for the full design
π Monitoring & Observability
- Backend heartbeat dashboard (Grafana status-history panel)
- Per-container metrics: CPU, memory, disk, network
- VictoriaMetrics + Grafana auto-provisioned
- Custom alert rules with webhook notifications
βοΈ Simple Management
- Single Go binary for all components
- Web UI with real-time updates (SSE)
- REST API + gRPC + MCP (Claude Desktop integration)
- Terraform for infrastructure provisioning
π° Cost Efficient
Example (illustrative):
| Setup | Monthly Cost |
|---|---|
| 50 VMs (1 per user) | $$$$ |
| 1 VM + 50 LXC containers | $$ |
Why LXC (Not Docker, Not Kubernetes)
Containarium uses LXC system containers because:
- Each container runs a full Linux OS
- Better fit for:
- SSH-based workflows
- Long-running dev environments
- "Feels like a VM" usage
This is not:
- A Kubernetes cluster
- An application container platform
It is intentionally simple.
Use Cases
π©βπ» Shared developer environments (Linux + Windows)
π§βπ Education, bootcamps, workshops
π§ͺ AI / ML experimentation with GPU passthrough
π± Android app development (headless CI or Android Studio via VNC)
π§βπΌ Intern or contractor onboarding
π’ Cost-sensitive enterprises with SSH workflows
π Security-scanned environments (ClamAV + Trivy)
How It's Different
| Tool | What It Optimizes For |
|---|---|
| Kubernetes | Application orchestration |
| Docker | App packaging |
| Proxmox | General virtualization |
| Codespaces | Browser IDEs |
| Containarium | Cheap, fast, SSH-based dev environments |
Status
- Actively used in production (GCP + bare-metal GPU nodes)
- v0.16.1 β multi-pool architecture, GPU-by-PCI, lab pool SSH
- APIs stable (protobuf-defined with gRPC-gateway)
- Contributions and feedback welcome
Getting Started
β οΈ Currently optimized for Linux hosts and cloud VMs.
System Requirements
Host System (runs on Ubuntu, containers can be any supported OS):
- Ubuntu 24.04 LTS (Noble) or later
- Incus 6.19 or later (required for Docker build support)
- Ubuntu 24.04 default repos ship Incus 6.0.0 which has AppArmor bug (CVE-2025-52881)
- This bug breaks Docker builds in unprivileged containers
- Solution: Use Zabbly Incus repository for latest stable builds
- ZFS kernel module (for disk quotas)
- Kernel modules:
overlay,br_netfilter,nf_nat(for Docker in containers)
Quick Incus Installation (6.19+):
# Add Zabbly repository (recommended)
curl -fsSL https://pkgs.zabbly.com/key.asc | sudo gpg --dearmor -o /usr/share/keyrings/zabbly-incus.gpg
echo 'deb [signed-by=/usr/share/keyrings/zabbly-incus.gpg] https://pkgs.zabbly.com/incus/stable noble main' | sudo tee /etc/apt/sources.list.d/zabbly-incus-stable.list
sudo apt update
sudo apt install incus incus-tools incus-client
# Verify version
incus --version # Should show 6.19 or later
Quick Start
Option 1: Manual Installation (Recommended for getting started)
One-command installation on Ubuntu:
curl -fsSL https://raw.githubusercontent.com/footprintai/containarium/main/hacks/install.sh | sudo bash
This installs Containarium, Incus, and all dependencies. See hacks/README.md for details.
Option 2: Terraform Deployment (Recommended for production)
Deploy to GCE with full infrastructure automation:
cd terraform/gce
terraform init
terraform apply
See terraform/gce/README.md for configuration options.
After Installation:
- Start the daemon:
sudo systemctl start containarium - Create containers:
# Ubuntu (default) sudo containarium create alice --ssh-key ~/.ssh/id_ed25519.pub # Rocky Linux 9 (dev/test) sudo containarium create bob --ssh-key ~/.ssh/id_ed25519.pub --os-type rocky9 # With GPU and software stack sudo containarium create ml-dev --ssh-key ~/.ssh/id_ed25519.pub --gpu 0 --stack gpu - Connect via SSH:
ssh alice@container-ip - Web UI:
http://your-server:8080/webui/ - REST API:
http://your-server:8080/swagger-ui/
π See docs/ for detailed setup instructions.
API Access
Containarium provides two APIs for maximum flexibility:
gRPC API (Port 50051)
For programmatic access and the CLI tool. Uses mTLS for authentication.
# Start daemon with mTLS
containarium daemon --mtls
# Use CLI
containarium list
containarium create --username john
REST API (Port 8080)
For HTTP/JSON access, webhooks, and web UIs. Uses Bearer token authentication.
# Start daemon with REST API
containarium daemon --rest
# The daemon will auto-generate and display a JWT secret on startup
JWT Secret Configuration (Priority Order):
-
Environment Variable (Production - Recommended)
export CONTAINARIUM_JWT_SECRET="your-secret-key" containarium daemon --rest -
Secret File (Production)
# Generate secret openssl rand -base64 32 > /etc/containarium/jwt.secret chmod 600 /etc/containarium/jwt.secret # Start daemon containarium daemon --rest --jwt-secret-file /etc/containarium/jwt.secret -
Command-line Flag (Testing)
containarium daemon --rest --jwt-secret "test-secret" -
Auto-Generated (Development)
# Just start the daemon - it will generate and print a random secret containarium daemon --rest # Output includes: # βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ # π JWT Secret (Auto-Generated) # βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ # Kx8jP7yN2wR5vT9mQ3hF6nL4sZ1aE0uC8bV5gX2wY4pM7kR= # ...
Generate API Token:
# Using generated/configured secret
TOKEN=$(containarium token generate \
--username admin \
--roles admin \
--expiry 720h \
--secret "your-jwt-secret")
Use REST API:
# Set token
export TOKEN="eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
# List containers
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:8080/v1/containers
# Create container
curl -X POST \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"username": "johndoe",
"resources": {
"cpu": "4",
"memory": "8GB",
"disk": "100GB"
},
"osType": "OS_TYPE_UBUNTU_2404",
"enablePodman": true,
"stack": "nodejs",
"async": true
}' \
http://localhost:8080/v1/containers
# Get container details
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:8080/v1/containers/johndoe
# Delete container
curl -X DELETE \
-H "Authorization: Bearer $TOKEN" \
http://localhost:8080/v1/containers/johndoe
MCP Integration (Claude Desktop)
NEW! Control Containarium directly from Claude Desktop using natural language:
# Build and install MCP server
make build-mcp
make install-mcp
# Generate JWT token for MCP
containarium token generate \
--username mcp-client \
--roles admin \
--expiry 8760h \
--secret-file /etc/containarium/jwt.secret
Configure Claude Desktop (~/.config/claude/claude_desktop_config.json):
{
"mcpServers": {
"containarium": {
"command": "/usr/local/bin/mcp-server",
"env": {
"CONTAINARIUM_SERVER_URL": "http://localhost:8080",
"CONTAINARIUM_JWT_TOKEN": "your-jwt-token"
}
}
}
}
Use Claude to manage containers:
- "Create a container for alice with 8GB memory"
- "List all running containers"
- "Show metrics for bob's container"
- "Delete charlie's container"
π See docs/MCP-INTEGRATION.md for complete guide.
Interactive API Documentation
Swagger UI is available at: http://localhost:8080/swagger-ui/
Features:
- Interactive API testing
- Complete endpoint documentation
- Request/response examples
- Built-in authentication testing
Available REST Endpoints
Containers:
| Method | Endpoint | Description |
|---|---|---|
POST | /v1/containers | Create a new container |
GET | /v1/containers | List all containers (all backends) |
GET | /v1/containers/{username} | Get container details |
DELETE | /v1/containers/{username} | Delete a container |
POST | /v1/containers/{username}/start | Start a container |
POST | /v1/containers/{username}/stop | Stop a container |
PUT | /v1/containers/{username}/resize | Resize CPU/memory/disk |
POST | /v1/containers/{username}/install-stack | Install software stack |
POST | /v1/containers/{username}/cleanup-disk | Free disk space |
Collaborators:
| Method | Endpoint | Description |
|---|---|---|
POST | /v1/containers/{username}/collaborators | Add collaborator |
DELETE | /v1/containers/{username}/collaborators/{collaborator} | Remove collaborator |
GET | /v1/containers/{username}/collaborators | List collaborators |
System & Monitoring:
| Method | Endpoint | Description |
|---|---|---|
GET | /v1/system/info | System info (all backends) |
GET | /v1/system/monitoring | Grafana/VictoriaMetrics URLs |
GET | /v1/metrics | Container metrics |
GET | /v1/backends | List backends with health status |
Security:
| Method | Endpoint | Description |
|---|---|---|
GET | /v1/security/clamav-summary | ClamAV scan summary |
GET | /v1/security/clamav-reports | Scan reports |
POST | /v1/security/clamav-scan | Trigger security scan |
Alerts:
| Method | Endpoint | Description |
|---|---|---|
POST | /v1/alerts | Create alert rule |
GET | /v1/alerts | List alert rules |
PUT | /v1/system/alerting | Update webhook config |
Documentation
| Guide | Description |
|---|---|
| SENTINEL-DESIGN.md | Sentinel HA architecture |
| WINDOWS-VM-SETUP.md | Windows Server VM with RDP access |
| ANDROID-DEV-SETUP.md | Android development environment (headless + GUI) |
| KUBEFLOW-SETUP.md | Kind + Kubeflow Pipelines for ML workflows |
| CROSS-PEER-FILE-TRANSFER.md | Transfer large files between peer containers |
| ALERTING-SETUP.md | Alert rules, webhooks (Zulip/Slack), troubleshooting |
| MCP-INTEGRATION.md | Claude Desktop MCP integration |
Philosophy
Containarium follows the same principle as Footprint-AI's platform:
Do more with less compute.
- Less idle.
- Less waste.
- Less cost.
License
Apache 2.0
About Footprint-AI
Containarium is an open-source project by Footprint-AI, focused on resource-efficient computing for modern development and AI workloads.
π Detailed Architecture Design
Overview
Containarium provides a multi-layer architecture combining cloud infrastructure, container management, and secure access:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Users (SSH / HTTP / gRPC) β
ββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Sentinel VM (e2-micro, always-on) β
β β’ Owns static public IP β
β β’ iptables DNAT β spot VMs (normal) β
β β’ Maintenance page + status (preemption) β
β β’ Auto-restarts spot VMs on preemption β
β β’ TLS cert sync for valid HTTPS β
β β’ Mgmt SSH: port 2222 β
ββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β VPC internal
ββββββββββββββββΌβββββββββββββββ
βΌ βΌ βΌ
ββββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββββββ
β Spot VM 1 β β Spot VM 2 β β Spot VM N β
β β’ Incus + ZFS β β β’ Incus + ZFS β β β’ Incus + ZFS β
β β’ Caddy (TLS) β β β’ Caddy (TLS) β β β’ Caddy (TLS) β
β β’ Containarium β β β’ Containarium β β β’ Containarium β
β β’ No external IP β β β’ No external IP β β β’ No external IP β
ββββββββββ¬ββββββββββ ββββββββββ¬ββββββββββ ββββββββββ¬ββββββββββ
βΌ βΌ βΌ
Persistent Disk Persistent Disk Persistent Disk
(ZFS pool) (ZFS pool) (ZFS pool)
50 containers 50 containers 50 containers
Architecture Layers
1. Infrastructure Layer (Terraform + GCE)
- Compute: Spot instances with persistent disks
- Storage: ZFS on dedicated persistent disks (survives termination)
- Network: VPC with firewall rules, Cloud NAT for spot VM outbound
- HA: Single sentinel VM monitors multiple spot VMs, auto-restarts on preemption (~85s recovery), serves maintenance page during outage. See docs/SENTINEL-DESIGN.md
2. Container Layer (Incus + LXC)
- Runtime: Unprivileged LXC containers
- Storage: ZFS with compression (lz4) and quotas
- Network: Bridge networking with isolated namespaces
- Security: AppArmor profiles, resource limits
3. Management Layer (Containarium CLI + REST API)
- Language: Go with Protobuf contracts
- Operations: Create, delete, list, info, resize, export
- APIs:
- Local CLI (default)
- gRPC daemon with mTLS (port 50051)
- REST/HTTP API with JWT auth (port 8080)
- Interactive Swagger UI for REST API
- Automation: Automated container lifecycle
4. Access Layer (SSH)
- Jump Server: SSH bastion host
- ProxyJump: Transparent container access
- Authentication: SSH key-based only
- Isolation: Per-user containers
Component Interaction
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β User Machine β
β β
β $ ssh my-dev β
β β β
β βββ ProxyJump via Jump Server β
β β β
β βββ SSH to Container IP (10.0.3.x) β
β β β
β βββ User in isolated Ubuntu container β
β with Docker installed β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Terraform Workflow β
β β
β terraform apply β
β β β
β βββ Create GCE instances (spot + persistent disk) β
β βββ Configure ZFS on persistent disk β
β βββ Install Incus from official repo β
β βββ Setup firewall rules β
β βββ Optional: Deploy containarium daemon β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Containarium CLI Workflow β
β β
β containarium create alice --ssh-key ~/.ssh/alice.pub β
β β β
β βββ Generate container profile (ZFS quota, limits) β
β βββ Launch Incus container (Ubuntu 24.04) β
β βββ Configure networking (get IP from pool) β
β βββ Inject SSH key for user β
β βββ Install Docker and dev tools β
β βββ Return container IP and SSH command β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Deployment Topologies
Single Server β no HA (20-50 users, dev/testing)
Internet
β
βΌ
βββββββββββββββββββββββββββββββββββ
β GCE Spot Instance β
β β’ n2-standard-8 (32GB RAM) β
β β’ 100GB boot + 100GB data disk β
β β’ ZFS pool on data disk β
β β’ 50 containers @ 500MB each β
βββββββββββββββββββββββββββββββββββ
Cost: $98/month | $1.96/user
Availability: ~99% (auto-restart only, ~9min downtime on preemption)
Single Server with Sentinel HA (20-50 users, production recommended)
Internet
β
βΌ
βββββββββββββββββββββββββββββββββββ
β Sentinel VM (e2-micro, free) β Owns static IP
β β’ sshpiper on :22 (SSH proxy) β Port 2222: management SSH
β β’ failtoban (brute-force ban) β Port 8888: binary server
β β’ iptables DNAT (:80,:443,etc) β
β β’ TLS cert + SSH key sync β
β β’ Maintenance page on preempt β
βββββββββββββββββ¬ββββββββββββββββββ
β VPC internal
βΌ
βββββββββββββββββββββββββββββββββββ
β Spot VM (c3d-highmem-8) β No external IP
β β’ Caddy reverse proxy β Cloud NAT for outbound
β β’ Containarium daemon β
β β’ 50 containers @ 500MB each β
β β’ ZFS on persistent disk β
βββββββββββββββββββββββββββββββββββ
Cost: ~$98/month | Recovery: ~85s
Availability: ~99.5% (auto-restart + maintenance page)
Horizontal Scaling (100-250 users)
Load Balancer
(SSH / HTTP)
β
βΌ
ββββββββββββββββββββββββ
β Sentinel VM β e2-micro (free tier)
β β’ sshpiper on :22 β Monitors all spot VMs
β β’ DNAT (non-SSH) β Auto-restarts on preemption
ββββββββββββ¬ββββββββββββ
β VPC internal
ββββββββββββββββΌβββββββββββββββ
βΌ βΌ βΌ
Spot VM-1 Spot VM-2 Spot VM-3
(50 users) (50 users) (50 users)
β β β
βΌ βΌ βΌ
Persistent-1 Persistent-2 Persistent-3
(500GB ZFS) (500GB ZFS) (500GB ZFS)
Cost: ~$312/month | $2.08/user (150 users)
Sentinel VM: free (e2-micro free tier)
Availability: ~99.5% (auto-restart + maintenance page per spot VM)
One sentinel monitors all spot VMs in the cluster. Each spot VM is independently monitored β if one is preempted, the sentinel auto-restarts it while the others continue serving. The sentinel owns the static public IP and routes traffic to the appropriate spot VM.
Data Flow
Container Creation Flow
1. User: containarium create alice --ssh-key alice.pub
2. CLI: Read SSH public key from file
3. CLI: Call Incus API to launch container
4. Incus: Pull Ubuntu 24.04 image (cached after first use)
5. Incus: Create ZFS dataset with quota (default 20GB)
6. Incus: Assign IP from pool (10.0.3.x)
7. CLI: Wait for container network ready
8. CLI: Inject SSH key into container
9. CLI: Install Docker and dev tools
10. CLI: Return IP and connection info
SSH Connection Flow
Containarium supports two SSH connection methods. The choice depends on whether the sentinel can reach the container's IP directly.
Method 1: Direct via sshpiper (recommended)
The simplest setup β sshpiper on the sentinel routes by username and containarium-shell on the backend host proxies into the container. Works regardless of network topology since the sentinel only needs to reach the backend host, not the container IP:
~/.ssh/config:
Host my-dev
HostName containarium.example.com β sentinel address
User alice β username = routing key
IdentityFile ~/.ssh/containarium
Flow:
1. User: ssh my-dev
2. SSH: Connect to sentinel:22 (sshpiper)
3. sshpiper: Match username "alice", route to backend host
4. Host sshd: Authenticate sentinel upstream key
5. containarium-shell: sudo incus exec alice-container -- su -l alice
6. User: Interactive shell in container
(If auth fails 20x β sshpiper bans client IP for 1h)
Method 2: ProxyJump with container IP
Uses the sentinel as a TCP tunnel to reach the container's sshd directly. Requires that the container IP (on the Incus bridge) is routable from the sentinel β this only works when the backend host is on the same network (e.g., same VPC). Does not work when the backend is behind a firewall, NAT, or connected via tunnel:
~/.ssh/config:
Host containarium-jump
HostName containarium.example.com
User alice
IdentityFile ~/.ssh/containarium
Host my-dev
HostName 10.0.3.100 β container IP on Incus bridge
User alice
IdentityFile ~/.ssh/containarium
ProxyJump containarium-jump
Flow:
1. User: ssh my-dev
2. SSH: ProxyJump through sentinel:22 (sshpiper)
3. sshpiper: TCP-forward to backend host
4. Backend host: Forward TCP to container IP (10.0.3.100:22)
5. Container sshd: Authenticate alice's key
6. User: Shell access in container
Method Comparison:
| Method 1 (Direct) | Method 2 (ProxyJump) | |
|---|---|---|
| Config complexity | Simple (1 Host entry) | Requires jump host + container IP |
| Same-network backends | Yes | Yes |
| Firewalled/NAT backends | Yes | No (container IP not routable) |
ssh host "command" | Interactive shell only | Full command execution |
| Container IP needed | No | Yes |
Security Architecture:
- Separate accounts: Each user has their own account on the backend host
- containarium-shell: Login shell proxies into the user's container via
incus exec(no host shell access) - Same key: Users use one key for both sentinel auth and container access
- Admin isolation: Only admin can access host shell directly
- Audit trail: Each user's connections logged separately
- DDoS protection: sshpiper failtoban bans IPs after 3 failed auth attempts for 1h
- Zero trust: Users cannot see other containers or inspect the host system
Spot Instance Recovery Flow
1. GCE: Spot instance terminated (preemption)
2. GCE: Persistent disk detached (data preserved)
3. GCE: Instance restarts (within 5 minutes)
4. Startup: Mount persistent disk to /var/lib/incus
5. Startup: Import existing ZFS pool (incus-pool)
6. Incus: Auto-start containers (boot.autostart=true)
7. Total downtime: 2-5 minutes
8. Data: 100% preserved
π― Use Cases
- Development Teams: Isolated dev environments for each developer (100+ users)
- Training & Education: Spin up temporary environments for students
- CI/CD Runners: Ephemeral build and test environments
- Testing: Isolated test environments with Docker support
- Multi-Tenancy: Safe isolation between users, teams, or projects
π° Cost Comparison
| Users | Traditional VMs | Containarium | Savings |
|---|---|---|---|
| 50 | $1,250/mo | $98/mo | 92% |
| 150 | $3,750/mo | $312/mo | 92% |
| 250 | $6,250/mo | $508/mo | 92% |
How?
- LXC containers: 10x more density than VMs
- Spot instances: 76% cheaper than regular VMs
- Persistent disks: Survive spot termination
- Single infrastructure: No VM-per-user overhead
π¦ Quick Start
1. Deploy Infrastructure
Choose your deployment size:
Small Team (20-50 users):
cd terraform/gce
cp examples/single-server-spot.tfvars terraform.tfvars
vim terraform.tfvars # Add your project_id and SSH keys
terraform init
terraform apply
Medium Team (100-150 users):
cp examples/horizontal-scaling-3-servers.tfvars terraform.tfvars
vim terraform.tfvars # Configure
terraform apply
Large Team (200-250 users):
cp examples/horizontal-scaling-5-servers.tfvars terraform.tfvars
terraform apply
2. Build and Deploy CLI
Option A: Deploy for Local Mode (SSH to server)
# Build containarium CLI for Linux
make build-linux
# Copy to jump server(s)
scp bin/containarium-linux-amd64 admin@<jump-server-ip>:/tmp/
ssh admin@<jump-server-ip>
sudo mv /tmp/containarium-linux-amd64 /usr/local/bin/containarium
sudo chmod +x /usr/local/bin/containarium
Option B: Setup for Remote Mode (Run from anywhere)
# Build containarium for your platform
make build # macOS/Linux on your laptop
# Deploy binary to server
scp bin/containarium-linux-amd64 admin@<jump-server-ip>:/tmp/
ssh admin@<jump-server-ip>
sudo mv /tmp/containarium-linux-amd64 /usr/local/bin/containarium
sudo chmod +x /usr/local/bin/containarium
# Install systemd service (generates JWT secret, writes service file, starts daemon)
sudo containarium service install
# The daemon auto-detects PostgreSQL and Caddy from Incus containers,
# and loads persisted config (base-domain, ports) from PostgreSQL.
# After VM recreation, just re-run the two commands above.
3. Create Containers
Option A: Local Mode (SSH to server)
# SSH to jump server
ssh admin@<jump-server-ip>
# Create container for a user
sudo containarium create alice --ssh-key ~/.ssh/alice.pub
# Output:
# β Creating container for user: alice
# β [1/7] Creating container...
# β [2/7] Starting container...
# β [3/7] Creating jump server account (proxy-only)...
# β Jump server account created: alice (no shell access, proxy-only)
# β [4/7] Waiting for network...
# Container IP: 10.0.3.100
# β [5/7] Installing Docker, SSH, and tools...
# β [6/7] Creating user: alice...
# β [7/7] Adding SSH keys (including jump server key for ProxyJump)...
# β Container alice-container created successfully!
#
# Container Details:
# Name: alice-container
# User: alice
# IP: 10.0.3.100
# Disk: 50GB
# Auto-start: enabled
#
# Jump Server Account (Secure Multi-Tenant):
# Username: alice
# Shell: /usr/sbin/nologin (proxy-only, no shell access)
# SSH ProxyJump: enabled
#
# SSH Access (via ProxyJump):
# ssh alice-dev # (after SSH config setup)
# List containers
sudo containarium list
# +------------------+---------+----------------------+------+-----------+
# | NAME | STATE | IPV4 | TYPE | SNAPSHOTS |
# +------------------+---------+----------------------+------+-----------+
# | alice-container | RUNNING | 10.0.3.100 (eth0) | C | 0 |
# +------------------+---------+----------------------+------+-----------+
Option B: Remote Mode (from your laptop)
# No SSH required - direct gRPC call with mTLS
containarium create alice --ssh-key ~/.ssh/alice.pub \
--server 35.229.246.67:50051 \
--certs-dir ~/.config/containarium/certs \
--cpu 4 --memory 8GB -v
# List containers remotely
containarium list \
--server 35.229.246.67:50051 \
--certs-dir ~/.config/containarium/certs
# Export SSH config remotely (run on server)
ssh admin@<jump-server-ip>
sudo containarium export alice --jump-ip 35.229.246.67 >> ~/.ssh/config
4. Setup SSH Keys for Users
Each user needs their own SSH key pair for container access.
User generates SSH key (on their local machine):
# Generate new SSH key pair
ssh-keygen -t ed25519 -C "alice@company.com" -f ~/.ssh/containarium_alice
# Output:
# ~/.ssh/containarium_alice (private key - keep secret!)
# ~/.ssh/containarium_alice.pub (public key - share with admin)
Admin creates container with user's public key:
# User sends their public key to admin
# Admin receives: alice_id_ed25519.pub
# SSH to jump server
ssh admin@<jump-server-ip>
# Create container with user's public key
sudo containarium create alice --ssh-key /path/to/alice_id_ed25519.pub
# Or if key is on admin's local machine, copy it first:
scp alice_id_ed25519.pub admin@<jump-server-ip>:/tmp/
ssh admin@<jump-server-ip>
sudo containarium create alice --ssh-key /tmp/alice_id_ed25519.pub
5. User Access (SSH ProxyJump) - Secure Multi-Tenant Architecture
Containarium implements a secure proxy-only jump server architecture:
Security Model
- β
Each user has a separate jump server account with
/usr/sbin/nologinshell - β Jump server accounts are proxy-only (no direct shell access)
- β SSH ProxyJump works transparently through the jump server
- β Users cannot access jump server data or see other users
- β Automatic jump server account creation when container is created
- β Jump server accounts deleted when container is deleted
Architecture Flow
User's Laptop Jump Server Container
β β β
β SSH to alice-jump β β
βββββββββββββββββββββββββββ>β (alice account: β
β (ProxyJump) β /usr/sbin/nologin) β
β β ββ> Blocks shell β
β β ββ> Allows proxy β
β β β β
β β β SSH forward β
β β ββββββββββββββ>β
β β
β Direct SSH to container (10.0.3.100) β
β<βββββββββββββββββββββββββββββββββββββββββββββββββββββ
Users configure SSH on their local machine:
Add to ~/.ssh/config:
# Jump server (proxy-only account - NO shell access)
Host containarium-jump
HostName <jump-server-ip>
User alice # Each user has their own jump account
IdentityFile ~/.ssh/containarium_alice
# Your dev container
Host alice-dev
HostName 10.0.3.100
User alice
IdentityFile ~/.ssh/containarium_alice
ProxyJump containarium-jump
StrictHostKeyChecking accept-new
Test the setup:
# This will FAIL (proxy-only account - no shell)
ssh containarium-jump
# Output: "This account is currently not available."
# This WORKS (ProxyJump to container)
ssh alice-dev
# Output: alice@alice-container:~$
Connect:
ssh my-dev
# Alice is now in her Ubuntu container with Docker!
# First connection will ask to verify host key:
# The authenticity of host '10.0.3.100 (<no hostip for proxy command>)' can't be established.
# ED25519 key fingerprint is SHA256:...
# Are you sure you want to continue connecting (yes/no)? yes
6. Managing SSH Keys in Containers
Add Additional SSH Keys (After Container Creation)
# Method 1: Using incus exec
sudo incus exec alice-container -- bash -c "echo 'ssh-ed25519 AAAA...' >> /home/alice/.ssh/authorized_keys"
# Method 2: Using incus file push
echo 'ssh-ed25519 AAAA...' > /tmp/new_key.pub
sudo incus file push /tmp/new_key.pub alice-container/home/alice/.ssh/authorized_keys --mode 0600 --uid 1000 --gid 1000
# Method 3: SSH into container and add manually
ssh alice@10.0.3.100 # (from jump server)
echo 'ssh-ed25519 AAAA...' >> ~/.ssh/authorized_keys
Replace SSH Key
# Overwrite authorized_keys with new key
echo 'ssh-ed25519 NEW_KEY_AAAA...' | sudo incus exec alice-container -- \
tee /home/alice/.ssh/authorized_keys > /dev/null
# Set correct permissions
sudo incus exec alice-container -- chown alice:alice /home/alice/.ssh/authorized_keys
sudo incus exec alice-container -- chmod 600 /home/alice/.ssh/authorized_keys
Remove SSH Key
# Edit authorized_keys file
sudo incus exec alice-container -- bash -c \
"sed -i '/alice@old-laptop/d' /home/alice/.ssh/authorized_keys"
View Current SSH Keys
# List all authorized keys for a user
sudo incus exec alice-container -- cat /home/alice/.ssh/authorized_keys
ποΈ Project Structure
Containarium/
βββ proto/ # Protobuf contracts (type-safe)
β βββ containarium/v1/
β βββ container.proto # Container operations
β βββ config.proto # System configuration
β
βββ cmd/containarium/ # CLI entry point
βββ internal/
β βββ cmd/ # CLI commands (create, list, delete, info)
β βββ container/ # Container management logic
β βββ incus/ # Incus API wrapper
β βββ ssh/ # SSH key management
β
βββ terraform/
β βββ gce/ # GCP deployment
β β βββ main.tf # Main infrastructure
β β βββ horizontal-scaling.tf # Multi-server setup
β β βββ spot-instance.tf # Spot VM + persistent disk
β β βββ examples/ # Ready-to-use configurations
β β βββ scripts/ # Startup scripts
β βββ embed/ # Terraform file embedding for tests
β βββ terraform.go # go:embed declarations
β βββ README.md # Embedding documentation
β
βββ test/integration/ # E2E tests
β βββ e2e_terraform_test.go # Terraform-based E2E tests
β βββ e2e_reboot_test.go # gcloud-based E2E tests
β βββ TERRAFORM-E2E.md # Terraform testing guide
β βββ E2E-README.md # gcloud testing guide
β
βββ docs/ # Documentation
β βββ HORIZONTAL-SCALING-QUICKSTART.md
β βββ SSH-JUMP-SERVER-SETUP.md
β βββ SPOT-INSTANCES-AND-SCALING.md
β
βββ Makefile # Build automation
βββ IMPLEMENTATION-PLAN.md # Detailed roadmap
π οΈ Development
Build Commands
# Show all commands
make help
# Build for current platform
make build
# Build for Linux (deployment)
make build-linux
# Generate protobuf code
make proto
# Run tests
make test
# Run E2E tests (requires GCP credentials)
export GCP_PROJECT=your-project-id
make test-e2e
# Lint and format
make lint fmt
Local Testing
# Build and run locally
make run-local
# Test commands
./bin/containarium create alice
./bin/containarium list
./bin/containarium info alice
π§ͺ Testing Architecture
Containarium uses a comprehensive testing strategy with real infrastructure validation:
E2E Testing with Terraform
The E2E test suite leverages the same Terraform configuration used for production deployments:
test/integration/
βββ e2e_terraform_test.go # Terraform-based E2E tests
βββ e2e_reboot_test.go # Alternative gcloud-based tests
βββ TERRAFORM-E2E.md # Terraform E2E documentation
βββ E2E-README.md # gcloud E2E documentation
terraform/embed/
βββ terraform.go # Embeds Terraform files (go:embed)
βββ README.md # Embedding documentation
Key Features:
- β go:embed Integration: Terraform files embedded in test binary for portability
- β ZFS Persistence: Verifies data survives spot instance reboots
- β No Hardcoded Values: All configuration from Terraform outputs
- β Reproducible: Same Terraform config as production
- β Automatic Cleanup: Infrastructure destroyed after tests
Running E2E Tests:
# Set GCP project
export GCP_PROJECT=your-gcp-project-id
# Run full E2E test (25-30 min)
make test-e2e
# Test workflow:
# 1. Deploy infrastructure with Terraform
# 2. Wait for instance ready
# 3. Verify ZFS setup
# 4. Create container with test data
# 5. Reboot instance (stop/start)
# 6. Verify data persisted
# 7. Cleanup infrastructure
Test Reports:
- Creates temporary Terraform workspace
- Verifies ZFS pool status
- Validates container quota enforcement
- Confirms data persistence across reboots
See test/integration/TERRAFORM-E2E.md for detailed documentation.
π Security: Audit Logging & Intrusion Prevention
Audit Logging
With separate user accounts, every SSH connection is logged with the actual username:
SSH Audit Logs (/var/log/auth.log):
# Alice connects to her container
Jan 10 14:23:15 jump-server sshd[12345]: Accepted publickey for alice from 203.0.113.10
Jan 10 14:23:15 jump-server sshd[12345]: pam_unix(sshd:session): session opened for user alice
# Bob connects to his container
Jan 10 14:25:32 jump-server sshd[12346]: Accepted publickey for bob from 203.0.113.11
Jan 10 14:25:32 jump-server sshd[12346]: pam_unix(sshd:session): session opened for user bob
# Failed login attempt
Jan 10 14:30:01 jump-server sshd[12347]: Failed publickey for charlie from 203.0.113.12
Jan 10 14:30:05 jump-server sshd[12348]: Failed publickey for charlie from 203.0.113.12
Jan 10 14:30:09 jump-server sshd[12349]: Failed publickey for charlie from 203.0.113.12
View Audit Logs:
# SSH to jump server as admin
ssh admin@<jump-server-ip>
# View all SSH connections
sudo journalctl -u sshd -f
# View connections for specific user
sudo journalctl -u sshd | grep "for alice"
# View failed login attempts
sudo journalctl -u sshd | grep "Failed"
# View connections from specific IP
sudo journalctl -u sshd | grep "from 203.0.113.10"
# Export logs for security audit
sudo journalctl -u sshd --since "2025-01-01" --until "2025-01-31" > ssh-audit-jan-2025.log
Brute-Force Protection
With Sentinel HA: sshpiper failtoban (recommended)
When using the sentinel architecture, SSH brute-force protection is handled by sshpiper's built-in failtoban plugin on the sentinel VM:
- sshpiper sits on port 22 and sees real client IPs (not the sentinel's IP)
- After 3 failed SSH auth attempts, the client IP is banned for 1 hour
- No configuration needed β set up automatically by the startup script
# Check sshpiper status (on sentinel, port 2222)
gcloud compute ssh <sentinel-vm> --tunnel-through-iap --ssh-flag="-p 2222"
systemctl status sshpiper
journalctl -u sshpiper | grep "banned"
Why not iptables DNAT + fail2ban? The previous approach forwarded port 22 via iptables DNAT with MASQUERADE. The spot VM only saw connections from the sentinel's IP, so fail2ban would ban the sentinel itself β blocking all SSH users. sshpiper operates at SSH protocol level (L7), correctly identifying individual attackers.
Without Sentinel: fail2ban (single VM)
fail2ban Configuration
Automatically block brute force attacks and unauthorized access attempts:
Install fail2ban (added to startup script):
# Automatically installed by Terraform startup script
sudo apt install -y fail2ban
Configure fail2ban for SSH (/etc/fail2ban/jail.d/sshd.conf):
[sshd]
enabled = true
port = 22
filter = sshd
logpath = /var/log/auth.log
maxretry = 3 # Block after 3 failed attempts
findtime = 600 # Within 10 minutes
bantime = 3600 # Ban for 1 hour
banaction = iptables-multiport
Monitor fail2ban:
# Check fail2ban status
sudo fail2ban-client status
# Check SSH jail status
sudo fail2ban-client status sshd
# Output:
# Status for the jail: sshd
# |- Filter
# | |- Currently failed: 2
# | |- Total failed: 15
# | `- File list: /var/log/auth.log
# `- Actions
# |- Currently banned: 1
# |- Total banned: 3
# `- Banned IP list: 203.0.113.12
# View banned IPs
sudo fail2ban-client get sshd banip
# Unban IP manually (if needed)
sudo fail2ban-client set sshd unbanip 203.0.113.12
fail2ban Logs:
# View fail2ban activity
sudo tail -f /var/log/fail2ban.log
# Example output:
# 2025-01-10 14:30:15,123 fail2ban.filter [12345]: INFO [sshd] Found 203.0.113.12 - 2025-01-10 14:30:09
# 2025-01-10 14:30:20,456 fail2ban.actions [12346]: NOTICE [sshd] Ban 203.0.113.12
# 2025-01-10 15:30:20,789 fail2ban.actions [12347]: NOTICE [sshd] Unban 203.0.113.12
Security Monitoring Dashboard
Create monitoring script (/usr/local/bin/security-monitor.sh):
#!/bin/bash
echo "=== Containarium Security Status ==="
echo ""
echo "π Active SSH Sessions:"
who
echo ""
echo "π« Banned IPs (fail2ban):"
sudo fail2ban-client status sshd | grep "Banned IP"
echo ""
echo "β οΈ Recent Failed Login Attempts:"
sudo journalctl -u sshd --since "1 hour ago" | grep "Failed" | tail -10
echo ""
echo "β
Successful Logins (last hour):"
sudo journalctl -u sshd --since "1 hour ago" | grep "Accepted publickey" | tail -10
echo ""
echo "π₯ Unique Users Connected Today:"
sudo journalctl -u sshd --since "today" | grep "Accepted publickey" | \
awk '{print $9}' | sort -u
Run monitoring:
# Make executable
sudo chmod +x /usr/local/bin/security-monitor.sh
# Run manually
sudo /usr/local/bin/security-monitor.sh
# Add to cron for daily reports
echo "0 9 * * * /usr/local/bin/security-monitor.sh | mail -s 'Daily Security Report' admin@company.com" | sudo crontab -
Per-User Connection Tracking
Since each user has their own account, you can track:
User-specific metrics:
# Count connections per user
sudo journalctl -u sshd --since "today" | grep "Accepted publickey" | \
awk '{print $9}' | sort | uniq -c | sort -rn
# Output:
# 45 alice
# 32 bob
# 18 charlie
# 5 david
# View all of Alice's connections
sudo journalctl -u sshd | grep "for alice" | grep "Accepted publickey"
# Find when Bob last connected
sudo journalctl -u sshd | grep "for bob" | grep "Accepted publickey" | tail -1
DDoS Protection Benefits
With separate accounts, DDoS attacks are isolated:
Scenario: Alice's laptop is compromised and spams connections
# fail2ban detects excessive failed attempts from alice's IP
2025-01-10 15:00:00 fail2ban.filter [12345]: INFO [sshd] Found alice from 203.0.113.10
2025-01-10 15:00:05 fail2ban.filter [12346]: INFO [sshd] Found alice from 203.0.113.10
2025-01-10 15:00:10 fail2ban.filter [12347]: INFO [sshd] Found alice from 203.0.113.10
2025-01-10 15:00:15 fail2ban.actions [12348]: NOTICE [sshd] Ban 203.0.113.10
# Result:
# β
Alice's IP is banned (her laptop is blocked)
# β
Bob, Charlie, and other users are NOT affected
# β
Service continues for everyone else
# β
Admin can investigate Alice's account specifically
Without separate accounts (everyone uses 'admin'):
# β Can't tell which user is causing the issue
# β Banning the IP might affect legitimate users behind NAT
# β No per-user accountability
Compliance & Security Audits
Export security logs for compliance:
# Export all SSH activity for user 'alice' in January
sudo journalctl -u sshd --since "2025-01-01" --until "2025-02-01" | \
grep "for alice" > alice-ssh-audit-jan-2025.log
# Export all failed login attempts
sudo journalctl -u sshd --since "2025-01-01" --until "2025-02-01" | \
grep "Failed" > failed-logins-jan-2025.log
# Export fail2ban bans
sudo fail2ban-client get sshd banhistory > ban-history-jan-2025.log
Best Practices
- Regular Log Reviews: Check logs weekly for suspicious activity
- fail2ban Tuning: Adjust
maxretryandbantimebased on your security needs - Alert on Anomalies: Set up alerts for unusual patterns (100+ connections from one user)
- Log Retention: Keep logs for at least 90 days for compliance
- Separate Admin Access: Never use user accounts for admin tasks
- Monitor fail2ban: Ensure fail2ban service is always running
π₯ User Onboarding Workflow
Complete end-to-end workflow for adding a new user:
Step 1: User Generates SSH Key Pair
User (on their local machine):
# Generate SSH key pair
ssh-keygen -t ed25519 -C "alice@company.com" -f ~/.ssh/containarium_alice
# Output:
# Generating public/private ed25519 key pair.
# Enter passphrase (empty for no passphrase): [optional]
# Your identification has been saved in ~/.ssh/containarium_alice
# Your public key has been saved in ~/.ssh/containarium_alice.pub
# View and copy public key to send to admin
cat ~/.ssh/containarium_alice.pub
# ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIJqL+XYZ... alice@company.com
Step 2: Admin Creates Container
Admin receives public key and creates container:
# Save user's public key to file
echo 'ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIJqL... alice@company.com' > /tmp/alice.pub
# SSH to jump server
ssh admin@<jump-server-ip>
# Create container with user's public key
# This automatically:
# 1. Creates jump server account for alice (proxy-only, no shell)
# 2. Creates alice-container with SSH access
# 3. Sets up SSH keys for both
sudo containarium create alice --ssh-key /tmp/alice.pub
# Output:
# β Creating jump server account: alice (proxy-only)
# β Creating container for user: alice
# β Container started: alice-container
# β IP Address: 10.0.3.100
# β Installing Docker and dev tools
# β Container alice-container created successfully!
#
# β Jump server account: alice@35.229.246.67 (proxy-only, no shell)
# β Container access: alice@10.0.3.100
#
# Send this to user:
# Jump Server: 35.229.246.67 (user: alice)
# Container IP: 10.0.3.100
# Username: alice
# Enable auto-start for spot instance recovery
sudo incus config set alice-container boot.autostart true
Step 3: Admin Sends Connection Info to User
Method 1: Export SSH Config (Recommended)
# Admin exports SSH configuration
sudo containarium export alice --jump-ip 35.229.246.67 --key ~/.ssh/containarium_alice > alice-ssh-config.txt
# Send alice-ssh-config.txt to user via email/Slack
Method 2: Manual SSH Config
Admin sends to user via email/Slack:
Your development container is ready!
Jump Server IP: 35.229.246.67
Your Username: alice (for both jump server and container)
Container IP: 10.0.3.100
Add this to your ~/.ssh/config:
Host containarium-jump
HostName 35.229.246.67
User alice # β Your own username!
IdentityFile ~/.ssh/containarium_alice
Host alice-dev
HostName 10.0.3.100
User alice # β Same username
IdentityFile ~/.ssh/containarium_alice # β Same key
ProxyJump containarium-jump
Then connect with: ssh alice-dev
Note: Your jump server account is proxy-only (no shell access).
You can only access your container, not the jump server itself.
Step 4: User Configures SSH and Connects
User (on their local machine):
Method 1: Using Exported Config (Recommended)
# Add exported config to your SSH config
cat alice-ssh-config.txt >> ~/.ssh/config
# Connect to container
ssh alice-dev
# You're now in your container!
alice@alice-container:~$ docker run hello-world
alice@alice-container:~$ sudo apt install vim git tmux
Method 2: Manual Configuration
# Add to ~/.ssh/config
vim ~/.ssh/config
# Paste the configuration provided by admin
# Connect to container
ssh alice-dev
# First time: verify host key
# The authenticity of host '10.0.3.100' can't be established.
# ED25519 key fingerprint is SHA256:...
# Are you sure you want to continue connecting (yes/no)? yes
# You're now in your container!
alice@alice-container:~$ docker run hello-world
alice@alice-container:~$ sudo apt install vim git tmux
Step 5: User Adds Additional Devices (Optional)
User wants to access from second laptop:
# On second laptop, generate new key
ssh-keygen -t ed25519 -C "alice@home-laptop" -f ~/.ssh/containarium_alice_home
# Send new public key to admin
cat ~/.ssh/containarium_alice_home.pub
Admin adds second key:
# Add second key to container (keeps existing keys)
NEW_KEY='ssh-ed25519 AAAAC3... alice@home-laptop'
sudo incus exec alice-container -- bash -c \
"echo '$NEW_KEY' >> /home/alice/.ssh/authorized_keys"
User can now connect from both laptops!
π CLI Command Reference
Containarium provides a simple, intuitive CLI for container management.
Unified Binary Architecture
Containarium uses a single binary that operates in two modes:
π₯οΈ Local Mode (Direct Incus Access)
# Execute directly on the jump server (requires sudo)
sudo containarium create alice --ssh-key ~/.ssh/alice.pub
sudo containarium list
sudo containarium delete bob
- β Direct Incus API access via Unix socket
- β No daemon required
- β Fastest execution
- β Must be run on the server
- β Requires sudo/root privileges
π Remote Mode (gRPC + mTLS)
# Execute from anywhere (laptop, CI/CD, etc.)
containarium create alice --ssh-key ~/.ssh/alice.pub \
--server 35.229.246.67:50051 \
--certs-dir ~/.config/containarium/certs
containarium list --server 35.229.246.67:50051 \
--certs-dir ~/.config/containarium/certs
- β Remote execution from any machine
- β Secure mTLS authentication
- β No SSH required
- β Perfect for automation/CI/CD
- β Requires daemon running on server
- β Requires certificate setup
π Daemon Mode (Server Component)
# Install systemd service (writes service file, generates JWT secret, starts daemon)
sudo containarium service install
# Or manage manually
sudo systemctl start containarium
sudo systemctl status containarium
sudo journalctl -u containarium -f
- Self-bootstraps: auto-detects PostgreSQL and Caddy from Incus containers
- Persists config (base-domain, ports) in PostgreSQL β survives VM recreation
- Only needs
--rest --jwt-secret-filein the service file; everything else is auto-detected or loaded from DB - Listens on port 50051 (gRPC) + port 8080 (REST/HTTP)
- Automatically started via systemd
Certificate Setup for Remote Mode
Generate mTLS certificates:
# On server: Generate server and client certificates
containarium cert generate \
--server-ip 35.229.246.67 \
--output-dir /etc/containarium/certs
# Copy client certificates to local machine
scp admin@35.229.246.67:/etc/containarium/certs/{ca.crt,client.crt,client.key} \
~/.config/containarium/certs/
Verify connection:
# Test remote connection
containarium list \
--server 35.229.246.67:50051 \
--certs-dir ~/.config/containarium/certs
Basic Commands
Create Container
# Basic usage
sudo containarium create <username> --ssh-key <path-to-public-key>
# Example
sudo containarium create alice --ssh-key ~/.ssh/alice.pub
# With custom disk quota
sudo containarium create bob --ssh-key ~/.ssh/bob.pub --disk-quota 50GB
# With a pre-configured software stack
sudo containarium create alice --ssh-key ~/.ssh/alice.pub --stack nodejs
sudo containarium create bob --ssh-key ~/.ssh/bob.pub --stack docker
# Enable auto-start on boot
sudo containarium create charlie --ssh-key ~/.ssh/charlie.pub --autostart
Output:
β Creating container for user: alice
β Launching Ubuntu 24.04 container
β Container started: alice-container
β IP Address: 10.0.3.100
β Installing Docker and dev tools
β Configuring SSH access
β Container alice-container created successfully!
Container Details:
Name: alice-container
User: alice
IP: 10.0.3.100
Disk Quota: 20GB (ZFS)
SSH: ssh alice@10.0.3.100
Available Software Stacks (--stack):
| Stack | Description |
|---|---|
nodejs | Node.js LTS, npm, yarn, pnpm, TypeScript |
python | Python 3, pip, virtualenv, poetry |
golang | Go, gopls, golangci-lint |
rust | Rust toolchain via rustup |
docker | Docker CE, docker-compose-plugin |
datascience | Python, Jupyter, pandas, numpy, scikit-learn |
devops | kubectl, Terraform |
database | PostgreSQL, MySQL, Redis CLI clients |
fullstack | Node.js + Python + database clients |
List Containers
# List all containers
sudo containarium list
# Example output
NAME STATUS IP QUOTA AUTOSTART
alice-container Running 10.0.3.100 20GB Yes
bob-container Running 10.0.3.101 50GB Yes
charlie-container Stopped - 20GB No
Get Container Info
# Get detailed information
sudo containarium info alice
# Example output
Container: alice-container
Status: Running
User: alice
IP Address: 10.0.3.100
Disk Quota: 20GB
Disk Used: 4.2GB (21%)
Memory: 512MB / 2GB
CPU Usage: 5%
Uptime: 3 days
Auto-start: Enabled
Delete Container
# Delete container (with confirmation)
sudo containarium delete alice
# Force delete (no confirmation)
sudo containarium delete bob --force
# Delete with data backup
sudo containarium delete charlie --backup
Resize Container
Dynamically adjust container resources (CPU, memory, disk) without any downtime. All changes take effect immediately without restarting the container.
# Resize CPU only
sudo containarium resize alice --cpu 4
# Resize memory only
sudo containarium resize alice --memory 8GB
# Resize disk only
sudo containarium resize alice --disk 100GB
# Resize all three at once
sudo containarium resize alice --cpu 4 --memory 8GB --disk 100GB
# With verbose output
sudo containarium resize alice --cpu 8 --memory 16GB -v
Advanced CPU Options:
# Set specific number of cores
sudo containarium resize alice --cpu 4
# Set CPU range (flexible allocation)
sudo containarium resize alice --cpu 2-4
# Pin to specific CPU cores (performance)
sudo containarium resize alice --cpu 0-3
Memory Formats:
# Gigabytes
sudo containarium resize alice --memory 8GB
# Megabytes
sudo containarium resize alice --memory 4096MB
# Gibibytes (binary)
sudo containarium resize alice --memory 8GiB
Important Notes:
- CPU: Always safe to increase or decrease. Supports over-provisioning (4-8x).
- Memory: Safe to increase. Check current usage before decreasing to avoid OOM kills.
- Disk: Can only increase (cannot shrink below current usage).
- All changes are instant with no container restart required.
Verbose Output Example:
$ sudo containarium resize alice --cpu 4 --memory 8GB -v
Resizing container: alice-container
Setting CPU limit: 4
Setting memory limit: 8GB
β Resources updated successfully (no restart required)
β Container alice-container resized successfully!
Updated configuration:
CPU: 4
Memory: 8GB
Export SSH Configuration
# Export to stdout (copy/paste to ~/.ssh/config)
sudo containarium export alice --jump-ip 35.229.246.67
# Export to file
sudo containarium export alice --jump-ip 35.229.246.67 --output ~/.ssh/config.d/containarium-alice
# With custom SSH key path
sudo containarium export alice --jump-ip 35.229.246.67 --key ~/.ssh/containarium_alice
# Append directly to SSH config
sudo containarium export alice --jump-ip 35.229.246.67 >> ~/.ssh/config
Output:
# Containarium SSH Configuration
# User: alice
# Generated: 2026-01-10 08:43:18
# Jump server (GCE instance with proxy-only account)
Host containarium-jump
HostName 35.229.246.67
User alice
IdentityFile ~/.ssh/containarium_alice
# No shell access - proxy-only account
# User's development container
Host alice-dev
HostName 10.0.3.100
User alice
IdentityFile ~/.ssh/containarium_alice
ProxyJump containarium-jump
Usage:
# After exporting, connect with:
ssh alice-dev
SSH Key Management
Generate SSH Keys for Users
# User generates their own key pair
ssh-keygen -t ed25519 -C "user@company.com" -f ~/.ssh/containarium
# Output files:
# ~/.ssh/containarium (private - never share!)
# ~/.ssh/containarium.pub (public - give to admin)
# View public key (to send to admin)
cat ~/.ssh/containarium.pub
Create Container with Custom SSH Key
# Method 1: Admin has key file locally
sudo containarium create alice --ssh-key /path/to/alice.pub
# Method 2: Admin receives key via secure channel
# User sends their public key:
cat ~/.ssh/containarium.pub
# ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIJqL... alice@company.com
# Admin creates container with key inline
echo 'ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIJqL... alice@company.com' > /tmp/alice.pub
sudo containarium create alice --ssh-key /tmp/alice.pub
# Method 3: Multiple users with different keys
sudo containarium create alice --ssh-key /tmp/alice.pub
sudo containarium create bob --ssh-key /tmp/bob.pub
sudo containarium create charlie --ssh-key /tmp/charlie.pub
Add SSH Key to Existing Container
# Add additional key (keep existing keys)
NEW_KEY='ssh-ed25519 AAAAC3... user@laptop'
sudo incus exec alice-container -- bash -c \
"echo '$NEW_KEY' >> /home/alice/.ssh/authorized_keys"
# Verify key was added
sudo incus exec alice-container -- cat /home/alice/.ssh/authorized_keys
Replace SSH Key
# Replace all keys with new key
NEW_KEY='ssh-ed25519 AAAAC3... alice@new-laptop'
echo "$NEW_KEY" | sudo incus exec alice-container -- \
tee /home/alice/.ssh/authorized_keys > /dev/null
# Fix permissions
sudo incus exec alice-container -- chown alice:alice /home/alice/.ssh/authorized_keys
sudo incus exec alice-container -- chmod 600 /home/alice/.ssh/authorized_keys
Manage Multiple SSH Keys per User
# Add work laptop key
WORK_KEY='ssh-ed25519 AAAAC3... alice@work-laptop'
sudo incus exec alice-container -- bash -c \
"echo '$WORK_KEY' >> /home/alice/.ssh/authorized_keys"
# Add home laptop key
HOME_KEY='ssh-ed25519 AAAAC3... alice@home-laptop'
sudo incus exec alice-container -- bash -c \
"echo '$HOME_KEY' >> /home/alice/.ssh/authorized_keys"
# View all keys
sudo incus exec alice-container -- cat /home/alice/.ssh/authorized_keys
# ssh-ed25519 AAAAC3... alice@work-laptop
# ssh-ed25519 AAAAC3... alice@home-laptop
Remove Specific SSH Key
# Remove key by comment (last part of key)
sudo incus exec alice-container -- bash -c \
"sed -i '/alice@old-laptop/d' /home/alice/.ssh/authorized_keys"
# Remove key by fingerprint pattern
sudo incus exec alice-container -- bash -c \
"sed -i '/AAAAC3NzaC1lZDI1NTE5AAAAIAbc123/d' /home/alice/.ssh/authorized_keys"
Troubleshoot SSH Key Issues
# Check authorized_keys permissions
sudo incus exec alice-container -- ls -la /home/alice/.ssh/
# Should show:
# drwx------ 2 alice alice 4096 ... .ssh
# -rw------- 1 alice alice 123 ... authorized_keys
# Fix permissions if wrong
sudo incus exec alice-container -- chown -R alice:alice /home/alice/.ssh
sudo incus exec alice-container -- chmod 700 /home/alice/.ssh
sudo incus exec alice-container -- chmod 600 /home/alice/.ssh/authorized_keys
# Test SSH from jump server
ssh -v alice@10.0.3.100
# -v shows verbose output for debugging
# Check SSH logs in container
sudo incus exec alice-container -- tail -f /var/log/auth.log
Collaborator Management
Add collaborators to containers so multiple users can share a development environment.
Add a Collaborator
# Basic: collaborator gets restricted access (sudo su - owner only)
sudo containarium collaborator add alice bob --ssh-key ~/.ssh/bob.pub
# Grant full sudo access
sudo containarium collaborator add alice bob --ssh-key ~/.ssh/bob.pub --sudo
# Grant container runtime access (docker/podman groups)
sudo containarium collaborator add alice bob --ssh-key ~/.ssh/bob.pub --container-runtime
# Both permissions
sudo containarium collaborator add alice carol --ssh-key ~/.ssh/carol.pub --sudo --container-runtime
Permission levels:
| Flag | Effect |
|---|---|
| (default) | Can only sudo su - <owner> with session logging |
--sudo | Full NOPASSWD: ALL sudo access with session logging |
--container-runtime | Added to docker and podman groups |
List Collaborators
sudo containarium collaborator list alice
Remove a Collaborator
sudo containarium collaborator remove alice bob
Advanced Operations
Using Incus Directly
# Execute command in container
sudo incus exec alice-container -- df -h
# Shell into container
sudo incus exec alice-container -- su - alice
# View container logs
sudo incus console alice-container --show-log
# Snapshot container
sudo incus snapshot alice-container snap1
# Restore snapshot
sudo incus restore alice-container snap1
# Copy container
sudo incus copy alice-container alice-backup
Resource Management
# Set memory limit
sudo incus config set alice-container limits.memory 4GB
# Set CPU limit
sudo incus config set alice-container limits.cpu 2
# View container metrics
sudo incus info alice-container
# Resize disk quota
sudo containarium resize alice --disk-quota 100GB
Terraform Commands
Deploy Infrastructure
cd terraform/gce
# Initialize Terraform
terraform init
# Preview changes
terraform plan
# Deploy infrastructure
terraform apply
# Deploy with custom variables
terraform apply -var-file=examples/horizontal-scaling-3-servers.tfvars
# Show outputs
terraform output
# Get specific output
terraform output jump_server_ip
Manage Infrastructure
# Update infrastructure
terraform apply
# Destroy specific resource
terraform destroy -target=google_compute_instance.jump_server_spot[0]
# Destroy everything
terraform destroy
# Import existing resource
terraform import google_compute_instance.jump_server projects/my-project/zones/us-central1-a/instances/my-instance
# Refresh state
terraform refresh
Maintenance Commands
Backup and Recovery
# Backup ZFS pool
sudo zfs snapshot incus-pool@backup-$(date +%Y%m%d)
# List snapshots
sudo zfs list -t snapshot
# Rollback to snapshot
sudo zfs rollback incus-pool@backup-20240115
# Export container
sudo incus export alice-container alice-backup.tar.gz
# Import container
sudo incus import alice-backup.tar.gz
Monitoring
# Check ZFS pool status
sudo zpool status
# Check disk usage
sudo zfs list
# Check container resource usage
sudo incus list --columns ns4mDcup
# View system load
htop
# Check Incus daemon status
sudo systemctl status incus
Troubleshooting
Common Issues
1. "cannot lock /etc/passwd" Error
This occurs when google_guest_agent is managing users while Containarium tries to create jump server accounts.
Solution: Containarium includes automatic retry logic with exponential backoff:
- β Pre-checks for lock files before attempting
- β 6 retry attempts with exponential backoff (500ms β 30s)
- β Jitter to prevent thundering herd
- β Smart error detection (only retries lock errors)
If retries are exhausted, check agent activity:
# Check what google_guest_agent is doing
sudo journalctl -u google-guest-agent --since "5 minutes ago" | grep -E "account|user|Updating"
# Temporarily disable account management (if needed)
sudo systemctl stop google-guest-agent
sudo containarium create alice --ssh-key ~/.ssh/alice.pub
sudo systemctl start google-guest-agent
# Or wait and retry - agent usually releases lock within 30-60 seconds
2. Container Network Issues
# View Incus logs
sudo journalctl -u incus -f
# Check container network
sudo incus network list
sudo incus network show incusbr0
# Restart Incus daemon
sudo systemctl restart incus
3. Infrastructure Issues
# Check startup script logs (GCE)
gcloud compute instances get-serial-port-output <instance-name> --zone=<zone>
# Verify ZFS health
sudo zpool scrub incus-pool
sudo zpool status -v
Daemon Management
Install, check, and manage the daemon:
# Install systemd service (first time or after VM recreation)
sudo containarium service install
# View daemon status
sudo containarium service status
# or: sudo systemctl status containarium
# View daemon logs
sudo journalctl -u containarium -f
# Restart daemon
sudo systemctl restart containarium
# Uninstall service
sudo containarium service uninstall
Direct gRPC testing with grpcurl:
# Install grpcurl if needed
go install github.com/fullstorydev/grpcurl/cmd/grpcurl@latest
# List services (with mTLS)
grpcurl -cacert /etc/containarium/certs/ca.crt \
-cert /etc/containarium/certs/client.crt \
-key /etc/containarium/certs/client.key \
35.229.246.67:50051 list
# Create container via gRPC (with mTLS)
grpcurl -cacert /etc/containarium/certs/ca.crt \
-cert /etc/containarium/certs/client.crt \
-key /etc/containarium/certs/client.key \
-d '{"username": "alice", "ssh_keys": ["ssh-ed25519 AAA..."]}' \
35.229.246.67:50051 containarium.v1.ContainerService/CreateContainer
Batch Operations
# Create multiple containers
for user in alice bob charlie; do
sudo containarium create $user --ssh-key ~/.ssh/${user}.pub
done
# Enable autostart for all containers
sudo incus list --format csv -c n | while read name; do
sudo incus config set $name boot.autostart true
done
# Snapshot all containers
for container in $(sudo incus list --format csv -c n); do
sudo incus snapshot $container "backup-$(date +%Y%m%d)"
done
π©οΈ Infrastructure Deployment
Single Server (Development)
# terraform.tfvars
use_spot_instance = true # 76% cheaper
use_persistent_disk = true # Survives restarts
machine_type = "n2-standard-8" # 32GB RAM, 50 users
Horizontal Scaling (Production)
# terraform.tfvars
enable_horizontal_scaling = true
jump_server_count = 3 # 3 independent servers
enable_load_balancer = true # SSH load balancing
use_spot_instance = true
Deploy:
cd terraform/gce
terraform init
terraform plan
terraform apply
π Documentation
Essential Guides
- Deployment Guide - START HERE! Complete workflow from zero to running containers
- Production Deployment - PRODUCTION READY! Remote state, secrets management, CI/CD
- Horizontal Scaling Quick Start - Deploy 3-5 jump servers
- SSH Jump Server Setup - SSH configuration guide
Advanced Topics
- Spot Instances & Scaling - Cost optimization
- Horizontal Scaling Architecture - Scaling strategies
- Terraform GCE README - Deployment details
- Implementation Plan - Architecture & roadmap
- Testing Architecture - E2E testing with Terraform
π‘ Why Containarium?
Traditional Approach (Wasteful)
β Create 1 GCE VM per user
β Each VM: 2-4GB RAM (most unused)
β Cost: $25-50/month per user
β Slow: 30-60 seconds to provision
β Unmanageable: 50+ VMs to maintain
Containarium Approach (Efficient)
β
1 GCE VM hosts 50 containers
β
Each container: 100-500MB RAM (efficient)
β
Cost: $1.96-2.08/month per user
β
Fast: <60 seconds to provision
β
Scalable: Add servers as you grow
β
Resilient: Spot instances + persistent storage
π Resource Efficiency
| Metric | VM-per-User | Containarium | Improvement |
|---|---|---|---|
| Memory/User | 2-4 GB | 100-500 MB | 10x |
| Startup Time | 30-60s | 2-5s | 12x |
| Density | 2-3/host | 150/host | 50x |
| Cost (50 users) | $1,250/mo | $98/mo | 92% savings |
π Security Features
Access Control
- Separate User Accounts: Each user has proxy-only account on jump server
- No Shell Access: User accounts use
/usr/sbin/nologin(cannot execute commands on jump server) - SSH Key Auth: Password authentication disabled globally
- Per-User Isolation: Users can only access their own containers
- Admin Separation: Only admin account has jump server shell access
Container Security
- Unprivileged Containers: Container root β host root (UID mapping)
- Resource Limits: CPU, memory, disk quotas per container
- Network Isolation: Separate network namespace per container
- AppArmor Profiles: Additional security layer per container
Network Security
- Firewall Rules: Restrict SSH access to known IPs
- fail2ban Integration: Auto-block brute force attacks per user account
- DDoS Protection: Per-account rate limiting
- Private Container IPs: Only jump server has public IP
Audit & Monitoring
- SSH Audit Logging: Track all connections by user account
- Per-User Logs: Separate logs for each user (alice@jump β container)
- Container Access Logs: Track who accessed which container and when
- Security Event Alerts: Monitor for suspicious activity
Data Protection
- Persistent Disk Encryption: Data encrypted at rest
- Automated Backups: Daily snapshots with 30-day retention
- ZFS Checksums: Detect data corruption automatically
π Deployment Options
| Configuration | Users | Servers | Cost/Month | Use Case |
|---|---|---|---|---|
| Dev/Test | 20-50 | 1 spot | $98 | Development, testing |
| Small Team | 50-100 | 1 regular | $242 | Production, small team |
| Medium Team | 100-150 | 3 spot | $312 | Production, medium team |
| Large Team | 200-250 | 5 spot | $508 | Production, large team |
| Enterprise | 500+ | 10+ or cluster | Custom | Enterprise scale |
πΊοΈ Roadmap
Completed
- Phase 1: Protobuf contracts - Type-safe gRPC contracts for container operations
- Phase 2: Go CLI framework - Full-featured CLI with create, delete, list, info, resize commands
- Phase 3: Terraform GCE deployment - Single-server and horizontal scaling configurations
- Phase 4: Spot instances + persistent disk - ZFS-backed storage surviving spot termination
- Phase 5: Horizontal scaling with load balancer - Multi-server SSH load balancing
- Phase 6: Container management (Incus integration) - Complete LXC container lifecycle management
- Phase 7: End-to-end testing - Terraform-based E2E tests with ZFS persistence validation
- Phase 8: gRPC daemon with mTLS - Remote container management with mutual TLS authentication
- Phase 9: Secure multi-tenant architecture - Proxy-only jump server accounts, per-user isolation
- Phase 10: Production deployment guides - Complete documentation for production deployments
In Progress
- Phase 11: Monitoring & observability - Metrics, alerts, and dashboards for production systems
- Phase 12: Automated backup & disaster recovery - Scheduled snapshots and recovery procedures
Planned
- Phase 13: AWS support - Terraform modules for AWS EC2 deployment
- Phase 14: Azure support - Terraform modules for Azure VM deployment
- Phase 15: Web UI dashboard - Browser-based container management interface
- Phase 16: Container templates - Pre-configured environments (ML, web dev, data science)
- Phase 17: Resource usage analytics - Per-user and per-container cost tracking
- Phase 18: Auto-scaling - Dynamic server provisioning based on demand
π― Production Features
Spot Instance Auto-Recovery
Containers automatically restart when spot instances recover:
# Spot VM terminated β Containers stop
# VM restarts β Containers auto-start (boot.autostart=true)
# Downtime: 2-5 minutes
# Data: Preserved on persistent disk
Daily Backups
# Automatic snapshots enabled by default
enable_disk_snapshots = true
# 30-day retention
# Point-in-time recovery available
Load Balancing
# SSH traffic distributed across healthy servers
# Session affinity keeps users on same server
# Health checks on port 22
β FAQ
SSH Key Management
Q: Can multiple users share the same SSH key?
A: No, never! Each user must have their own SSH key pair. Sharing keys:
- Violates security best practices
- Makes it impossible to revoke access for one user
- Prevents audit logging of who accessed what
- Creates compliance issues
Q: What's the difference between admin and user accounts?
A: Containarium uses separate user accounts for security:
- Admin account: Full access to jump server shell, can manage containers
- User accounts: Proxy-only (no shell), can only connect to their container
Example:
/home/admin/.ssh/authorized_keys (admin - full shell access)
/home/alice/.ssh/authorized_keys (alice - proxy only, /usr/sbin/nologin)
/home/bob/.ssh/authorized_keys (bob - proxy only, /usr/sbin/nologin)
Q: Can I use the same key for both jump server and my container?
A: Yes, and that's the recommended approach! Each user has ONE key that works for both:
- Simpler for users (one key to manage)
- Same key authenticates to jump server account (proxy-only)
- Same key authenticates to container (full access)
- Users cannot access jump server shell (secured by
/usr/sbin/nologin) - Admin can still track per-user activity in logs
Q: How do I rotate SSH keys?
A:
# User generates new key
ssh-keygen -t ed25519 -C "alice@company.com" -f ~/.ssh/containarium_alice_new
# Admin replaces old key with new key
NEW_KEY='ssh-ed25519 AAAAC3... alice@new-laptop'
echo "$NEW_KEY" | sudo incus exec alice-container -- \
tee /home/alice/.ssh/authorized_keys > /dev/null
Q: Can users access the jump server itself?
A: No! User accounts are configured with /usr/sbin/nologin:
- Users can proxy through jump server to their container
- Users CANNOT get a shell on the jump server
- Users CANNOT see other containers or system processes
- Users CANNOT inspect other users' data
- Only admin has shell access to jump server
Q: Can one user access another user's container?
A: No! Each user only has SSH keys for their own container. Users cannot:
- Access other users' containers (no SSH keys for them)
- Become admin on the jump server (no shell access)
- See other users' data or processes (isolated)
- Execute commands on jump server (nologin shell)
Q: How does fail2ban protect against attacks?
A: With separate user accounts, fail2ban provides granular protection:
- Per-user banning: If alice's IP attacks, only alice is blocked
- Other users unaffected: bob, charlie continue working normally
- Audit trail: Logs show which user account was targeted
- DDoS isolation: Attacks on one user don't impact others
- Automatic recovery: Banned IPs are unbanned after timeout
Q: Why is separate user accounts more secure than shared admin?
A: Shared admin account (everyone uses admin) has serious flaws:
β Without separate accounts:
- All users can execute commands on jump server
- All users can see all containers (
incus list) - All users can inspect system processes (
ps aux) - All users can spy on other users
- Logs show only "admin" - can't tell who did what
- Banning one attacker affects all users
β With separate accounts (our design):
- Users cannot execute commands (nologin shell)
- Users cannot see other containers
- Users cannot inspect system
- Each user's activity logged separately
- Per-user banning without affecting others
- Follows principle of least privilege
Q: What happens if I lose my SSH private key?
A: You'll need to:
- Generate a new SSH key pair
- Send new public key to admin
- Admin updates your container with new key
- Old key is automatically invalid
Q: Can I have different keys for work laptop and home laptop?
A: Yes! You can have multiple public keys in your container:
# Admin adds second key for same user
sudo incus exec alice-container -- bash -c \
"echo 'ssh-ed25519 AAAAC3... alice@home-laptop' >> /home/alice/.ssh/authorized_keys"
How is Containarium different from...
Q: Why not just use Docker or Podman?
A: Docker and Podman are application containers β they package and run a single process or service. Containarium uses LXC system containers, which behave like lightweight VMs:
| Docker / Podman | Containarium (LXC) | |
|---|---|---|
| Designed for | Running apps / microservices | Running full Linux environments |
| Init system | No (single process) | Yes (systemd) |
| SSH access | Possible but hacky | Native, first-class |
| Run Docker inside | Docker-in-Docker (fragile) | Works natively |
| Persistent state | Volumes, ephemeral by default | Full persistent filesystem |
| User accounts | Not really | Real Linux users with sudo |
| Multi-tenant SSH | Not a use case | Built-in with jump server isolation |
If your developers need "a Linux box with SSH, Docker, and their own home directory," that's exactly what Containarium provides β and Docker/Podman don't.
Q: Why not Dev Containers / VS Code Remote Containers?
A: Dev Containers are great for single-developer, project-scoped environments tied to VS Code. Containarium solves a different problem:
- Dev Containers: One container per project, developer runs it locally or in Codespaces, tightly coupled to VS Code
- Containarium: One persistent environment per developer on shared infrastructure, editor-agnostic (SSH into it with anything)
Use Dev Containers when each developer has their own machine and wants reproducible project setups. Use Containarium when you need to host many developers on shared infrastructure at low cost.
Q: Why not GitHub Codespaces or Gitpod?
A: Codespaces and Gitpod are cloud-hosted, browser-based IDEs. They're excellent but:
- Cost: $0.18-0.36/hour per environment. A developer working 8h/day costs $30-60/month. Containarium costs ~$2/user/month.
- Vendor lock-in: Tied to GitHub (Codespaces) or Gitpod's platform
- IDE choice: Primarily browser-based or VS Code. Containarium is SSH-based β use any editor (Vim, Emacs, Neovim, JetBrains via remote, VS Code via SSH, etc.)
- Persistence: Codespaces auto-delete after inactivity. Containarium environments persist indefinitely.
- Docker/systemd: Limited in Codespaces. Full support in Containarium.
Containarium is for teams that want self-hosted, persistent, SSH-based environments without per-hour billing.
Q: Why not Jetify Devbox?
A: Devbox creates isolated, reproducible dev environments using Nix packages. It's a local tool that runs on the developer's own machine.
- Devbox: Local package isolation (like a better virtualenv for everything). No VMs, no containers, no SSH. Developer needs their own machine.
- Containarium: Remote, multi-tenant Linux environments on shared infrastructure. Developer only needs an SSH client.
They solve different problems. Devbox is great for "I want reproducible local toolchains." Containarium is for "I need to give 50 developers each their own Linux box without buying 50 machines."
Q: Why not Vagrant?
A: Vagrant provisions full VMs (via VirtualBox, VMware, etc.). Compared to Containarium:
- Density: Vagrant runs 2-4 VMs per host. Containarium runs 50+ containers per host.
- Startup: Vagrant VMs take 30-60 seconds. LXC containers start in 2-5 seconds.
- Resources: Each Vagrant VM needs 1-4GB RAM. Each LXC container uses 100-500MB.
- Use case: Vagrant is for local development. Containarium is for centralized, multi-tenant hosting.
Q: Why not Proxmox?
A: Proxmox is a full virtualization platform (KVM VMs + LXC containers). It's powerful but general-purpose. Containarium is opinionated:
- Proxmox: General-purpose hypervisor with a web UI. You manage everything yourself β networking, storage, user access, SSH.
- Containarium: Purpose-built for developer environments. Handles SSH jump server setup, per-user isolation, Docker-in-container, ZFS quotas, and spot instance recovery out of the box.
If you need general virtualization, use Proxmox. If you specifically need cheap, fast, SSH-based dev environments with multi-tenant isolation, Containarium does it with less setup.
General Questions
Q: What happens when a spot instance is terminated?
A: Containers automatically restart:
- Spot instance terminated (by GCP)
- Persistent disk preserved (data safe)
- Instance restarts within ~5 minutes
- Containers auto-start (
boot.autostart=true) - Users can reconnect
- Downtime: 2-5 minutes
- Data loss: None
Q: How many containers can fit on one server?
A: Depends on machine type:
- e2-standard-2 (8GB RAM): 10-15 containers
- n2-standard-4 (16GB RAM): 20-30 containers
- n2-standard-8 (32GB RAM): 40-60 containers
Each container uses ~100-500MB RAM depending on workload.
Q: Can containers run Docker?
A: Yes! Each container has Docker pre-installed and working.
# Inside your container
docker run hello-world
docker-compose up -d
docker build -t myapp . # Docker builds work with Incus 6.19+
Important: Requires Incus 6.19 or later on the host. Earlier versions (including Ubuntu 24.04's default Incus 6.0.0) have an AppArmor bug (CVE-2025-52881) that breaks Docker builds in unprivileged containers. Use the Zabbly Incus repository for latest stable builds.
Q: Is my data backed up?
A: If you enabled snapshots:
# In terraform.tfvars
enable_disk_snapshots = true
Automatic daily snapshots with 30-day retention.
Q: Can I resize my container's disk quota?
A: Yes:
# Increase quota to 50GB
sudo containarium resize alice --disk-quota 50GB
Q: How do I install software in my container?
A:
# SSH to your container
ssh my-dev
# Install packages as usual
sudo apt update
sudo apt install vim git tmux htop
# Or use Docker
docker run -it ubuntu bash
π€ Contributing
Contributions are welcome! Please:
- Read the Implementation Plan
- Check existing issues and PRs
- Follow the existing code style
- Add tests for new features
- Update documentation
π License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
π Acknowledgments
- Incus - Modern LXC container manager
- sshpiper - SSH reverse proxy with failtoban plugin
- Protocol Buffers - Type-safe data contracts
- Cobra - Powerful CLI framework
- Terraform - Infrastructure as Code
π Support & Contact
- Documentation: See docs/ directory
- Issues: GitHub Issues
- Organization: FootprintAI
β‘ Quick Links
- π Horizontal Scaling Guide
- π§ SSH Setup Guide
- π° Cost & Scaling Strategies
- ποΈ Implementation Plan
- π©οΈ Terraform Examples
π Getting Started in 10 Minutes
Step 1: Deploy Infrastructure (3-5 min)
# Clone repo
git clone https://github.com/footprintai/Containarium.git
cd Containarium/terraform/gce
# Choose your size and configure
cp examples/single-server-spot.tfvars terraform.tfvars
vim terraform.tfvars # Add: project_id, admin_ssh_keys, allowed_ssh_sources
# Deploy to GCP
terraform init
terraform apply # Creates VM with Incus pre-installed
# Save the jump server IP from output!
Step 2: Install Containarium CLI (2 min, one-time)
# Build for Linux
cd ../..
make build-linux
# Copy to jump server
scp bin/containarium-linux-amd64 admin@<jump-server-ip>:/tmp/
# SSH and install
ssh admin@<jump-server-ip>
sudo mv /tmp/containarium-linux-amd64 /usr/local/bin/containarium
sudo chmod +x /usr/local/bin/containarium
exit
Step 3: Create Containers (1 min per user)
Each user must generate their own SSH key pair first:
# User generates their key (on their local machine)
ssh-keygen -t ed25519 -C "alice@company.com" -f ~/.ssh/containarium_alice
# User sends public key file to admin: ~/.ssh/containarium_alice.pub
Admin creates containers with users' public keys:
# SSH to jump server
ssh admin@<jump-server-ip>
# Save users' public keys (received from users)
echo 'ssh-ed25519 AAAAC3... alice@company.com' > /tmp/alice.pub
echo 'ssh-ed25519 AAAAC3... bob@company.com' > /tmp/bob.pub
# Create containers with users' keys
sudo containarium create alice --ssh-key /tmp/alice.pub --image images:ubuntu/24.04
sudo containarium create bob --ssh-key /tmp/bob.pub --image images:ubuntu/24.04
sudo containarium create charlie --ssh-key /tmp/charlie.pub --image images:ubuntu/24.04
# Output:
# β Container alice-container created successfully!
# β Jump server account: alice (proxy-only, no shell access)
# IP Address: 10.0.3.166
# Enable auto-start (survive spot instance restarts)
sudo incus config set alice-container boot.autostart true
sudo incus config set bob-container boot.autostart true
sudo incus config set charlie-container boot.autostart true
# Export SSH configs for users
sudo containarium export alice --jump-ip <jump-server-ip> > alice-ssh-config.txt
sudo containarium export bob --jump-ip <jump-server-ip> > bob-ssh-config.txt
sudo containarium export charlie --jump-ip <jump-server-ip> > charlie-ssh-config.txt
# Send config files to users
# List all containers
sudo containarium list
Step 4: Users Connect
Admin sends exported SSH config to each user:
- Send
alice-ssh-config.txtto Alice - Send
bob-ssh-config.txtto Bob - Send
charlie-ssh-config.txtto Charlie
Users add to their ~/.ssh/config:
# Alice on her laptop
cat alice-ssh-config.txt >> ~/.ssh/config
# Connect!
ssh alice-dev
# Alice is now in her Ubuntu container with Docker!
docker run hello-world
Done! π
See Deployment Guide for complete details.
Made with β€οΈ by the FootprintAI team
Save 92% on cloud costs. Deploy in 5 minutes. Scale to 250+ users.
