📦

Paygate

Pay-per-tool-call gating proxy for MCP servers. Wrap any MCP server with API key auth, per-tool pricing, rate limiting, and usage metering.

0 installs

Trust: 37 — Low

Browser

Ask AI about Paygate

I know everything about Paygate. Ask me about installation, configuration, usage, or troubleshooting.

0/500

Loading tools...

Reviews

Documentation

paygate-mcp

Monetize any MCP server with one command. Add API key auth, per-tool pricing, rate limiting, and usage metering to any Model Context Protocol server. Zero dependencies. Zero config. Zero code changes.

Quick Start
What It Does
Usage — Local stdio, remote HTTP, multi-server, client SDK
API Reference — All 199+ endpoints
CLI Options
Deployment — Docker, docker-compose, systemd, PM2
Load Testing — k6 benchmarking for production
Error Codes — Complete error code reference
Feature Reference — Detailed docs for every feature
- Storage & Billing · Stripe · ACL · Rate Limits
- Key Management · Webhooks · OAuth 2.1 · SSE
- Analytics (64 endpoints) · Teams · Redis Scaling
- Plugins · Groups · Namespaces
Programmatic API
Security
Tested With — Verified against popular MCP servers
Current Limitations
Roadmap
Requirements
License

Quick Start

# Interactive setup wizard (generates paygate.json)
npx paygate-mcp init

# Or wrap directly with CLI flags
npx paygate-mcp wrap --server "npx @modelcontextprotocol/server-filesystem /tmp"

# Gate a remote MCP server (Streamable HTTP transport)
npx paygate-mcp wrap --remote-url "https://my-server.example.com/mcp" --price 5

That's it. Your MCP server is now gated behind API keys with credit-based billing.

What It Does

PayGate sits between AI agents and your MCP server:

Agent → PayGate (auth + billing) → Your MCP Server (stdio or HTTP)

API Key Auth — Clients need a valid X-API-Key to call tools
Credit Billing — Each tool call costs credits (configurable per-tool)
Rate Limiting — Sliding window per-key rate limits + per-tool rate limits
Usage Metering — Track who called what, when, and how much they spent
Multi-Server Mode — Wrap N MCP servers behind one PayGate with tool prefix routing
Client SDK — PayGateClient with auto 402 retry, balance tracking, and typed errors
Two Transports — Wrap local servers via stdio or remote servers via Streamable HTTP
Per-Tool ACL — Whitelist/blacklist tools per API key (enterprise access control)
Per-Tool Rate Limits — Independent rate limits per tool, not just global
Key Expiry (TTL) — Auto-expire API keys after a set time
Spending Limits — Cap total spend per API key to prevent runaway costs
Usage Quotas — Daily/monthly call and credit limits per key (with UTC auto-reset)
Dynamic Pricing — Charge extra credits based on input size (creditsPerKbInput)
OAuth 2.1 — Full authorization server with PKCE, client registration, Bearer tokens
SSE Streaming — Full MCP Streamable HTTP transport (POST SSE, GET notifications, DELETE sessions)
Audit Log — Structured audit trail with retention policies, query API, CSV/JSON export
Registry/Discovery — Agent-discoverable pricing via /.well-known/mcp-payment, /pricing, and /.well-known/mcp.json identity card
OpenAPI 3.1 + Interactive Docs — Auto-generated spec at /openapi.json, Swagger UI at /docs — all 199+ endpoints documented
Public Endpoint Rate Limiting — Configurable per-IP rate limit (default 300/min) on /health, /info, /pricing, /docs, /openapi.json, /.well-known/*, /robots.txt, / — 429 with Retry-After header
Robots.txt + HEAD Support — Standard /robots.txt (allow public, disallow admin/keys), HEAD method on all public endpoints for uptime monitoring
Prometheus Metrics — /metrics endpoint with counters, gauges, and uptime in standard text format
Key Rotation — Rotate API keys without losing credits, ACLs, or quotas
Rate Limit Headers — X-RateLimit-* and X-Credits-Remaining on every /mcp response
Webhook Signatures — HMAC-SHA256 signed webhook payloads (X-PayGate-Signature) for tamper-proof delivery
Admin Lifecycle Events — Webhook notifications for key.created, key.revoked, key.rotated, key.topup
IP Allowlisting — Restrict API keys to specific IPs or CIDR ranges (IPv4)
Key Tags/Metadata — Attach arbitrary key-value tags to API keys for external system integration
Usage Analytics — Time-series analytics API with tool breakdown, top consumers, and trend comparison
Alert Webhooks — Configurable alerts for spending thresholds, low credits, quota warnings, key expiry, rate limit spikes
Team Management — Group API keys into teams with shared budgets, quotas, and usage tracking
Horizontal Scaling (Redis) — Redis-backed state for multi-process deployments with atomic credit deduction, distributed rate limiting, persistent usage audit trail, real-time pub/sub notifications, and admin API sync
Webhook Retry Queue — Exponential backoff retry (1s, 2s, 4s...) with dead letter queue for permanently failed deliveries, admin API for monitoring, clearing, and replaying
Admin Dashboard v2 — Tabbed web dashboard at /dashboard with overview, keys management (create/suspend/resume/revoke/top-up), analytics (credit flow, deny reasons, top consumers, webhook health), and system status — all data via safe DOM methods, 30s auto-refresh
Self-Service Portal — API key holder portal at /portal — check credits, usage, rate limits, available tools, and recent activity without admin access; includes Buy Credits UI, credit history with spending velocity, usage alerts, and self-service key rotation
Stripe Checkout — Self-service credit purchases via Stripe Checkout Sessions — POST /stripe/checkout creates a session, GET /stripe/packages lists available packages; zero-dependency implementation using Node.js https, auto-tops-up credits via webhook
State Backup & Restore — GET /admin/backup exports full server state (keys, teams, groups, webhooks) as versioned JSON with SHA-256 checksum; POST /admin/restore imports with merge/overwrite/full modes and integrity verification
API Version Header — X-PayGate-Version header on every HTTP response for client version tracking, exposed via CORS
Readiness Probe — GET /ready returns 200/503 based on operational state (not draining, not maintenance, backend connected) — separate from /health liveness probe, ideal for Kubernetes
Health Check + Graceful Shutdown — GET /health public endpoint with status, uptime, version, in-flight requests, Redis & webhook stats; gracefulStop() drains in-flight requests before teardown
Config Validation + Dry Run — paygate-mcp validate --config paygate.json catches misconfigurations before starting; --dry-run discovers tools, prints pricing table, then exits
Batch Tool Calls — tools/call_batch method for calling multiple tools in one request with all-or-nothing billing, aggregate credit checks, and parallel execution
Multi-Tenant Namespaces — Isolate API keys and usage data by tenant with namespace-filtered admin endpoints, analytics, and usage export
Scoped Tokens — Issue short-lived pgt_ tokens scoped to specific tools with auto-expiry (max 24h), HMAC-SHA256 signed, zero server-side state
Token Revocation List — Revoke scoped tokens before expiry with O(1) lookup, auto-cleanup, Redis cross-instance sync, and admin API
Usage-Based Auto-Topup — Automatically add credits when balance drops below a threshold with configurable daily limits, audit trail, webhook events, and Redis sync
Admin API Key Management — Multiple admin keys with role-based permissions (super_admin, admin, viewer), file persistence, audit trail, and safety guards
Plugin System — Extensible middleware hooks for custom billing logic, request/response transformation, custom endpoints, and lifecycle management
Key Groups — Policy templates that apply shared ACL, rate limits, pricing overrides, IP allowlists, and quotas to groups of API keys with automatic inheritance and key-level override support
Refund on Failure — Automatically refund credits when downstream tool calls fail
Credit Transfers — Atomically transfer credits between API keys with validation, audit trail, and webhook events
Bulk Key Operations — Execute multiple key operations (create, topup, revoke, suspend, resume) in a single request with per-operation error handling and index tracking
Key Import/Export — Export all API keys for backup/migration (JSON or CSV) and import with conflict resolution (skip, overwrite, error modes)
Webhook Filters — Route webhook events to different destinations based on event type and API key prefix with per-filter secrets, independent retry queues, and admin CRUD API
Key Cloning — POST /keys/clone creates a new API key with the same config (ACL, quotas, tags, IP, namespace, group, spending limit, expiry, auto-topup) but fresh counters — ideal for provisioning similar keys
Key Suspension — Temporarily disable API keys without revoking them — suspended keys are denied at the gate but can be resumed, and admin operations (topup, ACL, etc.) still work on suspended keys
Per-Key Usage — GET /keys/usage?key=... returns detailed usage breakdown for a specific key: per-tool stats, hourly time-series, deny reasons, recent events, and key metadata
Webhook Test — POST /webhooks/test sends a test event to your configured webhook URL with synchronous response including status code, response time, and delivery success/failure — verifies webhook connectivity without generating real events
Webhook Delivery Log — GET /webhooks/log returns a queryable log of all webhook delivery attempts with timestamps, HTTP status codes, response times, success/failure, retry attempts, event counts, and event types — filter by success status, time range, and limit
Webhook Pause/Resume — POST /webhooks/pause and POST /webhooks/resume temporarily halt webhook delivery during maintenance — events are buffered (not lost) and flushed on resume, with pause state visible in /webhooks/stats
Key Aliases — POST /keys/alias assigns human-readable aliases (e.g. my-service, prod-backend) to API keys — use aliases in any admin endpoint (topup, revoke, suspend, resume, clone, transfer, usage) instead of opaque key IDs, with uniqueness enforcement, format validation, state file persistence, and audit trail
Key Expiry Scanner — Proactive background scanner that detects expiring API keys before they expire — configurable scan interval and notification thresholds (default: 7d, 24h, 1h), de-duplicated key.expiry_warning webhook events, audit trail, GET /keys/expiring?within=86400 query endpoint, and graceful shutdown
Key Templates — Named templates for API key creation — define reusable presets (credits, ACL, quotas, IP, tags, namespace, expiry TTL, spending limit, auto-topup) and create keys with template: "free-tier" — explicit params override template defaults, CRUD admin API, Prometheus gauge, file persistence, max 100 templates
Environment Variables Config — Configure everything via PAYGATE_* env vars for Docker/K8s deployments — 18 env vars covering all CLI flags, with priority: CLI flags > env vars > config file > defaults, PAYGATE_CONFIG loads config file path, help text with Docker examples
Request ID Tracking — Every HTTP response includes X-Request-Id header (auto-generated req_ prefix + 16 hex chars) for distributed tracing — propagates incoming X-Request-Id from load balancers/proxies, included in gate audit log metadata, CORS-exposed, available via getRequestId(req) helper
Server Info Endpoint — GET /info returns server capabilities, enabled features, auth methods, pricing summary, rate limits, and available endpoints — public, no admin key required, ideal for agent auto-discovery and debugging
Configurable CORS — Control which origins can access your server: single origin, multiple origins, or wildcard (* default), with credentials support, configurable preflight max-age, and Vary: Origin for proper caching — set via config file cors object, --cors-origin CLI flag, or PAYGATE_CORS_ORIGIN env var
Custom Response Headers — Add security headers (X-Frame-Options, X-Content-Type-Options, etc.), cache control, or any custom headers to all HTTP responses — set via config file customHeaders object, --header CLI flag, or PAYGATE_CUSTOM_HEADERS env var
Config Export — GET /config returns the running server configuration with sensitive values masked (webhook secrets → ***, server commands → ***, webhook URLs → scheme+host only) — admin auth required, includes audit trail
Trusted Proxies — Configure trusted proxy IPs/CIDRs for accurate X-Forwarded-For extraction — walks the header right-to-left, skipping trusted proxies to find the real client IP, supports exact IPs and CIDR ranges (IPv4), backward compatible (first IP) when not configured
Key Listing Pagination — Enhanced GET /keys with cursor-based pagination (limit/offset), sorting (sortBy/order), and filtering by namespace, group, active/suspended/expired status, name prefix, and credit range — backward compatible (returns flat array when no pagination params used)
Key Statistics — GET /keys/stats returns aggregate statistics across all keys — total/active/suspended/expired/revoked counts, credit aggregates (allocated/spent/remaining), total calls, namespace and group breakdowns, optional ?namespace= filter
Rate Limit Status — GET /keys/rate-limit-status?key=... returns the current rate limit window state for any key — global calls used/remaining/reset time, per-tool rate limits with individual usage, read-only (doesn't consume a call)
Quota Status — GET /keys/quota-status?key=... returns daily/monthly quota usage for any key — calls and credits used/remaining/limits, reset periods, quota source (per-key vs global vs none)
Credit History — GET /keys/credit-history?key=... returns per-key credit mutation log — tracks initial allocation, topups, transfers (in/out), auto-topups, with type/limit/since filters, balance-before/after on every entry, newest-first ordering, capped at 100 entries per key
Spending Velocity — GET /keys/spending-velocity?key=... returns credit burn rate and depletion forecast — credits/calls per hour/day, estimated depletion date, top tools by spend, configurable analysis window (1h–30d)
Key Comparison — GET /keys/compare?keys=pg_a,pg_b returns side-by-side comparison of 2–10 keys — credits, usage, velocity, rate limits, status, metadata (namespace/group/tags) — with not-found key reporting
Key Health Score — GET /keys/health?key=... returns composite health score (0–100) with weighted component breakdown: balance health (30%), quota utilization (25%), rate limit pressure (20%), error rate (25%) — status levels (healthy/good/caution/warning/critical), key issue detection (revoked/suspended/expired/expiring/zero credits), alias support
Maintenance Mode — POST /maintenance enables/disables maintenance mode with custom message — /mcp returns 503 to clients while admin endpoints stay operational, GET /maintenance checks status, GET /health reflects maintenance state, full audit trail
Admin Event Stream — GET /admin/events SSE endpoint streams real-time audit events to admin clients — tool calls, denials, key operations, maintenance changes, all with optional ?types= filter for event type filtering, keepalive pings, multi-client support
Key Notes — POST /keys/notes adds timestamped notes to API keys, GET /keys/notes?key=... lists notes, DELETE /keys/notes?key=...&index=N removes notes — max 50 per key, 1000 char limit, works on suspended/revoked keys, alias support, audit trail
Scheduled Actions — POST /keys/schedule creates future-dated actions (revoke/suspend/topup) on API keys, GET /keys/schedule lists pending schedules with optional ?key= filter, DELETE /keys/schedule?id=... cancels a schedule — max 20 per key, alias support, background execution timer, audit trail
Key Activity Timeline — GET /keys/activity?key=... returns a unified chronological feed of audit events and usage events for a specific key — newest first, optional ?since= and ?limit= filters, alias support
Credit Reservations — POST /keys/reserve holds credits, POST /keys/reserve/commit deducts held credits, POST /keys/reserve/release frees the hold, GET /keys/reserve lists active reservations — prevents overcommit, configurable TTL (10s–1h), max 50 per key, auto-expiry, audit trail
Request Log — GET /requests queryable log of every tool call with timing, credits charged, status (allowed/denied), deny reason, key, and request ID — filter by key/tool/status/since, pagination, summary statistics (totals + avg duration), 5000-entry ring buffer
Tool Stats — GET /tools/stats per-tool analytics: call counts, success rate, avg/p95 latency, credits consumed, deny reason breakdown, top 10 consumers — optional ?tool= for detailed single-tool view, ?since= filter
Request Log Export — GET /requests/export exports the full request log as JSON or CSV with Content-Disposition headers — filter by key/tool/status/since/until, combined time-window queries, no pagination limit
Tool Call Dry Run — POST /requests/dry-run simulates a tool call without executing — checks key validity, ACL, rate limits, credits, and spending limits, returns predicted outcome with credits-after calculation and rate limit status
Batch Dry Run — POST /requests/dry-run/batch simulates multiple tool calls at once — aggregate credit check, per-tool ACL validation, spending limit, returns per-tool results with total credits required and credits-after
Tool Availability — GET /tools/available?key=... returns per-key tool availability with pricing, affordability (canAfford), ACL enforcement (accessible/denyReason), and per-tool + global rate limit status
Key Dashboard — GET /keys/dashboard?key=... consolidated single-endpoint view with metadata, balance, health score, spending velocity, rate limits, quotas, usage summary, and recent activity timeline
Admin Notifications — GET /admin/notifications scans all keys for actionable issues: expired/expiring keys, zero credits, credit depletion velocity, suspended keys, high error rates, and rate limit pressure — with severity filtering and priority sorting
System Dashboard — GET /admin/dashboard system-wide overview with key counts (active/suspended/revoked/expired), credit summary (allocated/spent/remaining), usage breakdown with deny reasons, top consumers, top tools, notification counts, and uptime
Key Lifecycle Report — GET /admin/lifecycle aggregated lifecycle trends with daily creation/revocation/suspension buckets, average key lifetime, and at-risk keys (expiring, expired, zero credits)
Cost Analysis — GET /admin/costs cost-centric view with per-tool and per-namespace cost breakdowns, hourly spending trends, top spenders, average cost per call, and namespace filtering
Rate Limit Analysis — GET /admin/rate-limits rate limit utilization analysis with per-key and per-tool breakdown, denial trends, most throttled keys, and current window utilization
Quota Analysis — GET /admin/quotas quota utilization analysis with per-key daily/monthly usage vs limits, per-tool denial breakdown, most constrained keys, and global/per-key quota source tracking
Denial Analysis — GET /admin/denials comprehensive denial breakdown by reason type (insufficient_credits, rate_limited, quota_exceeded, key_suspended, etc.) with per-key and per-tool stats, hourly trends, and most denied keys
Traffic Analysis — GET /admin/traffic request volume analysis with tool popularity, hourly volume, top consumers by call count, namespace breakdown, peak hour identification, and success rates
Response Caching — SHA-256 keyed response cache for identical tool calls — skips backend invocation and credit deduction on cache hit, LRU eviction, per-tool or global TTL, X-Cache: HIT/MISS header, admin management (GET/DELETE /admin/cache), Prometheus gauge
Circuit Breaker — Three-state circuit breaker (closed → open → half_open) for backend failure detection — opens after N consecutive failures, auto-recovers after cooldown, error code -32003, admin management (GET/POST /admin/circuit)
Configurable Timeouts — Per-tool and global timeout for tool calls — returns error code -32004 on timeout, per-tool override via toolPricing[tool].timeoutMs, triggers circuit breaker failure recording
Outcome-Based Pricing — Charge extra credits based on response output size — creditsPerKbOutput per-tool config, post-response billing, X-Output-Surcharge header, complements creditsPerKbInput for complete size-based pricing
Compliance Audit Export — Framework-specific compliance reports for SOC 2, GDPR, HIPAA — GET /admin/compliance/export, event classification into access control/data processing/config changes/security, JSON or CSV export, configurable time periods
Per-Key Webhook URLs — Key-level webhook routing — events for a specific key sent to key's webhook URL alongside global webhook, SSRF-protected, HMAC-SHA256 signed, lazy emitter management via POST/GET/DELETE /keys/webhook
Security Audit — GET /admin/security security posture analysis identifying keys without IP allowlists, quotas, ACL restrictions, spending limits, or expiry dates, flagging high-credit keys, and computing a composite security score
Revenue Analysis — GET /admin/revenue revenue metrics with per-tool revenue breakdown, per-key spending, hourly revenue trends, credit flow summary (allocated/spent/remaining), and average revenue per call
Key Portfolio Health — GET /admin/key-portfolio portfolio-wide key health with active/inactive/suspended counts, stale keys, expiring-soon keys, age distribution, credit utilization, and namespace breakdown
Content Guardrails — Regex-based PII detection and redaction for tool call inputs/outputs — 8 built-in rules (credit card, SSN, email, phone, AWS key, API secret, IBAN, passport), 4 actions (log/warn/block/redact), scope filtering (input/output/both), per-tool targeting, violation tracking with query API, admin CRUD endpoints (/admin/guardrails, /admin/guardrails/violations)
IP Country Restrictions — Per-key geographic access control with allow/deny country lists (ISO 3166-1 alpha-2) — country code from reverse-proxy headers (X-Country, CF-IPCountry, configurable), CRUD via /keys/geo, enforced at gate evaluation, zero-dependency geo-fencing
Bulk Suspend/Resume — Added suspend and resume actions to POST /keys/bulk — temporarily disable or re-activate multiple keys in one request with per-operation error handling
Concurrency Limiter — Per-key and per-tool inflight request caps — distinct from rate limiting, limits simultaneous active requests to protect backends from burst parallelism, error code -32005 with Retry-After header, runtime-adjustable via GET/POST /admin/concurrency
Traffic Mirroring — Fire-and-forget request duplication to a shadow backend for A/B testing MCP server versions — percentage-based sampling, configurable timeout, zero impact on primary response path, stats/management via GET/POST/DELETE /admin/mirror
Tool Aliasing + Deprecation — Tool renaming with RFC 8594 compliance — map old tool names to new ones with Deprecation, Sunset, and Link headers, chain prevention, per-alias call counts, CRUD via GET/POST/DELETE /admin/tool-aliases
Usage Plans — Tiered key policies (free/pro/enterprise) — bundle rate limits, quotas, credit multipliers, and tool ACL into reusable templates, assign keys to plans via POST /admin/keys/plan, denied tools rejected with error code -32403, CRUD via GET/POST/DELETE /admin/plans
Tool Input Schema Validation — Per-tool JSON Schema validation at the gateway — register schemas to reject invalid payloads before they reach downstream, zero-dependency JSON Schema subset (type, required, enum, minLength, pattern, items), error code -32602 with detailed errors, manage via GET/POST/DELETE /admin/tools/schema
Canary Routing — Weighted traffic splitting between primary and canary MCP servers — enable zero-downtime upgrades with percentage-based routing (0-100%), unbiased crypto.randomInt decisions, per-backend call/error tracking, weight updates without restart, manage via GET/POST/DELETE /admin/canary
Request/Response Transforms — Declarative rewriting of tool call arguments and responses — inject defaults, strip fields, rename keys, and template {{variables}} from context, wildcard tool matching, priority ordering, deep clone on apply, import/export for backup, manage via GET/POST/PUT/DELETE /admin/transforms
Backend Retry Policy — Automatic retry with exponential backoff for transient failures — configurable max retries, base/max backoff, full jitter, retry budget (max % of traffic as retries with cold-start grace), per-tool stats, retryable error pattern matching, manage via GET/POST /admin/retry-policy
Adaptive Rate Limiting — Dynamic rate adjustment based on key behavior — auto-tighten for high error rates, auto-boost for good actors, cooldown periods, configurable thresholds, per-key behavior tracking, LRU eviction, batch evaluation, manage via GET/POST /admin/adaptive-rates
Request Deduplication — Idempotency layer preventing duplicate billing from agent retries — X-Idempotency-Key header with auto-generation fallback (SHA-256), in-flight request coalescing, configurable TTL window, LRU eviction, credits-saved tracking, manage via GET/POST/DELETE /admin/dedup
Priority Queue — Tiered request prioritization (critical/high/normal/low/background) with fair scheduling — per-key priority assignment, configurable max wait times per tier, starvation prevention via automatic promotion, max queue depth limiting, manage via GET/POST /admin/priority-queue
Cost Allocation Tags — Per-request cost attribution via X-Cost-Tags header (JSON) for enterprise chargeback — aggregated reports by any tag dimension, cross-tabulation, CSV export, required tag enforcement per key, cardinality limits, manage via GET/POST/DELETE /admin/cost-tags
IP Access Control — Fine-grained IP-based access control with CIDR notation support — global allow/deny lists, per-key IP binding, automatic blocking after configurable violation thresholds, X-Forwarded-For/X-Real-IP trusted proxy depth, IPv6-mapped IPv4 normalization, manage via GET/POST/DELETE /admin/ip-access
Request Signing (HMAC-SHA256) — Cryptographic request authentication with replay protection — X-Signature: t=<ts>,n=<nonce>,s=<sig> header, timestamp tolerance with nonce dedup, per-key signing secrets with rotation, timing-safe comparison, manage via GET/POST/DELETE /admin/signing
Multi-Tenant Isolation — Full tenant isolation for platform operators — per-tenant rate limits, credit pools, usage tracking, API key binding, tenant suspension/activation, cross-tenant reporting, configurable limits (10K tenants, 1K keys/tenant), manage via GET/POST/DELETE /admin/tenants
Request Tracing — End-to-end structured tracing with span recording at gate, backend, and transform stages — trace/request ID lookup, timing breakdown (gateMs/backendMs/transformMs), configurable sample rate, retention limits, P95 latency tracking, JSON export, manage via GET/POST/DELETE /admin/tracing
Budget Policy Engine — Burn rate monitoring with progressive throttling — daily/monthly budget enforcement, credits/minute burn rate tracking over configurable windows, three actions (alert/throttle/deny), per-namespace and per-key targeting, budget remaining forecast, automatic daily/monthly reset, manage via GET/POST/DELETE /admin/budget-policies
Tool Dependency Graph — DAG-based workflow validation — register tool dependencies, enforce execution order, failure propagation (upstream failure blocks downstream), topological sort, cycle detection, per-workflow execution tracking, hard vs soft dependencies, group scoping, manage via GET/POST/DELETE /admin/tool-deps
Quota Management — Granular daily/weekly/monthly hard caps per API key — per-tool or global quotas, calls or credits metric, burst allowance (temporary over-limit percentage), three overage actions (deny/warn/throttle), UTC-based period boundaries (daily midnight, weekly Monday, monthly 1st), automatic period rollover, manage via GET/POST/DELETE /admin/quota-rules
Webhook Replay (DLQ) — Dead letter queue management for failed webhook deliveries — record failures with full request context (URL, headers, body, HMAC signature), replay individual or bulk failed deliveries, status tracking (pending → retrying → succeeded/exhausted), configurable max retries with timeout, purge by ID or status, age-based expiry, manage via GET/POST/DELETE /admin/webhook-replay
Config Profiles — Named configuration presets with save/activate/rollback — profile inheritance chains (base → child merging), SHA-256 checksums, flat-key diffing for comparison (onlyInA/onlyInB/changed/unchanged), import/export as JSON with merge or replace mode, activation history, circular inheritance detection, manage via GET/POST/DELETE /admin/config-profiles
Scheduled Reports — Automated periodic usage, billing, compliance, and security reports delivered via webhook — daily/weekly/monthly frequency with UTC period bounds, HMAC-SHA256 signed payloads, namespace/group/tool/key filters, report generation with delivery tracking, configurable timeouts, manage via POST /admin/scheduled-reports
Approval Workflows — Pre-execution approval gates for high-cost or sensitive tool calls — three conditions (cost_threshold, tool_match with glob, key_match with prefix), pending requests with configurable TTL (default 1h), approve/deny/expire lifecycle, trigger counting, manage via POST /admin/approval-workflows
Gateway Hooks — Pre/post request lifecycle hooks for custom logic — three stages (pre_gate, pre_backend, post_backend), four types (log, header_inject, metadata_tag, reject), priority-based execution pipeline, tool/key glob filtering, reject short-circuits processing, execution counting, manage via POST /admin/gateway-hooks
Anomaly Detection — GET /admin/anomalies identifies unusual patterns: keys with high denial rates, rapid credit depletion, low remaining credits, with severity ratings and detailed descriptions
Usage Forecasting — GET /admin/forecast predicts future credit consumption with per-key depletion estimates, calls remaining, at-risk key identification, system-wide consumption aggregates, and per-tool cost breakdown
Compliance Report — GET /admin/compliance generates compliance-ready report with key governance (expiry coverage), access control (ACL/IP/spending limit coverage), audit trail completeness, weighted overall score, and actionable recommendations
SLA Monitoring — GET /admin/sla tracks service level metrics: success rates, denial breakdowns by reason, per-tool availability and error rates, uptime tracking, sorted by call volume
Capacity Planning — GET /admin/capacity system capacity analysis with credit burn rates, utilization percentages, top consumers, per-namespace breakdown, and scaling recommendations
Key Dependency Map — GET /admin/dependencies tool-to-key relationship map with tool usage popularity, unique key counts per tool, per-key tool lists, and used/unused tool identification
Tool Latency Analysis — GET /admin/latency per-tool response time metrics with avg/p95/min/max durations, slowest tools ranking, and per-key latency breakdown
Error Rate Trends — GET /admin/error-trends denial rate trends with per-tool error rates, denial reason breakdown, worst-performing tools, and trend direction
Credit Flow Analysis — GET /admin/credit-flow credit inflow/outflow analysis with utilization percentage, top spenders, and per-tool spend breakdown
Key Age Analysis — GET /admin/key-age key age distribution with oldest/newest keys, age buckets (24h/7d/30d/older), and recently created list
Namespace Usage Summary — GET /admin/namespace-usage per-namespace usage metrics with credit allocation, spending, call counts, and cross-namespace comparison
Audit Summary — GET /admin/audit-summary audit event analytics with type breakdown, top actors, recent events, and activity summary
Group Performance — GET /admin/group-performance per-group analytics with key counts, credit allocation/spending, call volume, utilization, and policy summary
Request Volume Trends — GET /admin/request-trends hourly time-series of request volume, success/failure counts, credit spend, avg duration, and peak hour identification
Key Status Overview — GET /admin/key-status key status dashboard with active/suspended/revoked/expired counts and keys needing attention (low credits, near expiry)
Webhook Health — GET /admin/webhook-health webhook delivery health overview with success rate, pending retries, dead letter count, pause status, and buffered events
Consumer Insights — GET /admin/consumer-insights per-key behavioral analytics with top spenders, most active callers, tool diversity, and spending patterns
System Health Score — GET /admin/system-health composite 0-100 health score with weighted component breakdowns for key health, error rates, and credit utilization
Tool Adoption — GET /admin/tool-adoption per-tool adoption metrics with unique consumers, adoption rate, first/last seen timestamps, and usage ranking
Credit Efficiency — GET /admin/credit-efficiency credit allocation efficiency with burn efficiency, waste ratio, over-provisioned and under-provisioned key detection
Access Heatmap — GET /admin/access-heatmap hourly access patterns with tool breakdown, unique consumers, and peak hour identification
Key Churn Analysis — GET /admin/key-churn key churn metrics with creation/revocation rates, churn and retention percentages, and never-used key detection
Tool Correlation — GET /admin/tool-correlation tool co-occurrence analysis showing which tools are commonly used together by the same consumers
Consumer Segmentation — GET /admin/consumer-segmentation classifies API key consumers into power/regular/casual/dormant segments with per-segment metrics
Credit Distribution — GET /admin/credit-distribution histogram of credit balances across active keys with bucket ranges and median calculation
Response Time Distribution — GET /admin/response-time-distribution histogram of response times with latency buckets and p50/p95/p99 percentiles
Consumer Lifetime Value — GET /admin/consumer-lifetime-value per-consumer spend analysis with value tiers, tool diversity, and top spender rankings
Tool Revenue Ranking — GET /admin/tool-revenue ranks tools by total credits consumed with call counts, unique consumers, and percentage breakdown
Consumer Retention Cohorts — GET /admin/consumer-retention groups consumers by creation date with retention rates and avg spend per cohort
Error Breakdown — GET /admin/error-breakdown categorizes denied requests by reason with counts, percentages, affected consumers, and error rate
Credit Utilization Rate — GET /admin/credit-utilization shows utilization percentage across active keys with utilization bands and over-provisioning detection
Namespace Revenue — GET /admin/namespace-revenue revenue breakdown by namespace with spend, call counts, key counts, and percentage breakdown
Group Revenue — GET /admin/group-revenue revenue breakdown by key group with spend, call counts, key counts, and percentage breakdown
Peak Usage Times — GET /admin/peak-usage traffic patterns by hour-of-day with request counts, credits, unique consumers, and peak hour identification
Consumer Activity — GET /admin/consumer-activity per-consumer activity metrics with calls, spend, credits remaining, last active time, and active/inactive status
Tool Popularity — GET /admin/tool-popularity tool usage popularity with call counts, credits, unique consumers, percentage, and most popular tool identification
Credit Allocation Summary — GET /admin/credit-allocation credit allocation across active keys with tier breakdown (1-100, 101-500, 501+), totals, and average allocation
Daily Summary — GET /admin/daily-summary daily rollup of requests, credits spent, new keys, errors, unique consumers and tools for trend analysis
Key Ranking — GET /admin/key-ranking leaderboard of active keys ranked by spend, calls, or credits remaining with configurable sorting
Hourly Traffic — GET /admin/hourly-traffic granular per-hour request counts with allowed/denied breakdown, credits, consumers, tools, and busiest hour
Tool Error Rate — GET /admin/tool-error-rate per-tool error rates with denied/allowed counts, error percentage, and overall reliability metrics
Consumer Spend Velocity — GET /admin/consumer-spend-velocity per-consumer spend rate with credits/hour, depletion forecast, and velocity ranking
Namespace Activity — GET /admin/namespace-activity per-namespace activity metrics with key counts, spend, calls, credits remaining for multi-tenant visibility
Credit Burn Rate — GET /admin/credit-burn-rate system-wide credit burn rate with credits/hour, utilization percentage, depletion forecast
Consumer Risk Score — GET /admin/consumer-risk-score per-consumer risk scoring based on utilization with risk levels (low/medium/high/critical)
Revenue Forecast — GET /admin/revenue-forecast projected revenue with hourly/daily/weekly/monthly forecasts capped by remaining credits
System Overview — GET /admin/system-overview executive summary with key counts, credit totals, utilization, activity metrics
Key Health Overview — GET /admin/key-health-overview holistic per-key health check with utilization, status levels, health distribution
Namespace Comparison — GET /admin/namespace-comparison side-by-side namespace comparison with allocation, spend, utilization, leader
Consumer Growth — GET /admin/consumer-growth consumer growth metrics with age, spend rate, credits allocated, new consumer count
Tool Profitability — GET /admin/tool-profitability per-tool profitability analysis with revenue, calls, avg revenue per call, unique callers
Credit Waste Analysis — GET /admin/credit-waste per-key credit waste analysis with utilization metrics and waste percentage
Group Activity — GET /admin/group-activity per-group activity metrics with key counts, spend, calls, credits remaining for policy-template analytics
Config Hot Reload — POST /config/reload reloads pricing, rate limits, webhooks, quotas, and behavior flags from config file without server restart
Webhook Events — POST batched usage events to any URL for external billing/alerting
Config File Mode — Load all settings from a JSON file (--config)
Shadow Mode — Log everything without enforcing payment (for testing)
Persistent Storage — Keys, credits, admin keys, and groups survive restarts with --state-file
Zero Dependencies — No external npm packages. Uses only Node.js built-ins.

Usage

Wrap a Local MCP Server (stdio)

# Default: 1 credit per call, 60 calls/min, port 3402
npx paygate-mcp wrap --server "npx @modelcontextprotocol/server-filesystem /tmp"

# Custom pricing and limits
npx paygate-mcp wrap \
  --server "python my-server.py" \
  --price 2 \
  --rate-limit 30 \
  --port 8080

# Per-tool pricing
npx paygate-mcp wrap \
  --server "node server.js" \
  --tool-price "search:1,generate:5,premium_analyze:20"

# Shadow mode (observe without enforcing)
npx paygate-mcp wrap --server "node server.js" --shadow

Gate a Remote MCP Server (Streamable HTTP)

Gate any remote MCP server that supports the Streamable HTTP transport (MCP spec 2025-03-26):

npx paygate-mcp wrap --remote-url "https://my-mcp-server.example.com/mcp"

# With custom pricing
npx paygate-mcp wrap \
  --remote-url "https://api.example.com/mcp" \
  --price 5 \
  --tool-price "gpt4:20,search:2"

The proxy handles:

JSON-RPC forwarding via HTTP POST
SSE (text/event-stream) response parsing
Mcp-Session-Id session management
Graceful session cleanup (HTTP DELETE on shutdown)

When started, you'll see your admin key in the console. Save it.

Multi-Server Mode

Wrap multiple MCP servers behind a single PayGate instance. Tools are prefixed with the server name:

npx paygate-mcp wrap --config multi-server.json

Example multi-server.json:

{
  "port": 3402,
  "defaultCreditsPerCall": 1,
  "servers": [
    {
      "prefix": "fs",
      "serverCommand": "npx",
      "serverArgs": ["@modelcontextprotocol/server-filesystem", "/tmp"]
    },
    {
      "prefix": "github",
      "remoteUrl": "https://github-mcp.example.com/mcp"
    }
  ]
}

Tools are exposed with prefixes: fs:read_file, fs:write_file, github:search_repos, etc. Pricing and ACLs work on the prefixed names:

{
  "toolPricing": {
    "github:search_repos": { "creditsPerCall": 5 },
    "fs:read_file": { "creditsPerCall": 1 }
  }
}

Credits are shared across all backends — one API key works for all servers.

Client SDK

Use PayGateClient to call tools from TypeScript/Node.js with auto 402 retry:

import { PayGateClient, PayGateError } from 'paygate-mcp/client';

const client = new PayGateClient({
  url: 'http://localhost:3402',
  apiKey: 'pg_abc123...',
  autoRetry: true,
  onCreditsNeeded: async (info) => {
    // Called when credits run out — add credits and return true to retry
    await topUpCredits(info.creditsRequired);
    return true;
  },
});

const tools = await client.listTools();
const result = await client.callTool('search', { query: 'hello' });
const balance = await client.getBalance();

Features:

Auto 402 retry: When a tool call returns payment-required, calls onCreditsNeeded and retries
Balance tracking: client.lastKnownBalance tracks credits from getBalance() calls
Typed errors: PayGateError with .isPaymentRequired, .isRateLimited, .isExpired helpers
Zero dependencies: Uses Node.js built-in http/https

Create API Keys

curl -X POST http://localhost:3402/keys \
  -H "Content-Type: application/json" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"name": "my-client", "credits": 100}'

Call Tools

curl -X POST http://localhost:3402/mcp \
  -H "Content-Type: application/json" \
  -H "X-API-Key: CLIENT_API_KEY" \
  -d '{
    "jsonrpc": "2.0",
    "id": 1,
    "method": "tools/call",
    "params": {
      "name": "read_file",
      "arguments": {"path": "/tmp/test.txt"}
    }
  }'

Top Up Credits

curl -X POST http://localhost:3402/topup \
  -H "Content-Type: application/json" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "CLIENT_API_KEY", "credits": 500}'

Check Balance (Client Self-Service)

curl http://localhost:3402/balance \
  -H "X-API-Key: CLIENT_API_KEY"

Returns credits, total spent, call count, and last used timestamp. Clients can check their own balance without needing admin access.

Export Usage Data (Admin)

# JSON export
curl http://localhost:3402/usage \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

# CSV export (for spreadsheet/billing import)
curl "http://localhost:3402/usage?format=csv" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Filter by date
curl "http://localhost:3402/usage?since=2025-01-01T00:00:00Z" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Returns per-call usage events with tool name, credits charged, and timestamps. API keys are masked in output.

Check Status

curl http://localhost:3402/status \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Returns active keys, usage stats, per-tool breakdown, and deny reasons.

Admin Dashboard

Open the web dashboard in your browser:

http://localhost:3402/dashboard

A real-time admin UI for managing keys, viewing usage, and monitoring tool calls. Enter your admin key to authenticate. Features auto-refresh every 30s, top tools chart, activity feed, and key creation/management.

API Reference

Endpoint	Method	Auth	Description
`/mcp`	POST	`X-API-Key` or `Bearer`	JSON-RPC 2.0 proxy (returns JSON or SSE)
`/mcp`	GET	`X-API-Key` or `Bearer`	SSE notification stream (Streamable HTTP)
`/mcp`	DELETE	`Mcp-Session-Id`	Terminate an MCP session
`/balance`	GET	`X-API-Key`	Client self-service — check credits, quota, ACL, expiry
`/keys`	POST	`X-Admin-Key`	Create API key (with ACL, expiry, quota, credits)
`/keys`	GET	`X-Admin-Key`	List all keys (masked, with expiry status)
`/topup`	POST	`X-Admin-Key`	Add credits to an existing key
`/keys/transfer`	POST	`X-Admin-Key`	Transfer credits between API keys
`/keys/bulk`	POST	`X-Admin-Key`	Execute multiple key operations (create, topup, revoke) in one request
`/keys/export`	GET	`X-Admin-Key`	Export all API keys for backup/migration (JSON or CSV)
`/keys/import`	POST	`X-Admin-Key`	Import API keys from backup with conflict resolution
`/keys/revoke`	POST	`X-Admin-Key`	Permanently revoke an API key
`/keys/suspend`	POST	`X-Admin-Key`	Temporarily suspend a key (reversible)
`/keys/resume`	POST	`X-Admin-Key`	Resume a suspended key
`/keys/clone`	POST	`X-Admin-Key`	Clone a key (new key, same config, fresh counters)
`/keys/usage`	GET	`X-Admin-Key`	Per-key usage breakdown (per-tool, time-series, deny reasons)
`/keys/rotate`	POST	`X-Admin-Key`	Rotate key (new key, same credits/ACL/quotas)
`/keys/acl`	POST	`X-Admin-Key`	Set tool ACL (whitelist/blacklist) on a key
`/keys/expiry`	POST	`X-Admin-Key`	Set or remove key expiry (TTL)
`/keys/quota`	POST	`X-Admin-Key`	Set usage quota (daily/monthly limits)
`/keys/tags`	POST	`X-Admin-Key`	Set key tags/metadata (merge semantics)
`/keys/ip`	POST	`X-Admin-Key`	Set IP allowlist (CIDR + exact match)
`/keys/search`	POST	`X-Admin-Key`	Search keys by tag values
`/keys/auto-topup`	POST	`X-Admin-Key`	Configure or disable auto-topup for a key
`/admin/keys`	GET	`X-Admin-Key` (super_admin)	List all admin keys (masked)
`/admin/keys`	POST	`X-Admin-Key` (super_admin)	Create a new admin key with role
`/admin/keys/revoke`	POST	`X-Admin-Key` (super_admin)	Revoke an admin key
`/limits`	POST	`X-Admin-Key`	Set spending limit on a key
`/usage`	GET	`X-Admin-Key`	Export usage data (JSON or CSV)
`/status`	GET	`X-Admin-Key`	Full dashboard with usage stats
`/dashboard`	GET	None (admin key in-browser)	Real-time admin web dashboard
`/stripe/checkout`	POST	`X-API-Key`	Create Stripe Checkout Session for credit purchase
`/stripe/packages`	GET	None	List available credit packages (public, rate-limited)
`/stripe/webhook`	POST	Stripe Signature	Auto-top-up credits on payment
`/admin/backup`	GET	`X-Admin-Key`	Export full server state as versioned JSON snapshot
`/admin/restore`	POST	`X-Admin-Key`	Import state from backup (merge/overwrite/full modes)
`/admin/cache`	GET	`X-Admin-Key`	Response cache stats (entries, hits, misses, hit rate)
`/admin/cache`	DELETE	`X-Admin-Key`	Clear cache (all or `?tool=` filter)
`/admin/circuit`	GET	`X-Admin-Key`	Circuit breaker status (state, failures, rejections)
`/admin/circuit`	POST	`X-Admin-Key`	Reset circuit breaker to closed state
`/admin/compliance/export`	GET	`X-Admin-Key`	Compliance audit export (SOC 2/GDPR/HIPAA, JSON/CSV)
`/keys/webhook`	POST	`X-Admin-Key`	Set per-key webhook URL
`/keys/webhook`	GET	`X-Admin-Key`	Get per-key webhook status
`/keys/webhook`	DELETE	`X-Admin-Key`	Remove per-key webhook URL
`/.well-known/oauth-authorization-server`	GET	None	OAuth 2.1 server metadata
`/oauth/register`	POST	None	Dynamic Client Registration (RFC 7591)
`/oauth/authorize`	GET	None	Authorization endpoint (PKCE required)
`/oauth/token`	POST	None	Token endpoint (code exchange + refresh)
`/oauth/revoke`	POST	None	Token revocation (RFC 7009)
`/oauth/clients`	GET	`X-Admin-Key`	List registered OAuth clients
`/.well-known/mcp-payment`	GET	None	Server payment metadata (SEP-2007)
`/.well-known/mcp.json`	GET	None	MCP Server Identity card (discovery)
`/pricing`	GET	None	Full per-tool pricing breakdown
`/openapi.json`	GET	None	OpenAPI 3.1 spec (all 199+ endpoints)
`/docs`	GET	None	Interactive API docs (Swagger UI)
`/robots.txt`	GET	None	Crawler directives (allow public, disallow admin/keys)
`/portal`	GET	None	Self-service API key portal (browser UI, auth via X-API-Key prompt)
`/ready`	GET	None	Readiness probe (200 when ready, 503 when draining/maintenance)
`/metrics`	GET	None	Prometheus metrics (counters, gauges, uptime)
`/analytics`	GET	`X-Admin-Key`	Usage analytics (time-series, tool breakdown, trends)
`/alerts`	GET	`X-Admin-Key`	Consume pending alerts
`/alerts`	POST	`X-Admin-Key`	Configure alert rules
`/teams`	GET	`X-Admin-Key`	List all teams
`/teams`	POST	`X-Admin-Key`	Create a team (name, budget, quota, tags)
`/teams/update`	POST	`X-Admin-Key`	Update team settings
`/teams/delete`	POST	`X-Admin-Key`	Delete (deactivate) a team
`/teams/assign`	POST	`X-Admin-Key`	Assign an API key to a team
`/teams/remove`	POST	`X-Admin-Key`	Remove an API key from a team
`/teams/usage`	GET	`X-Admin-Key`	Team usage summary with member breakdown
`/tokens`	POST	`X-Admin-Key`	Create a scoped token (short-lived, tool-restricted)
`/tokens/revoke`	POST	`X-Admin-Key`	Revoke a scoped token (by full token string)
`/tokens/revoked`	GET	`X-Admin-Key`	List all revoked token entries
`/namespaces`	GET	`X-Admin-Key`	List all namespaces with key/credit/spending stats
`/audit`	GET	`X-Admin-Key`	Query audit log (filter by type, actor, time)
`/audit/export`	GET	`X-Admin-Key`	Export full audit log (JSON or CSV)
`/audit/stats`	GET	`X-Admin-Key`	Audit log statistics
`/plugins`	GET	`X-Admin-Key`	List registered plugins with hook info
`/groups`	GET	`X-Admin-Key`	List all key groups (policy templates)
`/groups`	POST	`X-Admin-Key`	Create a key group with shared policies
`/groups/update`	POST	`X-Admin-Key`	Update group policies
`/groups/delete`	POST	`X-Admin-Key`	Delete (deactivate) a group
`/groups/assign`	POST	`X-Admin-Key`	Assign an API key to a group
`/groups/remove`	POST	`X-Admin-Key`	Remove an API key from a group
`/webhooks/filters`	GET	`X-Admin-Key`	List all webhook filter rules
`/webhooks/filters`	POST	`X-Admin-Key`	Create a webhook filter rule
`/webhooks/filters/update`	POST	`X-Admin-Key`	Update a webhook filter rule
`/webhooks/filters/delete`	POST	`X-Admin-Key`	Delete a webhook filter rule
`/webhooks/replay`	POST	`X-Admin-Key`	Replay dead letter webhook events (all or by index)
`/webhooks/test`	POST	`X-Admin-Key`	Send test event to configured webhook URL (synchronous)
`/webhooks/log`	GET	`X-Admin-Key`	Webhook delivery log with status, timing, and filters
`/webhooks/pause`	POST	`X-Admin-Key`	Pause webhook delivery (events buffered until resumed)
`/webhooks/resume`	POST	`X-Admin-Key`	Resume webhook delivery and flush buffered events
`/keys/alias`	POST	`X-Admin-Key`	Set or clear a human-readable alias for an API key
`/keys/expiring`	GET	`X-Admin-Key`	List keys expiring within a time window (`?within=86400` seconds)
`/keys/templates`	GET	`X-Admin-Key`	List all key templates
`/keys/templates`	POST	`X-Admin-Key`	Create or update a key template
`/keys/templates/delete`	POST	`X-Admin-Key`	Delete a key template
`/config/reload`	POST	`X-Admin-Key`	Hot-reload config file (pricing, rate limits, webhooks, quotas)
`/health`	GET	None	Health check (status, uptime, version, in-flight, Redis/webhook status)
`/`	GET	None	Root endpoint (endpoint list)

Free Methods

These MCP methods pass through without auth or billing: initialize, initialized, ping, tools/list, resources/list, prompts/list

Gated methods: tools/call (single), tools/call_batch (batch — all-or-nothing billing, parallel execution). See Batch Tool Calls.

CLI Commands

paygate-mcp wrap [options]             # Start a payment-gated MCP proxy
paygate-mcp init [--output] [--force]  # Interactive setup wizard
paygate-mcp validate --config <path>   # Validate config without starting
paygate-mcp completions <bash|zsh|fish> # Generate shell completions
paygate-mcp version [--json]           # Print version

Shell Completions

# Bash
paygate-mcp completions bash > ~/.local/share/bash-completion/completions/paygate-mcp

# Zsh
paygate-mcp completions zsh > ~/.zfunc/_paygate-mcp
# Add to .zshrc: fpath=(~/.zfunc $fpath) && compinit

# Fish
paygate-mcp completions fish > ~/.config/fish/completions/paygate-mcp.fish

Machine-Readable Output

# Version as JSON (for CI/CD)
paygate-mcp version --json
# → {"version":"10.3.0"}

# Validate config with structured output
paygate-mcp validate --config paygate.json --json
# → {"valid":true,"diagnostics":[...],"errors":0,"warnings":0}

CLI Options

--server <cmd>       MCP server command to wrap via stdio
--remote-url <url>   Remote MCP server URL (Streamable HTTP transport)
--port <n>           HTTP port (default: 3402)
--price <n>          Default credits per tool call (default: 1)
--rate-limit <n>     Max calls/min per key (default: 60, 0=unlimited)
--name <s>           Server display name
--shadow             Shadow mode — log without enforcing payment
--admin-key <s>      Set admin key (default: auto-generated)
--tool-price <t:n>   Per-tool price (e.g. "search:5,generate:10")
--import-key <k:c>   Import existing key with credits (e.g. "pg_abc:100")
--state-file <path>  Persist keys/credits to a JSON file (survives restarts)
--stripe-secret <s>  Stripe webhook signing secret (enables /stripe/webhook)
--webhook-url <url>  POST batched usage events to this URL
--webhook-secret <s> HMAC-SHA256 secret for signing webhook payloads
--refund-on-failure  Refund credits when downstream tool call fails
--redis-url <url>    Redis URL for distributed state (e.g. "redis://localhost:6379")
--config <path>      Load settings from a JSON config file
--discovery <mode>   Tool discovery mode: static (default) or dynamic
--json               Machine-readable JSON output

Note: Use --server OR --remote-url for single-server mode. Use servers in a config file for multi-server mode.

Dynamic Tool Discovery

For servers with many tools, dynamic discovery mode reduces agent context window bloat by exposing 3 meta-tools instead of the full tool list:

npx paygate-mcp wrap --server "your-server" --discovery dynamic

Agents see 3 tools: paygate_list_tools (paginated listing), paygate_search_tools (keyword search), and paygate_call_tool (proxy any tool). This reduces N tools to 3 in the context window while preserving full functionality.

Persistent Storage

Add --state-file to save API keys and credits to disk. Data survives server restarts.

npx paygate-mcp wrap --server "your-mcp-server" --state-file ~/.paygate/state.json

Stripe Integration

Connect Stripe to automatically top up credits when customers pay:

npx paygate-mcp wrap \
  --server "your-mcp-server" \
  --state-file ~/.paygate/state.json \
  --stripe-secret "whsec_your_stripe_webhook_secret"

Setup:

Create a Stripe Checkout Session with metadata:
- paygate_api_key — the customer's API key (e.g. pg_abc123...)
- paygate_credits — credits to add on payment (e.g. 500)
Point your Stripe webhook to https://your-server/stripe/webhook
Subscribe to checkout.session.completed and invoice.payment_succeeded events

When a customer completes payment, credits are automatically added to their API key. Subscriptions auto-renew credits on each billing cycle.

Security:

HMAC-SHA256 signature verification (Stripe's v1 scheme)
Timing-safe comparison to prevent timing attacks
5-minute timestamp tolerance to prevent replay attacks
Payment status verification (only paid triggers credits)
Zero dependencies — uses Node.js built-in crypto

Per-Tool ACL (Access Control)

Control which tools each API key can access:

# Create a key that can only access search and read tools
curl -X POST http://localhost:3402/keys \
  -H "Content-Type: application/json" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"name": "limited-client", "credits": 100, "allowedTools": ["search", "read_file"]}'

# Create a key with specific tools blocked
curl -X POST http://localhost:3402/keys \
  -H "Content-Type: application/json" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"name": "safe-client", "credits": 100, "deniedTools": ["delete_file", "admin_reset"]}'

# Update ACL on an existing key
curl -X POST http://localhost:3402/keys/acl \
  -H "Content-Type: application/json" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "CLIENT_API_KEY", "allowedTools": ["search"], "deniedTools": ["admin"]}'

allowedTools (whitelist): Only these tools are accessible. Empty = all tools.
deniedTools (blacklist): These tools are always denied. Applied after allowedTools.
ACL also filters tools/list — clients only see their permitted tools.

Per-Tool Rate Limits

Set independent rate limits per tool (on top of the global limit):

{
  "toolPricing": {
    "expensive_analyze": { "creditsPerCall": 10, "rateLimitPerMin": 5 },
    "search": { "creditsPerCall": 1, "rateLimitPerMin": 30 },
    "cheap_read": { "creditsPerCall": 1 }
  }
}

Per-tool limits are enforced independently per API key. A key can be rate-limited on one tool while still accessing others. The global --rate-limit applies across all tools.

Key Expiry (TTL)

Create API keys that auto-expire:

# Create a key that expires in 1 hour (3600 seconds)
curl -X POST http://localhost:3402/keys \
  -H "Content-Type: application/json" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"name": "trial-user", "credits": 50, "expiresIn": 3600}'

# Create a key with a specific expiry date
curl -X POST http://localhost:3402/keys \
  -H "Content-Type: application/json" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"name": "quarterly", "credits": 1000, "expiresAt": "2026-06-01T00:00:00Z"}'

# Set or extend expiry on an existing key
curl -X POST http://localhost:3402/keys/expiry \
  -H "Content-Type: application/json" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "CLIENT_API_KEY", "expiresIn": 86400}'

# Remove expiry (key never expires)
curl -X POST http://localhost:3402/keys/expiry \
  -H "Content-Type: application/json" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "CLIENT_API_KEY", "expiresAt": null}'

Expired keys return a clear api_key_expired error. Admins can extend or remove expiry at any time.

Credit Transfers

Atomically transfer credits between API keys:

curl -X POST http://localhost:3402/keys/transfer \
  -H "X-Admin-Key: $ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "from": "pg_source_key", "to": "pg_dest_key", "credits": 500, "memo": "Monthly allocation" }'

Response:

{
  "transferred": 500,
  "from": { "keyMasked": "pg_sour...key1", "balance": 500 },
  "to": { "keyMasked": "pg_dest...key2", "balance": 700 },
  "memo": "Monthly allocation",
  "message": "Transferred 500 credits"
}

Validation: Both keys must exist, be active (not revoked/expired), and the source must have sufficient credits. Fractional credits are floored to integers. Self-transfers are rejected.

Audit trail: Every transfer logs a key.credits_transferred audit event with masked keys, amount, balances, and memo.

Bulk Key Operations

Execute multiple key operations (create, topup, revoke) in a single request. Failed operations don't stop subsequent ones — each result includes success status and index for easy correlation.

curl -X POST http://localhost:3402/keys/bulk \
  -H "X-Admin-Key: $ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "operations": [
      { "action": "create", "name": "api-key-1", "credits": 500, "tags": { "env": "prod" } },
      { "action": "create", "name": "api-key-2", "credits": 200 },
      { "action": "topup", "key": "pg_existing_key", "credits": 1000 },
      { "action": "revoke", "key": "pg_old_key" }
    ]
  }'

Response:

{
  "total": 4,
  "succeeded": 4,
  "failed": 0,
  "results": [
    { "index": 0, "action": "create", "success": true, "result": { "key": "pg_abc...", "name": "api-key-1", "credits": 500 } },
    { "index": 1, "action": "create", "success": true, "result": { "key": "pg_def...", "name": "api-key-2", "credits": 200 } },
    { "index": 2, "action": "topup", "success": true, "result": { "creditsAdded": 1000, "newBalance": 1500 } },
    { "index": 3, "action": "revoke", "success": true, "result": { "message": "Key revoked" } }
  ]
}

Actions: create (with optional name, credits, tags, namespace, allowedTools, deniedTools), topup (key + credits), revoke (key). Unknown actions return an error result without stopping the batch.

Limits: Maximum 100 operations per request. Empty operations array returns 400.

Audit trail: Each successful operation logs an individual audit event with "(bulk)" suffix.

Key Import/Export

Export all API keys for backup or migration between PayGate instances:

# Export as JSON (includes full key secrets)
curl http://localhost:3402/keys/export \
  -H "X-Admin-Key: $ADMIN_KEY" \
  -o paygate-keys-backup.json

# Export as CSV
curl "http://localhost:3402/keys/export?format=csv" \
  -H "X-Admin-Key: $ADMIN_KEY" \
  -o paygate-keys-backup.csv

# Export only active keys in a specific namespace
curl "http://localhost:3402/keys/export?activeOnly=true&namespace=production" \
  -H "X-Admin-Key: $ADMIN_KEY"

Import keys into a PayGate instance:

curl -X POST http://localhost:3402/keys/import \
  -H "X-Admin-Key: $ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "keys": [{ "key": "pg_abc123...", "name": "my-key", "credits": 500, "active": true, "tags": {} }],
    "mode": "skip"
  }'

Response:

{
  "total": 1,
  "imported": 1,
  "overwritten": 0,
  "skipped": 0,
  "errors": 0,
  "mode": "skip",
  "results": [{ "key": "pg_abc123...", "name": "my-key", "status": "imported" }]
}

Conflict modes: skip (default) — skip keys that already exist, overwrite — replace existing keys, error — fail on duplicate keys.

Limits: Maximum 1000 keys per import request. Keys must start with pg_ prefix.

Export formats: JSON (full records with all fields) or CSV (key subset for spreadsheet use).

Spending Limits

Cap the total credits any API key can spend:

# Set a spending limit on a key (admin only)
curl -X POST http://localhost:3402/limits \
  -H "Content-Type: application/json" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "CLIENT_API_KEY", "spendingLimit": 500}'

# Check remaining budget
curl http://localhost:3402/balance -H "X-API-Key: CLIENT_API_KEY"
# → { "spendingLimit": 500, "remainingBudget": 350, ... }

Set spendingLimit to 0 for unlimited. When a key hits its limit, tool calls are denied with a clear error.

Refund on Failure

Automatically return credits when a downstream tool call fails:

npx paygate-mcp wrap --server "node server.js" --refund-on-failure

Credits are deducted before the tool call. If the wrapped server returns an error, credits are refunded and totalSpent / totalCalls are rolled back. Prevents charging users for failed operations.

Webhook Events

POST usage events to any external URL for billing, alerting, or analytics:

npx paygate-mcp wrap --server "node server.js" --webhook-url "https://billing.example.com/events"

Events are batched (up to 10 per POST) and flushed every 5 seconds. Each event includes tool name, credits charged, API key, and timestamp.

Retry Queue & Dead Letters

Failed webhook deliveries are retried with exponential backoff (1s, 2s, 4s, 8s, 16s — configurable up to --webhook-retries attempts). After all retries are exhausted, events move to a dead letter queue for admin inspection.

# Custom max retries (default: 5)
npx paygate-mcp wrap --server "node server.js" \
  --webhook-url "https://billing.example.com/events" \
  --webhook-retries 10

Admin endpoints:

Endpoint	Method	Description
`/webhooks/stats`	GET	Delivery statistics (delivered, failed, pending retries, dead letters)
`/webhooks/dead-letter`	GET	List permanently failed deliveries with error details
`/webhooks/dead-letter`	DELETE	Clear dead letter queue
`/webhooks/replay`	POST	Replay dead letter events (all or by index)

Retry attempts include an X-PayGate-Retry header with the attempt number for observability.

Webhook Event Replay

Replay permanently failed webhook events from the dead letter queue:

# Replay all dead letter entries
curl -X POST http://localhost:3402/webhooks/replay \
  -H "X-Admin-Key: $ADMIN_KEY"

# Replay specific entries by index
curl -X POST http://localhost:3402/webhooks/replay \
  -H "X-Admin-Key: $ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "indices": [0, 2, 5] }'

Replayed entries are removed from the dead letter queue and re-queued for fresh delivery (attempt counter resets to 0). If delivery fails again, they follow the normal retry/dead-letter flow.

Webhook Signatures (HMAC-SHA256)

Sign webhook payloads for tamper-proof delivery:

npx paygate-mcp wrap --server "node server.js" \
  --webhook-url "https://billing.example.com/events" \
  --webhook-secret "whsec_your_secret_here"

When --webhook-secret is set, every webhook POST includes an X-PayGate-Signature header:

X-PayGate-Signature: t=1709123456,v1=a1b2c3d4...

Verifying signatures (Node.js example):

import { WebhookEmitter } from 'paygate-mcp';

const signature = req.headers['x-paygate-signature'];
const [tPart, v1Part] = signature.split(',');
const timestamp = tPart.split('=')[1];
const sig = v1Part.split('=')[1];

// Reconstruct signed payload: timestamp.body
const signedPayload = `${timestamp}.${rawBody}`;
const isValid = WebhookEmitter.verify(signedPayload, sig, 'whsec_your_secret_here');

The signature covers timestamp.body to prevent replay attacks. Use timing-safe comparison (built into WebhookEmitter.verify).

Admin Lifecycle Events

When webhooks are enabled, admin operations also fire webhook events:

Event Type	Trigger	Metadata
`key.created`	POST /keys	keyMasked, name, credits
`key.topup`	POST /topup	keyMasked, creditsAdded, newBalance
`key.revoked`	POST /keys/revoke	keyMasked
`key.rotated`	POST /keys/rotate	oldKeyMasked, newKeyMasked
`key.expired`	Gate evaluation	keyMasked
`alert.fired`	Gate evaluation	alertType, keyPrefix, message, value, threshold
`team.created`	POST /teams	teamId, name, budget
`team.updated`	POST /teams/update	teamId, changes
`team.deleted`	POST /teams/delete	teamId
`team.key_assigned`	POST /teams/assign	teamId, keyMasked
`team.key_removed`	POST /teams/remove	teamId, keyMasked

Admin events appear in the adminEvents array of the webhook payload (separate from usage events). Both arrays can be present in the same batch.

Webhook Filters (Event Routing)

Route webhook events to different destinations based on event type and API key prefix. Each filter rule routes matching events to its own URL with independent retry queues, dead letter queues, and optional signing secrets.

Create a filter rule:

curl -X POST http://localhost:3402/webhooks/filters \
  -H "X-Admin-Key: $ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "production-alerts",
    "events": ["key.created", "key.revoked", "alert.fired"],
    "url": "https://alerts.example.com/webhook",
    "secret": "whsec_alerts_secret",
    "keyPrefixes": ["pk_prod_"],
    "active": true
  }'

List filters:

curl http://localhost:3402/webhooks/filters -H "X-Admin-Key: $ADMIN_KEY"

Update a filter:

curl -X POST http://localhost:3402/webhooks/filters/update \
  -H "X-Admin-Key: $ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "id": "wf_abc123", "active": false }'

Delete a filter:

curl -X POST http://localhost:3402/webhooks/filters/delete \
  -H "X-Admin-Key: $ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "id": "wf_abc123" }'

Filter rules:

events — Array of event types to match (exact match or "*" wildcard for all events)
keyPrefixes — Optional array of API key prefixes (e.g., ["pk_prod_"]). Events only match if the associated key starts with one of these prefixes. Omit for all keys.
url — Destination URL for matched events (each unique URL gets its own retry queue)
secret — Optional HMAC-SHA256 signing secret for this destination
active — Enable/disable the filter without deleting it

Routing behavior:

Events matching filter rules are sent to the filter's destination URL
The default webhook URL (if configured) always receives all events (backward compatible)
Multiple filters can match the same event — it's sent to all matching destinations
Inactive filters are skipped during routing

Config file:

{
  "webhookUrl": "https://billing.example.com/events",
  "webhookFilters": [
    {
      "name": "production-alerts",
      "events": ["key.created", "key.revoked", "alert.fired"],
      "url": "https://alerts.example.com/webhook",
      "keyPrefixes": ["pk_prod_"]
    }
  ]
}

Stats: GET /webhooks/stats includes per-URL delivery statistics for all filter destinations plus the default endpoint.

Usage Quotas

Set daily or monthly usage limits per API key:

# Create a key with 10 calls/day, 200 calls/month
curl -X POST http://localhost:3402/keys \
  -H "Content-Type: application/json" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"name": "metered-user", "credits": 1000, "quota": {"dailyCallLimit": 10, "monthlyCallLimit": 200}}'

# Set credit-based quotas (max 50 credits/day)
curl -X POST http://localhost:3402/keys/quota \
  -H "Content-Type: application/json" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "CLIENT_API_KEY", "dailyCreditLimit": 50}'

# Remove per-key quota (fall back to global defaults)
curl -X POST http://localhost:3402/keys/quota \
  -H "Content-Type: application/json" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "CLIENT_API_KEY", "remove": true}'

Quota types: dailyCallLimit, monthlyCallLimit, dailyCreditLimit, monthlyCreditLimit. Set to 0 for unlimited. Counters reset at UTC midnight (daily) and UTC month boundary (monthly). Set global defaults in the config file with globalQuota.

Dynamic Pricing

Charge extra credits based on input argument size:

{
  "toolPricing": {
    "analyze_text": { "creditsPerCall": 2, "creditsPerKbInput": 5 },
    "search": { "creditsPerCall": 1 }
  }
}

For analyze_text, a 3 KB input would cost 2 + ceil(3 × 5) = 17 credits. Small inputs round up to at least 1 KB. Tools without creditsPerKbInput use the flat base price.

OAuth 2.1

Full OAuth 2.1 authorization server for MCP clients. Implements PKCE, dynamic client registration, token refresh, and revocation.

Enable OAuth in config:

{
  "oauth": {
    "accessTokenTtl": 3600,
    "refreshTokenTtl": 2592000,
    "scopes": ["tools:*", "tools:read", "tools:write"]
  }
}

Full flow:

# 1. Register an OAuth client
curl -X POST http://localhost:3402/oauth/register \
  -H "Content-Type: application/json" \
  -d '{"client_name": "My Agent", "redirect_uris": ["http://localhost:8080/callback"], "api_key": "pg_..."}'

# 2. Generate PKCE challenge (code_verifier → SHA256 → base64url)
# 3. Authorize: GET /oauth/authorize?response_type=code&client_id=...&redirect_uri=...&code_challenge=...&code_challenge_method=S256
# 4. Exchange code for tokens
curl -X POST http://localhost:3402/oauth/token \
  -H "Content-Type: application/json" \
  -d '{"grant_type": "authorization_code", "code": "...", "client_id": "...", "redirect_uri": "...", "code_verifier": "..."}'

# 5. Use Bearer token on /mcp
curl -X POST http://localhost:3402/mcp \
  -H "Authorization: Bearer pg_at_..." \
  -d '{"jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": {"name": "search", "arguments": {"query": "hello"}}}'

# 6. Refresh token
curl -X POST http://localhost:3402/oauth/token \
  -d '{"grant_type": "refresh_token", "refresh_token": "pg_rt_...", "client_id": "..."}'

OAuth tokens are backed by API keys — each token maps to an API key for billing. The /mcp endpoint accepts both X-API-Key and Authorization: Bearer headers.

SSE Streaming (MCP Streamable HTTP)

PayGate implements the full MCP Streamable HTTP transport with SSE support:

# POST /mcp with SSE response (add Accept header)
curl -N -X POST http://localhost:3402/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -H "X-API-Key: YOUR_KEY" \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"analyze","arguments":{}}}'
# Response: SSE stream with event: message + data: {jsonrpc response}

# GET /mcp — Open SSE notification stream
curl -N http://localhost:3402/mcp \
  -H "Accept: text/event-stream" \
  -H "Mcp-Session-Id: mcp_sess_..."
# Receives server-initiated notifications as SSE events

# DELETE /mcp — Terminate session
curl -X DELETE http://localhost:3402/mcp \
  -H "Mcp-Session-Id: mcp_sess_..."

Session Management:

Every POST /mcp response includes an Mcp-Session-Id header
Clients reuse sessions by sending Mcp-Session-Id on subsequent requests
GET /mcp opens a long-lived SSE connection for server-to-client notifications
DELETE /mcp terminates a session and closes all SSE connections
Sessions auto-expire after 30 minutes of inactivity

Transport modes:

POST /mcp without Accept: text/event-stream → standard JSON response (backward compatible)
POST /mcp with Accept: text/event-stream → SSE-wrapped JSON-RPC response
GET /mcp with Accept: text/event-stream → long-lived notification stream

Audit Log

Every significant operation is recorded in a structured audit trail:

# Query audit events (with filtering)
curl http://localhost:3402/audit?types=key.created,gate.deny&limit=50 \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Export full audit log as CSV
curl http://localhost:3402/audit/export?format=csv \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" > audit.csv

# Get audit statistics
curl http://localhost:3402/audit/stats \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Tracked events: key.created, key.revoked, key.topup, key.acl_updated, key.expiry_updated, key.quota_updated, key.limit_updated, key.tags_updated, key.ip_updated, gate.allow, gate.deny, session.created, session.destroyed, oauth.client_registered, oauth.token_issued, oauth.token_revoked, admin.auth_failed, admin.alerts_configured, billing.refund, team.created, team.updated, team.deleted, team.key_assigned, team.key_removed.

Retention: Ring buffer (default 10,000 events), age-based cleanup (default 30 days), automatic periodic enforcement.

Registry/Discovery (Agent-Discoverable Pricing)

AI agents can programmatically discover your server's pricing and payment requirements before calling tools. Aligns with SEP-2007 (MCP Payment Spec Draft).

# Discover server payment metadata (public, no auth)
curl http://localhost:3402/.well-known/mcp-payment
# → { "specVersion": "2007-draft", "billingModel": "credits", "defaultCreditsPerCall": 1, ... }

# Get full pricing breakdown (public, no auth)
curl http://localhost:3402/pricing
# → { "server": {...}, "tools": [{ "name": "search", "creditsPerCall": 5, "pricingModel": "dynamic" }, ...] }

How it works:

/.well-known/mcp-payment — Server-level payment metadata (billing model, auth methods, error codes)
/pricing — Full per-tool pricing breakdown with overrides
tools/list responses include _pricing metadata on each tool (creditsPerCall, pricingModel, rateLimitPerMin)
-32402 error responses include pricing details so agents know how to afford the tool

Both discovery endpoints are public (no auth required) so agents can check pricing before obtaining an API key.

Prometheus Metrics

Monitor your PayGate server with any Prometheus-compatible monitoring system:

curl http://localhost:3402/metrics

Returns metrics in standard Prometheus text exposition format:

# HELP paygate_tool_calls_total Total tool calls processed
# TYPE paygate_tool_calls_total counter
paygate_tool_calls_total{status="allowed",tool="search"} 42
paygate_tool_calls_total{status="denied",tool="premium"} 3

# HELP paygate_credits_charged_total Total credits charged
# TYPE paygate_credits_charged_total counter
paygate_credits_charged_total{tool="search"} 210

# HELP paygate_active_keys_total Number of active (non-revoked) API keys
# TYPE paygate_active_keys_total gauge
paygate_active_keys_total 5

# HELP paygate_uptime_seconds Server uptime in seconds
# TYPE paygate_uptime_seconds gauge
paygate_uptime_seconds 3600

Available metrics:

paygate_tool_calls_total{tool,status} — Tool calls (allowed/denied)
paygate_credits_charged_total{tool} — Credits charged per tool
paygate_denials_total{reason} — Denials by reason (insufficient_credits, rate_limited, etc.)
paygate_rate_limit_hits_total{tool} — Rate limit hits per tool
paygate_refunds_total{tool} — Credit refunds per tool
paygate_http_requests_total{method,path,status} — HTTP requests
paygate_active_keys_total — Active API keys (gauge)
paygate_active_sessions_total — Active MCP sessions (gauge)
paygate_total_credits_available — Total credits across all keys (gauge)
paygate_uptime_seconds — Server uptime (gauge)

The /metrics endpoint is public (no auth required) for easy Prometheus scraping.

Key Cloning

Create a new API key with the same configuration as an existing key but fresh counters. Ideal for provisioning similar keys for team members, staging environments, or batch key creation:

# Clone with same config and credits
curl -X POST http://localhost:3402/keys/clone \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "pg_source..."}'
# → { "message": "Key cloned", "key": "pg_newkey...", "name": "source-clone", "credits": 200, ... }

# Clone with overrides
curl -X POST http://localhost:3402/keys/clone \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "pg_source...", "name": "staging-key", "credits": 50, "namespace": "staging"}'

What gets cloned: allowedTools, deniedTools, expiresAt, quota, tags, ipAllowlist, namespace, group, spendingLimit, autoTopup config. What gets reset: totalSpent, totalCalls, lastUsedAt, quotaDailyCalls, suspended state. You can override name, credits, tags, and namespace in the clone request. Suspended and expired keys can be cloned (but not revoked keys).

Key Rotation

Rotate an API key without losing credits, ACLs, quotas, or spending limits:

curl -X POST http://localhost:3402/keys/rotate \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "pg_oldkey..."}'
# → { "message": "Key rotated", "newKey": "pg_newkey...", "name": "my-key", "credits": 500 }

The old key is immediately invalidated. All state (credits, totalSpent, totalCalls, ACL, quota, expiry, spending limit) transfers to the new key. Use this for periodic key rotation policies, compromised key response, or key migration.

Key Suspension & Resumption

Temporarily disable an API key without permanently revoking it. Suspended keys are denied at the gate (key_suspended reason), but admin operations (topup, ACL, quota, tags, etc.) still work — making this ideal for investigating abuse, pausing billing, or temporary lockouts:

# Suspend a key (with optional reason for audit trail)
curl -X POST http://localhost:3402/keys/suspend \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "pg_abc123...", "reason": "investigating abuse"}'
# → { "message": "Key suspended", "suspended": true }

# Resume a suspended key
curl -X POST http://localhost:3402/keys/resume \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "pg_abc123..."}'
# → { "message": "Key resumed", "suspended": false }

Suspension vs Revocation:

Suspend — Reversible. Key remains active but is denied at the gate. Admin operations still work. Use for temporary lockouts.
Revoke — Permanent. Key is deactivated and cannot be restored. Use for compromised or decommissioned keys.

Suspension fires key.suspended and key.resumed audit events and webhook notifications. Shadow mode allows suspended keys through (with shadow:key_suspended reason) for testing.

Per-Key Usage

Get detailed usage breakdown for a specific API key — per-tool stats, hourly time-series, deny reasons, and recent events:

# Get full usage for a key
curl http://localhost:3402/keys/usage?key=pg_abc123... \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Filter by time (ISO 8601)
curl "http://localhost:3402/keys/usage?key=pg_abc123...&since=2025-01-01T00:00:00Z" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Response includes:

Field	Description
`key`	Masked API key (first 10 chars + `...`)
`name`	Key name
`credits`	Current credit balance
`active` / `suspended`	Key status
`totalCalls`	Total tool calls made
`totalAllowed` / `totalDenied`	Allowed vs denied breakdown
`totalCreditsSpent`	Total credits consumed
`perTool`	Per-tool breakdown: `{ calls, credits, denied }`
`denyReasons`	Aggregated deny reasons with counts
`timeSeries`	Hourly buckets: `{ hour, calls, credits, denied }`
`recentEvents`	Last 50 events (newest first) with tool, credits, and deny reason

Works for active, suspended, and expired keys. Useful for debugging, billing audits, and per-customer analytics.

Webhook Test

Send a test event to your configured webhook URL to verify connectivity without generating real events:

# Send test event
curl -X POST http://localhost:3402/webhooks/test \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

# With custom message
curl -X POST http://localhost:3402/webhooks/test \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"message": "Testing from staging deploy"}'

Response:

Field	Description
`url`	Webhook URL (credentials masked)
`success`	`true` if webhook returned 2xx
`statusCode`	HTTP status code from webhook endpoint
`responseTime`	Round-trip delivery time in milliseconds
`error`	Error message (only on failure)

The test event includes X-PayGate-Test: 1 header and X-PayGate-Signature when a webhook secret is configured. Returns 400 if no webhook URL is configured. Creates an audit trail entry (webhook.test).

Webhook Delivery Log

Query the log of all webhook delivery attempts — successes, failures, and retries:

# Get recent deliveries (default: last 50, newest first)
curl http://localhost:3402/webhooks/log \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Filter by success/failure
curl "http://localhost:3402/webhooks/log?success=false" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Filter by time and limit
curl "http://localhost:3402/webhooks/log?since=2025-01-01T00:00:00Z&limit=10" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Each entry includes:

Field	Description
`id`	Auto-incrementing delivery ID
`timestamp`	When the delivery attempt was made
`url`	Webhook URL (credentials masked)
`statusCode`	HTTP status code (0 for connection errors)
`success`	`true` if webhook returned 2xx
`responseTime`	Round-trip time in milliseconds
`attempt`	Retry attempt number (0 = first attempt)
`error`	Error message (only on failure)
`eventCount`	Number of events in the batch
`eventTypes`	Distinct event types (e.g. `["usage"]`, `["key.created"]`)

Query parameters: limit (default 50, max 200), since (ISO 8601), success (true or false). Entries are capped at 500 in memory. Use alongside /webhooks/stats for aggregate counters.

Webhook Pause/Resume

Temporarily halt webhook delivery during maintenance windows. Events are buffered (not lost) and flushed on resume:

# Pause delivery
curl -X POST http://localhost:3402/webhooks/pause \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Check pause status (visible in /webhooks/stats)
curl http://localhost:3402/webhooks/stats \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"
# → { "paused": true, "pausedAt": "2025-...", "bufferedEvents": 12, ... }

# Resume delivery (flushes buffered events)
curl -X POST http://localhost:3402/webhooks/resume \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"
# → { "paused": false, "flushedEvents": 12 }

While paused, events continue to accumulate in the buffer. On resume, all buffered events are flushed immediately. The pause state and buffered event count are visible in /webhooks/stats. Creates audit trail entries (webhook.pause, webhook.resume).

Key Aliases

Assign human-readable aliases to API keys so you can reference them by name instead of opaque key IDs in admin endpoints:

# Set an alias
curl -X POST http://localhost:3402/keys/alias \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "pg_abc123...", "alias": "prod-backend"}'
# → { "key": "pg_abc12...", "alias": "prod-backend", "message": "Alias set to \"prod-backend\"" }

# Use the alias in any admin endpoint
curl -X POST http://localhost:3402/topup \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "prod-backend", "credits": 500}'

curl -X POST http://localhost:3402/keys/suspend \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "prod-backend", "reason": "maintenance"}'

curl -X POST http://localhost:3402/keys/transfer \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"from": "prod-backend", "to": "staging-api", "credits": 100}'

# Clear an alias
curl -X POST http://localhost:3402/keys/alias \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "prod-backend", "alias": null}'

Field	Description
`alias`	1-100 chars, alphanumeric + hyphens + underscores only
Uniqueness	Aliases must be unique across all keys and cannot collide with existing key IDs
Scope	Aliases work in all admin endpoints (topup, revoke, suspend, resume, clone, transfer, usage) — they do not work for API key authentication on `/mcp`
Persistence	Aliases are saved to the state file and survive server restarts
Clone	Cloned keys do not inherit the source key's alias
Audit	`key.alias_set` event logged for every set/clear operation

Key Expiry Scanner

Proactive background scanner that detects API keys approaching expiration and sends webhook notifications before they expire — even if the keys are not actively being used:

# Query keys expiring within 24 hours (default)
curl http://localhost:3402/keys/expiring \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"
# → { "within": 86400, "count": 2, "scanner": { ... }, "keys": [ ... ] }

# Query keys expiring within 7 days
curl http://localhost:3402/keys/expiring?within=604800 \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Configure the scanner in your config file:

{
  "expiryScanner": {
    "enabled": true,
    "intervalSeconds": 3600,
    "thresholds": [604800, 86400, 3600]
  }
}

Field	Description
`enabled`	Enable/disable the background scanner. Default: `true`
`intervalSeconds`	How often to scan (seconds). Default: `3600` (1 hour). Min: 60
`thresholds`	Seconds before expiry to notify. Default: `[604800, 86400, 3600]` (7d, 24h, 1h)
Webhook	Fires `key.expiry_warning` events with key name, alias, namespace, expiry time, and remaining seconds
De-duplication	Each key+threshold pair is only notified once (no duplicate alerts)
Progressive	Largest threshold fires first, then progressively smaller thresholds on subsequent scans
Audit	`key.expiry_warning` event logged for every notification
Endpoint	`GET /keys/expiring?within=N` lists keys expiring within N seconds (default: 86400)

Key Templates

Named templates for API key creation. Define reusable presets and create keys with template: "free-tier":

# Create a template
curl -X POST http://localhost:3402/keys/templates \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{
    "name": "free-tier",
    "description": "Free plan with basic access",
    "credits": 50,
    "allowedTools": ["search", "read"],
    "deniedTools": ["admin"],
    "tags": {"plan": "free"},
    "namespace": "public",
    "expiryTtlSeconds": 2592000,
    "spendingLimit": 200
  }'

# List all templates
curl http://localhost:3402/keys/templates \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Create a key from template (inherits all defaults)
curl -X POST http://localhost:3402/keys \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"name": "new-user", "template": "free-tier"}'

# Create a key from template with overrides
curl -X POST http://localhost:3402/keys \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"name": "vip-user", "template": "free-tier", "credits": 500, "tags": {"plan": "vip"}}'

# Delete a template
curl -X POST http://localhost:3402/keys/templates/delete \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"name": "free-tier"}'

Feature	Details
Fields	credits, allowedTools, deniedTools, quota, ipAllowlist, spendingLimit, tags, namespace, expiryTtlSeconds, autoTopup
Override	Explicit params in `POST /keys` always override template defaults
TTL	`expiryTtlSeconds` sets expiry relative to key creation time (0 = never)
Limit	Max 100 templates per server
Persistence	`-templates.json` alongside state file, survives restarts
Audit	`template.created`, `template.updated`, `template.deleted` events
Prometheus	`paygate_templates_total` gauge tracks template count

Environment Variables Config

Configure everything via PAYGATE_* environment variables — ideal for Docker, Kubernetes, and CI/CD deployments:

# Docker example
docker run -e PAYGATE_SERVER="node /app/server.js" \
  -e PAYGATE_PORT=8080 \
  -e PAYGATE_PRICE=5 \
  -e PAYGATE_ADMIN_KEY=sk-admin-secret \
  -e PAYGATE_REDIS_URL=redis://redis:6379 \
  -e PAYGATE_WEBHOOK_URL=https://hooks.example.com/billing \
  -p 8080:8080 node:20 npx paygate-mcp wrap

# Or use a config file via env var
docker run -e PAYGATE_CONFIG=/etc/paygate/config.json \
  -v ./config.json:/etc/paygate/config.json \
  -p 3402:3402 node:20 npx paygate-mcp wrap

All 18 supported environment variables:

Env Var	CLI Flag	Description
`PAYGATE_SERVER`	`--server`	MCP server command to wrap (stdio)
`PAYGATE_REMOTE_URL`	`--remote-url`	Remote MCP server URL (HTTP)
`PAYGATE_CONFIG`	`--config`	Path to JSON config file
`PAYGATE_PORT`	`--port`	Server port (default: 3402)
`PAYGATE_PRICE`	`--price`	Credits per tool call (default: 1)
`PAYGATE_RATE_LIMIT`	`--rate-limit`	Max calls per minute per key (default: 60)
`PAYGATE_NAME`	`--name`	Server name for display
`PAYGATE_SHADOW`	`--shadow`	Enable shadow mode (true/false)
`PAYGATE_ADMIN_KEY`	`--admin-key`	Admin API key
`PAYGATE_STATE_FILE`	`--state-file`	Persistent state file path
`PAYGATE_WEBHOOK_URL`	`--webhook-url`	Webhook delivery URL
`PAYGATE_WEBHOOK_SECRET`	`--webhook-secret`	HMAC-SHA256 webhook secret
`PAYGATE_WEBHOOK_RETRIES`	`--webhook-retries`	Max webhook retry attempts
`PAYGATE_REFUND_ON_FAILURE`	`--refund-on-failure`	Refund credits on tool failure (true/false)
`PAYGATE_REDIS_URL`	`--redis-url`	Redis URL for horizontal scaling
`PAYGATE_DRY_RUN`	`--dry-run`	Discover tools and exit (true/false)
`PAYGATE_TOOL_PRICE`	`--tool-price`	Per-tool pricing (tool=price,...)
`PAYGATE_STRIPE_SECRET`	`--stripe-secret`	Stripe secret key for payments

Priority: CLI flags > env vars > config file > defaults. This means you can set defaults via env vars in Docker and override specific values on the command line.

Request ID Tracking

Every HTTP response includes an X-Request-Id header for distributed tracing. If the incoming request has an X-Request-Id header (e.g., from a load balancer or API gateway), it is propagated through. Otherwise, a new ID is auto-generated with the format req_<16 hex chars>.

# Auto-generated request ID
curl -v http://localhost:3402/health
# < X-Request-Id: req_a1b2c3d4e5f67890

# Propagate your own trace ID
curl -v -H "X-Request-Id: my-trace-123" http://localhost:3402/health
# < X-Request-Id: my-trace-123

Feature	Details
Format	`req_` + 16 hex chars (8 bytes of randomness)
Propagation	Incoming `X-Request-Id` header is preserved and returned
CORS	Included in `Access-Control-Allow-Headers` and `Access-Control-Expose-Headers`
Audit	Request ID appears in `gate.allow`, `gate.deny`, and `session.created` audit metadata
Exports	`generateRequestId()` and `getRequestId(req)` available in SDK

Server Info Endpoint

GET /info returns a comprehensive JSON object describing the server's capabilities. Public endpoint — no admin key required.

curl http://localhost:3402/info

{
  "name": "My API Server",
  "version": "5.5.0",
  "transport": "stdio",
  "port": 3402,
  "auth": ["api_key", "scoped_token"],
  "features": {
    "shadowMode": false,
    "webhooks": true,
    "webhookSignatures": true,
    "refundOnFailure": true,
    "redis": false,
    "oauth": false,
    "plugins": false,
    "multiServer": false
  },
  "pricing": {
    "defaultCreditsPerCall": 1,
    "toolPricing": {
      "expensive-tool": { "creditsPerCall": 10 }
    }
  },
  "rateLimit": { "globalPerMin": 60 },
  "endpoints": {
    "mcp": "/mcp",
    "health": "/health",
    "info": "/info",
    "status": "/status (admin)",
    "keys": "/keys (admin)",
    "metrics": "/metrics",
    "pricing": "/pricing",
    "audit": "/audit (admin)",
    "analytics": "/analytics (admin)"
  }
}

Configurable CORS

Control which browser origins can access your PayGate server. Default is * (allow all).

# CLI flag: single origin
npx paygate-mcp wrap --server "..." --cors-origin "https://myapp.com"

# CLI flag: multiple origins (comma-separated)
npx paygate-mcp wrap --server "..." --cors-origin "https://app1.com,https://app2.com"

# Env var
PAYGATE_CORS_ORIGIN=https://myapp.com npx paygate-mcp wrap --server "..."

Config file:

{
  "cors": {
    "origin": ["https://app1.com", "https://app2.com"],
    "credentials": true,
    "maxAge": 3600
  }
}

Feature	Details
Default	`*` (allow all origins)
Single origin	Exact match against request `Origin` header
Multiple origins	Array of allowed origins, matched against request
Credentials	`Access-Control-Allow-Credentials: true` when enabled
Max-Age	Preflight cache duration (default: 86400 = 24 hours)
Vary	`Vary: Origin` header added when origin is not `*`

Custom Response Headers

Add custom HTTP headers to all responses — perfect for security headers, cache control, or custom tracking.

# CLI flag: single header
npx paygate-mcp wrap --server "..." --header "X-Frame-Options:DENY"

# CLI flag: multiple headers (comma-separated)
npx paygate-mcp wrap --server "..." --header "X-Frame-Options:DENY,X-Content-Type-Options:nosniff"

# Env var
PAYGATE_CUSTOM_HEADERS="X-Frame-Options:DENY,X-Content-Type-Options:nosniff" npx paygate-mcp wrap --server "..."

Config file:

{
  "customHeaders": {
    "X-Frame-Options": "DENY",
    "X-Content-Type-Options": "nosniff",
    "Strict-Transport-Security": "max-age=31536000; includeSubDomains",
    "X-Custom-Tag": "my-service"
  }
}

Custom headers are applied to every HTTP response (health, info, admin, MCP, preflight) and coexist with CORS headers and request IDs. They do not override built-in headers.

Config Export

Inspect the running server configuration for debugging and verification:

curl http://localhost:3402/config -H "X-Admin-Key: YOUR_ADMIN_KEY"

Returns the full config with sensitive values masked:

Field	Masking
`webhookSecret`	`***`
`webhookUrl`	Scheme + host only (e.g. `https://hooks.example.com/***`)
`serverCommand`	`***`
`serverArgs`	`['***']`
Webhook filter secrets	`***`
Webhook filter URLs	Scheme + host only

Non-sensitive values (pricing, rate limits, CORS, custom headers, quotas, etc.) are returned as-is. Each export is recorded in the audit trail as config.export.

Trusted Proxies

When running behind load balancers or reverse proxies, configure trusted proxy IPs/CIDRs so PayGate extracts the real client IP from the X-Forwarded-For header correctly:

# CLI flag (comma-separated IPs and/or CIDRs)
paygate-mcp wrap --server "node server.js" --trusted-proxies "10.0.0.0/8,172.16.0.0/12"

# Environment variable
PAYGATE_TRUSTED_PROXIES="10.0.0.0/8,172.16.0.0/12" paygate-mcp wrap --server "node server.js"

Config file:

{
  "serverCommand": "node",
  "serverArgs": ["server.js"],
  "trustedProxies": ["10.0.0.0/8", "172.16.0.0/12", "192.168.1.1"]
}

How it works: Without trusted proxies, the first X-Forwarded-For value is used (backward compatible). With trusted proxies configured, the header is walked right-to-left, skipping IPs that match the trusted list, and the first non-trusted IP is returned as the real client IP. This is critical for accurate IP allowlisting when behind proxies.

Supports exact IPv4 addresses and CIDR notation (/8, /16, /24, /32, etc.). The resolveClientIp function is also exported from the SDK for custom use.

Key Listing Pagination

The GET /keys endpoint supports pagination, filtering, and sorting when any query parameter is present:

# Paginate: 10 keys per page, second page
curl "http://localhost:3402/keys?limit=10&offset=10" -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Filter by namespace and active status
curl "http://localhost:3402/keys?limit=50&namespace=prod&active=true" -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Sort by credits descending
curl "http://localhost:3402/keys?limit=20&sortBy=credits&order=desc" -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Filter by credit range
curl "http://localhost:3402/keys?limit=50&minCredits=100&maxCredits=1000" -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Find keys by name prefix
curl "http://localhost:3402/keys?limit=50&namePrefix=prod-" -H "X-Admin-Key: YOUR_ADMIN_KEY"

Query Parameters:

Parameter	Type	Description
`limit`	number	Results per page (1–500, default: 50)
`offset`	number	Skip N results (default: 0)
`sortBy`	string	Sort field: `createdAt`, `name`, `credits`, `lastUsedAt`, `totalSpent`, `totalCalls`
`order`	string	Sort direction: `asc` or `desc` (default: `desc`)
`namespace`	string	Filter by namespace
`group`	string	Filter by group ID
`active`	string	`true` or `false`
`suspended`	string	`true` or `false`
`expired`	string	`true` or `false`
`namePrefix`	string	Case-insensitive name prefix match
`minCredits`	number	Minimum credits (inclusive)
`maxCredits`	number	Maximum credits (inclusive)

Response format (when any pagination/filter param is present):

{
  "keys": [...],
  "total": 150,
  "offset": 20,
  "limit": 10,
  "hasMore": true
}

Backward compatible: Without any pagination/filter/sort params, GET /keys returns the same flat array as before.

Key Statistics

GET /keys/stats returns aggregate statistics across all keys:

# Get all key statistics
curl http://localhost:3402/keys/stats -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Filter by namespace
curl "http://localhost:3402/keys/stats?namespace=prod" -H "X-Admin-Key: YOUR_ADMIN_KEY"

Response:

{
  "total": 150,
  "active": 120,
  "suspended": 10,
  "expired": 15,
  "revoked": 5,
  "totalCreditsAllocated": 500000,
  "totalCreditsSpent": 125000,
  "totalCreditsRemaining": 375000,
  "totalCalls": 84200,
  "byNamespace": { "prod": 80, "staging": 50, "default": 20 },
  "byGroup": { "enterprise": 30, "starter": 45 }
}

When ?namespace= is provided, all counts/aggregates are scoped to that namespace, and a filteredByNamespace field is included in the response.

Rate Limit Status

GET /keys/rate-limit-status?key=... returns the current rate limit window state for a key without consuming a call:

curl "http://localhost:3402/keys/rate-limit-status?key=pg_..." -H "X-Admin-Key: YOUR_ADMIN_KEY"

Response:

{
  "key": "pg_abc123...",
  "name": "my-key",
  "global": {
    "limit": 100,
    "used": 23,
    "remaining": 77,
    "resetInMs": 45000,
    "windowMs": 60000
  },
  "perTool": {
    "search": { "limit": 10, "used": 5, "remaining": 5, "resetInMs": 30000 },
    "translate": { "limit": 20, "used": 0, "remaining": 20, "resetInMs": 60000 }
  }
}

perTool is only present when tools have per-tool rate limits configured via toolPricing. Tools without custom rate limits are not included.

Quota Status

GET /keys/quota-status?key=... returns daily/monthly quota usage for a key:

curl "http://localhost:3402/keys/quota-status?key=pg_..." -H "X-Admin-Key: YOUR_ADMIN_KEY"

Response:

{
  "key": "pg_abc123...",
  "name": "my-key",
  "quotaSource": "global",
  "daily": {
    "callsUsed": 42,
    "callsLimit": 100,
    "callsRemaining": 58,
    "creditsUsed": 150,
    "creditsLimit": 500,
    "creditsRemaining": 350,
    "resetDay": "2026-02-26"
  },
  "monthly": {
    "callsUsed": 850,
    "callsLimit": 2000,
    "callsRemaining": 1150,
    "creditsUsed": 3200,
    "creditsLimit": 10000,
    "creditsRemaining": 6800,
    "resetMonth": "2026-02"
  }
}

quotaSource indicates where the quota is configured: "per-key" (key-level override), "global" (server-wide config), or "none" (no quota). When a limit is 0 (unlimited), remaining is null.

Credit History

GET /keys/credit-history?key=... returns the credit mutation log for a key:

curl "http://localhost:3402/keys/credit-history?key=pg_..." -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Filter by type, limit, or since timestamp
curl "http://localhost:3402/keys/credit-history?key=pg_...&type=topup&limit=10" -H "X-Admin-Key: YOUR_ADMIN_KEY"

Response:

{
  "key": "pg_abc123...",
  "name": "my-key",
  "currentBalance": 700,
  "totalEntries": 3,
  "entries": [
    {
      "timestamp": "2026-02-26T12:30:00.000Z",
      "type": "topup",
      "amount": 200,
      "balanceBefore": 500,
      "balanceAfter": 700
    },
    {
      "timestamp": "2026-02-26T12:00:00.000Z",
      "type": "initial",
      "amount": 500,
      "balanceBefore": 0,
      "balanceAfter": 500
    }
  ]
}

Entry types: initial, topup, transfer_in, transfer_out, auto_topup, deduction, refund, bulk_topup. Entries are newest-first, capped at 100 per key. Transfers include a memo field when provided.

Spending Velocity

GET /keys/spending-velocity?key=... returns credit burn rate and depletion forecast:

curl "http://localhost:3402/keys/spending-velocity?key=pg_..." -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Custom analysis window (default 24h, max 720h/30d)
curl "http://localhost:3402/keys/spending-velocity?key=pg_...&window=48" -H "X-Admin-Key: YOUR_ADMIN_KEY"

Response:

{
  "key": "pg_abc123...",
  "name": "my-key",
  "currentBalance": 750,
  "velocity": {
    "creditsPerHour": 12.5,
    "creditsPerDay": 300,
    "callsPerHour": 2.5,
    "callsPerDay": 60,
    "estimatedDepletionDate": "2026-03-01T18:00:00.000Z",
    "estimatedHoursRemaining": 60,
    "windowHours": 24,
    "dataPoints": 45
  },
  "topTools": [
    { "tool": "search", "calls": 30, "credits": 150 },
    { "tool": "generate", "calls": 15, "credits": 120 }
  ]
}

estimatedDepletionDate and estimatedHoursRemaining are null when there's no spending activity. topTools shows the 5 highest-spend tools from usage data.

Key Comparison

GET /keys/compare?keys=pg_a,pg_b,pg_c returns side-by-side comparison of 2–10 keys:

curl "http://localhost:3402/keys/compare?keys=pg_abc,pg_xyz" -H "X-Admin-Key: YOUR_ADMIN_KEY"

Response:

{
  "compared": 2,
  "keys": [
    {
      "key": "pg_abc123...",
      "name": "prod-agent",
      "status": "active",
      "credits": { "current": 750, "totalSpent": 250 },
      "usage": { "totalCalls": 50, "totalAllowed": 48, "totalDenied": 2 },
      "velocity": { "creditsPerHour": 12.5, "creditsPerDay": 300, "estimatedHoursRemaining": 60 },
      "rateLimit": { "used": 3, "limit": 60, "remaining": 57 },
      "metadata": { "namespace": "prod", "group": "team-a", "createdAt": "2026-02-01T00:00:00Z", "tags": { "env": "prod" } }
    },
    {
      "key": "pg_xyz789...",
      "name": "staging-agent",
      "status": "active",
      "credits": { "current": 200, "totalSpent": 800 },
      "usage": { "totalCalls": 120, "totalAllowed": 120, "totalDenied": 0 },
      "velocity": { "creditsPerHour": 8.3, "creditsPerDay": 200, "estimatedHoursRemaining": 24 },
      "rateLimit": { "used": 0, "limit": 60, "remaining": 60 },
      "metadata": { "namespace": "staging", "group": null, "createdAt": "2026-02-15T00:00:00Z", "tags": {} }
    }
  ]
}

Keys not found are reported in a notFound array. Supports aliases. Maximum 10 keys per comparison.

Key Health Score

GET /keys/health?key=... returns a composite health score (0–100) with weighted component breakdown:

curl "http://localhost:3402/keys/health?key=pg_abc" -H "X-Admin-Key: YOUR_ADMIN_KEY"

Response:

{
  "key": "pg_abc123...",
  "name": "prod-agent",
  "score": 72,
  "status": "caution",
  "issues": ["Key expires within 7 days", "Credits depleting rapidly"],
  "components": {
    "balance": { "score": 40, "risk": "warning", "weight": 0.30 },
    "quota": { "score": 85, "risk": "good", "weight": 0.25 },
    "rateLimit": { "score": 100, "risk": "healthy", "weight": 0.20 },
    "errorRate": { "score": 75, "risk": "good", "weight": 0.25 }
  }
}

Components: balance (30%, hours until credit depletion), quota (25%, max utilization across daily/monthly limits), rateLimit (20%, current window usage), errorRate (25%, denied/total ratio). Status: healthy (≥90), good (≥75), caution (≥50), warning (≥25), critical (<25). Issues detect: revoked, suspended, expired, expiring soon, zero credits, rapid depletion. Supports aliases.

Maintenance Mode

Put your server into maintenance mode to gracefully reject client traffic while keeping admin endpoints operational:

# Enable maintenance mode
curl -X POST http://localhost:3402/maintenance \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"enabled": true, "message": "Upgrading to v7 — back in 10 minutes"}'

# Check maintenance status
curl http://localhost:3402/maintenance -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Disable maintenance mode
curl -X POST http://localhost:3402/maintenance \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"enabled": false}'

Response (enabled):

{
  "enabled": true,
  "message": "Upgrading to v7 — back in 10 minutes",
  "since": "2025-03-15T14:30:00.000Z"
}

When enabled, all /mcp requests return 503 with the custom message. Admin endpoints (/keys, /maintenance, /audit, etc.) remain fully operational. GET /health returns {"status": "maintenance"}. Both enable and disable actions are recorded in the audit trail (maintenance.enabled / maintenance.disabled).

Admin Event Stream

Stream real-time server events to admin clients via Server-Sent Events (SSE):

# Stream all events
curl -N http://localhost:3402/admin/events \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -H "Accept: text/event-stream"

# Stream only key operations
curl -N http://localhost:3402/admin/events?types=key.created,key.revoked,key.topup \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -H "Accept: text/event-stream"

Events:

event: connected
data: {"message":"Admin event stream connected","filters":"all"}

event: audit
data: {"id":42,"timestamp":"2025-03-15T14:30:00.000Z","type":"key.created","actor":"admin","message":"Key created: prod-agent","metadata":{...}}

event: audit
data: {"id":43,"timestamp":"2025-03-15T14:30:01.000Z","type":"gate.allow","actor":"pg_abc12...","message":"Allowed: get_weather","metadata":{...}}

Every audit event (tool calls, denials, key operations, maintenance, alerts) is broadcast in real-time. Use ?types= to filter by comma-separated event types. Supports multiple concurrent admin clients. Keepalive pings every 15s prevent connection timeouts. Connections are cleaned up automatically on disconnect.

Key Notes

Attach timestamped notes to API keys for operational tracking:

# Add a note
curl -X POST http://localhost:3402/keys/notes \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "pg_...", "text": "Increased credits per customer request #1234"}'

# List notes
curl "http://localhost:3402/keys/notes?key=pg_..." -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Delete a note by index
curl -X DELETE "http://localhost:3402/keys/notes?key=pg_...&index=0" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Response (list):

{
  "key": "pg_abc1...2345",
  "notes": [
    { "timestamp": "2025-03-15T14:30:00.000Z", "author": "admin", "text": "Increased credits per customer request #1234" },
    { "timestamp": "2025-03-16T09:00:00.000Z", "author": "admin", "text": "Upgraded to premium tier" }
  ],
  "count": 2
}

Max 50 notes per key, 1000 characters per note. Works on suspended and revoked keys. Supports aliases. All add/delete operations recorded in audit trail (key.note_added / key.note_deleted).

Scheduled Actions

Schedule future-dated actions on API keys — automatically revoke, suspend, or top up credits at a specified time:

# Schedule a key revocation in 24 hours
curl -X POST http://localhost:3402/keys/schedule \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "pg_...", "action": "revoke", "executeAt": "2025-04-01T00:00:00Z"}'

# Schedule a credit top-up
curl -X POST http://localhost:3402/keys/schedule \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "pg_...", "action": "topup", "executeAt": "2025-04-01T00:00:00Z", "params": {"credits": 500}}'

# List all pending schedules (optional ?key= filter)
curl "http://localhost:3402/keys/schedule" -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Cancel a schedule
curl -X DELETE "http://localhost:3402/keys/schedule?id=sched_1" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Response (create):

{
  "id": "sched_1",
  "key": "pg_abc1...2345",
  "action": "revoke",
  "executeAt": "2025-04-01T00:00:00.000Z",
  "createdAt": "2025-03-15T10:30:00.000Z"
}

Supported actions: revoke, suspend, topup (requires params.credits). Max 20 schedules per key. Supports aliases. Background timer checks every 10 seconds. All create/execute/cancel operations recorded in audit trail (schedule.created / schedule.executed / schedule.cancelled).

Key Activity Timeline

Get a unified chronological feed of all events for a specific key — audit events (creation, suspension, notes, etc.) and usage events (tool calls, denials) merged into one timeline:

# Get activity for a key (newest first, default limit 50)
curl "http://localhost:3402/keys/activity?key=pg_..." -H "X-Admin-Key: YOUR_ADMIN_KEY"

# With filters
curl "http://localhost:3402/keys/activity?key=pg_...&limit=20&since=2025-03-15T00:00:00Z" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Response:

{
  "key": "pg_abc1...2345",
  "name": "my-agent",
  "total": 42,
  "limit": 50,
  "events": [
    { "timestamp": "2025-03-16T10:30:00Z", "source": "usage", "type": "tool.call", "message": "Called search (5 credits)", "metadata": { "tool": "search", "creditsCharged": 5, "allowed": true } },
    { "timestamp": "2025-03-16T09:00:00Z", "source": "audit", "type": "key.note_added", "message": "Note added to key", "metadata": { "key": "pg_abc1...2345" } },
    { "timestamp": "2025-03-15T14:00:00Z", "source": "audit", "type": "key.created", "message": "Key created: my-agent", "metadata": { "keyMasked": "pg_abc1...2345" } }
  ]
}

Max 200 events per request. Supports aliases. Works on suspended and revoked keys.

Credit Reservations

Pre-reserve credits before executing expensive operations. Prevents overcommit in concurrent scenarios:

# Reserve 500 credits (hold for 5 min default, or set ttlSeconds)
curl -X POST http://localhost:3402/keys/reserve \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "pg_...", "credits": 500, "ttlSeconds": 300, "memo": "Batch job #42"}'

# Commit — deducts the held credits
curl -X POST http://localhost:3402/keys/reserve/commit \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"reservationId": "rsv_1"}'

# Release — frees the hold without deducting
curl -X POST http://localhost:3402/keys/reserve/release \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"reservationId": "rsv_1"}'

# List active reservations (optional ?key= filter)
curl "http://localhost:3402/keys/reserve" -H "X-Admin-Key: YOUR_ADMIN_KEY"

Response (reserve):

{
  "id": "rsv_1",
  "key": "pg_abc1...2345",
  "credits": 500,
  "createdAt": "2025-03-16T10:30:00Z",
  "expiresAt": "2025-03-16T10:35:00Z",
  "memo": "Batch job #42",
  "available": 500
}

TTL range: 10s to 1h (default 5 min). Max 50 reservations per key. Expired reservations auto-cleanup. Alias support. Rejects revoked/suspended keys. Audit trail (credits.reserved / credits.committed / credits.released).

Request Log

Queryable log of every MCP tool call with timing, credits, status, and deny reason:

# Get all requests (newest first)
curl "http://localhost:3402/requests" -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Filter by tool name
curl "http://localhost:3402/requests?tool=my_tool" -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Filter by status (allowed or denied)
curl "http://localhost:3402/requests?status=denied" -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Filter by key (partial match on masked key)
curl "http://localhost:3402/requests?key=pg_abc1" -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Filter by time + pagination
curl "http://localhost:3402/requests?since=2025-03-01T00:00:00Z&limit=50&offset=0" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Combine filters
curl "http://localhost:3402/requests?tool=my_tool&status=allowed&limit=10" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Response:

{
  "total": 42,
  "offset": 0,
  "limit": 100,
  "summary": {
    "totalAllowed": 38,
    "totalDenied": 4,
    "totalCredits": 190,
    "avgDurationMs": 45
  },
  "requests": [
    {
      "id": 42,
      "timestamp": "2025-03-16T10:30:00Z",
      "tool": "my_tool",
      "key": "pg_abc1...2345",
      "status": "allowed",
      "credits": 5,
      "durationMs": 32,
      "requestId": "req_a1b2c3d4e5f6g7h8"
    },
    {
      "id": 41,
      "timestamp": "2025-03-16T10:29:55Z",
      "tool": "my_tool",
      "key": "pg_xyz9...8765",
      "status": "denied",
      "credits": 0,
      "durationMs": 1,
      "denyReason": "insufficient_credits",
      "requestId": "req_i9j0k1l2m3n4o5p6"
    }
  ]
}

5000-entry ring buffer. Summary statistics are computed on filtered results. Deny reasons: insufficient_credits, rate_limited, invalid_api_key, key_suspended, api_key_expired, tool_not_allowed, quota_exceeded.

Tool Stats

Per-tool analytics derived from the request log:

# Overview of all tools
curl "http://localhost:3402/tools/stats" -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Detailed stats for a specific tool
curl "http://localhost:3402/tools/stats?tool=my_tool" -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Filter by time range
curl "http://localhost:3402/tools/stats?since=2025-03-01T00:00:00Z" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Response (overview):

{
  "totalTools": 3,
  "totalCalls": 150,
  "tools": [
    {
      "tool": "my_tool",
      "totalCalls": 100,
      "allowed": 95,
      "denied": 5,
      "successRate": 95,
      "totalCredits": 475,
      "avgDurationMs": 42
    }
  ]
}

Response (detailed ?tool=my_tool):

{
  "tool": "my_tool",
  "totalCalls": 100,
  "allowed": 95,
  "denied": 5,
  "successRate": 95,
  "totalCredits": 475,
  "avgDurationMs": 42,
  "p95DurationMs": 120,
  "denyReasons": {
    "insufficient_credits": 3,
    "rate_limited": 2
  },
  "topConsumers": [
    { "key": "pg_abc1...2345", "calls": 50, "credits": 250 },
    { "key": "pg_xyz9...8765", "calls": 30, "credits": 150 }
  ]
}

Top consumers limited to 10. Tools sorted by call count in overview. Data sourced from request log (5000-entry ring buffer).

Request Log Export

Export the request log as JSON or CSV for offline analysis:

# Export as JSON (default)
curl "http://localhost:3402/requests/export" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" -o paygate-requests.json

# Export as CSV
curl "http://localhost:3402/requests/export?format=csv" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" -o paygate-requests.csv

# Export with filters
curl "http://localhost:3402/requests/export?tool=my_tool&status=denied&since=2025-03-01T00:00:00Z&until=2025-03-31T23:59:59Z" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Parameter	Description
`format`	`json` (default) or `csv`
`key`	Filter by API key (partial match)
`tool`	Filter by tool name (exact match)
`status`	`allowed` or `denied`
`since`	ISO 8601 start timestamp
`until`	ISO 8601 end timestamp

Both formats include Content-Disposition headers for automatic file download. Unlike /requests, the export endpoint returns all matching entries (no pagination limit). CSV includes proper quoting for values with commas.

Tool Call Dry Run

Simulate a tool call to check if it would be allowed — without deducting credits or incrementing rate limits:

curl -X POST http://localhost:3402/requests/dry-run \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "pg_...", "tool": "my_tool"}'

Response (allowed):

{
  "allowed": true,
  "tool": "my_tool",
  "creditsRequired": 5,
  "creditsAvailable": 100,
  "creditsAfter": 95,
  "rateLimit": { "used": 3, "limit": 60, "remaining": 57, "resetInMs": 45000 }
}

Response (denied):

{
  "allowed": false,
  "reason": "insufficient_credits: need 5, have 2",
  "tool": "my_tool",
  "creditsRequired": 5,
  "creditsAvailable": 2
}

Checks key validity, suspension, tool ACL, rate limits, credit balance, and spending limits. Supports alias keys. Useful for agents that want to pre-flight check a call before committing.

Batch Dry Run

Simulate multiple tool calls at once to check if an entire batch would succeed:

curl -X POST http://localhost:3402/requests/dry-run/batch \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "pg_...", "tools": [{"name": "tool_a"}, {"name": "tool_b"}]}'

Response:

{
  "allAllowed": true,
  "totalCreditsRequired": 10,
  "creditsAvailable": 100,
  "creditsAfter": 90,
  "results": [
    { "tool": "tool_a", "allowed": true, "creditsRequired": 5 },
    { "tool": "tool_b", "allowed": true, "creditsRequired": 5 }
  ]
}

Performs aggregate credit check (sum of all tool prices vs balance), per-tool ACL validation, spending limit, and rate limit checks. Returns per-tool results so you can see which specific tools would fail. Max 100 tools per batch. Supports alias keys.

Tool Availability

Check per-key tool availability including pricing, affordability, and rate limit status:

curl "http://localhost:3402/tools/available?key=pg_..." \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Response:

{
  "key": "pg_c815...09a6",
  "creditsAvailable": 100,
  "totalTools": 3,
  "accessibleTools": 2,
  "globalRateLimit": { "limit": 60, "used": 5, "remaining": 55, "resetInMs": 45000 },
  "tools": [
    { "tool": "tool_a", "accessible": true, "creditsPerCall": 10, "canAfford": true },
    { "tool": "tool_b", "accessible": false, "denyReason": "denied_by_acl", "creditsPerCall": 5, "canAfford": true },
    { "tool": "tool_c", "accessible": true, "creditsPerCall": 1, "canAfford": true, "rateLimit": { "limit": 10, "used": 3, "remaining": 7 } }
  ]
}

Returns every discovered tool with: accessible (ACL check), denyReason (if blocked), creditsPerCall, canAfford (credits vs price), and per-tool rateLimit when configured. Includes global rate limit info. Supports alias keys. Works on suspended keys (informational). Read-only — does not deduct credits or increment rate counters.

Key Dashboard

Get a consolidated overview of any API key in a single request:

curl "http://localhost:3402/keys/dashboard?key=pg_..." \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Response:

{
  "key": "pg_c815...09a6",
  "name": "production-agent",
  "status": "active",
  "namespace": "prod",
  "balance": { "credits": 850, "totalSpent": 150, "totalAllocated": 1000, "spendingLimit": 500 },
  "health": { "score": 92, "status": "good" },
  "velocity": { "creditsPerHour": 6.2, "creditsPerDay": 149, "estimatedDepletionDate": "2025-02-03T..." },
  "rateLimits": { "global": { "limit": 60, "used": 12, "remaining": 48, "resetInMs": 35000 } },
  "quotas": { "source": "global", "daily": { "callsUsed": 24, "callsLimit": 100 }, "monthly": { "callsUsed": 340, "callsLimit": 5000 } },
  "usage": { "totalCalls": 340, "totalAllowed": 330, "totalDenied": 10, "totalCredits": 150 },
  "recentActivity": [{ "timestamp": "...", "event": "gate.allowed", "tool": "search", "credits": 5 }]
}

Combines metadata (status/namespace/group/tags), balance (credits/spent/allocated/spendingLimit), health score (0-100 composite), spending velocity with depletion forecast, rate limit and quota status, usage summary, and last 10 audit events. Supports alias keys. Works on suspended/revoked/expired keys. Read-only.

Admin Notifications

Get actionable notifications about keys that need attention:

# Get all notifications
curl http://localhost:3402/admin/notifications \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Filter by severity
curl "http://localhost:3402/admin/notifications?severity=critical" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Response:

{
  "total": 4,
  "critical": 2,
  "warning": 1,
  "info": 1,
  "notifications": [
    {
      "severity": "critical",
      "category": "zero_credits",
      "message": "Key has zero credits remaining",
      "key": "pg_c815...09a6",
      "keyName": "production-agent"
    },
    {
      "severity": "critical",
      "category": "key_expiring_soon",
      "message": "Key expires within 8 hours",
      "key": "pg_a3f1...b2e4",
      "keyName": "staging-agent",
      "details": { "expiresAt": "2026-02-27T08:00:00.000Z", "hoursRemaining": 7.5 }
    },
    {
      "severity": "warning",
      "category": "credits_depleting",
      "message": "Credits will deplete in ~18 hours at current rate",
      "key": "pg_d7e2...f1a3",
      "keyName": "batch-worker",
      "details": { "credits": 90, "creditsPerHour": 5.1, "estimatedHoursRemaining": 17.6 }
    },
    {
      "severity": "info",
      "category": "key_suspended",
      "message": "Key is suspended",
      "key": "pg_b4c9...e8d5",
      "keyName": "deprecated-agent"
    }
  ]
}

Notification categories:

key_expired (critical) — Key has passed its expiry date
key_expiring_soon (critical <24h, warning <7d) — Key approaching expiry
zero_credits (critical) — Key has no credits remaining
credits_depleting (critical <6h, warning <24h) — Spending velocity predicts depletion
key_suspended (info) — Key is suspended
high_error_rate (critical ≥50%, warning ≥25%) — High denial rate (min 10 calls)
rate_limit_pressure (warning ≥90%) — Rate limit nearly exhausted

Notifications are sorted by severity (critical first). Revoked keys are excluded. A single key can appear in multiple notifications (e.g., zero credits AND expiring soon). Filter with ?severity=critical|warning|info. Read-only.

System Dashboard

Get a system-wide overview in a single request:

curl http://localhost:3402/admin/dashboard \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Response:

{
  "keys": { "total": 15, "active": 10, "suspended": 2, "revoked": 2, "expired": 1 },
  "credits": { "totalAllocated": 15000, "totalSpent": 4200, "totalRemaining": 10800 },
  "usage": {
    "totalCalls": 840,
    "totalAllowed": 790,
    "totalDenied": 50,
    "totalCreditsSpent": 4200,
    "denyReasons": [{ "reason": "insufficient_credits", "count": 30 }, { "reason": "rate_limited", "count": 20 }]
  },
  "topConsumers": [
    { "name": "production-agent", "calls": 320, "credits": 1600, "denied": 5 },
    { "name": "batch-worker", "calls": 210, "credits": 1050, "denied": 0 }
  ],
  "topTools": [
    { "tool": "search", "calls": 450, "credits": 2250, "denied": 20 },
    { "tool": "generate", "calls": 300, "credits": 1500, "denied": 10 }
  ],
  "notifications": { "critical": 2, "warning": 3, "info": 2 },
  "uptime": { "startedAt": "2026-02-27T00:00:00.000Z", "uptimeSeconds": 86400, "uptimeHours": 24 }
}

Combines key counts by state, credit allocation and spending totals, usage breakdown with deny reasons, top 10 consumers ranked by credits spent, top 10 tools ranked by call count, notification severity counts, and server uptime. Read-only.

Key Lifecycle Report

Track key creation, revocation, suspension trends and identify at-risk keys:

# Full lifecycle report
curl http://localhost:3402/admin/lifecycle \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Filter by date range
curl "http://localhost:3402/admin/lifecycle?since=2026-02-01&until=2026-02-28" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Response:

{
  "events": { "created": 25, "revoked": 3, "suspended": 5, "resumed": 4, "rotated": 2, "cloned": 1 },
  "trends": [
    { "date": "2026-02-25", "created": 5, "revoked": 1, "suspended": 2, "resumed": 1 },
    { "date": "2026-02-26", "created": 8, "revoked": 0, "suspended": 1, "resumed": 2 },
    { "date": "2026-02-27", "created": 12, "revoked": 2, "suspended": 2, "resumed": 1 }
  ],
  "averageLifetimeHours": 168.5,
  "atRisk": [
    { "key": "pg_c815...09a6", "name": "staging-agent", "risk": "expiring_soon", "details": { "expiresAt": "2026-03-01T...", "daysRemaining": 2.5 } },
    { "key": "pg_a3f1...b2e4", "name": "batch-worker", "risk": "zero_credits", "details": { "credits": 0 } }
  ]
}

Shows aggregated lifecycle event counts, daily trend buckets (sorted chronologically), average key lifetime in hours (for revoked keys), and at-risk keys with their risk category (expired, expiring_soon, zero_credits). Supports ?since= and ?until= date filters. Excludes suspended and revoked keys from at-risk list. Read-only.

Cost Analysis

Get a cost-centric breakdown of credit usage across tools, namespaces, and time:

# Full cost analysis
curl http://localhost:3402/admin/costs \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Filter by namespace
curl "http://localhost:3402/admin/costs?namespace=prod" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Filter by time range
curl "http://localhost:3402/admin/costs?since=2026-02-01" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": {
    "totalCredits": 4250,
    "totalCalls": 312,
    "totalAllowed": 298,
    "totalDenied": 14,
    "avgCostPerCall": 13.62
  },
  "perTool": [
    { "tool": "generate_report", "calls": 85, "credits": 1700, "avgCost": 20 },
    { "tool": "query_data", "calls": 142, "credits": 1420, "avgCost": 10 }
  ],
  "perNamespace": [
    { "namespace": "prod", "calls": 210, "credits": 3150 },
    { "namespace": "staging", "calls": 102, "credits": 1100 }
  ],
  "hourlyTrends": [
    { "hour": "2026-02-26T14:00:00.000Z", "calls": 23, "credits": 345, "denied": 1 },
    { "hour": "2026-02-26T15:00:00.000Z", "calls": 31, "credits": 465, "denied": 0 }
  ],
  "topSpenders": [
    { "key": "pg_a1b2...c3d4", "name": "ml-pipeline", "credits": 1800, "calls": 90 },
    { "key": "pg_e5f6...g7h8", "name": "batch-worker", "credits": 1200, "calls": 120 }
  ]
}

Returns per-tool cost breakdown (with average cost per call), per-namespace spending, hourly trend buckets (last 24 hours), and top 10 spenders ranked by credits consumed. Supports ?since= and ?namespace= query filters. Keys without an explicit namespace appear under default. Read-only.

Rate Limit Analysis

Analyze rate limit utilization across keys and tools:

curl http://localhost:3402/admin/rate-limits \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "config": {
    "globalLimitPerMin": 60,
    "windowMs": 60000
  },
  "summary": {
    "totalCalls": 450,
    "totalRateLimited": 12,
    "rateLimitRate": 0.0267
  },
  "perKey": [
    { "name": "ml-pipeline", "calls": 200, "rateLimited": 8, "currentWindowUsed": 45, "currentWindowRemaining": 15 },
    { "name": "batch-worker", "calls": 150, "rateLimited": 4, "currentWindowUsed": 12, "currentWindowRemaining": 48 }
  ],
  "perTool": [
    { "tool": "generate_report", "calls": 180, "rateLimited": 10 },
    { "tool": "query_data", "calls": 270, "rateLimited": 2 }
  ],
  "hourlyTrends": [
    { "hour": "2026-02-26T14", "calls": 52, "rateLimited": 3 },
    { "hour": "2026-02-26T15", "calls": 48, "rateLimited": 1 }
  ],
  "mostThrottled": [
    { "name": "ml-pipeline", "rateLimited": 8, "calls": 200, "throttleRate": 0.04 }
  ]
}

Returns rate limit configuration, denial summary with throttle rate, per-key breakdown with current sliding window utilization, per-tool denial counts, hourly denial trends (last 24 hours), and top 10 most throttled keys ranked by denial count. Handles unlimited rate limits (globalLimitPerMin: 0). Read-only.

Quota Analysis

curl http://localhost:3000/admin/quotas -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "config": { "globalQuota": { "dailyCallLimit": 100, "monthlyCallLimit": 1000, "dailyCreditLimit": 500, "monthlyCreditLimit": 5000 } },
  "summary": { "totalKeys": 5, "keysWithQuotas": 4, "totalQuotaDenials": 3, "quotaDenialRate": 0.02 },
  "perKey": [
    { "name": "heavy-user", "dailyCalls": 95, "monthlyCalls": 450, "dailyCredits": 475, "monthlyCredits": 2250, "dailyCallLimit": 100, "monthlyCallLimit": 1000, "dailyCreditLimit": 500, "monthlyCreditLimit": 5000, "dailyCallUtilization": 0.95, "monthlyCallUtilization": 0.45, "source": "global" }
  ],
  "perTool": [{ "tool": "summarize", "calls": 120, "quotaDenied": 2 }],
  "hourlyTrends": [{ "hour": "2025-01-15T14", "calls": 15, "quotaDenied": 1 }],
  "mostConstrained": [{ "name": "heavy-user", "dailyCalls": 95, "dailyCallLimit": 100, "dailyCallUtilization": 0.95, "monthlyCalls": 450, "monthlyCallLimit": 1000, "monthlyCallUtilization": 0.45 }]
}

Returns quota configuration (global or null), key counts with/without quotas, denial summary with denial rate, per-key daily/monthly call and credit usage vs limits with utilization percentages, quota source (per-key/global/none), per-tool quota denial counts, hourly denial trends (last 24 hours), and top 10 most constrained keys ranked by daily call utilization. Read-only.

Denial Analysis

curl http://localhost:3000/admin/denials -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": { "totalCalls": 150, "totalDenials": 12, "denialRate": 0.08 },
  "byReason": { "insufficient_credits": 5, "rate_limited": 4, "quota_exceeded": 2, "key_suspended": 1 },
  "perKey": [
    { "name": "heavy-user", "calls": 50, "denials": 8, "denialRate": 0.16, "topReason": "rate_limited" }
  ],
  "perTool": [{ "tool": "summarize", "calls": 80, "denials": 6, "denialRate": 0.075, "topReason": "insufficient_credits" }],
  "hourlyTrends": [{ "hour": "2025-01-15T14", "calls": 20, "denials": 3 }],
  "mostDenied": [{ "name": "heavy-user", "denials": 8, "calls": 50, "denialRate": 0.16, "topReason": "rate_limited" }]
}

Returns denial summary with denial rate, breakdown by canonical reason type (insufficient_credits, rate_limited, tool_rate_limited, quota_exceeded, key_suspended, api_key_expired, invalid_api_key, missing_api_key, tool_not_allowed, ip_not_allowed, spending_limit_exceeded, etc.), per-key denial counts with top reason, per-tool denial counts, hourly denial trends (last 24 hours), and top 10 most denied keys. Read-only.

Traffic Analysis

curl http://localhost:3000/admin/traffic -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": { "totalCalls": 500, "totalAllowed": 470, "totalDenied": 30, "successRate": 0.94, "uniqueKeys": 8, "uniqueTools": 3, "peakHour": "2025-01-15T14", "peakHourCalls": 85 },
  "toolPopularity": [{ "tool": "summarize", "calls": 250, "successRate": 0.96, "credits": 2500 }],
  "hourlyVolume": [{ "hour": "2025-01-15T14", "calls": 85, "allowed": 80, "denied": 5, "credits": 400 }],
  "topConsumers": [{ "name": "heavy-user", "calls": 150, "successRate": 0.92, "credits": 1380 }],
  "byNamespace": [{ "namespace": "production", "calls": 400, "allowed": 380, "credits": 3800 }]
}

Returns traffic summary with success rate and peak hour, tool popularity ranked by call count with success rates and credit totals, hourly volume (last 24 hours) with allowed/denied/credit breakdowns, top 10 consumers by call count, and namespace breakdown with per-namespace stats. Read-only.

Security Audit

curl http://localhost:3000/admin/security -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "score": 72,
  "summary": { "totalKeys": 5, "totalFindings": 12 },
  "findings": [
    { "type": "no_ip_allowlist", "severity": "warning", "keys": ["prod-key", "dev-key"], "description": "Keys without IP allowlists can be used from any IP address" },
    { "type": "no_acl_restriction", "severity": "info", "keys": ["dev-key"], "description": "Keys without ACL restrictions can access all tools" },
    { "type": "high_credit_balance", "severity": "warning", "keys": ["whale-key"], "description": "Keys with 10000+ credits are high-value targets if compromised" }
  ]
}

Returns a composite security score (0-100) with per-finding breakdown. Scans all active keys for: missing IP allowlists (warning), missing quotas (info), unrestricted ACLs (info), no spending limits (info), no expiry dates (info), and high credit balances (warning). Well-configured keys with IP restrictions, tool ACLs, quotas, spending limits, and expiry dates will not appear in any findings. Read-only — does not modify system state.

Revenue Analysis

curl http://localhost:3000/admin/revenue -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": { "totalRevenue": 5000, "totalCalls": 250, "averageRevenuePerCall": 20 },
  "byTool": [{ "tool": "summarize", "revenue": 3000, "calls": 150, "averagePerCall": 20 }],
  "byKey": [{ "name": "heavy-user", "revenue": 2000, "calls": 80 }],
  "hourlyRevenue": [{ "hour": "2025-01-15T14", "revenue": 500, "calls": 25 }],
  "creditFlow": { "totalAllocated": 50000, "totalSpent": 5000, "totalRemaining": 45000 }
}

Returns revenue summary with total credits earned, per-tool revenue ranked by earnings with average per-call, top 10 per-key spending, hourly revenue trends (last 24 hours), and credit flow showing total allocated vs spent vs remaining across all active keys. Only counts successful (allowed) calls. Read-only.

Key Portfolio Health

curl http://localhost:3000/admin/key-portfolio -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": { "totalKeys": 10, "activeKeys": 7, "inactiveKeys": 2, "suspendedKeys": 1, "averageCreditUtilization": 0.35 },
  "staleKeys": [{ "name": "unused-key", "createdAt": "2025-01-01T00:00:00Z", "credits": 500, "ageDays": 30 }],
  "expiringSoon": [{ "name": "temp-key", "expiresAt": "2025-01-20T00:00:00Z", "hoursRemaining": 48, "credits": 100 }],
  "ageDistribution": { "averageAgeDays": 15, "oldestAgeDays": 60, "newestAgeDays": 0 },
  "byNamespace": [{ "namespace": "production", "total": 5, "active": 4, "suspended": 1 }]
}

Returns portfolio-wide key health: active/inactive/suspended counts, average credit utilization, stale keys (created but never used), keys expiring within 7 days sorted by urgency, age distribution statistics, and namespace breakdown. Read-only.

Anomaly Detection

curl http://localhost:3000/admin/anomalies -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": { "totalAnomalies": 3, "byType": { "high_denial_rate": 1, "rapid_credit_depletion": 1, "low_credits": 1 } },
  "anomalies": [
    { "type": "high_denial_rate", "severity": "warning", "keyName": "test-key", "description": "Key \"test-key\" has 80% denial rate (8/10 calls denied)" },
    { "type": "rapid_credit_depletion", "severity": "warning", "keyName": "fast-spender", "description": "Key \"fast-spender\" has used 95% of allocated credits (950/1000)" },
    { "type": "low_credits", "severity": "info", "keyName": "nearly-empty", "description": "Key \"nearly-empty\" has only 5 credits remaining (5% of allocated)" }
  ],
  "analyzedAt": "2025-01-15T14:30:00Z"
}

Scans all active keys for anomalous patterns: keys with >50% denial rates (3+ calls minimum), rapid credit depletion (>=75% spent), and low remaining credits (<=10 credits or <=10% remaining). Each anomaly includes type, severity, affected key name, and human-readable description. Read-only.

Usage Forecasting

curl http://localhost:3000/admin/forecast -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": { "totalActiveKeys": 3, "keysAtRisk": 1 },
  "keyForecasts": [
    { "keyName": "heavy-user", "creditsRemaining": 50, "totalSpent": 950, "callCount": 95, "avgCreditsPerCall": 10, "estimatedCallsRemaining": 5, "atRisk": true },
    { "keyName": "light-user", "creditsRemaining": 900, "totalSpent": 100, "callCount": 20, "avgCreditsPerCall": 5, "estimatedCallsRemaining": 180, "atRisk": false }
  ],
  "systemForecast": {
    "totalCreditsRemaining": 950,
    "totalCreditsSpent": 1050,
    "totalCalls": 115,
    "byTool": [
      { "tool": "expensive_tool", "calls": 50, "totalCredits": 500, "avgCreditsPerCall": 10 },
      { "tool": "cheap_tool", "calls": 65, "totalCredits": 325, "avgCreditsPerCall": 5 }
    ]
  },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Forecasts credit consumption for all active keys: per-key depletion estimates with calls remaining, at-risk identification (<=5 estimated calls), system-wide credit aggregates, and per-tool cost breakdown sorted by revenue. Keys with no usage history show estimatedCallsRemaining: null. Read-only.

Compliance Report

curl http://localhost:3000/admin/compliance -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "keyGovernance": { "totalKeys": 5, "keysWithExpiry": 3, "keysWithoutExpiry": 2 },
  "accessControl": {
    "keysWithAcl": 3, "keysWithoutAcl": 2,
    "keysWithIpRestriction": 2, "keysWithoutIpRestriction": 3,
    "keysWithSpendingLimit": 4, "keysWithoutSpendingLimit": 1
  },
  "auditTrail": { "totalEvents": 150, "uniqueTools": 5, "uniqueKeys": 4 },
  "overallScore": 72,
  "recommendations": [
    "Set expiry dates on 2 key(s) without time-limited access",
    "Add tool ACL restrictions to 2 key(s) with unrestricted tool access"
  ],
  "generatedAt": "2025-01-15T14:30:00Z"
}

Compliance-ready report scoring key governance (expiry 25%), access control (ACL 25%, IP 20%, spending limits 15%), and audit trail (15%). Actionable recommendations for each gap. Read-only.

SLA Monitoring

curl http://localhost:3000/admin/sla -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": {
    "totalCalls": 150, "allowedCalls": 140, "deniedCalls": 10,
    "successRate": 93.33,
    "denialReasons": { "insufficient_credits": 6, "rate_limited": 3, "acl_denied": 1 }
  },
  "byTool": [
    { "tool": "tool_a", "totalCalls": 100, "allowedCalls": 95, "deniedCalls": 5, "successRate": 95 },
    { "tool": "tool_b", "totalCalls": 50, "allowedCalls": 45, "deniedCalls": 5, "successRate": 90 }
  ],
  "uptime": { "startedAt": "2025-01-15T10:00:00Z", "uptimeSeconds": 16200 },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Service level metrics: overall success rate, denial breakdown by canonical reason (insufficient_credits, rate_limited, quota_exceeded, acl_denied, spending_limit, key_suspended, key_expired), per-tool availability sorted by call volume, and server uptime tracking. Read-only.

Capacity Planning

curl http://localhost:3000/admin/capacity -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": {
    "totalCreditsAllocated": 10000, "totalCreditsSpent": 3500, "totalCreditsRemaining": 6500,
    "utilizationPct": 35,
    "burnRate": { "creditsPerCall": 10, "totalCalls": 350 }
  },
  "topConsumers": [
    { "keyName": "heavy-user", "creditsSpent": 2000, "creditsRemaining": 500, "callCount": 200 }
  ],
  "byNamespace": {
    "prod": { "allocated": 8000, "spent": 3000, "remaining": 5000, "keys": 3, "utilizationPct": 37 }
  },
  "recommendations": ["1 key(s) have less than 10% credits remaining"],
  "generatedAt": "2025-01-15T14:30:00Z"
}

System capacity analysis: overall credit utilization, burn rate (credits/call), top 10 consumers ranked by spend, per-namespace breakdown, and scaling recommendations for high utilization (>=75%) or depleted keys. Read-only.

Key Dependency Map

curl http://localhost:3000/admin/dependencies -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": { "totalTools": 5, "usedTools": 3, "unusedTools": 2 },
  "toolUsage": [
    { "tool": "search", "totalCalls": 150, "uniqueKeys": 8 },
    { "tool": "translate", "totalCalls": 45, "uniqueKeys": 3 }
  ],
  "keyToolMap": [
    { "keyName": "power-user", "tools": ["search", "translate", "summarize"], "toolCount": 3 },
    { "keyName": "basic-user", "tools": ["search"], "toolCount": 1 }
  ],
  "generatedAt": "2025-01-15T14:30:00Z"
}

Tool-to-key relationship map: shows which tools each key uses, tool popularity ranked by total calls, unique key counts per tool, and identifies orphaned tools (available but unused). Useful for understanding tool adoption and pruning unused capabilities. Read-only.

Tool Latency Analysis

curl http://localhost:3000/admin/latency -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": { "totalCalls": 200, "avgDurationMs": 45, "minDurationMs": 8, "maxDurationMs": 312, "p95DurationMs": 120 },
  "byTool": [
    { "tool": "translate", "totalCalls": 80, "avgDurationMs": 65, "minDurationMs": 20, "maxDurationMs": 312, "p95DurationMs": 150 },
    { "tool": "search", "totalCalls": 120, "avgDurationMs": 32, "minDurationMs": 8, "maxDurationMs": 95, "p95DurationMs": 78 }
  ],
  "slowestTools": [
    { "tool": "translate", "avgDurationMs": 65, "totalCalls": 80 }
  ],
  "byKey": [
    { "keyName": "heavy-user", "totalCalls": 150, "avgDurationMs": 48, "minDurationMs": 8, "maxDurationMs": 312 }
  ],
  "generatedAt": "2025-01-15T14:30:00Z"
}

Per-tool response time metrics: average, p95, min, and max durations for each tool sorted by slowest average first, top 10 slowest tools ranking, per-key latency breakdown, and global summary. Only counts successful (allowed) calls. Read-only.

Error Rate Trends

curl http://localhost:3000/admin/error-trends -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": { "totalCalls": 500, "totalDenials": 45, "overallErrorRate": 9, "trend": "improving" },
  "byTool": [
    { "tool": "translate", "totalCalls": 200, "denials": 30, "errorRate": 15 },
    { "tool": "search", "totalCalls": 300, "denials": 15, "errorRate": 5 }
  ],
  "denialReasons": [
    { "reason": "insufficient_credits", "count": 30 },
    { "reason": "rate_limited", "count": 15 }
  ],
  "generatedAt": "2025-01-15T14:30:00Z"
}

Denial rate trends: overall error rate, per-tool error rates sorted by worst-performing, denial reason breakdown, and trend direction (improving/degrading/stable based on first-half vs second-half comparison). Read-only.

Credit Flow Analysis

curl http://localhost:3000/admin/credit-flow -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": { "totalAllocated": 10000, "totalSpent": 3500, "totalRemaining": 6500, "utilizationPct": 35 },
  "topSpenders": [
    { "keyName": "heavy-user", "creditsSpent": 2000, "creditsRemaining": 500, "callCount": 200 }
  ],
  "byTool": [
    { "tool": "search", "creditsSpent": 2000, "callCount": 400 },
    { "tool": "translate", "creditsSpent": 1500, "callCount": 150 }
  ],
  "generatedAt": "2025-01-15T14:30:00Z"
}

Credit inflow/outflow analysis: total credits allocated (initial + spent) vs spent vs remaining, utilization percentage, top 10 spenders ranked by credits consumed, and per-tool spend breakdown sorted by revenue. Read-only.

Key Age Analysis

curl http://localhost:3000/admin/key-age -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": {
    "totalKeys": 15, "avgAgeHours": 168.5,
    "oldestKey": { "keyName": "legacy", "ageHours": 720, "createdAt": "2025-01-01T00:00:00Z" },
    "newestKey": { "keyName": "fresh", "ageHours": 0.5, "createdAt": "2025-01-31T12:00:00Z" }
  },
  "distribution": { "last24h": 3, "last7d": 5, "last30d": 4, "older": 3 },
  "recentlyCreated": [
    { "keyName": "fresh", "ageHours": 0.5, "createdAt": "2025-01-31T12:00:00Z" }
  ],
  "generatedAt": "2025-01-31T12:30:00Z"
}

Key age distribution: average age across all active keys, oldest/newest key identification, age buckets (last 24h / 7d / 30d / older), and recently created list (newest first, top 10). Read-only.

Namespace Usage Summary

curl http://localhost:3000/admin/namespace-usage -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": { "totalNamespaces": 3 },
  "namespaces": [
    { "namespace": "prod", "keyCount": 5, "totalAllocated": 5000, "totalSpent": 2000, "totalRemaining": 3000, "totalCalls": 400, "utilizationPct": 40 },
    { "namespace": "staging", "keyCount": 2, "totalAllocated": 1000, "totalSpent": 200, "totalRemaining": 800, "totalCalls": 40, "utilizationPct": 20 }
  ],
  "generatedAt": "2025-01-15T14:30:00Z"
}

Per-namespace usage metrics: key counts, credit allocation/spending/remaining, call counts, and utilization percentages. Sorted by spending (highest first). Keys without a namespace appear under "default". Read-only.

Audit Summary

curl http://localhost:3000/admin/audit-summary -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": { "totalEvents": 142, "eventsLastHour": 18, "eventsLast24h": 95, "oldestEvent": "2025-01-14T08:00:00Z", "newestEvent": "2025-01-15T14:30:00Z" },
  "eventsByType": [
    { "type": "gate.allow", "count": 80 },
    { "type": "gate.deny", "count": 25 },
    { "type": "key.created", "count": 12 }
  ],
  "topActors": [
    { "actor": "pg_abc1...", "count": 60 },
    { "actor": "admin", "count": 30 }
  ],
  "recentEvents": [
    { "id": 142, "timestamp": "2025-01-15T14:30:00Z", "type": "gate.allow", "actor": "pg_abc1...", "message": "Allowed: tool_a" }
  ],
  "generatedAt": "2025-01-15T14:30:00Z"
}

Audit event analytics: total events with hourly/daily counts, event type breakdown sorted by frequency, top 10 most active actors, and the 20 most recent events (newest first). Read-only.

Group Performance

curl http://localhost:3000/admin/group-performance -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": { "totalGroups": 2, "ungroupedKeys": 3 },
  "groups": [
    {
      "groupId": "grp_abc123", "groupName": "prod-team", "description": "Production",
      "keyCount": 5, "totalAllocated": 5000, "totalSpent": 2000, "totalRemaining": 3000,
      "totalCalls": 400, "utilizationPct": 40,
      "policy": { "allowedTools": ["tool_a"], "deniedTools": [], "rateLimitPerMin": 60 }
    }
  ],
  "generatedAt": "2025-01-15T14:30:00Z"
}

Per-group analytics: key counts, credit allocation/spending/remaining, call volume, and utilization percentages. Includes group policy summary (allowed/denied tools, rate limits). Sorted by spending (highest first). Also reports ungrouped key count. Read-only.

Request Volume Trends

curl http://localhost:3000/admin/request-trends -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": {
    "totalRequests": 150, "totalAllowed": 130, "totalDenied": 20,
    "totalCredits": 650, "avgDurationMs": 45,
    "peakHour": { "hour": "2025-01-15T14:00:00Z", "total": 42 }
  },
  "hourly": [
    { "hour": "2025-01-15T12:00:00Z", "total": 35, "allowed": 30, "denied": 5, "credits": 150, "avgDurationMs": 40 },
    { "hour": "2025-01-15T13:00:00Z", "total": 42, "allowed": 38, "denied": 4, "credits": 190, "avgDurationMs": 50 }
  ],
  "generatedAt": "2025-01-15T14:30:00Z"
}

Hourly request volume time-series: total/allowed/denied counts, credit spend, and average duration per hour. Includes summary with peak hour identification. Built from request log data. Sorted chronologically. Read-only.

Key Status Overview

curl http://localhost:3000/admin/key-status -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "counts": { "total": 20, "active": 15, "suspended": 2, "revoked": 2, "expired": 1 },
  "needsAttention": [
    { "keyName": "low-balance", "issue": "low_credits", "detail": "5 credits remaining" },
    { "keyName": "trial-key", "issue": "expiring_soon", "detail": "Expires in 48 hours" }
  ],
  "generatedAt": "2025-01-15T14:30:00Z"
}

Key status dashboard: active/suspended/revoked/expired counts with keys needing attention. Flags active keys with low credits (<=10) and near expiry (within 7 days). Read-only.

Webhook Health

curl http://localhost:3000/admin/webhook-health -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "configured": true,
  "status": "healthy",
  "delivery": {
    "totalDelivered": 142,
    "totalFailed": 3,
    "totalRetries": 5,
    "pendingRetries": 0,
    "deadLetterCount": 1,
    "bufferedEvents": 0,
    "paused": false,
    "pausedAt": null,
    "successRate": 97.93
  },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Webhook delivery health overview. Status is healthy, retrying, degraded (dead letters exist), paused, or not_configured. Includes success rate, pending retries, dead letter count, and buffered events. Read-only.

Consumer Insights

curl http://localhost:3000/admin/consumer-insights -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": {
    "totalConsumers": 15,
    "activeConsumers": 12,
    "totalCreditsSpent": 4850,
    "totalCalls": 970
  },
  "topSpenders": [
    { "name": "heavy-user", "totalSpent": 1200, "totalCalls": 240, "uniqueTools": 5 }
  ],
  "mostActive": [
    { "name": "heavy-user", "totalCalls": 240, "totalSpent": 1200, "uniqueTools": 5 }
  ],
  "generatedAt": "2025-01-15T14:30:00Z"
}

Per-key behavioral analytics. Top 10 spenders ranked by credits consumed, top 10 most active by call count. Each entry includes tool diversity (unique tools used). Summary shows total/active consumers and aggregate spend. Read-only.

System Health Score

curl http://localhost:3000/admin/system-health -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "score": 85,
  "level": "healthy",
  "components": {
    "keyHealth": { "score": 90, "weight": 0.4, "detail": "2 suspended" },
    "errorRate": { "score": 80, "weight": 0.35, "detail": "10% denial rate (5/50)" },
    "creditUtilization": { "score": 85, "weight": 0.25, "detail": "45% utilized (4500/10000 credits)" }
  },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Composite system health score 0-100 with weighted component breakdowns. Key health (40%): penalizes suspended/revoked/expired/low-credit keys. Error rate (35%): penalizes high denial rates. Credit utilization (25%): healthy at 10-80%, degrades at >80%. Levels: healthy (>=80), good (>=60), warning (>=40), critical (<40). Read-only.

Tool Adoption

curl http://localhost:3000/admin/tool-adoption -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "tools": [
    {
      "tool": "search",
      "uniqueConsumers": 8,
      "adoptionRate": 80,
      "totalCalls": 245,
      "firstSeen": "2025-01-10T08:00:00Z",
      "lastSeen": "2025-01-15T14:30:00Z"
    },
    {
      "tool": "translate",
      "uniqueConsumers": 3,
      "adoptionRate": 30,
      "totalCalls": 42,
      "firstSeen": "2025-01-12T10:00:00Z",
      "lastSeen": "2025-01-15T12:00:00Z"
    }
  ],
  "summary": {
    "totalTools": 2,
    "usedTools": 2,
    "unusedTools": 0
  },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Per-tool adoption metrics showing which tools are being used and by how many consumers. uniqueConsumers counts distinct API keys that called the tool. adoptionRate is the percentage of active keys that have used the tool. Sorted by adoption rate descending, then by total calls. Read-only.

Credit Efficiency

curl http://localhost:3000/admin/credit-efficiency -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": {
    "totalAllocated": 5000,
    "totalSpent": 1200,
    "totalRemaining": 3800,
    "burnEfficiency": 24,
    "wasteRatio": 76,
    "activeKeys": 10
  },
  "overProvisioned": [
    { "name": "idle-whale", "credits": 950, "totalAllocated": 1000, "totalSpent": 50, "remainingPercent": 95 }
  ],
  "underProvisioned": [
    { "name": "heavy-user", "credits": 3, "totalAllocated": 500, "totalSpent": 497, "remainingPercent": 1 }
  ],
  "generatedAt": "2025-01-15T14:30:00Z"
}

Credit allocation efficiency analysis. burnEfficiency is the percentage of allocated credits actually spent. wasteRatio is the percentage remaining unused. Over-provisioned keys have >90% remaining credits. Under-provisioned keys have <=10 credits or <=10% remaining with active usage. Top 10 in each category, sorted by urgency. Read-only.

Access Heatmap

curl http://localhost:3000/admin/access-heatmap -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "hourly": [
    {
      "hour": "2025-01-15T14:00:00.000Z",
      "total": 45,
      "uniqueConsumers": 8,
      "tools": { "search": 30, "translate": 15 }
    }
  ],
  "summary": {
    "totalRequests": 45,
    "totalHours": 1,
    "peakHour": { "hour": "2025-01-15T14:00:00.000Z", "total": 45 }
  },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Hourly access patterns for capacity planning. Each bucket shows total requests, unique consumers, and per-tool breakdown. Peak hour identification helps spot usage spikes. Only counts allowed requests. Sorted chronologically. Read-only.

Key Churn Analysis

curl http://localhost:3000/admin/key-churn -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "summary": {
    "totalKeys": 50,
    "activeKeys": 40,
    "revokedKeys": 5,
    "suspendedKeys": 3,
    "neverUsedKeys": 8,
    "churnRate": 10,
    "retentionRate": 90,
    "avgCreditsPerKey": 250
  },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Key churn analysis showing the health of your API key base. churnRate is the percentage of keys that have been revoked. retentionRate is the inverse. neverUsedKeys counts active keys with zero total calls. avgCreditsPerKey shows average remaining credits across active keys. Read-only.

Tool Correlation

curl http://localhost:3000/admin/tool-correlation -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "pairs": [
    { "toolA": "search", "toolB": "translate", "sharedConsumers": 5, "strength": 50 },
    { "toolA": "search", "toolB": "summarize", "sharedConsumers": 3, "strength": 30 }
  ],
  "summary": { "totalPairs": 2, "totalConsumers": 10 },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Tool co-occurrence analysis showing which tools are commonly used together. sharedConsumers counts API keys that used both tools. strength is the percentage of all consumers who use the pair. Sorted by shared consumers descending. Helps identify tool bundles and usage patterns. Read-only.

Consumer Segmentation

curl http://localhost:3000/admin/consumer-segmentation -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "segments": [
    { "segment": "power", "count": 3, "totalCredits": 500, "totalSpent": 2400, "avgCallsPerKey": 35 },
    { "segment": "regular", "count": 8, "totalCredits": 1200, "totalSpent": 800, "avgCallsPerKey": 12 },
    { "segment": "casual", "count": 15, "totalCredits": 3000, "totalSpent": 150, "avgCallsPerKey": 2 },
    { "segment": "dormant", "count": 5, "totalCredits": 1000, "totalSpent": 0, "avgCallsPerKey": 0 }
  ],
  "summary": { "totalConsumers": 31 },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Classifies active API key consumers into segments based on usage: power (20+ calls), regular (5–19 calls), casual (1–4 calls), dormant (0 calls). Each segment includes aggregate metrics: count, total credits remaining, total spent, and average calls per key. Excludes revoked and suspended keys. Read-only.

Credit Distribution

curl http://localhost:3000/admin/credit-distribution -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "buckets": [
    { "range": "0-10", "count": 5, "totalCredits": 30 },
    { "range": "11-50", "count": 12, "totalCredits": 420 },
    { "range": "51-100", "count": 8, "totalCredits": 640 },
    { "range": "101-500", "count": 4, "totalCredits": 1200 },
    { "range": "1001+", "count": 2, "totalCredits": 5000 }
  ],
  "summary": { "totalKeys": 31, "medianCredits": 50 },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Histogram of credit balances across active, non-suspended keys. Buckets: 0–10, 11–50, 51–100, 101–500, 501–1000, 1001+. Only non-empty buckets are returned. medianCredits is the median remaining balance. Useful for pricing analysis and capacity planning. Read-only.

Response Time Distribution

curl http://localhost:3000/admin/response-time-distribution -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "buckets": [
    { "range": "0-50ms", "count": 45, "percentage": 60 },
    { "range": "51-100ms", "count": 20, "percentage": 27 },
    { "range": "101-250ms", "count": 8, "percentage": 11 },
    { "range": "251-500ms", "count": 2, "percentage": 3 }
  ],
  "summary": { "totalRequests": 75, "avgResponseTime": 62, "p50": 42, "p95": 180, "p99": 350 },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Histogram of response times across allowed tool calls. Buckets: 0–50ms, 51–100ms, 101–250ms, 251–500ms, 501–1000ms, 1001ms+. Includes percentile metrics (p50, p95, p99) and average response time. Only non-empty buckets are returned. Useful for SLA monitoring and performance optimization. Read-only.

Consumer Lifetime Value

curl http://localhost:3000/admin/consumer-lifetime-value -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "consumers": [
    { "name": "enterprise-bot", "lifetimeValue": 2500, "totalCalls": 500, "avgSpendPerCall": 5, "toolsUsed": 8, "tier": "high" },
    { "name": "dev-team", "lifetimeValue": 450, "totalCalls": 90, "avgSpendPerCall": 5, "toolsUsed": 4, "tier": "medium" },
    { "name": "trial-user", "lifetimeValue": 5, "totalCalls": 1, "avgSpendPerCall": 5, "toolsUsed": 1, "tier": "low" }
  ],
  "summary": { "totalConsumers": 15, "totalLifetimeValue": 3200, "avgLifetimeValue": 213 },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Per-consumer value analysis for active keys with usage. Value tiers: high (100+ credits spent), medium (10–99), low (<10). toolsUsed shows tool diversity. Top 20 consumers by lifetime value. Zero-spend consumers excluded from list. avgLifetimeValue uses all active keys as denominator. Read-only.

Tool Revenue Ranking

curl http://localhost:3000/admin/tool-revenue -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "tools": [
    { "tool": "code_review", "totalCredits": 1500, "callCount": 300, "avgCreditsPerCall": 5, "uniqueConsumers": 25, "percentage": 60 },
    { "tool": "generate_tests", "totalCredits": 750, "callCount": 150, "avgCreditsPerCall": 5, "uniqueConsumers": 18, "percentage": 30 },
    { "tool": "lint_check", "totalCredits": 250, "callCount": 50, "avgCreditsPerCall": 5, "uniqueConsumers": 12, "percentage": 10 }
  ],
  "summary": { "totalTools": 3, "totalRevenue": 2500, "topTool": "code_review" },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Ranks tools by total credits consumed from allowed requests. Each tool entry includes call count, average credits per call, unique consumer count, and revenue percentage. topTool is the highest revenue generator. Only allowed requests are counted; denied requests are excluded. Sorted by total credits descending. Read-only.

Consumer Retention Cohorts

curl http://localhost:3000/admin/consumer-retention -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "cohorts": [
    { "period": "2025-01-15", "created": 10, "retained": 8, "retentionRate": 80, "avgSpend": 150 },
    { "period": "2025-01-14", "created": 5, "retained": 3, "retentionRate": 60, "avgSpend": 80 }
  ],
  "summary": { "totalKeys": 15, "retainedKeys": 11, "overallRetentionRate": 73 },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Groups active consumers by creation date (YYYY-MM-DD cohorts). A consumer is "retained" if they have at least 1 tool call. Per-cohort: created count, retained count, retention rate percentage, and average spend. Excludes revoked/suspended keys. Cohorts sorted newest first. Read-only.

Error Breakdown

curl http://localhost:3000/admin/error-breakdown -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "errors": [
    { "reason": "insufficient_credits", "count": 45, "percentage": 75, "affectedConsumers": 12 },
    { "reason": "rate_limited", "count": 10, "percentage": 17, "affectedConsumers": 3 },
    { "reason": "acl_denied", "count": 5, "percentage": 8, "affectedConsumers": 2 }
  ],
  "summary": { "totalDenied": 60, "totalAllowed": 940, "errorRate": 6 },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Categorizes denied requests by deny reason for root-cause analysis. Per-reason: count, percentage of total denials, and affected consumer count. errorRate is the percentage of total requests that were denied. Sorted by count descending. Read-only.

Credit Utilization Rate

curl http://localhost:3000/admin/credit-utilization -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "bands": [
    { "range": "0%", "count": 5, "percentage": 25 },
    { "range": "1-25%", "count": 8, "percentage": 40 },
    { "range": "26-50%", "count": 4, "percentage": 20 },
    { "range": "51-75%", "count": 2, "percentage": 10 },
    { "range": "76-99%", "count": 1, "percentage": 5 }
  ],
  "summary": { "totalAllocated": 10000, "totalSpent": 3500, "overallUtilization": 35, "totalKeys": 20 },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Shows what percentage of allocated credits are being used across active keys. Utilization bands: 0% (unused), 1-25%, 26-50%, 51-75%, 76-99%, 100% (fully consumed). totalAllocated = remaining credits + spent credits (original allocation). Excludes revoked/suspended keys. Read-only.

Namespace Revenue

curl http://localhost:3000/admin/namespace-revenue -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "namespaces": [
    { "namespace": "team-alpha", "totalSpent": 1500, "totalCalls": 300, "keyCount": 5, "percentage": 60 },
    { "namespace": "team-beta", "totalSpent": 750, "totalCalls": 150, "keyCount": 3, "percentage": 30 },
    { "namespace": "default", "totalSpent": 250, "totalCalls": 50, "keyCount": 2, "percentage": 10 }
  ],
  "summary": { "totalNamespaces": 3, "totalRevenue": 2500, "topNamespace": "team-alpha" },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Revenue breakdown by namespace. Keys without a namespace are grouped as "default". Per-namespace: total credits spent, call count, key count, and revenue percentage. topNamespace is the highest revenue generator. Excludes revoked/suspended keys. Sorted by spend descending. Read-only.

Group Revenue

curl http://localhost:3000/admin/group-revenue -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "groups": [
    { "group": "premium", "totalSpent": 2000, "totalCalls": 400, "keyCount": 8, "percentage": 65 },
    { "group": "free-tier", "totalSpent": 800, "totalCalls": 160, "keyCount": 12, "percentage": 26 },
    { "group": "ungrouped", "totalSpent": 280, "totalCalls": 56, "keyCount": 3, "percentage": 9 }
  ],
  "summary": { "totalGroups": 3, "totalRevenue": 3080, "topGroup": "premium" },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Revenue breakdown by key group. Keys not assigned to any group are shown as "ungrouped". Group IDs are resolved to human-readable names. Per-group: total credits spent, call count, key count, and revenue percentage. topGroup is the highest revenue generator. Excludes revoked/suspended keys. Sorted by spend descending. Read-only.

Peak Usage Times

curl http://localhost:3000/admin/peak-usage -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "hours": [
    { "hour": 9, "requests": 450, "allowed": 420, "denied": 30, "credits": 2100, "uniqueConsumers": 15, "percentage": 30 },
    { "hour": 14, "requests": 380, "allowed": 370, "denied": 10, "credits": 1850, "uniqueConsumers": 12, "percentage": 25 },
    { "hour": 22, "requests": 120, "allowed": 118, "denied": 2, "credits": 590, "uniqueConsumers": 5, "percentage": 8 }
  ],
  "summary": { "totalRequests": 1500, "peakHour": 9, "peakRequests": 450 },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Traffic patterns by hour-of-day (UTC). Per-hour: total requests, allowed/denied split, credits spent, unique consumers, and traffic percentage. peakHour identifies the busiest hour for capacity planning. Hours are 0-23 (UTC), sorted ascending. Only hours with traffic are included. Read-only.

Consumer Activity

curl http://localhost:3000/admin/consumer-activity -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "consumers": [
    { "name": "alice", "totalCalls": 150, "totalSpent": 750, "creditsRemaining": 250, "lastActive": "2025-01-15T14:30:00Z", "status": "active" },
    { "name": "bob", "totalCalls": 0, "totalSpent": 0, "creditsRemaining": 500, "lastActive": null, "status": "inactive" }
  ],
  "summary": { "totalConsumers": 2, "activeConsumers": 1, "inactiveConsumers": 1 },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Per-consumer activity metrics. Shows each active key's call count, total spend, credits remaining, last active timestamp, and active/inactive status. Consumers with zero calls are "inactive". Excludes revoked/suspended keys. Sorted by spend descending. Read-only.

Tool Popularity

curl http://localhost:3000/admin/tool-popularity -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "tools": [
    { "tool": "search", "totalCalls": 500, "totalCredits": 2500, "uniqueConsumers": 20, "percentage": 50 },
    { "tool": "generate", "totalCalls": 300, "totalCredits": 3000, "uniqueConsumers": 15, "percentage": 30 },
    { "tool": "translate", "totalCalls": 200, "totalCredits": 1000, "uniqueConsumers": 10, "percentage": 20 }
  ],
  "summary": { "totalTools": 3, "totalCalls": 1000, "mostPopular": "search" },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Tool usage popularity ranking. Per-tool: total calls, credits spent, unique consumers, and call percentage. Only counts allowed (successful) requests. mostPopular identifies the most-called tool. Sorted by call count descending. Read-only.

Credit Allocation Summary

curl http://localhost:3000/admin/credit-allocation -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "tiers": [
    { "tier": "1-100", "count": 15, "totalCredits": 750, "percentage": 5.0 },
    { "tier": "101-500", "count": 30, "totalCredits": 9000, "percentage": 60.0 },
    { "tier": "501+", "count": 5, "totalCredits": 5250, "percentage": 35.0 }
  ],
  "summary": { "totalKeys": 50, "totalAllocated": 15000, "totalRemaining": 12000, "totalSpent": 3000, "averageAllocation": 300 },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Credit allocation distribution across active keys. Groups keys into allocation tiers (1-100, 101-500, 501+) with count, total credits, and percentage per tier. Summary includes total keys, total allocated/remaining/spent credits, and average allocation per key. Excludes revoked/suspended keys. Read-only.

Daily Summary

curl http://localhost:3000/admin/daily-summary -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "days": [
    { "date": "2025-01-15", "requests": 150, "allowed": 140, "denied": 10, "creditsSpent": 700, "uniqueConsumers": 25, "uniqueTools": 8, "newKeys": 3 },
    { "date": "2025-01-14", "requests": 120, "allowed": 115, "denied": 5, "creditsSpent": 575, "uniqueConsumers": 20, "uniqueTools": 7, "newKeys": 1 }
  ],
  "summary": { "totalDays": 2, "totalRequests": 270, "totalCreditsSpent": 1275, "averageRequestsPerDay": 135 },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Daily activity rollup for trend analysis. Per-day: total requests, allowed/denied breakdown, credits spent, unique consumers, unique tools, and new keys created. Summary includes total days, total requests, total credits, and average requests per day. Sorted by date descending (most recent first). Read-only.

Key Ranking

curl http://localhost:3000/admin/key-ranking -H "X-Admin-Key: YOUR_ADMIN_KEY"
# Sort by calls: ?sortBy=totalCalls
# Sort by credits remaining: ?sortBy=creditsRemaining

{
  "rankings": [
    { "rank": 1, "name": "power-user", "totalSpent": 500, "totalCalls": 100, "creditsRemaining": 500 },
    { "rank": 2, "name": "moderate-user", "totalSpent": 200, "totalCalls": 40, "creditsRemaining": 800 },
    { "rank": 3, "name": "light-user", "totalSpent": 50, "totalCalls": 10, "creditsRemaining": 950 }
  ],
  "summary": { "totalKeys": 3, "sortedBy": "totalSpent" },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Key leaderboard ranked by configurable metric. Default sorts by totalSpent descending. Use ?sortBy=totalCalls or ?sortBy=creditsRemaining for alternative rankings. Each entry includes rank number, name, spend, calls, and credits remaining. Excludes revoked/suspended keys. Read-only.

Hourly Traffic

curl http://localhost:3000/admin/hourly-traffic -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "hours": [
    { "timestamp": "2025-01-15T14:00:00Z", "requests": 45, "allowed": 42, "denied": 3, "credits": 210, "uniqueConsumers": 12, "uniqueTools": 5 },
    { "timestamp": "2025-01-15T13:00:00Z", "requests": 30, "allowed": 28, "denied": 2, "credits": 140, "uniqueConsumers": 8, "uniqueTools": 4 }
  ],
  "summary": { "totalRequests": 75, "totalCredits": 350, "busiestHour": "2025-01-15T14:00:00Z", "busiestHourRequests": 45 },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Granular per-hour request metrics. Per-hour: total requests, allowed/denied breakdown, credits spent, unique consumers, and unique tools. Summary includes totals and identifies the busiest hour. Sorted by timestamp descending (most recent first). Read-only.

Tool Error Rate

curl http://localhost:3000/admin/tool-error-rate -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "tools": [
    { "tool": "translate", "totalRequests": 100, "allowed": 85, "denied": 15, "errorRate": 15 },
    { "tool": "search", "totalRequests": 200, "allowed": 190, "denied": 10, "errorRate": 5 },
    { "tool": "generate", "totalRequests": 150, "allowed": 150, "denied": 0, "errorRate": 0 }
  ],
  "summary": { "totalTools": 3, "overallErrorRate": 5.56, "highestErrorTool": "translate" },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Per-tool error/denial rate analysis. Per-tool: total requests, allowed/denied counts, and error rate percentage. Summary includes total tools, overall error rate, and identifies the tool with highest error rate. Sorted by error rate descending. Read-only.

Consumer Spend Velocity

curl http://localhost:3000/admin/consumer-spend-velocity -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "consumers": [
    { "name": "power-user", "totalSpent": 500, "creditsRemaining": 500, "creditsPerHour": 25.5, "hoursUntilDepleted": 19.61 },
    { "name": "casual-user", "totalSpent": 50, "creditsRemaining": 950, "creditsPerHour": 2.1, "hoursUntilDepleted": 452.38 },
    { "name": "idle-user", "totalSpent": 0, "creditsRemaining": 100, "creditsPerHour": 0, "hoursUntilDepleted": null }
  ],
  "summary": { "totalConsumers": 3, "fastestSpender": "power-user" },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Per-consumer spend velocity analysis. Per-consumer: total spent, credits remaining, credits per hour rate, and estimated hours until depletion. Zero-spend consumers have creditsPerHour: 0 and hoursUntilDepleted: null. Summary identifies the fastest spender. Excludes revoked/suspended keys. Sorted by spend rate descending. Read-only.

Namespace Activity

curl http://localhost:3000/admin/namespace-activity -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "namespaces": [
    { "namespace": "production", "keyCount": 5, "totalSpent": 1200, "totalCalls": 240, "creditsRemaining": 3800 },
    { "namespace": "staging", "keyCount": 2, "totalSpent": 80, "totalCalls": 16, "creditsRemaining": 920 },
    { "namespace": "default", "keyCount": 1, "totalSpent": 0, "totalCalls": 0, "creditsRemaining": 100 }
  ],
  "summary": { "totalNamespaces": 3, "topNamespace": "production" },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Per-namespace activity breakdown for multi-tenant visibility. Per-namespace: key count, total spend, total calls, credits remaining. Keys without a namespace are grouped as "default". Summary identifies the top namespace by spend. Excludes revoked/suspended keys. Sorted by totalSpent descending. Read-only.

Credit Burn Rate

curl http://localhost:3000/admin/credit-burn-rate -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "burnRate": { "creditsPerHour": 45.5, "hoursUntilDepleted": 104.4, "utilizationPercent": 25 },
  "summary": { "totalAllocated": 5000, "totalSpent": 1250, "totalRemaining": 3750, "activeKeys": 10 },
  "generatedAt": "2025-01-15T14:30:00Z"
}

System-wide credit burn rate analysis. Shows aggregate credits/hour burn rate, utilization percentage (spent/allocated), and estimated hours until all credits are depleted. Summary includes total allocated, spent, remaining, and active key count. Zero-spend systems show creditsPerHour: 0 and hoursUntilDepleted: null. Excludes revoked/suspended keys. Read-only.

Consumer Risk Score

curl http://localhost:3000/admin/consumer-risk-score -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "consumers": [
    { "name": "heavy-user", "riskScore": 80, "riskLevel": "critical", "creditsRemaining": 20, "totalSpent": 80, "utilizationPercent": 80 },
    { "name": "normal-user", "riskScore": 25, "riskLevel": "medium", "creditsRemaining": 150, "totalSpent": 50, "utilizationPercent": 25 },
    { "name": "idle-user", "riskScore": 0, "riskLevel": "low", "creditsRemaining": 100, "totalSpent": 0, "utilizationPercent": 0 }
  ],
  "summary": { "totalConsumers": 3, "riskDistribution": { "low": 1, "medium": 1, "high": 0, "critical": 1 } },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Per-consumer risk scoring based on credit utilization. Risk score (0–100) maps to levels: low (0–24), medium (25–49), high (50–74), critical (75–100). Per-consumer: risk score, risk level, credits remaining, total spent, utilization percentage. Summary includes risk distribution counts. Excludes revoked/suspended keys. Sorted by riskScore descending. Read-only.

Revenue Forecast

curl http://localhost:3000/admin/revenue-forecast -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "forecast": { "nextHour": 45.5, "nextDay": 1092, "nextWeek": 7644, "nextMonth": 32760 },
  "current": { "totalSpent": 1250, "totalRemaining": 48750, "creditsPerHour": 45.5, "activeKeys": 10 },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Projected revenue based on current spend trends. Forecasts for next hour, day, week, and month are extrapolated from aggregate credits/hour rate and capped by total remaining credits. Includes current totals and active key count. Zero-spend systems show zero forecasts. Excludes revoked/suspended keys. Read-only.

System Overview

curl http://localhost:3000/admin/system-overview -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "keys": { "total": 15, "active": 12, "revoked": 2, "suspended": 1 },
  "credits": { "totalAllocated": 150000, "totalSpent": 45000, "totalRemaining": 105000, "utilizationPercent": 30 },
  "activity": { "totalCalls": 3500, "uniqueTools": 8 },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Executive summary of the entire system. Key counts by status (active, revoked, suspended). Credit totals with utilization percentage. Activity metrics including total calls and unique tools used. Single endpoint for dashboards and monitoring integrations. Read-only.

Key Health Overview

curl http://localhost:3000/admin/key-health-overview -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "keys": [
    { "name": "depleted-key", "credits": 0, "totalSpent": 1000, "totalCalls": 85, "utilizationPercent": 100, "status": "critical" },
    { "name": "active-key", "credits": 200, "totalSpent": 800, "totalCalls": 60, "utilizationPercent": 80, "status": "warning" },
    { "name": "healthy-key", "credits": 9000, "totalSpent": 1000, "totalCalls": 50, "utilizationPercent": 10, "status": "healthy" }
  ],
  "summary": { "totalKeys": 3, "healthDistribution": { "healthy": 1, "warning": 1, "critical": 1 } },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Holistic per-key health check. Status levels: critical (0 credits remaining), warning (≥75% utilization), healthy (below thresholds). Summary includes health distribution counts. Sorted by credits ascending (most depleted first). Excludes revoked/suspended keys. Read-only.

Namespace Comparison

curl http://localhost:3000/admin/namespace-comparison -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "namespaces": [
    { "namespace": "production", "keyCount": 5, "totalAllocated": 50000, "totalSpent": 12000, "totalCalls": 800, "creditsRemaining": 38000, "utilizationPercent": 24 },
    { "namespace": "staging", "keyCount": 3, "totalAllocated": 3000, "totalSpent": 500, "totalCalls": 50, "creditsRemaining": 2500, "utilizationPercent": 17 }
  ],
  "summary": { "totalNamespaces": 2, "leader": "production" },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Side-by-side namespace comparison. Per namespace: key count, total allocated credits, total spent, total calls, credits remaining, utilization percentage. Keys without a namespace appear under "default". Summary includes namespace count and leading namespace (highest allocation). Sorted by totalAllocated descending. Excludes revoked/suspended keys. Read-only.

Consumer Growth

curl http://localhost:3000/admin/consumer-growth -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "consumers": [
    { "name": "enterprise-key", "ageHours": 720, "totalSpent": 4500, "creditsAllocated": 10000, "spendRate": 6.25 },
    { "name": "trial-key", "ageHours": 24, "totalSpent": 10, "creditsAllocated": 100, "spendRate": 0.42 }
  ],
  "summary": { "totalConsumers": 2, "newConsumers24h": 1 },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Consumer growth metrics per key. Per consumer: age in hours since creation, total credits spent, original credits allocated (credits + totalSpent), spend rate (credits/hour). Summary includes total active consumer count and new consumers created in the last 24 hours. Sorted by creditsAllocated descending. Excludes revoked/suspended keys. Read-only.

Tool Profitability

curl http://localhost:3000/admin/tool-profitability -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "tools": [
    { "tool": "search", "totalCalls": 150, "totalRevenue": 450, "avgRevenuePerCall": 3, "callerCount": 8 },
    { "tool": "translate", "totalCalls": 40, "totalRevenue": 120, "avgRevenuePerCall": 3, "callerCount": 3 }
  ],
  "summary": { "totalRevenue": 570, "mostProfitable": "search", "leastProfitable": "translate" },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Per-tool profitability analysis based on actual tool call revenue. Per tool: total calls, total revenue (credits spent), average revenue per call, unique caller count. Sorted by totalRevenue descending. Summary includes most/least profitable tools and total revenue across all tools. Read-only.

Credit Waste Analysis

curl http://localhost:3000/admin/credit-waste -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "keys": [
    { "name": "unused-key", "creditsAllocated": 1000, "creditsUsed": 0, "creditsRemaining": 1000, "wastePercent": 100 },
    { "name": "active-key", "creditsAllocated": 500, "creditsUsed": 350, "creditsRemaining": 150, "wastePercent": 30 }
  ],
  "summary": { "totalAllocated": 1500, "totalWasted": 1150, "averageWastePercent": 65 },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Per-key credit waste analysis showing allocated vs used credits. Waste percent = remaining / allocated × 100 (100% = fully unused, 0% = fully utilized). Summary includes total allocated credits, total wasted (remaining), and average waste percentage. Sorted by wastePercent descending (biggest wasters first). Excludes revoked/suspended keys. Read-only.

Group Activity

curl http://localhost:3000/admin/group-activity -H "X-Admin-Key: YOUR_ADMIN_KEY"

{
  "groups": [
    { "group": "production", "keyCount": 5, "totalSpent": 2500, "totalCalls": 180, "creditsRemaining": 7500 },
    { "group": "staging", "keyCount": 3, "totalSpent": 400, "totalCalls": 45, "creditsRemaining": 2600 }
  ],
  "summary": { "totalGroups": 2, "topGroup": "production" },
  "generatedAt": "2025-01-15T14:30:00Z"
}

Activity breakdown by key group. Per-group: key count, total spent, total calls, credits remaining. Ungrouped keys appear under "ungrouped". Group IDs are resolved to human-readable group names. Sorted by totalSpent descending. Summary includes group count and top-spending group. Excludes revoked/suspended keys. Read-only.

IP Allowlisting

Restrict API keys to specific IP addresses or CIDR ranges:

# Set IP allowlist on a key (replaces existing list)
curl -X POST http://localhost:3402/keys/ip \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "pg_...", "ips": ["192.168.1.0/24", "10.0.0.5"]}'

# Clear allowlist (allow all IPs)
curl -X POST http://localhost:3402/keys/ip \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "pg_...", "ips": []}'

You can also set the allowlist at key creation time:

curl -X POST http://localhost:3402/keys \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"name": "prod-agent", "credits": 1000, "ipAllowlist": ["10.0.0.0/8"]}'

Supports exact IPv4 matching and CIDR notation (/8, /16, /24, /32, etc.). When the allowlist is empty, all IPs are allowed. Client IP is extracted from X-Forwarded-For header or socket remote address. Configure trustedProxies for accurate IP extraction behind load balancers (see Trusted Proxies).

Key Tags / Metadata

Attach arbitrary key-value tags to API keys for external system integration:

# Set tags (merge semantics — existing tags preserved, new ones added/updated)
curl -X POST http://localhost:3402/keys/tags \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "pg_...", "tags": {"team": "backend", "env": "production", "customer_id": "cus_123"}}'

# Remove a tag (set value to null)
curl -X POST http://localhost:3402/keys/tags \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"key": "pg_...", "tags": {"env": null}}'

# Search keys by tags
curl -X POST http://localhost:3402/keys/search \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"tags": {"team": "backend"}}'
# → { "keys": [...], "count": 3 }

Tags can also be set at key creation:

curl -X POST http://localhost:3402/keys \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"name": "backend-prod", "credits": 5000, "tags": {"team": "backend", "env": "production"}}'

Limits: max 50 tags per key, max 100 chars per key/value. Tags appear in /balance responses and key listings.

Usage Analytics

Query aggregated usage data for dashboards, reports, and trend analysis:

# Get analytics for the last 24 hours (hourly buckets)
curl "http://localhost:3402/analytics?from=2026-02-25T00:00:00Z&to=2026-02-26T00:00:00Z&granularity=hourly" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

# Daily granularity with top 5 consumers
curl "http://localhost:3402/analytics?granularity=daily&topN=5" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Returns:

timeSeries — Bucketed call counts, credits charged, and denials per time window
toolBreakdown — Per-tool stats (calls, credits, average cost) sorted by usage
topConsumers — Top N API keys by credits spent, with each key's most-used tool
trend — Current vs previous period comparison with percentage changes (calls, credits, denials)
summary — Total calls, credits, unique keys, and unique tools

Query parameters: from (ISO date), to (ISO date), granularity (hourly or daily, default: hourly), topN (number, default: 10).

Alert Webhooks

Configure rules to fire alerts when usage thresholds are crossed. Alerts are delivered via webhooks as alert.fired admin events:

# Configure alert rules
curl -X POST http://localhost:3402/alerts \
  -H "Content-Type: application/json" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"rules": [
    {"type": "spending_threshold", "threshold": 80},
    {"type": "credits_low", "threshold": 50},
    {"type": "quota_warning", "threshold": 90},
    {"type": "key_expiry_soon", "threshold": 86400},
    {"type": "rate_limit_spike", "threshold": 10}
  ]}'

# Consume pending alerts (returns and clears queue)
curl http://localhost:3402/alerts \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Alert types:

Type	Threshold Meaning	Fires When
`spending_threshold`	Percentage (0–100)	Key has spent ≥ threshold% of its initial credits
`credits_low`	Absolute credits	Key's remaining credits drop below threshold
`quota_warning`	Percentage (0–100)	Key's daily call usage exceeds threshold% of quota
`key_expiry_soon`	Seconds	Key expires within threshold seconds
`rate_limit_spike`	Count	Key has ≥ threshold rate-limit denials in 5 minutes

Each rule has an optional cooldownSeconds (default: 300) to prevent alert storms. Alerts are automatically checked on every gate evaluation (tool call).

When webhooks are enabled (--webhook-url), alerts fire as alert.fired events in the adminEvents webhook payload with full context (key, rule type, current value, threshold).

Team Management

Group API keys into teams with shared budgets, quotas, and usage tracking. Teams enforce budget and quota limits at the gate level — if a key belongs to a team that has exceeded its budget or quota, tool calls are denied even if the individual key has credits remaining.

# Create a team
curl -X POST http://localhost:3402/teams \
  -H "Content-Type: application/json" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"name": "Engineering", "budget": 10000, "tags": {"dept": "eng"}}'

# Assign an API key to a team
curl -X POST http://localhost:3402/teams/assign \
  -H "Content-Type: application/json" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"teamId": "team_abc123...", "apiKey": "pg_abc123..."}'

# Set team quotas (daily/monthly limits)
curl -X POST http://localhost:3402/teams/update \
  -H "Content-Type: application/json" \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -d '{"teamId": "team_abc123...", "quota": {"dailyCalls": 1000, "monthlyCalls": 25000, "dailyCredits": 5000, "monthlyCredits": 100000}}'

# View team usage with member breakdown
curl "http://localhost:3402/teams/usage?teamId=team_abc123..." \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Team features:

Feature	Description
Shared budget	Pool credits across all team members (0 = unlimited)
Team quotas	Daily/monthly call and credit limits (UTC auto-reset)
Member breakdown	Per-key usage within the team (keys masked)
Tags/metadata	Attach key-value pairs for department, project, etc.
Max 100 keys	Per team limit to prevent abuse
Gate integration	Budget + quota checked on every tool call
Audit trail	team.created, team.updated, team.deleted, team.key_assigned, team.key_removed events

Each API key can belong to at most one team. Team budget and quota checks happen after individual key checks — both must pass for a tool call to succeed.

Rate Limit Response Headers

Every /mcp response includes rate limit and credits headers when an API key is provided:

X-RateLimit-Limit: 100        # Max calls per window
X-RateLimit-Remaining: 87     # Calls remaining in current window
X-RateLimit-Reset: 45         # Seconds until window resets
X-Credits-Remaining: 4500     # Credits remaining on the key

When a tool has a per-tool rate limit, the headers reflect that tool's limit (not the global limit). These headers are CORS-exposed so browser-based agents can read them.

Health Check + Graceful Shutdown

The GET /health endpoint provides a public (no auth required) health check for load balancers and orchestrators:

curl http://localhost:3402/health

{
  "status": "healthy",
  "uptime": 3600,
  "version": "2.6.0",
  "inflight": 3,
  "redis": { "connected": true, "pubsub": true },
  "webhooks": { "pendingRetries": 0, "deadLetterCount": 2 }
}

Field	Description
`status`	`"healthy"` or `"draining"` (during graceful shutdown)
`uptime`	Seconds since server started
`version`	Package version
`inflight`	Number of in-flight `/mcp` requests
`redis`	Redis connectivity (only present when `--redis-url` is set)
`webhooks`	Webhook retry stats (only present when `--webhook-url` is set)

During graceful shutdown, /health returns HTTP 503 with "status": "draining", and new /mcp requests are rejected with 503. Existing in-flight requests are allowed to complete before the server tears down. The CLI uses gracefulStop() on SIGTERM/SIGINT with a 30-second drain timeout.

Programmatic API:

// Graceful shutdown with custom timeout (default 30s)
await server.gracefulStop(15_000);

Config Validation + Dry Run

Validate a config file before starting the server:

# Validate a config file — exits 0 if valid, 1 if errors found
paygate-mcp validate --config paygate.json

Output on error:

✗ 2 error(s):
  ERROR  [port] Invalid port 99999. Must be 0–65535.
  ERROR  [redisUrl] Invalid redisUrl protocol "http:". Expected "redis://" or "rediss://".
⚠ 1 warning(s):
  WARN   [shadowMode] Shadow mode is enabled. Payment will not be enforced.

Dry run mode starts the server, discovers tools from the backend, prints a pricing table, then exits:

paygate-mcp wrap --server "node my-server.js" --dry-run

  ── DRY RUN ──────────────────────────────────────
  Discovered 3 tool(s):

  ────────────────────────────────────────────────────────────
  Tool                          Credits/Call   Rate Limit
  ────────────────────────────────────────────────────────────
  search                        5              30/min
  generate                      10             10/min
  list_items                    1              60/min
  ────────────────────────────────────────────────────────────

  Dry run complete — shutting down.

Programmatic API:

import { validateConfig, formatDiagnostics } from 'paygate-mcp';

const diags = validateConfig(myConfig);
if (diags.some(d => d.level === 'error')) {
  console.error(formatDiagnostics(diags));
  process.exit(1);
}

Batch Tool Calls

Call multiple tools in a single request with all-or-nothing billing:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call_batch",
  "params": {
    "calls": [
      { "name": "search", "arguments": { "q": "MCP servers" } },
      { "name": "translate", "arguments": { "text": "hello", "to": "es" } },
      { "name": "summarize", "arguments": { "url": "https://example.com" } }
    ]
  }
}

Response:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "results": [
      { "tool": "search", "result": { "content": [...] }, "creditsCharged": 5 },
      { "tool": "translate", "result": { "content": [...] }, "creditsCharged": 3 },
      { "tool": "summarize", "result": { "content": [...] }, "creditsCharged": 2 }
    ],
    "totalCreditsCharged": 10,
    "remainingCredits": 90
  }
}

Key features:

All-or-nothing — All calls are pre-validated (auth, ACL, rate limits, credits, quotas) before any execute. If any call would be denied, the entire batch is rejected and zero credits are charged.
Aggregate pricing — Total credits are checked and deducted atomically. A batch of 3 calls needing 5+3+2=10 credits requires 10 credits available.
Parallel execution — After gate approval, all tool calls execute concurrently for minimum latency.
Refund on failure — With refundOnFailure enabled, individual tools that error downstream get their credits refunded.
Multi-server support — Works with prefixed tools in multi-server mode (e.g., fs:read, github:search).

Programmatic API:

import { Gate, BatchToolCall } from 'paygate-mcp';

const calls: BatchToolCall[] = [
  { name: 'search', arguments: { q: 'test' } },
  { name: 'translate', arguments: { text: 'hi' } },
];

const result = gate.evaluateBatch(apiKey, calls, clientIp);
if (!result.allAllowed) {
  console.log(`Denied at index ${result.failedIndex}: ${result.reason}`);
} else {
  console.log(`Charged ${result.totalCredits} credits for ${calls.length} calls`);
}

Multi-Tenant Namespaces

Isolate API keys and usage data by tenant. Each key belongs to a namespace (default: "default"). All admin endpoints support namespace filtering for tenant-scoped views.

Create a key in a namespace:

curl -X POST http://localhost:3402/keys \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{"name": "acme-agent", "credits": 1000, "namespace": "acme-corp"}'

List keys filtered by namespace:

curl http://localhost:3402/keys?namespace=acme-corp \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

List all namespaces with stats:

curl http://localhost:3402/namespaces \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Returns:

{
  "namespaces": [
    { "namespace": "acme-corp", "keyCount": 3, "activeKeys": 2, "totalCredits": 2500, "totalSpent": 480 },
    { "namespace": "beta-inc", "keyCount": 1, "activeKeys": 1, "totalCredits": 500, "totalSpent": 120 }
  ],
  "count": 2
}

Namespace-filtered status, usage, and analytics:

# Status filtered to one namespace
curl http://localhost:3402/status?namespace=acme-corp -H "X-Admin-Key: ..."

# Usage events filtered by namespace
curl http://localhost:3402/usage?namespace=acme-corp -H "X-Admin-Key: ..."

# Analytics filtered by namespace
curl "http://localhost:3402/analytics?namespace=acme-corp&from=2025-01-01" -H "X-Admin-Key: ..."

# Search keys by tag within a namespace
curl -X POST http://localhost:3402/keys/search \
  -H "X-Admin-Key: ..." -H "Content-Type: application/json" \
  -d '{"tags": {"env": "prod"}, "namespace": "acme-corp"}'

Namespace rules:

Alphanumeric + hyphens only, max 50 characters, case-insensitive (stored lowercase)
Defaults to "default" if omitted or invalid
Old keys automatically backfilled to "default" on state file load
Usage events carry the key's namespace for cross-cutting analytics
Namespaces are implicit — created automatically when a key is assigned to one

Scoped Tokens

Issue short-lived, tool-restricted tokens from any API key. Scoped tokens let you delegate narrow access to agents or sub-processes without exposing the parent API key.

Create a scoped token (admin):

curl -X POST http://localhost:3402/tokens \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "key": "pg_parent_key_here",
    "ttl": 300,
    "allowedTools": ["search", "summarize"],
    "label": "agent-session-42"
  }'

Returns:

{
  "token": "pgt_eyJhcGl...signature",
  "expiresAt": "2025-06-15T12:05:00.000Z",
  "ttl": 300,
  "parentKey": "my-agent",
  "allowedTools": ["search", "summarize"],
  "label": "agent-session-42",
  "message": "Use as X-API-Key or Bearer token on /mcp"
}

Use the token on /mcp:

# As X-API-Key header
curl -X POST http://localhost:3402/mcp \
  -H "X-API-Key: pgt_eyJhcGl...signature" \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"search","arguments":{"q":"hello"}}}'

# As Bearer token
curl -X POST http://localhost:3402/mcp \
  -H "Authorization: Bearer pgt_eyJhcGl...signature" \
  -H "Content-Type: application/json" \
  -d '...'

Token behavior:

Self-contained — HMAC-SHA256 signed, zero server-side state. Validated cryptographically on every request.
Auto-expiry — TTL defaults to 1 hour, max 24 hours. Expired tokens are rejected instantly.
Tool ACL narrowing — If allowedTools is set, the token can only call those tools (intersection with parent key's ACL).
Credits from parent — Tool calls charge against the parent key's credit balance.
tools/list filtering — When a scoped token calls tools/list, only the allowed tools are returned.
Batch-aware — tools/call_batch checks scoped token ACL for every call in the batch.
Resolution priority — X-API-Key header → pgt_ scoped token → OAuth Bearer token.

Token format: pgt_<base64url(JSON payload)>.<base64url(HMAC-SHA256 signature)>

Programmatic usage:

import { ScopedTokenManager } from 'paygate-mcp';

const tokens = new ScopedTokenManager('your-signing-secret');

// Create
const token = tokens.create({
  apiKey: 'pg_parent_key',
  ttlSeconds: 300,
  allowedTools: ['search'],
  label: 'agent-42',
});

// Validate
const result = tokens.validate(token);
if (result.valid) {
  console.log(result.payload.apiKey); // 'pg_parent_key'
  console.log(result.payload.allowedTools); // ['search']
}

// Check if a string is a scoped token
ScopedTokenManager.isToken('pgt_...'); // true
ScopedTokenManager.isToken('pg_...');  // false

Token Revocation List

Revoke scoped tokens before they expire. Once revoked, the token is immediately rejected by all PayGate instances (synced via Redis pub/sub in multi-instance deployments).

Revoke a token (admin):

curl -X POST http://localhost:3402/tokens/revoke \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{"token": "pgt_eyJhcGl...signature", "reason": "session ended"}'

Returns:

{
  "message": "Token revoked",
  "fingerprint": "a1b2c3d4e5f6...",
  "expiresAt": "2025-06-15T12:05:00.000Z",
  "revokedAt": "2025-06-15T11:30:00.000Z"
}

List revoked tokens (admin):

curl http://localhost:3402/tokens/revoked \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

Returns { count, entries: [{ fingerprint, expiresAt, revokedAt, reason }] }.

Revocation behavior:

O(1) lookup — SHA-256 fingerprints stored in a Map for constant-time rejection checks.
Auto-cleanup — Revocation entries are purged once the original token would have naturally expired (max 24h), so the list never grows unbounded.
Redis sync — In multi-instance deployments, revocations are propagated via token_revoked pub/sub events. Other instances add the entry to their local revocation list immediately.
Audit trail — Every revocation is logged as a token.revoked audit event with fingerprint and reason.
Signature validation — Only tokens signed by this server can be revoked (prevents revoking arbitrary strings).

Programmatic usage:

import { ScopedTokenManager } from 'paygate-mcp';

const tokens = new ScopedTokenManager('your-signing-secret');
const token = tokens.create({ apiKey: 'pg_key', ttlSeconds: 3600 });

// Revoke
const entry = tokens.revokeToken(token, 'session ended');
console.log(entry.fingerprint); // SHA-256 hex

// Validate — now returns { valid: false, reason: 'token_revoked' }
tokens.validate(token); // { valid: false, reason: 'token_revoked' }

// Check revocation list size
tokens.revocationList.size; // 1

// Clean up on shutdown
tokens.destroy();

Usage-Based Auto-Topup

Automatically refill credits when a key's balance drops below a threshold. Prevents service interruptions for high-value API consumers.

Configure auto-topup (admin):

curl -X POST http://localhost:3402/keys/auto-topup \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{"key": "pg_abc123...", "threshold": 100, "amount": 500, "maxDaily": 10}'

Returns:

{
  "autoTopup": { "threshold": 100, "amount": 500, "maxDaily": 10 },
  "message": "Auto-topup enabled: add 500 credits when balance drops below 100 (max 10/day)"
}

Disable auto-topup:

curl -X POST http://localhost:3402/keys/auto-topup \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{"key": "pg_abc123...", "disable": true}'

Auto-topup behavior:

Post-deduction trigger — After each tool call (or batch) deducts credits, the gate checks if credits fell below the threshold and automatically adds credits.
Daily limits — maxDaily caps how many auto-topups can occur per UTC day. Set to 0 for unlimited.
Audit trail — Every auto-topup is logged as a key.auto_topped_up audit event. Configuration changes are logged as key.auto_topup_configured.
Webhook events — Both key.auto_topup_configured and key.auto_topped_up events are sent via webhooks.
Redis sync — In multi-instance deployments, auto-topup credits are synced atomically via Redis.
State persistence — Auto-topup config and daily counters are persisted in the state file and Redis.

Programmatic usage:

import { Gate } from 'paygate-mcp';

const gate = new Gate(config, 'state.json');
const record = gate.store.createKey('premium-client', 1000);

// Configure auto-topup
record.autoTopup = { threshold: 100, amount: 500, maxDaily: 5 };
gate.store.save();

// Hook for notifications
gate.onAutoTopup = (apiKey, amount, newBalance) => {
  console.log(`Auto-topped up ${amount} credits → balance: ${newBalance}`);
};

// Gate.evaluate() automatically triggers auto-topup after credit deduction
const result = gate.evaluate(record.key, { name: 'expensive-tool' });

Admin API Key Management

Manage multiple admin keys with role-based permissions. The bootstrap admin key (from constructor or CLI) is always a super_admin.

Roles:

Role	Description
`super_admin`	Full access, including admin key management
`admin`	All API key and system operations, but cannot manage admin keys
`viewer`	Read-only access to status, usage, analytics, audit, etc.

Create an admin key (super_admin only):

curl -X POST http://localhost:3402/admin/keys \
  -H "X-Admin-Key: $ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{"name": "CI Bot", "role": "admin"}'
# Returns: { "key": "ak_...", "name": "CI Bot", "role": "admin", "createdAt": "..." }

List admin keys (super_admin only):

curl http://localhost:3402/admin/keys \
  -H "X-Admin-Key: $ADMIN_KEY"
# Returns masked keys with roles, status, and last used timestamps

Revoke an admin key (super_admin only):

curl -X POST http://localhost:3402/admin/keys/revoke \
  -H "X-Admin-Key: $ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{"key": "ak_..."}'

Behavior:

The default role for POST /admin/keys is admin if not specified.
Cannot revoke your own admin key (safety guard).
Cannot revoke the last super_admin key (safety guard).
viewer keys can access all read-only endpoints (GET) but are denied write operations (POST).
admin keys can create/revoke/rotate API keys, manage teams, tokens, etc. but cannot manage admin keys.
Admin keys are persisted to a separate file (*-admin.json) alongside the state file.
All operations are logged in the audit trail (admin_key.created, admin_key.revoked).
Webhook events are fired for admin key lifecycle changes.

Plugin System

Add custom logic to PayGate with the plugin API. Plugins can intercept gate decisions, transform pricing, modify tool requests/responses, add custom HTTP endpoints, and hook into server lifecycle events.

import { PayGateServer, PayGatePlugin } from 'paygate-mcp';

// Define a plugin
const loggingPlugin: PayGatePlugin = {
  name: 'request-logger',
  version: '1.0.0',

  // Gate hooks (sync — hot path)
  beforeGate: (ctx) => {
    // Return { allowed: false, reason: '...' } to short-circuit
    // Return null to continue normal evaluation
    if (ctx.toolName === 'dangerous_tool') {
      return { allowed: false, reason: 'tool_disabled' };
    }
    return null;
  },

  afterGate: (ctx, decision) => {
    // Modify the gate decision after evaluation
    console.log(`${ctx.toolName}: ${decision.allowed ? 'allowed' : 'denied'}`);
    return decision;
  },

  transformPrice: (toolName, basePrice, args) => {
    // Return a number to override price, or null to keep base price
    if (toolName === 'premium_search') return basePrice * 2;
    return null;
  },

  onDeny: (ctx, reason) => {
    // Called whenever a tool call is denied
    console.log(`Denied: ${ctx.toolName} — ${reason}`);
  },

  // Tool hooks (async)
  beforeToolCall: async (ctx) => {
    // Modify the JSON-RPC request before forwarding
    return { ...ctx.request, params: { ...ctx.request.params, audit: true } };
  },

  afterToolCall: async (ctx, response) => {
    // Modify the JSON-RPC response before returning to client
    return response;
  },

  // HTTP hook (async)
  onRequest: (req, res) => {
    // Add custom endpoints — return true if handled
    if (req.url === '/custom/status') {
      res.writeHead(200, { 'Content-Type': 'application/json' });
      res.end(JSON.stringify({ custom: true }));
      return true;
    }
    return false;
  },

  // Lifecycle hooks (async)
  onStart: async () => { console.log('Plugin started'); },
  onStop: async () => { console.log('Plugin stopped'); },
};

// Register plugins with .use() (chainable)
const server = new PayGateServer({ ... });
server
  .use(loggingPlugin)
  .use(anotherPlugin);

await server.start();

Hook types:

Hook	Sync/Async	Description
`beforeGate`	Sync	Short-circuit gate evaluation. First non-null result wins.
`afterGate`	Sync	Modify gate decision. Cascading (each plugin sees previous result).
`transformPrice`	Sync	Override tool pricing. First non-null number wins.
`onDeny`	Sync	Notification on denial. All plugins called.
`beforeToolCall`	Async	Modify JSON-RPC request before forwarding. Cascading.
`afterToolCall`	Async	Modify JSON-RPC response before returning. Cascading.
`onRequest`	Async	Add custom HTTP endpoints. First `true` return handles the request.
`onStart`	Async	Called after server starts. Registration order.
`onStop`	Async	Called before server stops. Reverse registration order.

Error isolation: Plugin errors are caught and logged — a crashing plugin never takes down the server.

List registered plugins (admin only):

curl http://localhost:3402/plugins -H "X-Admin-Key: $ADMIN_KEY"
# { "count": 2, "plugins": [{ "name": "...", "version": "...", "hooks": ["beforeGate", ...] }] }

Key Groups (Policy Templates)

Key groups let you define reusable policy templates and apply them to multiple API keys at once. Unlike teams (which share budgets), groups share policies: ACL, rate limits, pricing overrides, IP allowlists, and quotas.

Create a group:

curl -X POST http://localhost:3402/groups \
  -H "X-Admin-Key: $ADMIN_KEY" \
  -d '{
    "name": "free-tier",
    "allowedTools": ["search", "read_file"],
    "rateLimitPerMin": 30,
    "ipAllowlist": ["10.0.0.0/8"],
    "quota": { "dailyCallLimit": 100, "monthlyCallLimit": 1000, "dailyCreditLimit": 50, "monthlyCreditLimit": 200 },
    "toolPricing": { "search": { "creditsPerCall": 2 } },
    "tags": { "tier": "free" }
  }'
# { "id": "grp_a1b2c3...", "name": "free-tier", ... }

Assign keys to a group:

curl -X POST http://localhost:3402/groups/assign \
  -H "X-Admin-Key: $ADMIN_KEY" \
  -d '{ "groupId": "grp_a1b2c3...", "key": "pgk_..." }'

Policy resolution rules:

Policy	Resolution
`allowedTools`	Key wins if non-empty, otherwise group
`deniedTools`	Union of both (most restrictive)
`ipAllowlist`	Union of both (additive)
`rateLimitPerMin`	Key wins if set, otherwise group
`quota`	Key wins if set, otherwise group
`toolPricing`	Group overrides global config
`maxSpendingLimit`	Group default (key can override via `/limits`)

List groups:

curl http://localhost:3402/groups -H "X-Admin-Key: $ADMIN_KEY"
# [{ "id": "grp_...", "name": "free-tier", "memberCount": 5, ... }]

Update / delete / remove:

# Update group policies
curl -X POST http://localhost:3402/groups/update \
  -H "X-Admin-Key: $ADMIN_KEY" \
  -d '{ "id": "grp_...", "rateLimitPerMin": 60 }'

# Remove a key from its group
curl -X POST http://localhost:3402/groups/remove \
  -H "X-Admin-Key: $ADMIN_KEY" \
  -d '{ "key": "pgk_..." }'

# Delete a group (removes all assignments)
curl -X POST http://localhost:3402/groups/delete \
  -H "X-Admin-Key: $ADMIN_KEY" \
  -d '{ "id": "grp_..." }'

Programmatic usage:

import { PayGateServer, KeyGroupManager } from 'paygate-mcp';

const server = new PayGateServer({ ... });
const { port, adminKey } = await server.start();

// Access groups directly
const group = server.groups.createGroup({ name: 'enterprise', rateLimitPerMin: 1000 });
server.groups.assignKey(apiKey, group.id);

// Resolve effective policy for a key
const policy = server.groups.resolvePolicy(apiKey, keyRecord);
// { allowedTools, deniedTools, rateLimitPerMin, quota, ipAllowlist, toolPricing, maxSpendingLimit }

File persistence: When using --state-file, group definitions and key assignments are automatically saved to a *-groups.json file alongside the main state file. Groups survive restarts without needing Redis.

Redis sync: When running with --redis-url, group definitions and key assignments are additionally persisted to Redis and synced across instances via pub/sub. All group CRUD operations and assignment changes propagate in real-time to other PayGate processes.

Horizontal Scaling (Redis)

Enable Redis-backed state for multi-process deployments. Multiple PayGate instances share API keys, credits, and usage data through Redis:

# Single instance with Redis persistence
npx paygate-mcp wrap --server "your-mcp-server" --redis-url "redis://localhost:6379"

# With password and database
npx paygate-mcp wrap --server "your-mcp-server" \
  --redis-url "redis://:mypassword@redis.internal:6379/2"

Or in a config file:

{
  "serverCommand": "your-mcp-server",
  "redisUrl": "redis://localhost:6379"
}

Architecture: Write-Through Cache

PayGate uses a write-through cache pattern for maximum performance:

Reads — Served from in-memory KeyStore (zero latency, no Redis round-trip)
Writes — Propagated to Redis for cross-process shared state
Credit deduction — Uses Redis Lua scripts for atomic check-and-deduct (prevents double-spend across processes)
Periodic sync — Local caches refresh from Redis every 5 seconds as a safety net
Pub/sub notifications — Key mutations and credit changes propagate to all instances in real-time via Redis PUBLISH/SUBSCRIBE (sub-millisecond latency)

This means Gate.evaluate() stays synchronous and fast, while credit operations remain atomic across your entire fleet. The server automatically wires Redis hooks into the gate — every usage event and credit deduction flows to Redis without any code changes. Pub/sub ensures other instances see changes near-instantly (no 5-second wait).

What Gets Synced

State	Redis Key Pattern	Sync Method
API keys	`pg:key:<keyId>` (Hash)	Write-through + pub/sub + periodic refresh
Key registry	`pg:keys` (Set)	Write-through
Credit deduction	`pg:key:<keyId>`	Atomic Lua script + pub/sub broadcast
Credit top-up	`pg:key:<keyId>`	Atomic Lua script + pub/sub broadcast
Admin mutations	`pg:key:<keyId>` (Hash)	Write-through (all admin endpoints)
Rate limiting	`pg:rate:<key>` (Sorted Set)	Atomic Lua (sliding window)
Usage events	`pg:usage` (List)	Fire-and-forget RPUSH
Cross-instance events	`pg:events` (Pub/Sub)	PUBLISH/SUBSCRIBE with inline data

Deployment Pattern

                    ┌──────────────┐
                    │   Redis 7+   │
                    │  ┌────────┐  │
                    │  │pub/sub │  │
                    └──┴───┬────┴──┘
                           │
              ┌────────────┼────────────┐
              │            │            │
        ┌─────┴─────┐ ┌───┴───┐ ┌─────┴─────┐
        │ PayGate 1 │ │  PG 2 │ │ PayGate 3 │
        │ (sub+pub) │ │ (sub) │ │ (sub+pub) │
        └─────┬─────┘ └───┬───┘ └─────┬─────┘
              │            │            │
        ┌─────┴────────────┴────────────┴─────┐
        │          Load Balancer               │
        └──────────────────────────────────────┘

Real-Time Pub/Sub — When one instance creates/revokes a key or changes credits, it publishes an event to the pg:events channel. All other instances receive it instantly and update their local KeyStore without waiting for the 5-second sync. Credit changes include inline data (credits, totalSpent, totalCalls) so receivers skip the Redis roundtrip entirely. Each instance has a unique ID for self-message filtering — no echo loops. If pub/sub fails, the periodic sync continues as a fallback.

Admin API Sync — All admin HTTP endpoints (create key, revoke, rotate, topup, set ACL, expiry, quota, tags, IP allowlist, spending limit) write through to Redis. Topup and revoke use atomic Lua scripts; other mutations use fire-and-forget HSET to propagate changes across instances immediately.

Distributed Rate Limiting — Rate limits are enforced atomically across all instances using Redis sorted sets with Lua scripts. Each rate check does ZREMRANGEBYSCORE + ZCARD + ZADD in a single round-trip, preventing burst bypass across processes. Falls open (allows) if Redis is temporarily unavailable.

Persistent Usage Audit Trail — Usage events are appended to a Redis list (RPUSH), creating a shared audit trail visible from any instance. Events survive process restarts and are queryable from the dashboard. Max 100k events with automatic trimming.

Graceful Fallback — If Redis is temporarily unavailable, PayGate falls back to local in-memory operations. On reconnect, state syncs automatically.

Zero Dependencies — The Redis client uses Node.js net.Socket with raw RESP protocol encoding. No ioredis, no redis package — pure built-in networking.

Config File Mode

Load all settings from a JSON file instead of CLI flags:

npx paygate-mcp wrap --config paygate.json

Example paygate.json:

{
  "serverCommand": "npx",
  "serverArgs": ["@modelcontextprotocol/server-filesystem", "/tmp"],
  "port": 3402,
  "defaultCreditsPerCall": 2,
  "globalRateLimitPerMin": 30,
  "webhookUrl": "https://billing.example.com/events",
  "webhookFilters": [
    {
      "name": "production-alerts",
      "events": ["key.created", "key.revoked", "alert.fired"],
      "url": "https://alerts.example.com/webhook",
      "keyPrefixes": ["pk_prod_"]
    }
  ],
  "refundOnFailure": true,
  "stateFile": "~/.paygate/state.json",
  "toolPricing": {
    "premium_analyze": { "creditsPerCall": 10, "creditsPerKbInput": 5 }
  },
  "globalQuota": {
    "dailyCallLimit": 1000,
    "monthlyCreditLimit": 50000
  },
  "oauth": {
    "accessTokenTtl": 3600,
    "scopes": ["tools:*"]
  },
  "redisUrl": "redis://localhost:6379",
  "importKeys": {
    "pg_abc123def456": 500
  }
}

CLI flags override config file values when both are specified.

Config Hot Reload

Reload pricing, rate limits, webhooks, quotas, and behavior flags from your config file without restarting the server:

# Reload from the config file used at startup
curl -X POST http://localhost:3402/config/reload \
  -H "X-Admin-Key: YOUR_ADMIN_KEY"

# One-time reload from a different config file
curl -X POST http://localhost:3402/config/reload \
  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{"configPath": "/path/to/updated-config.json"}'

Hot-reloadable fields (take effect immediately):

defaultCreditsPerCall, toolPricing — pricing changes
globalRateLimitPerMin — rate limit adjustment
shadowMode, refundOnFailure — behavior flags
freeMethods — free method list
globalQuota — daily/monthly call and credit limits
webhookUrl, webhookSecret, webhookMaxRetries — webhook infrastructure (rebuilt)
alertRules — alert thresholds and rules

Non-reloadable fields (reported as skipped, require restart):

serverCommand, serverArgs — backend MCP server process
port — listening port
oauth — OAuth 2.1 configuration

Response includes changed fields, skipped fields, and any validation warnings:

{
  "ok": true,
  "changed": ["defaultCreditsPerCall", "globalRateLimitPerMin"],
  "skipped": [],
  "warnings": [],
  "message": "Config reloaded: 2 fields updated"
}

The config file is validated before applying changes — invalid configs are rejected with detailed error messages and zero changes applied.

Deployment

One-Click Deploy

Deploy PayGate to your preferred cloud platform:

Render:

https://render.com/deploy?repo=https://github.com/walker77/paygate-mcp

Fly.io:

fly launch --image ghcr.io/walker77/paygate-mcp:latest --name my-paygate
fly secrets set PAYGATE_ADMIN_KEY=your-admin-key PAYGATE_REMOTE_URL=https://your-mcp-server.com/mcp

Docker

# Build the image
docker build -t paygate-mcp .

# Run with a remote MCP server
docker run -d \
  -p 3000:3000 \
  -v paygate-data:/data \
  -e PAYGATE_REMOTE_URL="https://my-mcp-server.com/mcp" \
  -e PAYGATE_ADMIN_KEY="your-admin-key" \
  paygate-mcp

# Run with environment variables
docker run -d \
  -p 3000:3000 \
  -e PAYGATE_PORT=3000 \
  -e PAYGATE_REMOTE_URL="https://api.example.com/mcp" \
  -e PAYGATE_DEFAULT_CREDITS=5 \
  -e PAYGATE_RATE_LIMIT=120 \
  -e PAYGATE_WEBHOOK_URL="https://hooks.example.com/paygate" \
  paygate-mcp

Docker Compose (with Redis)

# Set your MCP server URL and start
MCP_REMOTE_URL="https://my-mcp-server.com/mcp" docker-compose up -d

# View logs
docker-compose logs -f paygate

# Check health
curl http://localhost:3000/health

The included docker-compose.yml starts PayGate with Redis for horizontal scaling, state persistence, and distributed rate limiting.

systemd

# /etc/systemd/system/paygate-mcp.service
[Unit]
Description=PayGate MCP Proxy
After=network.target

[Service]
Type=simple
User=paygate
WorkingDirectory=/opt/paygate-mcp
ExecStart=/usr/bin/node dist/cli.js wrap \
  --remote-url "https://my-mcp-server.com/mcp" \
  --port 3000 \
  --state-file /var/lib/paygate/state.json \
  --audit-file /var/log/paygate/audit.jsonl
Restart=always
RestartSec=5
Environment=NODE_ENV=production

[Install]
WantedBy=multi-user.target

sudo systemctl enable paygate-mcp
sudo systemctl start paygate-mcp
sudo journalctl -u paygate-mcp -f

PM2

# Install globally
npm install -g paygate-mcp

# Start with PM2
pm2 start paygate-mcp -- wrap \
  --remote-url "https://my-mcp-server.com/mcp" \
  --port 3000 \
  --state-file ./state.json

# Or use ecosystem file
pm2 start ecosystem.config.js

Production Checklist

Load Testing

A k6 load test script is included for production benchmarking:

# Install k6
brew install k6            # macOS
# or: https://k6.io/docs/getting-started/installation

# Start server (example: echo backend)
npx paygate-mcp wrap -- echo '{"jsonrpc":"2.0","id":1,"result":{"content":[{"type":"text","text":"ok"}]}}' \
  --port 3000 --credits-per-call 1

# Run with admin key (from server startup output)
K6_ADMIN_KEY=admin_xxxx k6 run load-test.js

# Custom VUs and duration
K6_ADMIN_KEY=admin_xxxx k6 run --vus 100 --duration 60s load-test.js

# Against remote deployment
K6_PAYGATE_URL=https://paygate.example.com K6_ADMIN_KEY=admin_xxxx k6 run load-test.js

Scenarios:

mcp_traffic — Simulates agent tool calls (ramp 0→50 VUs over 10s, sustain 30s)
admin_reads — Dashboard/analytics reads (5 constant VUs)
health_checks — Load balancer probes (10 req/s constant rate)

Thresholds:

p(95) response time < 200ms
p(99) response time < 500ms
Error rate < 5%
Request rate > 100 req/s

Error Codes

HTTP Status Codes

Code	Meaning	When
`200`	OK	Successful read/update operations
`201`	Created	Key, team, group, or template created
`401`	Unauthorized	Missing or invalid admin key
`402`	Payment Required	Insufficient credits for tool call
`403`	Forbidden	IP not in allowlist, ACL denied
`404`	Not Found	Key, template, group, or resource not found
`405`	Method Not Allowed	Wrong HTTP method for endpoint
`409`	Conflict	Duplicate alias, template name collision
`429`	Too Many Requests	Rate limit exceeded
`503`	Service Unavailable	Maintenance mode or server shutting down

JSON-RPC Error Codes (MCP /mcp endpoint)

Code	Name	Description
`-32402`	`insufficient_credits`	API key has zero credits remaining
`-32402`	`rate_limited`	Request rate exceeds per-key or per-tool limit
`-32402`	`quota_exceeded`	Daily/monthly call or credit quota exceeded
`-32402`	`spending_limit_reached`	Cumulative spend exceeds key spending limit
`-32402`	`key_suspended`	API key is temporarily suspended
`-32402`	`key_expired`	API key TTL has elapsed
`-32402`	`acl_denied`	Tool not in key's ACL whitelist
`-32402`	`ip_not_allowed`	Client IP not in key's allowlist
`-32402`	`invalid_api_key`	X-API-Key header not recognized
`-32402`	`maintenance_mode`	Server in maintenance mode
`-32003`	`circuit_breaker_open`	Backend unavailable, circuit breaker is open
`-32004`	`tool_timeout`	Tool call exceeded configured timeout
`-32600`	`invalid_request`	Malformed JSON-RPC request body
`-32601`	`method_not_found`	Unknown MCP method

Webhook Event Types

Event	Trigger
`key.created`	New API key provisioned
`key.revoked`	API key permanently revoked
`key.suspended`	API key temporarily suspended
`key.resumed`	Suspended key reactivated
`key.rotated`	API key rotated to new value
`key.topup`	Credits added to key
`key.expired`	Key TTL elapsed
`key.expiry_warning`	Key approaching expiry
`credit.transfer`	Credits moved between keys
`credit.auto_topup`	Auto-topup triggered
`usage`	Batched tool call events

Programmatic API

import { PayGateServer } from 'paygate-mcp';

// Wrap a local server (stdio)
const server = new PayGateServer({
  serverCommand: 'npx',
  serverArgs: ['@modelcontextprotocol/server-filesystem', '/tmp'],
  port: 3402,
  defaultCreditsPerCall: 1,
  toolPricing: {
    'premium_analyze': { creditsPerCall: 10 }
  },
});

const { port, adminKey } = await server.start();

// Multi-server mode
const multiServer = new PayGateServer(
  { serverCommand: '', port: 3402, defaultCreditsPerCall: 1 },
  undefined, undefined, undefined, undefined,
  [
    { prefix: 'fs', serverCommand: 'npx', serverArgs: ['@modelcontextprotocol/server-filesystem', '/tmp'] },
    { prefix: 'api', remoteUrl: 'https://my-mcp-server.example.com/mcp' },
  ]
);

// With Redis for horizontal scaling
const redisServer = new PayGateServer(
  { serverCommand: 'npx', serverArgs: ['my-mcp-server'], port: 3402, defaultCreditsPerCall: 1 },
  undefined, undefined, undefined, undefined, undefined,
  'redis://localhost:6379'
);

// Client SDK
import { PayGateClient } from 'paygate-mcp/client';

const client = new PayGateClient({
  url: `http://localhost:${port}`,
  apiKey: 'pg_...',
});

const tools = await client.listTools();
const result = await client.callTool('search', { query: 'hello' });

Security

Cryptographic API key generation (pg_ prefix, 48 hex chars)
Keys masked in list endpoints
Integer-only credits (no float precision attacks)
1MB request body limit
Input sanitization on all endpoints
Admin key never exposed in responses
API keys never forwarded to remote servers (HTTP transport)
Rate limiting is per-key, concurrent-safe
Stripe webhook signature verification (HMAC-SHA256, timing-safe)
Dashboard uses safe DOM methods (textContent/createElement) — no innerHTML
Webhook HMAC-SHA256 signatures with timing-safe verification
Webhook URLs masked in status output
Spending limits enforced with integer arithmetic (no float bypass)
Per-tool ACL enforcement (whitelist + blacklist, sanitized inputs)
Key expiry with fail-closed behavior (expired = denied)
OAuth 2.1 with PKCE (S256) — no implicit grant, no plain challenge
OAuth tokens are opaque hex strings (no JWT data leakage)
Quota counters reset atomically at UTC boundaries
SSE sessions auto-expire (30 min), max 1000 concurrent, max 3 SSE per session
Audit log with retention policies (ring buffer, age-based cleanup)
API keys masked in audit events (only first 7 + last 4 chars visible)
Discovery endpoints (/.well-known/mcp-payment, /pricing) are public but read-only
Team budgets enforce integer arithmetic (no float bypass)
Keys masked in team usage summaries (first 7 + last 4 chars only)
Team quota resets atomic at UTC day/month boundaries
Redis credit deduction uses Lua scripts for atomic check-and-deduct (no double-spend)
Redis rate limiting uses Lua scripts for atomic check-and-record (no burst bypass)
Redis auth supported via password in URL (redis://:password@host:port)
Graceful Redis fallback — local operations continue if Redis disconnects
Rate limiter fails open on Redis error (allows request, never blocks on network issues)
Pub/sub self-message filtering via unique instance IDs (no echo loops)
Pub/sub subscriber uses a dedicated Redis connection (required by Redis protocol)
Red-teamed with 101 adversarial security tests across 14 passes

Tested With

PayGate is integration-tested against popular MCP servers from the official @modelcontextprotocol npm scope. These tests wrap real MCP servers via npx, execute tool calls through the PayGate proxy, and verify that auth gating, credit billing, and rate limiting work correctly end-to-end.

MCP Server	Type	Tests	What's Verified
`@modelcontextprotocol/server-everything`	stdio	4	Tool discovery, math tool execution, credit deduction, credit blocking
`@modelcontextprotocol/server-filesystem`	stdio	4	File write/read through gate, credit deduction, credit blocking
`@modelcontextprotocol/server-memory`	stdio	4	Entity CRUD, knowledge graph search, credit deduction, credit blocking
`@modelcontextprotocol/server-sequential-thinking`	stdio	4	Sequential thinking flow, credit deduction, credit blocking

Cross-server tests verify admin endpoints (/health, /keys, /balance) work identically regardless of the wrapped backend. All 16 integration tests pass.

# Run integration tests (requires internet — downloads MCP servers via npx)
npx vitest run tests/real-mcp-servers.test.ts

Current Limitations

No response size limits for HTTP transport — Large responses from remote servers are forwarded as-is.
Redis key metadata syncs on write — Admin mutations write through to Redis immediately; pub/sub delivers near-instant cross-instance updates; periodic sync (5s) serves as a safety net. Credits, rate limits, and usage are always atomic.
SSE sessions are per-instance — Each PayGate instance manages its own SSE connections (HTTP streams can't be serialized to Redis).

Paygate

Reviews

Documentation

paygate-mcp

Table of Contents

Quick Start

What It Does

Usage

Wrap a Local MCP Server (stdio)

Gate a Remote MCP Server (Streamable HTTP)

Multi-Server Mode

Client SDK

Create API Keys

Call Tools

Top Up Credits

Check Balance (Client Self-Service)

Export Usage Data (Admin)

Check Status

Admin Dashboard

API Reference

Free Methods

CLI Commands

Shell Completions

Machine-Readable Output

CLI Options

Dynamic Tool Discovery

Persistent Storage

Stripe Integration

Per-Tool ACL (Access Control)

Per-Tool Rate Limits

Key Expiry (TTL)

Credit Transfers

Bulk Key Operations

Key Import/Export

Spending Limits

Refund on Failure

Webhook Events

Retry Queue & Dead Letters

Webhook Event Replay

Webhook Signatures (HMAC-SHA256)

Admin Lifecycle Events

Webhook Filters (Event Routing)

Usage Quotas

Dynamic Pricing

OAuth 2.1

SSE Streaming (MCP Streamable HTTP)

Audit Log

Registry/Discovery (Agent-Discoverable Pricing)

Prometheus Metrics

Key Cloning

Key Rotation

Key Suspension & Resumption

Per-Key Usage

Webhook Test

Webhook Delivery Log

Webhook Pause/Resume

Key Aliases

Key Expiry Scanner

Key Templates

Environment Variables Config

Request ID Tracking

Server Info Endpoint

Configurable CORS

Custom Response Headers

Config Export

Trusted Proxies

Key Listing Pagination

Key Statistics

Rate Limit Status

Quota Status

Credit History

Spending Velocity

Key Comparison

Key Health Score

Maintenance Mode

Admin Event Stream

Key Notes

Scheduled Actions

Key Activity Timeline

Credit Reservations