Paygate
Pay-per-tool-call gating proxy for MCP servers. Wrap any MCP server with API key auth, per-tool pricing, rate limiting, and usage metering.
Ask AI about Paygate
Powered by Claude Β· Grounded in docs
I know everything about Paygate. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
paygate-mcp
Monetize any MCP server with one command. Add API key auth, per-tool pricing, rate limiting, and usage metering to any Model Context Protocol server. Zero dependencies. Zero config. Zero code changes.
Table of Contents
- Quick Start
- What It Does
- Usage β Local stdio, remote HTTP, multi-server, client SDK
- API Reference β All 199+ endpoints
- CLI Options
- Deployment β Docker, docker-compose, systemd, PM2
- Load Testing β k6 benchmarking for production
- Error Codes β Complete error code reference
- Feature Reference β Detailed docs for every feature
- Storage & Billing Β· Stripe Β· ACL Β· Rate Limits
- Key Management Β· Webhooks Β· OAuth 2.1 Β· SSE
- Analytics (64 endpoints) Β· Teams Β· Redis Scaling
- Plugins Β· Groups Β· Namespaces
- Programmatic API
- Security
- Tested With β Verified against popular MCP servers
- Current Limitations
- Roadmap
- Requirements
- License
Quick Start
# Interactive setup wizard (generates paygate.json)
npx paygate-mcp init
# Or wrap directly with CLI flags
npx paygate-mcp wrap --server "npx @modelcontextprotocol/server-filesystem /tmp"
# Gate a remote MCP server (Streamable HTTP transport)
npx paygate-mcp wrap --remote-url "https://my-server.example.com/mcp" --price 5
That's it. Your MCP server is now gated behind API keys with credit-based billing.
What It Does
PayGate sits between AI agents and your MCP server:
Agent β PayGate (auth + billing) β Your MCP Server (stdio or HTTP)
- API Key Auth β Clients need a valid
X-API-Keyto call tools - Credit Billing β Each tool call costs credits (configurable per-tool)
- Rate Limiting β Sliding window per-key rate limits + per-tool rate limits
- Usage Metering β Track who called what, when, and how much they spent
- Multi-Server Mode β Wrap N MCP servers behind one PayGate with tool prefix routing
- Client SDK β
PayGateClientwith auto 402 retry, balance tracking, and typed errors - Two Transports β Wrap local servers via stdio or remote servers via Streamable HTTP
- Per-Tool ACL β Whitelist/blacklist tools per API key (enterprise access control)
- Per-Tool Rate Limits β Independent rate limits per tool, not just global
- Key Expiry (TTL) β Auto-expire API keys after a set time
- Spending Limits β Cap total spend per API key to prevent runaway costs
- Usage Quotas β Daily/monthly call and credit limits per key (with UTC auto-reset)
- Dynamic Pricing β Charge extra credits based on input size (
creditsPerKbInput) - OAuth 2.1 β Full authorization server with PKCE, client registration, Bearer tokens
- SSE Streaming β Full MCP Streamable HTTP transport (POST SSE, GET notifications, DELETE sessions)
- Audit Log β Structured audit trail with retention policies, query API, CSV/JSON export
- Registry/Discovery β Agent-discoverable pricing via
/.well-known/mcp-payment,/pricing, and/.well-known/mcp.jsonidentity card - OpenAPI 3.1 + Interactive Docs β Auto-generated spec at
/openapi.json, Swagger UI at/docsβ all 199+ endpoints documented - Public Endpoint Rate Limiting β Configurable per-IP rate limit (default 300/min) on
/health,/info,/pricing,/docs,/openapi.json,/.well-known/*,/robots.txt,/β 429 with Retry-After header - Robots.txt + HEAD Support β Standard
/robots.txt(allow public, disallow admin/keys), HEAD method on all public endpoints for uptime monitoring - Prometheus Metrics β
/metricsendpoint with counters, gauges, and uptime in standard text format - Key Rotation β Rotate API keys without losing credits, ACLs, or quotas
- Rate Limit Headers β
X-RateLimit-*andX-Credits-Remainingon every/mcpresponse - Webhook Signatures β HMAC-SHA256 signed webhook payloads (
X-PayGate-Signature) for tamper-proof delivery - Admin Lifecycle Events β Webhook notifications for key.created, key.revoked, key.rotated, key.topup
- IP Allowlisting β Restrict API keys to specific IPs or CIDR ranges (IPv4)
- Key Tags/Metadata β Attach arbitrary key-value tags to API keys for external system integration
- Usage Analytics β Time-series analytics API with tool breakdown, top consumers, and trend comparison
- Alert Webhooks β Configurable alerts for spending thresholds, low credits, quota warnings, key expiry, rate limit spikes
- Team Management β Group API keys into teams with shared budgets, quotas, and usage tracking
- Horizontal Scaling (Redis) β Redis-backed state for multi-process deployments with atomic credit deduction, distributed rate limiting, persistent usage audit trail, real-time pub/sub notifications, and admin API sync
- Webhook Retry Queue β Exponential backoff retry (1s, 2s, 4s...) with dead letter queue for permanently failed deliveries, admin API for monitoring, clearing, and replaying
- Admin Dashboard v2 β Tabbed web dashboard at
/dashboardwith overview, keys management (create/suspend/resume/revoke/top-up), analytics (credit flow, deny reasons, top consumers, webhook health), and system status β all data via safe DOM methods, 30s auto-refresh - Self-Service Portal β API key holder portal at
/portalβ check credits, usage, rate limits, available tools, and recent activity without admin access; includes Buy Credits UI, credit history with spending velocity, usage alerts, and self-service key rotation - Stripe Checkout β Self-service credit purchases via Stripe Checkout Sessions β
POST /stripe/checkoutcreates a session,GET /stripe/packageslists available packages; zero-dependency implementation using Node.jshttps, auto-tops-up credits via webhook - State Backup & Restore β
GET /admin/backupexports full server state (keys, teams, groups, webhooks) as versioned JSON with SHA-256 checksum;POST /admin/restoreimports with merge/overwrite/full modes and integrity verification - API Version Header β
X-PayGate-Versionheader on every HTTP response for client version tracking, exposed via CORS - Readiness Probe β
GET /readyreturns 200/503 based on operational state (not draining, not maintenance, backend connected) β separate from/healthliveness probe, ideal for Kubernetes - Health Check + Graceful Shutdown β
GET /healthpublic endpoint with status, uptime, version, in-flight requests, Redis & webhook stats;gracefulStop()drains in-flight requests before teardown - Config Validation + Dry Run β
paygate-mcp validate --config paygate.jsoncatches misconfigurations before starting;--dry-rundiscovers tools, prints pricing table, then exits - Batch Tool Calls β
tools/call_batchmethod for calling multiple tools in one request with all-or-nothing billing, aggregate credit checks, and parallel execution - Multi-Tenant Namespaces β Isolate API keys and usage data by tenant with namespace-filtered admin endpoints, analytics, and usage export
- Scoped Tokens β Issue short-lived
pgt_tokens scoped to specific tools with auto-expiry (max 24h), HMAC-SHA256 signed, zero server-side state - Token Revocation List β Revoke scoped tokens before expiry with O(1) lookup, auto-cleanup, Redis cross-instance sync, and admin API
- Usage-Based Auto-Topup β Automatically add credits when balance drops below a threshold with configurable daily limits, audit trail, webhook events, and Redis sync
- Admin API Key Management β Multiple admin keys with role-based permissions (super_admin, admin, viewer), file persistence, audit trail, and safety guards
- Plugin System β Extensible middleware hooks for custom billing logic, request/response transformation, custom endpoints, and lifecycle management
- Key Groups β Policy templates that apply shared ACL, rate limits, pricing overrides, IP allowlists, and quotas to groups of API keys with automatic inheritance and key-level override support
- Refund on Failure β Automatically refund credits when downstream tool calls fail
- Credit Transfers β Atomically transfer credits between API keys with validation, audit trail, and webhook events
- Bulk Key Operations β Execute multiple key operations (create, topup, revoke, suspend, resume) in a single request with per-operation error handling and index tracking
- Key Import/Export β Export all API keys for backup/migration (JSON or CSV) and import with conflict resolution (skip, overwrite, error modes)
- Webhook Filters β Route webhook events to different destinations based on event type and API key prefix with per-filter secrets, independent retry queues, and admin CRUD API
- Key Cloning β
POST /keys/clonecreates a new API key with the same config (ACL, quotas, tags, IP, namespace, group, spending limit, expiry, auto-topup) but fresh counters β ideal for provisioning similar keys - Key Suspension β Temporarily disable API keys without revoking them β suspended keys are denied at the gate but can be resumed, and admin operations (topup, ACL, etc.) still work on suspended keys
- Per-Key Usage β
GET /keys/usage?key=...returns detailed usage breakdown for a specific key: per-tool stats, hourly time-series, deny reasons, recent events, and key metadata - Webhook Test β
POST /webhooks/testsends a test event to your configured webhook URL with synchronous response including status code, response time, and delivery success/failure β verifies webhook connectivity without generating real events - Webhook Delivery Log β
GET /webhooks/logreturns a queryable log of all webhook delivery attempts with timestamps, HTTP status codes, response times, success/failure, retry attempts, event counts, and event types β filter by success status, time range, and limit - Webhook Pause/Resume β
POST /webhooks/pauseandPOST /webhooks/resumetemporarily halt webhook delivery during maintenance β events are buffered (not lost) and flushed on resume, with pause state visible in/webhooks/stats - Key Aliases β
POST /keys/aliasassigns human-readable aliases (e.g.my-service,prod-backend) to API keys β use aliases in any admin endpoint (topup, revoke, suspend, resume, clone, transfer, usage) instead of opaque key IDs, with uniqueness enforcement, format validation, state file persistence, and audit trail - Key Expiry Scanner β Proactive background scanner that detects expiring API keys before they expire β configurable scan interval and notification thresholds (default: 7d, 24h, 1h), de-duplicated
key.expiry_warningwebhook events, audit trail,GET /keys/expiring?within=86400query endpoint, and graceful shutdown - Key Templates β Named templates for API key creation β define reusable presets (credits, ACL, quotas, IP, tags, namespace, expiry TTL, spending limit, auto-topup) and create keys with
template: "free-tier"β explicit params override template defaults, CRUD admin API, Prometheus gauge, file persistence, max 100 templates - Environment Variables Config β Configure everything via
PAYGATE_*env vars for Docker/K8s deployments β 18 env vars covering all CLI flags, with priority: CLI flags > env vars > config file > defaults,PAYGATE_CONFIGloads config file path, help text with Docker examples - Request ID Tracking β Every HTTP response includes
X-Request-Idheader (auto-generatedreq_prefix + 16 hex chars) for distributed tracing β propagates incomingX-Request-Idfrom load balancers/proxies, included in gate audit log metadata, CORS-exposed, available viagetRequestId(req)helper - Server Info Endpoint β
GET /inforeturns server capabilities, enabled features, auth methods, pricing summary, rate limits, and available endpoints β public, no admin key required, ideal for agent auto-discovery and debugging - Configurable CORS β Control which origins can access your server: single origin, multiple origins, or wildcard (
*default), with credentials support, configurable preflight max-age, andVary: Originfor proper caching β set via config filecorsobject,--cors-originCLI flag, orPAYGATE_CORS_ORIGINenv var - Custom Response Headers β Add security headers (
X-Frame-Options,X-Content-Type-Options, etc.), cache control, or any custom headers to all HTTP responses β set via config filecustomHeadersobject,--headerCLI flag, orPAYGATE_CUSTOM_HEADERSenv var - Config Export β
GET /configreturns the running server configuration with sensitive values masked (webhook secrets β***, server commands β***, webhook URLs β scheme+host only) β admin auth required, includes audit trail - Trusted Proxies β Configure trusted proxy IPs/CIDRs for accurate
X-Forwarded-Forextraction β walks the header right-to-left, skipping trusted proxies to find the real client IP, supports exact IPs and CIDR ranges (IPv4), backward compatible (first IP) when not configured - Key Listing Pagination β Enhanced
GET /keyswith cursor-based pagination (limit/offset), sorting (sortBy/order), and filtering by namespace, group, active/suspended/expired status, name prefix, and credit range β backward compatible (returns flat array when no pagination params used) - Key Statistics β
GET /keys/statsreturns aggregate statistics across all keys β total/active/suspended/expired/revoked counts, credit aggregates (allocated/spent/remaining), total calls, namespace and group breakdowns, optional?namespace=filter - Rate Limit Status β
GET /keys/rate-limit-status?key=...returns the current rate limit window state for any key β global calls used/remaining/reset time, per-tool rate limits with individual usage, read-only (doesn't consume a call) - Quota Status β
GET /keys/quota-status?key=...returns daily/monthly quota usage for any key β calls and credits used/remaining/limits, reset periods, quota source (per-key vs global vs none) - Credit History β
GET /keys/credit-history?key=...returns per-key credit mutation log β tracks initial allocation, topups, transfers (in/out), auto-topups, with type/limit/since filters, balance-before/after on every entry, newest-first ordering, capped at 100 entries per key - Spending Velocity β
GET /keys/spending-velocity?key=...returns credit burn rate and depletion forecast β credits/calls per hour/day, estimated depletion date, top tools by spend, configurable analysis window (1hβ30d) - Key Comparison β
GET /keys/compare?keys=pg_a,pg_breturns side-by-side comparison of 2β10 keys β credits, usage, velocity, rate limits, status, metadata (namespace/group/tags) β with not-found key reporting - Key Health Score β
GET /keys/health?key=...returns composite health score (0β100) with weighted component breakdown: balance health (30%), quota utilization (25%), rate limit pressure (20%), error rate (25%) β status levels (healthy/good/caution/warning/critical), key issue detection (revoked/suspended/expired/expiring/zero credits), alias support - Maintenance Mode β
POST /maintenanceenables/disables maintenance mode with custom message β/mcpreturns 503 to clients while admin endpoints stay operational,GET /maintenancechecks status,GET /healthreflects maintenance state, full audit trail - Admin Event Stream β
GET /admin/eventsSSE endpoint streams real-time audit events to admin clients β tool calls, denials, key operations, maintenance changes, all with optional?types=filter for event type filtering, keepalive pings, multi-client support - Key Notes β
POST /keys/notesadds timestamped notes to API keys,GET /keys/notes?key=...lists notes,DELETE /keys/notes?key=...&index=Nremoves notes β max 50 per key, 1000 char limit, works on suspended/revoked keys, alias support, audit trail - Scheduled Actions β
POST /keys/schedulecreates future-dated actions (revoke/suspend/topup) on API keys,GET /keys/schedulelists pending schedules with optional?key=filter,DELETE /keys/schedule?id=...cancels a schedule β max 20 per key, alias support, background execution timer, audit trail - Key Activity Timeline β
GET /keys/activity?key=...returns a unified chronological feed of audit events and usage events for a specific key β newest first, optional?since=and?limit=filters, alias support - Credit Reservations β
POST /keys/reserveholds credits,POST /keys/reserve/commitdeducts held credits,POST /keys/reserve/releasefrees the hold,GET /keys/reservelists active reservations β prevents overcommit, configurable TTL (10sβ1h), max 50 per key, auto-expiry, audit trail - Request Log β
GET /requestsqueryable log of every tool call with timing, credits charged, status (allowed/denied), deny reason, key, and request ID β filter by key/tool/status/since, pagination, summary statistics (totals + avg duration), 5000-entry ring buffer - Tool Stats β
GET /tools/statsper-tool analytics: call counts, success rate, avg/p95 latency, credits consumed, deny reason breakdown, top 10 consumers β optional?tool=for detailed single-tool view,?since=filter - Request Log Export β
GET /requests/exportexports the full request log as JSON or CSV with Content-Disposition headers β filter by key/tool/status/since/until, combined time-window queries, no pagination limit - Tool Call Dry Run β
POST /requests/dry-runsimulates a tool call without executing β checks key validity, ACL, rate limits, credits, and spending limits, returns predicted outcome with credits-after calculation and rate limit status - Batch Dry Run β
POST /requests/dry-run/batchsimulates multiple tool calls at once β aggregate credit check, per-tool ACL validation, spending limit, returns per-tool results with total credits required and credits-after - Tool Availability β
GET /tools/available?key=...returns per-key tool availability with pricing, affordability (canAfford), ACL enforcement (accessible/denyReason), and per-tool + global rate limit status - Key Dashboard β
GET /keys/dashboard?key=...consolidated single-endpoint view with metadata, balance, health score, spending velocity, rate limits, quotas, usage summary, and recent activity timeline - Admin Notifications β
GET /admin/notificationsscans all keys for actionable issues: expired/expiring keys, zero credits, credit depletion velocity, suspended keys, high error rates, and rate limit pressure β with severity filtering and priority sorting - System Dashboard β
GET /admin/dashboardsystem-wide overview with key counts (active/suspended/revoked/expired), credit summary (allocated/spent/remaining), usage breakdown with deny reasons, top consumers, top tools, notification counts, and uptime - Key Lifecycle Report β
GET /admin/lifecycleaggregated lifecycle trends with daily creation/revocation/suspension buckets, average key lifetime, and at-risk keys (expiring, expired, zero credits) - Cost Analysis β
GET /admin/costscost-centric view with per-tool and per-namespace cost breakdowns, hourly spending trends, top spenders, average cost per call, and namespace filtering - Rate Limit Analysis β
GET /admin/rate-limitsrate limit utilization analysis with per-key and per-tool breakdown, denial trends, most throttled keys, and current window utilization - Quota Analysis β
GET /admin/quotasquota utilization analysis with per-key daily/monthly usage vs limits, per-tool denial breakdown, most constrained keys, and global/per-key quota source tracking - Denial Analysis β
GET /admin/denialscomprehensive denial breakdown by reason type (insufficient_credits, rate_limited, quota_exceeded, key_suspended, etc.) with per-key and per-tool stats, hourly trends, and most denied keys - Traffic Analysis β
GET /admin/trafficrequest volume analysis with tool popularity, hourly volume, top consumers by call count, namespace breakdown, peak hour identification, and success rates - Response Caching β SHA-256 keyed response cache for identical tool calls β skips backend invocation and credit deduction on cache hit, LRU eviction, per-tool or global TTL,
X-Cache: HIT/MISSheader, admin management (GET/DELETE /admin/cache), Prometheus gauge - Circuit Breaker β Three-state circuit breaker (closed β open β half_open) for backend failure detection β opens after N consecutive failures, auto-recovers after cooldown, error code
-32003, admin management (GET/POST /admin/circuit) - Configurable Timeouts β Per-tool and global timeout for tool calls β returns error code
-32004on timeout, per-tool override viatoolPricing[tool].timeoutMs, triggers circuit breaker failure recording - Outcome-Based Pricing β Charge extra credits based on response output size β
creditsPerKbOutputper-tool config, post-response billing,X-Output-Surchargeheader, complementscreditsPerKbInputfor complete size-based pricing - Compliance Audit Export β Framework-specific compliance reports for SOC 2, GDPR, HIPAA β
GET /admin/compliance/export, event classification into access control/data processing/config changes/security, JSON or CSV export, configurable time periods - Per-Key Webhook URLs β Key-level webhook routing β events for a specific key sent to key's webhook URL alongside global webhook, SSRF-protected, HMAC-SHA256 signed, lazy emitter management via
POST/GET/DELETE /keys/webhook - Security Audit β
GET /admin/securitysecurity posture analysis identifying keys without IP allowlists, quotas, ACL restrictions, spending limits, or expiry dates, flagging high-credit keys, and computing a composite security score - Revenue Analysis β
GET /admin/revenuerevenue metrics with per-tool revenue breakdown, per-key spending, hourly revenue trends, credit flow summary (allocated/spent/remaining), and average revenue per call - Key Portfolio Health β
GET /admin/key-portfolioportfolio-wide key health with active/inactive/suspended counts, stale keys, expiring-soon keys, age distribution, credit utilization, and namespace breakdown - Content Guardrails β Regex-based PII detection and redaction for tool call inputs/outputs β 8 built-in rules (credit card, SSN, email, phone, AWS key, API secret, IBAN, passport), 4 actions (log/warn/block/redact), scope filtering (input/output/both), per-tool targeting, violation tracking with query API, admin CRUD endpoints (
/admin/guardrails,/admin/guardrails/violations) - IP Country Restrictions β Per-key geographic access control with allow/deny country lists (ISO 3166-1 alpha-2) β country code from reverse-proxy headers (
X-Country,CF-IPCountry, configurable), CRUD via/keys/geo, enforced at gate evaluation, zero-dependency geo-fencing - Bulk Suspend/Resume β Added
suspendandresumeactions toPOST /keys/bulkβ temporarily disable or re-activate multiple keys in one request with per-operation error handling - Concurrency Limiter β Per-key and per-tool inflight request caps β distinct from rate limiting, limits simultaneous active requests to protect backends from burst parallelism, error code
-32005withRetry-Afterheader, runtime-adjustable viaGET/POST /admin/concurrency - Traffic Mirroring β Fire-and-forget request duplication to a shadow backend for A/B testing MCP server versions β percentage-based sampling, configurable timeout, zero impact on primary response path, stats/management via
GET/POST/DELETE /admin/mirror - Tool Aliasing + Deprecation β Tool renaming with RFC 8594 compliance β map old tool names to new ones with
Deprecation,Sunset, andLinkheaders, chain prevention, per-alias call counts, CRUD viaGET/POST/DELETE /admin/tool-aliases - Usage Plans β Tiered key policies (free/pro/enterprise) β bundle rate limits, quotas, credit multipliers, and tool ACL into reusable templates, assign keys to plans via
POST /admin/keys/plan, denied tools rejected with error code-32403, CRUD viaGET/POST/DELETE /admin/plans - Tool Input Schema Validation β Per-tool JSON Schema validation at the gateway β register schemas to reject invalid payloads before they reach downstream, zero-dependency JSON Schema subset (type, required, enum, minLength, pattern, items), error code
-32602with detailed errors, manage viaGET/POST/DELETE /admin/tools/schema - Canary Routing β Weighted traffic splitting between primary and canary MCP servers β enable zero-downtime upgrades with percentage-based routing (0-100%), unbiased
crypto.randomIntdecisions, per-backend call/error tracking, weight updates without restart, manage viaGET/POST/DELETE /admin/canary - Request/Response Transforms β Declarative rewriting of tool call arguments and responses β inject defaults, strip fields, rename keys, and template
{{variables}}from context, wildcard tool matching, priority ordering, deep clone on apply, import/export for backup, manage viaGET/POST/PUT/DELETE /admin/transforms - Backend Retry Policy β Automatic retry with exponential backoff for transient failures β configurable max retries, base/max backoff, full jitter, retry budget (max % of traffic as retries with cold-start grace), per-tool stats, retryable error pattern matching, manage via
GET/POST /admin/retry-policy - Adaptive Rate Limiting β Dynamic rate adjustment based on key behavior β auto-tighten for high error rates, auto-boost for good actors, cooldown periods, configurable thresholds, per-key behavior tracking, LRU eviction, batch evaluation, manage via
GET/POST /admin/adaptive-rates - Request Deduplication β Idempotency layer preventing duplicate billing from agent retries β
X-Idempotency-Keyheader with auto-generation fallback (SHA-256), in-flight request coalescing, configurable TTL window, LRU eviction, credits-saved tracking, manage viaGET/POST/DELETE /admin/dedup - Priority Queue β Tiered request prioritization (critical/high/normal/low/background) with fair scheduling β per-key priority assignment, configurable max wait times per tier, starvation prevention via automatic promotion, max queue depth limiting, manage via
GET/POST /admin/priority-queue - Cost Allocation Tags β Per-request cost attribution via
X-Cost-Tagsheader (JSON) for enterprise chargeback β aggregated reports by any tag dimension, cross-tabulation, CSV export, required tag enforcement per key, cardinality limits, manage viaGET/POST/DELETE /admin/cost-tags - IP Access Control β Fine-grained IP-based access control with CIDR notation support β global allow/deny lists, per-key IP binding, automatic blocking after configurable violation thresholds, X-Forwarded-For/X-Real-IP trusted proxy depth, IPv6-mapped IPv4 normalization, manage via
GET/POST/DELETE /admin/ip-access - Request Signing (HMAC-SHA256) β Cryptographic request authentication with replay protection β
X-Signature: t=<ts>,n=<nonce>,s=<sig>header, timestamp tolerance with nonce dedup, per-key signing secrets with rotation, timing-safe comparison, manage viaGET/POST/DELETE /admin/signing - Multi-Tenant Isolation β Full tenant isolation for platform operators β per-tenant rate limits, credit pools, usage tracking, API key binding, tenant suspension/activation, cross-tenant reporting, configurable limits (10K tenants, 1K keys/tenant), manage via
GET/POST/DELETE /admin/tenants - Request Tracing β End-to-end structured tracing with span recording at gate, backend, and transform stages β trace/request ID lookup, timing breakdown (gateMs/backendMs/transformMs), configurable sample rate, retention limits, P95 latency tracking, JSON export, manage via
GET/POST/DELETE /admin/tracing - Budget Policy Engine β Burn rate monitoring with progressive throttling β daily/monthly budget enforcement, credits/minute burn rate tracking over configurable windows, three actions (alert/throttle/deny), per-namespace and per-key targeting, budget remaining forecast, automatic daily/monthly reset, manage via
GET/POST/DELETE /admin/budget-policies - Tool Dependency Graph β DAG-based workflow validation β register tool dependencies, enforce execution order, failure propagation (upstream failure blocks downstream), topological sort, cycle detection, per-workflow execution tracking, hard vs soft dependencies, group scoping, manage via
GET/POST/DELETE /admin/tool-deps - Quota Management β Granular daily/weekly/monthly hard caps per API key β per-tool or global quotas, calls or credits metric, burst allowance (temporary over-limit percentage), three overage actions (deny/warn/throttle), UTC-based period boundaries (daily midnight, weekly Monday, monthly 1st), automatic period rollover, manage via
GET/POST/DELETE /admin/quota-rules - Webhook Replay (DLQ) β Dead letter queue management for failed webhook deliveries β record failures with full request context (URL, headers, body, HMAC signature), replay individual or bulk failed deliveries, status tracking (pending β retrying β succeeded/exhausted), configurable max retries with timeout, purge by ID or status, age-based expiry, manage via
GET/POST/DELETE /admin/webhook-replay - Config Profiles β Named configuration presets with save/activate/rollback β profile inheritance chains (base β child merging), SHA-256 checksums, flat-key diffing for comparison (onlyInA/onlyInB/changed/unchanged), import/export as JSON with merge or replace mode, activation history, circular inheritance detection, manage via
GET/POST/DELETE /admin/config-profiles - Scheduled Reports β Automated periodic usage, billing, compliance, and security reports delivered via webhook β daily/weekly/monthly frequency with UTC period bounds, HMAC-SHA256 signed payloads, namespace/group/tool/key filters, report generation with delivery tracking, configurable timeouts, manage via
POST /admin/scheduled-reports - Approval Workflows β Pre-execution approval gates for high-cost or sensitive tool calls β three conditions (cost_threshold, tool_match with glob, key_match with prefix), pending requests with configurable TTL (default 1h), approve/deny/expire lifecycle, trigger counting, manage via
POST /admin/approval-workflows - Gateway Hooks β Pre/post request lifecycle hooks for custom logic β three stages (pre_gate, pre_backend, post_backend), four types (log, header_inject, metadata_tag, reject), priority-based execution pipeline, tool/key glob filtering, reject short-circuits processing, execution counting, manage via
POST /admin/gateway-hooks - Anomaly Detection β
GET /admin/anomaliesidentifies unusual patterns: keys with high denial rates, rapid credit depletion, low remaining credits, with severity ratings and detailed descriptions - Usage Forecasting β
GET /admin/forecastpredicts future credit consumption with per-key depletion estimates, calls remaining, at-risk key identification, system-wide consumption aggregates, and per-tool cost breakdown - Compliance Report β
GET /admin/compliancegenerates compliance-ready report with key governance (expiry coverage), access control (ACL/IP/spending limit coverage), audit trail completeness, weighted overall score, and actionable recommendations - SLA Monitoring β
GET /admin/slatracks service level metrics: success rates, denial breakdowns by reason, per-tool availability and error rates, uptime tracking, sorted by call volume - Capacity Planning β
GET /admin/capacitysystem capacity analysis with credit burn rates, utilization percentages, top consumers, per-namespace breakdown, and scaling recommendations - Key Dependency Map β
GET /admin/dependenciestool-to-key relationship map with tool usage popularity, unique key counts per tool, per-key tool lists, and used/unused tool identification - Tool Latency Analysis β
GET /admin/latencyper-tool response time metrics with avg/p95/min/max durations, slowest tools ranking, and per-key latency breakdown - Error Rate Trends β
GET /admin/error-trendsdenial rate trends with per-tool error rates, denial reason breakdown, worst-performing tools, and trend direction - Credit Flow Analysis β
GET /admin/credit-flowcredit inflow/outflow analysis with utilization percentage, top spenders, and per-tool spend breakdown - Key Age Analysis β
GET /admin/key-agekey age distribution with oldest/newest keys, age buckets (24h/7d/30d/older), and recently created list - Namespace Usage Summary β
GET /admin/namespace-usageper-namespace usage metrics with credit allocation, spending, call counts, and cross-namespace comparison - Audit Summary β
GET /admin/audit-summaryaudit event analytics with type breakdown, top actors, recent events, and activity summary - Group Performance β
GET /admin/group-performanceper-group analytics with key counts, credit allocation/spending, call volume, utilization, and policy summary - Request Volume Trends β
GET /admin/request-trendshourly time-series of request volume, success/failure counts, credit spend, avg duration, and peak hour identification - Key Status Overview β
GET /admin/key-statuskey status dashboard with active/suspended/revoked/expired counts and keys needing attention (low credits, near expiry) - Webhook Health β
GET /admin/webhook-healthwebhook delivery health overview with success rate, pending retries, dead letter count, pause status, and buffered events - Consumer Insights β
GET /admin/consumer-insightsper-key behavioral analytics with top spenders, most active callers, tool diversity, and spending patterns - System Health Score β
GET /admin/system-healthcomposite 0-100 health score with weighted component breakdowns for key health, error rates, and credit utilization - Tool Adoption β
GET /admin/tool-adoptionper-tool adoption metrics with unique consumers, adoption rate, first/last seen timestamps, and usage ranking - Credit Efficiency β
GET /admin/credit-efficiencycredit allocation efficiency with burn efficiency, waste ratio, over-provisioned and under-provisioned key detection - Access Heatmap β
GET /admin/access-heatmaphourly access patterns with tool breakdown, unique consumers, and peak hour identification - Key Churn Analysis β
GET /admin/key-churnkey churn metrics with creation/revocation rates, churn and retention percentages, and never-used key detection - Tool Correlation β
GET /admin/tool-correlationtool co-occurrence analysis showing which tools are commonly used together by the same consumers - Consumer Segmentation β
GET /admin/consumer-segmentationclassifies API key consumers into power/regular/casual/dormant segments with per-segment metrics - Credit Distribution β
GET /admin/credit-distributionhistogram of credit balances across active keys with bucket ranges and median calculation - Response Time Distribution β
GET /admin/response-time-distributionhistogram of response times with latency buckets and p50/p95/p99 percentiles - Consumer Lifetime Value β
GET /admin/consumer-lifetime-valueper-consumer spend analysis with value tiers, tool diversity, and top spender rankings - Tool Revenue Ranking β
GET /admin/tool-revenueranks tools by total credits consumed with call counts, unique consumers, and percentage breakdown - Consumer Retention Cohorts β
GET /admin/consumer-retentiongroups consumers by creation date with retention rates and avg spend per cohort - Error Breakdown β
GET /admin/error-breakdowncategorizes denied requests by reason with counts, percentages, affected consumers, and error rate - Credit Utilization Rate β
GET /admin/credit-utilizationshows utilization percentage across active keys with utilization bands and over-provisioning detection - Namespace Revenue β
GET /admin/namespace-revenuerevenue breakdown by namespace with spend, call counts, key counts, and percentage breakdown - Group Revenue β
GET /admin/group-revenuerevenue breakdown by key group with spend, call counts, key counts, and percentage breakdown - Peak Usage Times β
GET /admin/peak-usagetraffic patterns by hour-of-day with request counts, credits, unique consumers, and peak hour identification - Consumer Activity β
GET /admin/consumer-activityper-consumer activity metrics with calls, spend, credits remaining, last active time, and active/inactive status - Tool Popularity β
GET /admin/tool-popularitytool usage popularity with call counts, credits, unique consumers, percentage, and most popular tool identification - Credit Allocation Summary β
GET /admin/credit-allocationcredit allocation across active keys with tier breakdown (1-100, 101-500, 501+), totals, and average allocation - Daily Summary β
GET /admin/daily-summarydaily rollup of requests, credits spent, new keys, errors, unique consumers and tools for trend analysis - Key Ranking β
GET /admin/key-rankingleaderboard of active keys ranked by spend, calls, or credits remaining with configurable sorting - Hourly Traffic β
GET /admin/hourly-trafficgranular per-hour request counts with allowed/denied breakdown, credits, consumers, tools, and busiest hour - Tool Error Rate β
GET /admin/tool-error-rateper-tool error rates with denied/allowed counts, error percentage, and overall reliability metrics - Consumer Spend Velocity β
GET /admin/consumer-spend-velocityper-consumer spend rate with credits/hour, depletion forecast, and velocity ranking - Namespace Activity β
GET /admin/namespace-activityper-namespace activity metrics with key counts, spend, calls, credits remaining for multi-tenant visibility - Credit Burn Rate β
GET /admin/credit-burn-ratesystem-wide credit burn rate with credits/hour, utilization percentage, depletion forecast - Consumer Risk Score β
GET /admin/consumer-risk-scoreper-consumer risk scoring based on utilization with risk levels (low/medium/high/critical) - Revenue Forecast β
GET /admin/revenue-forecastprojected revenue with hourly/daily/weekly/monthly forecasts capped by remaining credits - System Overview β
GET /admin/system-overviewexecutive summary with key counts, credit totals, utilization, activity metrics - Key Health Overview β
GET /admin/key-health-overviewholistic per-key health check with utilization, status levels, health distribution - Namespace Comparison β
GET /admin/namespace-comparisonside-by-side namespace comparison with allocation, spend, utilization, leader - Consumer Growth β
GET /admin/consumer-growthconsumer growth metrics with age, spend rate, credits allocated, new consumer count - Tool Profitability β
GET /admin/tool-profitabilityper-tool profitability analysis with revenue, calls, avg revenue per call, unique callers - Credit Waste Analysis β
GET /admin/credit-wasteper-key credit waste analysis with utilization metrics and waste percentage - Group Activity β
GET /admin/group-activityper-group activity metrics with key counts, spend, calls, credits remaining for policy-template analytics - Config Hot Reload β
POST /config/reloadreloads pricing, rate limits, webhooks, quotas, and behavior flags from config file without server restart - Webhook Events β POST batched usage events to any URL for external billing/alerting
- Config File Mode β Load all settings from a JSON file (
--config) - Shadow Mode β Log everything without enforcing payment (for testing)
- Persistent Storage β Keys, credits, admin keys, and groups survive restarts with
--state-file - Zero Dependencies β No external npm packages. Uses only Node.js built-ins.
Usage
Wrap a Local MCP Server (stdio)
# Default: 1 credit per call, 60 calls/min, port 3402
npx paygate-mcp wrap --server "npx @modelcontextprotocol/server-filesystem /tmp"
# Custom pricing and limits
npx paygate-mcp wrap \
--server "python my-server.py" \
--price 2 \
--rate-limit 30 \
--port 8080
# Per-tool pricing
npx paygate-mcp wrap \
--server "node server.js" \
--tool-price "search:1,generate:5,premium_analyze:20"
# Shadow mode (observe without enforcing)
npx paygate-mcp wrap --server "node server.js" --shadow
Gate a Remote MCP Server (Streamable HTTP)
Gate any remote MCP server that supports the Streamable HTTP transport (MCP spec 2025-03-26):
npx paygate-mcp wrap --remote-url "https://my-mcp-server.example.com/mcp"
# With custom pricing
npx paygate-mcp wrap \
--remote-url "https://api.example.com/mcp" \
--price 5 \
--tool-price "gpt4:20,search:2"
The proxy handles:
- JSON-RPC forwarding via HTTP POST
- SSE (text/event-stream) response parsing
Mcp-Session-Idsession management- Graceful session cleanup (HTTP DELETE on shutdown)
When started, you'll see your admin key in the console. Save it.
Multi-Server Mode
Wrap multiple MCP servers behind a single PayGate instance. Tools are prefixed with the server name:
npx paygate-mcp wrap --config multi-server.json
Example multi-server.json:
{
"port": 3402,
"defaultCreditsPerCall": 1,
"servers": [
{
"prefix": "fs",
"serverCommand": "npx",
"serverArgs": ["@modelcontextprotocol/server-filesystem", "/tmp"]
},
{
"prefix": "github",
"remoteUrl": "https://github-mcp.example.com/mcp"
}
]
}
Tools are exposed with prefixes: fs:read_file, fs:write_file, github:search_repos, etc. Pricing and ACLs work on the prefixed names:
{
"toolPricing": {
"github:search_repos": { "creditsPerCall": 5 },
"fs:read_file": { "creditsPerCall": 1 }
}
}
Credits are shared across all backends β one API key works for all servers.
Client SDK
Use PayGateClient to call tools from TypeScript/Node.js with auto 402 retry:
import { PayGateClient, PayGateError } from 'paygate-mcp/client';
const client = new PayGateClient({
url: 'http://localhost:3402',
apiKey: 'pg_abc123...',
autoRetry: true,
onCreditsNeeded: async (info) => {
// Called when credits run out β add credits and return true to retry
await topUpCredits(info.creditsRequired);
return true;
},
});
const tools = await client.listTools();
const result = await client.callTool('search', { query: 'hello' });
const balance = await client.getBalance();
Features:
- Auto 402 retry: When a tool call returns payment-required, calls
onCreditsNeededand retries - Balance tracking:
client.lastKnownBalancetracks credits fromgetBalance()calls - Typed errors:
PayGateErrorwith.isPaymentRequired,.isRateLimited,.isExpiredhelpers - Zero dependencies: Uses Node.js built-in
http/https
Create API Keys
curl -X POST http://localhost:3402/keys \
-H "Content-Type: application/json" \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"name": "my-client", "credits": 100}'
Call Tools
curl -X POST http://localhost:3402/mcp \
-H "Content-Type: application/json" \
-H "X-API-Key: CLIENT_API_KEY" \
-d '{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "read_file",
"arguments": {"path": "/tmp/test.txt"}
}
}'
Top Up Credits
curl -X POST http://localhost:3402/topup \
-H "Content-Type: application/json" \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "CLIENT_API_KEY", "credits": 500}'
Check Balance (Client Self-Service)
curl http://localhost:3402/balance \
-H "X-API-Key: CLIENT_API_KEY"
Returns credits, total spent, call count, and last used timestamp. Clients can check their own balance without needing admin access.
Export Usage Data (Admin)
# JSON export
curl http://localhost:3402/usage \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
# CSV export (for spreadsheet/billing import)
curl "http://localhost:3402/usage?format=csv" \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
# Filter by date
curl "http://localhost:3402/usage?since=2025-01-01T00:00:00Z" \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Returns per-call usage events with tool name, credits charged, and timestamps. API keys are masked in output.
Check Status
curl http://localhost:3402/status \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Returns active keys, usage stats, per-tool breakdown, and deny reasons.
Admin Dashboard
Open the web dashboard in your browser:
http://localhost:3402/dashboard
A real-time admin UI for managing keys, viewing usage, and monitoring tool calls. Enter your admin key to authenticate. Features auto-refresh every 30s, top tools chart, activity feed, and key creation/management.
API Reference
| Endpoint | Method | Auth | Description |
|---|---|---|---|
/mcp | POST | X-API-Key or Bearer | JSON-RPC 2.0 proxy (returns JSON or SSE) |
/mcp | GET | X-API-Key or Bearer | SSE notification stream (Streamable HTTP) |
/mcp | DELETE | Mcp-Session-Id | Terminate an MCP session |
/balance | GET | X-API-Key | Client self-service β check credits, quota, ACL, expiry |
/keys | POST | X-Admin-Key | Create API key (with ACL, expiry, quota, credits) |
/keys | GET | X-Admin-Key | List all keys (masked, with expiry status) |
/topup | POST | X-Admin-Key | Add credits to an existing key |
/keys/transfer | POST | X-Admin-Key | Transfer credits between API keys |
/keys/bulk | POST | X-Admin-Key | Execute multiple key operations (create, topup, revoke) in one request |
/keys/export | GET | X-Admin-Key | Export all API keys for backup/migration (JSON or CSV) |
/keys/import | POST | X-Admin-Key | Import API keys from backup with conflict resolution |
/keys/revoke | POST | X-Admin-Key | Permanently revoke an API key |
/keys/suspend | POST | X-Admin-Key | Temporarily suspend a key (reversible) |
/keys/resume | POST | X-Admin-Key | Resume a suspended key |
/keys/clone | POST | X-Admin-Key | Clone a key (new key, same config, fresh counters) |
/keys/usage | GET | X-Admin-Key | Per-key usage breakdown (per-tool, time-series, deny reasons) |
/keys/rotate | POST | X-Admin-Key | Rotate key (new key, same credits/ACL/quotas) |
/keys/acl | POST | X-Admin-Key | Set tool ACL (whitelist/blacklist) on a key |
/keys/expiry | POST | X-Admin-Key | Set or remove key expiry (TTL) |
/keys/quota | POST | X-Admin-Key | Set usage quota (daily/monthly limits) |
/keys/tags | POST | X-Admin-Key | Set key tags/metadata (merge semantics) |
/keys/ip | POST | X-Admin-Key | Set IP allowlist (CIDR + exact match) |
/keys/search | POST | X-Admin-Key | Search keys by tag values |
/keys/auto-topup | POST | X-Admin-Key | Configure or disable auto-topup for a key |
/admin/keys | GET | X-Admin-Key (super_admin) | List all admin keys (masked) |
/admin/keys | POST | X-Admin-Key (super_admin) | Create a new admin key with role |
/admin/keys/revoke | POST | X-Admin-Key (super_admin) | Revoke an admin key |
/limits | POST | X-Admin-Key | Set spending limit on a key |
/usage | GET | X-Admin-Key | Export usage data (JSON or CSV) |
/status | GET | X-Admin-Key | Full dashboard with usage stats |
/dashboard | GET | None (admin key in-browser) | Real-time admin web dashboard |
/stripe/checkout | POST | X-API-Key | Create Stripe Checkout Session for credit purchase |
/stripe/packages | GET | None | List available credit packages (public, rate-limited) |
/stripe/webhook | POST | Stripe Signature | Auto-top-up credits on payment |
/admin/backup | GET | X-Admin-Key | Export full server state as versioned JSON snapshot |
/admin/restore | POST | X-Admin-Key | Import state from backup (merge/overwrite/full modes) |
/admin/cache | GET | X-Admin-Key | Response cache stats (entries, hits, misses, hit rate) |
/admin/cache | DELETE | X-Admin-Key | Clear cache (all or ?tool= filter) |
/admin/circuit | GET | X-Admin-Key | Circuit breaker status (state, failures, rejections) |
/admin/circuit | POST | X-Admin-Key | Reset circuit breaker to closed state |
/admin/compliance/export | GET | X-Admin-Key | Compliance audit export (SOC 2/GDPR/HIPAA, JSON/CSV) |
/keys/webhook | POST | X-Admin-Key | Set per-key webhook URL |
/keys/webhook | GET | X-Admin-Key | Get per-key webhook status |
/keys/webhook | DELETE | X-Admin-Key | Remove per-key webhook URL |
/.well-known/oauth-authorization-server | GET | None | OAuth 2.1 server metadata |
/oauth/register | POST | None | Dynamic Client Registration (RFC 7591) |
/oauth/authorize | GET | None | Authorization endpoint (PKCE required) |
/oauth/token | POST | None | Token endpoint (code exchange + refresh) |
/oauth/revoke | POST | None | Token revocation (RFC 7009) |
/oauth/clients | GET | X-Admin-Key | List registered OAuth clients |
/.well-known/mcp-payment | GET | None | Server payment metadata (SEP-2007) |
/.well-known/mcp.json | GET | None | MCP Server Identity card (discovery) |
/pricing | GET | None | Full per-tool pricing breakdown |
/openapi.json | GET | None | OpenAPI 3.1 spec (all 199+ endpoints) |
/docs | GET | None | Interactive API docs (Swagger UI) |
/robots.txt | GET | None | Crawler directives (allow public, disallow admin/keys) |
/portal | GET | None | Self-service API key portal (browser UI, auth via X-API-Key prompt) |
/ready | GET | None | Readiness probe (200 when ready, 503 when draining/maintenance) |
/metrics | GET | None | Prometheus metrics (counters, gauges, uptime) |
/analytics | GET | X-Admin-Key | Usage analytics (time-series, tool breakdown, trends) |
/alerts | GET | X-Admin-Key | Consume pending alerts |
/alerts | POST | X-Admin-Key | Configure alert rules |
/teams | GET | X-Admin-Key | List all teams |
/teams | POST | X-Admin-Key | Create a team (name, budget, quota, tags) |
/teams/update | POST | X-Admin-Key | Update team settings |
/teams/delete | POST | X-Admin-Key | Delete (deactivate) a team |
/teams/assign | POST | X-Admin-Key | Assign an API key to a team |
/teams/remove | POST | X-Admin-Key | Remove an API key from a team |
/teams/usage | GET | X-Admin-Key | Team usage summary with member breakdown |
/tokens | POST | X-Admin-Key | Create a scoped token (short-lived, tool-restricted) |
/tokens/revoke | POST | X-Admin-Key | Revoke a scoped token (by full token string) |
/tokens/revoked | GET | X-Admin-Key | List all revoked token entries |
/namespaces | GET | X-Admin-Key | List all namespaces with key/credit/spending stats |
/audit | GET | X-Admin-Key | Query audit log (filter by type, actor, time) |
/audit/export | GET | X-Admin-Key | Export full audit log (JSON or CSV) |
/audit/stats | GET | X-Admin-Key | Audit log statistics |
/plugins | GET | X-Admin-Key | List registered plugins with hook info |
/groups | GET | X-Admin-Key | List all key groups (policy templates) |
/groups | POST | X-Admin-Key | Create a key group with shared policies |
/groups/update | POST | X-Admin-Key | Update group policies |
/groups/delete | POST | X-Admin-Key | Delete (deactivate) a group |
/groups/assign | POST | X-Admin-Key | Assign an API key to a group |
/groups/remove | POST | X-Admin-Key | Remove an API key from a group |
/webhooks/filters | GET | X-Admin-Key | List all webhook filter rules |
/webhooks/filters | POST | X-Admin-Key | Create a webhook filter rule |
/webhooks/filters/update | POST | X-Admin-Key | Update a webhook filter rule |
/webhooks/filters/delete | POST | X-Admin-Key | Delete a webhook filter rule |
/webhooks/replay | POST | X-Admin-Key | Replay dead letter webhook events (all or by index) |
/webhooks/test | POST | X-Admin-Key | Send test event to configured webhook URL (synchronous) |
/webhooks/log | GET | X-Admin-Key | Webhook delivery log with status, timing, and filters |
/webhooks/pause | POST | X-Admin-Key | Pause webhook delivery (events buffered until resumed) |
/webhooks/resume | POST | X-Admin-Key | Resume webhook delivery and flush buffered events |
/keys/alias | POST | X-Admin-Key | Set or clear a human-readable alias for an API key |
/keys/expiring | GET | X-Admin-Key | List keys expiring within a time window (?within=86400 seconds) |
/keys/templates | GET | X-Admin-Key | List all key templates |
/keys/templates | POST | X-Admin-Key | Create or update a key template |
/keys/templates/delete | POST | X-Admin-Key | Delete a key template |
/config/reload | POST | X-Admin-Key | Hot-reload config file (pricing, rate limits, webhooks, quotas) |
/health | GET | None | Health check (status, uptime, version, in-flight, Redis/webhook status) |
/ | GET | None | Root endpoint (endpoint list) |
Free Methods
These MCP methods pass through without auth or billing:
initialize, initialized, ping, tools/list, resources/list, prompts/list
Gated methods: tools/call (single), tools/call_batch (batch β all-or-nothing billing, parallel execution). See Batch Tool Calls.
CLI Commands
paygate-mcp wrap [options] # Start a payment-gated MCP proxy
paygate-mcp init [--output] [--force] # Interactive setup wizard
paygate-mcp validate --config <path> # Validate config without starting
paygate-mcp completions <bash|zsh|fish> # Generate shell completions
paygate-mcp version [--json] # Print version
Shell Completions
# Bash
paygate-mcp completions bash > ~/.local/share/bash-completion/completions/paygate-mcp
# Zsh
paygate-mcp completions zsh > ~/.zfunc/_paygate-mcp
# Add to .zshrc: fpath=(~/.zfunc $fpath) && compinit
# Fish
paygate-mcp completions fish > ~/.config/fish/completions/paygate-mcp.fish
Machine-Readable Output
# Version as JSON (for CI/CD)
paygate-mcp version --json
# β {"version":"10.3.0"}
# Validate config with structured output
paygate-mcp validate --config paygate.json --json
# β {"valid":true,"diagnostics":[...],"errors":0,"warnings":0}
CLI Options
--server <cmd> MCP server command to wrap via stdio
--remote-url <url> Remote MCP server URL (Streamable HTTP transport)
--port <n> HTTP port (default: 3402)
--price <n> Default credits per tool call (default: 1)
--rate-limit <n> Max calls/min per key (default: 60, 0=unlimited)
--name <s> Server display name
--shadow Shadow mode β log without enforcing payment
--admin-key <s> Set admin key (default: auto-generated)
--tool-price <t:n> Per-tool price (e.g. "search:5,generate:10")
--import-key <k:c> Import existing key with credits (e.g. "pg_abc:100")
--state-file <path> Persist keys/credits to a JSON file (survives restarts)
--stripe-secret <s> Stripe webhook signing secret (enables /stripe/webhook)
--webhook-url <url> POST batched usage events to this URL
--webhook-secret <s> HMAC-SHA256 secret for signing webhook payloads
--refund-on-failure Refund credits when downstream tool call fails
--redis-url <url> Redis URL for distributed state (e.g. "redis://localhost:6379")
--config <path> Load settings from a JSON config file
--discovery <mode> Tool discovery mode: static (default) or dynamic
--json Machine-readable JSON output
Note: Use
--serverOR--remote-urlfor single-server mode. Useserversin a config file for multi-server mode.
Dynamic Tool Discovery
For servers with many tools, dynamic discovery mode reduces agent context window bloat by exposing 3 meta-tools instead of the full tool list:
npx paygate-mcp wrap --server "your-server" --discovery dynamic
Agents see 3 tools: paygate_list_tools (paginated listing), paygate_search_tools (keyword search), and paygate_call_tool (proxy any tool). This reduces N tools to 3 in the context window while preserving full functionality.
Persistent Storage
Add --state-file to save API keys and credits to disk. Data survives server restarts.
npx paygate-mcp wrap --server "your-mcp-server" --state-file ~/.paygate/state.json
Stripe Integration
Connect Stripe to automatically top up credits when customers pay:
npx paygate-mcp wrap \
--server "your-mcp-server" \
--state-file ~/.paygate/state.json \
--stripe-secret "whsec_your_stripe_webhook_secret"
Setup:
- Create a Stripe Checkout Session with metadata:
paygate_api_keyβ the customer's API key (e.g.pg_abc123...)paygate_creditsβ credits to add on payment (e.g.500)
- Point your Stripe webhook to
https://your-server/stripe/webhook - Subscribe to
checkout.session.completedandinvoice.payment_succeededevents
When a customer completes payment, credits are automatically added to their API key. Subscriptions auto-renew credits on each billing cycle.
Security:
- HMAC-SHA256 signature verification (Stripe's v1 scheme)
- Timing-safe comparison to prevent timing attacks
- 5-minute timestamp tolerance to prevent replay attacks
- Payment status verification (only
paidtriggers credits) - Zero dependencies β uses Node.js built-in
crypto
Per-Tool ACL (Access Control)
Control which tools each API key can access:
# Create a key that can only access search and read tools
curl -X POST http://localhost:3402/keys \
-H "Content-Type: application/json" \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"name": "limited-client", "credits": 100, "allowedTools": ["search", "read_file"]}'
# Create a key with specific tools blocked
curl -X POST http://localhost:3402/keys \
-H "Content-Type: application/json" \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"name": "safe-client", "credits": 100, "deniedTools": ["delete_file", "admin_reset"]}'
# Update ACL on an existing key
curl -X POST http://localhost:3402/keys/acl \
-H "Content-Type: application/json" \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "CLIENT_API_KEY", "allowedTools": ["search"], "deniedTools": ["admin"]}'
- allowedTools (whitelist): Only these tools are accessible. Empty = all tools.
- deniedTools (blacklist): These tools are always denied. Applied after allowedTools.
- ACL also filters
tools/listβ clients only see their permitted tools.
Per-Tool Rate Limits
Set independent rate limits per tool (on top of the global limit):
{
"toolPricing": {
"expensive_analyze": { "creditsPerCall": 10, "rateLimitPerMin": 5 },
"search": { "creditsPerCall": 1, "rateLimitPerMin": 30 },
"cheap_read": { "creditsPerCall": 1 }
}
}
Per-tool limits are enforced independently per API key. A key can be rate-limited on one tool while still accessing others. The global --rate-limit applies across all tools.
Key Expiry (TTL)
Create API keys that auto-expire:
# Create a key that expires in 1 hour (3600 seconds)
curl -X POST http://localhost:3402/keys \
-H "Content-Type: application/json" \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"name": "trial-user", "credits": 50, "expiresIn": 3600}'
# Create a key with a specific expiry date
curl -X POST http://localhost:3402/keys \
-H "Content-Type: application/json" \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"name": "quarterly", "credits": 1000, "expiresAt": "2026-06-01T00:00:00Z"}'
# Set or extend expiry on an existing key
curl -X POST http://localhost:3402/keys/expiry \
-H "Content-Type: application/json" \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "CLIENT_API_KEY", "expiresIn": 86400}'
# Remove expiry (key never expires)
curl -X POST http://localhost:3402/keys/expiry \
-H "Content-Type: application/json" \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "CLIENT_API_KEY", "expiresAt": null}'
Expired keys return a clear api_key_expired error. Admins can extend or remove expiry at any time.
Credit Transfers
Atomically transfer credits between API keys:
curl -X POST http://localhost:3402/keys/transfer \
-H "X-Admin-Key: $ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{ "from": "pg_source_key", "to": "pg_dest_key", "credits": 500, "memo": "Monthly allocation" }'
Response:
{
"transferred": 500,
"from": { "keyMasked": "pg_sour...key1", "balance": 500 },
"to": { "keyMasked": "pg_dest...key2", "balance": 700 },
"memo": "Monthly allocation",
"message": "Transferred 500 credits"
}
Validation: Both keys must exist, be active (not revoked/expired), and the source must have sufficient credits. Fractional credits are floored to integers. Self-transfers are rejected.
Audit trail: Every transfer logs a key.credits_transferred audit event with masked keys, amount, balances, and memo.
Bulk Key Operations
Execute multiple key operations (create, topup, revoke) in a single request. Failed operations don't stop subsequent ones β each result includes success status and index for easy correlation.
curl -X POST http://localhost:3402/keys/bulk \
-H "X-Admin-Key: $ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{
"operations": [
{ "action": "create", "name": "api-key-1", "credits": 500, "tags": { "env": "prod" } },
{ "action": "create", "name": "api-key-2", "credits": 200 },
{ "action": "topup", "key": "pg_existing_key", "credits": 1000 },
{ "action": "revoke", "key": "pg_old_key" }
]
}'
Response:
{
"total": 4,
"succeeded": 4,
"failed": 0,
"results": [
{ "index": 0, "action": "create", "success": true, "result": { "key": "pg_abc...", "name": "api-key-1", "credits": 500 } },
{ "index": 1, "action": "create", "success": true, "result": { "key": "pg_def...", "name": "api-key-2", "credits": 200 } },
{ "index": 2, "action": "topup", "success": true, "result": { "creditsAdded": 1000, "newBalance": 1500 } },
{ "index": 3, "action": "revoke", "success": true, "result": { "message": "Key revoked" } }
]
}
Actions: create (with optional name, credits, tags, namespace, allowedTools, deniedTools), topup (key + credits), revoke (key). Unknown actions return an error result without stopping the batch.
Limits: Maximum 100 operations per request. Empty operations array returns 400.
Audit trail: Each successful operation logs an individual audit event with "(bulk)" suffix.
Key Import/Export
Export all API keys for backup or migration between PayGate instances:
# Export as JSON (includes full key secrets)
curl http://localhost:3402/keys/export \
-H "X-Admin-Key: $ADMIN_KEY" \
-o paygate-keys-backup.json
# Export as CSV
curl "http://localhost:3402/keys/export?format=csv" \
-H "X-Admin-Key: $ADMIN_KEY" \
-o paygate-keys-backup.csv
# Export only active keys in a specific namespace
curl "http://localhost:3402/keys/export?activeOnly=true&namespace=production" \
-H "X-Admin-Key: $ADMIN_KEY"
Import keys into a PayGate instance:
curl -X POST http://localhost:3402/keys/import \
-H "X-Admin-Key: $ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{
"keys": [{ "key": "pg_abc123...", "name": "my-key", "credits": 500, "active": true, "tags": {} }],
"mode": "skip"
}'
Response:
{
"total": 1,
"imported": 1,
"overwritten": 0,
"skipped": 0,
"errors": 0,
"mode": "skip",
"results": [{ "key": "pg_abc123...", "name": "my-key", "status": "imported" }]
}
Conflict modes: skip (default) β skip keys that already exist, overwrite β replace existing keys, error β fail on duplicate keys.
Limits: Maximum 1000 keys per import request. Keys must start with pg_ prefix.
Export formats: JSON (full records with all fields) or CSV (key subset for spreadsheet use).
Spending Limits
Cap the total credits any API key can spend:
# Set a spending limit on a key (admin only)
curl -X POST http://localhost:3402/limits \
-H "Content-Type: application/json" \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "CLIENT_API_KEY", "spendingLimit": 500}'
# Check remaining budget
curl http://localhost:3402/balance -H "X-API-Key: CLIENT_API_KEY"
# β { "spendingLimit": 500, "remainingBudget": 350, ... }
Set spendingLimit to 0 for unlimited. When a key hits its limit, tool calls are denied with a clear error.
Refund on Failure
Automatically return credits when a downstream tool call fails:
npx paygate-mcp wrap --server "node server.js" --refund-on-failure
Credits are deducted before the tool call. If the wrapped server returns an error, credits are refunded and totalSpent / totalCalls are rolled back. Prevents charging users for failed operations.
Webhook Events
POST usage events to any external URL for billing, alerting, or analytics:
npx paygate-mcp wrap --server "node server.js" --webhook-url "https://billing.example.com/events"
Events are batched (up to 10 per POST) and flushed every 5 seconds. Each event includes tool name, credits charged, API key, and timestamp.
Retry Queue & Dead Letters
Failed webhook deliveries are retried with exponential backoff (1s, 2s, 4s, 8s, 16s β configurable up to --webhook-retries attempts). After all retries are exhausted, events move to a dead letter queue for admin inspection.
# Custom max retries (default: 5)
npx paygate-mcp wrap --server "node server.js" \
--webhook-url "https://billing.example.com/events" \
--webhook-retries 10
Admin endpoints:
| Endpoint | Method | Description |
|---|---|---|
/webhooks/stats | GET | Delivery statistics (delivered, failed, pending retries, dead letters) |
/webhooks/dead-letter | GET | List permanently failed deliveries with error details |
/webhooks/dead-letter | DELETE | Clear dead letter queue |
/webhooks/replay | POST | Replay dead letter events (all or by index) |
Retry attempts include an X-PayGate-Retry header with the attempt number for observability.
Webhook Event Replay
Replay permanently failed webhook events from the dead letter queue:
# Replay all dead letter entries
curl -X POST http://localhost:3402/webhooks/replay \
-H "X-Admin-Key: $ADMIN_KEY"
# Replay specific entries by index
curl -X POST http://localhost:3402/webhooks/replay \
-H "X-Admin-Key: $ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{ "indices": [0, 2, 5] }'
Replayed entries are removed from the dead letter queue and re-queued for fresh delivery (attempt counter resets to 0). If delivery fails again, they follow the normal retry/dead-letter flow.
Webhook Signatures (HMAC-SHA256)
Sign webhook payloads for tamper-proof delivery:
npx paygate-mcp wrap --server "node server.js" \
--webhook-url "https://billing.example.com/events" \
--webhook-secret "whsec_your_secret_here"
When --webhook-secret is set, every webhook POST includes an X-PayGate-Signature header:
X-PayGate-Signature: t=1709123456,v1=a1b2c3d4...
Verifying signatures (Node.js example):
import { WebhookEmitter } from 'paygate-mcp';
const signature = req.headers['x-paygate-signature'];
const [tPart, v1Part] = signature.split(',');
const timestamp = tPart.split('=')[1];
const sig = v1Part.split('=')[1];
// Reconstruct signed payload: timestamp.body
const signedPayload = `${timestamp}.${rawBody}`;
const isValid = WebhookEmitter.verify(signedPayload, sig, 'whsec_your_secret_here');
The signature covers timestamp.body to prevent replay attacks. Use timing-safe comparison (built into WebhookEmitter.verify).
Admin Lifecycle Events
When webhooks are enabled, admin operations also fire webhook events:
| Event Type | Trigger | Metadata |
|---|---|---|
key.created | POST /keys | keyMasked, name, credits |
key.topup | POST /topup | keyMasked, creditsAdded, newBalance |
key.revoked | POST /keys/revoke | keyMasked |
key.rotated | POST /keys/rotate | oldKeyMasked, newKeyMasked |
key.expired | Gate evaluation | keyMasked |
alert.fired | Gate evaluation | alertType, keyPrefix, message, value, threshold |
team.created | POST /teams | teamId, name, budget |
team.updated | POST /teams/update | teamId, changes |
team.deleted | POST /teams/delete | teamId |
team.key_assigned | POST /teams/assign | teamId, keyMasked |
team.key_removed | POST /teams/remove | teamId, keyMasked |
Admin events appear in the adminEvents array of the webhook payload (separate from usage events). Both arrays can be present in the same batch.
Webhook Filters (Event Routing)
Route webhook events to different destinations based on event type and API key prefix. Each filter rule routes matching events to its own URL with independent retry queues, dead letter queues, and optional signing secrets.
Create a filter rule:
curl -X POST http://localhost:3402/webhooks/filters \
-H "X-Admin-Key: $ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "production-alerts",
"events": ["key.created", "key.revoked", "alert.fired"],
"url": "https://alerts.example.com/webhook",
"secret": "whsec_alerts_secret",
"keyPrefixes": ["pk_prod_"],
"active": true
}'
List filters:
curl http://localhost:3402/webhooks/filters -H "X-Admin-Key: $ADMIN_KEY"
Update a filter:
curl -X POST http://localhost:3402/webhooks/filters/update \
-H "X-Admin-Key: $ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{ "id": "wf_abc123", "active": false }'
Delete a filter:
curl -X POST http://localhost:3402/webhooks/filters/delete \
-H "X-Admin-Key: $ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{ "id": "wf_abc123" }'
Filter rules:
eventsβ Array of event types to match (exact match or"*"wildcard for all events)keyPrefixesβ Optional array of API key prefixes (e.g.,["pk_prod_"]). Events only match if the associated key starts with one of these prefixes. Omit for all keys.urlβ Destination URL for matched events (each unique URL gets its own retry queue)secretβ Optional HMAC-SHA256 signing secret for this destinationactiveβ Enable/disable the filter without deleting it
Routing behavior:
- Events matching filter rules are sent to the filter's destination URL
- The default webhook URL (if configured) always receives all events (backward compatible)
- Multiple filters can match the same event β it's sent to all matching destinations
- Inactive filters are skipped during routing
Config file:
{
"webhookUrl": "https://billing.example.com/events",
"webhookFilters": [
{
"name": "production-alerts",
"events": ["key.created", "key.revoked", "alert.fired"],
"url": "https://alerts.example.com/webhook",
"keyPrefixes": ["pk_prod_"]
}
]
}
Stats: GET /webhooks/stats includes per-URL delivery statistics for all filter destinations plus the default endpoint.
Usage Quotas
Set daily or monthly usage limits per API key:
# Create a key with 10 calls/day, 200 calls/month
curl -X POST http://localhost:3402/keys \
-H "Content-Type: application/json" \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"name": "metered-user", "credits": 1000, "quota": {"dailyCallLimit": 10, "monthlyCallLimit": 200}}'
# Set credit-based quotas (max 50 credits/day)
curl -X POST http://localhost:3402/keys/quota \
-H "Content-Type: application/json" \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "CLIENT_API_KEY", "dailyCreditLimit": 50}'
# Remove per-key quota (fall back to global defaults)
curl -X POST http://localhost:3402/keys/quota \
-H "Content-Type: application/json" \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "CLIENT_API_KEY", "remove": true}'
Quota types: dailyCallLimit, monthlyCallLimit, dailyCreditLimit, monthlyCreditLimit. Set to 0 for unlimited. Counters reset at UTC midnight (daily) and UTC month boundary (monthly). Set global defaults in the config file with globalQuota.
Dynamic Pricing
Charge extra credits based on input argument size:
{
"toolPricing": {
"analyze_text": { "creditsPerCall": 2, "creditsPerKbInput": 5 },
"search": { "creditsPerCall": 1 }
}
}
For analyze_text, a 3 KB input would cost 2 + ceil(3 Γ 5) = 17 credits. Small inputs round up to at least 1 KB. Tools without creditsPerKbInput use the flat base price.
OAuth 2.1
Full OAuth 2.1 authorization server for MCP clients. Implements PKCE, dynamic client registration, token refresh, and revocation.
Enable OAuth in config:
{
"oauth": {
"accessTokenTtl": 3600,
"refreshTokenTtl": 2592000,
"scopes": ["tools:*", "tools:read", "tools:write"]
}
}
Full flow:
# 1. Register an OAuth client
curl -X POST http://localhost:3402/oauth/register \
-H "Content-Type: application/json" \
-d '{"client_name": "My Agent", "redirect_uris": ["http://localhost:8080/callback"], "api_key": "pg_..."}'
# 2. Generate PKCE challenge (code_verifier β SHA256 β base64url)
# 3. Authorize: GET /oauth/authorize?response_type=code&client_id=...&redirect_uri=...&code_challenge=...&code_challenge_method=S256
# 4. Exchange code for tokens
curl -X POST http://localhost:3402/oauth/token \
-H "Content-Type: application/json" \
-d '{"grant_type": "authorization_code", "code": "...", "client_id": "...", "redirect_uri": "...", "code_verifier": "..."}'
# 5. Use Bearer token on /mcp
curl -X POST http://localhost:3402/mcp \
-H "Authorization: Bearer pg_at_..." \
-d '{"jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": {"name": "search", "arguments": {"query": "hello"}}}'
# 6. Refresh token
curl -X POST http://localhost:3402/oauth/token \
-d '{"grant_type": "refresh_token", "refresh_token": "pg_rt_...", "client_id": "..."}'
OAuth tokens are backed by API keys β each token maps to an API key for billing. The /mcp endpoint accepts both X-API-Key and Authorization: Bearer headers.
SSE Streaming (MCP Streamable HTTP)
PayGate implements the full MCP Streamable HTTP transport with SSE support:
# POST /mcp with SSE response (add Accept header)
curl -N -X POST http://localhost:3402/mcp \
-H "Content-Type: application/json" \
-H "Accept: text/event-stream" \
-H "X-API-Key: YOUR_KEY" \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"analyze","arguments":{}}}'
# Response: SSE stream with event: message + data: {jsonrpc response}
# GET /mcp β Open SSE notification stream
curl -N http://localhost:3402/mcp \
-H "Accept: text/event-stream" \
-H "Mcp-Session-Id: mcp_sess_..."
# Receives server-initiated notifications as SSE events
# DELETE /mcp β Terminate session
curl -X DELETE http://localhost:3402/mcp \
-H "Mcp-Session-Id: mcp_sess_..."
Session Management:
- Every POST
/mcpresponse includes anMcp-Session-Idheader - Clients reuse sessions by sending
Mcp-Session-Idon subsequent requests - GET
/mcpopens a long-lived SSE connection for server-to-client notifications - DELETE
/mcpterminates a session and closes all SSE connections - Sessions auto-expire after 30 minutes of inactivity
Transport modes:
POST /mcpwithoutAccept: text/event-streamβ standard JSON response (backward compatible)POST /mcpwithAccept: text/event-streamβ SSE-wrapped JSON-RPC responseGET /mcpwithAccept: text/event-streamβ long-lived notification stream
Audit Log
Every significant operation is recorded in a structured audit trail:
# Query audit events (with filtering)
curl http://localhost:3402/audit?types=key.created,gate.deny&limit=50 \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
# Export full audit log as CSV
curl http://localhost:3402/audit/export?format=csv \
-H "X-Admin-Key: YOUR_ADMIN_KEY" > audit.csv
# Get audit statistics
curl http://localhost:3402/audit/stats \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Tracked events: key.created, key.revoked, key.topup, key.acl_updated, key.expiry_updated, key.quota_updated, key.limit_updated, key.tags_updated, key.ip_updated, gate.allow, gate.deny, session.created, session.destroyed, oauth.client_registered, oauth.token_issued, oauth.token_revoked, admin.auth_failed, admin.alerts_configured, billing.refund, team.created, team.updated, team.deleted, team.key_assigned, team.key_removed.
Retention: Ring buffer (default 10,000 events), age-based cleanup (default 30 days), automatic periodic enforcement.
Registry/Discovery (Agent-Discoverable Pricing)
AI agents can programmatically discover your server's pricing and payment requirements before calling tools. Aligns with SEP-2007 (MCP Payment Spec Draft).
# Discover server payment metadata (public, no auth)
curl http://localhost:3402/.well-known/mcp-payment
# β { "specVersion": "2007-draft", "billingModel": "credits", "defaultCreditsPerCall": 1, ... }
# Get full pricing breakdown (public, no auth)
curl http://localhost:3402/pricing
# β { "server": {...}, "tools": [{ "name": "search", "creditsPerCall": 5, "pricingModel": "dynamic" }, ...] }
How it works:
/.well-known/mcp-paymentβ Server-level payment metadata (billing model, auth methods, error codes)/pricingβ Full per-tool pricing breakdown with overridestools/listresponses include_pricingmetadata on each tool (creditsPerCall, pricingModel, rateLimitPerMin)-32402error responses include pricing details so agents know how to afford the tool
Both discovery endpoints are public (no auth required) so agents can check pricing before obtaining an API key.
Prometheus Metrics
Monitor your PayGate server with any Prometheus-compatible monitoring system:
curl http://localhost:3402/metrics
Returns metrics in standard Prometheus text exposition format:
# HELP paygate_tool_calls_total Total tool calls processed
# TYPE paygate_tool_calls_total counter
paygate_tool_calls_total{status="allowed",tool="search"} 42
paygate_tool_calls_total{status="denied",tool="premium"} 3
# HELP paygate_credits_charged_total Total credits charged
# TYPE paygate_credits_charged_total counter
paygate_credits_charged_total{tool="search"} 210
# HELP paygate_active_keys_total Number of active (non-revoked) API keys
# TYPE paygate_active_keys_total gauge
paygate_active_keys_total 5
# HELP paygate_uptime_seconds Server uptime in seconds
# TYPE paygate_uptime_seconds gauge
paygate_uptime_seconds 3600
Available metrics:
paygate_tool_calls_total{tool,status}β Tool calls (allowed/denied)paygate_credits_charged_total{tool}β Credits charged per toolpaygate_denials_total{reason}β Denials by reason (insufficient_credits, rate_limited, etc.)paygate_rate_limit_hits_total{tool}β Rate limit hits per toolpaygate_refunds_total{tool}β Credit refunds per toolpaygate_http_requests_total{method,path,status}β HTTP requestspaygate_active_keys_totalβ Active API keys (gauge)paygate_active_sessions_totalβ Active MCP sessions (gauge)paygate_total_credits_availableβ Total credits across all keys (gauge)paygate_uptime_secondsβ Server uptime (gauge)
The /metrics endpoint is public (no auth required) for easy Prometheus scraping.
Key Cloning
Create a new API key with the same configuration as an existing key but fresh counters. Ideal for provisioning similar keys for team members, staging environments, or batch key creation:
# Clone with same config and credits
curl -X POST http://localhost:3402/keys/clone \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "pg_source..."}'
# β { "message": "Key cloned", "key": "pg_newkey...", "name": "source-clone", "credits": 200, ... }
# Clone with overrides
curl -X POST http://localhost:3402/keys/clone \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "pg_source...", "name": "staging-key", "credits": 50, "namespace": "staging"}'
What gets cloned: allowedTools, deniedTools, expiresAt, quota, tags, ipAllowlist, namespace, group, spendingLimit, autoTopup config. What gets reset: totalSpent, totalCalls, lastUsedAt, quotaDailyCalls, suspended state. You can override name, credits, tags, and namespace in the clone request. Suspended and expired keys can be cloned (but not revoked keys).
Key Rotation
Rotate an API key without losing credits, ACLs, quotas, or spending limits:
curl -X POST http://localhost:3402/keys/rotate \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "pg_oldkey..."}'
# β { "message": "Key rotated", "newKey": "pg_newkey...", "name": "my-key", "credits": 500 }
The old key is immediately invalidated. All state (credits, totalSpent, totalCalls, ACL, quota, expiry, spending limit) transfers to the new key. Use this for periodic key rotation policies, compromised key response, or key migration.
Key Suspension & Resumption
Temporarily disable an API key without permanently revoking it. Suspended keys are denied at the gate (key_suspended reason), but admin operations (topup, ACL, quota, tags, etc.) still work β making this ideal for investigating abuse, pausing billing, or temporary lockouts:
# Suspend a key (with optional reason for audit trail)
curl -X POST http://localhost:3402/keys/suspend \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "pg_abc123...", "reason": "investigating abuse"}'
# β { "message": "Key suspended", "suspended": true }
# Resume a suspended key
curl -X POST http://localhost:3402/keys/resume \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "pg_abc123..."}'
# β { "message": "Key resumed", "suspended": false }
Suspension vs Revocation:
- Suspend β Reversible. Key remains active but is denied at the gate. Admin operations still work. Use for temporary lockouts.
- Revoke β Permanent. Key is deactivated and cannot be restored. Use for compromised or decommissioned keys.
Suspension fires key.suspended and key.resumed audit events and webhook notifications. Shadow mode allows suspended keys through (with shadow:key_suspended reason) for testing.
Per-Key Usage
Get detailed usage breakdown for a specific API key β per-tool stats, hourly time-series, deny reasons, and recent events:
# Get full usage for a key
curl http://localhost:3402/keys/usage?key=pg_abc123... \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
# Filter by time (ISO 8601)
curl "http://localhost:3402/keys/usage?key=pg_abc123...&since=2025-01-01T00:00:00Z" \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Response includes:
| Field | Description |
|---|---|
key | Masked API key (first 10 chars + ...) |
name | Key name |
credits | Current credit balance |
active / suspended | Key status |
totalCalls | Total tool calls made |
totalAllowed / totalDenied | Allowed vs denied breakdown |
totalCreditsSpent | Total credits consumed |
perTool | Per-tool breakdown: { calls, credits, denied } |
denyReasons | Aggregated deny reasons with counts |
timeSeries | Hourly buckets: { hour, calls, credits, denied } |
recentEvents | Last 50 events (newest first) with tool, credits, and deny reason |
Works for active, suspended, and expired keys. Useful for debugging, billing audits, and per-customer analytics.
Webhook Test
Send a test event to your configured webhook URL to verify connectivity without generating real events:
# Send test event
curl -X POST http://localhost:3402/webhooks/test \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
# With custom message
curl -X POST http://localhost:3402/webhooks/test \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"message": "Testing from staging deploy"}'
Response:
| Field | Description |
|---|---|
url | Webhook URL (credentials masked) |
success | true if webhook returned 2xx |
statusCode | HTTP status code from webhook endpoint |
responseTime | Round-trip delivery time in milliseconds |
error | Error message (only on failure) |
The test event includes X-PayGate-Test: 1 header and X-PayGate-Signature when a webhook secret is configured. Returns 400 if no webhook URL is configured. Creates an audit trail entry (webhook.test).
Webhook Delivery Log
Query the log of all webhook delivery attempts β successes, failures, and retries:
# Get recent deliveries (default: last 50, newest first)
curl http://localhost:3402/webhooks/log \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
# Filter by success/failure
curl "http://localhost:3402/webhooks/log?success=false" \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
# Filter by time and limit
curl "http://localhost:3402/webhooks/log?since=2025-01-01T00:00:00Z&limit=10" \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Each entry includes:
| Field | Description |
|---|---|
id | Auto-incrementing delivery ID |
timestamp | When the delivery attempt was made |
url | Webhook URL (credentials masked) |
statusCode | HTTP status code (0 for connection errors) |
success | true if webhook returned 2xx |
responseTime | Round-trip time in milliseconds |
attempt | Retry attempt number (0 = first attempt) |
error | Error message (only on failure) |
eventCount | Number of events in the batch |
eventTypes | Distinct event types (e.g. ["usage"], ["key.created"]) |
Query parameters: limit (default 50, max 200), since (ISO 8601), success (true or false). Entries are capped at 500 in memory. Use alongside /webhooks/stats for aggregate counters.
Webhook Pause/Resume
Temporarily halt webhook delivery during maintenance windows. Events are buffered (not lost) and flushed on resume:
# Pause delivery
curl -X POST http://localhost:3402/webhooks/pause \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
# Check pause status (visible in /webhooks/stats)
curl http://localhost:3402/webhooks/stats \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
# β { "paused": true, "pausedAt": "2025-...", "bufferedEvents": 12, ... }
# Resume delivery (flushes buffered events)
curl -X POST http://localhost:3402/webhooks/resume \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
# β { "paused": false, "flushedEvents": 12 }
While paused, events continue to accumulate in the buffer. On resume, all buffered events are flushed immediately. The pause state and buffered event count are visible in /webhooks/stats. Creates audit trail entries (webhook.pause, webhook.resume).
Key Aliases
Assign human-readable aliases to API keys so you can reference them by name instead of opaque key IDs in admin endpoints:
# Set an alias
curl -X POST http://localhost:3402/keys/alias \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "pg_abc123...", "alias": "prod-backend"}'
# β { "key": "pg_abc12...", "alias": "prod-backend", "message": "Alias set to \"prod-backend\"" }
# Use the alias in any admin endpoint
curl -X POST http://localhost:3402/topup \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "prod-backend", "credits": 500}'
curl -X POST http://localhost:3402/keys/suspend \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "prod-backend", "reason": "maintenance"}'
curl -X POST http://localhost:3402/keys/transfer \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"from": "prod-backend", "to": "staging-api", "credits": 100}'
# Clear an alias
curl -X POST http://localhost:3402/keys/alias \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "prod-backend", "alias": null}'
| Field | Description |
|---|---|
alias | 1-100 chars, alphanumeric + hyphens + underscores only |
| Uniqueness | Aliases must be unique across all keys and cannot collide with existing key IDs |
| Scope | Aliases work in all admin endpoints (topup, revoke, suspend, resume, clone, transfer, usage) β they do not work for API key authentication on /mcp |
| Persistence | Aliases are saved to the state file and survive server restarts |
| Clone | Cloned keys do not inherit the source key's alias |
| Audit | key.alias_set event logged for every set/clear operation |
Key Expiry Scanner
Proactive background scanner that detects API keys approaching expiration and sends webhook notifications before they expire β even if the keys are not actively being used:
# Query keys expiring within 24 hours (default)
curl http://localhost:3402/keys/expiring \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
# β { "within": 86400, "count": 2, "scanner": { ... }, "keys": [ ... ] }
# Query keys expiring within 7 days
curl http://localhost:3402/keys/expiring?within=604800 \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Configure the scanner in your config file:
{
"expiryScanner": {
"enabled": true,
"intervalSeconds": 3600,
"thresholds": [604800, 86400, 3600]
}
}
| Field | Description |
|---|---|
enabled | Enable/disable the background scanner. Default: true |
intervalSeconds | How often to scan (seconds). Default: 3600 (1 hour). Min: 60 |
thresholds | Seconds before expiry to notify. Default: [604800, 86400, 3600] (7d, 24h, 1h) |
| Webhook | Fires key.expiry_warning events with key name, alias, namespace, expiry time, and remaining seconds |
| De-duplication | Each key+threshold pair is only notified once (no duplicate alerts) |
| Progressive | Largest threshold fires first, then progressively smaller thresholds on subsequent scans |
| Audit | key.expiry_warning event logged for every notification |
| Endpoint | GET /keys/expiring?within=N lists keys expiring within N seconds (default: 86400) |
Key Templates
Named templates for API key creation. Define reusable presets and create keys with template: "free-tier":
# Create a template
curl -X POST http://localhost:3402/keys/templates \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{
"name": "free-tier",
"description": "Free plan with basic access",
"credits": 50,
"allowedTools": ["search", "read"],
"deniedTools": ["admin"],
"tags": {"plan": "free"},
"namespace": "public",
"expiryTtlSeconds": 2592000,
"spendingLimit": 200
}'
# List all templates
curl http://localhost:3402/keys/templates \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
# Create a key from template (inherits all defaults)
curl -X POST http://localhost:3402/keys \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"name": "new-user", "template": "free-tier"}'
# Create a key from template with overrides
curl -X POST http://localhost:3402/keys \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"name": "vip-user", "template": "free-tier", "credits": 500, "tags": {"plan": "vip"}}'
# Delete a template
curl -X POST http://localhost:3402/keys/templates/delete \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"name": "free-tier"}'
| Feature | Details |
|---|---|
| Fields | credits, allowedTools, deniedTools, quota, ipAllowlist, spendingLimit, tags, namespace, expiryTtlSeconds, autoTopup |
| Override | Explicit params in POST /keys always override template defaults |
| TTL | expiryTtlSeconds sets expiry relative to key creation time (0 = never) |
| Limit | Max 100 templates per server |
| Persistence | -templates.json alongside state file, survives restarts |
| Audit | template.created, template.updated, template.deleted events |
| Prometheus | paygate_templates_total gauge tracks template count |
Environment Variables Config
Configure everything via PAYGATE_* environment variables β ideal for Docker, Kubernetes, and CI/CD deployments:
# Docker example
docker run -e PAYGATE_SERVER="node /app/server.js" \
-e PAYGATE_PORT=8080 \
-e PAYGATE_PRICE=5 \
-e PAYGATE_ADMIN_KEY=sk-admin-secret \
-e PAYGATE_REDIS_URL=redis://redis:6379 \
-e PAYGATE_WEBHOOK_URL=https://hooks.example.com/billing \
-p 8080:8080 node:20 npx paygate-mcp wrap
# Or use a config file via env var
docker run -e PAYGATE_CONFIG=/etc/paygate/config.json \
-v ./config.json:/etc/paygate/config.json \
-p 3402:3402 node:20 npx paygate-mcp wrap
All 18 supported environment variables:
| Env Var | CLI Flag | Description |
|---|---|---|
PAYGATE_SERVER | --server | MCP server command to wrap (stdio) |
PAYGATE_REMOTE_URL | --remote-url | Remote MCP server URL (HTTP) |
PAYGATE_CONFIG | --config | Path to JSON config file |
PAYGATE_PORT | --port | Server port (default: 3402) |
PAYGATE_PRICE | --price | Credits per tool call (default: 1) |
PAYGATE_RATE_LIMIT | --rate-limit | Max calls per minute per key (default: 60) |
PAYGATE_NAME | --name | Server name for display |
PAYGATE_SHADOW | --shadow | Enable shadow mode (true/false) |
PAYGATE_ADMIN_KEY | --admin-key | Admin API key |
PAYGATE_STATE_FILE | --state-file | Persistent state file path |
PAYGATE_WEBHOOK_URL | --webhook-url | Webhook delivery URL |
PAYGATE_WEBHOOK_SECRET | --webhook-secret | HMAC-SHA256 webhook secret |
PAYGATE_WEBHOOK_RETRIES | --webhook-retries | Max webhook retry attempts |
PAYGATE_REFUND_ON_FAILURE | --refund-on-failure | Refund credits on tool failure (true/false) |
PAYGATE_REDIS_URL | --redis-url | Redis URL for horizontal scaling |
PAYGATE_DRY_RUN | --dry-run | Discover tools and exit (true/false) |
PAYGATE_TOOL_PRICE | --tool-price | Per-tool pricing (tool=price,...) |
PAYGATE_STRIPE_SECRET | --stripe-secret | Stripe secret key for payments |
Priority: CLI flags > env vars > config file > defaults. This means you can set defaults via env vars in Docker and override specific values on the command line.
Request ID Tracking
Every HTTP response includes an X-Request-Id header for distributed tracing. If the incoming request has an X-Request-Id header (e.g., from a load balancer or API gateway), it is propagated through. Otherwise, a new ID is auto-generated with the format req_<16 hex chars>.
# Auto-generated request ID
curl -v http://localhost:3402/health
# < X-Request-Id: req_a1b2c3d4e5f67890
# Propagate your own trace ID
curl -v -H "X-Request-Id: my-trace-123" http://localhost:3402/health
# < X-Request-Id: my-trace-123
| Feature | Details |
|---|---|
| Format | req_ + 16 hex chars (8 bytes of randomness) |
| Propagation | Incoming X-Request-Id header is preserved and returned |
| CORS | Included in Access-Control-Allow-Headers and Access-Control-Expose-Headers |
| Audit | Request ID appears in gate.allow, gate.deny, and session.created audit metadata |
| Exports | generateRequestId() and getRequestId(req) available in SDK |
Server Info Endpoint
GET /info returns a comprehensive JSON object describing the server's capabilities. Public endpoint β no admin key required.
curl http://localhost:3402/info
{
"name": "My API Server",
"version": "5.5.0",
"transport": "stdio",
"port": 3402,
"auth": ["api_key", "scoped_token"],
"features": {
"shadowMode": false,
"webhooks": true,
"webhookSignatures": true,
"refundOnFailure": true,
"redis": false,
"oauth": false,
"plugins": false,
"multiServer": false
},
"pricing": {
"defaultCreditsPerCall": 1,
"toolPricing": {
"expensive-tool": { "creditsPerCall": 10 }
}
},
"rateLimit": { "globalPerMin": 60 },
"endpoints": {
"mcp": "/mcp",
"health": "/health",
"info": "/info",
"status": "/status (admin)",
"keys": "/keys (admin)",
"metrics": "/metrics",
"pricing": "/pricing",
"audit": "/audit (admin)",
"analytics": "/analytics (admin)"
}
}
Configurable CORS
Control which browser origins can access your PayGate server. Default is * (allow all).
# CLI flag: single origin
npx paygate-mcp wrap --server "..." --cors-origin "https://myapp.com"
# CLI flag: multiple origins (comma-separated)
npx paygate-mcp wrap --server "..." --cors-origin "https://app1.com,https://app2.com"
# Env var
PAYGATE_CORS_ORIGIN=https://myapp.com npx paygate-mcp wrap --server "..."
Config file:
{
"cors": {
"origin": ["https://app1.com", "https://app2.com"],
"credentials": true,
"maxAge": 3600
}
}
| Feature | Details |
|---|---|
| Default | * (allow all origins) |
| Single origin | Exact match against request Origin header |
| Multiple origins | Array of allowed origins, matched against request |
| Credentials | Access-Control-Allow-Credentials: true when enabled |
| Max-Age | Preflight cache duration (default: 86400 = 24 hours) |
| Vary | Vary: Origin header added when origin is not * |
Custom Response Headers
Add custom HTTP headers to all responses β perfect for security headers, cache control, or custom tracking.
# CLI flag: single header
npx paygate-mcp wrap --server "..." --header "X-Frame-Options:DENY"
# CLI flag: multiple headers (comma-separated)
npx paygate-mcp wrap --server "..." --header "X-Frame-Options:DENY,X-Content-Type-Options:nosniff"
# Env var
PAYGATE_CUSTOM_HEADERS="X-Frame-Options:DENY,X-Content-Type-Options:nosniff" npx paygate-mcp wrap --server "..."
Config file:
{
"customHeaders": {
"X-Frame-Options": "DENY",
"X-Content-Type-Options": "nosniff",
"Strict-Transport-Security": "max-age=31536000; includeSubDomains",
"X-Custom-Tag": "my-service"
}
}
Custom headers are applied to every HTTP response (health, info, admin, MCP, preflight) and coexist with CORS headers and request IDs. They do not override built-in headers.
Config Export
Inspect the running server configuration for debugging and verification:
curl http://localhost:3402/config -H "X-Admin-Key: YOUR_ADMIN_KEY"
Returns the full config with sensitive values masked:
| Field | Masking |
|---|---|
webhookSecret | *** |
webhookUrl | Scheme + host only (e.g. https://hooks.example.com/***) |
serverCommand | *** |
serverArgs | ['***'] |
| Webhook filter secrets | *** |
| Webhook filter URLs | Scheme + host only |
Non-sensitive values (pricing, rate limits, CORS, custom headers, quotas, etc.) are returned as-is. Each export is recorded in the audit trail as config.export.
Trusted Proxies
When running behind load balancers or reverse proxies, configure trusted proxy IPs/CIDRs so PayGate extracts the real client IP from the X-Forwarded-For header correctly:
# CLI flag (comma-separated IPs and/or CIDRs)
paygate-mcp wrap --server "node server.js" --trusted-proxies "10.0.0.0/8,172.16.0.0/12"
# Environment variable
PAYGATE_TRUSTED_PROXIES="10.0.0.0/8,172.16.0.0/12" paygate-mcp wrap --server "node server.js"
Config file:
{
"serverCommand": "node",
"serverArgs": ["server.js"],
"trustedProxies": ["10.0.0.0/8", "172.16.0.0/12", "192.168.1.1"]
}
How it works: Without trusted proxies, the first X-Forwarded-For value is used (backward compatible). With trusted proxies configured, the header is walked right-to-left, skipping IPs that match the trusted list, and the first non-trusted IP is returned as the real client IP. This is critical for accurate IP allowlisting when behind proxies.
Supports exact IPv4 addresses and CIDR notation (/8, /16, /24, /32, etc.). The resolveClientIp function is also exported from the SDK for custom use.
Key Listing Pagination
The GET /keys endpoint supports pagination, filtering, and sorting when any query parameter is present:
# Paginate: 10 keys per page, second page
curl "http://localhost:3402/keys?limit=10&offset=10" -H "X-Admin-Key: YOUR_ADMIN_KEY"
# Filter by namespace and active status
curl "http://localhost:3402/keys?limit=50&namespace=prod&active=true" -H "X-Admin-Key: YOUR_ADMIN_KEY"
# Sort by credits descending
curl "http://localhost:3402/keys?limit=20&sortBy=credits&order=desc" -H "X-Admin-Key: YOUR_ADMIN_KEY"
# Filter by credit range
curl "http://localhost:3402/keys?limit=50&minCredits=100&maxCredits=1000" -H "X-Admin-Key: YOUR_ADMIN_KEY"
# Find keys by name prefix
curl "http://localhost:3402/keys?limit=50&namePrefix=prod-" -H "X-Admin-Key: YOUR_ADMIN_KEY"
Query Parameters:
| Parameter | Type | Description |
|---|---|---|
limit | number | Results per page (1β500, default: 50) |
offset | number | Skip N results (default: 0) |
sortBy | string | Sort field: createdAt, name, credits, lastUsedAt, totalSpent, totalCalls |
order | string | Sort direction: asc or desc (default: desc) |
namespace | string | Filter by namespace |
group | string | Filter by group ID |
active | string | true or false |
suspended | string | true or false |
expired | string | true or false |
namePrefix | string | Case-insensitive name prefix match |
minCredits | number | Minimum credits (inclusive) |
maxCredits | number | Maximum credits (inclusive) |
Response format (when any pagination/filter param is present):
{
"keys": [...],
"total": 150,
"offset": 20,
"limit": 10,
"hasMore": true
}
Backward compatible: Without any pagination/filter/sort params, GET /keys returns the same flat array as before.
Key Statistics
GET /keys/stats returns aggregate statistics across all keys:
# Get all key statistics
curl http://localhost:3402/keys/stats -H "X-Admin-Key: YOUR_ADMIN_KEY"
# Filter by namespace
curl "http://localhost:3402/keys/stats?namespace=prod" -H "X-Admin-Key: YOUR_ADMIN_KEY"
Response:
{
"total": 150,
"active": 120,
"suspended": 10,
"expired": 15,
"revoked": 5,
"totalCreditsAllocated": 500000,
"totalCreditsSpent": 125000,
"totalCreditsRemaining": 375000,
"totalCalls": 84200,
"byNamespace": { "prod": 80, "staging": 50, "default": 20 },
"byGroup": { "enterprise": 30, "starter": 45 }
}
When ?namespace= is provided, all counts/aggregates are scoped to that namespace, and a filteredByNamespace field is included in the response.
Rate Limit Status
GET /keys/rate-limit-status?key=... returns the current rate limit window state for a key without consuming a call:
curl "http://localhost:3402/keys/rate-limit-status?key=pg_..." -H "X-Admin-Key: YOUR_ADMIN_KEY"
Response:
{
"key": "pg_abc123...",
"name": "my-key",
"global": {
"limit": 100,
"used": 23,
"remaining": 77,
"resetInMs": 45000,
"windowMs": 60000
},
"perTool": {
"search": { "limit": 10, "used": 5, "remaining": 5, "resetInMs": 30000 },
"translate": { "limit": 20, "used": 0, "remaining": 20, "resetInMs": 60000 }
}
}
perTool is only present when tools have per-tool rate limits configured via toolPricing. Tools without custom rate limits are not included.
Quota Status
GET /keys/quota-status?key=... returns daily/monthly quota usage for a key:
curl "http://localhost:3402/keys/quota-status?key=pg_..." -H "X-Admin-Key: YOUR_ADMIN_KEY"
Response:
{
"key": "pg_abc123...",
"name": "my-key",
"quotaSource": "global",
"daily": {
"callsUsed": 42,
"callsLimit": 100,
"callsRemaining": 58,
"creditsUsed": 150,
"creditsLimit": 500,
"creditsRemaining": 350,
"resetDay": "2026-02-26"
},
"monthly": {
"callsUsed": 850,
"callsLimit": 2000,
"callsRemaining": 1150,
"creditsUsed": 3200,
"creditsLimit": 10000,
"creditsRemaining": 6800,
"resetMonth": "2026-02"
}
}
quotaSource indicates where the quota is configured: "per-key" (key-level override), "global" (server-wide config), or "none" (no quota). When a limit is 0 (unlimited), remaining is null.
Credit History
GET /keys/credit-history?key=... returns the credit mutation log for a key:
curl "http://localhost:3402/keys/credit-history?key=pg_..." -H "X-Admin-Key: YOUR_ADMIN_KEY"
# Filter by type, limit, or since timestamp
curl "http://localhost:3402/keys/credit-history?key=pg_...&type=topup&limit=10" -H "X-Admin-Key: YOUR_ADMIN_KEY"
Response:
{
"key": "pg_abc123...",
"name": "my-key",
"currentBalance": 700,
"totalEntries": 3,
"entries": [
{
"timestamp": "2026-02-26T12:30:00.000Z",
"type": "topup",
"amount": 200,
"balanceBefore": 500,
"balanceAfter": 700
},
{
"timestamp": "2026-02-26T12:00:00.000Z",
"type": "initial",
"amount": 500,
"balanceBefore": 0,
"balanceAfter": 500
}
]
}
Entry types: initial, topup, transfer_in, transfer_out, auto_topup, deduction, refund, bulk_topup. Entries are newest-first, capped at 100 per key. Transfers include a memo field when provided.
Spending Velocity
GET /keys/spending-velocity?key=... returns credit burn rate and depletion forecast:
curl "http://localhost:3402/keys/spending-velocity?key=pg_..." -H "X-Admin-Key: YOUR_ADMIN_KEY"
# Custom analysis window (default 24h, max 720h/30d)
curl "http://localhost:3402/keys/spending-velocity?key=pg_...&window=48" -H "X-Admin-Key: YOUR_ADMIN_KEY"
Response:
{
"key": "pg_abc123...",
"name": "my-key",
"currentBalance": 750,
"velocity": {
"creditsPerHour": 12.5,
"creditsPerDay": 300,
"callsPerHour": 2.5,
"callsPerDay": 60,
"estimatedDepletionDate": "2026-03-01T18:00:00.000Z",
"estimatedHoursRemaining": 60,
"windowHours": 24,
"dataPoints": 45
},
"topTools": [
{ "tool": "search", "calls": 30, "credits": 150 },
{ "tool": "generate", "calls": 15, "credits": 120 }
]
}
estimatedDepletionDate and estimatedHoursRemaining are null when there's no spending activity. topTools shows the 5 highest-spend tools from usage data.
Key Comparison
GET /keys/compare?keys=pg_a,pg_b,pg_c returns side-by-side comparison of 2β10 keys:
curl "http://localhost:3402/keys/compare?keys=pg_abc,pg_xyz" -H "X-Admin-Key: YOUR_ADMIN_KEY"
Response:
{
"compared": 2,
"keys": [
{
"key": "pg_abc123...",
"name": "prod-agent",
"status": "active",
"credits": { "current": 750, "totalSpent": 250 },
"usage": { "totalCalls": 50, "totalAllowed": 48, "totalDenied": 2 },
"velocity": { "creditsPerHour": 12.5, "creditsPerDay": 300, "estimatedHoursRemaining": 60 },
"rateLimit": { "used": 3, "limit": 60, "remaining": 57 },
"metadata": { "namespace": "prod", "group": "team-a", "createdAt": "2026-02-01T00:00:00Z", "tags": { "env": "prod" } }
},
{
"key": "pg_xyz789...",
"name": "staging-agent",
"status": "active",
"credits": { "current": 200, "totalSpent": 800 },
"usage": { "totalCalls": 120, "totalAllowed": 120, "totalDenied": 0 },
"velocity": { "creditsPerHour": 8.3, "creditsPerDay": 200, "estimatedHoursRemaining": 24 },
"rateLimit": { "used": 0, "limit": 60, "remaining": 60 },
"metadata": { "namespace": "staging", "group": null, "createdAt": "2026-02-15T00:00:00Z", "tags": {} }
}
]
}
Keys not found are reported in a notFound array. Supports aliases. Maximum 10 keys per comparison.
Key Health Score
GET /keys/health?key=... returns a composite health score (0β100) with weighted component breakdown:
curl "http://localhost:3402/keys/health?key=pg_abc" -H "X-Admin-Key: YOUR_ADMIN_KEY"
Response:
{
"key": "pg_abc123...",
"name": "prod-agent",
"score": 72,
"status": "caution",
"issues": ["Key expires within 7 days", "Credits depleting rapidly"],
"components": {
"balance": { "score": 40, "risk": "warning", "weight": 0.30 },
"quota": { "score": 85, "risk": "good", "weight": 0.25 },
"rateLimit": { "score": 100, "risk": "healthy", "weight": 0.20 },
"errorRate": { "score": 75, "risk": "good", "weight": 0.25 }
}
}
Components: balance (30%, hours until credit depletion), quota (25%, max utilization across daily/monthly limits), rateLimit (20%, current window usage), errorRate (25%, denied/total ratio). Status: healthy (β₯90), good (β₯75), caution (β₯50), warning (β₯25), critical (<25). Issues detect: revoked, suspended, expired, expiring soon, zero credits, rapid depletion. Supports aliases.
Maintenance Mode
Put your server into maintenance mode to gracefully reject client traffic while keeping admin endpoints operational:
# Enable maintenance mode
curl -X POST http://localhost:3402/maintenance \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"enabled": true, "message": "Upgrading to v7 β back in 10 minutes"}'
# Check maintenance status
curl http://localhost:3402/maintenance -H "X-Admin-Key: YOUR_ADMIN_KEY"
# Disable maintenance mode
curl -X POST http://localhost:3402/maintenance \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"enabled": false}'
Response (enabled):
{
"enabled": true,
"message": "Upgrading to v7 β back in 10 minutes",
"since": "2025-03-15T14:30:00.000Z"
}
When enabled, all /mcp requests return 503 with the custom message. Admin endpoints (/keys, /maintenance, /audit, etc.) remain fully operational. GET /health returns {"status": "maintenance"}. Both enable and disable actions are recorded in the audit trail (maintenance.enabled / maintenance.disabled).
Admin Event Stream
Stream real-time server events to admin clients via Server-Sent Events (SSE):
# Stream all events
curl -N http://localhost:3402/admin/events \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-H "Accept: text/event-stream"
# Stream only key operations
curl -N http://localhost:3402/admin/events?types=key.created,key.revoked,key.topup \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-H "Accept: text/event-stream"
Events:
event: connected
data: {"message":"Admin event stream connected","filters":"all"}
event: audit
data: {"id":42,"timestamp":"2025-03-15T14:30:00.000Z","type":"key.created","actor":"admin","message":"Key created: prod-agent","metadata":{...}}
event: audit
data: {"id":43,"timestamp":"2025-03-15T14:30:01.000Z","type":"gate.allow","actor":"pg_abc12...","message":"Allowed: get_weather","metadata":{...}}
Every audit event (tool calls, denials, key operations, maintenance, alerts) is broadcast in real-time. Use ?types= to filter by comma-separated event types. Supports multiple concurrent admin clients. Keepalive pings every 15s prevent connection timeouts. Connections are cleaned up automatically on disconnect.
Key Notes
Attach timestamped notes to API keys for operational tracking:
# Add a note
curl -X POST http://localhost:3402/keys/notes \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "pg_...", "text": "Increased credits per customer request #1234"}'
# List notes
curl "http://localhost:3402/keys/notes?key=pg_..." -H "X-Admin-Key: YOUR_ADMIN_KEY"
# Delete a note by index
curl -X DELETE "http://localhost:3402/keys/notes?key=pg_...&index=0" \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Response (list):
{
"key": "pg_abc1...2345",
"notes": [
{ "timestamp": "2025-03-15T14:30:00.000Z", "author": "admin", "text": "Increased credits per customer request #1234" },
{ "timestamp": "2025-03-16T09:00:00.000Z", "author": "admin", "text": "Upgraded to premium tier" }
],
"count": 2
}
Max 50 notes per key, 1000 characters per note. Works on suspended and revoked keys. Supports aliases. All add/delete operations recorded in audit trail (key.note_added / key.note_deleted).
Scheduled Actions
Schedule future-dated actions on API keys β automatically revoke, suspend, or top up credits at a specified time:
# Schedule a key revocation in 24 hours
curl -X POST http://localhost:3402/keys/schedule \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "pg_...", "action": "revoke", "executeAt": "2025-04-01T00:00:00Z"}'
# Schedule a credit top-up
curl -X POST http://localhost:3402/keys/schedule \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "pg_...", "action": "topup", "executeAt": "2025-04-01T00:00:00Z", "params": {"credits": 500}}'
# List all pending schedules (optional ?key= filter)
curl "http://localhost:3402/keys/schedule" -H "X-Admin-Key: YOUR_ADMIN_KEY"
# Cancel a schedule
curl -X DELETE "http://localhost:3402/keys/schedule?id=sched_1" \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Response (create):
{
"id": "sched_1",
"key": "pg_abc1...2345",
"action": "revoke",
"executeAt": "2025-04-01T00:00:00.000Z",
"createdAt": "2025-03-15T10:30:00.000Z"
}
Supported actions: revoke, suspend, topup (requires params.credits). Max 20 schedules per key. Supports aliases. Background timer checks every 10 seconds. All create/execute/cancel operations recorded in audit trail (schedule.created / schedule.executed / schedule.cancelled).
Key Activity Timeline
Get a unified chronological feed of all events for a specific key β audit events (creation, suspension, notes, etc.) and usage events (tool calls, denials) merged into one timeline:
# Get activity for a key (newest first, default limit 50)
curl "http://localhost:3402/keys/activity?key=pg_..." -H "X-Admin-Key: YOUR_ADMIN_KEY"
# With filters
curl "http://localhost:3402/keys/activity?key=pg_...&limit=20&since=2025-03-15T00:00:00Z" \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Response:
{
"key": "pg_abc1...2345",
"name": "my-agent",
"total": 42,
"limit": 50,
"events": [
{ "timestamp": "2025-03-16T10:30:00Z", "source": "usage", "type": "tool.call", "message": "Called search (5 credits)", "metadata": { "tool": "search", "creditsCharged": 5, "allowed": true } },
{ "timestamp": "2025-03-16T09:00:00Z", "source": "audit", "type": "key.note_added", "message": "Note added to key", "metadata": { "key": "pg_abc1...2345" } },
{ "timestamp": "2025-03-15T14:00:00Z", "source": "audit", "type": "key.created", "message": "Key created: my-agent", "metadata": { "keyMasked": "pg_abc1...2345" } }
]
}
Max 200 events per request. Supports aliases. Works on suspended and revoked keys.
Credit Reservations
Pre-reserve credits before executing expensive operations. Prevents overcommit in concurrent scenarios:
# Reserve 500 credits (hold for 5 min default, or set ttlSeconds)
curl -X POST http://localhost:3402/keys/reserve \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "pg_...", "credits": 500, "ttlSeconds": 300, "memo": "Batch job #42"}'
# Commit β deducts the held credits
curl -X POST http://localhost:3402/keys/reserve/commit \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"reservationId": "rsv_1"}'
# Release β frees the hold without deducting
curl -X POST http://localhost:3402/keys/reserve/release \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"reservationId": "rsv_1"}'
# List active reservations (optional ?key= filter)
curl "http://localhost:3402/keys/reserve" -H "X-Admin-Key: YOUR_ADMIN_KEY"
Response (reserve):
{
"id": "rsv_1",
"key": "pg_abc1...2345",
"credits": 500,
"createdAt": "2025-03-16T10:30:00Z",
"expiresAt": "2025-03-16T10:35:00Z",
"memo": "Batch job #42",
"available": 500
}
TTL range: 10s to 1h (default 5 min). Max 50 reservations per key. Expired reservations auto-cleanup. Alias support. Rejects revoked/suspended keys. Audit trail (credits.reserved / credits.committed / credits.released).
Request Log
Queryable log of every MCP tool call with timing, credits, status, and deny reason:
# Get all requests (newest first)
curl "http://localhost:3402/requests" -H "X-Admin-Key: YOUR_ADMIN_KEY"
# Filter by tool name
curl "http://localhost:3402/requests?tool=my_tool" -H "X-Admin-Key: YOUR_ADMIN_KEY"
# Filter by status (allowed or denied)
curl "http://localhost:3402/requests?status=denied" -H "X-Admin-Key: YOUR_ADMIN_KEY"
# Filter by key (partial match on masked key)
curl "http://localhost:3402/requests?key=pg_abc1" -H "X-Admin-Key: YOUR_ADMIN_KEY"
# Filter by time + pagination
curl "http://localhost:3402/requests?since=2025-03-01T00:00:00Z&limit=50&offset=0" \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
# Combine filters
curl "http://localhost:3402/requests?tool=my_tool&status=allowed&limit=10" \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Response:
{
"total": 42,
"offset": 0,
"limit": 100,
"summary": {
"totalAllowed": 38,
"totalDenied": 4,
"totalCredits": 190,
"avgDurationMs": 45
},
"requests": [
{
"id": 42,
"timestamp": "2025-03-16T10:30:00Z",
"tool": "my_tool",
"key": "pg_abc1...2345",
"status": "allowed",
"credits": 5,
"durationMs": 32,
"requestId": "req_a1b2c3d4e5f6g7h8"
},
{
"id": 41,
"timestamp": "2025-03-16T10:29:55Z",
"tool": "my_tool",
"key": "pg_xyz9...8765",
"status": "denied",
"credits": 0,
"durationMs": 1,
"denyReason": "insufficient_credits",
"requestId": "req_i9j0k1l2m3n4o5p6"
}
]
}
5000-entry ring buffer. Summary statistics are computed on filtered results. Deny reasons: insufficient_credits, rate_limited, invalid_api_key, key_suspended, api_key_expired, tool_not_allowed, quota_exceeded.
Tool Stats
Per-tool analytics derived from the request log:
# Overview of all tools
curl "http://localhost:3402/tools/stats" -H "X-Admin-Key: YOUR_ADMIN_KEY"
# Detailed stats for a specific tool
curl "http://localhost:3402/tools/stats?tool=my_tool" -H "X-Admin-Key: YOUR_ADMIN_KEY"
# Filter by time range
curl "http://localhost:3402/tools/stats?since=2025-03-01T00:00:00Z" \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Response (overview):
{
"totalTools": 3,
"totalCalls": 150,
"tools": [
{
"tool": "my_tool",
"totalCalls": 100,
"allowed": 95,
"denied": 5,
"successRate": 95,
"totalCredits": 475,
"avgDurationMs": 42
}
]
}
Response (detailed ?tool=my_tool):
{
"tool": "my_tool",
"totalCalls": 100,
"allowed": 95,
"denied": 5,
"successRate": 95,
"totalCredits": 475,
"avgDurationMs": 42,
"p95DurationMs": 120,
"denyReasons": {
"insufficient_credits": 3,
"rate_limited": 2
},
"topConsumers": [
{ "key": "pg_abc1...2345", "calls": 50, "credits": 250 },
{ "key": "pg_xyz9...8765", "calls": 30, "credits": 150 }
]
}
Top consumers limited to 10. Tools sorted by call count in overview. Data sourced from request log (5000-entry ring buffer).
Request Log Export
Export the request log as JSON or CSV for offline analysis:
# Export as JSON (default)
curl "http://localhost:3402/requests/export" \
-H "X-Admin-Key: YOUR_ADMIN_KEY" -o paygate-requests.json
# Export as CSV
curl "http://localhost:3402/requests/export?format=csv" \
-H "X-Admin-Key: YOUR_ADMIN_KEY" -o paygate-requests.csv
# Export with filters
curl "http://localhost:3402/requests/export?tool=my_tool&status=denied&since=2025-03-01T00:00:00Z&until=2025-03-31T23:59:59Z" \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
| Parameter | Description |
|---|---|
format | json (default) or csv |
key | Filter by API key (partial match) |
tool | Filter by tool name (exact match) |
status | allowed or denied |
since | ISO 8601 start timestamp |
until | ISO 8601 end timestamp |
Both formats include Content-Disposition headers for automatic file download. Unlike /requests, the export endpoint returns all matching entries (no pagination limit). CSV includes proper quoting for values with commas.
Tool Call Dry Run
Simulate a tool call to check if it would be allowed β without deducting credits or incrementing rate limits:
curl -X POST http://localhost:3402/requests/dry-run \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "pg_...", "tool": "my_tool"}'
Response (allowed):
{
"allowed": true,
"tool": "my_tool",
"creditsRequired": 5,
"creditsAvailable": 100,
"creditsAfter": 95,
"rateLimit": { "used": 3, "limit": 60, "remaining": 57, "resetInMs": 45000 }
}
Response (denied):
{
"allowed": false,
"reason": "insufficient_credits: need 5, have 2",
"tool": "my_tool",
"creditsRequired": 5,
"creditsAvailable": 2
}
Checks key validity, suspension, tool ACL, rate limits, credit balance, and spending limits. Supports alias keys. Useful for agents that want to pre-flight check a call before committing.
Batch Dry Run
Simulate multiple tool calls at once to check if an entire batch would succeed:
curl -X POST http://localhost:3402/requests/dry-run/batch \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "pg_...", "tools": [{"name": "tool_a"}, {"name": "tool_b"}]}'
Response:
{
"allAllowed": true,
"totalCreditsRequired": 10,
"creditsAvailable": 100,
"creditsAfter": 90,
"results": [
{ "tool": "tool_a", "allowed": true, "creditsRequired": 5 },
{ "tool": "tool_b", "allowed": true, "creditsRequired": 5 }
]
}
Performs aggregate credit check (sum of all tool prices vs balance), per-tool ACL validation, spending limit, and rate limit checks. Returns per-tool results so you can see which specific tools would fail. Max 100 tools per batch. Supports alias keys.
Tool Availability
Check per-key tool availability including pricing, affordability, and rate limit status:
curl "http://localhost:3402/tools/available?key=pg_..." \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Response:
{
"key": "pg_c815...09a6",
"creditsAvailable": 100,
"totalTools": 3,
"accessibleTools": 2,
"globalRateLimit": { "limit": 60, "used": 5, "remaining": 55, "resetInMs": 45000 },
"tools": [
{ "tool": "tool_a", "accessible": true, "creditsPerCall": 10, "canAfford": true },
{ "tool": "tool_b", "accessible": false, "denyReason": "denied_by_acl", "creditsPerCall": 5, "canAfford": true },
{ "tool": "tool_c", "accessible": true, "creditsPerCall": 1, "canAfford": true, "rateLimit": { "limit": 10, "used": 3, "remaining": 7 } }
]
}
Returns every discovered tool with: accessible (ACL check), denyReason (if blocked), creditsPerCall, canAfford (credits vs price), and per-tool rateLimit when configured. Includes global rate limit info. Supports alias keys. Works on suspended keys (informational). Read-only β does not deduct credits or increment rate counters.
Key Dashboard
Get a consolidated overview of any API key in a single request:
curl "http://localhost:3402/keys/dashboard?key=pg_..." \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Response:
{
"key": "pg_c815...09a6",
"name": "production-agent",
"status": "active",
"namespace": "prod",
"balance": { "credits": 850, "totalSpent": 150, "totalAllocated": 1000, "spendingLimit": 500 },
"health": { "score": 92, "status": "good" },
"velocity": { "creditsPerHour": 6.2, "creditsPerDay": 149, "estimatedDepletionDate": "2025-02-03T..." },
"rateLimits": { "global": { "limit": 60, "used": 12, "remaining": 48, "resetInMs": 35000 } },
"quotas": { "source": "global", "daily": { "callsUsed": 24, "callsLimit": 100 }, "monthly": { "callsUsed": 340, "callsLimit": 5000 } },
"usage": { "totalCalls": 340, "totalAllowed": 330, "totalDenied": 10, "totalCredits": 150 },
"recentActivity": [{ "timestamp": "...", "event": "gate.allowed", "tool": "search", "credits": 5 }]
}
Combines metadata (status/namespace/group/tags), balance (credits/spent/allocated/spendingLimit), health score (0-100 composite), spending velocity with depletion forecast, rate limit and quota status, usage summary, and last 10 audit events. Supports alias keys. Works on suspended/revoked/expired keys. Read-only.
Admin Notifications
Get actionable notifications about keys that need attention:
# Get all notifications
curl http://localhost:3402/admin/notifications \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
# Filter by severity
curl "http://localhost:3402/admin/notifications?severity=critical" \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Response:
{
"total": 4,
"critical": 2,
"warning": 1,
"info": 1,
"notifications": [
{
"severity": "critical",
"category": "zero_credits",
"message": "Key has zero credits remaining",
"key": "pg_c815...09a6",
"keyName": "production-agent"
},
{
"severity": "critical",
"category": "key_expiring_soon",
"message": "Key expires within 8 hours",
"key": "pg_a3f1...b2e4",
"keyName": "staging-agent",
"details": { "expiresAt": "2026-02-27T08:00:00.000Z", "hoursRemaining": 7.5 }
},
{
"severity": "warning",
"category": "credits_depleting",
"message": "Credits will deplete in ~18 hours at current rate",
"key": "pg_d7e2...f1a3",
"keyName": "batch-worker",
"details": { "credits": 90, "creditsPerHour": 5.1, "estimatedHoursRemaining": 17.6 }
},
{
"severity": "info",
"category": "key_suspended",
"message": "Key is suspended",
"key": "pg_b4c9...e8d5",
"keyName": "deprecated-agent"
}
]
}
Notification categories:
key_expired(critical) β Key has passed its expiry datekey_expiring_soon(critical <24h, warning <7d) β Key approaching expiryzero_credits(critical) β Key has no credits remainingcredits_depleting(critical <6h, warning <24h) β Spending velocity predicts depletionkey_suspended(info) β Key is suspendedhigh_error_rate(critical β₯50%, warning β₯25%) β High denial rate (min 10 calls)rate_limit_pressure(warning β₯90%) β Rate limit nearly exhausted
Notifications are sorted by severity (critical first). Revoked keys are excluded. A single key can appear in multiple notifications (e.g., zero credits AND expiring soon). Filter with ?severity=critical|warning|info. Read-only.
System Dashboard
Get a system-wide overview in a single request:
curl http://localhost:3402/admin/dashboard \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Response:
{
"keys": { "total": 15, "active": 10, "suspended": 2, "revoked": 2, "expired": 1 },
"credits": { "totalAllocated": 15000, "totalSpent": 4200, "totalRemaining": 10800 },
"usage": {
"totalCalls": 840,
"totalAllowed": 790,
"totalDenied": 50,
"totalCreditsSpent": 4200,
"denyReasons": [{ "reason": "insufficient_credits", "count": 30 }, { "reason": "rate_limited", "count": 20 }]
},
"topConsumers": [
{ "name": "production-agent", "calls": 320, "credits": 1600, "denied": 5 },
{ "name": "batch-worker", "calls": 210, "credits": 1050, "denied": 0 }
],
"topTools": [
{ "tool": "search", "calls": 450, "credits": 2250, "denied": 20 },
{ "tool": "generate", "calls": 300, "credits": 1500, "denied": 10 }
],
"notifications": { "critical": 2, "warning": 3, "info": 2 },
"uptime": { "startedAt": "2026-02-27T00:00:00.000Z", "uptimeSeconds": 86400, "uptimeHours": 24 }
}
Combines key counts by state, credit allocation and spending totals, usage breakdown with deny reasons, top 10 consumers ranked by credits spent, top 10 tools ranked by call count, notification severity counts, and server uptime. Read-only.
Key Lifecycle Report
Track key creation, revocation, suspension trends and identify at-risk keys:
# Full lifecycle report
curl http://localhost:3402/admin/lifecycle \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
# Filter by date range
curl "http://localhost:3402/admin/lifecycle?since=2026-02-01&until=2026-02-28" \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Response:
{
"events": { "created": 25, "revoked": 3, "suspended": 5, "resumed": 4, "rotated": 2, "cloned": 1 },
"trends": [
{ "date": "2026-02-25", "created": 5, "revoked": 1, "suspended": 2, "resumed": 1 },
{ "date": "2026-02-26", "created": 8, "revoked": 0, "suspended": 1, "resumed": 2 },
{ "date": "2026-02-27", "created": 12, "revoked": 2, "suspended": 2, "resumed": 1 }
],
"averageLifetimeHours": 168.5,
"atRisk": [
{ "key": "pg_c815...09a6", "name": "staging-agent", "risk": "expiring_soon", "details": { "expiresAt": "2026-03-01T...", "daysRemaining": 2.5 } },
{ "key": "pg_a3f1...b2e4", "name": "batch-worker", "risk": "zero_credits", "details": { "credits": 0 } }
]
}
Shows aggregated lifecycle event counts, daily trend buckets (sorted chronologically), average key lifetime in hours (for revoked keys), and at-risk keys with their risk category (expired, expiring_soon, zero_credits). Supports ?since= and ?until= date filters. Excludes suspended and revoked keys from at-risk list. Read-only.
Cost Analysis
Get a cost-centric breakdown of credit usage across tools, namespaces, and time:
# Full cost analysis
curl http://localhost:3402/admin/costs \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
# Filter by namespace
curl "http://localhost:3402/admin/costs?namespace=prod" \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
# Filter by time range
curl "http://localhost:3402/admin/costs?since=2026-02-01" \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": {
"totalCredits": 4250,
"totalCalls": 312,
"totalAllowed": 298,
"totalDenied": 14,
"avgCostPerCall": 13.62
},
"perTool": [
{ "tool": "generate_report", "calls": 85, "credits": 1700, "avgCost": 20 },
{ "tool": "query_data", "calls": 142, "credits": 1420, "avgCost": 10 }
],
"perNamespace": [
{ "namespace": "prod", "calls": 210, "credits": 3150 },
{ "namespace": "staging", "calls": 102, "credits": 1100 }
],
"hourlyTrends": [
{ "hour": "2026-02-26T14:00:00.000Z", "calls": 23, "credits": 345, "denied": 1 },
{ "hour": "2026-02-26T15:00:00.000Z", "calls": 31, "credits": 465, "denied": 0 }
],
"topSpenders": [
{ "key": "pg_a1b2...c3d4", "name": "ml-pipeline", "credits": 1800, "calls": 90 },
{ "key": "pg_e5f6...g7h8", "name": "batch-worker", "credits": 1200, "calls": 120 }
]
}
Returns per-tool cost breakdown (with average cost per call), per-namespace spending, hourly trend buckets (last 24 hours), and top 10 spenders ranked by credits consumed. Supports ?since= and ?namespace= query filters. Keys without an explicit namespace appear under default. Read-only.
Rate Limit Analysis
Analyze rate limit utilization across keys and tools:
curl http://localhost:3402/admin/rate-limits \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"config": {
"globalLimitPerMin": 60,
"windowMs": 60000
},
"summary": {
"totalCalls": 450,
"totalRateLimited": 12,
"rateLimitRate": 0.0267
},
"perKey": [
{ "name": "ml-pipeline", "calls": 200, "rateLimited": 8, "currentWindowUsed": 45, "currentWindowRemaining": 15 },
{ "name": "batch-worker", "calls": 150, "rateLimited": 4, "currentWindowUsed": 12, "currentWindowRemaining": 48 }
],
"perTool": [
{ "tool": "generate_report", "calls": 180, "rateLimited": 10 },
{ "tool": "query_data", "calls": 270, "rateLimited": 2 }
],
"hourlyTrends": [
{ "hour": "2026-02-26T14", "calls": 52, "rateLimited": 3 },
{ "hour": "2026-02-26T15", "calls": 48, "rateLimited": 1 }
],
"mostThrottled": [
{ "name": "ml-pipeline", "rateLimited": 8, "calls": 200, "throttleRate": 0.04 }
]
}
Returns rate limit configuration, denial summary with throttle rate, per-key breakdown with current sliding window utilization, per-tool denial counts, hourly denial trends (last 24 hours), and top 10 most throttled keys ranked by denial count. Handles unlimited rate limits (globalLimitPerMin: 0). Read-only.
Quota Analysis
curl http://localhost:3000/admin/quotas -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"config": { "globalQuota": { "dailyCallLimit": 100, "monthlyCallLimit": 1000, "dailyCreditLimit": 500, "monthlyCreditLimit": 5000 } },
"summary": { "totalKeys": 5, "keysWithQuotas": 4, "totalQuotaDenials": 3, "quotaDenialRate": 0.02 },
"perKey": [
{ "name": "heavy-user", "dailyCalls": 95, "monthlyCalls": 450, "dailyCredits": 475, "monthlyCredits": 2250, "dailyCallLimit": 100, "monthlyCallLimit": 1000, "dailyCreditLimit": 500, "monthlyCreditLimit": 5000, "dailyCallUtilization": 0.95, "monthlyCallUtilization": 0.45, "source": "global" }
],
"perTool": [{ "tool": "summarize", "calls": 120, "quotaDenied": 2 }],
"hourlyTrends": [{ "hour": "2025-01-15T14", "calls": 15, "quotaDenied": 1 }],
"mostConstrained": [{ "name": "heavy-user", "dailyCalls": 95, "dailyCallLimit": 100, "dailyCallUtilization": 0.95, "monthlyCalls": 450, "monthlyCallLimit": 1000, "monthlyCallUtilization": 0.45 }]
}
Returns quota configuration (global or null), key counts with/without quotas, denial summary with denial rate, per-key daily/monthly call and credit usage vs limits with utilization percentages, quota source (per-key/global/none), per-tool quota denial counts, hourly denial trends (last 24 hours), and top 10 most constrained keys ranked by daily call utilization. Read-only.
Denial Analysis
curl http://localhost:3000/admin/denials -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": { "totalCalls": 150, "totalDenials": 12, "denialRate": 0.08 },
"byReason": { "insufficient_credits": 5, "rate_limited": 4, "quota_exceeded": 2, "key_suspended": 1 },
"perKey": [
{ "name": "heavy-user", "calls": 50, "denials": 8, "denialRate": 0.16, "topReason": "rate_limited" }
],
"perTool": [{ "tool": "summarize", "calls": 80, "denials": 6, "denialRate": 0.075, "topReason": "insufficient_credits" }],
"hourlyTrends": [{ "hour": "2025-01-15T14", "calls": 20, "denials": 3 }],
"mostDenied": [{ "name": "heavy-user", "denials": 8, "calls": 50, "denialRate": 0.16, "topReason": "rate_limited" }]
}
Returns denial summary with denial rate, breakdown by canonical reason type (insufficient_credits, rate_limited, tool_rate_limited, quota_exceeded, key_suspended, api_key_expired, invalid_api_key, missing_api_key, tool_not_allowed, ip_not_allowed, spending_limit_exceeded, etc.), per-key denial counts with top reason, per-tool denial counts, hourly denial trends (last 24 hours), and top 10 most denied keys. Read-only.
Traffic Analysis
curl http://localhost:3000/admin/traffic -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": { "totalCalls": 500, "totalAllowed": 470, "totalDenied": 30, "successRate": 0.94, "uniqueKeys": 8, "uniqueTools": 3, "peakHour": "2025-01-15T14", "peakHourCalls": 85 },
"toolPopularity": [{ "tool": "summarize", "calls": 250, "successRate": 0.96, "credits": 2500 }],
"hourlyVolume": [{ "hour": "2025-01-15T14", "calls": 85, "allowed": 80, "denied": 5, "credits": 400 }],
"topConsumers": [{ "name": "heavy-user", "calls": 150, "successRate": 0.92, "credits": 1380 }],
"byNamespace": [{ "namespace": "production", "calls": 400, "allowed": 380, "credits": 3800 }]
}
Returns traffic summary with success rate and peak hour, tool popularity ranked by call count with success rates and credit totals, hourly volume (last 24 hours) with allowed/denied/credit breakdowns, top 10 consumers by call count, and namespace breakdown with per-namespace stats. Read-only.
Security Audit
curl http://localhost:3000/admin/security -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"score": 72,
"summary": { "totalKeys": 5, "totalFindings": 12 },
"findings": [
{ "type": "no_ip_allowlist", "severity": "warning", "keys": ["prod-key", "dev-key"], "description": "Keys without IP allowlists can be used from any IP address" },
{ "type": "no_acl_restriction", "severity": "info", "keys": ["dev-key"], "description": "Keys without ACL restrictions can access all tools" },
{ "type": "high_credit_balance", "severity": "warning", "keys": ["whale-key"], "description": "Keys with 10000+ credits are high-value targets if compromised" }
]
}
Returns a composite security score (0-100) with per-finding breakdown. Scans all active keys for: missing IP allowlists (warning), missing quotas (info), unrestricted ACLs (info), no spending limits (info), no expiry dates (info), and high credit balances (warning). Well-configured keys with IP restrictions, tool ACLs, quotas, spending limits, and expiry dates will not appear in any findings. Read-only β does not modify system state.
Revenue Analysis
curl http://localhost:3000/admin/revenue -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": { "totalRevenue": 5000, "totalCalls": 250, "averageRevenuePerCall": 20 },
"byTool": [{ "tool": "summarize", "revenue": 3000, "calls": 150, "averagePerCall": 20 }],
"byKey": [{ "name": "heavy-user", "revenue": 2000, "calls": 80 }],
"hourlyRevenue": [{ "hour": "2025-01-15T14", "revenue": 500, "calls": 25 }],
"creditFlow": { "totalAllocated": 50000, "totalSpent": 5000, "totalRemaining": 45000 }
}
Returns revenue summary with total credits earned, per-tool revenue ranked by earnings with average per-call, top 10 per-key spending, hourly revenue trends (last 24 hours), and credit flow showing total allocated vs spent vs remaining across all active keys. Only counts successful (allowed) calls. Read-only.
Key Portfolio Health
curl http://localhost:3000/admin/key-portfolio -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": { "totalKeys": 10, "activeKeys": 7, "inactiveKeys": 2, "suspendedKeys": 1, "averageCreditUtilization": 0.35 },
"staleKeys": [{ "name": "unused-key", "createdAt": "2025-01-01T00:00:00Z", "credits": 500, "ageDays": 30 }],
"expiringSoon": [{ "name": "temp-key", "expiresAt": "2025-01-20T00:00:00Z", "hoursRemaining": 48, "credits": 100 }],
"ageDistribution": { "averageAgeDays": 15, "oldestAgeDays": 60, "newestAgeDays": 0 },
"byNamespace": [{ "namespace": "production", "total": 5, "active": 4, "suspended": 1 }]
}
Returns portfolio-wide key health: active/inactive/suspended counts, average credit utilization, stale keys (created but never used), keys expiring within 7 days sorted by urgency, age distribution statistics, and namespace breakdown. Read-only.
Anomaly Detection
curl http://localhost:3000/admin/anomalies -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": { "totalAnomalies": 3, "byType": { "high_denial_rate": 1, "rapid_credit_depletion": 1, "low_credits": 1 } },
"anomalies": [
{ "type": "high_denial_rate", "severity": "warning", "keyName": "test-key", "description": "Key \"test-key\" has 80% denial rate (8/10 calls denied)" },
{ "type": "rapid_credit_depletion", "severity": "warning", "keyName": "fast-spender", "description": "Key \"fast-spender\" has used 95% of allocated credits (950/1000)" },
{ "type": "low_credits", "severity": "info", "keyName": "nearly-empty", "description": "Key \"nearly-empty\" has only 5 credits remaining (5% of allocated)" }
],
"analyzedAt": "2025-01-15T14:30:00Z"
}
Scans all active keys for anomalous patterns: keys with >50% denial rates (3+ calls minimum), rapid credit depletion (>=75% spent), and low remaining credits (<=10 credits or <=10% remaining). Each anomaly includes type, severity, affected key name, and human-readable description. Read-only.
Usage Forecasting
curl http://localhost:3000/admin/forecast -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": { "totalActiveKeys": 3, "keysAtRisk": 1 },
"keyForecasts": [
{ "keyName": "heavy-user", "creditsRemaining": 50, "totalSpent": 950, "callCount": 95, "avgCreditsPerCall": 10, "estimatedCallsRemaining": 5, "atRisk": true },
{ "keyName": "light-user", "creditsRemaining": 900, "totalSpent": 100, "callCount": 20, "avgCreditsPerCall": 5, "estimatedCallsRemaining": 180, "atRisk": false }
],
"systemForecast": {
"totalCreditsRemaining": 950,
"totalCreditsSpent": 1050,
"totalCalls": 115,
"byTool": [
{ "tool": "expensive_tool", "calls": 50, "totalCredits": 500, "avgCreditsPerCall": 10 },
{ "tool": "cheap_tool", "calls": 65, "totalCredits": 325, "avgCreditsPerCall": 5 }
]
},
"generatedAt": "2025-01-15T14:30:00Z"
}
Forecasts credit consumption for all active keys: per-key depletion estimates with calls remaining, at-risk identification (<=5 estimated calls), system-wide credit aggregates, and per-tool cost breakdown sorted by revenue. Keys with no usage history show estimatedCallsRemaining: null. Read-only.
Compliance Report
curl http://localhost:3000/admin/compliance -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"keyGovernance": { "totalKeys": 5, "keysWithExpiry": 3, "keysWithoutExpiry": 2 },
"accessControl": {
"keysWithAcl": 3, "keysWithoutAcl": 2,
"keysWithIpRestriction": 2, "keysWithoutIpRestriction": 3,
"keysWithSpendingLimit": 4, "keysWithoutSpendingLimit": 1
},
"auditTrail": { "totalEvents": 150, "uniqueTools": 5, "uniqueKeys": 4 },
"overallScore": 72,
"recommendations": [
"Set expiry dates on 2 key(s) without time-limited access",
"Add tool ACL restrictions to 2 key(s) with unrestricted tool access"
],
"generatedAt": "2025-01-15T14:30:00Z"
}
Compliance-ready report scoring key governance (expiry 25%), access control (ACL 25%, IP 20%, spending limits 15%), and audit trail (15%). Actionable recommendations for each gap. Read-only.
SLA Monitoring
curl http://localhost:3000/admin/sla -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": {
"totalCalls": 150, "allowedCalls": 140, "deniedCalls": 10,
"successRate": 93.33,
"denialReasons": { "insufficient_credits": 6, "rate_limited": 3, "acl_denied": 1 }
},
"byTool": [
{ "tool": "tool_a", "totalCalls": 100, "allowedCalls": 95, "deniedCalls": 5, "successRate": 95 },
{ "tool": "tool_b", "totalCalls": 50, "allowedCalls": 45, "deniedCalls": 5, "successRate": 90 }
],
"uptime": { "startedAt": "2025-01-15T10:00:00Z", "uptimeSeconds": 16200 },
"generatedAt": "2025-01-15T14:30:00Z"
}
Service level metrics: overall success rate, denial breakdown by canonical reason (insufficient_credits, rate_limited, quota_exceeded, acl_denied, spending_limit, key_suspended, key_expired), per-tool availability sorted by call volume, and server uptime tracking. Read-only.
Capacity Planning
curl http://localhost:3000/admin/capacity -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": {
"totalCreditsAllocated": 10000, "totalCreditsSpent": 3500, "totalCreditsRemaining": 6500,
"utilizationPct": 35,
"burnRate": { "creditsPerCall": 10, "totalCalls": 350 }
},
"topConsumers": [
{ "keyName": "heavy-user", "creditsSpent": 2000, "creditsRemaining": 500, "callCount": 200 }
],
"byNamespace": {
"prod": { "allocated": 8000, "spent": 3000, "remaining": 5000, "keys": 3, "utilizationPct": 37 }
},
"recommendations": ["1 key(s) have less than 10% credits remaining"],
"generatedAt": "2025-01-15T14:30:00Z"
}
System capacity analysis: overall credit utilization, burn rate (credits/call), top 10 consumers ranked by spend, per-namespace breakdown, and scaling recommendations for high utilization (>=75%) or depleted keys. Read-only.
Key Dependency Map
curl http://localhost:3000/admin/dependencies -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": { "totalTools": 5, "usedTools": 3, "unusedTools": 2 },
"toolUsage": [
{ "tool": "search", "totalCalls": 150, "uniqueKeys": 8 },
{ "tool": "translate", "totalCalls": 45, "uniqueKeys": 3 }
],
"keyToolMap": [
{ "keyName": "power-user", "tools": ["search", "translate", "summarize"], "toolCount": 3 },
{ "keyName": "basic-user", "tools": ["search"], "toolCount": 1 }
],
"generatedAt": "2025-01-15T14:30:00Z"
}
Tool-to-key relationship map: shows which tools each key uses, tool popularity ranked by total calls, unique key counts per tool, and identifies orphaned tools (available but unused). Useful for understanding tool adoption and pruning unused capabilities. Read-only.
Tool Latency Analysis
curl http://localhost:3000/admin/latency -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": { "totalCalls": 200, "avgDurationMs": 45, "minDurationMs": 8, "maxDurationMs": 312, "p95DurationMs": 120 },
"byTool": [
{ "tool": "translate", "totalCalls": 80, "avgDurationMs": 65, "minDurationMs": 20, "maxDurationMs": 312, "p95DurationMs": 150 },
{ "tool": "search", "totalCalls": 120, "avgDurationMs": 32, "minDurationMs": 8, "maxDurationMs": 95, "p95DurationMs": 78 }
],
"slowestTools": [
{ "tool": "translate", "avgDurationMs": 65, "totalCalls": 80 }
],
"byKey": [
{ "keyName": "heavy-user", "totalCalls": 150, "avgDurationMs": 48, "minDurationMs": 8, "maxDurationMs": 312 }
],
"generatedAt": "2025-01-15T14:30:00Z"
}
Per-tool response time metrics: average, p95, min, and max durations for each tool sorted by slowest average first, top 10 slowest tools ranking, per-key latency breakdown, and global summary. Only counts successful (allowed) calls. Read-only.
Error Rate Trends
curl http://localhost:3000/admin/error-trends -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": { "totalCalls": 500, "totalDenials": 45, "overallErrorRate": 9, "trend": "improving" },
"byTool": [
{ "tool": "translate", "totalCalls": 200, "denials": 30, "errorRate": 15 },
{ "tool": "search", "totalCalls": 300, "denials": 15, "errorRate": 5 }
],
"denialReasons": [
{ "reason": "insufficient_credits", "count": 30 },
{ "reason": "rate_limited", "count": 15 }
],
"generatedAt": "2025-01-15T14:30:00Z"
}
Denial rate trends: overall error rate, per-tool error rates sorted by worst-performing, denial reason breakdown, and trend direction (improving/degrading/stable based on first-half vs second-half comparison). Read-only.
Credit Flow Analysis
curl http://localhost:3000/admin/credit-flow -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": { "totalAllocated": 10000, "totalSpent": 3500, "totalRemaining": 6500, "utilizationPct": 35 },
"topSpenders": [
{ "keyName": "heavy-user", "creditsSpent": 2000, "creditsRemaining": 500, "callCount": 200 }
],
"byTool": [
{ "tool": "search", "creditsSpent": 2000, "callCount": 400 },
{ "tool": "translate", "creditsSpent": 1500, "callCount": 150 }
],
"generatedAt": "2025-01-15T14:30:00Z"
}
Credit inflow/outflow analysis: total credits allocated (initial + spent) vs spent vs remaining, utilization percentage, top 10 spenders ranked by credits consumed, and per-tool spend breakdown sorted by revenue. Read-only.
Key Age Analysis
curl http://localhost:3000/admin/key-age -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": {
"totalKeys": 15, "avgAgeHours": 168.5,
"oldestKey": { "keyName": "legacy", "ageHours": 720, "createdAt": "2025-01-01T00:00:00Z" },
"newestKey": { "keyName": "fresh", "ageHours": 0.5, "createdAt": "2025-01-31T12:00:00Z" }
},
"distribution": { "last24h": 3, "last7d": 5, "last30d": 4, "older": 3 },
"recentlyCreated": [
{ "keyName": "fresh", "ageHours": 0.5, "createdAt": "2025-01-31T12:00:00Z" }
],
"generatedAt": "2025-01-31T12:30:00Z"
}
Key age distribution: average age across all active keys, oldest/newest key identification, age buckets (last 24h / 7d / 30d / older), and recently created list (newest first, top 10). Read-only.
Namespace Usage Summary
curl http://localhost:3000/admin/namespace-usage -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": { "totalNamespaces": 3 },
"namespaces": [
{ "namespace": "prod", "keyCount": 5, "totalAllocated": 5000, "totalSpent": 2000, "totalRemaining": 3000, "totalCalls": 400, "utilizationPct": 40 },
{ "namespace": "staging", "keyCount": 2, "totalAllocated": 1000, "totalSpent": 200, "totalRemaining": 800, "totalCalls": 40, "utilizationPct": 20 }
],
"generatedAt": "2025-01-15T14:30:00Z"
}
Per-namespace usage metrics: key counts, credit allocation/spending/remaining, call counts, and utilization percentages. Sorted by spending (highest first). Keys without a namespace appear under "default". Read-only.
Audit Summary
curl http://localhost:3000/admin/audit-summary -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": { "totalEvents": 142, "eventsLastHour": 18, "eventsLast24h": 95, "oldestEvent": "2025-01-14T08:00:00Z", "newestEvent": "2025-01-15T14:30:00Z" },
"eventsByType": [
{ "type": "gate.allow", "count": 80 },
{ "type": "gate.deny", "count": 25 },
{ "type": "key.created", "count": 12 }
],
"topActors": [
{ "actor": "pg_abc1...", "count": 60 },
{ "actor": "admin", "count": 30 }
],
"recentEvents": [
{ "id": 142, "timestamp": "2025-01-15T14:30:00Z", "type": "gate.allow", "actor": "pg_abc1...", "message": "Allowed: tool_a" }
],
"generatedAt": "2025-01-15T14:30:00Z"
}
Audit event analytics: total events with hourly/daily counts, event type breakdown sorted by frequency, top 10 most active actors, and the 20 most recent events (newest first). Read-only.
Group Performance
curl http://localhost:3000/admin/group-performance -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": { "totalGroups": 2, "ungroupedKeys": 3 },
"groups": [
{
"groupId": "grp_abc123", "groupName": "prod-team", "description": "Production",
"keyCount": 5, "totalAllocated": 5000, "totalSpent": 2000, "totalRemaining": 3000,
"totalCalls": 400, "utilizationPct": 40,
"policy": { "allowedTools": ["tool_a"], "deniedTools": [], "rateLimitPerMin": 60 }
}
],
"generatedAt": "2025-01-15T14:30:00Z"
}
Per-group analytics: key counts, credit allocation/spending/remaining, call volume, and utilization percentages. Includes group policy summary (allowed/denied tools, rate limits). Sorted by spending (highest first). Also reports ungrouped key count. Read-only.
Request Volume Trends
curl http://localhost:3000/admin/request-trends -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": {
"totalRequests": 150, "totalAllowed": 130, "totalDenied": 20,
"totalCredits": 650, "avgDurationMs": 45,
"peakHour": { "hour": "2025-01-15T14:00:00Z", "total": 42 }
},
"hourly": [
{ "hour": "2025-01-15T12:00:00Z", "total": 35, "allowed": 30, "denied": 5, "credits": 150, "avgDurationMs": 40 },
{ "hour": "2025-01-15T13:00:00Z", "total": 42, "allowed": 38, "denied": 4, "credits": 190, "avgDurationMs": 50 }
],
"generatedAt": "2025-01-15T14:30:00Z"
}
Hourly request volume time-series: total/allowed/denied counts, credit spend, and average duration per hour. Includes summary with peak hour identification. Built from request log data. Sorted chronologically. Read-only.
Key Status Overview
curl http://localhost:3000/admin/key-status -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"counts": { "total": 20, "active": 15, "suspended": 2, "revoked": 2, "expired": 1 },
"needsAttention": [
{ "keyName": "low-balance", "issue": "low_credits", "detail": "5 credits remaining" },
{ "keyName": "trial-key", "issue": "expiring_soon", "detail": "Expires in 48 hours" }
],
"generatedAt": "2025-01-15T14:30:00Z"
}
Key status dashboard: active/suspended/revoked/expired counts with keys needing attention. Flags active keys with low credits (<=10) and near expiry (within 7 days). Read-only.
Webhook Health
curl http://localhost:3000/admin/webhook-health -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"configured": true,
"status": "healthy",
"delivery": {
"totalDelivered": 142,
"totalFailed": 3,
"totalRetries": 5,
"pendingRetries": 0,
"deadLetterCount": 1,
"bufferedEvents": 0,
"paused": false,
"pausedAt": null,
"successRate": 97.93
},
"generatedAt": "2025-01-15T14:30:00Z"
}
Webhook delivery health overview. Status is healthy, retrying, degraded (dead letters exist), paused, or not_configured. Includes success rate, pending retries, dead letter count, and buffered events. Read-only.
Consumer Insights
curl http://localhost:3000/admin/consumer-insights -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": {
"totalConsumers": 15,
"activeConsumers": 12,
"totalCreditsSpent": 4850,
"totalCalls": 970
},
"topSpenders": [
{ "name": "heavy-user", "totalSpent": 1200, "totalCalls": 240, "uniqueTools": 5 }
],
"mostActive": [
{ "name": "heavy-user", "totalCalls": 240, "totalSpent": 1200, "uniqueTools": 5 }
],
"generatedAt": "2025-01-15T14:30:00Z"
}
Per-key behavioral analytics. Top 10 spenders ranked by credits consumed, top 10 most active by call count. Each entry includes tool diversity (unique tools used). Summary shows total/active consumers and aggregate spend. Read-only.
System Health Score
curl http://localhost:3000/admin/system-health -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"score": 85,
"level": "healthy",
"components": {
"keyHealth": { "score": 90, "weight": 0.4, "detail": "2 suspended" },
"errorRate": { "score": 80, "weight": 0.35, "detail": "10% denial rate (5/50)" },
"creditUtilization": { "score": 85, "weight": 0.25, "detail": "45% utilized (4500/10000 credits)" }
},
"generatedAt": "2025-01-15T14:30:00Z"
}
Composite system health score 0-100 with weighted component breakdowns. Key health (40%): penalizes suspended/revoked/expired/low-credit keys. Error rate (35%): penalizes high denial rates. Credit utilization (25%): healthy at 10-80%, degrades at >80%. Levels: healthy (>=80), good (>=60), warning (>=40), critical (<40). Read-only.
Tool Adoption
curl http://localhost:3000/admin/tool-adoption -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"tools": [
{
"tool": "search",
"uniqueConsumers": 8,
"adoptionRate": 80,
"totalCalls": 245,
"firstSeen": "2025-01-10T08:00:00Z",
"lastSeen": "2025-01-15T14:30:00Z"
},
{
"tool": "translate",
"uniqueConsumers": 3,
"adoptionRate": 30,
"totalCalls": 42,
"firstSeen": "2025-01-12T10:00:00Z",
"lastSeen": "2025-01-15T12:00:00Z"
}
],
"summary": {
"totalTools": 2,
"usedTools": 2,
"unusedTools": 0
},
"generatedAt": "2025-01-15T14:30:00Z"
}
Per-tool adoption metrics showing which tools are being used and by how many consumers. uniqueConsumers counts distinct API keys that called the tool. adoptionRate is the percentage of active keys that have used the tool. Sorted by adoption rate descending, then by total calls. Read-only.
Credit Efficiency
curl http://localhost:3000/admin/credit-efficiency -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": {
"totalAllocated": 5000,
"totalSpent": 1200,
"totalRemaining": 3800,
"burnEfficiency": 24,
"wasteRatio": 76,
"activeKeys": 10
},
"overProvisioned": [
{ "name": "idle-whale", "credits": 950, "totalAllocated": 1000, "totalSpent": 50, "remainingPercent": 95 }
],
"underProvisioned": [
{ "name": "heavy-user", "credits": 3, "totalAllocated": 500, "totalSpent": 497, "remainingPercent": 1 }
],
"generatedAt": "2025-01-15T14:30:00Z"
}
Credit allocation efficiency analysis. burnEfficiency is the percentage of allocated credits actually spent. wasteRatio is the percentage remaining unused. Over-provisioned keys have >90% remaining credits. Under-provisioned keys have <=10 credits or <=10% remaining with active usage. Top 10 in each category, sorted by urgency. Read-only.
Access Heatmap
curl http://localhost:3000/admin/access-heatmap -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"hourly": [
{
"hour": "2025-01-15T14:00:00.000Z",
"total": 45,
"uniqueConsumers": 8,
"tools": { "search": 30, "translate": 15 }
}
],
"summary": {
"totalRequests": 45,
"totalHours": 1,
"peakHour": { "hour": "2025-01-15T14:00:00.000Z", "total": 45 }
},
"generatedAt": "2025-01-15T14:30:00Z"
}
Hourly access patterns for capacity planning. Each bucket shows total requests, unique consumers, and per-tool breakdown. Peak hour identification helps spot usage spikes. Only counts allowed requests. Sorted chronologically. Read-only.
Key Churn Analysis
curl http://localhost:3000/admin/key-churn -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"summary": {
"totalKeys": 50,
"activeKeys": 40,
"revokedKeys": 5,
"suspendedKeys": 3,
"neverUsedKeys": 8,
"churnRate": 10,
"retentionRate": 90,
"avgCreditsPerKey": 250
},
"generatedAt": "2025-01-15T14:30:00Z"
}
Key churn analysis showing the health of your API key base. churnRate is the percentage of keys that have been revoked. retentionRate is the inverse. neverUsedKeys counts active keys with zero total calls. avgCreditsPerKey shows average remaining credits across active keys. Read-only.
Tool Correlation
curl http://localhost:3000/admin/tool-correlation -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"pairs": [
{ "toolA": "search", "toolB": "translate", "sharedConsumers": 5, "strength": 50 },
{ "toolA": "search", "toolB": "summarize", "sharedConsumers": 3, "strength": 30 }
],
"summary": { "totalPairs": 2, "totalConsumers": 10 },
"generatedAt": "2025-01-15T14:30:00Z"
}
Tool co-occurrence analysis showing which tools are commonly used together. sharedConsumers counts API keys that used both tools. strength is the percentage of all consumers who use the pair. Sorted by shared consumers descending. Helps identify tool bundles and usage patterns. Read-only.
Consumer Segmentation
curl http://localhost:3000/admin/consumer-segmentation -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"segments": [
{ "segment": "power", "count": 3, "totalCredits": 500, "totalSpent": 2400, "avgCallsPerKey": 35 },
{ "segment": "regular", "count": 8, "totalCredits": 1200, "totalSpent": 800, "avgCallsPerKey": 12 },
{ "segment": "casual", "count": 15, "totalCredits": 3000, "totalSpent": 150, "avgCallsPerKey": 2 },
{ "segment": "dormant", "count": 5, "totalCredits": 1000, "totalSpent": 0, "avgCallsPerKey": 0 }
],
"summary": { "totalConsumers": 31 },
"generatedAt": "2025-01-15T14:30:00Z"
}
Classifies active API key consumers into segments based on usage: power (20+ calls), regular (5β19 calls), casual (1β4 calls), dormant (0 calls). Each segment includes aggregate metrics: count, total credits remaining, total spent, and average calls per key. Excludes revoked and suspended keys. Read-only.
Credit Distribution
curl http://localhost:3000/admin/credit-distribution -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"buckets": [
{ "range": "0-10", "count": 5, "totalCredits": 30 },
{ "range": "11-50", "count": 12, "totalCredits": 420 },
{ "range": "51-100", "count": 8, "totalCredits": 640 },
{ "range": "101-500", "count": 4, "totalCredits": 1200 },
{ "range": "1001+", "count": 2, "totalCredits": 5000 }
],
"summary": { "totalKeys": 31, "medianCredits": 50 },
"generatedAt": "2025-01-15T14:30:00Z"
}
Histogram of credit balances across active, non-suspended keys. Buckets: 0β10, 11β50, 51β100, 101β500, 501β1000, 1001+. Only non-empty buckets are returned. medianCredits is the median remaining balance. Useful for pricing analysis and capacity planning. Read-only.
Response Time Distribution
curl http://localhost:3000/admin/response-time-distribution -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"buckets": [
{ "range": "0-50ms", "count": 45, "percentage": 60 },
{ "range": "51-100ms", "count": 20, "percentage": 27 },
{ "range": "101-250ms", "count": 8, "percentage": 11 },
{ "range": "251-500ms", "count": 2, "percentage": 3 }
],
"summary": { "totalRequests": 75, "avgResponseTime": 62, "p50": 42, "p95": 180, "p99": 350 },
"generatedAt": "2025-01-15T14:30:00Z"
}
Histogram of response times across allowed tool calls. Buckets: 0β50ms, 51β100ms, 101β250ms, 251β500ms, 501β1000ms, 1001ms+. Includes percentile metrics (p50, p95, p99) and average response time. Only non-empty buckets are returned. Useful for SLA monitoring and performance optimization. Read-only.
Consumer Lifetime Value
curl http://localhost:3000/admin/consumer-lifetime-value -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"consumers": [
{ "name": "enterprise-bot", "lifetimeValue": 2500, "totalCalls": 500, "avgSpendPerCall": 5, "toolsUsed": 8, "tier": "high" },
{ "name": "dev-team", "lifetimeValue": 450, "totalCalls": 90, "avgSpendPerCall": 5, "toolsUsed": 4, "tier": "medium" },
{ "name": "trial-user", "lifetimeValue": 5, "totalCalls": 1, "avgSpendPerCall": 5, "toolsUsed": 1, "tier": "low" }
],
"summary": { "totalConsumers": 15, "totalLifetimeValue": 3200, "avgLifetimeValue": 213 },
"generatedAt": "2025-01-15T14:30:00Z"
}
Per-consumer value analysis for active keys with usage. Value tiers: high (100+ credits spent), medium (10β99), low (<10). toolsUsed shows tool diversity. Top 20 consumers by lifetime value. Zero-spend consumers excluded from list. avgLifetimeValue uses all active keys as denominator. Read-only.
Tool Revenue Ranking
curl http://localhost:3000/admin/tool-revenue -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"tools": [
{ "tool": "code_review", "totalCredits": 1500, "callCount": 300, "avgCreditsPerCall": 5, "uniqueConsumers": 25, "percentage": 60 },
{ "tool": "generate_tests", "totalCredits": 750, "callCount": 150, "avgCreditsPerCall": 5, "uniqueConsumers": 18, "percentage": 30 },
{ "tool": "lint_check", "totalCredits": 250, "callCount": 50, "avgCreditsPerCall": 5, "uniqueConsumers": 12, "percentage": 10 }
],
"summary": { "totalTools": 3, "totalRevenue": 2500, "topTool": "code_review" },
"generatedAt": "2025-01-15T14:30:00Z"
}
Ranks tools by total credits consumed from allowed requests. Each tool entry includes call count, average credits per call, unique consumer count, and revenue percentage. topTool is the highest revenue generator. Only allowed requests are counted; denied requests are excluded. Sorted by total credits descending. Read-only.
Consumer Retention Cohorts
curl http://localhost:3000/admin/consumer-retention -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"cohorts": [
{ "period": "2025-01-15", "created": 10, "retained": 8, "retentionRate": 80, "avgSpend": 150 },
{ "period": "2025-01-14", "created": 5, "retained": 3, "retentionRate": 60, "avgSpend": 80 }
],
"summary": { "totalKeys": 15, "retainedKeys": 11, "overallRetentionRate": 73 },
"generatedAt": "2025-01-15T14:30:00Z"
}
Groups active consumers by creation date (YYYY-MM-DD cohorts). A consumer is "retained" if they have at least 1 tool call. Per-cohort: created count, retained count, retention rate percentage, and average spend. Excludes revoked/suspended keys. Cohorts sorted newest first. Read-only.
Error Breakdown
curl http://localhost:3000/admin/error-breakdown -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"errors": [
{ "reason": "insufficient_credits", "count": 45, "percentage": 75, "affectedConsumers": 12 },
{ "reason": "rate_limited", "count": 10, "percentage": 17, "affectedConsumers": 3 },
{ "reason": "acl_denied", "count": 5, "percentage": 8, "affectedConsumers": 2 }
],
"summary": { "totalDenied": 60, "totalAllowed": 940, "errorRate": 6 },
"generatedAt": "2025-01-15T14:30:00Z"
}
Categorizes denied requests by deny reason for root-cause analysis. Per-reason: count, percentage of total denials, and affected consumer count. errorRate is the percentage of total requests that were denied. Sorted by count descending. Read-only.
Credit Utilization Rate
curl http://localhost:3000/admin/credit-utilization -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"bands": [
{ "range": "0%", "count": 5, "percentage": 25 },
{ "range": "1-25%", "count": 8, "percentage": 40 },
{ "range": "26-50%", "count": 4, "percentage": 20 },
{ "range": "51-75%", "count": 2, "percentage": 10 },
{ "range": "76-99%", "count": 1, "percentage": 5 }
],
"summary": { "totalAllocated": 10000, "totalSpent": 3500, "overallUtilization": 35, "totalKeys": 20 },
"generatedAt": "2025-01-15T14:30:00Z"
}
Shows what percentage of allocated credits are being used across active keys. Utilization bands: 0% (unused), 1-25%, 26-50%, 51-75%, 76-99%, 100% (fully consumed). totalAllocated = remaining credits + spent credits (original allocation). Excludes revoked/suspended keys. Read-only.
Namespace Revenue
curl http://localhost:3000/admin/namespace-revenue -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"namespaces": [
{ "namespace": "team-alpha", "totalSpent": 1500, "totalCalls": 300, "keyCount": 5, "percentage": 60 },
{ "namespace": "team-beta", "totalSpent": 750, "totalCalls": 150, "keyCount": 3, "percentage": 30 },
{ "namespace": "default", "totalSpent": 250, "totalCalls": 50, "keyCount": 2, "percentage": 10 }
],
"summary": { "totalNamespaces": 3, "totalRevenue": 2500, "topNamespace": "team-alpha" },
"generatedAt": "2025-01-15T14:30:00Z"
}
Revenue breakdown by namespace. Keys without a namespace are grouped as "default". Per-namespace: total credits spent, call count, key count, and revenue percentage. topNamespace is the highest revenue generator. Excludes revoked/suspended keys. Sorted by spend descending. Read-only.
Group Revenue
curl http://localhost:3000/admin/group-revenue -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"groups": [
{ "group": "premium", "totalSpent": 2000, "totalCalls": 400, "keyCount": 8, "percentage": 65 },
{ "group": "free-tier", "totalSpent": 800, "totalCalls": 160, "keyCount": 12, "percentage": 26 },
{ "group": "ungrouped", "totalSpent": 280, "totalCalls": 56, "keyCount": 3, "percentage": 9 }
],
"summary": { "totalGroups": 3, "totalRevenue": 3080, "topGroup": "premium" },
"generatedAt": "2025-01-15T14:30:00Z"
}
Revenue breakdown by key group. Keys not assigned to any group are shown as "ungrouped". Group IDs are resolved to human-readable names. Per-group: total credits spent, call count, key count, and revenue percentage. topGroup is the highest revenue generator. Excludes revoked/suspended keys. Sorted by spend descending. Read-only.
Peak Usage Times
curl http://localhost:3000/admin/peak-usage -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"hours": [
{ "hour": 9, "requests": 450, "allowed": 420, "denied": 30, "credits": 2100, "uniqueConsumers": 15, "percentage": 30 },
{ "hour": 14, "requests": 380, "allowed": 370, "denied": 10, "credits": 1850, "uniqueConsumers": 12, "percentage": 25 },
{ "hour": 22, "requests": 120, "allowed": 118, "denied": 2, "credits": 590, "uniqueConsumers": 5, "percentage": 8 }
],
"summary": { "totalRequests": 1500, "peakHour": 9, "peakRequests": 450 },
"generatedAt": "2025-01-15T14:30:00Z"
}
Traffic patterns by hour-of-day (UTC). Per-hour: total requests, allowed/denied split, credits spent, unique consumers, and traffic percentage. peakHour identifies the busiest hour for capacity planning. Hours are 0-23 (UTC), sorted ascending. Only hours with traffic are included. Read-only.
Consumer Activity
curl http://localhost:3000/admin/consumer-activity -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"consumers": [
{ "name": "alice", "totalCalls": 150, "totalSpent": 750, "creditsRemaining": 250, "lastActive": "2025-01-15T14:30:00Z", "status": "active" },
{ "name": "bob", "totalCalls": 0, "totalSpent": 0, "creditsRemaining": 500, "lastActive": null, "status": "inactive" }
],
"summary": { "totalConsumers": 2, "activeConsumers": 1, "inactiveConsumers": 1 },
"generatedAt": "2025-01-15T14:30:00Z"
}
Per-consumer activity metrics. Shows each active key's call count, total spend, credits remaining, last active timestamp, and active/inactive status. Consumers with zero calls are "inactive". Excludes revoked/suspended keys. Sorted by spend descending. Read-only.
Tool Popularity
curl http://localhost:3000/admin/tool-popularity -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"tools": [
{ "tool": "search", "totalCalls": 500, "totalCredits": 2500, "uniqueConsumers": 20, "percentage": 50 },
{ "tool": "generate", "totalCalls": 300, "totalCredits": 3000, "uniqueConsumers": 15, "percentage": 30 },
{ "tool": "translate", "totalCalls": 200, "totalCredits": 1000, "uniqueConsumers": 10, "percentage": 20 }
],
"summary": { "totalTools": 3, "totalCalls": 1000, "mostPopular": "search" },
"generatedAt": "2025-01-15T14:30:00Z"
}
Tool usage popularity ranking. Per-tool: total calls, credits spent, unique consumers, and call percentage. Only counts allowed (successful) requests. mostPopular identifies the most-called tool. Sorted by call count descending. Read-only.
Credit Allocation Summary
curl http://localhost:3000/admin/credit-allocation -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"tiers": [
{ "tier": "1-100", "count": 15, "totalCredits": 750, "percentage": 5.0 },
{ "tier": "101-500", "count": 30, "totalCredits": 9000, "percentage": 60.0 },
{ "tier": "501+", "count": 5, "totalCredits": 5250, "percentage": 35.0 }
],
"summary": { "totalKeys": 50, "totalAllocated": 15000, "totalRemaining": 12000, "totalSpent": 3000, "averageAllocation": 300 },
"generatedAt": "2025-01-15T14:30:00Z"
}
Credit allocation distribution across active keys. Groups keys into allocation tiers (1-100, 101-500, 501+) with count, total credits, and percentage per tier. Summary includes total keys, total allocated/remaining/spent credits, and average allocation per key. Excludes revoked/suspended keys. Read-only.
Daily Summary
curl http://localhost:3000/admin/daily-summary -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"days": [
{ "date": "2025-01-15", "requests": 150, "allowed": 140, "denied": 10, "creditsSpent": 700, "uniqueConsumers": 25, "uniqueTools": 8, "newKeys": 3 },
{ "date": "2025-01-14", "requests": 120, "allowed": 115, "denied": 5, "creditsSpent": 575, "uniqueConsumers": 20, "uniqueTools": 7, "newKeys": 1 }
],
"summary": { "totalDays": 2, "totalRequests": 270, "totalCreditsSpent": 1275, "averageRequestsPerDay": 135 },
"generatedAt": "2025-01-15T14:30:00Z"
}
Daily activity rollup for trend analysis. Per-day: total requests, allowed/denied breakdown, credits spent, unique consumers, unique tools, and new keys created. Summary includes total days, total requests, total credits, and average requests per day. Sorted by date descending (most recent first). Read-only.
Key Ranking
curl http://localhost:3000/admin/key-ranking -H "X-Admin-Key: YOUR_ADMIN_KEY"
# Sort by calls: ?sortBy=totalCalls
# Sort by credits remaining: ?sortBy=creditsRemaining
{
"rankings": [
{ "rank": 1, "name": "power-user", "totalSpent": 500, "totalCalls": 100, "creditsRemaining": 500 },
{ "rank": 2, "name": "moderate-user", "totalSpent": 200, "totalCalls": 40, "creditsRemaining": 800 },
{ "rank": 3, "name": "light-user", "totalSpent": 50, "totalCalls": 10, "creditsRemaining": 950 }
],
"summary": { "totalKeys": 3, "sortedBy": "totalSpent" },
"generatedAt": "2025-01-15T14:30:00Z"
}
Key leaderboard ranked by configurable metric. Default sorts by totalSpent descending. Use ?sortBy=totalCalls or ?sortBy=creditsRemaining for alternative rankings. Each entry includes rank number, name, spend, calls, and credits remaining. Excludes revoked/suspended keys. Read-only.
Hourly Traffic
curl http://localhost:3000/admin/hourly-traffic -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"hours": [
{ "timestamp": "2025-01-15T14:00:00Z", "requests": 45, "allowed": 42, "denied": 3, "credits": 210, "uniqueConsumers": 12, "uniqueTools": 5 },
{ "timestamp": "2025-01-15T13:00:00Z", "requests": 30, "allowed": 28, "denied": 2, "credits": 140, "uniqueConsumers": 8, "uniqueTools": 4 }
],
"summary": { "totalRequests": 75, "totalCredits": 350, "busiestHour": "2025-01-15T14:00:00Z", "busiestHourRequests": 45 },
"generatedAt": "2025-01-15T14:30:00Z"
}
Granular per-hour request metrics. Per-hour: total requests, allowed/denied breakdown, credits spent, unique consumers, and unique tools. Summary includes totals and identifies the busiest hour. Sorted by timestamp descending (most recent first). Read-only.
Tool Error Rate
curl http://localhost:3000/admin/tool-error-rate -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"tools": [
{ "tool": "translate", "totalRequests": 100, "allowed": 85, "denied": 15, "errorRate": 15 },
{ "tool": "search", "totalRequests": 200, "allowed": 190, "denied": 10, "errorRate": 5 },
{ "tool": "generate", "totalRequests": 150, "allowed": 150, "denied": 0, "errorRate": 0 }
],
"summary": { "totalTools": 3, "overallErrorRate": 5.56, "highestErrorTool": "translate" },
"generatedAt": "2025-01-15T14:30:00Z"
}
Per-tool error/denial rate analysis. Per-tool: total requests, allowed/denied counts, and error rate percentage. Summary includes total tools, overall error rate, and identifies the tool with highest error rate. Sorted by error rate descending. Read-only.
Consumer Spend Velocity
curl http://localhost:3000/admin/consumer-spend-velocity -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"consumers": [
{ "name": "power-user", "totalSpent": 500, "creditsRemaining": 500, "creditsPerHour": 25.5, "hoursUntilDepleted": 19.61 },
{ "name": "casual-user", "totalSpent": 50, "creditsRemaining": 950, "creditsPerHour": 2.1, "hoursUntilDepleted": 452.38 },
{ "name": "idle-user", "totalSpent": 0, "creditsRemaining": 100, "creditsPerHour": 0, "hoursUntilDepleted": null }
],
"summary": { "totalConsumers": 3, "fastestSpender": "power-user" },
"generatedAt": "2025-01-15T14:30:00Z"
}
Per-consumer spend velocity analysis. Per-consumer: total spent, credits remaining, credits per hour rate, and estimated hours until depletion. Zero-spend consumers have creditsPerHour: 0 and hoursUntilDepleted: null. Summary identifies the fastest spender. Excludes revoked/suspended keys. Sorted by spend rate descending. Read-only.
Namespace Activity
curl http://localhost:3000/admin/namespace-activity -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"namespaces": [
{ "namespace": "production", "keyCount": 5, "totalSpent": 1200, "totalCalls": 240, "creditsRemaining": 3800 },
{ "namespace": "staging", "keyCount": 2, "totalSpent": 80, "totalCalls": 16, "creditsRemaining": 920 },
{ "namespace": "default", "keyCount": 1, "totalSpent": 0, "totalCalls": 0, "creditsRemaining": 100 }
],
"summary": { "totalNamespaces": 3, "topNamespace": "production" },
"generatedAt": "2025-01-15T14:30:00Z"
}
Per-namespace activity breakdown for multi-tenant visibility. Per-namespace: key count, total spend, total calls, credits remaining. Keys without a namespace are grouped as "default". Summary identifies the top namespace by spend. Excludes revoked/suspended keys. Sorted by totalSpent descending. Read-only.
Credit Burn Rate
curl http://localhost:3000/admin/credit-burn-rate -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"burnRate": { "creditsPerHour": 45.5, "hoursUntilDepleted": 104.4, "utilizationPercent": 25 },
"summary": { "totalAllocated": 5000, "totalSpent": 1250, "totalRemaining": 3750, "activeKeys": 10 },
"generatedAt": "2025-01-15T14:30:00Z"
}
System-wide credit burn rate analysis. Shows aggregate credits/hour burn rate, utilization percentage (spent/allocated), and estimated hours until all credits are depleted. Summary includes total allocated, spent, remaining, and active key count. Zero-spend systems show creditsPerHour: 0 and hoursUntilDepleted: null. Excludes revoked/suspended keys. Read-only.
Consumer Risk Score
curl http://localhost:3000/admin/consumer-risk-score -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"consumers": [
{ "name": "heavy-user", "riskScore": 80, "riskLevel": "critical", "creditsRemaining": 20, "totalSpent": 80, "utilizationPercent": 80 },
{ "name": "normal-user", "riskScore": 25, "riskLevel": "medium", "creditsRemaining": 150, "totalSpent": 50, "utilizationPercent": 25 },
{ "name": "idle-user", "riskScore": 0, "riskLevel": "low", "creditsRemaining": 100, "totalSpent": 0, "utilizationPercent": 0 }
],
"summary": { "totalConsumers": 3, "riskDistribution": { "low": 1, "medium": 1, "high": 0, "critical": 1 } },
"generatedAt": "2025-01-15T14:30:00Z"
}
Per-consumer risk scoring based on credit utilization. Risk score (0β100) maps to levels: low (0β24), medium (25β49), high (50β74), critical (75β100). Per-consumer: risk score, risk level, credits remaining, total spent, utilization percentage. Summary includes risk distribution counts. Excludes revoked/suspended keys. Sorted by riskScore descending. Read-only.
Revenue Forecast
curl http://localhost:3000/admin/revenue-forecast -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"forecast": { "nextHour": 45.5, "nextDay": 1092, "nextWeek": 7644, "nextMonth": 32760 },
"current": { "totalSpent": 1250, "totalRemaining": 48750, "creditsPerHour": 45.5, "activeKeys": 10 },
"generatedAt": "2025-01-15T14:30:00Z"
}
Projected revenue based on current spend trends. Forecasts for next hour, day, week, and month are extrapolated from aggregate credits/hour rate and capped by total remaining credits. Includes current totals and active key count. Zero-spend systems show zero forecasts. Excludes revoked/suspended keys. Read-only.
System Overview
curl http://localhost:3000/admin/system-overview -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"keys": { "total": 15, "active": 12, "revoked": 2, "suspended": 1 },
"credits": { "totalAllocated": 150000, "totalSpent": 45000, "totalRemaining": 105000, "utilizationPercent": 30 },
"activity": { "totalCalls": 3500, "uniqueTools": 8 },
"generatedAt": "2025-01-15T14:30:00Z"
}
Executive summary of the entire system. Key counts by status (active, revoked, suspended). Credit totals with utilization percentage. Activity metrics including total calls and unique tools used. Single endpoint for dashboards and monitoring integrations. Read-only.
Key Health Overview
curl http://localhost:3000/admin/key-health-overview -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"keys": [
{ "name": "depleted-key", "credits": 0, "totalSpent": 1000, "totalCalls": 85, "utilizationPercent": 100, "status": "critical" },
{ "name": "active-key", "credits": 200, "totalSpent": 800, "totalCalls": 60, "utilizationPercent": 80, "status": "warning" },
{ "name": "healthy-key", "credits": 9000, "totalSpent": 1000, "totalCalls": 50, "utilizationPercent": 10, "status": "healthy" }
],
"summary": { "totalKeys": 3, "healthDistribution": { "healthy": 1, "warning": 1, "critical": 1 } },
"generatedAt": "2025-01-15T14:30:00Z"
}
Holistic per-key health check. Status levels: critical (0 credits remaining), warning (β₯75% utilization), healthy (below thresholds). Summary includes health distribution counts. Sorted by credits ascending (most depleted first). Excludes revoked/suspended keys. Read-only.
Namespace Comparison
curl http://localhost:3000/admin/namespace-comparison -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"namespaces": [
{ "namespace": "production", "keyCount": 5, "totalAllocated": 50000, "totalSpent": 12000, "totalCalls": 800, "creditsRemaining": 38000, "utilizationPercent": 24 },
{ "namespace": "staging", "keyCount": 3, "totalAllocated": 3000, "totalSpent": 500, "totalCalls": 50, "creditsRemaining": 2500, "utilizationPercent": 17 }
],
"summary": { "totalNamespaces": 2, "leader": "production" },
"generatedAt": "2025-01-15T14:30:00Z"
}
Side-by-side namespace comparison. Per namespace: key count, total allocated credits, total spent, total calls, credits remaining, utilization percentage. Keys without a namespace appear under "default". Summary includes namespace count and leading namespace (highest allocation). Sorted by totalAllocated descending. Excludes revoked/suspended keys. Read-only.
Consumer Growth
curl http://localhost:3000/admin/consumer-growth -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"consumers": [
{ "name": "enterprise-key", "ageHours": 720, "totalSpent": 4500, "creditsAllocated": 10000, "spendRate": 6.25 },
{ "name": "trial-key", "ageHours": 24, "totalSpent": 10, "creditsAllocated": 100, "spendRate": 0.42 }
],
"summary": { "totalConsumers": 2, "newConsumers24h": 1 },
"generatedAt": "2025-01-15T14:30:00Z"
}
Consumer growth metrics per key. Per consumer: age in hours since creation, total credits spent, original credits allocated (credits + totalSpent), spend rate (credits/hour). Summary includes total active consumer count and new consumers created in the last 24 hours. Sorted by creditsAllocated descending. Excludes revoked/suspended keys. Read-only.
Tool Profitability
curl http://localhost:3000/admin/tool-profitability -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"tools": [
{ "tool": "search", "totalCalls": 150, "totalRevenue": 450, "avgRevenuePerCall": 3, "callerCount": 8 },
{ "tool": "translate", "totalCalls": 40, "totalRevenue": 120, "avgRevenuePerCall": 3, "callerCount": 3 }
],
"summary": { "totalRevenue": 570, "mostProfitable": "search", "leastProfitable": "translate" },
"generatedAt": "2025-01-15T14:30:00Z"
}
Per-tool profitability analysis based on actual tool call revenue. Per tool: total calls, total revenue (credits spent), average revenue per call, unique caller count. Sorted by totalRevenue descending. Summary includes most/least profitable tools and total revenue across all tools. Read-only.
Credit Waste Analysis
curl http://localhost:3000/admin/credit-waste -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"keys": [
{ "name": "unused-key", "creditsAllocated": 1000, "creditsUsed": 0, "creditsRemaining": 1000, "wastePercent": 100 },
{ "name": "active-key", "creditsAllocated": 500, "creditsUsed": 350, "creditsRemaining": 150, "wastePercent": 30 }
],
"summary": { "totalAllocated": 1500, "totalWasted": 1150, "averageWastePercent": 65 },
"generatedAt": "2025-01-15T14:30:00Z"
}
Per-key credit waste analysis showing allocated vs used credits. Waste percent = remaining / allocated Γ 100 (100% = fully unused, 0% = fully utilized). Summary includes total allocated credits, total wasted (remaining), and average waste percentage. Sorted by wastePercent descending (biggest wasters first). Excludes revoked/suspended keys. Read-only.
Group Activity
curl http://localhost:3000/admin/group-activity -H "X-Admin-Key: YOUR_ADMIN_KEY"
{
"groups": [
{ "group": "production", "keyCount": 5, "totalSpent": 2500, "totalCalls": 180, "creditsRemaining": 7500 },
{ "group": "staging", "keyCount": 3, "totalSpent": 400, "totalCalls": 45, "creditsRemaining": 2600 }
],
"summary": { "totalGroups": 2, "topGroup": "production" },
"generatedAt": "2025-01-15T14:30:00Z"
}
Activity breakdown by key group. Per-group: key count, total spent, total calls, credits remaining. Ungrouped keys appear under "ungrouped". Group IDs are resolved to human-readable group names. Sorted by totalSpent descending. Summary includes group count and top-spending group. Excludes revoked/suspended keys. Read-only.
IP Allowlisting
Restrict API keys to specific IP addresses or CIDR ranges:
# Set IP allowlist on a key (replaces existing list)
curl -X POST http://localhost:3402/keys/ip \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "pg_...", "ips": ["192.168.1.0/24", "10.0.0.5"]}'
# Clear allowlist (allow all IPs)
curl -X POST http://localhost:3402/keys/ip \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "pg_...", "ips": []}'
You can also set the allowlist at key creation time:
curl -X POST http://localhost:3402/keys \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"name": "prod-agent", "credits": 1000, "ipAllowlist": ["10.0.0.0/8"]}'
Supports exact IPv4 matching and CIDR notation (/8, /16, /24, /32, etc.). When the allowlist is empty, all IPs are allowed. Client IP is extracted from X-Forwarded-For header or socket remote address. Configure trustedProxies for accurate IP extraction behind load balancers (see Trusted Proxies).
Key Tags / Metadata
Attach arbitrary key-value tags to API keys for external system integration:
# Set tags (merge semantics β existing tags preserved, new ones added/updated)
curl -X POST http://localhost:3402/keys/tags \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "pg_...", "tags": {"team": "backend", "env": "production", "customer_id": "cus_123"}}'
# Remove a tag (set value to null)
curl -X POST http://localhost:3402/keys/tags \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"key": "pg_...", "tags": {"env": null}}'
# Search keys by tags
curl -X POST http://localhost:3402/keys/search \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"tags": {"team": "backend"}}'
# β { "keys": [...], "count": 3 }
Tags can also be set at key creation:
curl -X POST http://localhost:3402/keys \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"name": "backend-prod", "credits": 5000, "tags": {"team": "backend", "env": "production"}}'
Limits: max 50 tags per key, max 100 chars per key/value. Tags appear in /balance responses and key listings.
Usage Analytics
Query aggregated usage data for dashboards, reports, and trend analysis:
# Get analytics for the last 24 hours (hourly buckets)
curl "http://localhost:3402/analytics?from=2026-02-25T00:00:00Z&to=2026-02-26T00:00:00Z&granularity=hourly" \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
# Daily granularity with top 5 consumers
curl "http://localhost:3402/analytics?granularity=daily&topN=5" \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Returns:
- timeSeries β Bucketed call counts, credits charged, and denials per time window
- toolBreakdown β Per-tool stats (calls, credits, average cost) sorted by usage
- topConsumers β Top N API keys by credits spent, with each key's most-used tool
- trend β Current vs previous period comparison with percentage changes (calls, credits, denials)
- summary β Total calls, credits, unique keys, and unique tools
Query parameters: from (ISO date), to (ISO date), granularity (hourly or daily, default: hourly), topN (number, default: 10).
Alert Webhooks
Configure rules to fire alerts when usage thresholds are crossed. Alerts are delivered via webhooks as alert.fired admin events:
# Configure alert rules
curl -X POST http://localhost:3402/alerts \
-H "Content-Type: application/json" \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"rules": [
{"type": "spending_threshold", "threshold": 80},
{"type": "credits_low", "threshold": 50},
{"type": "quota_warning", "threshold": 90},
{"type": "key_expiry_soon", "threshold": 86400},
{"type": "rate_limit_spike", "threshold": 10}
]}'
# Consume pending alerts (returns and clears queue)
curl http://localhost:3402/alerts \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Alert types:
| Type | Threshold Meaning | Fires When |
|---|---|---|
spending_threshold | Percentage (0β100) | Key has spent β₯ threshold% of its initial credits |
credits_low | Absolute credits | Key's remaining credits drop below threshold |
quota_warning | Percentage (0β100) | Key's daily call usage exceeds threshold% of quota |
key_expiry_soon | Seconds | Key expires within threshold seconds |
rate_limit_spike | Count | Key has β₯ threshold rate-limit denials in 5 minutes |
Each rule has an optional cooldownSeconds (default: 300) to prevent alert storms. Alerts are automatically checked on every gate evaluation (tool call).
When webhooks are enabled (--webhook-url), alerts fire as alert.fired events in the adminEvents webhook payload with full context (key, rule type, current value, threshold).
Team Management
Group API keys into teams with shared budgets, quotas, and usage tracking. Teams enforce budget and quota limits at the gate level β if a key belongs to a team that has exceeded its budget or quota, tool calls are denied even if the individual key has credits remaining.
# Create a team
curl -X POST http://localhost:3402/teams \
-H "Content-Type: application/json" \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"name": "Engineering", "budget": 10000, "tags": {"dept": "eng"}}'
# Assign an API key to a team
curl -X POST http://localhost:3402/teams/assign \
-H "Content-Type: application/json" \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"teamId": "team_abc123...", "apiKey": "pg_abc123..."}'
# Set team quotas (daily/monthly limits)
curl -X POST http://localhost:3402/teams/update \
-H "Content-Type: application/json" \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-d '{"teamId": "team_abc123...", "quota": {"dailyCalls": 1000, "monthlyCalls": 25000, "dailyCredits": 5000, "monthlyCredits": 100000}}'
# View team usage with member breakdown
curl "http://localhost:3402/teams/usage?teamId=team_abc123..." \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Team features:
| Feature | Description |
|---|---|
| Shared budget | Pool credits across all team members (0 = unlimited) |
| Team quotas | Daily/monthly call and credit limits (UTC auto-reset) |
| Member breakdown | Per-key usage within the team (keys masked) |
| Tags/metadata | Attach key-value pairs for department, project, etc. |
| Max 100 keys | Per team limit to prevent abuse |
| Gate integration | Budget + quota checked on every tool call |
| Audit trail | team.created, team.updated, team.deleted, team.key_assigned, team.key_removed events |
Each API key can belong to at most one team. Team budget and quota checks happen after individual key checks β both must pass for a tool call to succeed.
Rate Limit Response Headers
Every /mcp response includes rate limit and credits headers when an API key is provided:
X-RateLimit-Limit: 100 # Max calls per window
X-RateLimit-Remaining: 87 # Calls remaining in current window
X-RateLimit-Reset: 45 # Seconds until window resets
X-Credits-Remaining: 4500 # Credits remaining on the key
When a tool has a per-tool rate limit, the headers reflect that tool's limit (not the global limit). These headers are CORS-exposed so browser-based agents can read them.
Health Check + Graceful Shutdown
The GET /health endpoint provides a public (no auth required) health check for load balancers and orchestrators:
curl http://localhost:3402/health
{
"status": "healthy",
"uptime": 3600,
"version": "2.6.0",
"inflight": 3,
"redis": { "connected": true, "pubsub": true },
"webhooks": { "pendingRetries": 0, "deadLetterCount": 2 }
}
| Field | Description |
|---|---|
status | "healthy" or "draining" (during graceful shutdown) |
uptime | Seconds since server started |
version | Package version |
inflight | Number of in-flight /mcp requests |
redis | Redis connectivity (only present when --redis-url is set) |
webhooks | Webhook retry stats (only present when --webhook-url is set) |
During graceful shutdown, /health returns HTTP 503 with "status": "draining", and new /mcp requests are rejected with 503. Existing in-flight requests are allowed to complete before the server tears down. The CLI uses gracefulStop() on SIGTERM/SIGINT with a 30-second drain timeout.
Programmatic API:
// Graceful shutdown with custom timeout (default 30s)
await server.gracefulStop(15_000);
Config Validation + Dry Run
Validate a config file before starting the server:
# Validate a config file β exits 0 if valid, 1 if errors found
paygate-mcp validate --config paygate.json
Output on error:
β 2 error(s):
ERROR [port] Invalid port 99999. Must be 0β65535.
ERROR [redisUrl] Invalid redisUrl protocol "http:". Expected "redis://" or "rediss://".
β 1 warning(s):
WARN [shadowMode] Shadow mode is enabled. Payment will not be enforced.
Dry run mode starts the server, discovers tools from the backend, prints a pricing table, then exits:
paygate-mcp wrap --server "node my-server.js" --dry-run
ββ DRY RUN ββββββββββββββββββββββββββββββββββββββ
Discovered 3 tool(s):
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Tool Credits/Call Rate Limit
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
search 5 30/min
generate 10 10/min
list_items 1 60/min
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Dry run complete β shutting down.
Programmatic API:
import { validateConfig, formatDiagnostics } from 'paygate-mcp';
const diags = validateConfig(myConfig);
if (diags.some(d => d.level === 'error')) {
console.error(formatDiagnostics(diags));
process.exit(1);
}
Batch Tool Calls
Call multiple tools in a single request with all-or-nothing billing:
{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call_batch",
"params": {
"calls": [
{ "name": "search", "arguments": { "q": "MCP servers" } },
{ "name": "translate", "arguments": { "text": "hello", "to": "es" } },
{ "name": "summarize", "arguments": { "url": "https://example.com" } }
]
}
}
Response:
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"results": [
{ "tool": "search", "result": { "content": [...] }, "creditsCharged": 5 },
{ "tool": "translate", "result": { "content": [...] }, "creditsCharged": 3 },
{ "tool": "summarize", "result": { "content": [...] }, "creditsCharged": 2 }
],
"totalCreditsCharged": 10,
"remainingCredits": 90
}
}
Key features:
- All-or-nothing β All calls are pre-validated (auth, ACL, rate limits, credits, quotas) before any execute. If any call would be denied, the entire batch is rejected and zero credits are charged.
- Aggregate pricing β Total credits are checked and deducted atomically. A batch of 3 calls needing 5+3+2=10 credits requires 10 credits available.
- Parallel execution β After gate approval, all tool calls execute concurrently for minimum latency.
- Refund on failure β With
refundOnFailureenabled, individual tools that error downstream get their credits refunded. - Multi-server support β Works with prefixed tools in multi-server mode (e.g.,
fs:read,github:search).
Programmatic API:
import { Gate, BatchToolCall } from 'paygate-mcp';
const calls: BatchToolCall[] = [
{ name: 'search', arguments: { q: 'test' } },
{ name: 'translate', arguments: { text: 'hi' } },
];
const result = gate.evaluateBatch(apiKey, calls, clientIp);
if (!result.allAllowed) {
console.log(`Denied at index ${result.failedIndex}: ${result.reason}`);
} else {
console.log(`Charged ${result.totalCredits} credits for ${calls.length} calls`);
}
Multi-Tenant Namespaces
Isolate API keys and usage data by tenant. Each key belongs to a namespace (default: "default"). All admin endpoints support namespace filtering for tenant-scoped views.
Create a key in a namespace:
curl -X POST http://localhost:3402/keys \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{"name": "acme-agent", "credits": 1000, "namespace": "acme-corp"}'
List keys filtered by namespace:
curl http://localhost:3402/keys?namespace=acme-corp \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
List all namespaces with stats:
curl http://localhost:3402/namespaces \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Returns:
{
"namespaces": [
{ "namespace": "acme-corp", "keyCount": 3, "activeKeys": 2, "totalCredits": 2500, "totalSpent": 480 },
{ "namespace": "beta-inc", "keyCount": 1, "activeKeys": 1, "totalCredits": 500, "totalSpent": 120 }
],
"count": 2
}
Namespace-filtered status, usage, and analytics:
# Status filtered to one namespace
curl http://localhost:3402/status?namespace=acme-corp -H "X-Admin-Key: ..."
# Usage events filtered by namespace
curl http://localhost:3402/usage?namespace=acme-corp -H "X-Admin-Key: ..."
# Analytics filtered by namespace
curl "http://localhost:3402/analytics?namespace=acme-corp&from=2025-01-01" -H "X-Admin-Key: ..."
# Search keys by tag within a namespace
curl -X POST http://localhost:3402/keys/search \
-H "X-Admin-Key: ..." -H "Content-Type: application/json" \
-d '{"tags": {"env": "prod"}, "namespace": "acme-corp"}'
Namespace rules:
- Alphanumeric + hyphens only, max 50 characters, case-insensitive (stored lowercase)
- Defaults to
"default"if omitted or invalid - Old keys automatically backfilled to
"default"on state file load - Usage events carry the key's namespace for cross-cutting analytics
- Namespaces are implicit β created automatically when a key is assigned to one
Scoped Tokens
Issue short-lived, tool-restricted tokens from any API key. Scoped tokens let you delegate narrow access to agents or sub-processes without exposing the parent API key.
Create a scoped token (admin):
curl -X POST http://localhost:3402/tokens \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{
"key": "pg_parent_key_here",
"ttl": 300,
"allowedTools": ["search", "summarize"],
"label": "agent-session-42"
}'
Returns:
{
"token": "pgt_eyJhcGl...signature",
"expiresAt": "2025-06-15T12:05:00.000Z",
"ttl": 300,
"parentKey": "my-agent",
"allowedTools": ["search", "summarize"],
"label": "agent-session-42",
"message": "Use as X-API-Key or Bearer token on /mcp"
}
Use the token on /mcp:
# As X-API-Key header
curl -X POST http://localhost:3402/mcp \
-H "X-API-Key: pgt_eyJhcGl...signature" \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"search","arguments":{"q":"hello"}}}'
# As Bearer token
curl -X POST http://localhost:3402/mcp \
-H "Authorization: Bearer pgt_eyJhcGl...signature" \
-H "Content-Type: application/json" \
-d '...'
Token behavior:
- Self-contained β HMAC-SHA256 signed, zero server-side state. Validated cryptographically on every request.
- Auto-expiry β TTL defaults to 1 hour, max 24 hours. Expired tokens are rejected instantly.
- Tool ACL narrowing β If
allowedToolsis set, the token can only call those tools (intersection with parent key's ACL). - Credits from parent β Tool calls charge against the parent key's credit balance.
tools/listfiltering β When a scoped token callstools/list, only the allowed tools are returned.- Batch-aware β
tools/call_batchchecks scoped token ACL for every call in the batch. - Resolution priority β
X-API-Keyheader βpgt_scoped token β OAuth Bearer token.
Token format: pgt_<base64url(JSON payload)>.<base64url(HMAC-SHA256 signature)>
Programmatic usage:
import { ScopedTokenManager } from 'paygate-mcp';
const tokens = new ScopedTokenManager('your-signing-secret');
// Create
const token = tokens.create({
apiKey: 'pg_parent_key',
ttlSeconds: 300,
allowedTools: ['search'],
label: 'agent-42',
});
// Validate
const result = tokens.validate(token);
if (result.valid) {
console.log(result.payload.apiKey); // 'pg_parent_key'
console.log(result.payload.allowedTools); // ['search']
}
// Check if a string is a scoped token
ScopedTokenManager.isToken('pgt_...'); // true
ScopedTokenManager.isToken('pg_...'); // false
Token Revocation List
Revoke scoped tokens before they expire. Once revoked, the token is immediately rejected by all PayGate instances (synced via Redis pub/sub in multi-instance deployments).
Revoke a token (admin):
curl -X POST http://localhost:3402/tokens/revoke \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{"token": "pgt_eyJhcGl...signature", "reason": "session ended"}'
Returns:
{
"message": "Token revoked",
"fingerprint": "a1b2c3d4e5f6...",
"expiresAt": "2025-06-15T12:05:00.000Z",
"revokedAt": "2025-06-15T11:30:00.000Z"
}
List revoked tokens (admin):
curl http://localhost:3402/tokens/revoked \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
Returns { count, entries: [{ fingerprint, expiresAt, revokedAt, reason }] }.
Revocation behavior:
- O(1) lookup β SHA-256 fingerprints stored in a Map for constant-time rejection checks.
- Auto-cleanup β Revocation entries are purged once the original token would have naturally expired (max 24h), so the list never grows unbounded.
- Redis sync β In multi-instance deployments, revocations are propagated via
token_revokedpub/sub events. Other instances add the entry to their local revocation list immediately. - Audit trail β Every revocation is logged as a
token.revokedaudit event with fingerprint and reason. - Signature validation β Only tokens signed by this server can be revoked (prevents revoking arbitrary strings).
Programmatic usage:
import { ScopedTokenManager } from 'paygate-mcp';
const tokens = new ScopedTokenManager('your-signing-secret');
const token = tokens.create({ apiKey: 'pg_key', ttlSeconds: 3600 });
// Revoke
const entry = tokens.revokeToken(token, 'session ended');
console.log(entry.fingerprint); // SHA-256 hex
// Validate β now returns { valid: false, reason: 'token_revoked' }
tokens.validate(token); // { valid: false, reason: 'token_revoked' }
// Check revocation list size
tokens.revocationList.size; // 1
// Clean up on shutdown
tokens.destroy();
Usage-Based Auto-Topup
Automatically refill credits when a key's balance drops below a threshold. Prevents service interruptions for high-value API consumers.
Configure auto-topup (admin):
curl -X POST http://localhost:3402/keys/auto-topup \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{"key": "pg_abc123...", "threshold": 100, "amount": 500, "maxDaily": 10}'
Returns:
{
"autoTopup": { "threshold": 100, "amount": 500, "maxDaily": 10 },
"message": "Auto-topup enabled: add 500 credits when balance drops below 100 (max 10/day)"
}
Disable auto-topup:
curl -X POST http://localhost:3402/keys/auto-topup \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{"key": "pg_abc123...", "disable": true}'
Auto-topup behavior:
- Post-deduction trigger β After each tool call (or batch) deducts credits, the gate checks if credits fell below the threshold and automatically adds credits.
- Daily limits β
maxDailycaps how many auto-topups can occur per UTC day. Set to0for unlimited. - Audit trail β Every auto-topup is logged as a
key.auto_topped_upaudit event. Configuration changes are logged askey.auto_topup_configured. - Webhook events β Both
key.auto_topup_configuredandkey.auto_topped_upevents are sent via webhooks. - Redis sync β In multi-instance deployments, auto-topup credits are synced atomically via Redis.
- State persistence β Auto-topup config and daily counters are persisted in the state file and Redis.
Programmatic usage:
import { Gate } from 'paygate-mcp';
const gate = new Gate(config, 'state.json');
const record = gate.store.createKey('premium-client', 1000);
// Configure auto-topup
record.autoTopup = { threshold: 100, amount: 500, maxDaily: 5 };
gate.store.save();
// Hook for notifications
gate.onAutoTopup = (apiKey, amount, newBalance) => {
console.log(`Auto-topped up ${amount} credits β balance: ${newBalance}`);
};
// Gate.evaluate() automatically triggers auto-topup after credit deduction
const result = gate.evaluate(record.key, { name: 'expensive-tool' });
Admin API Key Management
Manage multiple admin keys with role-based permissions. The bootstrap admin key (from constructor or CLI) is always a super_admin.
Roles:
| Role | Description |
|---|---|
super_admin | Full access, including admin key management |
admin | All API key and system operations, but cannot manage admin keys |
viewer | Read-only access to status, usage, analytics, audit, etc. |
Create an admin key (super_admin only):
curl -X POST http://localhost:3402/admin/keys \
-H "X-Admin-Key: $ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{"name": "CI Bot", "role": "admin"}'
# Returns: { "key": "ak_...", "name": "CI Bot", "role": "admin", "createdAt": "..." }
List admin keys (super_admin only):
curl http://localhost:3402/admin/keys \
-H "X-Admin-Key: $ADMIN_KEY"
# Returns masked keys with roles, status, and last used timestamps
Revoke an admin key (super_admin only):
curl -X POST http://localhost:3402/admin/keys/revoke \
-H "X-Admin-Key: $ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{"key": "ak_..."}'
Behavior:
- The default role for
POST /admin/keysisadminif not specified. - Cannot revoke your own admin key (safety guard).
- Cannot revoke the last
super_adminkey (safety guard). viewerkeys can access all read-only endpoints (GET) but are denied write operations (POST).adminkeys can create/revoke/rotate API keys, manage teams, tokens, etc. but cannot manage admin keys.- Admin keys are persisted to a separate file (
*-admin.json) alongside the state file. - All operations are logged in the audit trail (
admin_key.created,admin_key.revoked). - Webhook events are fired for admin key lifecycle changes.
Plugin System
Add custom logic to PayGate with the plugin API. Plugins can intercept gate decisions, transform pricing, modify tool requests/responses, add custom HTTP endpoints, and hook into server lifecycle events.
import { PayGateServer, PayGatePlugin } from 'paygate-mcp';
// Define a plugin
const loggingPlugin: PayGatePlugin = {
name: 'request-logger',
version: '1.0.0',
// Gate hooks (sync β hot path)
beforeGate: (ctx) => {
// Return { allowed: false, reason: '...' } to short-circuit
// Return null to continue normal evaluation
if (ctx.toolName === 'dangerous_tool') {
return { allowed: false, reason: 'tool_disabled' };
}
return null;
},
afterGate: (ctx, decision) => {
// Modify the gate decision after evaluation
console.log(`${ctx.toolName}: ${decision.allowed ? 'allowed' : 'denied'}`);
return decision;
},
transformPrice: (toolName, basePrice, args) => {
// Return a number to override price, or null to keep base price
if (toolName === 'premium_search') return basePrice * 2;
return null;
},
onDeny: (ctx, reason) => {
// Called whenever a tool call is denied
console.log(`Denied: ${ctx.toolName} β ${reason}`);
},
// Tool hooks (async)
beforeToolCall: async (ctx) => {
// Modify the JSON-RPC request before forwarding
return { ...ctx.request, params: { ...ctx.request.params, audit: true } };
},
afterToolCall: async (ctx, response) => {
// Modify the JSON-RPC response before returning to client
return response;
},
// HTTP hook (async)
onRequest: (req, res) => {
// Add custom endpoints β return true if handled
if (req.url === '/custom/status') {
res.writeHead(200, { 'Content-Type': 'application/json' });
res.end(JSON.stringify({ custom: true }));
return true;
}
return false;
},
// Lifecycle hooks (async)
onStart: async () => { console.log('Plugin started'); },
onStop: async () => { console.log('Plugin stopped'); },
};
// Register plugins with .use() (chainable)
const server = new PayGateServer({ ... });
server
.use(loggingPlugin)
.use(anotherPlugin);
await server.start();
Hook types:
| Hook | Sync/Async | Description |
|---|---|---|
beforeGate | Sync | Short-circuit gate evaluation. First non-null result wins. |
afterGate | Sync | Modify gate decision. Cascading (each plugin sees previous result). |
transformPrice | Sync | Override tool pricing. First non-null number wins. |
onDeny | Sync | Notification on denial. All plugins called. |
beforeToolCall | Async | Modify JSON-RPC request before forwarding. Cascading. |
afterToolCall | Async | Modify JSON-RPC response before returning. Cascading. |
onRequest | Async | Add custom HTTP endpoints. First true return handles the request. |
onStart | Async | Called after server starts. Registration order. |
onStop | Async | Called before server stops. Reverse registration order. |
Error isolation: Plugin errors are caught and logged β a crashing plugin never takes down the server.
List registered plugins (admin only):
curl http://localhost:3402/plugins -H "X-Admin-Key: $ADMIN_KEY"
# { "count": 2, "plugins": [{ "name": "...", "version": "...", "hooks": ["beforeGate", ...] }] }
Key Groups (Policy Templates)
Key groups let you define reusable policy templates and apply them to multiple API keys at once. Unlike teams (which share budgets), groups share policies: ACL, rate limits, pricing overrides, IP allowlists, and quotas.
Create a group:
curl -X POST http://localhost:3402/groups \
-H "X-Admin-Key: $ADMIN_KEY" \
-d '{
"name": "free-tier",
"allowedTools": ["search", "read_file"],
"rateLimitPerMin": 30,
"ipAllowlist": ["10.0.0.0/8"],
"quota": { "dailyCallLimit": 100, "monthlyCallLimit": 1000, "dailyCreditLimit": 50, "monthlyCreditLimit": 200 },
"toolPricing": { "search": { "creditsPerCall": 2 } },
"tags": { "tier": "free" }
}'
# { "id": "grp_a1b2c3...", "name": "free-tier", ... }
Assign keys to a group:
curl -X POST http://localhost:3402/groups/assign \
-H "X-Admin-Key: $ADMIN_KEY" \
-d '{ "groupId": "grp_a1b2c3...", "key": "pgk_..." }'
Policy resolution rules:
| Policy | Resolution |
|---|---|
allowedTools | Key wins if non-empty, otherwise group |
deniedTools | Union of both (most restrictive) |
ipAllowlist | Union of both (additive) |
rateLimitPerMin | Key wins if set, otherwise group |
quota | Key wins if set, otherwise group |
toolPricing | Group overrides global config |
maxSpendingLimit | Group default (key can override via /limits) |
List groups:
curl http://localhost:3402/groups -H "X-Admin-Key: $ADMIN_KEY"
# [{ "id": "grp_...", "name": "free-tier", "memberCount": 5, ... }]
Update / delete / remove:
# Update group policies
curl -X POST http://localhost:3402/groups/update \
-H "X-Admin-Key: $ADMIN_KEY" \
-d '{ "id": "grp_...", "rateLimitPerMin": 60 }'
# Remove a key from its group
curl -X POST http://localhost:3402/groups/remove \
-H "X-Admin-Key: $ADMIN_KEY" \
-d '{ "key": "pgk_..." }'
# Delete a group (removes all assignments)
curl -X POST http://localhost:3402/groups/delete \
-H "X-Admin-Key: $ADMIN_KEY" \
-d '{ "id": "grp_..." }'
Programmatic usage:
import { PayGateServer, KeyGroupManager } from 'paygate-mcp';
const server = new PayGateServer({ ... });
const { port, adminKey } = await server.start();
// Access groups directly
const group = server.groups.createGroup({ name: 'enterprise', rateLimitPerMin: 1000 });
server.groups.assignKey(apiKey, group.id);
// Resolve effective policy for a key
const policy = server.groups.resolvePolicy(apiKey, keyRecord);
// { allowedTools, deniedTools, rateLimitPerMin, quota, ipAllowlist, toolPricing, maxSpendingLimit }
File persistence: When using --state-file, group definitions and key assignments are automatically saved to a *-groups.json file alongside the main state file. Groups survive restarts without needing Redis.
Redis sync: When running with --redis-url, group definitions and key assignments are additionally persisted to Redis and synced across instances via pub/sub. All group CRUD operations and assignment changes propagate in real-time to other PayGate processes.
Horizontal Scaling (Redis)
Enable Redis-backed state for multi-process deployments. Multiple PayGate instances share API keys, credits, and usage data through Redis:
# Single instance with Redis persistence
npx paygate-mcp wrap --server "your-mcp-server" --redis-url "redis://localhost:6379"
# With password and database
npx paygate-mcp wrap --server "your-mcp-server" \
--redis-url "redis://:mypassword@redis.internal:6379/2"
Or in a config file:
{
"serverCommand": "your-mcp-server",
"redisUrl": "redis://localhost:6379"
}
Architecture: Write-Through Cache
PayGate uses a write-through cache pattern for maximum performance:
- Reads β Served from in-memory KeyStore (zero latency, no Redis round-trip)
- Writes β Propagated to Redis for cross-process shared state
- Credit deduction β Uses Redis Lua scripts for atomic check-and-deduct (prevents double-spend across processes)
- Periodic sync β Local caches refresh from Redis every 5 seconds as a safety net
- Pub/sub notifications β Key mutations and credit changes propagate to all instances in real-time via Redis PUBLISH/SUBSCRIBE (sub-millisecond latency)
This means Gate.evaluate() stays synchronous and fast, while credit operations remain atomic across your entire fleet. The server automatically wires Redis hooks into the gate β every usage event and credit deduction flows to Redis without any code changes. Pub/sub ensures other instances see changes near-instantly (no 5-second wait).
What Gets Synced
| State | Redis Key Pattern | Sync Method |
|---|---|---|
| API keys | pg:key:<keyId> (Hash) | Write-through + pub/sub + periodic refresh |
| Key registry | pg:keys (Set) | Write-through |
| Credit deduction | pg:key:<keyId> | Atomic Lua script + pub/sub broadcast |
| Credit top-up | pg:key:<keyId> | Atomic Lua script + pub/sub broadcast |
| Admin mutations | pg:key:<keyId> (Hash) | Write-through (all admin endpoints) |
| Rate limiting | pg:rate:<key> (Sorted Set) | Atomic Lua (sliding window) |
| Usage events | pg:usage (List) | Fire-and-forget RPUSH |
| Cross-instance events | pg:events (Pub/Sub) | PUBLISH/SUBSCRIBE with inline data |
Deployment Pattern
ββββββββββββββββ
β Redis 7+ β
β ββββββββββ β
β βpub/sub β β
ββββ΄ββββ¬βββββ΄βββ
β
ββββββββββββββΌβββββββββββββ
β β β
βββββββ΄ββββββ βββββ΄ββββ βββββββ΄ββββββ
β PayGate 1 β β PG 2 β β PayGate 3 β
β (sub+pub) β β (sub) β β (sub+pub) β
βββββββ¬ββββββ βββββ¬ββββ βββββββ¬ββββββ
β β β
βββββββ΄βββββββββββββ΄βββββββββββββ΄ββββββ
β Load Balancer β
ββββββββββββββββββββββββββββββββββββββββ
Real-Time Pub/Sub β When one instance creates/revokes a key or changes credits, it publishes an event to the pg:events channel. All other instances receive it instantly and update their local KeyStore without waiting for the 5-second sync. Credit changes include inline data (credits, totalSpent, totalCalls) so receivers skip the Redis roundtrip entirely. Each instance has a unique ID for self-message filtering β no echo loops. If pub/sub fails, the periodic sync continues as a fallback.
Admin API Sync β All admin HTTP endpoints (create key, revoke, rotate, topup, set ACL, expiry, quota, tags, IP allowlist, spending limit) write through to Redis. Topup and revoke use atomic Lua scripts; other mutations use fire-and-forget HSET to propagate changes across instances immediately.
Distributed Rate Limiting β Rate limits are enforced atomically across all instances using Redis sorted sets with Lua scripts. Each rate check does ZREMRANGEBYSCORE + ZCARD + ZADD in a single round-trip, preventing burst bypass across processes. Falls open (allows) if Redis is temporarily unavailable.
Persistent Usage Audit Trail β Usage events are appended to a Redis list (RPUSH), creating a shared audit trail visible from any instance. Events survive process restarts and are queryable from the dashboard. Max 100k events with automatic trimming.
Graceful Fallback β If Redis is temporarily unavailable, PayGate falls back to local in-memory operations. On reconnect, state syncs automatically.
Zero Dependencies β The Redis client uses Node.js net.Socket with raw RESP protocol encoding. No ioredis, no redis package β pure built-in networking.
Config File Mode
Load all settings from a JSON file instead of CLI flags:
npx paygate-mcp wrap --config paygate.json
Example paygate.json:
{
"serverCommand": "npx",
"serverArgs": ["@modelcontextprotocol/server-filesystem", "/tmp"],
"port": 3402,
"defaultCreditsPerCall": 2,
"globalRateLimitPerMin": 30,
"webhookUrl": "https://billing.example.com/events",
"webhookFilters": [
{
"name": "production-alerts",
"events": ["key.created", "key.revoked", "alert.fired"],
"url": "https://alerts.example.com/webhook",
"keyPrefixes": ["pk_prod_"]
}
],
"refundOnFailure": true,
"stateFile": "~/.paygate/state.json",
"toolPricing": {
"premium_analyze": { "creditsPerCall": 10, "creditsPerKbInput": 5 }
},
"globalQuota": {
"dailyCallLimit": 1000,
"monthlyCreditLimit": 50000
},
"oauth": {
"accessTokenTtl": 3600,
"scopes": ["tools:*"]
},
"redisUrl": "redis://localhost:6379",
"importKeys": {
"pg_abc123def456": 500
}
}
CLI flags override config file values when both are specified.
Config Hot Reload
Reload pricing, rate limits, webhooks, quotas, and behavior flags from your config file without restarting the server:
# Reload from the config file used at startup
curl -X POST http://localhost:3402/config/reload \
-H "X-Admin-Key: YOUR_ADMIN_KEY"
# One-time reload from a different config file
curl -X POST http://localhost:3402/config/reload \
-H "X-Admin-Key: YOUR_ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{"configPath": "/path/to/updated-config.json"}'
Hot-reloadable fields (take effect immediately):
defaultCreditsPerCall,toolPricingβ pricing changesglobalRateLimitPerMinβ rate limit adjustmentshadowMode,refundOnFailureβ behavior flagsfreeMethodsβ free method listglobalQuotaβ daily/monthly call and credit limitswebhookUrl,webhookSecret,webhookMaxRetriesβ webhook infrastructure (rebuilt)alertRulesβ alert thresholds and rules
Non-reloadable fields (reported as skipped, require restart):
serverCommand,serverArgsβ backend MCP server processportβ listening portoauthβ OAuth 2.1 configuration
Response includes changed fields, skipped fields, and any validation warnings:
{
"ok": true,
"changed": ["defaultCreditsPerCall", "globalRateLimitPerMin"],
"skipped": [],
"warnings": [],
"message": "Config reloaded: 2 fields updated"
}
The config file is validated before applying changes β invalid configs are rejected with detailed error messages and zero changes applied.
Deployment
One-Click Deploy
Deploy PayGate to your preferred cloud platform:
Render:
https://render.com/deploy?repo=https://github.com/walker77/paygate-mcp
Fly.io:
fly launch --image ghcr.io/walker77/paygate-mcp:latest --name my-paygate
fly secrets set PAYGATE_ADMIN_KEY=your-admin-key PAYGATE_REMOTE_URL=https://your-mcp-server.com/mcp
Docker
# Build the image
docker build -t paygate-mcp .
# Run with a remote MCP server
docker run -d \
-p 3000:3000 \
-v paygate-data:/data \
-e PAYGATE_REMOTE_URL="https://my-mcp-server.com/mcp" \
-e PAYGATE_ADMIN_KEY="your-admin-key" \
paygate-mcp
# Run with environment variables
docker run -d \
-p 3000:3000 \
-e PAYGATE_PORT=3000 \
-e PAYGATE_REMOTE_URL="https://api.example.com/mcp" \
-e PAYGATE_DEFAULT_CREDITS=5 \
-e PAYGATE_RATE_LIMIT=120 \
-e PAYGATE_WEBHOOK_URL="https://hooks.example.com/paygate" \
paygate-mcp
Docker Compose (with Redis)
# Set your MCP server URL and start
MCP_REMOTE_URL="https://my-mcp-server.com/mcp" docker-compose up -d
# View logs
docker-compose logs -f paygate
# Check health
curl http://localhost:3000/health
The included docker-compose.yml starts PayGate with Redis for horizontal scaling, state persistence, and distributed rate limiting.
systemd
# /etc/systemd/system/paygate-mcp.service
[Unit]
Description=PayGate MCP Proxy
After=network.target
[Service]
Type=simple
User=paygate
WorkingDirectory=/opt/paygate-mcp
ExecStart=/usr/bin/node dist/cli.js wrap \
--remote-url "https://my-mcp-server.com/mcp" \
--port 3000 \
--state-file /var/lib/paygate/state.json \
--audit-file /var/log/paygate/audit.jsonl
Restart=always
RestartSec=5
Environment=NODE_ENV=production
[Install]
WantedBy=multi-user.target
sudo systemctl enable paygate-mcp
sudo systemctl start paygate-mcp
sudo journalctl -u paygate-mcp -f
PM2
# Install globally
npm install -g paygate-mcp
# Start with PM2
pm2 start paygate-mcp -- wrap \
--remote-url "https://my-mcp-server.com/mcp" \
--port 3000 \
--state-file ./state.json
# Or use ecosystem file
pm2 start ecosystem.config.js
Production Checklist
- Set
--state-filefor persistent storage across restarts - Set
--audit-filefor audit trail retention - Configure
--webhook-urlfor external billing/alerting - Use
--admin-keyor setPAYGATE_ADMIN_KEY(auto-generated if omitted) - Enable Redis (
--redis-url) for multi-instance deployments - Set up reverse proxy (nginx/caddy) with TLS termination
- Configure
--cors-originfor browser-based clients - Monitor
/healthendpoint with your uptime checker - Scrape
/metricswith Prometheus for observability - Back up state file regularly (or use Redis persistence)
Load Testing
A k6 load test script is included for production benchmarking:
# Install k6
brew install k6 # macOS
# or: https://k6.io/docs/getting-started/installation
# Start server (example: echo backend)
npx paygate-mcp wrap -- echo '{"jsonrpc":"2.0","id":1,"result":{"content":[{"type":"text","text":"ok"}]}}' \
--port 3000 --credits-per-call 1
# Run with admin key (from server startup output)
K6_ADMIN_KEY=admin_xxxx k6 run load-test.js
# Custom VUs and duration
K6_ADMIN_KEY=admin_xxxx k6 run --vus 100 --duration 60s load-test.js
# Against remote deployment
K6_PAYGATE_URL=https://paygate.example.com K6_ADMIN_KEY=admin_xxxx k6 run load-test.js
Scenarios:
- mcp_traffic β Simulates agent tool calls (ramp 0β50 VUs over 10s, sustain 30s)
- admin_reads β Dashboard/analytics reads (5 constant VUs)
- health_checks β Load balancer probes (10 req/s constant rate)
Thresholds:
- p(95) response time < 200ms
- p(99) response time < 500ms
- Error rate < 5%
- Request rate > 100 req/s
Error Codes
HTTP Status Codes
| Code | Meaning | When |
|---|---|---|
200 | OK | Successful read/update operations |
201 | Created | Key, team, group, or template created |
401 | Unauthorized | Missing or invalid admin key |
402 | Payment Required | Insufficient credits for tool call |
403 | Forbidden | IP not in allowlist, ACL denied |
404 | Not Found | Key, template, group, or resource not found |
405 | Method Not Allowed | Wrong HTTP method for endpoint |
409 | Conflict | Duplicate alias, template name collision |
429 | Too Many Requests | Rate limit exceeded |
503 | Service Unavailable | Maintenance mode or server shutting down |
JSON-RPC Error Codes (MCP /mcp endpoint)
| Code | Name | Description |
|---|---|---|
-32402 | insufficient_credits | API key has zero credits remaining |
-32402 | rate_limited | Request rate exceeds per-key or per-tool limit |
-32402 | quota_exceeded | Daily/monthly call or credit quota exceeded |
-32402 | spending_limit_reached | Cumulative spend exceeds key spending limit |
-32402 | key_suspended | API key is temporarily suspended |
-32402 | key_expired | API key TTL has elapsed |
-32402 | acl_denied | Tool not in key's ACL whitelist |
-32402 | ip_not_allowed | Client IP not in key's allowlist |
-32402 | invalid_api_key | X-API-Key header not recognized |
-32402 | maintenance_mode | Server in maintenance mode |
-32003 | circuit_breaker_open | Backend unavailable, circuit breaker is open |
-32004 | tool_timeout | Tool call exceeded configured timeout |
-32600 | invalid_request | Malformed JSON-RPC request body |
-32601 | method_not_found | Unknown MCP method |
Webhook Event Types
| Event | Trigger |
|---|---|
key.created | New API key provisioned |
key.revoked | API key permanently revoked |
key.suspended | API key temporarily suspended |
key.resumed | Suspended key reactivated |
key.rotated | API key rotated to new value |
key.topup | Credits added to key |
key.expired | Key TTL elapsed |
key.expiry_warning | Key approaching expiry |
credit.transfer | Credits moved between keys |
credit.auto_topup | Auto-topup triggered |
usage | Batched tool call events |
Programmatic API
import { PayGateServer } from 'paygate-mcp';
// Wrap a local server (stdio)
const server = new PayGateServer({
serverCommand: 'npx',
serverArgs: ['@modelcontextprotocol/server-filesystem', '/tmp'],
port: 3402,
defaultCreditsPerCall: 1,
toolPricing: {
'premium_analyze': { creditsPerCall: 10 }
},
});
const { port, adminKey } = await server.start();
// Multi-server mode
const multiServer = new PayGateServer(
{ serverCommand: '', port: 3402, defaultCreditsPerCall: 1 },
undefined, undefined, undefined, undefined,
[
{ prefix: 'fs', serverCommand: 'npx', serverArgs: ['@modelcontextprotocol/server-filesystem', '/tmp'] },
{ prefix: 'api', remoteUrl: 'https://my-mcp-server.example.com/mcp' },
]
);
// With Redis for horizontal scaling
const redisServer = new PayGateServer(
{ serverCommand: 'npx', serverArgs: ['my-mcp-server'], port: 3402, defaultCreditsPerCall: 1 },
undefined, undefined, undefined, undefined, undefined,
'redis://localhost:6379'
);
// Client SDK
import { PayGateClient } from 'paygate-mcp/client';
const client = new PayGateClient({
url: `http://localhost:${port}`,
apiKey: 'pg_...',
});
const tools = await client.listTools();
const result = await client.callTool('search', { query: 'hello' });
Security
- Cryptographic API key generation (
pg_prefix, 48 hex chars) - Keys masked in list endpoints
- Integer-only credits (no float precision attacks)
- 1MB request body limit
- Input sanitization on all endpoints
- Admin key never exposed in responses
- API keys never forwarded to remote servers (HTTP transport)
- Rate limiting is per-key, concurrent-safe
- Stripe webhook signature verification (HMAC-SHA256, timing-safe)
- Dashboard uses safe DOM methods (textContent/createElement) β no innerHTML
- Webhook HMAC-SHA256 signatures with timing-safe verification
- Webhook URLs masked in status output
- Spending limits enforced with integer arithmetic (no float bypass)
- Per-tool ACL enforcement (whitelist + blacklist, sanitized inputs)
- Key expiry with fail-closed behavior (expired = denied)
- OAuth 2.1 with PKCE (S256) β no implicit grant, no plain challenge
- OAuth tokens are opaque hex strings (no JWT data leakage)
- Quota counters reset atomically at UTC boundaries
- SSE sessions auto-expire (30 min), max 1000 concurrent, max 3 SSE per session
- Audit log with retention policies (ring buffer, age-based cleanup)
- API keys masked in audit events (only first 7 + last 4 chars visible)
- Discovery endpoints (/.well-known/mcp-payment, /pricing) are public but read-only
- Team budgets enforce integer arithmetic (no float bypass)
- Keys masked in team usage summaries (first 7 + last 4 chars only)
- Team quota resets atomic at UTC day/month boundaries
- Redis credit deduction uses Lua scripts for atomic check-and-deduct (no double-spend)
- Redis rate limiting uses Lua scripts for atomic check-and-record (no burst bypass)
- Redis auth supported via password in URL (redis://:password@host:port)
- Graceful Redis fallback β local operations continue if Redis disconnects
- Rate limiter fails open on Redis error (allows request, never blocks on network issues)
- Pub/sub self-message filtering via unique instance IDs (no echo loops)
- Pub/sub subscriber uses a dedicated Redis connection (required by Redis protocol)
- Red-teamed with 101 adversarial security tests across 14 passes
Tested With
PayGate is integration-tested against popular MCP servers from the official @modelcontextprotocol npm scope. These tests wrap real MCP servers via npx, execute tool calls through the PayGate proxy, and verify that auth gating, credit billing, and rate limiting work correctly end-to-end.
| MCP Server | Type | Tests | What's Verified |
|---|---|---|---|
@modelcontextprotocol/server-everything | stdio | 4 | Tool discovery, math tool execution, credit deduction, credit blocking |
@modelcontextprotocol/server-filesystem | stdio | 4 | File write/read through gate, credit deduction, credit blocking |
@modelcontextprotocol/server-memory | stdio | 4 | Entity CRUD, knowledge graph search, credit deduction, credit blocking |
@modelcontextprotocol/server-sequential-thinking | stdio | 4 | Sequential thinking flow, credit deduction, credit blocking |
Cross-server tests verify admin endpoints (/health, /keys, /balance) work identically regardless of the wrapped backend. All 16 integration tests pass.
# Run integration tests (requires internet β downloads MCP servers via npx)
npx vitest run tests/real-mcp-servers.test.ts
Current Limitations
- No response size limits for HTTP transport β Large responses from remote servers are forwarded as-is.
- Redis key metadata syncs on write β Admin mutations write through to Redis immediately; pub/sub delivers near-instant cross-instance updates; periodic sync (5s) serves as a safety net. Credits, rate limits, and usage are always atomic.
- SSE sessions are per-instance β Each PayGate instance manages its own SSE connections (HTTP streams can't be serialized to Redis).
Roadmap
- Persistent storage (
--state-file) - Streamable HTTP transport (
--remote-url) - Stripe webhook integration (
--stripe-secret) - Client self-service balance check (
/balance) - Usage data export β JSON and CSV (
/usage) - Admin web dashboard (
/dashboard) - Per-key spending limits (
/limits) - Webhook events (
--webhook-url) - Refund on failure (
--refund-on-failure) - Config file mode (
--config) - Per-tool ACL β whitelist/blacklist tools per key
- Per-tool rate limits β independent limits per tool
- Key expiry (TTL) β auto-expire API keys
- Multi-server mode β wrap N MCP servers behind one PayGate
- Client SDK β
PayGateClientwith auto 402 retry - Usage quotas β daily/monthly call and credit limits per key
- Dynamic pricing β charge by input size (
creditsPerKbInput) - OAuth 2.1 β PKCE, client registration, Bearer tokens, token refresh/revocation
- SSE streaming β Full MCP Streamable HTTP transport with session management
- Audit log β Structured audit trail with retention, query API, CSV/JSON export
- Registry/discovery β Agent-discoverable pricing (/.well-known/mcp-payment, /pricing, tools/list _pricing)
- Prometheus metrics β /metrics endpoint with counters, gauges, and uptime
- Key rotation β Rotate API keys preserving credits, ACLs, quotas, and spending limits
- Rate limit headers β X-RateLimit-* and X-Credits-Remaining on /mcp responses
- Webhook signatures β HMAC-SHA256 signed payloads with timing-safe verification
- Admin lifecycle events β Webhook notifications for key management operations
- IP allowlisting β Restrict API keys to specific IPs or CIDR ranges
- Key tags/metadata β Attach key-value tags for external system integration
- Usage analytics β Time-series analytics API with tool breakdown, trends, and top consumers
- Alert webhooks β Configurable threshold alerts (spending, credits, quota, expiry, rate limits)
- Team management β Group API keys with shared budgets, quotas, and usage tracking
- Horizontal scaling β Redis-backed state for multi-process deployments
- Batch tool calls β
tools/call_batchwith all-or-nothing billing and parallel execution - Multi-tenant namespaces β Isolate API keys and usage data by tenant with namespace-filtered endpoints
- Scoped tokens β Short-lived
pgt_tokens with tool ACL narrowing, HMAC-SHA256 signed, zero server-side state - Token revocation list β Revoke scoped tokens before expiry with O(1) lookup, auto-cleanup, Redis sync
- Usage-based auto-topup β Automatically refill credits when balance drops below threshold with daily limits
- Admin API key management β Multiple admin keys with role-based permissions (super_admin, admin, viewer)
- Webhook filters β Route events to multiple destinations by event type and key prefix with independent retry queues
- Credit transfers β Atomically transfer credits between API keys with validation and audit trail
- Bulk key operations β Execute multiple create/topup/revoke operations in one request with per-operation error handling
- Key import/export β Export keys (JSON/CSV) for backup/migration, import with conflict resolution (skip, overwrite, error modes)
- Webhook event replay β Replay dead letter entries (all or by index) with fresh delivery attempt and audit trail
Requirements
- Node.js >= 18.0.0
- Zero external dependencies
License
MIT
