Lynkr
Streamline your workflow with Lynkr, a CLI tool that acts as an HTTP proxy for efficient code interactions using Claude Code CLI.
Installation
npx lynkrAsk AI about Lynkr
Powered by Claude Β· Grounded in docs
I know everything about Lynkr. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
Lynkr
Run Claude Code, Cursor, and Codex on any model. One proxy, every provider.
| 12+ LLM Providers | 60-80% Cost Reduction | 699 Tests Passing | 0 Code Changes Required |
The Problem
AI coding tools lock you into one provider. Claude Code requires Anthropic. Codex requires OpenAI. You can't use your company's Databricks endpoint, your local Ollama models, or your AWS Bedrock account β at least, not without Lynkr.
The real costs:
- Anthropic API at $15/MTok output adds up fast for daily coding
- No way to use free local models (Ollama, llama.cpp) with Claude Code
- Enterprise teams can't route through their own cloud infrastructure
- Provider outages take your entire workflow down
The Solution
Lynkr is a self-hosted proxy that sits between your AI coding tools and any LLM provider. One environment variable change, and your tools work with any model.
Claude Code / Cursor / Codex / Cline / Continue / Vercel AI SDK
|
Lynkr
|
Ollama | Bedrock | Databricks | OpenRouter | Azure | OpenAI | llama.cpp
# That's it. Three lines.
npm install -g lynkr
export ANTHROPIC_BASE_URL=http://localhost:8081
lynkr start
Quick Start
Install
One-line install (recommended):
curl -fsSL https://raw.githubusercontent.com/Fast-Editor/Lynkr/main/install.sh | bash
Or via npm:
npm install -g pino-pretty && npm install -g lynkr
Pick a Provider
Free & Local (Ollama)
export MODEL_PROVIDER=ollama
export OLLAMA_MODEL=qwen2.5-coder:latest
lynkr start
AWS Bedrock (100+ models)
export MODEL_PROVIDER=bedrock
export AWS_BEDROCK_API_KEY=your-key
export AWS_BEDROCK_MODEL_ID=anthropic.claude-3-5-sonnet-20241022-v2:0
lynkr start
OpenRouter (cheapest cloud)
export MODEL_PROVIDER=openrouter
export OPENROUTER_API_KEY=sk-or-v1-your-key
lynkr start
Connect Your Tool
Claude Code
export ANTHROPIC_BASE_URL=http://localhost:8081
export ANTHROPIC_API_KEY=dummy
claude "Your prompt here"
Codex CLI β edit ~/.codex/config.toml:
model_provider = "lynkr"
model = "gpt-4o"
[model_providers.lynkr]
name = "Lynkr Proxy"
base_url = "http://localhost:8081/v1"
wire_api = "responses"
Cursor IDE
- Settings > Features > Models
- Base URL:
http://localhost:8081/v1 - API Key:
sk-lynkr
Vercel AI SDK
import { generateText } from "ai";
import { createOpenAICompatible } from "@ai-sdk/openai-compatible";
const lynkr = createOpenAICompatible({
baseURL: "http://localhost:8081/v1",
name: "lynkr",
apiKey: "sk-lynkr",
});
const { text } = await generateText({
model: lynkr.chatModel("auto"),
prompt: "Hello!",
});
OpenClaw
// openclaw.json
{
"models": {
"providers": [{
"name": "lynkr",
"type": "openai-compatible",
"base_url": "http://localhost:8081/v1",
"api_key": "any-value",
"models": ["auto"]
}]
}
}
Set OPENCLAW_MODE=true in Lynkr's .env to show actual provider/model in responses.
Works with any OpenAI-compatible client: Cline, Continue.dev, OpenClaw, KiloCode, and more.
Supported Providers
| Provider | Type | Models | Cost |
|---|---|---|---|
| Ollama | Local | Unlimited (free, offline) | Free |
| llama.cpp | Local | Any GGUF model | Free |
| LM Studio | Local | Local models with GUI | Free |
| MLX Server | Local | Apple Silicon optimized | Free |
| AWS Bedrock | Cloud | 100+ (Claude, Llama, Mistral, Titan) | $$ |
| OpenRouter | Cloud | 100+ (GPT, Claude, Llama, Gemini) | $-$$ |
| Databricks | Cloud | Claude Sonnet 4.5, Opus 4.6 | $$$ |
| Azure OpenAI | Cloud | GPT-4o, o1, o3 | $$$ |
| Azure Anthropic | Cloud | Claude models | $$$ |
| OpenAI | Cloud | GPT-4o, o3, o4-mini | $$$ |
| Google Vertex | Cloud | Gemini 2.5 Pro/Flash | $$$ |
| Moonshot AI | Cloud | Kimi K2 Thinking/Turbo | $$ |
| Z.AI | Cloud | GLM-4.7 | $$ |
| DeepSeek | Cloud | DeepSeek Reasoner, R1 | $ |
4 local providers for 100% offline, free usage. 10+ cloud providers for scale.
Why Lynkr Over Alternatives
| Feature | Lynkr | LiteLLM (42K stars) | OpenRouter | PortKey |
|---|---|---|---|---|
| Setup | npm install -g lynkr | Python + Docker + Postgres | Account signup | Docker + config |
| Claude Code support | Drop-in, native | Requires config | No CLI support | Requires config |
| Cursor support | Drop-in, native | Partial | Via API key | Partial |
| Codex CLI support | Drop-in, native | No | No | No |
| Built for coding tools | Yes (purpose-built) | No (general gateway) | No (general API) | No (general gateway) |
| Local models | Ollama, llama.cpp, LM Studio, MLX | Ollama only | No | No |
| Token optimization | Built-in (60-80% savings) | No | No | Caching only |
| Complexity routing | Auto-routes by task difficulty | Manual | Cost/latency only | Manual |
| Memory system | Titans-inspired long-term memory | No | No | No |
| Self-hosted | Yes (Node.js) | Yes (Python stack) | No (SaaS) | Yes (Docker) |
| Offline capable | Yes | Yes | No | No |
| Transaction fees | None | None (OSS) / Paid enterprise | 5.5% on credits | Free tier / Paid |
| Dependencies | Node.js only | Python, Prisma, PostgreSQL | N/A | Docker, Python |
| Format conversion | Anthropic <-> OpenAI (automatic) | Automatic | N/A | Automatic |
| Code intelligence | Graphify (19-lang AST graph) | No | No | No |
| Routing telemetry | Built-in (SQLite + REST API) | No | Dashboard | Dashboard |
| Admin hot-reload | Yes (no restart) | Requires restart | N/A | Requires restart |
| License | Apache 2.0 | MIT | Proprietary | MIT (gateway) |
Lynkr's edge: Purpose-built for AI coding tools. Not a general LLM gateway β a proxy that understands Claude Code, Cursor, and Codex natively, with built-in token optimization, complexity-based routing, and a memory system designed for coding workflows. Installs in one command, runs on Node.js, zero infrastructure required.
Cost Comparison
| Scenario | Direct Anthropic | Lynkr + Ollama | Lynkr + OpenRouter | Lynkr + Bedrock |
|---|---|---|---|---|
| Daily Claude Code usage | ~$10-30/day | $0 (free) | ~$2-8/day | ~$5-15/day |
| Token optimization savings | β | β | 60-80% further | 60-80% further |
| Monthly (heavy use) | $300-900 | $0 | $60-240 | $150-450 |
With token optimization enabled, Lynkr's smart tool selection, prompt caching, and memory deduplication reduce token usage by 60-80% on top of provider savings.
What's Under the Hood
Lynkr isn't just a passthrough proxy. It's an optimization layer.
Smart Routing (5-Phase)
Routes requests to the right model based on 5-phase complexity analysis. Simple questions go to fast/cheap models. Complex architectural tasks go to powerful models. Includes Graphify structural analysis for code-aware routing.
- Complexity scoring β 15-dimension weighted scoring with agentic workflow detection
- Graphify integration β AST-based knowledge graph detects god nodes, community cohesion, blast radius across 19 languages
- Routing telemetry β every decision recorded with quality scoring (0-100) and latency tracking (P50/P95/P99)
Token Optimization (7 Phases)
- Smart tool selection β only sends tools relevant to the current task
- Code Mode β replaces 100+ MCP tools with 4 meta-tools (~96% token reduction)
- Distill compression β structural similarity, delta rendering, smart dedup of repetitive tool outputs
- Prompt caching β SHA-256 keyed LRU cache
- Memory deduplication β eliminates repeated information across turns
- History compression β sliding window with Distill-powered structural dedup
- Headroom sidecar β optional 47-92% ML-based compression (Smart Crusher, CCR, LLMLingua)
Enterprise Resilience
- Circuit breakers β automatic failover with half-open probe recovery
- Admin hot-reload β
POST /v1/admin/reloadreloads config + resets circuit breakers without restart - Load shedding β graceful degradation under high load
- Prometheus metrics β full observability at
/metrics - Health checks β K8s-ready endpoints at
/health - Performance timer β per-request timing breakdown with
PERF_TIMER=true
Memory System
Titans-inspired long-term memory with surprise-based filtering. The system remembers important context across sessions and forgets noise β reducing token waste from repeated context.
Semantic Cache
Cache responses for semantically similar prompts. Hit rate depends on your workflow, but repeat questions (common in coding) get instant responses.
SEMANTIC_CACHE_ENABLED=true
SEMANTIC_CACHE_THRESHOLD=0.95
MCP Integration + Code Mode
Automatic Model Context Protocol server discovery and orchestration. Your MCP tools work through Lynkr without configuration. Enable Code Mode to replace 100+ MCP tool definitions with 4 lightweight meta-tools:
CODE_MODE_ENABLED=true # ~96% reduction in tool-catalog tokens
Deployment Options
One-line install (recommended)
curl -fsSL https://raw.githubusercontent.com/Fast-Editor/Lynkr/main/install.sh | bash
NPM
npm install -g lynkr && lynkr start
Docker
docker-compose up -d
Git Clone
git clone https://github.com/Fast-Editor/Lynkr.git
cd Lynkr && npm install && cp .env.example .env
npm start
Homebrew
brew tap vishalveerareddy123/lynkr
brew install lynkr
Documentation
| Guide | Description |
|---|---|
| Installation | All installation methods |
| Provider Config | Setup for all 12+ providers |
| Claude Code CLI | Detailed Claude Code integration |
| Codex CLI | Codex config.toml setup |
| OpenClaw | OpenClaw integration with tier routing |
| Cursor IDE | Cursor integration + troubleshooting |
| Embeddings | @Codebase semantic search (4 options) |
| Token Optimization | 60-80% cost reduction strategies |
| Memory System | Titans-inspired long-term memory |
| Tools & Execution | Tool calling and execution modes |
| Smart Routing | Complexity-based model routing |
| Docker Deployment | docker-compose with GPU support |
| Production Hardening | Circuit breakers, metrics, load shedding |
| API Reference | All endpoints and formats |
| Troubleshooting | Common issues and solutions |
| FAQ | Frequently asked questions |
Troubleshooting
| Issue | Solution |
|---|---|
| Same response for all queries | Disable semantic cache: SEMANTIC_CACHE_ENABLED=false |
| Tool calls not executing | Increase threshold: POLICY_TOOL_LOOP_THRESHOLD=15 |
| Slow first request | Keep Ollama loaded: OLLAMA_KEEP_ALIVE=24h |
| Connection refused | Ensure Lynkr is running: lynkr start |
Contributing
We welcome contributions. See the Contributing Guide and Testing Guide.
License
Apache 2.0 β See LICENSE.
Community
- GitHub Discussions β Questions and tips
- Report Issues β Bug reports and feature requests
- NPM Package β Official package
- DeepWiki β AI-powered docs search
Built by Vishal Veera Reddy β for developers who want control over their AI tools.
