Cf Llama Chat
No description available
Ask AI about Cf Llama Chat
Powered by Claude ยท Grounded in docs
I know everything about Cf Llama Chat. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
๐ฆ CF Llama Chat
โโโโโโโโโโโโโโโ โโโ โโโ โโโโโโ โโโโ โโโโ โโโโโโ
โโโโโโโโโโโโโโโโ โโโ โโโ โโโโโโโโโโโโโ โโโโโโโโโโโโโ
โโโ โโโโโโ โโโ โโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโ โโโโโโ โโโ โโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโ โโโ โโโโโโ โโโ
โโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโ โโโโโโ โโโ
๐ Enterprise AI Chat for Cloud Foundry
A modern, enterprise-ready chat application built with Spring Boot 3.4 and Spring AI 1.1, designed for Tanzu Platform and Cloud Foundry deployments. Inspired by open-webui.
Multi-model chat through the Tanzu GenAI tile, agent-curated LLM Wiki, per-turn thinking-level control, Document RAG, MCP tool servers, multi-tenant organizations, and a full admin portal โ all without leaving the OpenAI-compatible API surface.
๐ธ Screenshots
| Chat Interface | Admin Portal | Metrics Dashboard |
|---|---|---|
![]() | ![]() | ![]() |
๐๏ธ Architecture
+---------------------------------------------------------------------+
| TANZU PLATFORM / CLOUD FOUNDRY |
+---------------------------------------------------------------------+
| |
| +-----------+ +---------------------------------------------+ |
| | | | CF LLAMA CHAT APP | |
| | Users |--->| +---------------------------------------+ | |
| | | | | Spring Boot 3.4 | | |
| | | | | +----------+ +-------+ +--------+ | | |
| +-----------+ | | | Chat + | | Wiki | | Admin | | | |
| | | | SSE | | Tools | | Portal | | | |
| | | +----+-----+ +---+---+ +---+----+ | | |
| | | +-----------+----------+ | | |
| | | | | | |
| | | +----------------v---------------+ | | |
| | | | Spring AI 1.1 | | | |
| | | | +------------------+ | | | |
| | | | | GenAI Locator | | | | |
| | | | +---------+--------+ | | | |
| | | +------------|-------------------+ | | |
| | +---------------|-----------------------+ | |
| +------------------|------------------------- + |
| | |
| +-----------------------------------v---------------------------+ |
| | VCAP_SERVICES BINDINGS | |
| | | |
| | +---------------------------------------------------------+ | |
| | | tanzu-all-models | | |
| | | (GenAI Multi-Model Binding) | | |
| | | +-------------+ +-------------+ +-------------+ | | |
| | | | gpt-oss:20b | | qwen3:4b | | nomic-embed | | | |
| | | | chat/reason | | chat | | embedding | | | |
| | | +-------------+ +-------------+ +-------------+ | | |
| | +---------------------------------------------------------+ | |
| | | |
| | +---------------+ +---------------+ +---------------+ | |
| | | PostgreSQL | | p-identity | | MCP Servers | | |
| | | + pgvector | | OAuth2 (opt) | | (optional) | | |
| | +---------------+ +---------------+ +---------------+ | |
| +---------------------------------------------------------------+ |
| |
+---------------------------------------------------------------------+
๐งโ๐ป Users โข ๐ฌ Chat + Wiki โข ๐ข Embeddings โข ๐๏ธ PostgreSQL + pgvector โข ๐ SSO โข ๐ ๏ธ MCP
๐ Key Features
๐ค Tanzu GenAI Integration
- ๐ Multi-Model Binding โ single
tanzu-all-modelsservice discovers every model on the tile viaGenaiLocator.getModelNamesByCapability() - ๐ Backward Compatible โ still supports individual
genaiservice bindings and a legacy per-model plan - ๐ง Smart Routing โ chat requests route to chat-capable models; embeddings route separately; mixed models in the same binding are automatically filtered
- ๐ External OpenAI-Compatible Bindings โ add any OpenAI-compatible API at runtime through the admin portal; hot-reloaded, secure key storage, optional GenAI Locator config URL for auto-discovery
๐ง LLM Wiki (agent-curated knowledge base)
The chat model writes to a persistent, per-user wiki during normal conversation via six Spring AI @Tool methods. Durable facts, preferences, decisions, and project entities are saved automatically. A Caffeine-cached index block is injected into every system prompt so the assistant stays consistent across sessions.
+---------------------------------------------------------------+
| USER: "i like tacos" |
| |
| ASSISTANT: "Got it! Tacos are now on your list..." |
| โโ calls wiki_write(slug="preference/food", |
| โ kind="PREFERENCE", ...) |
| โโ [ โธ Details (1 wiki op) ] |
| โโ Thinking: the user stated a stable preference... |
| โโ WIKI OPS: Saved PREFERENCE preference/food |
| [view] [undo] |
| |
| (later, in a FRESH conversation) |
| USER: "what's my favorite food?" |
| ASSISTANT: "You've mentioned liking tacos." |
| โโ reached the answer via wiki index block, not via |
| chat history |
+---------------------------------------------------------------+
- โ๏ธ Six
@Toolmethods โwiki_search,wiki_read,wiki_write,wiki_link,wiki_invalidate,wiki_index - ๐ Page kinds โ
FACT,PREFERENCE,DECISION,CONCEPT,ENTITY,EVENT - โฉ๏ธ Undo โ inline undo chip on every write. First-write undo deletes the page; multi-version undo restores prior content
- ๐ History + audit log โ every write, link, undo, and invalidate is versioned and logged, with full revision history
- ๐ Hybrid search โ vector search over page content via pgvector
- ๐๏ธ Unified workspace โ browse, search, edit, and review revision history at
/workspace/wiki - ๐๏ธ Two-layer enable/disable โ admin kill switch (
wiki.enabledsystem setting) + per-user opt-out in settings - ๐ One-shot migration โ legacy
user_notes+user_memoriestables automatically migrated towiki_pageon first boot, then dropped
๐๏ธ Thinking-Level Control
Per-turn segmented control in the chat input bar: None / Low / Med / High. Persisted in user preferences and sent with every chat request.
| Level | Effect | Best for |
|---|---|---|
| None | Suppresses reasoning entirely (/no_think for Qwen3 family; verbal directive for others) | Fast lookups, simple Q&A |
| Low | Brief reasoning, 1โ2 sentences | Most everyday questions |
| Med | Default reasoning depth | General use |
| High | Step-by-step reasoning | Hard problems, planning, code review |
ThinkingOptionsBuilder maps the level per-model family (Qwen3 native directive, verbal system-prompt nudge for everything else). Runs cleanly against the Tanzu GenAI tile's OpenAI-compatible proxy.
๐ฌ Polished Chat UI
- ๐ญ Collapsible Details panel โ default collapsed, per-assistant-message. Contains both the model's internal reasoning (parsed from
<think>...</think>blocks during streaming) and any wiki operations that fired during the turn - ๐ญ Thinking indicator โ pulse animation + "Thinkingโฆ" label on the streaming bubble while the model is inside a reasoning block
- ๐ Stream-aware markdown โ debounced rendering at 100 ms during streaming, full re-render on complete
- ๐งฎ Code + math + artifacts โ syntax highlighting, LaTeX math via KaTeX, sandboxed HTML/SVG artifacts
- ๐ Streaming RAG URL prefix โ
#https://โฆin a user message auto-fetches the page (web) or transcript (YouTube) and injects as context - โฑ๏ธ Per-message metrics โ TTFT, tokens/sec, total time shown under each response
- ๐ก๏ธ CSP-clean โ all JS in external files, no inline handlers, DOMPurify on every markdown-to-HTML path
๐ Document RAG
+----------+ +--------------+ +--------------+ +--------------+
| PDF | | Extract | | Chunk | | Embed |
| Word |-->| Tika / |-->| (350 tokens, |-->| nomic-embed |
| Text | | Docling / | | 100 overlap)| | |
| HTML | | Azure DocInt | | | | |
+----------+ +--------------+ +--------------+ +--------------+
|
v
+----------+ +--------------+ +--------------+ +--------------+
| | | Semantic | | Full-doc or | | PgVector |
| Query |-->| Search |-->| snippet |-->| Store |
| | | (top-k) | | retrieval | | |
+----------+ +--------------+ +--------------+ +--------------+
- ๐ค Upload โ PDF, Word, text, HTML, more via Apache Tika. Optional Docling and Azure Document Intelligence extractors for better layout handling
- ๐ฆ Pluggable storage โ local filesystem, S3-compatible, Azure Blob, Google Cloud Storage
- โ๏ธ Smart chunking โ default 350 tokens / 100 overlap, tuned for nomic-embed's context window
- ๐ Two retrieval modes โ
snippet(matched chunks only) orfull(all chunks from matched parent docs, grouped) - ๐ Per-user isolation โ each user's documents are private
- ๐ Hybrid search โ vector + keyword combined for better recall on technical content
๐ ๏ธ MCP (Model Context Protocol)
- ๐ Transport support โ SSE and streamable HTTP; routed by
McpClientFactorybased onMcpTransportType - ๐ก Auto-discovery โ scans
VCAP_SERVICESfor themcpSseURLcredentials key or user-provided services taggedmcpSseURL - ๐ง Tools + Skills โ MCP tools can be bundled into Skills (tools + prompt augmentation) for reusable agent behaviors
- ๐ก๏ธ Per-user permissions โ access rules + user groups control which tools a user can call
๐ฅ Multi-Tenancy & Organizations
- ๐ท๏ธ Slug-based routing โ access an org at
/{org-slug} - ๐จ Full theming โ colors, fonts, border radius, custom CSS, logo, favicon, welcome message per-org
- ๐ฅ User groups โ role-based grouping with fine-grained permissions
- ๐ Model access rules โ restrict specific models to users or groups
- ๐ SCIM 2.0 โ user provisioning endpoint for enterprise identity providers
๐งฐ Workspace Features
All user-facing features grouped at /workspace:
| Feature | Purpose |
|---|---|
| ๐ง Wiki | Agent-curated knowledge base (NEW โ replaces Notes + Memory) |
| ๐ฌ Channels | Group chat channels with persistent messages |
| ๐ Prompts | Reusable prompt presets with {{variable}} templates |
| ๐ ๏ธ Tools | Browse available MCP tools and toggle per-user access |
| ๐ Documents | Upload, manage, and search personal document library |
| โ Help | In-app guides for every feature |
๐ก๏ธ Admin Portal
All admin features grouped at /admin:
| Page | Purpose |
|---|---|
| โ๏ธ Settings | Site config, feature flags (wiki.enabled, feature.rag.enabled, etc.), rate limits, maintenance mode |
| ๐ฅ Users | Create, edit, reset passwords, manage roles, invitation codes |
| ๐จโ๐ฉโ๐ง User Groups | Group-based access control |
| ๐ข Organizations | Slug, theme, branding, SCIM config |
| ๐ค Models | View discovered models, set access rules, configure defaults |
| ๐ External Bindings | Add/edit/remove OpenAI-compatible API endpoints at runtime |
| ๐ ๏ธ Tools | Manage custom and MCP-discovered tools |
| ๐ก Skills | Bundle tools + prompt augmentation into reusable agent behaviors |
| ๐ MCP Servers | Configure SSE and streamable-HTTP MCP endpoints |
| ๐พ Storage | Configure document storage backend (local / S3 / Azure / GCS) |
| ๐ Banners | Site-wide notification banners |
| ๐ช Webhooks | Outbound event notifications |
| ๐๏ธ Database | DB stats, connection pool health |
๐ Metrics & Observability
- ๐ Usage metrics โ per-user, per-model: token counts, TTFT, tokens/sec, total response time
- ๐ Embedding metrics โ documents processed, chunks, characters, processing time
- ๐๏ธ Active user tracking โ real-time session tracking via
ActiveUserTracker - ๐ก OpenTelemetry โ Micrometer Observation API, OTLP export, trace context filter
- ๐ฉบ Actuator โ
/actuator/health,/actuator/info,/actuator/prometheus(when enabled) - ๐ป Admin dashboard โ live charts at
/admin/metrics
๐ Authentication & Security
+-----------------+
| Auth Options |
+--------+--------+
|
+--------+------------+-----------+--------+
| | | | |
v v v v v
+------+ +------+ +--------+ +------+ +------+
|Local | | SSO | | LDAP | |Invite| | SCIM |
|bcrypt| | UAA | | AD/389 | | Code | |2.0 |
+------+ +------+ +--------+ +------+ +------+
- ๐ Local auth โ bcrypt password hashing, admin reset, user self-service change
- ๐ข Enterprise SSO โ OAuth2 via CF
p-identityservice (bound manually; see CLAUDE.md) - ๐ LDAP โ optional LDAP/AD backend, configurable via
auth.ldap.* - ๐ซ Invitation codes โ gate registration with
app.auth.secret/APP_AUTH_SECRET - ๐ RBAC โ Admin and User roles + per-group permissions
- ๐ก๏ธ CSP โ strict
script-src 'self'; all JS external; no inline handlers - ๐ฆ Rate limiting โ configurable per-user request throttling via
RateLimitService - ๐งช Prompt injection detection โ heuristic scanning on every user message
- ๐ Redis session store โ optional, enabled when
REDIS_HOSTis set
๐ Quick Start โ Tanzu Platform
# 1๏ธโฃ Build the application
./mvnw clean package -DskipTests
# 2๏ธโฃ Create services
cf create-service postgres on-demand-postgres-db enterprise-chat-db
cf create-service genai tanzu-all-models enterprise-chat-genai
# 3๏ธโฃ Wait for services to finish provisioning
cf services # wait until both show "create succeeded"
# 4๏ธโฃ Deploy
cf push -f manifest.yml
# 5๏ธโฃ (Optional) Bind SSO manually after the app name stabilizes
cf create-service p-identity uaa enterprise-chat-sso
cf bind-service enterprise-chat-prod enterprise-chat-sso
cf restage enterprise-chat-prod
๐ Admin Password & Auth Secret Setup
TL;DR โ don't put these values in
manifest.yml(it's in git). Either let the app generate a password on first boot, or set them withcf set-envafter pushing. The app refuses to start on thecloudprofile if you use a known-weak value.
What these two variables actually do
They sound similar, but they are completely different things:
APP_ADMIN_DEFAULT_PASSWORD | APP_AUTH_SECRET | |
|---|---|---|
| Purpose | Initial password for the bootstrap admin user | Invitation code for self-registration |
| When it's used | Only on first boot โ when the users table is empty. Ignored afterwards. | Every registration attempt, only if APP_REQUIRE_INVITATION=true |
| Who types it | The admin user at first login | Each new user registering at /register |
| If unset | A random 16-char password is generated and printed to cf logs | Self-registration is open (no gate) |
| Changing it later | Does nothing โ change the admin's password through the UI instead | Immediate; next registrant needs the new value |
The weak-value guard
On the cloud profile (auto-activated in CF), SecurityStartupValidator refuses to start the app if either variable matches a well-known default:
| Variable | Blocked values |
|---|---|
APP_ADMIN_DEFAULT_PASSWORD | Tanzu123, tanzu123, admin, password, changeme |
APP_AUTH_SECRET | changeme, changeme-cdc-wiki |
The failure is only visible in cf logs <app> --recent โ cf push just reports All instances crashed / FAILED. Grep for Startup refused::
ERROR c.e.c.config.SecurityStartupValidator :
Startup refused: APP_ADMIN_DEFAULT_PASSWORD is set to a known-weak value.
Rotate via `cf set-env <app> APP_ADMIN_DEFAULT_PASSWORD <strong-value>` ...
โ Recommended: let the app generate the password
cf push -f manifest.yml
cf logs enterprise-chat-prod --recent | grep -A1 "Generated admin password"
# ================================================================
# Generated admin password: 4f8a2e9c1d7b3a5e
# Change this immediately after first login!
# ================================================================
Log in with admin + that password, then change it via the UI (profile โ change password). The env var is no longer needed after this.
โ Alternative: pin a strong password before first start
cf push -f manifest.yml --no-start
cf set-env enterprise-chat-prod APP_ADMIN_DEFAULT_PASSWORD "$(openssl rand -base64 24)"
cf start enterprise-chat-prod
Or, if you need to capture the value somewhere:
ADMIN_PW="$(openssl rand -base64 24)"
echo "$ADMIN_PW" > ~/.ent-chat-admin-pw # save somewhere safe, chmod 600
cf set-env enterprise-chat-prod APP_ADMIN_DEFAULT_PASSWORD "$ADMIN_PW"
cf restart enterprise-chat-prod
โ If you really want manifest-driven config
Use CF's built-in variable substitution with a gitignored vars file โ never commit the real value:
# manifest.yml โ committed to git
env:
APP_ADMIN_DEFAULT_PASSWORD: ((admin_password))
APP_AUTH_SECRET: ((auth_secret))
# secrets.yml โ add to .gitignore, chmod 600
admin_password: H3re-is-a-strong-value-2026
auth_secret: and-a-different-strong-value-xyz
echo "secrets.yml" >> .gitignore
cf push -f manifest.yml --vars-file secrets.yml
๐ซ Common mistakes
| What people try | What happens | Fix |
|---|---|---|
Hardcoding the password in manifest.yml (committed to git) | Works, but leaks the secret into git history | Use --vars-file or cf set-env |
Setting APP_ADMIN_DEFAULT_PASSWORD: Tanzu123 | Validator refuses to start โ "Startup refused: known-weak value" | Use a strong value |
Setting APP_ADMIN_DEFAULT_PASSWORD after the admin user exists | No effect โ it's inert after first boot | Change the password through the UI |
Using unquoted password with YAML special chars (e.g. P@ssw0rd!) | YAML parser may misinterpret !/@/#/&/* | Single-quote the value: APP_ADMIN_DEFAULT_PASSWORD: 'P@ssw0rd!' |
Setting APP_AUTH_SECRET but forgetting APP_REQUIRE_INVITATION=true | Invitation code is ignored โ anyone can register | Set both, or unset APP_AUTH_SECRET |
Expecting cf push to tell you why the app crashed | It doesn't โ only says FAILED | Always cf logs <app> --recent after a crashed push |
Service Bindings
| Service | Plan | Required | Purpose |
|---|---|---|---|
postgres | on-demand-postgres-db | โ | Data + pgvector embeddings |
genai | tanzu-all-models | โ | Chat + embedding models |
p-identity | uaa | โฌ | SSO / OAuth2 (bind manually, not via manifest) |
enterprise-mcp-gateway | any | โฌ | Optional MCP tool servers |
Two manifests, two app names
| Manifest | App name | Binds |
|---|---|---|
manifest.yml | enterprise-chat-prod | enterprise-chat-db + enterprise-chat-genai (manual / local) |
manifest-ci.yml | cf-llama-chat | Individual model services (CI blue-green workflow) |
โ ๏ธ SSO is intentionally omitted from both manifests. Binding
p-identityvia manifest during a CI blue-green push re-registers the OAuth client and invalidates the existing one. Bind manually once after the app name stabilizes.
๐ง Tech Stack
| Layer | Technology |
|---|---|
| โ Backend | Spring Boot 3.4, Spring AI 1.1, Java 21 |
| ๐จ Frontend | Thymeleaf + vanilla JS + CSS3 (zero Node deps) |
| ๐๏ธ Database | PostgreSQL 15+ with pgvector extension |
| ๐ค AI | Tanzu GenAI (primary), OpenAI, Ollama, any OpenAI-compatible API |
| ๐ข Embeddings | nomic-embed-text-v2-moe (default), 512-dim vectors |
| ๐ Document extraction | Apache Tika, PDFBox, optional Docling / Azure Document Intelligence |
| ๐ Auth | Spring Security, OAuth2 client, LDAP, BCrypt |
| ๐๏ธ Caching | Caffeine (local), Redis (cluster, optional) |
| ๐ฆ Storage | Local, S3-compatible, Azure Blob, Google Cloud Storage |
| ๐ Observability | Micrometer, OpenTelemetry, Actuator |
โ๏ธ Configuration
Runtime environment variables
| Variable | Description | Default |
|---|---|---|
SPRING_PROFILES_ACTIVE | Active profile | default |
APP_ADMIN_DEFAULT_PASSWORD | First-boot admin password (random if unset). See Admin Password & Auth Secret Setup. | (unset) |
APP_AUTH_SECRET | Invitation code for self-registration (only enforced when APP_REQUIRE_INVITATION=true). See Admin Password & Auth Secret Setup. | (empty) |
APP_REQUIRE_INVITATION | Require invitation code to register | false |
CHAT_PROVIDER | Default chat provider | openai |
EMBEDDING_MODEL | Embedding model name | text-embedding-3-small |
EMBEDDING_DIMENSIONS | Embedding vector size | 512 |
MAX_DOCUMENT_SIZE | Max upload size in bytes | 104857600 (100 MB) |
MAX_DOCUMENTS_PER_USER | Per-user document quota | 50 |
DOCUMENT_CHUNK_SIZE | Tokens per chunk | 350 |
DOCUMENT_CHUNK_OVERLAP | Chunk overlap tokens | 100 |
RAG_TOP_K | Top-K results for RAG | 5 |
MODEL_ACCESS_CONTROL_ENABLED | Per-model ACLs | false |
REDIS_HOST | Enables Redis session store | (empty) |
OPENAI_API_KEY | OpenAI direct API key (dev) | (empty) |
LDAP_ENABLED | Enable LDAP auth backend | false |
Wiki-specific settings (application.yml)
app:
wiki:
index:
max-entries: 40 # pages per-user in system-prompt index
cache-ttl-seconds: 300 # Caffeine TTL (also invalidated on write)
embedding:
retry:
interval-ms: 300000 # background retry interval for failed embeds
search:
default-k: 6
max-k: 20
๐ API Reference
๐ฌ Chat APIs
| Method | Endpoint | Description |
|---|---|---|
POST | /api/chat | Non-streaming chat completion |
POST | /api/chat/stream | Streaming chat with SSE; event: message for tokens, event: wiki_op for live wiki mutations |
GET | /api/chat/models | List available chat models from all bindings |
GET | /api/chat/available-tools | List MCP tools available to the current user |
GET | /api/chat/available-skills | List skills available to the current user |
ChatRequest fields (selected):
{
"conversationId": "uuid | null",
"message": "string (required)",
"provider": "genai | openai | ollama | external",
"model": "gpt-oss:20b",
"skillId": "uuid | null",
"useDocumentContext": false,
"ragRetrievalMode": "snippet | full | null",
"useTools": true,
"temporary": false,
"thinkingLevel": "none | low | medium | high"
}
๐ง Wiki APIs
| Method | Endpoint | Description |
|---|---|---|
GET | /api/wiki/pages | List current user's pages; ?kind=โฆ&limit=โฆ |
GET | /api/wiki/pages/{id} | Single page detail |
PUT | /api/wiki/pages/{id} | Direct user edit (routes through WikiService.upsert so history + log + events all fire) |
POST | /api/wiki/pages/{id}/undo | Restore prior version; deletes the page entirely if there's no history |
GET | /api/wiki/pages/{id}/history | All revisions of a page |
GET | /api/wiki/search?q=โฆ&kind=โฆ&k=โฆ | Vector search over page content |
GET | /api/wiki/log?limit=โฆ | Recent wiki activity |
GET | /api/wiki/feature-status | {adminEnabled, userEnabled, effective} for UI gating |
When admin disables the feature (SystemSetting wiki.enabled = false), all endpoints return 404 except /feature-status.
๐ Document APIs
| Method | Endpoint | Description |
|---|---|---|
POST | /api/documents/upload | Upload document (multipart) |
GET | /api/documents | List user's documents |
GET | /api/documents/{id} | Document metadata |
DELETE | /api/documents/{id} | Delete document and embeddings |
GET | /api/documents/search?q=โฆ | Semantic + keyword search |
๐ฌ Conversation APIs
| Method | Endpoint | Description |
|---|---|---|
GET | /api/conversations | List conversations |
GET | /api/conversations/{id} | Get conversation with messages |
POST | /api/conversations/{id}/share | Create shareable link |
POST | /api/conversations/{id}/export | Export as Markdown |
POST | /api/chat-folders | Create folder |
POST | /api/tags | Create conversation tag |
๐ ๏ธ Admin APIs
| Method | Endpoint | Description |
|---|---|---|
GET | /api/admin/users | List users |
POST | /api/admin/users | Create user |
POST | /api/admin/settings | Set system setting by key |
POST | /api/admin/mcp/servers | Create MCP server |
GET | /api/admin/tools | List registered tools |
POST | /api/admin/skills | Create skill (tools + prompt) |
GET | /api/admin/external-bindings | List external API bindings |
POST | /api/admin/external-bindings | Add external API binding |
PUT | /api/admin/external-bindings/{id} | Update binding |
PUT | /api/admin/external-bindings/{id}/enabled | Toggle |
POST | /api/admin/external-bindings/{id}/reload | Force re-discover models |
GET | /api/admin/organizations | List orgs |
POST | /api/admin/webhooks | Register outbound webhook |
๐ค User Preferences APIs
| Method | Endpoint | Description |
|---|---|---|
GET | /api/user/preferences | Get user preferences blob (theme, language, wikiEnabled, thinkingLevel, etc.) |
PUT | /api/user/preferences | Merge-update preferences |
PUT | /api/user/preferences/theme | Set theme (light / dark / oled) |
PUT | /api/user/preferences/background | Set chat background |
PUT | /api/user/preferences/language | Set UI language (en / es / fr / de / ja / zh) |
PUT | /api/user/preferences/rag-retrieval-mode | snippet or full |
PUT | /api/user/preferences/wiki-enabled | Toggle per-user wiki opt-out |
๐ SCIM 2.0 APIs
| Method | Endpoint | Description |
|---|---|---|
GET | /scim/v2/Users | List users |
POST | /scim/v2/Users | Create user |
GET | /scim/v2/Users/{id} | Get user |
PUT | /scim/v2/Users/{id} | Replace user |
DELETE | /scim/v2/Users/{id} | Delete user |
๐งช Local Development
Click to expand
Prerequisites
- โ Java 21+
- ๐ฆ Maven 3.8+ (or use the bundled
./mvnw) - ๐๏ธ PostgreSQL 15+ with pgvector extension
- ๐ค OpenAI API key (quickest path) or a local Ollama instance
Setup
git clone https://github.com/nkuhn-vmw/cf-llama-chat.git
cd cf-llama-chat
# Set environment for OpenAI
export OPENAI_API_KEY=sk-...
# Or for Ollama
export CHAT_PROVIDER=ollama
export OLLAMA_BASE_URL=http://localhost:11434
./mvnw spring-boot:run
# Open http://localhost:8080
Running tests
./mvnw test # full suite
./mvnw -Dtest=WikiIntegrationTest test # wiki round-trip
./mvnw -Dtest=ChatControllerTest test # chat controller slice
./mvnw -Dtest='*WikiTest,*ChatTest' test # pattern match
Local environment variables
| Variable | Description | Default |
|---|---|---|
OPENAI_API_KEY | OpenAI API key | - |
OPENAI_MODEL | OpenAI model | gpt-4o-mini |
CHAT_PROVIDER | AI provider | openai |
OLLAMA_BASE_URL | Ollama URL | http://localhost:11434 |
OLLAMA_MODEL | Ollama model | llama3.2 |
SPRING_DATASOURCE_URL | Postgres URL | jdbc:postgresql://localhost:5432/cfchat |
SPRING_DATASOURCE_USERNAME | DB user | cfchat |
SPRING_DATASOURCE_PASSWORD | DB password | cfchat |
๐ Project Structure
Click to expand
src/main/java/com/example/cfchat/
โโโ ๐ CfLlamaChatApplication.java # @SpringBootApplication + @EnableAsync + @EnableScheduling
โ
โโโ ๐ auth/ # Spring Security, UserService, permission aspect
โ
โโโ โ๏ธ config/
โ โโโ GenAiConfig.java # Tanzu GenAI multi-model discovery
โ โโโ VectorStoreConfig.java # pgvector store + EmbeddingModel
โ โโโ SpringAiConfig.java # ChatClient bean wiring
โ โโโ SecurityConfig.java # Form login, OAuth2 client, CSRF
โ โโโ RateLimitInterceptor.java
โ โโโ LdapConfig.java
โ โโโ SsoConfig.java
โ โโโ RedisSessionConfig.java # optional cluster sessions
โ โโโ ObservabilityConfig.java # Micrometer, tracing
โ โโโ OpenTelemetryConfig.java
โ
โโโ ๐ฎ controller/ # REST + Thymeleaf controllers
โ โโโ ChatController.java # /api/chat, SSE relay for wiki_op events
โ โโโ WikiController.java # /api/wiki/*
โ โโโ DocumentController.java
โ โโโ ConversationController.java
โ โโโ ChatFolderController.java
โ โโโ TagController.java
โ โโโ ChannelController.java
โ โโโ PromptPresetController.java
โ โโโ AdminController.java
โ โโโ AdminMcpController.java
โ โโโ AdminSkillsController.java
โ โโโ AdminExternalBindingController.java
โ โโโ AdminToolsController.java
โ โโโ AdminStorageController.java
โ โโโ OrganizationController.java
โ โโโ UserGroupController.java
โ โโโ UserPreferencesController.java
โ โโโ ScimController.java # SCIM 2.0
โ โโโ BannerController.java
โ โโโ MetricsController.java
โ โโโ UsageController.java
โ โโโ ModelKnowledgeController.java
โ โโโ WebController.java # Thymeleaf page routes
โ โโโ GlobalExceptionHandler.java
โ
โโโ ๐ฆ model/ # JPA entities
โ โโโ User.java, Role.java, Permission.java
โ โโโ Conversation.java, Message.java, ChatFolder.java, ConversationTag.java
โ โโโ Channel.java, ChannelMessage.java
โ โโโ Skill.java, Tool.java, PromptPreset.java
โ โโโ UserDocument.java, DocumentStorageConfig.java
โ โโโ McpServer.java, McpTransportType.java
โ โโโ ExternalBinding.java
โ โโโ Organization.java, UserGroup.java, UserAccess.java
โ โโโ ModelAccessRule.java, ModelKnowledge.java, ModelInfo.java
โ โโโ NotificationBanner.java, Webhook.java
โ โโโ SharedChat.java, SystemSetting.java
โ โโโ UsageMetric.java, EmbeddingMetric.java
โ โโโ wiki/
โ โโโ WikiPage.java # @OptimisticLock(excluded=true) on embedding fields
โ โโโ WikiPageHistory.java
โ โโโ WikiLink.java
โ โโโ WikiLogEntry.java
โ โโโ WikiKind.java, WikiOrigin.java, EmbeddingStatus.java
โ
โโโ ๐๏ธ repository/ # Spring Data JPA repos for each entity
โ โโโ wiki/ # Wiki repos including WikiPageIndexRow projection
โ
โโโ ๐ง service/
โ โโโ ChatService.java # buildMessageHistory, thinking-level + tool wiring
โ โโโ ThinkingOptionsBuilder.java # per-model thinking-level translation
โ โโโ ConversationService.java
โ โโโ DocumentEmbeddingService.java # pgvector indexing + search
โ โโโ DocumentStorageService.java # abstraction over Local/S3/Azure/GCS
โ โโโ LocalStorageService.java, S3StorageService.java
โ โโโ AzureBlobStorageService.java, GcsStorageService.java
โ โโโ DocumentExtractor.java # TikaDocumentExtractor / DoclingExtractor / AzureDocIntelExtractor
โ โโโ RagPromptBuilder.java, QueryRewriteService.java, HybridSearchService.java
โ โโโ YouTubeTranscriptService.java, WebContentService.java, WebSearchService.java
โ โโโ McpService.java # MCP server lifecycle + constraint migration
โ โโโ SkillService.java, ToolService.java
โ โโโ ExternalBindingService.java # hot-reloadable OpenAI-compat bindings
โ โโโ OrganizationService.java, UserGroupService.java, UserAccessService.java
โ โโโ PermissionService.java, ModelAccessService.java
โ โโโ SystemSettingService.java # broadcasts cache.settings cluster event
โ โโโ CacheInvalidationService.java # cluster-wide cache invalidation
โ โโโ ClusterEventService.java # Redis pub/sub when Redis is bound
โ โโโ RateLimitService.java, ContentModerationService.java
โ โโโ PromptInjectionDetector.java
โ โโโ MetricsService.java, ActiveUserTracker.java
โ โโโ ChatExportService.java, ChatSharingService.java, ConfigExportService.java
โ โโโ MessageEditService.java, RegenerationService.java
โ โโโ MarkdownService.java, TranslationService.java
โ โโโ WebhookService.java
โ โโโ AsyncChatService.java
โ โโโ wiki/
โ โโโ WikiService.java # upsert / read / link / invalidate / undo
โ โโโ WikiContextLoader.java # Caffeine-cached index block + @EventListener
โ โโโ WikiEmbeddingService.java # pgvector for wiki pages
โ โโโ WikiEmbeddingRetryJob.java # @Scheduled retry of PENDING/FAILED
โ โโโ WikiFeatureService.java # two-layer enable/disable gate
โ โโโ WikiMigrationRunner.java # one-shot notes+memory -> wiki migration
โ โโโ WikiScope.java # ToolContext -> userId/conversationId
โ โโโ SlugUtil.java
โ
โโโ ๐ ๏ธ tools/wiki/
โ โโโ WikiTools.java # Six @Tool methods
โ
โโโ ๐ mcp/
โ โโโ McpConfiguration.java, McpDiscoveryService.java
โ โโโ McpStartupService.java # @EventListener ApplicationReadyEvent
โ โโโ McpServerService.java, McpToolCallbackCacheService.java
โ โโโ McpClientFactory.java # SSE vs Streamable HTTP routing
โ โโโ SessionRecoveringToolCallbackProvider.java
โ โโโ ProtocolType.java
โ
โโโ ๐จ event/
โ โโโ WikiOpEvent.java # ApplicationEvent, consumed by
โ # ChatController (SSE relay)
โ # WikiContextLoader (cache invalidation)
โ
โโโ ๐ฆ dto/
โโโ ChatRequest.java # + thinkingLevel field
โโโ ChatResponse.java
โโโ wiki/
โโโ WikiPageView.java, WikiSearchHit.java
โโโ WikiIndexEntry.java, WikiOpPayload.java
src/main/resources/
โโโ templates/
โ โโโ index.html # Main chat UI w/ thinking selector
โ โโโ settings.html # User settings
โ โโโ admin.html, admin/*.html # Admin portal
โ โโโ workspace.html, workspace/*.html # Workspace hub: wiki, channels, prompts, tools, documents, help
โ โโโ metrics.html
โ โโโ error/
โโโ static/
โ โโโ js/
โ โ โโโ app.js # Chat UI, SSE parser, <think> routing, details panel
โ โ โโโ workspace-wiki.js # Wiki workspace page
โ โ โโโ settings.js # User preferences incl. wiki opt-in
โ โ โโโ admin-*.js # One file per admin page (CSP: no inline JS)
โ โ โโโ ...
โ โโโ css/style.css # Design tokens, thinking selector, details panel
โ โโโ vendor/marked.min.js, vendor/purify.min.js
โโโ application.yml # app.* config incl. app.wiki.*
๐บ๏ธ Roadmap
- LLM Wiki with agent-curated writes
- Per-turn thinking-level control
- Collapsible details panel with reasoning + wiki ops
- Two-layer enable/disable (admin + user) for wiki feature
- Cluster-aware settings cache invalidation
- Legacy notes + memory migration runner
- Wiki page tags + cross-links UI
- Wiki export to Markdown
- Native
reasoning_effortpassthrough (pending Tanzu GenAI tile enhancement) - Separate
delta.reasoning_contentstreaming (pending Tanzu GenAI tile enhancement) - Multi-modal chat (vision input)
- Artifact version history
- Per-organization wiki scoping
๐ License
MIT License โ Copyright (c) 2026 Kuhn-Labs
See LICENSE for details.
Built with โค๏ธ for Tanzu Platform
_____ _ ___
|_ _|_ _ _ __ _____ _ / \ |_ _|
| |/ _` | '_ \|_ / | | | / _ \ | |
| | (_| | | | |/ /| |_| | / ___ \ | |
|_|\__,_|_| |_/___|\__,_|/_/ \_\___|



