io.github.us-all/openmetadata
OpenMetadata MCP β metadata, lineage, search, data quality, OM 1.12+ Data Contracts
Ask AI about io.github.us-all/openmetadata
Powered by Claude Β· Grounded in docs
I know everything about io.github.us-all/openmetadata. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
OpenMetadata MCP Server
The OpenMetadata MCP that ships full CRUD across every entity type β including OM 1.12+ Data Contracts, Metrics, Search Index, API Collections, and API Endpoints that the embedded MCP doesn't cover yet.
170 tools, 4 workflow Prompts (lineage impact / DQ investigation / glossary bootstrap / owner reassign), 7 MCP Resources, and aggregations like
lineage-impact(downstream blast-radius w/ owner notification list),quality-rollup(DQ status across a scope), andget-domain-summary(domain + 6 child entity types in one call).
What it does that others don't
- OM 1.12+ entity coverage β Data Contracts, Metrics, Search Index, API Collections, API Endpoints (10 read tools). Not in the embedded MCP yet.
- Aggregation tools β
lineage-impactanswers "what breaks if I change/drop X?" by walking lineage + counting consumers + breaking down by entity type + resolving the owner union for change-mgmt notifications, in one call.get-domain-summaryreturns domain + 6 child entity types via/search/querywithtrack_total_hitsin one call instead of 7 sequential.get-table-summaryfolds table + lineage + sample-data + DQ similarly. - Semantic search β
semantic-searchover OM 1.12+ vector index (POST/search/vector/query). Useful when keyword search misses synonyms. - MCP Prompts (4) β
lineage-impact-analysis,data-quality-investigation,glossary-term-bootstrap,owner-change-propagation. Workflow templates the model invokes directly. - MCP Resources (7) β
om://table/{fqn},om://glossary-term/{fqn},om://lineage/{type}/{fqn},om://search/{query},om://dashboard/{fqn},om://pipeline/{fqn},om://schema/{fqn}. - Token-efficient by design β
extractFieldsprojection on 28 read tools (dropschangeDescription/version/updatedBy/hrefnoise β ~80% size reduction),OM_TOOLS/OM_DISABLE9 categories,search-toolsmeta-tool. - Apps SDK card β
lineage-impactrenders as a blast-radius card on ChatGPT clients (downstream/upstream counts + type breakdown + top consumers + owners-to-notify) via_meta["openai/outputTemplate"]. Claude clients receive the same JSON content. - stdio + Streamable HTTP β defaults to stdio. Set
MCP_TRANSPORT=httpfor ChatGPT Apps SDK or remote clients (Bearer auth viaMCP_HTTP_TOKEN).
Try this β 5 prompts
Connect the server to Claude Desktop or Claude Code, then paste any of these:
- Lineage impact β "The
payments.transactionstable is being deprecated. List every dashboard, pipeline, and ML model that depends on it (upstream + downstream, depth 3)." - Data quality investigation β "Show all failing test cases from the last 7 days. Group by table, then by test type, with pass/fail counts."
- Glossary bootstrap β "Create a
paymentsglossary with these 8 terms: chargeback, refund, settlement, KYC, AML, transaction, customer-id, payment-method. Link related terms." - Owner reassign β "User
taeheeis leaving. List every entity (table/dashboard/pipeline/ML model) where they are owner. Then reassign all of them to teamdata-platform." - Domain summary β "Summarize the
analyticsdomain: total tables/dashboards/pipelines/ML models, top 5 by recent updates, and the data products it owns."
When to use this vs OpenMetadata's embedded MCP
OpenMetadata 1.12+ ships an embedded MCP. They are complementary:
| OM 1.12 embedded MCP | @us-all/openmetadata-mcp (this) | |
|---|---|---|
| Tool count | ~10 (search, glossary basics, lineage, DQ, RCA, semantic search) | 170 (full CRUD across all entity types) |
| OM 1.12+ entity types (Data Contracts/Metrics/Search Index/API) | partial | β 10 read tools |
| Aggregation tools | β | β
lineage-impact, get-domain-summary, get-table-summary |
| MCP Prompts | β | β 4 |
| MCP Resources | β | β 7 |
| Auth | OAuth2 / PAT, OM Authorization Engine (RBAC) | JWT bot token + write gate |
| Deployment | Embedded in OM server (marketplace install) | Standalone npm / Docker / npx |
| OM version | 1.12+ only | 1.x compatible |
| Best for | RBAC-aware AI agents, SSO orgs | Bulk CRUD, automation, sample-data, older OM clusters |
Use the embedded MCP for RBAC-aware governance with SSO. Use this server for bulk metadata operations, full entity CRUD parity, automation, and OM clusters older than 1.12.
Install
Claude Desktop
{
"mcpServers": {
"openmetadata": {
"command": "npx",
"args": ["-y", "@us-all/openmetadata-mcp"],
"env": {
"OPENMETADATA_HOST": "http://your-host:8585",
"OPENMETADATA_TOKEN": "<jwt-bot-token>"
}
}
}
}
Claude Code
claude mcp add openmetadata -s user \
-e OPENMETADATA_HOST=http://your-host:8585 \
-e OPENMETADATA_TOKEN=<jwt-bot-token> \
-- npx -y @us-all/openmetadata-mcp
Docker
docker run --rm -i \
-e OPENMETADATA_HOST=http://your-host:8585 \
-e OPENMETADATA_TOKEN=<jwt-bot-token> \
ghcr.io/us-all/openmetadata-mcp-server
Build from source
git clone https://github.com/us-all/openmetadata-mcp-server.git
cd openmetadata-mcp-server && pnpm install && pnpm build
node dist/index.js
Get a token
- Open OpenMetadata UI β Settings β Bots
- Create a new bot or use an existing one (
ingestion-botworks) - Copy the JWT token
Configuration
| Variable | Required | Default | Description |
|---|---|---|---|
OPENMETADATA_HOST | β | β | OpenMetadata server URL (e.g. http://localhost:8585) |
OPENMETADATA_TOKEN | β | β | JWT or Bot token |
OPENMETADATA_ALLOW_WRITE | β | false | Set true to enable mutations (create/update/delete) |
OM_TOOLS | β | β | Comma-sep allowlist of categories. Biggest token saver. |
OM_DISABLE | β | β | Comma-sep denylist. Ignored when OM_TOOLS is set. |
MCP_TRANSPORT | β | stdio | http to enable Streamable HTTP transport |
MCP_HTTP_TOKEN | conditional | β | Bearer token. Required when MCP_TRANSPORT=http |
MCP_HTTP_PORT | β | 3000 | HTTP listen port |
MCP_HTTP_HOST | β | 127.0.0.1 | HTTP bind host (DNS rebinding protection auto-enabled for localhost) |
MCP_HTTP_SKIP_AUTH | β | false | Skip Bearer auth β e.g. behind a reverse proxy that handles it |
Categories (9): search, core, discovery, governance, quality, services, admin, events, meta (always-on).
When MCP_TRANSPORT=http: POST /mcp (Bearer-auth JSON-RPC) + GET /health (public liveness).
Token efficiency
| Scenario | Tools | Schema tokens | vs default |
|---|---|---|---|
| default (all categories) | 156 | 24,000 | β |
typical (OM_TOOLS=search,core,governance,quality,discovery) | 120 | 19,500 | β19% |
narrow (OM_TOOLS=search,core) | 26 | 4,600 | β81% |
extractFields adds another ~80β90% reduction on individual responses (e.g. get-table 8KB β 200B with extractFields: "name,columns.*.name,columns.*.dataType"). Auto-applied across 28 read tools.
// without
get-table { "id": "..." }
// with
get-table { "id": "...", "extractFields": "name,description,columns.*.name,columns.*.dataType" }
MCP Prompts (4)
Workflow templates available via MCP prompts/list:
lineage-impact-analysisβ given an entity, walk upstream + downstream lineage and rank by impact.data-quality-investigationβ diff DQ test results across two windows; cluster failure modes.glossary-term-bootstrapβ bulk-create a glossary with N related terms, link automatically.owner-change-propagationβ find all entities owned by user X, propose batch reassignment.
MCP Resources
URI-based read-only access:
om://table/{fqn} (table + columns + owners + tags + joins), om://glossary-term/{fqn}, om://lineage/{type}/{fqn} (depth 3), om://search/{query} (top 10 keyword hits), om://dashboard/{fqn}, om://pipeline/{fqn} (with tasks), om://schema/{fqn}.
Tools (170)
9 categories. Use search-tools to discover at runtime; full list collapsed below.
| Category | Tools |
|---|---|
| Tables / Databases / Schemas / Lineage | 22 |
| Services (database/dashboard/messaging/pipeline/ml/storage) | 16 |
| Glossaries / Terms | 12 |
| Domains / Data Products | 12 |
| Classifications / Tags | 10 |
| Discovery (dashboards / pipelines / charts / topics / containers / ml-models) | 36 |
| Governance (roles / policies / users / teams / bots) | 13 |
| Quality (test suites / cases / sample data) | 13 |
| Stored Procedures / Queries | 11 |
| OM 1.12+ entities (Data Contract / Metric / Search Index / API Collection / API Endpoint) | 10 |
Search (search-metadata, suggest-metadata, semantic-search) | 3 |
Aggregations (lineage-impact, quality-rollup, get-domain-summary, get-table-summary) | 4 |
Quality (run-test-suite write-gated) | 1 |
Meta (search-tools) | 1 |
Full tool list
Search (3)
search-metadata, suggest-metadata, semantic-search
Tables (6)
list-tables, get-table, get-table-by-name, create-table, update-table, delete-table
Databases (6)
list-databases, get-database, get-database-by-name, create-database, update-database, delete-database
Database Schemas (6)
list-schemas, get-schema, get-schema-by-name, create-schema, update-schema, delete-schema
Lineage (4)
get-lineage, get-lineage-by-name, add-lineage, delete-lineage
Services (16)
6 database-service tools + 2 each for dashboard/messaging/pipeline/ml-model/storage services.
Glossaries (12)
6 glossary CRUD + 6 glossary-term CRUD.
Dashboards / Pipelines / Topics / Charts / Containers / ML Models (36)
6 CRUD each, follows list / get / get-by-name / create / update / delete.
Classifications & Tags (10)
4 classification + 6 tag CRUD.
Domains & Data Products (12)
6 domain + 6 data-product CRUD.
Users & Teams (9)
3 user reads + 6 team CRUD.
Access Control (4)
list-roles, get-role, list-policies, get-policy
Data Quality (7)
list-test-suites, get-test-suite, get-test-suite-by-name, list-test-cases, get-test-case, get-test-case-by-name, list-test-case-results
Stored Procedures (6)
6 CRUD.
Queries (5)
list-queries, get-query, create-query, update-query, delete-query
Events (3)
list-events, get-event-subscription, get-event-subscription-by-name
Bots (3)
list-bots, get-bot, get-bot-by-name
Sample Data (6, read-only)
get-table-sample-data, get-table-sample-data-by-name, get-topic-sample-data, get-topic-sample-data-by-name, get-container-sample-data, get-container-sample-data-by-name
OM 1.12+ entities (10)
list-data-contracts, get-data-contract-by-name, list-metrics, get-metric-by-name, list-search-indexes, get-search-index-by-name, list-api-collections, get-api-collection-by-name, list-api-endpoints, get-api-endpoint-by-name
Aggregations
lineage-impact, quality-rollup, get-domain-summary, get-table-summary
Quality (write-gated)
run-test-suite β triggers the test-suite's associated ingestion pipeline. Async; results land via the normal pipeline flow.
Meta
search-tools β query other tools by keyword; always enabled.
Architecture
Claude β MCP stdio β src/index.ts β src/tools/*.ts β OpenMetadataClient (fetch) β OpenMetadata REST
Built on @us-all/mcp-toolkit:
extractFieldsβ token-efficient response projectionsaggregate(fetchers, caveats)β fan-out helper used bylineage-impact/get-domain-summary/get-table-summarycreateWrapToolHandlerβOPENMETADATA_TOKENredaction +OpenMetadataErrorextractionsearch-toolsmeta-tool
Targets OM 1.x. Validated against real OM backend with the OM 1.12+ entities.
Tech stack
Node.js 18+ β’ TypeScript strict ESM β’ pnpm β’ @modelcontextprotocol/sdk β’ zod β’ dotenv β’ vitest.
JSON-Patch updates handled automatically (PATCH application/json-patch+json content-type).
