MCP server for the Coalesce Catalog Public GraphQL API; covers tables, columns, dashboards, lineage, governance, and asset metadata across the warehouse and BI layer.
0 installs
Trust: 37 β Low
Healthcare
Ask AI about Coalesce Catalog
Powered by Claude Β· Grounded in docs
I know everything about Coalesce Catalog. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Loading tools...
Reviews
Documentation
coalesce-catalog-mcp
MCP server for the Coalesce Catalog Public GraphQL API. Exposes tables, columns, dashboards, lineage, tags, glossary terms, data products, governance (owners/teams/quality/pinned assets), semantic SQL search, and the Catalog AI Assistant across your warehouse + BI stack. Ships composite workflow tools for resolving warehouse paths, summarising assets in one call, and diagnosing lineage coverage gaps.
Built as the catalog-side companion to coalesce-transform-mcp β install both, keep the surfaces namespaced, let agents route between them.
[!TIP]
βοΈ Snowflake Cortex Code + coalesce-catalog-mcp. CoCo is Snowflake's AI coding CLI β it already authenticates to your warehouse and runs under your Snowflake role. Drop this MCP in and an agent can answer "who owns PROD.SALES.ORDERS, what feeds it, which dashboards read it?" inline with your SQL workflow β lineage, ownership, tags, and descriptions without leaving the terminal.
βοΈ Install in Snowflake Cortex Code (CoCo)
Why this pairing? Cortex Code is Snowflake's AI coding CLI β it already authenticates to your warehouse, runs under your Snowflake role, and has native tools for querying live data. Add coalesce-catalog-mcp and a single agent session can resolve warehouse paths to Catalog UUIDs, walk lineage (upstream tables + downstream dashboards), check documentation/PII coverage, and push fixes (descriptions, ownership, tags, lineage patches) back to the Catalog β all alongside the SQL you're already writing.
Restart Claude Code. Tools appear as mcp__coalesce-catalog__*.
Install in Claude Desktop
Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or the Windows equivalent and add the same mcpServers block as above. Full-quit and re-open Claude Desktop.
Install in Cursor
Paste into .cursor/mcp.json (project) or ~/.cursor/mcp.json (global):
Restart Windsurf. Tools appear under the Cascade chat's MCP panel.
Governance rollout
New to Catalog, or rolling it out across an org for the first time? The server ships an opinionated best-in-class governance playbook that takes a team from zero to a trusted, governed catalog in 8β12 weeks.
Pre-flight checklist β 6 organizational prerequisites (exec sponsor, named Steward with allocated time, success metric, stakeholder map, budget, incentive model). Hard-stop if any are missing.
9 sequenced phases β ingestion audit β ownership β metadata β glossary β tagging β lineage β data products β quality checks β review cadence + incident response β adoption. Each phase has a goal, success criterion, effort estimate, and the specific catalog_* tool calls that execute it.
Tiering model (T1/T2/T3) with capacity-grounded guidance (T1 β€ 5 % of your table count) and how to pick the T1 list from popularity + lineage signals instead of gut instinct.
Roles β 3 max (Catalog Steward, Domain Owner, Data Steward). Explicit refusal of the RACI-matrix antipattern.
Coalesce Transform integration β description flow, ownership flow, lineage consistency between the two products.
RBAC β Catalog role model, read-only mode, sensible token-scope defaults.
Incident response runbook β 5-step flow using Catalog as impact-assessment source of truth.
Change management β 7 tactics for driving adoption, ordered by effectiveness. Anti-patterns included.
KPIs β 10 measurable targets with per-metric ownership, not vanity metrics.
Quick-start recipe β 5 tool calls you can run on Monday morning to show progress by end of week.
Use the catalog-governance-rollout MCP prompt to kick off a guided walkthrough that grounds advice in your account's live state (top-25 tables, current coverage gaps) rather than generic principles.
Context Resources
The server ships 5 static markdown resources under the catalog://context/* URI scheme. Your client fetches them via resources/list + resources/read; agents cite them to orient quickly or when routing between tools.
catalog://context/overview β Entity graph (source β database β schema β table/column), cross-cutting annotations (tags, terms, data products, quality, ownership), lineage provenance, auth, and region setup.
catalog://context/tool-routing β Decision tree mapping common user questions to the right tool. Use this if you're unsure whether to reach for catalog_search_*, catalog_get_*, or a composite workflow.
catalog://context/ecosystem-boundaries β When to pick this server vs coalesce-transform-mcp. Explains the workflow seam where Transform-authored nodes materialise Catalog-indexed tables.
catalog://context/governance-rollout β Best-in-class rollout playbook. See the Governance rollout section above for the summary.
Content lives in src/resources/context/ β edit the markdown, rebuild, and the server picks it up on restart. There's no per-user override mechanism yet; fork the repo or overlay your own server alongside for org-specific context.
Tools
[!NOTE]
Legend
βοΈ Write β mutates Catalog state. Reversible (upsert/attach/create/update). Requires a READ_WRITE API token.
β οΈ Destructive β deletes state. Not recoverable via this API.
Read-only by default is OFF. Set COALESCE_CATALOG_READ_ONLY=true to drop every βοΈ and β οΈ tool at server startup.
catalog_find_asset_by_path - Resolve a dotted warehouse path (DATABASE.SCHEMA.TABLE or ...TABLE.COLUMN) to a Catalog UUID. Walks the hierarchy; returns structured diagnostics on ambiguity or not-found.
Source β database β schema hierarchy
catalog_search_sources - List connected sources (warehouses, BI tools, transform tools, quality tools). Filter by technology, type, origin.
catalog_search_databases - List warehouse databases. Scope by sourceIds or name.
catalog_search_schemas - List warehouse schemas. Scope by databaseIds, sourceIds, or name.
catalog_search_columns - Find columns by name/table/schema/database/source. Boolean filters for isPii, isPrimaryKey, isDocumented, hasColumnJoins.
catalog_get_column - Full detail by UUID, including description provenance and tags. Use tableId from the response with catalog_get_table for parent context (the public API forbids the nested table relation on column queries).
catalog_get_column_joins - Warehouse-observed JOIN relationships between columns, ranked by count. Useful for discovering de-facto foreign keys.
catalog_get_field_lineages - Column / dashboard-field edges. Scope is required β unscoped calls time out. Pair with catalog_search_columns to fan out.
Composite workflow
catalog_get_column_lineage - Complete column-level lineage graph for a starting column. Accepts an FQN (DATABASE.SCHEMA.TABLE.COLUMN) or UUID, resolves it, walks the BFS exhaustively with no depth cap, and batch-resolves every reached column id to { name, fqn, tableName, schemaName, databaseName }. Returns nodes + edges as a DAG (handles cycles + shared children). Configurable maxNodes safety ceiling (default 10000) guards against pathological graphs.
catalog_trace_missing_lineage - Heuristic diagnostic: probes a table's upstream/downstream counts, lineage provenance, and column-level coverage %; returns findings[] with severity + remediation suggestions.
Write
βοΈ catalog_upsert_lineages - Create/refresh tableβtable, tableβdashboard, dashboardβdashboard edges. Each edge must declare exactly one parent + one child (Zod-enforced). Batch up to 500.
β οΈ catalog_delete_lineages - Delete edges by endpoints. Batch up to 500.
Tags, terms, data products β the cross-cutting annotation layer
Read
catalog_search_tags - Tag labels, colors, linked terms.
catalog_search_terms - Glossary terms (hierarchy via parentTermId + depthLevel, linked tag). Set projection: "detailed" to inline ownerEntities, teamOwnerEntities, and tagEntities for term-health audits.
catalog_search_data_products - Assets promoted to curated data products. Filter by entityType (TABLE/DASHBOARD/TERM), withTagId.
Write
βοΈ catalog_attach_tags - Bind tags to entities by label. Auto-creates the tag label if new. Batch up to 500.
β οΈ catalog_detach_tags - Remove tag bindings (does not delete the tag itself). Batch up to 500.
βοΈ catalog_create_term - Create a glossary term, optionally under a parent and/or linked to a tag.
βοΈ catalog_update_term - Update fields in place. Pass parentTermId: null to move to root, linkedTagId: null to unlink.
β οΈ catalog_delete_term - Hard delete. Children become orphans.
catalog_search_users - List Catalog users (id, email, role, ownedAssetCount). Set projection: "detailed" to inline ownedAssetIds and resolve email β owned assets in one scan.
catalog_get_user_owned_assets - Paginated list of asset UUIDs owned by a user (by userId). Scans up to 10k users internally. Prefer catalog_search_users({ projection: "detailed" }) when starting from an email.
catalog_search_teams - List teams (members, Slack routing, memberCount, ownedAssetCount). Set projection: "detailed" to inline memberIds + ownedAssetIds.
catalog_get_team_members - Paginated list of user UUIDs belonging to a team (by teamId). Scans up to 10k teams internally. Prefer catalog_search_teams({ projection: "detailed" }) when starting from a team name.
catalog_get_team_owned_assets - Paginated list of asset UUIDs owned by a team (by teamId). Same scan strategy as catalog_get_team_members.
catalog_search_quality_checks - Data-quality test results (dbt, Monte Carlo, Soda, Great Expectations, etc.). Scope by tableId.
catalog_search_pinned_assets - Curated "see also" links between catalog entities.
Ownership writes
βοΈ catalog_upsert_user_owners - Mark a user as owner of N assets.
β οΈ catalog_remove_user_owners - Strip ownership (specific targets or all).
βοΈ catalog_upsert_team_owners - Team equivalent of upsert.
β οΈ catalog_remove_team_owners - Team equivalent of remove.
Team management
βοΈ catalog_upsert_team - Create-or-update by unique team name. Enforces Slack #channel / @group prefixes.
βοΈ catalog_add_team_users - Add members by email (must be existing Catalog users).
β οΈ catalog_remove_team_users - Remove members by email.
Quality checks
βοΈ catalog_upsert_data_qualities - Push check results for one table. Nested qualityChecks[] under a single tableId.
β οΈ catalog_remove_pinned_assets - Remove by endpoints.
AI β semantic SQL search and the Catalog Assistant
catalog_search_queries - Semantic (natural-language) search over ingested SQL queries. Returns up to 10 matches with query text + author + referenced tableIds. Optional tableIds (max 10) + filterMode (ALL/ANY) for scoping.
catalog_ask_assistant - Kick off an async AI Assistant job against the Catalog's RAG index. Returns a jobId.
catalog_get_assistant_result - Poll a job; returns status (ADDED/ACTIVE/COMPLETED/FAILED/RETRIES_EXHAUSTED) plus answer + referenced assets[] when completed.
catalog_describe_type - Introspect a GraphQL type on the Catalog Public API. Returns kind, description, fields (OBJECT / INTERFACE), inputFields (INPUT_OBJECT), or enumValues (ENUM). Each field is rendered as native GraphQL SDL ([String!]!) with an isRequired flag. On miss, returns near-match suggestions via Levenshtein + substring matching. Use for API-shape questions ("does getLineages accept a column scope?", "what's in GetFieldLineagesScope?").
catalog_run_graphql - Execute an arbitrary GraphQL query or mutation against the Catalog Public API. Returns the raw response envelope (data, errors, extensions) unchanged β validation errors come through verbatim so you can debug them. Mutations are blocked by default; pass allowMutations: true to opt in. Escape hatch, not the default path β reach for the structured tools (catalog_summarize_asset, catalog_get_column_lineage, etc.) whenever they fit.
catalog_find_asset_by_path - Resolve a dotted warehouse path to a UUID. (See Discovery above.)
catalog_summarize_asset - Full cross-domain overview of a TABLE or DASHBOARD in a single call: identity + ownership + tags + lineage counts + (for tables) columns + quality checks. Sub-queries run in parallel via Promise.allSettled; caller-controllable limits per section.
catalog_trace_missing_lineage - Lineage coverage diagnostic. See Lineage above.
catalog_assess_impact - Deprecation blast-radius report for a TABLE or DASHBOARD. Walks downstream lineage (depth 1-3, paginated exhaustively per node), batch-enriches every reached asset with ownership + popularity, and returns a 0-100 severity score with per-component rationale. Completeness contract: refuses with an explicit error when the depth-2 graph exceeds 2000 distinct nodes (or 500 at depth 3) β never returns a silently truncated report. Surfaces distinctOwnerTeamCount (teams to coordinate with) and unownedCount (orphaned downstream).
catalog_governance_scorecard - Coverage matrix per database, schema, or explicit tableIds list. Per-table flags for ownership / description / column-doc % / tag count, plus an optional 5th axis (includeQualityCoverage: true adds quality-check coverage). Aggregate governanceScore is popularity-weighted by default (matching Health-dashboard semantics); pass weighting: 'equal' for one-table-one-vote audits. Refuses scopes >500 tables.
catalog_audit_governance_freshness - Extends catalog_governance_scorecard from "is metadata present?" to "is metadata still current?". Composes scoped table-detail fetch (verifiedAt, updatedAt, tagEntities, popularity, ownership) β per-table cadence resolution against a sensitivity-tag policy (tightest cadence wins via case-insensitive substring match) β bucketed staleness (neverReviewed / overdue / dueSoon / ok) sorted by stalenessDays * popularity. Default cadence 365 days; pass cadencePolicy.byTag to tighten for pii, critical, etc. Scope is required (databaseId, schemaId, or tableIds β€ 500).
catalog_owner_scorecard - Per-owner cleanup scorecard. Given an email, enumerates every owned table/dashboard/term and groups them by hygiene issue: thin description, PII/domain-tag coverage, new-asset window, certification, lineage gaps (isolated / upstream-only / downstream-only), and term-specific health (missing owner, orphaned, uncertified). Owned UUIDs that don't resolve as any of table/dashboard/term (columns, queries, deleted refs) are reconciled in unclassified_owned_ids. Complete picture or loud refusal β no silent truncation. Pair with the catalog-daily-guide prompt for a rendered walkthrough.
catalog_audit_data_product_readiness - Per-asset promotion-readiness report (TABLE or DASHBOARD). Grades eight axes (description, ownership, tags, column-doc coverage, upstream + downstream lineage, quality checks, verification) with hardcoded thresholds and returns per-axis status: "pass"|"warn"|"fail"|"na" + raw signals + actionable gaps. Overall readyToPromote: true iff no axes fail (warns allowed). Dashboards report na for columnDocs/qualityChecks; wide tables flag sampled: true when column inspection caps.
catalog_resolve_ownership_gaps - Given a scope (database, schema, or tableIds), finds unowned tables and gathers per-table evidence bundles: top N query authors from the last ~200 queries (grouped by email, sorted by query count) + 1-hop upstream/downstream lineage-neighbor owners. Raw signals only β no confidence scores. Refuses loudly above 200 unowned tables (scope-splitting guidance in the error). Pair with catalog_governance_scorecard to size the gap, then this tool to pick owners from the evidence.
catalog_reconcile_ownership_handoff - For a departing owner (by email), builds a blast-radius-ranked handoff plan across every owned table/dashboard/term. Composes per-asset downstream consumer count + popularity + query volume (tables only) into a blastRadiusScore, gathers candidate-owner evidence per asset (top query authors, 1-hop neighbor owners), and fetches the team directory once to tag each candidate with their team memberships. Aggregates across the portfolio into candidateSummary[] sorted by assetCount DESC then totalBlastRadius DESC. Capacity-gated at 200 owned assets (beyond that, the workflow is bulk reassignment, not per-asset handoff). Differs from catalog_owner_scorecard (grades hygiene of what's owned) and catalog_resolve_ownership_gaps (finds unowned tables) β this tool addresses the transition moment when an owned portfolio needs redistribution.
βοΈ catalog_propagate_metadata - Propagate description / tags / owners from a source table downstream along lineage. Computes a typed diff plan with per-table per-axis decisions; default dryRun: true means nothing mutates. Non-dry-run requires MCP elicitation confirmation (or COALESCE_CATALOG_SKIP_CONFIRMATIONS=true). Default axes: ['description'] only; tags/owners are opt-in (owner propagation is high-trust). overwritePolicy: 'ifEmpty' (default) or 'overwrite'. Width caps + pagination ceilings match catalog_assess_impact. Per-axis partial-failure tracking in the execution response.
βοΈ catalog_propagate_tags_upstream - Upstream-direction companion to catalog_propagate_metadata. Propagates tag labels from a presentation-layer source (dashboard or gold-layer table) to the warehouse tables that feed it. Composes source-asset detail fetch β upstream lineage BFS (depth 1-3, refuses on width caps: 2000 @depth2 / 500 @depth3) β upstream-table detail batch β per-target diff plan with provenance trail. Default dryRun: true; execute requires acknowledgeProvenanceSemantics: true AND elicitation confirmation β guardrail enforces explicit recognition that "Critical on a dashboard" does NOT necessarily mean "Critical on its upstream warehouse table." Capacity-gated at 200 reached upstream tables. overwritePolicy: 'ifEmpty' (default) or 'overwrite' β never removes target-native tags.
catalog_triage_quality_failures - Triage all failing quality checks (ALERT + WARNING) into a prioritised action queue. Composes paginated quality-check fetch + client-side status filter (GraphQL scope has no status field) + table detail batch enrichment (ownership, popularity) + optional 1-hop upstream lineage fan-out for root-cause pointers. Ranked by popularity Γ failureCount DESC with per-owner grouping (byOwner keyed by email, __unowned__ for orphans). Capacity-gated at 500 failing checks; refuses with an actionable message above that threshold.
catalog_assess_quality_failure_dashboard_impact - Forward-extension of catalog_triage_quality_failures through downstream lineage. Enumerates BI dashboards that consume failing-quality tables, ranks them by blast-radius (dashboard popularity Γ failure count Γ criticality-tag boost), and shows which failing tables reach each. Use to answer "which downstream reports are affected by this morning's failing quality checks?" β pair with catalog_triage_quality_failures to triage the failures themselves.
catalog_audit_tag_hygiene - Audit structural health of the tag layer. Composes GET_TAGS (paginated) + GET_TABLES_DETAIL_BATCH + GET_DASHBOARDS_DETAIL_BATCH to build a reverse tag-usage index, then detects four finding classes: orphaned (zero usage), unlinked (in use but no glossary term), skewed (>80% single entity type with β₯5 uses), and near-duplicates (Levenshtein distance, configurable threshold). Capacity-gated at 1000 tags / 500 assets. Table scans can be scoped via databaseId/schemaId; dashboard scans are always workspace-wide (GraphQL API limitation).
Prompts
Seven reusable prompt templates kick off common workflows without re-reciting the tool chain. Invoke via prompts/get in your MCP client (in Claude Code, type / and look for the catalog- prefixed entries).
catalog-start-here - Orientation: reads overview + tool-routing context and gives the model the routing defaults it should follow.
catalog-governance-rollout - Kicks off the governance playbook walkthrough grounded in your account's live state (runs source / database / top-25 table sweeps first, then phase-by-phase recommendations).
catalog-asset-summary - Given a path or UUID, run find_asset_by_path + summarize_asset and present the result.
catalog-find-consumers - Enumerate everything downstream of a table: child tables, dashboards, and SQL queries that read it.
catalog-investigate-lineage-gaps - Run trace_missing_lineage and walk each finding with a proposed remediation (upsert_lineages call) β asks for approval before executing.
catalog-daily-guide - Given a user email, call catalog_owner_scorecard and render a prioritised "Today's Agenda" markdown report across the owner's tables, dashboards, and terms. Walks remediation actions with explicit approval before any mutation.
catalog-audit-documentation - Undocumented-column / undocumented-table report across a scope (database, schema, or table).
Full Installation
Requirements:
Node.js 22+ (works on 20 with an engine-mismatch warning)
A Coalesce Catalog account and a Public-API token (Catalog UI β Settings β API tokens)
1. Install the package. The Quick Start snippets all use npx -y coalesce-catalog-mcp@preview, which npm fetches on first invocation β no explicit install needed. If you prefer a pinned global install:
npm install -g coalesce-catalog-mcp@preview
2. Register with your MCP client via one of the Quick Start paths.
3. Restart the client and try the /catalog-start-here prompt (or whatever the slash-command UX is in your client). The agent should list the 5 context resources and 67 tools. If you get an auth error, double-check COALESCE_CATALOG_API_KEY has the right scope β READ tokens work on every query tool but mutations require READ_WRITE.
Credentials
Variable
Description
Default
COALESCE_CATALOG_API_KEY
Public-API token from the Catalog UI (Settings β API tokens). Required. READ tokens work on every query tool; mutations require a READ_WRITE token.
β
COALESCE_CATALOG_REGION
Catalog region: eu or us. Selects the default base URL.
eu
COALESCE_CATALOG_API_URL
Full base URL override. The path /public/graphql is appended automatically.
region-derived
COALESCE_CATALOG_READ_ONLY
When true, every mutation tool is filtered out at server registration time (67 tools β 42).
false
Region base URLs:
EU (default): https://api.castordoc.com
US: https://api.us.castordoc.com
Profile-file support (e.g. ~/.coalesce/catalog-profiles.yml, matching the ~/.coa/config pattern in coalesce-transform-mcp) is on the roadmap but not shipped. Today the server reads env vars only.
Safety model
Two layers keep destructive operations from happening by accident.
Tool annotations. Every tool carries MCP readOnlyHint / destructiveHint / idempotentHint. The βοΈ and β οΈ markers in Tools track readOnlyHint: false and destructiveHint: true respectively.
COALESCE_CATALOG_READ_ONLY=true hides all 25 mutation tools at server registration time. Use it for audits, agent sandboxes, or pairing with a prod token. When set, the server registers 42 tools instead of 67.
Mutation tools additionally require a READ_WRITE API token on the server side β a READ token returns AuthorizationError at call time regardless of client config.