What does “DataHub” MCP Server do?
The DataHub MCP (Model Context Protocol) Server acts as a bridge between AI assistants and your DataHub data ecosystem. By exposing DataHub’s powerful metadata and context APIs via the MCP standard, this server enables AI agents to search across all entity types, fetch detailed metadata, traverse data lineage, and list associated SQL queries. This dramatically improves development workflows by allowing AI models to access up-to-date data context, perform complex queries, and automate metadata exploration directly from your preferred AI interface. DataHub MCP Server supports both DataHub Core and DataHub Cloud, making it a versatile solution for organizations seeking to integrate their metadata platform with AI-driven tools and assistants.
List of Prompts
No prompt templates are detailed or mentioned in the repository or README.
List of Resources
No explicit MCP resource primitives are described in the repository or README.
List of Tools
- Search across all entity types and using arbitrary filters
Enables clients to query DataHub entities (datasets, dashboards, pipelines, etc.) using custom filters. - Fetch metadata for any entity
Retrieves comprehensive metadata about a specific DataHub entity. - Traverse the lineage graph (upstream and downstream)
Allows exploration of data lineage, both upstream (sources) and downstream (consumers) for a given entity. - List SQL queries associated with a dataset
Surfaces SQL queries linked to a particular dataset for auditing and understanding data usage.
Use Cases of this MCP Server
- Comprehensive Data Discovery
Developers and data scientists can search and filter across all DataHub entities, accelerating data discovery and reducing manual effort. - Automated Metadata Fetching
AI agents can programmatically retrieve detailed entity metadata, supporting automated documentation, quality checks, or onboarding workflows. - Lineage Analysis for Impact Assessment
By traversing upstream and downstream lineage, teams can instantly assess the impact of changes and improve data governance. - SQL Query Auditing
Easily list and analyze SQL queries associated with datasets, aiding in compliance monitoring, performance tuning, and data access optimization. - Integration With AI-Powered Agents
Seamlessly connect DataHub with modern AI assistants to automate repetitive data management and exploration tasks directly from chat or code environments.
How to set it up
Windsurf
No Windsurf-specific instructions found in the repository.
Claude
Install
uv
.Locate the full path to the
uvx
command usingwhich uvx
.Obtain your DataHub URL and personal access token.
Edit your
claude_desktop_config.json
file:{ "mcpServers": { "datahub": { "command": "<full-path-to-uvx>", // e.g. /Users/hsheth/.local/bin/uvx "args": ["mcp-server-datahub"], "env": { "DATAHUB_GMS_URL": "<your-datahub-url>", "DATAHUB_GMS_TOKEN": "<your-datahub-token>" } } } }
Save and (re)start Claude Desktop. Verify connection in the agent interface.
Cursor
Install
uv
.Obtain your DataHub URL and personal access token.
Edit
.cursor/mcp.json
:{ "mcpServers": { "datahub": { "command": "uvx", "args": ["mcp-server-datahub"], "env": { "DATAHUB_GMS_URL": "<your-datahub-url>", "DATAHUB_GMS_TOKEN": "<your-datahub-token>" } } } }
Save the file and restart Cursor. Check the MCP status panel.
Cline
No Cline-specific instructions found in the repository.
Generic/Other MCP Clients
Install
uv
.Prepare your DataHub URL and personal access token.
Use this configuration:
command: uvx args: - mcp-server-datahub env: DATAHUB_GMS_URL: <your-datahub-url> DATAHUB_GMS_TOKEN: <your-datahub-token>
Integrate this command in your MCP client configuration.
Securing API Keys
Always store sensitive credentials like DATAHUB_GMS_TOKEN
in environment variables, not in plaintext files. In your configuration, use the env
field as shown above to inject secrets securely.
How to use this MCP inside flows
Using MCP in FlowHunt
To integrate MCP servers into your FlowHunt workflow, start by adding the MCP component to your flow and connecting it to your AI agent:

Click on the MCP component to open the configuration panel. In the system MCP configuration section, insert your MCP server details using this JSON format:
{
"datahub": {
"transport": "streamable_http",
"url": "https://yourmcpserver.example/pathtothemcp/url"
}
}
Once configured, the AI agent is now able to use this MCP as a tool with access to all its functions and capabilities. Remember to change “datahub” to whatever the actual name of your MCP server is and replace the URL with your own MCP server URL.
Overview
Section | Availability | Details/Notes |
---|---|---|
Overview | ✅ | Present in README and repo description |
List of Prompts | ⛔ | No prompt templates found |
List of Resources | ⛔ | No explicit MCP resource primitives described |
List of Tools | ✅ | Tools described in README features section |
Securing API Keys | ✅ | Environment variables in setup instructions |
Sampling Support (less important in evaluation) | ⛔ | No mention of sampling in README or code |
I would rate this MCP server at about 6/10. It has a clear open-source license, multiple real tools, and basic secure setup instructions, but lacks documented prompt templates, explicit resource primitives, and advanced MCP features like sampling or roots.
MCP Score
Has a LICENSE | ✅ (Apache-2.0) |
---|---|
Has at least one tool | ✅ |
Number of Forks | 13 |
Number of Stars | 37 |
Frequently asked questions
- What does the DataHub MCP Server do?
It exposes DataHub's metadata and context APIs via the MCP standard, enabling AI agents to search, retrieve metadata, traverse lineage, and list SQL queries on your organizational data, directly from FlowHunt or other AI tools.
- Which DataHub platforms are supported?
Both DataHub Core and DataHub Cloud are supported, so you can connect regardless of your deployment.
- What are the main use cases?
Common use cases include comprehensive data discovery, automated metadata fetching, lineage analysis for impact assessment, SQL query auditing, and integration with AI-powered agents for workflow automation.
- How do I securely provide credentials?
Always use environment variables for sensitive credentials like DATAHUB_GMS_TOKEN. Inject them using the 'env' field in your configuration files to keep secrets safe.
- Are prompt templates or resource primitives included?
No explicit prompt templates or MCP resource primitives are included with this server.
- What tools does this MCP server offer?
It provides searching across all entity types, fetching metadata, lineage traversal, and listing SQL queries associated with datasets.
- How do I connect DataHub MCP to FlowHunt?
Add an MCP component in your FlowHunt flow, configure it with your DataHub MCP server JSON as shown in the documentation, and connect it to your AI agent for immediate access to DataHub capabilities.
Connect FlowHunt with DataHub via MCP
Empower your AI workflows with real-time access to organizational metadata, lineage, and data discovery tools using the DataHub MCP Server. Automate data management and governance directly from FlowHunt.