MCP Data Engineering Servers

The MCP ecosystem has 5,800+ community servers, but for data engineering the relevant set is much smaller. These are the servers that give AI assistants real access to your data infrastructure: warehouse query execution, schema inspection, streaming topic management, and federated access across multiple databases. Rather than building custom integrations for each AI client, these servers provide a standardized interface that works across Claude Desktop, Claude Code, Cursor, and any other MCP-compatible client.

Warehouse-Specific Servers

Snowflake

Repository: snowflake-labs/mcp Maintained by: Snowflake Labs (official)

The official Snowflake MCP server from Snowflake Labs. It includes:

SQL execution against your warehouse
Cortex AI integration (Snowflake’s native ML and LLM capabilities)
Semantic views for governed data access
RBAC support — the server respects your existing access controls

RBAC works through a service account: that account’s permissions govern what the AI can query. There is no separate MCP-level permission system — the server uses the warehouse’s existing access controls. If a table is visible to the service account, the AI can query it; otherwise it cannot.

Cortex AI integration means you can combine traditional SQL queries with Snowflake’s ML functions in a single MCP tool call, which is useful for teams doing in-warehouse ML or using Snowpark.

BigQuery

Repository: googleapis/genai-toolbox Maintained by: Google (official)

Google’s MCP Toolbox provides access to BigQuery, Cloud SQL, Spanner, and AlloyDB through a single server. The BigQuery integration supports query execution and schema inspection. Toolbox handles authentication via Application Default Credentials — the same mechanism used by the BigQuery client library — so there’s no separate auth configuration to manage.

For detailed setup, see BigQuery MCP Toolbox Setup and Choosing Between BigQuery MCP Options. If you need remote access (HTTP transport for shared team access), BigQuery Remote MCP Server Setup covers the Cloud Run deployment pattern.

ClickHouse

Repository: ClickHouse/mcp-clickhouse Maintained by: ClickHouse (official)

The official ClickHouse server supports schema inspection and query execution. Maintained by the ClickHouse team, so it tracks API changes and new features alongside ClickHouse releases.

Multi-Database Servers

centralmind/gateway

Repository: centralmind/gateway Supported databases: PostgreSQL, MySQL, ClickHouse, Snowflake, BigQuery, MSSQL, Oracle, SQLite, ElasticSearch, DuckDB

This server is worth knowing about if you work with multiple databases. A single gateway server provides MCP access to all of them from one configuration file and one server process. The configuration maps connection strings to named data sources; the AI tools reference those names rather than connection details.

The practical scenario: your team uses PostgreSQL for the application database, BigQuery for the warehouse, and ElasticSearch for search. Instead of configuring three separate MCP servers in every developer’s client, you configure one gateway that knows about all three. One entry in claude_desktop_config.json, one server process to manage.

Configuration example:

databases:
  - name: app_db
    type: postgresql
    dsn: "postgresql://user:pass@localhost/app"
  - name: warehouse
    type: bigquery
    project: my-gcp-project
    dataset: analytics
  - name: search
    type: elasticsearch
    url: "http://localhost:9200"

The gateway translates MCP tool calls into the appropriate database protocol for each backend. You can apply column-level access controls and read-only modes per database.

MindsDB MCP

Repository: mindsdb/mcp Supported databases: PostgreSQL, MySQL, MongoDB, Snowflake, BigQuery

MindsDB’s MCP server provides federated query access — you can query across databases using a SQL-like interface. The primary advantage over a simple multi-database gateway is MindsDB’s built-in ML capabilities: you can join traditional database tables with ML model predictions in a single query.

For teams already using MindsDB for ML features, the MCP server is a natural extension. For teams that just want multi-database access, centralmind/gateway is simpler.

Orchestration and Streaming

Databricks MCP

The Databricks MCP server supports SQL queries via the Databricks Statement Execution API. For data engineering teams using Databricks SQL Warehouses and Databricks Workflows, this gives AI assistants direct query access to your lakehouse.

Community-maintained as of early 2026. The Databricks CLI provides an alternative for some access patterns — see CLI vs MCP for AI Agents for the tradeoff analysis.

confluentinc/mcp-confluent

Repository: confluentinc/mcp-confluent Maintained by: Confluent (official)

The official Confluent MCP server exposes Kafka and Confluent Cloud through the MCP protocol. Capabilities include:

List topics and inspect schemas (Schema Registry integration)
Produce and consume messages for testing or debugging
Access Confluent Cloud REST APIs for cluster management

For streaming data engineering, this server opens up conversational debugging of Kafka pipelines. Instead of writing ad-hoc consumer scripts to inspect a topic, you ask the AI to sample messages and describe what it finds. Schema Registry integration means the AI can decode messages using the registered schema rather than seeing raw bytes.

Useful for: debugging pipelines where data isn’t arriving as expected, exploring topic contents during development, reviewing schema evolution before deploying changes.

Choosing a Starting Point

The prioritization depends on what you query most:

One primary warehouse (Snowflake or BigQuery): Use the official vendor server. Snowflake Labs for Snowflake, Google’s GenAI Toolbox for BigQuery. These have the best integration and active vendor maintenance.

Multiple databases: Start with centralmind/gateway. One server handles PostgreSQL, MySQL, BigQuery, and more. The configuration overhead is low; the operational overhead of running multiple servers is higher.

Streaming infrastructure: mcp-confluent if you’re on Confluent Cloud. For self-managed Kafka, a custom server built with the Python SDK is likely necessary — the MCP ecosystem for Kafka outside the Confluent ecosystem is thin.

dbt projects: The dbt MCP server adds lineage and documentation context. See dbt MCP Server Setup Hub.

Before building custom servers for any of these, check Custom MCP Server Decision Criteria — the ecosystem has grown fast, and an existing server may cover your use case.