MCP Ecosystem Overview

The MCP ecosystem grew from a single Anthropic project to a Linux Foundation-governed open standard with 97 million SDK downloads and 5,800+ community servers in under two years. This hub organizes the notes that cover the landscape — how the protocol works, what’s available, and how to build for it.

Start Here

MCP Protocol Architecture — What MCP is, how clients and servers communicate, and the three primitives (tools, resources, prompts). Read this first. Everything else in this hub assumes it.

The Ecosystem

MCP Official Reference Servers — The servers maintained by the MCP Steering Group. Seven actively maintained servers, including Filesystem (essential for data engineering) and Git. Plus the archived servers that transferred to vendor maintenance.

MCP Client Landscape — Claude Desktop, Claude Code, Cursor, VS Code Copilot, Windsurf, and others. How to choose based on where your work happens, not which client has the most features.

MCP Ecosystem Governance — How MCP became vendor-neutral: the Linux Foundation donation, corporate adoption by OpenAI, Microsoft, AWS, and Google, and what that means for infrastructure decisions.

MCP Discovery Resources — Where to find servers: the official registry, awesome-mcp-servers, mcpservers.org, and how to evaluate what you find before installing.

Data Engineering Servers

MCP Data Engineering Servers — The servers that matter for data work: Snowflake (Snowflake Labs official), BigQuery (Google’s GenAI Toolbox), ClickHouse (official), centralmind/gateway (multi-database), MindsDB (federated with ML), Databricks, and Confluent.

Database-specific deep dives:

BigQuery MCP Server Setup — setup hub for Google’s BigQuery options
BigQuery MCP Toolbox Setup — step-by-step for GenAI Toolbox
Choosing Between BigQuery MCP Options — remote Toolbox vs. BigQuery CLI
dbt MCP Server Setup Hub — dbt integration for lineage and documentation context

Building Custom Servers

When existing servers don’t cover your use case, the custom server path:

Custom MCP Server Decision Criteria — Build vs. browse: when to write custom, when to fork an existing server, when the CLI is a simpler answer.

MCP SDK Selection for Data Engineering — Python (FastMCP) or TypeScript (McpServer). For most data engineering teams: Python.

FastMCP Server Skeleton — Minimal working servers in Python and TypeScript. The 30-line starting point.

MCP Tool Design Patterns — Docstrings as descriptions, Pydantic models for structured output, input validation with schemas. The difference between tools the AI uses correctly and tools it fumbles.

MCP Resources and Prompts — Read-only data exposure (resources), reusable templates (prompts), progress reporting (Context object). The other two primitives beyond tools.

MCP Transport Configuration — stdio for local development, streamable HTTP for production. When to switch.

MCP Server Testing and Debugging — The Inspector, the stdout-corrupts-protocol gotcha, and the three-stage testing workflow.

MCP Server Project Setup — Full project initialization: directory structure, dependencies, client installation, project-scoped configuration.

Server patterns for common data engineering needs:

CLI vs MCP for AI Agents — When the BigQuery CLI beats a BigQuery MCP server. Token efficiency, training data, Unix composability vs. typed schemas and client universality.

Security Posture for AI Agents — Security principles for AI tools accessing data infrastructure. Relevant for any production MCP deployment.

Custom Parameterized MCP Queries — A specific pattern for constrained BigQuery access — parameterized queries that prevent schema sprawl.

MCP Context Window Overhead — The cost of connecting many MCP servers. Relevant when choosing how many servers to run simultaneously.

Start Here

The Ecosystem

Data Engineering Servers

Building Custom Servers

Related