CLAUDE.md for Analytics Engineering

This hub connects the notes covering each layer of CLAUDE.md configuration for analytics engineers working on dbt and BigQuery.

The Core Concept

CLAUDE.md as Project Memory covers how the file works: hierarchical loading from home directory through project root, the instruction budget constraint (~150-200 instructions before quality degrades), and the reactive approach. Start without a CLAUDE.md, use Claude Code, and add lines when Claude makes real mistakes.

dbt-Specific Configuration

CLAUDE.md for dbt Projects covers what actually belongs in a dbt project’s CLAUDE.md — the four categories that earn their place:

Commands with context — not just dbt run, but when to use dbt ls --select +model_name+ to check dependencies first
Naming conventions — the double-underscore separator, layer prefixes (base__, int__, mrt__), primary key naming
Warehouse-specific gotchas — the rules that prevent bugs, not style preferences
Workflow sequence — check impact → change → update schema.yml → compile → test

Also covers what not to include: comprehensive SQL style guides (SQLFluff handles those), full database schemas, generic dbt documentation, and auto-generated content from /init.

BigQuery-Specific Additions

Claude MD BigQuery Specifics covers the BigQuery section that prevents the most expensive mistakes:

GoogleSQL vs legacy SQL dialect enforcement
Partition filter requirements (the one that costs money when forgotten)
The incremental model config block with insert_overwrite and require_partition_filter=true
String quoting and comparison operator conventions

Beyond CLAUDE.md: Hooks and Slash Commands

CLAUDE.md provides guidance Claude should follow. For harder enforcement, two complementary tools:

Claude Code Hooks — deterministic shell commands that execute at lifecycle points. Use hooks to automatically run SQLFluff after every edit (PostToolUse), block edits to production mart files (PreToolUse), or auto-compile models immediately after changes.

Claude Code Slash Commands for dbt — reusable workflows stored in .claude/commands/ and committed to git. /generate-tests, /document-model, /debug-test-failure — team-shared workflows that encode institutional knowledge in files rather than individual prompts.

The three layers work together: CLAUDE.md for context and conventions, hooks for hard guardrails, slash commands for repeatable workflows.

CLI vs. MCP

The article this hub draws from notes that the dbt CLI is sufficient for most workflows. Claude Code reads the commands in your CLAUDE.md, runs dbt ls, dbt compile, dbt test, and bq ls naturally — no extra setup required.

CLI vs MCP for AI Agents covers the general tradeoffs. For dbt specifically: the CLI works great because Claude has extensive training data on dbt commands and BigQuery CLI syntax. An MCP server adds overhead that only pays off on very large projects where manifest parsing would save significant time.

The Iteration Loop

Run Claude Code on a project with minimal or no CLAUDE.md
Notice when Claude makes a mistake (wrong naming convention, missing schema.yml update, wrong SQL dialect)
Fix it in the session
Press # to open the memory command and add the instruction to CLAUDE.md
Repeat

After a few weeks, the result is a 30-50 line file where every instruction traces back to a real problem.