Adrienne Vermorel

Advanced Claude Code Workflows: Testing, Documentation, and Debugging for Analytics Engineers

You’ve been using Claude Code for the basics. Writing SQL, scaffolding models, maybe generating some boilerplate. It works, but your workflow still feels manual. You’re prompting one thing at a time, reviewing output, prompting again. The tool helps, but it hasn’t fundamentally changed how you work.

Most tutorials stop at “ask Claude to write code” without covering systematic, repeatable workflows. This article covers three: testing, documentation, and debugging. Each one turns Claude Code from something you talk to into something integrated with your process.

I’m assuming you’re already comfortable with Claude Code basics and working with dbt on BigQuery. If you’re just getting started with Claude Code, start with my first hour guide instead.

Foundation: Configuring Claude Code for dbt Projects

Before building workflows, you need the right foundation. That starts with CLAUDE.md.

CLAUDE.md as Project Memory

CLAUDE.md is a file Claude automatically pulls into context when starting a conversation. It eliminates the need to repeat architectural decisions, testing requirements, and coding conventions every session. Create it at your project root and Claude reads it automatically.

The key is keeping it concise. Frontier LLMs can reliably follow approximately 150-200 instructions. Claude Code’s system prompt already contains around 50, leaving limited capacity for your custom instructions. Don’t stuff every possible command into CLAUDE.md. Include what’s universally applicable, reference separate files for detailed documentation, and use pointers instead of copying code snippets (they become outdated quickly).

Here’s a template that works for dbt/BigQuery projects:

# Project Overview
## What
- Analytics engineering project using dbt Core with BigQuery
- Marketing attribution data transformations
- GA4 BigQuery exports processing
## Key Commands
- `dbt run --select model_name` - Run single model
- `dbt run --select +model_name` - Run with upstream dependencies
- `dbt test --select model_name` - Test single model
- `dbt build --select model_name` - Run + test in DAG order
## Project Structure
models/
├── base/ # Source-conformed, atomic building blocks
├── intermediate/ # Purpose-built transformations
└── marts/ # Business-conformed, end-user ready
## Naming Conventions
- Base: `base__{source}__{table}`
- Intermediate: `int__{entity}_{verb}`
## Code Standards
- Use CTEs for readability
- Always use ref() or source()
- Add unique + not_null tests to primary keys
- Document all models in schema.yml

Claude reads CLAUDE.md files from multiple locations: a global one at ~/.claude/CLAUDE.md, the project root, a personal CLAUDE.local.md (gitignored), and child directories on-demand for monorepos. Run /init to auto-generate a starting file, then refine it based on what Claude gets wrong. For a deeper dive on CLAUDE.md configuration, see my full guide on setting up CLAUDE.md for dbt projects.

Hooks for Deterministic Guardrails

Hooks are shell commands that execute at specific points in Claude Code’s lifecycle. They provide deterministic control over Claude’s behavior, ensuring certain actions always happen regardless of what you prompt.

For dbt workflows, the most useful hooks are PostToolUse (runs after Claude finishes a tool) and PreToolUse (runs before execution). You can auto-format SQL after edits, block modifications to production schemas, or auto-run dbt compile after model changes.

Here’s a hook that runs sqlfluff after any SQL edit:

{
"hooks": {
"PostToolUse": [
{
"matcher": "Edit|Write",
"hooks": [
{
"type": "command",
"command": "jq -r '.tool_input.file_path' | { read f; if echo \"$f\" | grep -q '\\.sql$'; then sqlfluff fix \"$f\" --dialect bigquery; fi; }"
}
]
}
]
}
}

And one that blocks modifications to production mart files:

{
"hooks": {
"PreToolUse": [
{
"matcher": "Edit|Write",
"hooks": [
{
"type": "command",
"command": "python3 -c \"import json, sys; d=json.load(sys.stdin); p=d.get('tool_input',{}).get('file_path',''); sys.exit(2 if 'models/marts/prod/' in p else 0)\""
}
]
}
]
}
}

Exit code 0 means the hook passes. Exit code 2 blocks the action and surfaces the error to Claude. Other non-zero codes log the error but continue. I cover hooks and custom commands in more detail in a separate article if you want to explore further.

Testing Workflows

Test-Driven Development with Claude Code

TDD works particularly well with AI agents. You give Claude a clear target (passing tests), it iterates until it gets there, and you have confidence the result actually works. The workflow has five steps.

First, write the tests before the model exists:

Write dbt tests for a new mrt__marketing__customer_lifetime_value model. Include:
- unique + not_null on customer_id
- accepted_values for customer_segment
- relationships test to base__stripe__customers
Don't create the model yet.

Then confirm the tests fail:

Run dbt test --select mrt__marketing__customer_lifetime_value and confirm all tests fail.

Now implement to pass:

Implement mrt__marketing__customer_lifetime_value to make all tests pass.
Don't modify the tests.

If something fails, iterate:

The relationship test is failing. Fix the implementation without changing the test.

Once tests pass, refactor with confidence:

Now that tests pass, refactor the CTEs for better readability while keeping all tests green.

This approach catches issues early. Claude knows exactly what success looks like, and you know the output actually works.

Generating dbt Tests Automatically

You can create a slash command that analyzes a model and generates appropriate tests. Save this as .claude/commands/generate-tests.md:

---
description: Generate comprehensive dbt tests for a model
allowed-tools: Read, Write, Bash(dbt:*)
argument-hint: [model_name]
---
Analyze $ARGUMENTS and generate appropriate tests:
1. Read the model SQL and existing schema.yml
2. Identify primary key columns → add unique + not_null
3. Find foreign key relationships → add relationships tests
4. Detect enum-like columns → add accepted_values
5. Find numeric ranges → add appropriate range tests
Update the schema.yml with new tests.

Type /generate-tests mrt__sales__orders and Claude analyzes the model, identifies what should be tested, and updates your schema.yml. It’s not perfect, but it catches the obvious stuff and gives you a starting point.

Enforcing Test Discipline

Claude sometimes skips tests or writes implementation first. You can use hooks to enforce TDD discipline by intercepting file modifications and validating against common violations: implementing functionality without a failing test, writing multiple tests before any implementation, or modifying tests to make them pass.

The pattern is to create a PreToolUse hook that checks whether the change follows your testing expectations. If not, it blocks and tells Claude why.

For validating prompts before using them on production code, create a /test-prompt command that runs your prompt on a sandbox model first, checks the output for correct BigQuery syntax and proper use of ref() and source(), and documents any edge cases. Test on non-production models before running something across your whole project.

Documentation Workflows

The Documentation Problem

Writing descriptions for 100+ models and columns is tedious. Schema.yml files end up inconsistent across team members, and documentation drifts from reality as models change. You know you should document everything, but the manual work never gets prioritized.

The Two-Step Pattern

The solution combines dbt codegen with Claude Code. First, generate the schema scaffolding with codegen:

Terminal window
dbt run-operation generate_model_yaml --args '{"model_names": ["base__ga4__events"], "upstream_descriptions": true}'

This gives you the YAML structure with all column names. Then use Claude Code to fill in the descriptions:

Read the base__ga4__events.sql model and the generated schema.yml scaffold.
For each column:
1. Analyze the SQL transformation logic
2. Infer the business meaning
3. Write a clear, concise description
4. Add appropriate tests
Follow these standards:
- Use present tense
- Include data type in description
- Note any known limitations

Claude reads the SQL, understands what each column does, and writes descriptions that actually reflect the transformations. It’s not perfect (review before committing), but it’s faster than writing everything manually.

A Slash Command for Model Documentation

Save this as .claude/commands/document-model.md:

---
description: Generate comprehensive dbt documentation for a model
allowed-tools: Bash(dbt:*), Read, Write
argument-hint: [model_name]
---
# Document Model: $ARGUMENTS
1. Read the model SQL file at models/**/$ARGUMENTS.sql
2. Identify all columns and their transformations
3. Check if schema.yml exists for this model
4. Generate or update the schema.yml with:
- Model description explaining business purpose
- Column descriptions explaining meaning and source
- Appropriate tests for each column
## Documentation Standards
- Model descriptions: Start with "This model..." and explain the grain
- Column descriptions: Include data type, business meaning, and source
- Use consistent terminology from our data dictionary
Create/update the schema.yml file and show a summary of changes.

Type /document-model int__ga4_sessions_sessionized and you get documentation in one command.

Docs Blocks for Consistency

The same columns appear across multiple models. customer_id shows up everywhere, and you end up with slightly different descriptions each time. The solution is dbt docs blocks.

Create a models/docs.md file with reusable documentation:

{% docs customer_id %}
Unique identifier for a customer account.
Format: UUID generated at account creation.
Source: Stripe customer API.
{% enddocs %}
{% docs event_timestamp %}
Timestamp when the event was recorded in UTC.
Microsecond precision from GA4 BigQuery export.
{% enddocs %}

Then reference these in your schema.yml:

columns:
- name: customer_id
description: "{{ doc('customer_id') }}"

You can prompt Claude to analyze all your models, identify commonly repeated columns, create a docs.md file with reusable blocks, and update schema.yml files to reference them. Same column, same description, everywhere.

Lineage Documentation

Claude can generate Mermaid diagrams showing data lineage:

Create a Mermaid flowchart showing the lineage from raw GA4 events
to the final marketing attribution model. Include all intermediate
transformations and key business logic at each step.

The output looks like:

flowchart TD
A[raw.ga4_events] --> B[base__ga4__events]
B --> C[int__ga4_sessions_sessionized]
C --> D[int__ga4_sessions_attributed]
D --> E[mrt__marketing__attributed_conversions]

Useful for onboarding new team members or documenting complex transformation chains.

Debugging Workflows

Let Claude Code Face the Errors

The best debugging approach is simple: let Claude Code run the model and hit the errors directly. Don’t pre-gather evidence or manually inspect logs. Just tell Claude what you’re trying to accomplish and let it iterate.

Run dbt build --select mrt__sales__orders and fix any errors.

Claude will execute the command, see the error output, check the compiled SQL if needed, and propose fixes. It iterates until the model runs successfully. This works because Claude Code can execute commands, read files, and make edits in a single conversation loop.

For data issues where the model runs but produces wrong results:

The mrt__sales__orders model shows incorrect total_revenue for customer_id='ABC123'.
Trace through the upstream models to find where the calculation goes wrong.

Claude will read the model, identify dependencies, query intermediate tables, and track down where values diverge. You don’t need to tell it how to debug. Just describe the problem and let it investigate.

Subagents for Complex Debugging

For particularly thorny issues, you can create a dedicated debugging subagent. Save this as .claude/agents/sql-debugger.md:

---
name: sql-debugger
description: Specialized agent for debugging SQL and dbt issues
---
You are an expert SQL debugger specializing in BigQuery and dbt.
## Your Approach
1. **Gather Evidence First**
- Read error logs completely
- Check compiled SQL
- Review recent git changes to the model
2. **Form Hypotheses**
- List possible causes ranked by likelihood
- Consider: data issues, schema changes, logic errors
3. **Test Systematically**
- Isolate the problem with minimal reproducing case
- Use WHERE clauses to test on small data
- Add CTEs to inspect intermediate results
4. **Fix and Verify**
- Implement the minimum change to fix the issue
- Add tests to prevent regression
- Document the root cause

Invoke it with:

Use the sql-debugger agent to investigate why int__ga4__sessions_attributed
is returning NULL for channel_grouping on 5% of sessions.

The subagent gets fresh context focused entirely on debugging. Useful when your main conversation is cluttered with other context.

Putting It Together

These workflows connect naturally. You receive a request for a new model, use /generate-tests to write tests first, implement the model with Claude iterating until tests pass, tell Claude to fix any errors that come up, run /document-model before opening a PR, and commit with full test coverage and documentation.

The value isn’t in any single command. It’s in the systematic, repeatable patterns. Claude Code becomes part of your process rather than something you occasionally ask questions.

Share these with your team by checking CLAUDE.md and your commands folder into git. Everyone gets the same workflows, the same guardrails, the same documentation standards.

Start with one workflow. /document-model is probably the easiest entry point since documentation is always behind and the command provides immediate value. Once that’s working, add testing workflows. Build incrementally.

The goal isn’t to automate everything. It’s to automate the tedious parts so you can focus on the decisions that actually matter.