Adrienne Vermorel
How I Use Claude Code for dbt Development
I work with dbt Core and dbt Cloud, primarily on BigQuery with occasional Snowflake projects. My daily responsibilities involve creating base models, authoring tests, documenting columns, and troubleshooting incremental models.
Claude Code has transformed my workflow—not through perfect code generation, but by handling the conversion work: translating requirements into actual files while I focus on the strategic thinking.
The Workflows That Changed Everything
Base Model Generation
Previously, creating a base model required 15-20 minutes: examining source schema, reviewing existing models, adapting structures, adding metadata, writing YAML, and including tests.
Now I request:
“Create a dbt base model for source_shopify.orders following the patterns in models/base/”
Claude Code examines existing models, recognizes naming conventions, identifies standard metadata columns, and generates both SQL and YAML files. I review the diff and approve.
Effective prompts specify constraints:
“Create a base model for source_stripe.charges from that specific GCP project. Follow patterns in models/base/base_shopify__orders.sql…”
Vague prompts produce generic results requiring extensive corrections.
I maintain a CLAUDE.md file specifying defaults:
“All base models include _loaded_at and _source_table columns. Timestamps cast to Europe/Paris timezone. Source freshness: warn_after 12 hours, error_after 24 hours…”
This prevents inconsistencies. The process now takes 2-3 minutes including review—a substantial improvement for onboarding new data sources with numerous tables.
Test Writing
Writing tests often gets postponed despite being necessary. Claude Code makes test generation sufficiently quick that I now complete them systematically.
For schema tests, I reference a model:
“Write schema tests for mrt__daily_revenue. Include uniqueness on the primary key, not_null on required fields, and accepted_values where appropriate…”
Claude reads the model, traces columns, and generates appropriate tests while catching relationships I might overlook.
For singular tests requiring complex logic, I describe validation requirements:
“Write a singular test that verifies mrt__customer_lifetime_value never has negative values for total_revenue, and that first_order_date is always before or equal to last_order_date.”
Being explicit about data quality rules ensures consistency across test generation.
Documentation
Column descriptions provide genuine value but are tedious to write. My approach involves requesting:
“Add documentation to models/marts/mrt__product_performance.yml. Read the model to understand what each column represents…”
Claude reads model SQL, traces columns to sources when necessary, and writes descriptions explaining functionality.
Quality documentation differs from technical definitions:
- Poor: “order_count: The count of orders”
- Better: “order_count: Total number of completed orders for this product in the time period. Excludes cancelled and refunded orders.”
For model descriptions, I request contextual information:
“Write a model description for mrt__site_performance that explains: What business questions it answers, key grain, main upstream dependencies, important caveats…”
Debugging
Previously: read error → open model → trace upstream → run queries manually → identify issue, involving multiple tab switches.
Now:
“[Paste error] Why is it failing and how do I fix it?”
Claude examines the model, often running dbt compile to view rendered SQL, then identifies issues ranging from null join keys to macro complications.
Teaching Claude Code to use dbt compile effectively proved valuable. I include in my CLAUDE.md:
“Use
dbt compile -s model_nameto see rendered SQL. Usedbt parseto understand lineage. Check source freshness withdbt source freshness -s source_name”
Refactoring
Renaming columns used across numerous downstream models previously discouraged thorough refactoring. With Claude Code:
“Rename customer_id to user_id across all models in models/marts/ that reference dim_customers. Update the YAML files too.”
Claude locates all references, makes changes, and displays diffs for review. This functionality excels for project-wide migrations:
“We’re moving from ref(‘stg_orders’) to ref(‘base__shopify__orders’) across the codebase. Find all references to stg_* models and update them to use the new base__* naming convention.”
Claude’s codebase awareness enables understanding dependencies and making consistent modifications.
Prompts That Work (and Ones That Don’t)
Effective Prompts Demonstrate:
- Specificity about desired outcomes
- Explicit constraints and requirements
- References to existing codebase patterns
- Clear definitions of “good” results
Strong example:
“Create an intermediate model int_orders__pivoted that takes base__shopify__orders and pivots the order_status column into separate boolean columns (is_pending, is_completed, is_cancelled). Follow the pattern in int_customers__aggregated. Add tests for the primary key and ensure the grain stays at order_id.”
Ineffective Prompts Are:
- Vague about intent
- Assume session memory (non-existent)
- Lack project context
- Leave critical decisions undefined
Poor example:
“Make a model that aggregates orders”
Claude produces generic output. Another weak example:
“Like we discussed yesterday, update the revenue model”
Claude lacks memory between sessions. Specify the model, state required changes, and provide context.
Sample CLAUDE.md Structure
# Analytics Project
## Project Structure- models/base/ — One model per source table, light transformations only- models/intermediate/ — Business logic, joins, aggregations- models/marts/ — Final outputs for BI tools and analysts
## Common Commands- `dbt run -s model_name` — Run specific model- `dbt run -s +model_name` — Run model with upstream dependencies- `dbt test -s model_name` — Test specific model- `dbt compile -s model_name` — See rendered SQL (useful for debugging)- `dbt build -s model_name+` — Run and test model with downstream
## SQL Style- Trailing commas in SELECT- CTEs ordered: imports, logical transformations- CTE names: prefixed with source (e.g., orders__filtered)- snake_case for all identifiers- Explicit column lists (no SELECT *)
## Naming Conventions- Base models: base__{source}__{table}- Intermediate: int__{entity}__{table}- Marts: mrt__{entity}__{area}
## Data Types- Timestamps: TIMESTAMP, cast to Europe/Paris timezone- Currency: INTEGER (cents, not decimals)- IDs: STRING (even if numeric at source)
## Testing Standards- Every model has unique + not_null on primary key- Foreign keys tested with relationships- Source freshness on all sources
## WarehouseBigQuery (project: dbt-prod, dataset: dbt)This document, approximately 40 lines, requires initial investment and periodic updates but becomes increasingly valuable over time.
When NOT to Use Claude Code
Claude Code handles implementation, not conceptualization. I avoid using it for:
Complex business logic requiring understanding: Designing attribution models or calculating customer lifetime value requires personal engagement. Delegating this work produces code I don’t fully understand—problematic for mission-critical logic.
Security-sensitive queries: Anything involving personally identifiable information, credentials, or production write access demands careful handling. Text generators lack understanding of consequences.
Learning new concepts: When acquiring new skills, struggling through difficulties facilitates genuine comprehension. Bypassing this struggle means lacking true understanding.
Time Investment Versus Return
Setup required genuine effort:
- Installation and authentication: 10 minutes
- Initial
CLAUDE.mdcreation: 1 hour - Learning effective prompts: several weeks of experimentation
- Developing custom workflows: 1-2 days
Total: approximately several weeks before consistent value emerged.
Return: Estimated daily time savings on routine tasks (base models, tests, documentation, debugging, refactoring) accumulates to days monthly and significant time yearly.
Beyond time savings, previously-skipped tasks now get completed: documentation, tests, refactoring that once felt too tedious to prioritize.
Claude Code amplifies existing capabilities—it accelerates implementation but doesn’t replace foundational knowledge. Users requiring understanding of data structure, model design quality, and problem recognition derive maximum benefit.
The tool doesn’t improve data engineering fundamentals; it increases speed and consistency for those possessing them.