Claude Code generates dbt base models via pattern replication: it reads existing base models in the project, extracts their conventions, and applies them to a new source. The workflow covers prompting, CLAUDE.md defaults, and common review issues.
The Core Pattern
The prompt that works:
Create a dbt base model for source_shopify.orders following the patternsin models/base/Claude Code will:
- Read 2-3 existing base models to extract your conventions
- Check the source YAML for the available columns
- Generate the SQL applying your CTE structure, naming pattern, and materialization config
- Generate the YAML with primary key tests
The output should match existing base model conventions, not generic boilerplate — the 50th base model in a project should be as consistent with project conventions as the first.
Why Vague Prompts Fail
“Create a base model for orders” produces generic output. Without more context, Claude does not know:
- Which source system (Shopify, Stripe, a custom API?)
- Which GCP project if you have multiple environments
- Whether you deduplicate with
ROW_NUMBER()andQUALIFYor a different pattern - What your column naming convention is (
order_id,order__id, orord_id?) - Which metadata columns you always include (
_loaded_at,_source_table, etc.)
Vague prompts require extensive corrections.
Effective prompts specify constraints:
Create a base model for source_stripe.charges from the production-gcp-projectBigQuery project. Follow patterns in models/base/base_shopify__orders.sql.Include _loaded_at and _source_table metadata columns. Cast amount to FLOAT64and divide by 100 (Stripe stores cents). Deduplicate on charge_id usingROW_NUMBER() QUALIFY.That prompt leaves nothing to assumption. Claude generates a model that fits your project on the first attempt.
Encoding Defaults in CLAUDE.md
The better long-term solution is encoding your project’s base model conventions in CLAUDE.md as Project Memory. This means you don’t have to repeat constraints in every prompt — Claude reads them once and applies them throughout the session.
A CLAUDE.md section for base models:
## Base Models
- All base models include _loaded_at and _source_table columns- Timestamps cast to TIMESTAMP, localized to Europe/Paris timezone- Source freshness: warn_after 12 hours, error_after 24 hours- Deduplicate using ROW_NUMBER() QUALIFY on the primary key, ordered by _loaded_at DESC- Naming: base__[source]__[entity] (e.g., base__shopify__orders)- Materialize as table, tagged with ['base', 'source_name']
## Column naming- Double-underscore separator: order__id, customer__id, order__created_at- Singular entity names: customer not customersWith this in place, the prompt can be minimal:
Create a base model for source_stripe.charges.Primary key: charge_id. Amount stored as cents (divide by 100).Claude reads CLAUDE.md, applies the conventions, and generates a complete model with the right metadata columns, deduplication pattern, materialization config, and naming.
A Typical Generated Base Model
Here’s what the output should look like for a Stripe charges source, with standard conventions applied:
{{ config( materialized='table', tags=['base', 'stripe']) }}
WITH source AS ( SELECT id, customer_id, amount, currency, status, created, _loaded_at FROM {{ source('stripe', 'charges') }}),
deduplicated AS ( SELECT * FROM source QUALIFY ROW_NUMBER() OVER ( PARTITION BY id ORDER BY _loaded_at DESC ) = 1),
renamed AS ( SELECT id AS charge__id, customer_id AS customer__id, CAST(amount AS FLOAT64) / 100 AS charge__amount_usd, LOWER(currency) AS charge__currency, status AS charge__status, TIMESTAMP_SECONDS(created) AS charge__created_at, _loaded_at, '{{ source("stripe", "charges") }}' AS _source_table FROM deduplicated)
SELECT * FROM renamedNotice the structure: source → deduplicated → renamed → final SELECT. This follows the standard base CTE pattern. No business logic, no joins, just clean typed data with consistent naming.
The YAML Claude generates alongside this:
models: - name: base__stripe__charges description: "This model contains one row per Stripe charge, deduplicated and renamed to project conventions." columns: - name: charge__id description: "Primary key. Stripe charge identifier." tests: - unique - not_null - name: customer__id description: "Stripe customer identifier. Foreign key to base__stripe__customers." tests: - not_null - name: charge__amount_usd description: "Charge amount in USD. Stripe stores as integer cents; converted to decimal." tests: - not_nullHandling New Source Onboarding
Where this pattern pays off most is onboarding an entirely new source system. A Stripe connector might expose 15-20 tables: charges, customers, invoices, subscriptions, payment_methods, products, prices, events, and so on.
With Claude Code:
We're onboarding the Stripe source. Create base models for all tables insource_stripe: charges, customers, invoices, subscriptions, payment_methods,products, prices. Follow patterns in models/base/. One model per table,one SQL file and one YAML file per model.Claude reads one or two existing base models, then generates all 7 in sequence. Review each diff. The process takes 30-45 minutes, and the output is more consistent than models written by different people at different times.
Reviewing Claude’s Output
Review every generated model before committing. The most common issues:
Missing type casts. Claude sometimes leaves columns as-is when they need explicit casting. Check every timestamp, boolean, and monetary column.
Wrong deduplication key. Compound primary keys (partitioned on multiple columns) require explicit instruction. If your source has a composite key, specify it in the prompt.
Invented columns. If Claude doesn’t have access to the actual source schema, it may generate column names based on what “seems right” for that type of table. Verify every column exists in the actual source.
Inconsistent metadata columns. If your CLAUDE.md specifies _loaded_at, check that it appears correctly. Claude sometimes drops metadata columns when it doesn’t see them in the source schema definition.
The review takes 2-3 minutes per model. That’s still a substantial improvement over writing the model from scratch — but skipping review is how subtle inconsistencies compound across a project.
When to Use This Pattern vs. Manual Writing
Claude Code base model generation makes sense when:
- Onboarding a new source with multiple tables
- Adding individual tables from an existing connector
- Standardizing legacy base models to a new naming convention (generate + compare)
Manual writing makes more sense when:
- The source has unusual patterns requiring deep thought (complex nested JSON, GA4 event unnesting)
- You’re learning the source system and want to understand it thoroughly
- The model has business logic that shouldn’t be in base but you’re sorting out the right layer
The dbt Base Layer Patterns note covers what belongs in base models. If you find yourself deciding whether something is “base layer logic,” that’s a signal to write manually — the thinking is the point.