Every dbt unit test lives in a YAML file with four required elements: a name, a target model, mocked inputs (given), and expected output (expect). There are three input formats and several optional configuration options.
Required Elements
Every unit test needs exactly four things: a name, a target model, input data (given), and expected output (expect).
unit_tests: - name: test_customer_status_logic # Unique identifier model: mrt__core__customers # Model being tested given: # Mock inputs - input: ref('base__crm__customers') rows: - {customer_id: 1, status: "active"} expect: # Expected output rows: - {customer_id: 1, is_active: true}The given section accepts three kinds of inputs:
ref()for models — the most common casesource()for source tablesthisfor self-references in incremental models
A critical convenience: you only need to specify columns that your logic actually uses. If base__crm__customers has 30 columns but your test only cares about customer_id and status, those two are sufficient. dbt handles the rest by filling in defaults.
Input Formats
dbt supports three formats for defining test data. Each has a sweet spot.
Dict Format
The default and most readable for small datasets. Each row is a YAML dictionary:
given: - input: ref('base__shopify__orders') format: dict rows: - {order_id: 1, amount: 100.00, status: "completed"} - {order_id: 2, amount: 50.00, status: "pending"}Dict format is what you’ll use 90% of the time. It’s concise, easy to scan, and plays well with code review diffs. The only downside is readability degradation when rows have many columns — at that point the YAML lines get long.
CSV Format
Better for larger datasets or when you want to share fixture files across tests:
given: - input: ref('base__shopify__orders') format: csv rows: | order_id,amount,status 1,100.00,completed 2,50.00,pendingCSV format also supports external fixture files, which is useful when the same dataset is needed by multiple tests:
given: - input: ref('base__shopify__orders') format: csv fixture: order_test_dataThis looks for tests/fixtures/order_test_data.csv in your project. External fixtures keep your YAML files clean when test data is large, but they add indirection — someone reading the test needs to open a separate file to see the inputs.
SQL Format
Required in two specific situations: ephemeral models and empty table scenarios.
given: - input: ref('base__shopify__orders') format: sql rows: | select 1 as order_id, 100.00 as amount, 'completed' as status union all select 2 as order_id, 50.00 as amount, 'pending' as statusFor testing zero-row scenarios (what happens when a table is empty?), SQL format is the only option:
given: - input: ref('base__shopify__orders') format: sql rows: | select cast(null as int64) as order_id, cast(null as float64) as amount where falseThe where false trick creates a result set with the correct schema but zero rows. This is essential for testing models that need to handle empty upstream tables gracefully.
One important caveat with SQL format for ephemeral models: you must include ALL columns the model references, not just the ones relevant to your test. Ephemeral models can’t be queried for their schema, so dbt has no way to fill in defaults.
Optional Configuration
Beyond the basics, unit tests support description, tags, meta, and conditional enablement:
unit_tests: - name: test_revenue_calculation model: mrt__finance__orders description: "Validates gross revenue calculation including tax"
config: tags: ["critical", "finance"] meta: owner: "data-team" ticket: "DATA-1234" enabled: "{{ target.name != 'prod' }}" # v1.9+ only
given: - input: ref('base__shopify__orders') rows: - {order_id: 1, subtotal: 100.00, tax_rate: 0.08} expect: rows: - {order_id: 1, gross_revenue: 108.00}Tags are particularly useful. They enable selective test runs (dbt test --select tag:critical) and make it easy to run just the unit tests that matter for a specific domain. If you tag tests by business area (finance, marketing, core), teams can run only their tests during development.
The enabled config lets you skip tests in certain environments. This is a dbt 1.9+ feature — it won’t work on 1.8.
The meta block is freeform. Use it for ownership (owner), traceability (ticket), or any project-specific metadata your team needs.
Version Compatibility
Unit testing syntax has evolved across dbt versions, and the differences matter:
- dbt 1.8 (May 2024): Unit testing introduced. The
tests:key was renamed todata_tests:to avoid ambiguity with unit tests. If you’re upgrading from pre-1.8, rename anytests:blocks in your YAML todata_tests:. - dbt 1.9: Added the
enabledconfig option for conditional test execution. New--resource-typeand--exclude-resource-typeflags for filtering. - dbt 1.11: Unit tests for disabled models are now automatically disabled — no more orphaned tests failing on models that have been turned off.
The data_tests: rename catches teams off guard during upgrades. If you see unexpected behavior after moving to 1.8+, check whether you still have tests: blocks that need renaming.
Choosing a Format
For most teams, the decision is simple:
- Dict format for everything under ~10 rows per input (which is almost every unit test)
- CSV fixtures when the same large dataset is reused across multiple tests
- SQL format only when dict/csv can’t work — ephemeral models and empty tables
Avoid mixing formats within a single test unless there’s a strong reason. Consistency makes tests easier to read and review.
The expect block supports the same formats as given, though dict format is almost always the right choice there — expected outputs are typically small, and you want them visible inline for quick comparison.