Building and publishing your own dbt package

You’ve written the same date spine macro in three different projects. That set of revenue recognition models? Your team keeps pinging you for the latest version. At some point, copying SQL between repos creates more problems than it solves.

dbt packages are the answer, and building one is simpler than most people expect. A package is just a dbt project structured for reuse. The patterns that make it work (var(), dispatch, namespacing) are straightforward once you see them in action.

What makes a package different from a project

A dbt package is a dbt project. Same files, same structure, same execution model. The only mandatory file is dbt_project.yml. What makes a package different is intent: it’s designed for someone else to install into their project via dbt deps.

Three principles separate a well-built package from a regular project:

Configurable. No hardcoded database names, schema references, or table identifiers. Everything users might need to customize goes through var().
Namespaced. Model names include the package prefix to avoid collisions. my_package__customers, not customers.
Adapter-aware. SQL that differs across warehouses uses adapter.dispatch() so the package works on Snowflake, BigQuery, Redshift, and others.

Package directory structure

Here’s the layout used by dbt Labs and Fivetran for their own packages:

dbt-my_package/
├── dbt_project.yml           # Required: package configuration
├── packages.yml              # Upstream dependencies
├── macros/
│   ├── my_macro.sql
│   └── _macros.yml           # Macro documentation
├── models/
│   ├── base/
│   └── marts/
├── tests/generic/            # Custom generic tests
├── integration_tests/        # Sub-project for testing
│   ├── dbt_project.yml
│   ├── packages.yml          # References parent via local: ../
│   ├── seeds/                # Mock data
│   ├── models/
│   └── tests/
├── .github/workflows/        # CI configuration
├── README.md
├── CHANGELOG.md
└── LICENSE

The dbt_project.yml carries a few settings that matter for packages specifically:

name: 'my_package'
version: '0.1.0'
require-dbt-version: [">=1.3.0", "<3.0.0"]

config-version: 2

models:
  my_package:
    +materialized: view

vars:
  my_package_schema: 'my_data'
  my_package_database: null
  my_package__some_model_enabled: true

The require-dbt-version range should include both dbt Core 1.x and Fusion 2.x. Setting the upper bound to <3.0.0 covers both. Default materialization should be view, not table, so users don’t create unnecessary physical tables when they install your package. Every configurable option should have a sensible default declared under vars.

Writing reusable macros with dispatch

If your macro generates SQL that differs by warehouse, adapter.dispatch() makes it portable. You write a dispatcher macro that calls adapter-specific implementations.

-- macros/my_safe_divide.sql

{% macro my_safe_divide(numerator, denominator) %}
    {{ return(adapter.dispatch('my_safe_divide', 'my_package')(numerator, denominator)) }}
{% endmacro %}

{% macro default__my_safe_divide(numerator, denominator) %}
    CASE
        WHEN {{ denominator }} = 0 THEN NULL
        ELSE {{ numerator }} / {{ denominator }}
    END
{% endmacro %}

{% macro bigquery__my_safe_divide(numerator, denominator) %}
    SAFE_DIVIDE({{ numerator }}, {{ denominator }})
{% endmacro %}

The dispatch call looks for a bigquery__my_safe_divide implementation when running on BigQuery, then falls back to default__my_safe_divide for any other adapter.

Before writing adapter-specific macros from scratch, check whether dbt Core already provides a built-in. Since dbt-utils v1.0, cross-database macros like datediff, dateadd, safe_cast, and hash live in the dbt namespace. Call {{ dbt.datediff(...) }} instead of reimplementing date arithmetic yourself.

Users of your package can also override dispatch behavior in their own dbt_project.yml by specifying a dispatch config with a custom search_order. This is how packages like spark_utils shim compatibility for non-core adapters. For more on building reusable macro patterns, the same principles that apply to project macros apply to packages.

Making models packageable

The Fivetran team maintains over 100 dbt packages, and their pattern for configurable models has become the standard. Three techniques make models installable by anyone.

Schema and database via var(). Never hardcode where source data lives:

-- models/base/base__my_package__events.sql

WITH source AS (

    SELECT
        event_id,
        event_name,
        event_timestamp,
        user_id
    FROM {{ source('my_package', 'events') }}

)

SELECT
    event_id,
    event_name,
    event_timestamp,
    user_id
FROM source

sources:
  - name: my_package
    schema: "{{ var('my_package_schema', 'my_data') }}"
    database: "{{ var('my_package_database', target.database) }}"
    tables:
      - name: events
        identifier: "{{ var('my_package_events_identifier', 'events') }}"

Users point the package at their own schema by setting my_package_schema in their dbt_project.yml. The identifier var handles cases where a table has a different name in someone’s warehouse.

Enable/disable models. Let users turn off parts of the package they don’t need:

-- models/marts/my_package__daily_summary.sql
{{ config(enabled=var('my_package__daily_summary_enabled', true)) }}

Prefix model names. Every model in your package should start with the package name. If your package is called revenue_tools, name your models revenue_tools__monthly_mrr and revenue_tools__churn_events, not monthly_mrr and churn_events. This prevents naming collisions when users have multiple packages installed.

Testing with the integration tests pattern

You can’t test a dbt package in isolation because it’s designed to be installed inside another project. The solution is an integration_tests/ sub-project inside your package repo that installs the parent package as a local dependency.

packages:
  - local: ../

The testing workflow uses seeds as mock data. Create CSV files representing the source data your package expects, configure them to land in the right schema, then compare model outputs against expected results.

name: 'my_package_integration_tests'

seeds:
  my_package_integration_tests:
    +schema: my_data  # Match the package's default source schema

Write a schema test that compares your model’s output against an expected seed:

models:
  - name: my_package__daily_summary
    data_tests:
      - dbt_utils.equality:
          compare_model: ref('expected_daily_summary')

Run the full suite from the sub-project:

cd integration_tests/
dbt deps && dbt seed && dbt run && dbt test

This pattern, used by dbt-utils and every Fivetran package, gives you confidence that your package produces correct results before anyone else installs it. If you want to go further, a broader testing strategy with generic tests on your package’s models adds another layer of protection.

Publishing to the dbt Hub

The dbt Hub is the standard distribution channel for open-source dbt packages. Publishing requires three things: the package is hosted on GitHub, it has a dbt_project.yml with a name field, and releases use semantic versioning tags.

The process:

Create a GitHub release with a semver tag (e.g., v0.1.0). Tags like first-release or beta are ignored by the Hub’s indexer.
Open a pull request on the hub.getdbt.com repository to add your package to the index.
PRs are typically approved within one business day.
After initial registration, a script called hubcap runs hourly to detect new GitHub releases automatically. No further PRs needed for version updates.

Once published, users install your package with:

packages:
  - package: your-namespace/my_package
    version: [">=0.1.0", "<1.0.0"]

Hub packages have one significant advantage over Git packages: automatic transitive dependency resolution. If your package depends on dbt-utils and the user’s project also depends on dbt-utils, the Hub reconciles version conflicts automatically. Git packages can’t do this.

Hub best practices from the hubcap documentation:

Set require-dbt-version so users know compatibility upfront
Declare dependencies from the Hub (not Git) whenever possible
Use the widest version ranges that work (don’t pin to patch versions)
Don’t override dbt Core behavior that affects resources outside your package
Use dbt Core’s built-in cross-database macros instead of writing your own

Private packages via Git

Not every package belongs on the Hub. Internal packages shared within a company can be distributed as Git dependencies:

packages:
  - git: "https://github.com/my-org/dbt-internal-utils.git"
    revision: v0.3.0

Pin to a tag or commit SHA, not a branch name. Pointing at main means your package resolution changes with every commit, breaking reproducibility.

For private repositories, dbt supports native authentication through Git providers:

packages:
  - private: my-org/dbt-internal-utils
    provider: github
    revision: v0.3.0

This uses the Git integration already configured in your environment (GitHub, GitLab, Azure DevOps) without embedding tokens in your packages.yml. For cases where you need explicit tokens, the traditional approach still works via env_var():

packages:
  - git: "https://{{ env_var('GIT_TOKEN') }}@github.com/my-org/dbt-internal-utils.git"
    revision: v0.3.0

CI/CD for packages

A package that works on your laptop might fail on a different warehouse or dbt version. GitHub Actions with a matrix strategy covers both dimensions:

name: CI
on: [pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        warehouse: [snowflake, bigquery, postgres]
        dbt-version: ['1.9.0', '1.11.0']
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'
      - run: pip install dbt-${{ matrix.warehouse }}==${{ matrix.dbt-version }}
      - run: |
          cd integration_tests/
          dbt deps
          dbt seed --target ${{ matrix.warehouse }}
          dbt run --target ${{ matrix.warehouse }}
          dbt test --target ${{ matrix.warehouse }}
        env:
          SNOWFLAKE_ACCOUNT: ${{ secrets.SNOWFLAKE_ACCOUNT }}
          BIGQUERY_KEYFILE: ${{ secrets.BIGQUERY_KEYFILE }}

Store warehouse credentials as GitHub Secrets. Each matrix combination runs the full integration test suite, catching adapter-specific issues before they reach users.

Mistakes I see in dbt packages

After reviewing dozens of community packages, the same issues come up repeatedly.

Hardcoded schema references. FROM my_database.raw_stripe.payments works in your project, breaks in everyone else’s. Always use source() with var() for schema and database configuration.

Missing dispatch implementations. If your package uses SQL that varies by warehouse and you only write a default__ implementation, users on other adapters get unexpected behavior or errors. Test on every adapter you claim to support.

Tight version constraints. Pinning to version: "0.20.1" forces every user to use exactly that version and creates dependency conflicts with other packages. Use ranges: [">=0.20.0", "<1.0.0"].

Generic model names. A model called customers will collide with the user’s own customers model. Prefix everything with your package name.

Table materialization by default. When someone runs dbt deps && dbt run, your package shouldn’t create 30 physical tables in their warehouse. Default to views and let users override for performance.

No require-dbt-version. Without this, users on incompatible dbt versions get cryptic compilation errors instead of a clear message. With the Fusion engine (dbt 2.0) now available, setting [">=1.3.0", "<3.0.0"] prevents confusion across both runtimes.

Packages vs Mesh: when to use which

dbt packages and dbt Mesh solve different problems, though they overlap at the edges.

Packages are code sharing. When someone installs your package, they get the full source code (macros, models, tests). The code runs inside their project and compiles against their warehouse. Open-source packages on the Hub, internal utility libraries, and shared generic tests are all good fits for this model.

Mesh (via dependencies.yml and cross-project refs) is data product sharing. Teams reference each other’s published models without installing source code. A marketing team can ref('finance', 'mrt__finance__monthly_revenue') without knowing how that model is built. Mesh requires dbt Cloud Enterprise and model access controls (public, protected, private).

If you’re sharing reusable logic (macros, generic tests, model templates), build a package. If you’re sharing data products (curated models with defined contracts), use Mesh. For teams exploring Mesh, the dbt-meshify CLI tool helps split a monolithic project into interconnected projects. For more on structuring dbt projects, that decision often comes before deciding on packages or Mesh.

Most packages start as code you’ve already written and battle-tested across your own projects. The packaging step is mostly about making that code configurable, collision-proof, and verifiable. You don’t need to publish to the Hub on day one. A Git package shared across your team’s projects is a perfectly good starting point, and when it’s stable enough for the broader community, the Hub is a single PR away.

The dbt package ecosystem has over 400 packages, but there are always gaps. If you’ve solved a problem that other analytics engineers face, packaging that solution is one of the most valuable contributions you can make to the community.