Elementary is a dbt-native data observability tool. It extends dbt with anomaly detection, historical test tracking, freshness monitoring, and alerting — capabilities that sit in a gap between dbt’s built-in generic tests and full-blown commercial observability platforms like Monte Carlo or Bigeye. Everything Elementary produces lives in your warehouse as queryable tables, which means you own the data and can build on top of it with any BI tool.
Architecture: Two Components
Elementary has two parts that serve distinct purposes.
The dbt package installs via packages.yml like any other dbt dependency. It creates metadata tables in a dedicated schema and uses on-run-end hooks to capture artifacts after every dbt run or dbt test. Model execution times, test results, schema snapshots, and run metadata all flow into tables you control.
The CLI (edr) is a standalone Python tool that reads from those warehouse tables. It generates HTML observability reports, sends alerts to Slack or Teams, and executes anomaly detection logic. The CLI connects to your warehouse through its own profile, separate from your dbt profile.
The data flow is linear:
dbt run/test --> on-run-end hooks --> INSERT into Elementary tables --> edr reads tables --> reports/alertsThis separation matters. The dbt package has zero runtime cost beyond the hook inserts. The CLI runs independently, on whatever schedule you choose, and can be pointed at any environment where Elementary tables exist.
Installation
Add the package and configure its schema:
packages: - package: elementary-data/elementary version: 0.21.0
# dbt_project.ymlmodels: elementary: +schema: "elementary"For dbt 1.8+, two flags are required because of changes to how package materializations work:
flags: require_explicit_package_overrides_for_builtin_materializations: False source_freshness_run_project_hooks: TrueYou also need a materialization override macro. Without it, tests run but Elementary’s result tables stay empty — the most common silent failure during setup:
-- macros/elementary_materialization.sql{% materialization test, default %} {{ return(elementary.materialization_test_default()) }}{% endmaterialization %}Run dbt deps, then dbt run --select elementary to create the metadata tables, then dbt test to populate them with initial results.
Anomaly Detection Tests
Where dbt’s built-in tests validate static rules you define, Elementary’s tests learn patterns from historical data and flag deviations. They use Z-score statistics: a sensitivity of 3 means alerting when a metric falls more than 3 standard deviations from its historical mean.
Volume Anomalies
Detects unusual row counts. If a source table normally receives 10,000 rows per day and suddenly gets 2,000, a not_null test passes but volume_anomalies catches it.
tests: - elementary.volume_anomalies: where: "event_date = current_date()" time_bucket: period: day count: 1Freshness Anomalies
Monitors time between table updates. Unlike dbt’s source freshness checks, which use a fixed threshold you must define, freshness_anomalies learns the normal update cadence and flags deviations. event_freshness_anomalies tracks the lag between when an event occurred and when it was loaded.
Column Anomalies
Tracks column-level metrics — null percentage, average, min, max, distinct count, zero count — and alerts when any metric deviates from its historical baseline. Useful for catching distribution shifts that pass every row-level constraint.
tests: - elementary.column_anomalies: column_name: order__amount column_anomalies: - average - max anomaly_sensitivity: 3 training_period: period: day count: 14The training_period controls how much history Elementary uses to establish baselines. Fourteen days is a reasonable default; increase it if your data has weekly cycles, decrease it if patterns shift frequently.
Schema Changes
Detects added or deleted columns, type changes, and deleted tables. The schema_changes_from_baseline variant validates against an explicitly defined schema, which is useful for models with downstream consumers who depend on a stable contract.
Alerting
The edr monitor command sends notifications for test failures. Slack is the most common destination:
edr monitor --slack-token $SLACK_TOKEN --slack-channel-name data-alertsAlerts become more useful with metadata in your model YAML:
models: - name: mrt__finance__revenue meta: owner: "@jessica.jones" channel: finance-data-alerts alert_suppression_interval: 24The channel field routes alerts to team-specific Slack channels. alert_suppression_interval (in hours) prevents repeated alerts for the same persistent failure — critical for avoiding alert fatigue.
For path-based routing across an entire directory of models:
models: your_project: marts: finance: +meta: channel: finance-data-alertsAlert grouping consolidates cascading failures into a single message instead of flooding a channel:
edr monitor --group-by table --group-alerts-threshold 5Reports and Dashboards
edr report generates a self-contained HTML file showing test pass/fail history, model runtime trends, anomaly detection charts, and data lineage. For team access, host it on S3, GCS, or Azure Blob Storage using edr send-report.
Since Elementary stores everything in warehouse tables (elementary_test_results, dbt_run_results, dbt_models), you can also build custom dashboards in your existing BI tool. A useful starting query for tracking data quality over time:
SELECT DATE(detected_at) AS date, COUNT(CASE WHEN status = 'pass' THEN 1 END) AS passed, COUNT(CASE WHEN status = 'fail' THEN 1 END) AS failed, ROUND( COUNT(CASE WHEN status = 'pass' THEN 1 END) * 100.0 / COUNT(*), 2 ) AS pass_rateFROM elementary_test_resultsWHERE detected_at >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)GROUP BY 1ORDER BY 1;BigQuery-Specific Notes
The CLI profile requires an explicit location parameter (US, EU, or your specific region). dbt infers this, but Elementary does not — the most common source of connection errors on BigQuery.
elementary: outputs: default: type: bigquery method: oauth project: your-project-id dataset: your_schema_elementary location: US threads: 4The service account running edr needs BigQuery Data Viewer on the Elementary dataset, Metadata Viewer and Resource Viewer on your dbt datasets, and Job User on the project. These are read-oriented roles; the CLI does not write to your production models.
Where Elementary Fits in a Testing Strategy
Elementary is best understood as a complement to explicit rule-based tests, not a replacement. The dbt Testing Taxonomy lays out five testing mechanisms in dbt; Elementary occupies the “unknown unknowns” space — anomalies you would not think to write explicit tests for.
A practical layering:
| Layer | Tool | What it catches |
|---|---|---|
| Primary keys, nulls, referential integrity | dbt generic tests | Known structural violations |
| Value ranges, patterns, business rules | dbt-expectations | Known domain violations |
| Transformation logic correctness | dbt unit tests | Logic bugs in SQL |
| Volume drops, freshness drift, distribution shifts | Elementary | Unknown anomalies |
| Schema stability for consumers | dbt model contracts | Breaking schema changes |
For incremental models, Elementary’s volume and freshness anomaly tests are particularly valuable. Incremental runs process only new data, so a silent upstream failure that stops sending rows will not cause an error — the incremental model simply processes zero new rows and succeeds. Volume anomaly detection catches this.
OSS vs. Cloud
Elementary OSS provides anomaly detection, HTML reports, and Slack/Teams alerting at no licensing cost. The trade-off is maintenance: expect 2-5 days for initial setup, ongoing configuration tuning, and self-managed report hosting.
Elementary Cloud adds automated ML monitors (more sophisticated than the OSS Z-score approach), column-level lineage extending to BI tools, a data catalog, and incident management integrations with PagerDuty, Jira, and ServiceNow. For teams under 4 engineers, the OSS version is typically sufficient. Beyond that, the operational overhead of self-hosting starts to compete with the cost of a managed solution.