A dashboard showing yesterday’s failures is helpful. An alert that tells you about a failure before your stakeholders notice is better. Proactive data teams catch problems early because their alerting does the watching for them.
Elementary can send alerts to Slack, Microsoft Teams, PagerDuty, and other incident management tools. This article covers how to configure each channel, route alerts to the right people, and keep the signal-to-noise ratio manageable.
Prerequisites
This guide assumes you have Elementary installed and running. Your elementary_test_results table should contain test execution data from your data quality tests. If you’re starting fresh, check the Elementary setup guide first.
Quick verification:
# Confirm you have test results to alert onedr report --select last_invocationIf you see test results in the generated report, you’re ready to configure alerting.
Basic Alerting with edr monitor
The edr monitor command runs Elementary tests and sends alerts for failures. The basic invocation:
edr monitor --slack-token $SLACK_TOKEN --slack-channel-name data-alertsThis sends alerts to a Slack channel for any test that failed or warned in the most recent run.
edr report generates an HTML file for human review, while edr monitor is designed for automation. Run monitor in your CI/CD pipeline or on a schedule after each dbt build.
Configuring Alert Metadata
Control what appears in alerts through test metadata in your YAML files:
models: - name: mrt__finance__revenue meta: owner: "@jessica.jones" subscribers: ["@jessica.jones", "@joe.joseph"] description: "Daily revenue aggregation for finance reporting" tags: ["critical", "finance"] channel: finance-data-alerts alert_suppression_interval: 24The owner and subscribers fields control who gets mentioned. The channel field routes alerts to specific Slack channels. The alert_suppression_interval prevents repeated alerts for the same failing test within the specified hours.
Slack Integration
Slack is the most common alerting destination. Elementary supports two integration methods.
Token-Based (Recommended)
Token-based integration gives you full control: custom channels per model, user tagging, and file uploads for detailed failure information.
Create a Slack app and add these bot token scopes:
channels:joinchannels:readchat:writefiles:writeusers:readusers:read.emailgroups:read
Install the app to your workspace and copy the bot token.
Configure in your Elementary profile or pass directly to the CLI:
# In your Elementary configslack: token: xoxb-your-slack-token channel_name: data-alerts group_alerts_by: "table"Or via command line:
edr monitor \ --slack-token $SLACK_TOKEN \ --slack-channel-name data-alerts \ --group-by tableWebhook-Based
Webhooks are simpler to set up but limited to a single channel with no user tagging. Create an incoming webhook in Slack, then:
edr monitor --slack-webhook $SLACK_WEBHOOK_URLUse this for quick setups or when you don’t need per-model routing.
Channel Routing
Route different alerts to different channels based on model location or metadata.
Per-model routing in the model’s YAML:
models: - name: mrt__marketing__campaigns config: meta: channel: marketing-data-alertsPath-based routing in dbt_project.yml:
models: your_project: marts: marketing: +meta: channel: marketing-data-alerts finance: +meta: channel: finance-data-alertsEvery model under marts/marketing/ now routes to the marketing channel, and marts/finance/ to the finance channel.
Microsoft Teams Integration
Teams integration uses webhooks with Adaptive Cards for formatting.
Note: Microsoft deprecated Incoming Webhooks in late 2025. Migrate to Power Automate Workflows if you haven’t already.
Current webhook setup:
teams: notification_webhook: https://your-org.webhook.office.com/webhookb2/... group_alerts_by: "table"Or via CLI:
edr monitor --teams-webhook $TEAMS_WEBHOOK_URLLimitations to know:
- User mentions are not fully supported
- Rich formatting options are more limited than Slack
- Webhook deprecation requires migration planning
For organizations standardizing on Teams, consider Power Automate Workflows that trigger on webhook events and provide more control over message formatting.
PagerDuty and Incident Management
Elementary Cloud extends alerting beyond Slack and Teams to incident management platforms: PagerDuty, Opsgenie, Jira, Linear, ServiceNow, and email.
Setting Up PagerDuty (Elementary Cloud)
- Navigate to the Environments page in Elementary Cloud
- Click “Connect incident management tool”
- Select PagerDuty
- Authorize Elementary (requires “User” role in PagerDuty)
- Configure alert rules
Alert rules map Elementary test failures to PagerDuty incidents based on:
- Status: fail vs warn
- Tags: critical, high, medium
- Resource types: model, source, test
Example rule: “If status = fail AND tag = critical, create P1 incident in PagerDuty.”
Other Integrations
Elementary Cloud also supports:
- Opsgenie: Similar to PagerDuty setup, good for teams already using Atlassian
- Jira: Create tickets for failures that need tracking
- Linear: Integrates with engineering workflows
- ServiceNow: Enterprise ITSM integration
- Email: Simple notifications without chat platform dependencies
- Webhooks (beta): Custom integrations with any system
For OSS users who need PagerDuty, you can bridge the gap by posting Elementary alerts to a Slack channel that triggers PagerDuty via Slack’s PagerDuty integration.
Alert Routing Strategies
Beyond basic channel routing, you can run multiple edr monitor commands with different filters to create sophisticated routing.
Filtering by Tag
# Critical alerts to the urgent channeledr monitor --filters tags:critical --slack-channel-name critical-alerts
# Finance team alertsedr monitor --filters tags:finance --slack-channel-name finance-dataFiltering by Owner
edr monitor --filters owners:@finance-team --slack-channel-name finance-dataFiltering by Status
# Only failures, no warningsedr monitor --filters statuses:fail --slack-channel-name failures-only
# Only warnings for reviewedr monitor --filters statuses:warn --slack-channel-name warnings-reviewCombining Filters
Multiple filters work as AND conditions:
edr monitor \ --filters resource_types:model \ --filters tags:finance,marketing \ --slack-channel-name business-criticalThis alerts on models tagged with either finance OR marketing.
Automation Pattern
Run multiple monitor commands in your CI/CD pipeline:
# In GitHub Actions- name: Alert on critical failures run: edr monitor --filters tags:critical --slack-channel-name critical-alerts
- name: Alert finance team run: edr monitor --filters tags:finance --slack-channel-name finance-data
- name: Alert marketing team run: edr monitor --filters tags:marketing --slack-channel-name marketing-dataReducing Alert Fatigue
Nothing kills an alerting system faster than noise. When every alert feels like a false positive, teams stop paying attention.
Suppression Intervals
Prevent repeated alerts for the same failing test:
meta: alert_suppression_interval: 24 # HoursIf a test fails at 9am and stays failing, you won’t get another alert until 9am the next day. This is critical for tests that can’t be immediately fixed.
Configure at the test, model, or project level:
# Project-wide default in dbt_project.ymlmodels: your_project: +meta: alert_suppression_interval: 12Alert Grouping
Consolidate multiple failures into single messages:
edr monitor --group-by tableInstead of 10 separate alerts for 10 failed tests on the same table, you get one alert listing all failures. This dramatically reduces noise during cascading failures.
Set a threshold for when grouping kicks in:
edr monitor --group-alerts-threshold 5Below 5 failures, send individual alerts. Above 5, consolidate.
Customizing Alert Content
Control what fields appear in alerts:
meta: alert_fields: ["description", "owners", "tags", "subscribers"]Remove fields that add noise without value.
Handling Sensitive Data
Disable sample data in alerts when tables contain PII:
edr monitor --disable-samplesOr configure per-model:
models: - name: mrt__customers__personal_info meta: disable_samples: trueElementary Cloud’s Incident Management
Elementary Cloud adds automatic incident grouping: when new failures relate to open incidents, they’re grouped together rather than creating separate tickets. Successful runs automatically resolve incidents, reducing manual cleanup.
On-Call Strategies for Data Teams
Data team on-call differs from traditional software engineering on-call. Data teams often handle support, triage, and development simultaneously. The same person investigating a data quality issue might also be building new pipelines.
Triage Process
Establish a clear categorization:
| Severity | Criteria | Response | Channel |
|---|---|---|---|
| Critical | SLA breach, production outage, revenue impact | Immediate page | PagerDuty |
| Warning | Quality degradation, potential issues | Next business day | Slack |
| Info | Logged for review, no action needed | Weekly review | None |
Runbooks in Test Metadata
Link troubleshooting documentation directly in your test definitions:
data_tests: - unique: column_name: customer_id config: meta: description: | Duplicate customer IDs detected. Runbook: https://docs.company.com/data/customer-dedup Contact: @data-platform-teamWhen this test fails, the alert includes the runbook link. New team members can resolve issues without asking where to find documentation.
Metrics to Track
Measure your alerting system’s health:
| Metric | What it tells you |
|---|---|
| Alert volume | Is the system too noisy? |
| False positive rate | Are alerts actionable? |
| Time to acknowledge (MTTA) | How quickly do people respond? |
| Time to resolution (MTTR) | How long do issues stay open? |
A high false positive rate erodes trust. A high MTTR might indicate missing runbooks or unclear ownership.
Rotation Considerations
Some patterns that work for data teams:
- Pair on-call with development sprints: The on-call person handles incidents AND works on improvements that reduce future incidents
- Weekly rotation with handoff document: Document open issues, recurring problems, and context for the next person
- Tiered response: Junior engineers handle initial triage, escalate complex issues to senior engineers
What’s Next
You now have alerting configured to notify the right people about the right problems at the right time. The final piece is deciding how much to build yourself versus buying a dedicated observability platform. The next article in this series covers the build vs buy decision for data quality tooling.