Monitoring alerts that include business context — “duplicates mean double-counted revenue across all sales dashboards” rather than “unique test failed with 3 rows” — require the agent to have access to dbt project documentation. OpenClaw’s persistent memory stores that context as Markdown files that survive across sessions. Where a skill file gives an agent a standing operating procedure, persistent memory gives it a knowledge base.
What to Store in Memory
Three categories of information belong in persistent memory for dbt monitoring:
Model and column documentation — the descriptions from your schema.yml files, translated into the agent’s context. Not the raw YAML, but a readable format the agent can search and reference.
Downstream dependency information — a compact summary of the model dependency graph derived from manifest.json. Which models depend on which, specifically which mart models are downstream of each source or base model.
Investigation history — a running log of failure events you’ve triaged, including your conclusions. This is what lets the agent tell you “you investigated this on Tuesday and decided it was a vendor delay” instead of treating every recurring failure as a novel event.
Storing Model Documentation
Extract your schema.yml model and column descriptions and store them as a Markdown file in OpenClaw’s memory. The format matters — you want the agent to be able to retrieve relevant sections quickly, not parse raw YAML:
# dbt project documentation
## mrt__sales__customersDeduplicated customer dimension for all sales reporting.Used by: revenue dashboards, cohort analysis, customer segmentation.Owner: data teamCriticality: high
### Columns- customer__id: Primary key. Deduplicated customer identifier from Shopify. Uniqueness is critical — duplicates mean double-counted revenue.- customer__email: Contact email. Used for CRM matching.- customer__first_ordered_at: Timestamp of first purchase. Drives cohort assignment.
## mrt__finance__ordersOrder-level mart for all revenue reporting.Used by: CFO dashboard, monthly revenue reports, Salesforce sync.Owner: finance teamCriticality: high
### Columns- order__id: Primary key. Source: Shopify order ID.- order__customer_id: Foreign key to mrt__sales__customers. Not nullable.- order__revenue_usd: Gross revenue in USD. Includes tax, excludes refunds.With this loaded, the agent transforms raw failure output into business-context alerts. When unique_mrt__sales__customers_customer__id fails, the agent looks up mrt__sales__customers, reads that “uniqueness is critical — duplicates mean double-counted revenue,” and includes that in its report. You get the business implication without doing the lookup yourself.
This doesn’t require custom code. Load the file using OpenClaw’s memory command and instruct the skill to consult it when reporting failures:
## Cross-referencing documentation
When a test fails on a model:1. Look up the model in your memory document "dbt-project-docs.md"2. Include the model description in the failure report3. Include the relevant column description4. Note which downstream reports or systems are affected5. Include the model owner for escalation routingThe limitation is maintenance. There’s no automated sync between your schema.yml and the memory file. When models change, someone needs to update the memory document. This is the most significant operational overhead of this approach — plan for it as part of your model change process, not as an afterthought.
Downstream Impact from manifest.json
dbt’s manifest.json contains the complete project dependency graph. Every model knows its parents and children. The agent can parse this to answer “which models depend on the model that just failed?” — information that shapes severity and response urgency immediately.
The challenge with large projects is size. A manifest for a project with 300+ models can be several megabytes of JSON, which is impractical to feed into a language model’s context window. The solution is pre-processing: extract the dependency graph into a compact format and store that in persistent memory instead.
A compact dependency summary looks like this:
# dbt dependency summary
## Mart-level models and their key sources
mrt__sales__customers Direct parents: int__customers_deduplicated Key ancestors: base__shopify__customers, base__shopify__orders Downstream of this mart: (none — this is a leaf node)
mrt__finance__orders Direct parents: int__orders_enriched Key ancestors: base__shopify__orders, base__shopify__refunds, stg__exchange_rates Downstream of this mart: (none)
int__orders_enriched Direct parents: base__shopify__orders, stg__exchange_rates, int__customers_deduplicated Downstream marts: mrt__finance__orders, mrt__marketing__attribution Downstream mart count: 2
base__shopify__customers Downstream intermediate count: 3 Downstream mart count: 4 High-criticality dependents: mrt__sales__customers, mrt__finance__ordersGenerating this summary can be scripted — iterate through manifest.json nodes, extract the parent/child relationships, and write the formatted output. Once it’s in memory, the agent can look up any failing model and immediately know its downstream footprint without re-parsing the manifest at monitoring time.
Instruct the skill to use it:
## Downstream impact lookup
When a test fails on a model:1. Look up the model in "dbt-dependency-summary.md" in memory2. Find how many downstream mart-layer models depend on it3. Include in the report: "X downstream marts affected, including [names of critical ones]"4. If 3 or more mart models are downstream, flag as high downstream impactHistorical Failure Tracking
The most powerful use of persistent memory for monitoring is tracking failure history. A plain cron job that runs dbt test has no memory. It sees each failure as a new event and reports it identically whether it’s happening for the first time or the fourteenth.
Persistent memory changes this. After each monitoring run, the agent updates a structured history file:
# dbt failure history
## unique_mrt__sales__customers_customer__idLast failed: 2026-03-25First failed: 2026-03-25Total occurrences last 30 days: 1Status: activeNotes: First occurrence. Investigating deduplication logic in base__shopify__customers.
## not_null_int__orders_enriched_order__customer_idLast failed: 2026-03-27First failed: 2026-03-14Total occurrences last 30 days: 6Status: recurring — intermittentNotes: 2026-03-18 investigated. Likely related to Shopify source freshness delays. Failures correlate with late source loads. Watch but don't escalate immediately.
## source_freshness_raw_ga4_eventsLast failed: 2026-03-22First failed: 2026-02-15Total occurrences last 30 days: 4Status: recurring — knownNotes: Expected on weekends (vendor batch timing). No action needed for Sat/Sun failures. Escalate only if fails on a weekday.The skill instructs the agent to consult this file when reporting and update it after each run:
## Historical context
Before reporting a failure:1. Check "dbt-failure-history.md" in memory for prior occurrences of this test2. If found: include occurrence count and any notes from previous investigations3. If not found: mark as "First occurrence" — highest urgency
After reporting all failures:1. Update "dbt-failure-history.md" for each failure encountered today2. Increment occurrence count3. Update "Last failed" date4. Preserve existing notes; add new notes if investigation reveals new informationEach run adds to the institutional knowledge the agent carries forward. Over time, the morning summary distinguishes new failures from recurring patterns that have already been categorized.
The Maintenance Reality
Be realistic about what this setup requires to stay useful. Memory documents that go stale are worse than no memory at all — an agent that confidently reports wrong business context causes more confusion than one that reports none.
The minimum maintenance commitment:
- Model documentation: update when models are added, renamed, or significantly changed
- Dependency summary: regenerate when the project DAG changes significantly (monthly is usually sufficient for stable projects)
- Failure history: the agent maintains this itself after each run, but review it monthly to archive closed issues and keep the file scannable
The documentation and dependency summary are the higher-maintenance items. If your dbt project changes frequently, automate the regeneration rather than relying on manual updates. A simple script that reads schema.yml files and manifest.json and regenerates the memory documents can run as part of your CI/CD pipeline on every PR merge.
For the full picture of how persistent memory feeds into the morning summary pattern, see dbt Quality Morning Summary Pattern.