Articles
-
Data Observability: Build vs. Buy in 2026
A practical framework for choosing data observability tools. When native dbt tests are enough, when to use OSS like Elementary, and when paid tools make sense.
dbt bigquery snowflake databricks -
Alerting on Data Quality Issues with Elementary
Set up data quality alerts with Elementary. Configure Slack, Teams, and PagerDuty notifications, route alerts by team, and reduce noise with smart suppression.
dbt elementary data quality testing -
Building Data Quality Dashboards with Elementary
Turn Elementary test results into actionable dashboards. Learn to generate and host HTML reports, and build custom data quality KPIs in your BI tools.
dbt elementary data quality testing -
Elementary for dbt: Complete Setup Guide
Install and configure Elementary for dbt-native data observability. Step-by-step BigQuery setup, CLI installation, platform configs, and troubleshooting.
dbt bigquery snowflake databricks -
Data Quality in 2026: Beyond Basic dbt Testing
Move past unique and not_null tests. Learn advanced data quality patterns with dbt-expectations, Elementary, data contracts, and ML-based anomaly detection.
dbt elementary data quality testing -
OpenClaw vs Claude Code for dbt Development: the Right Tool at the Right Time
OpenClaw and Claude Code serve different layers of a dbt workflow. Here's when to reach for each, where they overlap, and how to combine them effectively.
claude code dbt ai automation -
OpenClaw Security Risks: What Data Teams Need to Know Before Getting Started
OpenClaw has drawn security warnings from CrowdStrike, the Dutch DPA, and Gartner. Data teams handling client data need to understand the risks before starting.
claude code ai automation data engineering -
OpenClaw for Data People
OpenClaw is a 24/7 autonomous AI agent that monitors pipelines, runs dbt tests, and messages you results. Here's what it means for analytics engineers.
claude code ai automation data engineering -
GCP authentication when you have 10 clients and an AI agent
Isolate GCP credentials across client projects when using AI coding agents. A practical guide to CLOUDSDK_CONFIG, direnv, and service accounts.
gcp claude code mcp dbt -
Google Workspace finally gets a CLI, and it was built for AI agents
The gws CLI gives programmatic access to every Google Workspace API through a single binary, with built-in MCP server and agent-first design principles.
gcp mcp claude code automation -
GCP IAM for Data Teams: Least Privilege Done Right
Audit and tighten GCP IAM for data platforms. Per-workload service accounts, policy tags, row-level security, and patterns to eliminate security shortcuts.
gcp bigquery data engineering cost optimization -
BigQuery and Cloud Storage: Data Lake Patterns for 2026
A decision framework for native BigQuery tables, BigLake external tables, and BigLake Iceberg tables, covering cost, performance, and catalog infrastructure trade-offs.
bigquery gcp data engineering data modeling -
Deploying dbt Core on Cloud Run Jobs: Complete Setup Guide
Deploy dbt Core on Cloud Run Jobs with Workload Identity, Cloud Scheduler, and Eventarc triggers. Complete guide from Dockerfile to production-ready setup.
dbt gcp bigquery data engineering -
Cloud Run Jobs vs. Cloud Composer for dbt: A Cost-Conscious Decision Framework
Compare Cloud Run Jobs and Cloud Composer for dbt orchestration on GCP. Learn when each makes sense based on cost, complexity, and operational requirements.
dbt gcp bigquery data engineering -
GCP Data Platform Architecture: Strategic Patterns for 2026
Decision frameworks for GCP data platforms in 2026: open lakehouse with BigLake Iceberg, Cloud Run Jobs for dbt orchestration, and security patterns.
gcp bigquery dbt data engineering -
Comparing Attribution Models: A dbt + Looker Studio Dashboard
Build a multi-model attribution dashboard with dbt and Looker Studio. Run first-touch, last-touch, and other models in parallel to interpret disagreements.
dbt bigquery ga4 data modeling -
Data-Driven Attribution: Building Markov Chains and Shapley Values
Move beyond heuristic attribution with data-driven methods. Learn how Markov chains and Shapley values measure channel contribution through removal effects.
bigquery dbt data modeling analytics -
Position-Based and Time-Decay Attribution in BigQuery
Implement weighted attribution models in SQL. Build U-shaped position-based and exponential time-decay attribution with BigQuery and dbt code examples.
bigquery dbt data modeling analytics -
First-Touch, Last-Touch, and Linear Attribution in SQL
Build marketing attribution models in BigQuery with practical SQL patterns. Covers first-touch, last-touch, and linear attribution with dbt implementation.
bigquery dbt data modeling analytics -
Marketing Attribution in the Warehouse: A 2026 Guide
Build marketing attribution in your data warehouse with SQL. Covers heuristic models, Markov chains, Shapley values, identity resolution, and dbt patterns.
bigquery dbt ga4 analytics -
Building a Custom Materialization in dbt
Tutorial on building custom dbt materializations in BigQuery: zero-downtime table swaps with row-count validation and automated row-level security policies.
dbt bigquery data engineering data modeling -
Cross-Database Macros: Writing Once, Running Anywhere
Write dbt macros that work across BigQuery, Snowflake, and Databricks using the dispatch pattern. Covers key syntax differences and built-in functions.
dbt bigquery snowflake databricks -
Writing Reusable Macros: DRY Principles for dbt
Learn when to create dbt macros and when to leave code inline. Practical patterns for naming, folder organization, testing, and avoiding premature abstraction.
dbt data modeling data engineering testing -
10 Macros That Should Be in Every dbt Project
The essential dbt macros from packages and custom patterns that handle 80% of repeated SQL work. Practical examples for surrogate keys, date spines, and more.
dbt data modeling data engineering -
dbt Macros: From Jinja Basics to Production Patterns
Master dbt macros from Jinja fundamentals to production patterns. Covers dbt-utils, dbt-expectations, and design principles for maintainable macro code.
dbt data modeling data engineering -
Loading Google Ads Data to BigQuery: Four Approaches Compared
Compare BigQuery Data Transfer Service, Fivetran, Google Ads Scripts, and dlt for loading Google Ads data. Includes costs, limitations, and decision framework.
bigquery google ads data engineering etl -
When to Build vs. Buy Your Data Pipelines in 2026
Fivetran's pricing changes, AI-assisted development, and mature open-source tools like dlt have shifted the build-vs-buy calculation for data pipeline teams.
dlt bigquery data engineering etl -
Fivetran vs. Airbyte vs. dlt: The 2026 Comparison
Comparing Fivetran, Airbyte, and dlt for data ingestion in 2026. Pricing models, connector quality, self-hosting trade-offs, and a practical decision framework.
dlt bigquery data engineering etl -
Building Custom API Pipelines with dlt: From REST to BigQuery
Build production-ready API pipelines using dlt's RESTClient and REST API Source, with pagination, authentication, and incremental loading to BigQuery.
dlt bigquery data engineering etl -
dlt: The Python-Native Data Loader That Changes the Build vs Buy Equation
dlt fills the gap between expensive managed ELT and building from scratch. Learn when this Python-first data loading library is the right choice for your team.
dlt bigquery data engineering etl -
Defining Metrics in dbt: Best Practices and Patterns
Learn how to define metrics in dbt with MetricFlow. Covers all five metric types, naming conventions, organizational patterns, and common pitfalls to avoid.
dbt data modeling analytics -
Getting Started with the dbt Semantic Layer and MetricFlow
Set up the dbt Semantic Layer with MetricFlow: install the package, define semantic models and metrics in YAML, and query from the CLI in Core or Cloud.
dbt data modeling analytics -
The Semantic Layer Revolution: Why 2026 Is the Year
Semantic layers are becoming essential for AI-ready analytics. What's driving adoption, the three competing architectures, and whether you should invest.
dbt snowflake databricks data modeling -
Microbatch Incremental Strategy in dbt 1.9: A Practical Guide
Learn how dbt's microbatch strategy simplifies time-partitioned incremental models with built-in backfills, automatic filtering, and batch-level retries.
dbt incremental processing data engineering -
Late-Arriving Data in dbt: Patterns That Actually Work
Practical patterns for handling late-arriving data in dbt incremental models, from lookback windows and partition strategies to deduplication techniques.
dbt bigquery snowflake databricks -
Merge vs. Delete+Insert vs. Insert_Overwrite: Choosing the Right dbt Strategy
Compare dbt's incremental strategies across BigQuery, Snowflake, and Databricks. Learn when merge becomes a bottleneck and which alternatives cut costs.
dbt bigquery snowflake databricks -
Incremental Models in dbt: The Complete Guide
Complete reference for dbt incremental models covering all strategies, warehouse-specific behaviors, late-arriving data patterns, and decision frameworks.
dbt bigquery snowflake databricks -
MCP Ecosystem Overview: Servers, Clients, and SDKs
A practical guide to MCP servers, clients, and SDKs for data engineers. Which database servers to use and which clients work best.
mcp bigquery snowflake data engineering -
MCP Apps: Interactive Visualizations in Claude
Build interactive dashboards and charts that render directly in Claude conversations. A guide to MCP Apps for data teams.
mcp claude code data engineering ai -
Building Custom MCP Servers for Data Engineering
Build custom MCP servers with Python or TypeScript. Practical examples for data catalogs, pipeline monitoring, and data quality tools with full code.
mcp data engineering data quality -
BigQuery MCP Server: Complete Setup Guide
Connect BigQuery to AI assistants with Google's official MCP options. Compare Remote Server vs self-hosted Toolbox with setup steps.
mcp bigquery gcp claude code -
dbt MCP Server: Complete Setup Guide
Connect dbt to Claude Desktop or Claude Code via MCP. Query models, metrics, lineage, and run dbt commands through conversation.
mcp dbt claude code ai -
MCP Protocol Fundamentals: What Data Engineers Need to Know
Learn MCP architecture, core primitives, and the security model. Essential foundation for connecting AI assistants to your data infrastructure.
mcp data engineering ai -
Understanding the terminal: A complete guide for Claude Code beginners
Learn essential terminal commands and how Claude Code uses them. A practical guide for beginners who want to understand what's happening behind the scenes.
claude code ai -
GA4 + dbt: A Production-Ready Project Template
A complete dbt project template for GA4 BigQuery exports with incremental processing, sessionized event tables, testing patterns, and documentation.
dbt bigquery ga4 data modeling -
GA4 User Stitching: Handling Anonymous to Known Users
Build identity resolution pipelines in BigQuery to stitch anonymous GA4 users to authenticated identities across devices and sessions, with production dbt patterns.
bigquery ga4 dbt analytics -
Building Session Tables from GA4 Event Data
Learn to sessionize GA4 BigQuery exports by building enriched event tables with session identity, attribution, and sequence position using window functions.
bigquery dbt ga4 data modeling -
Unnesting GA4 Events: Patterns for Every Use Case
Production-ready SQL patterns for extracting GA4 data from nested arrays in BigQuery, covering e-commerce, engagement events, and dbt model templates.
bigquery dbt ga4 data modeling -
GA4 BigQuery Export: The Complete Schema Reference
A practical field guide to GA4's BigQuery export schema covering nested structures, traffic source fields, critical gotchas, and efficient query patterns.
bigquery ga4 data engineering analytics -
10 BigQuery SQL Patterns Every Analytics Engineer Should Know
Production-ready BigQuery patterns for partitioning, materialized views, HLL sketches, nested data, window functions, dbt incrementals, and attribution.
bigquery dbt data modeling data engineering -
On-Demand vs. Editions Pricing: When to Switch
A practical guide to choosing between BigQuery On-Demand and Editions pricing models, with SQL queries to analyze your workload and calculate breakeven points.
bigquery cost optimization -
BigQuery Slots and Reservations Explained
Learn how BigQuery slots work, understand the reservation hierarchy, compare Editions pricing tiers, and optimize slot usage for your dbt workflows.
bigquery dbt cost optimization -
Partitioning vs. Clustering: The Decision Framework
A practical decision framework for choosing between BigQuery partitioning and clustering based on query patterns, table sizes, and dbt incremental strategies.
bigquery dbt data engineering data modeling -
BigQuery Architecture for Analytics Engineers: The Complete Guide
Learn BigQuery's resource hierarchy, regional constraints, multi-environment patterns, and IAM configuration to design scalable analytics systems.
bigquery dbt data engineering data modeling -
dbt-expectations: The Package Every Project Needs
Learn how dbt-expectations adds 50+ data quality tests to your dbt project: pattern matching, freshness checks, statistical validation, and more.
dbt data quality testing -
Unit Testing vs. Data Testing: When to Use Each
A decision framework for choosing between dbt unit tests, data tests, dbt-expectations, Elementary, and dbt-audit-helper based on what you're testing.
dbt bigquery testing -
Unit Testing dbt Models: Real-World Examples and Patterns
Copy-paste patterns for unit testing incremental models, snapshots, window functions, GA4 sessionization, and attribution models in dbt with BigQuery.
dbt bigquery testing -
Unit Testing in dbt 1.8+: Complete Implementation Guide
Learn to implement dbt unit tests from scratch. Covers YAML syntax, BigQuery-specific workarounds for STRUCTs and ARRAYs, mocking dependencies, and CI/CD integration.
dbt bigquery testing -
dbt Testing Strategy: A Framework for Every Project
A practical framework for dbt testing that scales from first projects to enterprise pipelines, covering data tests, unit tests, contracts, and packages.
dbt data engineering data quality -
BigQuery Cost Optimization: The 80/20 Guide
Cut BigQuery costs by focusing on partitioning, clustering, and column selection. Includes dbt configs, INFORMATION_SCHEMA queries, and governance guardrails.
bigquery dbt data engineering cost optimization -
Your First GA4 dbt Models: From Raw Events to Sessions
Build event-level GA4 dbt models that preserve granularity. Learn the session key trap, nested event_params extraction, and a three-layer pattern for flexible analytics.
dbt bigquery ga4 data engineering -
Base, Intermediate, Marts: When to Use Each Layer
Learn when to use base, intermediate, and mart layers in dbt. Clear rules for where joins, business logic, and aggregations belong in your transformation project.
dbt data engineering data modeling -
dbt Project Structure: The Definitive Guide
A complete guide to dbt project structure: three-layer architecture, entity naming, table materialization, and marketing analytics examples.
dbt data modeling -
Connecting Claude Code to Your Data Warehouse (And Why You Might Not Need MCP)
Cloudflare and Anthropic discovered LLMs write better code than tool calls. For BigQuery users, that means CLI might beat MCP. Here's the evidence for it.
bigquery claude code gcp ai -
Advanced Claude Code Workflows: Testing, Documentation, and Debugging for Analytics Engineers
Three production-ready workflows for testing, documentation, and debugging that turn Claude Code into an integrated part of your analytics engineering process.
dbt bigquery claude code data engineering -
Automating the Boring Parts: Hooks and Custom Commands for Analytics Engineers
Learn how to use Claude Code hooks and custom slash commands to automate dbt workflows — from auto-formatting SQL to blocking dangerous production commands.
claude code dbt automation -
Claude Code - Skills vs. Commands
Skills activate automatically only 20% of the time. For repeatable data workflows like dbt audits and lineage docs, commands give you the consistency you need.
claude code dbt ai -
How to set up CLAUDE.md for your dbt project (and actually make it useful)
Learn how to configure CLAUDE.md for dbt projects. Practical tips for naming conventions, BigQuery gotchas, and keeping it minimal.
claude code dbt ai -
How I Use Claude Code for dbt Development
Practical guide to using Claude Code with dbt. From base model generation to refactoring: what actually works in day-to-day analytics engineering.
claude code dbt data engineering ai -
Claude Code for Data People: What It Is and Why You Should Care
70% of analytics engineers use AI for coding. Meet Claude Code, the agentic tool that reads your dbt codebase and writes code based on your patterns.
claude code data engineering ai -
Your First Hour with Claude Code as an Analytics Engineer
Install and master Claude Code in one hour. Practical guide for analytics engineers: setup, authentication, first dbt models, and essential tips to get started.
claude code data engineering ai -
n8n RSS to Notion
Transform your RSS feeds into an automated knowledge base with n8n and ChatGPT. No more manual cleanup: fetch and organize articles in Notion effortlessly.
automation ai -
Deploying dbt core on Google Cloud Function
Let's look at how to deploy dbt Core on a Google Cloud Function.
dbt bigquery data engineering automation -
How to Pass the dbt Certification
Here is my experience passing the dbt developer certification.
dbt data engineering -
dbt core vs dbt cloud
Let's look at the differences between dbt Core and dbt Cloud.
dbt data engineering -
Loading Data Made Simple: A Hands-on Guide to dlt
Learn how to build data pipelines with dlt (data load tool). From basic API extraction to incremental loading: a practical tutorial using the GitHub API.
dlt data engineering etl