ServicesAboutNotesContact Get in touch →
EN FR
Note

Meta Ads Pipeline Maintenance

Operational practices for keeping a Meta Ads pipeline running — token expiry monitoring, spend reconciliation, API version lifecycle management, and circuit breaker patterns.

Planted
bigquerydata engineeringdata quality

Meta’s API changes frequently enough that pipeline maintenance is a recurring cost. The three practices that keep pipelines running reliably: token expiry tracking, spend reconciliation, and a circuit breaker that stops extraction after consecutive failures rather than exhausting rate limits on a broken endpoint.

Token Expiry Monitoring

The most common silent failure in Meta pipelines is token expiration. When a long-lived user token or personal access token expires, the API starts returning 401 errors. Unless you’re monitoring for auth errors specifically, the pipeline stops and nothing alerts you. By the time someone notices (usually when a dashboard goes stale), you’ve lost days of data.

Use System User Access Tokens. Unlike user tokens that expire after ~60 days, System User tokens don’t expire. Set them up once in Business Settings under System Users, and they remain valid until you revoke them. This eliminates the category of silent expiration failures entirely for most pipelines.

If you’re stuck with user tokens (common when using BigQuery Data Transfer Service for Meta, which requires a user token), build expiry monitoring:

import datetime
from facebook_business.api import FacebookAdsApi
from facebook_business.adobjects.user import User
def check_token_expiry(access_token: str) -> datetime.datetime:
api = FacebookAdsApi.init(access_token=access_token)
me = User(fbid='me', api=api)
fields = ['token_for_business']
result = me.api_get(fields=fields)
# Token debug endpoint gives expiry
debug = api.call(
'GET',
'/debug_token',
{'input_token': access_token, 'access_token': access_token}
).json()
expires_at = debug['data'].get('expires_at', 0)
return datetime.datetime.fromtimestamp(expires_at) if expires_at else None

Alert one week before the 60-day mark. One week gives you enough time to refresh without an emergency. A day’s notice does not.

Spend Reconciliation

The second operational practice that pays for itself quickly: weekly comparison of warehouse spend totals against Ads Manager.

Pull your warehouse total for the trailing 7 days:

SELECT
SUM(spend) AS warehouse_spend_7d,
CURRENT_DATE() AS report_date
FROM `your_project.meta_ads.ad_insights_daily`
WHERE date_start >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)

Compare this against the Ads Manager interface (or the API) for the same period. A variance under 5% is normal — timezone differences, attribution window rounding, and modeling uncertainty all contribute. A variance over 10% on a consistent basis means something is wrong with your pipeline.

Common causes of large variances:

  • Lookback window too short: you’re not re-pulling data that Meta has retroactively updated
  • Attribution window mismatch: your API call and Ads Manager use different windows
  • Missing ad accounts: your pipeline doesn’t have access to all accounts contributing to Ads Manager totals
  • Currency normalization bug: spend in non-primary currencies isn’t being converted

Automate this check. Run it as a dbt test, a scheduled query, or a simple script with an alert. The value is in the frequency and consistency — catching variance early means fixing a small problem before it becomes a large data quality incident.

-- dbt singular test: assert spend reconciliation within 10%
-- tests/assert_meta_spend_reconciles.sql
WITH warehouse_spend AS (
SELECT SUM(spend) AS total_spend
FROM {{ ref('mrt__marketing__meta__campaign_performance') }}
WHERE date_day >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
),
-- Compare against a manually maintained reference or API check
-- This pattern flags for manual review when warehouse looks off
-- Adjust the threshold based on acceptable variance for your setup
meta_reference AS (
SELECT SUM(spend) AS total_spend
FROM {{ source('meta_ads', 'spend_reference') }}
WHERE date_day >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
)
SELECT
warehouse_spend.total_spend AS warehouse_total,
meta_reference.total_spend AS reference_total,
ABS(warehouse_spend.total_spend - meta_reference.total_spend) /
NULLIF(meta_reference.total_spend, 0) AS variance_pct
FROM warehouse_spend
CROSS JOIN meta_reference
WHERE ABS(warehouse_spend.total_spend - meta_reference.total_spend) /
NULLIF(meta_reference.total_spend, 0) > 0.10

API Version Lifecycle Management

Meta releases roughly two major API versions per year, each supported for approximately two years. That means you need to upgrade your pipeline’s API version at least once a year to stay on a supported version.

The process isn’t just changing a version number in a URL. New API versions sometimes rename fields, change data types, or restructure responses. The canonical place to find these changes is Meta’s API changelog.

Build a version tracking discipline:

  1. Note the current version when you deploy your pipeline
  2. Subscribe to Meta’s developer changelog announcements
  3. Plan upgrades at least six months before the version’s end-of-life date
  4. Test on the new version in a staging environment before cutting over

If you’re using a managed extraction tool (Fivetran, Airbyte), they handle API version upgrades for you — this is a large part of what you’re paying for. If you’re running custom extraction, version management is your responsibility.

The failure mode to avoid: discovering a version is deprecated because your API calls start returning errors in production. At that point you’re doing an emergency migration under pressure. The same work done proactively, six months earlier, is a scheduled maintenance task.

Circuit Breaker Pattern

A pipeline that hits errors should stop, not retry forever. Every failed API call burns rate limit budget and delays the actual problem diagnosis.

The circuit breaker pattern: track consecutive failures and pause extraction after a threshold is hit.

import time
from enum import Enum
class CircuitState(Enum):
CLOSED = "closed" # Normal operation
OPEN = "open" # Stopped after failures
HALF_OPEN = "half_open" # Testing recovery
class MetaApiCircuitBreaker:
def __init__(self, failure_threshold=5, recovery_timeout=300):
self.failure_threshold = failure_threshold
self.recovery_timeout = recovery_timeout
self.failure_count = 0
self.last_failure_time = None
self.state = CircuitState.CLOSED
def call(self, fn, *args, **kwargs):
if self.state == CircuitState.OPEN:
if time.time() - self.last_failure_time > self.recovery_timeout:
self.state = CircuitState.HALF_OPEN
else:
raise Exception("Circuit breaker open — pausing extraction")
try:
result = fn(*args, **kwargs)
self._on_success()
return result
except Exception as e:
self._on_failure()
raise
def _on_success(self):
self.failure_count = 0
self.state = CircuitState.CLOSED
def _on_failure(self):
self.failure_count += 1
self.last_failure_time = time.time()
if self.failure_count >= self.failure_threshold:
self.state = CircuitState.OPEN

After 5 consecutive failures, the circuit opens and extraction pauses for 5 minutes before trying again. This prevents a broken endpoint from exhausting your rate limit budget while also giving the service time to recover.

The circuit breaker is most valuable in combination with proper alerting. When the circuit opens, fire an alert so a human knows something needs attention. Don’t silently pause — the whole point is to surface problems, not bury them.

Raw Data Storage

Store raw API responses before transformation. This sounds expensive but isn’t — raw JSON compressed in BigQuery storage is cheap. The value is operational: when something goes wrong in your transformation logic, or when Meta changes a field you hadn’t anticipated, you can debug without re-pulling from the API.

-- Raw storage table structure
CREATE TABLE raw_meta_ads.api_responses (
ingestion_timestamp TIMESTAMP,
ad_account_id STRING,
report_date DATE,
api_version STRING,
response_json JSON -- or STRING if you prefer
)
PARTITION BY report_date
OPTIONS (
require_partition_filter = false
);

Partition by report date so queries against specific date ranges don’t scan the entire table. Keep raw responses for 90 days — enough to investigate any issue that a normal lookback window might miss.

When you’re investigating a discrepancy, the raw table lets you inspect exactly what the API returned before your transformation touched it. This eliminates an entire category of debugging questions (“is the bug in extraction or transformation?”).

What to Monitor

A minimal monitoring setup for a Meta pipeline:

CheckFrequencyAlert Threshold
Token validity (if using user tokens)DailyExpiry within 7 days
Spend reconciliation vs. Ads ManagerWeekly>10% variance
Row count for yesterdayDaily>20% drop from trailing 7-day average
API error ratePer run>5 consecutive failures
API version end-of-lifeMonthly<6 months to EOL

The row count check is a simple proxy for “is data still flowing.” A 20% drop from the trailing average is often the first signal of a silent failure — the pipeline is running but returning less data than expected, which means the API is misbehaving, an account lost access, or a rate limit is being hit silently.

These five checks cover the vast majority of real production failures. More sophisticated monitoring is possible — schema drift detection, coverage checks across accounts, completeness validation by campaign — but start here and add complexity only when the basics are stable.