ServicesAboutNotesContact Get in touch →
EN FR
Note

Hybrid ELT Strategy

When to buy managed ELT, when to build with dlt + AI, and the practical migration path — a decision framework for splitting your pipeline portfolio strategically

Planted
dltbigquerydata engineeringetlcost optimization

Most data teams should operate a hybrid portfolio rather than choosing all-managed or all-custom. Each pipeline uses the tool that makes economic and operational sense for that specific source.

When Buying Still Wins

Managed ELT tools like Fivetran and Airbyte Cloud remain the right choice for certain scenarios, even after the pricing changes.

Compliance-heavy environments. SOC 2 Type II, HIPAA, and GDPR compliance are built into Fivetran and Airbyte Enterprise. Building equivalent audit trails, access controls, and security infrastructure yourself takes significant effort. If your organization requires these certifications and lacks security engineering capacity, the premium is justified. The cost of a compliance failure far exceeds the cost of a managed tool.

Non-technical data teams. If your team lacks Python proficiency and the organization won’t invest in developing it, code-first tools aren’t practical. Fivetran’s UI-driven configuration serves teams that need data flowing without engineering capacity. No amount of AI assistance helps if nobody on the team can review the generated Python code.

Extreme connector breadth. Fivetran offers 700+ connectors. If you need reliable integrations with dozens of SaaS tools you’d never build yourself, the coverage matters. Some community Airbyte connectors have reliability issues, and dlt’s verified connector list sits at 60+ plus its REST API builder. When you need Zendesk, Intercom, Salesforce, NetSuite, Jira, and fifteen other standard SaaS tools, managed extraction avoids building and maintaining connectors that aren’t core to your business.

Time-to-value urgency. Managed solutions deploy in days. Built solutions, even with AI assistance, require development cycles measured in weeks for production readiness. If you need data flowing next week for a critical business decision, managed wins on timeline alone.

When Building Wins

The economics have shifted to favor building for several common scenarios:

High-MAR sources. Marketing platforms, ad networks, anything with granular row-level data that updates frequently. These are the sources where per-connector MAR pricing scales most painfully. Google Ads, Meta Ads, TikTok Ads — high update frequency, granular data, well-documented APIs that dlt + AI can handle.

Custom integrations. Sources your managed provider doesn’t support well or at all. dlt’s REST API framework makes these straightforward to build. Internal APIs, niche SaaS tools, partner data feeds — these are often the most valuable data sources precisely because they’re unique to your business.

Sources where you need control. When you need to control exactly what data gets extracted, at what frequency, with what transformations applied at the extraction layer. Managed tools give you their schema and their sync schedule. Custom pipelines give you yours.

Cost-sensitive environments. When a team’s total data infrastructure budget is measured in hundreds, not thousands, per month, the $12,000 annual Fivetran minimum alone may represent an outsized share of the budget. A dlt pipeline running on existing infrastructure costs effectively nothing beyond engineering time.

The Portfolio Split

The practical split for most teams looks like this:

Source TypeRecommended ApproachRationale
ERP / CRM (Salesforce, NetSuite)ManagedStable APIs, complex schemas, low MAR relative to value
Marketing platforms (Google Ads, Meta)Build with dltHigh MAR, frequent updates, well-documented APIs
Standard SaaS (Zendesk, Jira, Intercom)ManagedConnector maintenance not worth engineering time
Custom / internal APIsBuild with dltNo managed option, or connector quality is poor
High-volume event dataBuild with dltStreaming volume makes per-row pricing prohibitive
Compliance-regulated sourcesManagedAudit trail and certification requirements

Use managed tools where operational convenience exceeds the cost; build where the cost of convenience is disproportionate to the value delivered.

The Migration Path

If you’re paying significant Fivetran bills for marketing data, here’s a practical path forward. Don’t try to migrate everything at once — start with the highest-impact connector and build confidence.

Step 1: Start With Your Highest-MAR Connector

Identify the source that’s most expensive relative to its business value. Marketing platforms are usually the answer. Check your Fivetran dashboard for per-connector MAR costs. The connector with the highest MAR bill is your first migration candidate.

Google Ads, Meta Ads, and TikTok Ads are the most common starting points. They have high update frequency (ad metrics update retroactively), granular data (ad-level or keyword-level), and well-documented APIs that dlt + AI handle well.

Step 2: Use BigQuery-Specific Optimizations

If your warehouse is BigQuery, take advantage of dlt’s destination-specific features:

  • GCS staging for large loads avoids BigQuery streaming insert costs. Stage through Cloud Storage and use free batch loading.
  • Partition by date for marketing data. This is standard practice and dlt configures it declaratively.
  • Cluster on campaign or ad group IDs for query performance. Clustering reduces scan costs for the filtered queries your analysts run most.

These optimizations are configuration, not custom engineering. They’re the kind of thing AI generates correctly on the first try.

Step 3: Measure Honestly

Track actual development time, not estimates. Include the time to:

  • Understand the source API (read documentation, test endpoints)
  • Generate the initial pipeline with AI assistance
  • Handle edge cases (rate limits, undocumented quirks, data gaps)
  • Test against production-like data volumes
  • Set up monitoring and alerting
  • Get through code review and deploy to production

Compare the total against what you’re currently paying in MAR fees. The math usually works, but verify it for your specific situation. Be honest about the ongoing maintenance cost too — dlt handles schema evolution automatically, but you’ll still need to respond to API changes and monitor pipeline health.

Step 4: Build the Next Connector Faster

Each pipeline you build develops patterns and reusable components:

  • Authentication modules. OAuth flows, token refresh logic, and secrets management become templates.
  • Error handling. Retry strategies, rate limit backoff, and alerting patterns transfer across connectors.
  • Deployment scripts. CI/CD, scheduling, and infrastructure configuration become boilerplate.
  • Testing patterns. Data validation, schema verification, and incremental loading tests are reusable.

The second connector takes less time than the first. The fifth takes a fraction. This compounding benefit is something managed tools don’t provide — your tenth Fivetran connector costs the same as your first.

Step 5: Re-Evaluate the Full Portfolio

After you’ve migrated 2-3 high-MAR connectors and have a working pattern library, re-evaluate your remaining managed connectors. Some will clearly justify staying managed (stable, low-MAR, complex schemas). Others will be obvious migration candidates. A few will be judgment calls where the decision depends on your team’s capacity and priorities.

Cost Threshold Heuristic

If the monthly MAR cost for a source exceeds what a senior engineer costs for a day of work, building is likely more economical. For most markets, that threshold is $1,200–2,000/month per connector. Below that threshold, engineering time to build and maintain a custom connector typically exceeds the managed tool cost.

The threshold is not static. As AI-assisted development speeds up and open-source tools mature, it moves downward. Run the cost comparison periodically rather than once.