Harbinger Explorer

Back to Knowledge Hub
Engineering

Reverse ETL Explained: Push Data Back to Your Tools

9 min read·Tags: reverse-etl, data-pipeline, crm, data-warehouse, dbt, operational-analytics, data-activation

Your data warehouse has everything — customer LTV, churn scores, product usage signals. But your sales team is staring at a blank Salesforce record. Reverse ETL is the pattern that closes this gap by syncing processed data from your warehouse back into the operational tools your teams actually live in.

What Is Reverse ETL?

Traditional ETL (or ELT) moves data into your warehouse for analytics. Reverse ETL flips the direction: it takes the curated, transformed data already in your warehouse and pushes it into downstream operational systems — CRMs, marketing automation, customer support tools, ad platforms.

The pattern emerged around 2020 when teams realized that their data stack was producing great insights that never reached the people who needed them. Analysts built churn models; sales reps never saw the scores. Marketing built lookalike audiences; they had to export CSVs manually to upload.

Warehouse (Snowflake / BigQuery / Redshift / DuckDB)
        │
        ▼  [Reverse ETL]
  ┌─────────────────────────────────────┐
  │  CRM        │  Marketing │  Support  │
  │  Salesforce │  Braze     │  Zendesk  │
  └─────────────────────────────────────┘

Why Teams Adopt Reverse ETL

Pain Without Reverse ETLWhat Reverse ETL Solves
Manual CSV exports to upload segmentsAutomated, scheduled syncs
Sales reps lack enriched contextLTV, risk scores live in CRM fields
Marketing builds audiences from raw toolsWarehouse-quality segments in Braze/Iterable
Multiple teams maintain their own ETLSingle source of truth flows everywhere
Stale data in operational systemsNear-real-time updates from warehouse

How Reverse ETL Works — Step by Step

Step 1 — Define a model: You write a SQL query or point to a dbt model in your warehouse. This is the "what to sync" definition.

-- PostgreSQL / Snowflake SQL
-- churn_risk_scores model passed to Reverse ETL
SELECT
    u.user_id,
    u.email,
    cs.churn_probability,
    cs.predicted_churn_date,
    cs.segment_label
FROM users u
JOIN churn_scores cs ON u.user_id = cs.user_id
WHERE cs.scored_at >= CURRENT_DATE - INTERVAL '1 day';

Step 2 — Map fields: You map warehouse columns to destination fields (churn_probability → Salesforce custom field Churn_Score__c).

Step 3 — Define sync behavior: Upsert (match on primary key and update), insert-only, or mirror (delete records that disappear from the model). Most teams use upsert.

Step 4 — Schedule: Run hourly, daily, or trigger via webhook after a dbt run finishes.

Step 5 — Monitor: Track sync success/failure, record counts, and API error rates.

Common Destinations

  • CRM: Salesforce, HubSpot — enrich contacts with product usage, scores
  • Marketing: Braze, Iterable, Klaviyo — sync behavioral segments for campaigns
  • Advertising: Facebook Custom Audiences, Google Customer Match — match lists
  • Support: Zendesk, Intercom — add subscription tier, LTV to tickets
  • Product: Amplitude, Mixpanel — push warehouse segments for cohort analysis

Top Reverse ETL Tools Compared

ToolPricingDestinationsWarehouse SupportNotes
CensusFrom ~$800/mo200+All majordbt model native support
HightouchFree tier + paid200+All majorStrong no-code audience builder
PolytomicCustom pricing100+All majorFocuses on sales/ops use cases
RudderStackOSS + Cloud150+All majorFull CDP with reverse ETL
dbt Cloud + partnerVariesVia integrationsdbt nativeEmerging ecosystem play

Last verified: April 2026 [PRICING-CHECK]

When to Use Reverse ETL (and When Not To)

Use it when:

  • Your warehouse is the single source of truth and operational tools need derived data
  • You're replacing manual CSV exports with automated syncs
  • You want to sync dbt models directly to destinations without custom code
  • Your team lacks engineering capacity to build bespoke integrations

Skip it when:

  • You need sub-second latency — streaming pipelines (Kafka + Flink) are a better fit
  • Your destination already connects to your source systems directly and the warehouse adds no value
  • You're syncing raw operational data back to operational systems — that's just circular replication

The Operational Reverse ETL Pattern with dbt

The cleanest implementation pairs dbt with a Reverse ETL tool:

  1. Raw data lands in warehouse
  2. dbt transforms it into clean models (e.g., dim_customers, fct_churn_scores)
  3. A dbt job-completion webhook triggers Census or Hightouch
  4. The tool syncs only changed records (using a _synced_at watermark or primary key diff)
-- Spark SQL / dbt incremental model
-- mart_customer_crm_sync.sql
{{ config(materialized='incremental', unique_key='user_id') }}

SELECT
    user_id,
    email,
    plan_tier,
    lifetime_value_usd,
    churn_risk_score,
    last_active_date,
    CURRENT_TIMESTAMP() AS updated_at
FROM {{ ref('fct_customer_metrics') }}

{% if is_incremental() %}
  WHERE updated_at > (SELECT MAX(updated_at) FROM {{ this }})
{% endif %}

Common Pitfalls

API rate limits: Salesforce's REST API has daily limits. Syncing 500k records daily at 200 records/batch will hit them. Always check destination API quotas before planning sync volume.

Overwriting human edits: If a sales rep manually sets a field in Salesforce and your sync overwrites it an hour later, expect complaints. Use conditional write logic or lock fields that humans should own.

Schema drift: If your warehouse model changes (column renamed, type changed), the destination mapping breaks silently. Build alerting on sync failure rates.

Treating it as a CDC replacement: Reverse ETL syncs processed warehouse data on a schedule. If you need real-time event replication between operational databases, use Change Data Capture instead.

Reverse ETL vs. Other Patterns

PatternDirectionLatencyUse Case
ETL/ELTOps → WarehouseMinutes-hoursAnalytics
CDCDB → DB / WarehouseSecondsReal-time replication
Reverse ETLWarehouse → OpsMinutes-hoursOperational activation
StreamingEvent bus → anywhereMillisecondsReal-time pipelines

See ETL vs ELT for a deeper look at the inbound direction.

Harbinger Explorer for Reverse ETL Prep

Before you set up a sync, you need to validate your warehouse model produces the right data. Harbinger Explorer lets you query your data directly in the browser using DuckDB WASM — connect to CSVs or API endpoints, run SQL to inspect the output of your model, and confirm field distributions before wiring up a sync. Running SELECT COUNT(*), segment_label FROM churn_scores GROUP BY 2 to spot-check segment sizes takes seconds without a full BI tool.

Wrapping Up

Reverse ETL is the operational layer that makes your data warehouse useful beyond analytics reports. The pattern is mature, the tooling is solid, and the ROI is high when your team is drowning in manual CSV workflows. Start with one destination, one model, and a weekly sync — then expand from there.

Next step: Identify one report your sales or marketing team is running manually today, build a dbt model for it, and set up a Reverse ETL sync to eliminate the manual step.


Continue Reading


[VERIFY]: API rate limit numbers for Salesforce REST API. [PRICING-CHECK]: Tool pricing as of April 2026 — verify with vendor pages.


Continue Reading

Try Harbinger Explorer for free

Connect any API, upload files, and explore with AI — all in your browser. No credit card required.

Start Free Trial

Command Palette

Search for a command to run...