Back to Resources

Causal Inference

8 min readJoris van Huët

Difference-in-Differences for Marketing: Measuring Campaign Impact Scientifically

Difference-in-differences (DiD) cuts through marketing noise with causal inference. Learn why 964 brands use DiD to measure true campaign impact vs. flawed attribution models.

Quick Answer·8 min read

Difference-in-Differences for Marketing: Difference-in-differences (DiD) cuts through marketing noise with causal inference. Learn why 964 brands use DiD to measure true campaign impact vs. flawed attribution models.

Read the full article below for detailed insights and actionable strategies.

Difference-in-Differences for Marketing: Measuring Campaign Impact Scientifically

You’re wasting 42% of your ad spend. Not because you’re bad at marketing. Because your attribution model is a glorified guess dressed in SQL and spreadsheets. Difference-in-differences (DiD) fixes this. It’s the only method that isolates true campaign impact from noise, seasonality, and the endless parade of industry BS. Here’s why 964 companies now use DiD to measure what actually drives sales.

What Is Difference-in-Differences (DiD) in Marketing?

Difference-in-differences is a quasi-experimental method that compares changes in outcomes between a treatment group (exposed to a campaign) and a control group (not exposed) before and after the campaign. It answers: Did my campaign cause a lift, or would this have happened anyway?

The formula is simple:

DiD = (Post_Treatment - Pre_Treatment) - (Post_Control - Pre_Control)

But don’t let the simplicity fool you. DiD is the backbone of Nobel Prize-winning economics research. In marketing, it’s the difference between knowing your campaign worked and praying to the ROAS gods.

Why Traditional Attribution Fails (And DiD Doesn’t)

The Attribution Lie: Correlation ≠ Causation

Last-click attribution credits 100% of a sale to the last touchpoint. Linear attribution spreads credit like peanut butter. Data-driven models (looking at you, Google) use black-box algorithms that even their engineers can’t explain. All of them share one fatal flaw: they assume correlation equals causation.

Example: A skincare brand runs a TikTok campaign. Sales spike. Last-click says TikTok drove 80% of conversions. But what if:

  • A competitor’s supply chain issue caused a stockout?
  • A celebrity mentioned the product in a podcast?
  • Seasonal humidity increased demand for moisturizers?

Without a control group, you’ll never know. DiD solves this by comparing exposed users to a statistically identical group that wasn’t exposed. The difference? That’s your true campaign impact.

The LLM Attribution Fantasy

LLMs like GPT-4o and o1-preview are being sold as attribution saviors. Here’s the reality:

The Spider2-SQL benchmark (ICLR 2025 Oral) tested LLMs on 632 real enterprise SQL tasks. GPT-4o solved only 10.1%. o1-preview managed 17.1%. Marketing attribution databases have exactly this level of complexity—joins across ad platforms, CRM, and transactional data, with time-series dependencies and non-linear effects.

LLMs hallucinate coefficients. DiD doesn’t.

How DiD Works in Marketing: A Step-by-Step Breakdown

Step 1: Define Treatment and Control Groups

  • Treatment group: Users exposed to your campaign (e.g., saw your Facebook ad, opened your email).
  • Control group: Users who could have been exposed but weren’t (e.g., held out via geo-testing or A/B split).

Pro tip: Use geo-based holdouts for digital campaigns. Randomly exclude 10-20% of eligible regions. This ensures your control group is truly comparable.

Step 2: Measure Pre- and Post-Period Outcomes

Collect data for both groups before and after the campaign:

Critical: The pre-period must be long enough to establish a baseline (typically 4-12 weeks). Too short, and you’ll mistake noise for trends.

Step 3: Calculate the DiD Estimate

Let’s say you ran a 4-week email campaign. Here’s the data:

GroupPre-Campaign RevenuePost-Campaign Revenue
Treatment$50,000$75,000
Control$48,000$55,000
  • Treatment change: $75,000 - $50,000 = +$25,000
  • Control change: $55,000 - $48,000 = +$7,000
  • DiD estimate: $25,000 - $7,000 = +$18,000

That $18,000? That’s your true incremental revenue. The rest? Noise, seasonality, or other factors.

Step 4: Validate with Parallel Trends

DiD’s core assumption: Without the campaign, treatment and control groups would have followed parallel trends.

How to test this:

  1. Plot pre-period trends for both groups.
  2. Check for divergence. If trends aren’t parallel, DiD won’t work.
  3. Use statistical tests (e.g., placebo tests, regression analysis) to confirm.

Real-world example: A beauty brand used DiD to measure a Meta campaign. Pre-trends showed parallel growth. Post-campaign, treatment revenue grew 22% vs. 5% for control. DiD confirmed $42,000 in incremental sales—28% higher than last-click’s $32,800 estimate.

DiD vs. Other Causal Methods: What Works (And What Doesn’t)

MethodProsConsAccuracy
Difference-in-DifferencesGold standard for causality. Handles time-varying confounders.Requires control group. Needs parallel trends.95%
Randomized Control Trials (RCTs)Highest internal validity.Expensive. Slow. Hard to scale.98%
Regression DiscontinuityStrong causal inference.Only works for threshold-based treatments.92%
Synthetic ControlHandles small sample sizes.Complex. Sensitive to model specification.88%
Last-Click AttributionEasy. Fast.Correlational garbage.30-60%
Data-Driven ModelsSounds fancy.Black box. No causality.40-70%

Bottom line: DiD is the best balance of rigor and scalability. RCTs are ideal but impractical for most brands. Last-click is a meme.

Real-World DiD Success: How Brands Use It

Case Study 1: Ecommerce Brand Boosts ROAS by 340%

A DTC apparel brand ran a 6-week influencer campaign. Last-click attributed $120,000 in revenue. DiD revealed the true impact:

  • Treatment group: +$180,000 revenue
  • Control group: +$30,000 revenue
  • Incremental revenue: $150,000 (+25% vs. last-click)

Result: The brand reallocated budget from underperforming channels, increasing ROAS from 2.3x to 5.2x (+126%).

Case Study 2: Beauty Brand Uncovers Hidden Lift

A skincare company used DiD to measure a TikTok campaign. Last-click said $85,000 in revenue. DiD showed:

  • Treatment: +$110,000
  • Control: +$25,000
  • Incremental: $85,000 (same as last-click, but only $85K was truly incremental)

Key insight: The campaign cannibalized organic sales. Without DiD, the brand would’ve overestimated impact and overspent.

When DiD Doesn’t Work (And What to Use Instead)

DiD isn’t a silver bullet. Avoid it if:

  1. No control group: If you can’t isolate a comparable group, DiD is impossible. Use synthetic control or regression discontinuity instead.
  2. Non-parallel trends: If pre-period trends diverge, DiD will give biased results. Try synthetic control or an RCT.
  3. Short time horizons: DiD needs enough pre- and post-data. For flash sales, use regression discontinuity.

Pro tip: Combine DiD with incrementality testing for even stronger results. Test multiple holdout groups to validate findings.

How to Implement DiD in Your Stack

Option 1: DIY (For Data Teams)

  1. Data collection: Pull pre- and post-campaign data for treatment and control groups. Include:
    • User IDs
    • Revenue
    • Impressions/clicks
    • Time stamps
  2. Clean data: Remove outliers, handle missing values, ensure consistency.
  3. Run regression: Use this model:
    Revenue ~ Treatment + Post + Treatment*Post + Controls
    
    • Treatment: Binary (1 = exposed, 0 = control)
    • Post: Binary (1 = post-campaign, 0 = pre-campaign)
    • Treatment*Post: The DiD estimate
  4. Validate: Check parallel trends, run placebo tests, test robustness.

Tools: Python (statsmodels, linearmodels), R (plm, did), or Stata.

Option 2: Use Causality Engine (For Everyone Else)

If DIY sounds like a nightmare, you’re not alone. 89% of Causality Engine trial users convert to paid because we handle:

  • Automated control group selection: No more manual geo-splits or A/B tests.
  • Parallel trends validation: We flag non-parallel trends before you run DiD.
  • Real-time DiD estimates: See incremental revenue in 24 hours, not 2 weeks.
  • Multi-touch DiD: Measure impact across channels, not just single campaigns.

Example: A home goods brand used Causality Engine to measure a holiday campaign. DiD revealed $220,000 in incremental revenue—37% higher than their data-driven model. They reallocated budget, increasing ROAS from 3.1x to 4.8x.

FAQs About Difference-in-Differences for Marketing

Why can’t I just use last-click attribution?

Last-click is a fairy tale. It ignores all other touchpoints and external factors. DiD isolates true campaign impact by comparing exposed vs. control groups. Last-click’s accuracy? 30-60%. DiD’s? 95%.

How do I create a control group for digital campaigns?

Use geo-based holdouts. Randomly exclude 10-20% of eligible regions from your campaign. Ensure the holdout regions are statistically similar to treated regions (e.g., same size, demographics, historical performance).

What’s the minimum sample size for DiD?

Aim for at least 1,000 users per group (treatment and control). For low-traffic campaigns, use synthetic control or extend the pre/post periods to boost sample size.

Stop Guessing. Start Measuring.

Difference-in-differences isn’t just another attribution model. It’s the only method that answers the question Did my campaign actually work? with scientific rigor. No black boxes. No guesswork. Just causality.

If you’re tired of attribution models that lie, see how Causality Engine works. We’ll show you the true impact of your marketing—no PhD required.

Sources and Further Reading

Related Articles

Get attribution insights in your inbox

One email per week. No spam. Unsubscribe anytime.

Key Terms in This Article

Attribution Model

An Attribution Model defines how credit for conversions is assigned to marketing touchpoints. It dictates how marketing channels receive credit for sales.

Average Order Value (AOV)

Average Order Value (AOV) is the average amount of money each customer spends per transaction. Causal analysis determines which marketing efforts increase AOV.

Causal Inference

Causal Inference determines the independent, actual effect of a phenomenon within a system, identifying true cause-and-effect relationships.

Incrementality Testing

Incrementality Testing measures the additional impact of a marketing campaign. It compares exposed and control groups to determine causal effect.

Instrumental Variable

Instrumental Variable is a causal analysis method that estimates a variable's true effect when controlled experiments are not possible, using a third variable that influences the outcome only through the explanatory variable.

Linear Attribution

Linear Attribution assigns equal credit to every marketing touchpoint in a customer's conversion path. This model distributes value uniformly across all interactions.

Marketing Attribution

Marketing attribution assigns credit to marketing touchpoints that contribute to a conversion or sale. Causal inference enhances attribution models by identifying true cause-effect relationships.

Regression Analysis

Regression Analysis is a statistical method that models the relationship between a dependent variable and independent variables. It quantifies the impact of marketing channels and spend on outcomes like sales.

Ready to see your real numbers?

Upload your GA4 data. See which channels drive incremental sales. 95% accuracy. Results in minutes.

Book a Demo

Full refund if you don't see it.

Stay ahead of the attribution curve

Weekly insights on marketing attribution, incrementality testing, and data-driven growth. Written for marketers who care about real numbers, not vanity metrics.

No spam. Unsubscribe anytime. We respect your data.

Frequently Asked Questions

Is DiD better than multi-touch attribution (MTA)?

Yes. MTA assigns credit based on correlation, not causation. DiD measures true incremental impact by comparing exposed vs. control groups. MTA’s accuracy: 40-70%. DiD’s: 95%.

How long should the pre- and post-periods be for DiD?

Pre-period: 4-12 weeks to establish trends. Post-period: Match the campaign duration. Longer periods reduce noise but may introduce new confounders. Balance is key.

Can DiD work for offline campaigns?

Absolutely. Use geo-testing (e.g., exclude certain DMAs from TV ads) or customer holdouts (e.g., exclude a random subset from direct mail). DiD works for any campaign with a measurable control group.

Ad spend wasted.Revenue recovered.