Back to Resources

Attribution

11 min readJoris van Huët

How to Run a Holdout Test on Meta Ads Without Killing Your Revenue

Learn how to run a holdout test on Meta Ads to measure the true incremental value of your campaigns without sacrificing revenue. A guide for Dutch Shopify brands.

Quick Answer·11 min read

How to Run a Holdout Test on Meta Ads Without Killing Your Revenue: Learn how to run a holdout test on Meta Ads to measure the true incremental value of your campaigns without sacrificing revenue. A guide for Dutch Shopify brands.

Read the full article below for detailed insights and actionable strategies.

Your Meta Ads dashboard shows a 4.5x ROAS. You are celebrating, ready to scale your budget. Stop. That number is a fiction, a carefully crafted illusion designed to keep you spending. The hard truth is that platform-reported metrics are fundamentally broken, and your actual return on ad spend is a mystery. For Dutch Shopify brands scaling past €100k per month, this mystery is a multi-thousand-euro problem. The only way to find the truth is to stop relying on what Meta tells you and start measuring for yourself.

A holdout test is the only way to measure the true, incremental impact of your Meta Ads. It works by showing your ads to a test group while hiding them from a similar control group, allowing you to isolate the sales your ads actually caused. This guide shows you how to run one without risking your revenue.

The ROAS Lie: Why Your Dashboard Is Wrong

Platform-reported ROAS is a misleading metric because it relies on self-serving attribution models that take credit for sales that would have happened anyway. Unlike true incremental lift, which measures causation, ROAS is based on correlation and fails to account for organic customer behavior, leading to inflated performance data and inefficient ad spend.

Return on Ad Spend (ROAS) has been the benchmark for performance marketing for years. The formula is simple: Revenue / Ad Spend. But the data feeding this formula is anything but simple. It is a black box of self-serving marketing attribution models that inflate performance and obscure the truth about your sales.

Meta, like every other ad platform, wants to take credit for as many conversions as possible. It uses generous attribution windows and counts view-through conversions, conversions that happen after someone saw your ad but did not click. A significant portion of these sales would have happened anyway. These are not incremental sales. They are organic purchases that your ads are simply taking credit for. This creates a dangerous gap between your reported ROAS and your actual revenue, a problem we detail in our post on the /blog/attribution-platform-roas-revenue-gap.

Imagine a shopper in Amsterdam. They follow your Dutch beauty brand on Instagram, see your products in a local boutique, and then finally purchase after seeing a retargeting ad on Facebook. Meta claims 100% of the credit. The reality is a complex causality chain involving multiple touchpoints. Your Meta ad was just one link in that chain, and likely not the most important one. Relying on Meta's numbers leads you to over-invest in channels that are not driving new growth, but are instead cannibalizing your organic demand.

This problem is magnified for high-growth Dutch e-commerce brands. As you scale your ad spend, the overlap between channels increases. A customer might see your ad on TikTok, search for your brand on Google, and then convert through a Meta retargeting campaign. Each platform will claim full credit for the sale. This isn't just a measurement error. It's a fundamental misunderstanding of customer behavior. You end up in a bidding war against yourself, paying multiple times to acquire the same customer. The result is a plateau in growth and a decline in profitability, a scenario we explore in our analysis of /blog/channel-cannibalization-meta-tiktok. Breaking through this plateau requires a shift from attribution to causality.

The Gold Standard: Measuring True Incrementality with a Holdout Test

A holdout test is a scientific method for measuring the true causal impact of your advertising. Unlike ROAS, which relies on flawed correlations, a holdout test isolates the incremental sales generated by your ads by comparing a group that sees your ads to a control group that does not. This reveals your actual ROI.

To escape the ROAS trap, you need to measure incrementality. Incrementality tells you the percentage of sales that would not have happened without your ads. The most reliable way to measure this is with a holdout test. A holdout test meta ads is the scientific method applied to marketing. You create two identical groups from your target audience: a test group that sees your ads, and a holdout group that does not. By comparing the purchasing behavior of the two groups, you can isolate the true causal impact of your campaigns. You are no longer looking at correlation. You are measuring causation, a foundational concept in econometrics and marketing science [1].

The primary fear marketers have about holdout tests is lost revenue. The idea of intentionally not showing ads to a portion of your audience feels like leaving money on the table. This is a misconception. A properly structured holdout test uses a small, statistically significant portion of your audience (typically 10-20%) for a limited time. The insights gained from a short-term test will generate far more revenue in the long run than the small amount you risk by creating a holdout group. It is the difference between flying blind and having a GPS. For a deeper dive into the concept, see our foundational post on /blog/incrementality-testing-ads-work.

How to Run a Holdout Test in Meta Ads Manager

Running a holdout test in Meta Ads Manager involves using the built-in "Lift Test" tool. This feature automates the process by creating a randomized control group that is excluded from seeing your ads, allowing you to directly compare their behavior against the test group and measure the true, incremental impact of your campaigns.

Meta has a built-in tool for running these tests, which they call a "Lift Test." Setting one up is a straightforward process that gives you an undeniable measure of your ads’ real value. This is the framework for empowerment, giving you the tools to challenge the platform’s data. For technical details, you can consult the developer portal.

Step 1: Define Your Test and Audience Before you start, know what you want to measure. Are you testing a specific campaign, a channel, or your entire Meta ads strategy? For your first test, keep it simple. Let’s say you want to measure the incrementality of your top-of-funnel prospecting campaign in the Netherlands. Your audience is defined, your campaign is running. Now it is time to create the test.

Step 2: Create the Lift Test in Meta's Experiments Tool Navigate to the "Experiments" tool in your Meta Ads Manager. Select "Lift Test." Meta will guide you through the setup. You will choose the campaigns you want to include in the test. Meta then automatically creates the holdout group. It randomly splits your defined audience, typically into a 90% test group (who can see your ads) and a 10% holdout group (who cannot). This process is critical. The randomization ensures that the only significant difference between the two groups is their exposure to your ads. It removes selection bias, a common flaw in less rigorous testing methods [2]. Without a randomized control group, you can never be certain if the outcomes you see are due to your ads or to pre-existing differences in the audience segments.

Step 3: Launch and Run the Test Determine the test duration. It needs to be long enough to gather statistically significant data. For most e-commerce brands, a 2 to 4 week period is sufficient. During this time, do not make major changes to your campaigns or budget. Consistency is key to isolating the variable you are testing: the ads themselves. This is a period of scarcity for the holdout group, a limited window where their natural behavior is observed without ad influence.

Step 4: Analyze the Results and Calculate Incremental Lift Once the test concludes, Meta provides a detailed report. The key metric is "Incremental Lift Percentage." This shows the percentage increase in conversions in the test group compared to the holdout group. The report will also show a confidence level. You are looking for a confidence level of 90% or higher. This means there is a 90% or greater chance that the observed lift was caused by your ads and not random chance.

The formula for incremental sales is simple:

(Conversions in Test Group / Users in Test Group) - (Conversions in Holdout Group / Users in Holdout Group) * Total Users

This calculation reveals the exact number of sales your ads generated that would not have happened otherwise. This is the number you should be basing your budget decisions on, not the inflated ROAS from your dashboard.

The confidence level is your guard against making decisions based on random noise. A 90% confidence level doesn't just mean the result is 'good'. It means that if you were to run this exact same test 100 times, you would expect to see a positive lift in at least 90 of them. It's a statistical guarantee against fluke results. Anything less than 80% should be treated with extreme caution. Making budget decisions on a low-confidence result is no better than guessing. It's a direct path to wasting money on campaigns that feel effective but have no real impact on your bottom line.

From Manual Tests to Automated Intelligence

Automated behavioral intelligence platforms like Causality Engine provide a continuous, real-time view of incrementality without the need for manual holdout tests. By using causal inference, these systems analyze causality chains to identify the true drivers of growth and eliminate wasted ad spend on cannibalistic channels, offering a more precise and scalable solution.

Holdout tests are a powerful tool, but they are a snapshot in time. Consumer behavior changes, market dynamics shift, and a test you ran last quarter may not be relevant today. Running them continuously is impractical for most teams. This is where behavioral intelligence becomes essential. Causality Engine is a behavioral intelligence platform that uses causal inference to replace broken marketing attribution for ecommerce brands.

Causality Engine automates this process of discovery. Instead of running manual, high-stakes holdout tests, our platform uses causal inference to continuously measure the incremental impact of every channel in your marketing mix. We analyze the complex causality chains that lead to a purchase, identifying the true drivers of growth and exposing the cannibalistic channels that are stealing credit. We provide a real-time, always-on view of your incremental sales, allowing you to sharpen your ad spend with a level of precision that manual testing can never achieve. You can use our /tools/waste-calculator to see how much you could be saving.

Frequently Asked Questions

What is a holdout test in marketing?

A holdout test is an experiment where a portion of your target audience (the holdout group) is intentionally not shown advertising for a specific period. By comparing the conversion rate of the holdout group to the group that did see the ads (the test group), marketers can measure the true incremental lift, or the sales that were directly caused by the advertising.

How long should I run a holdout test?

The duration of a holdout test depends on your sales cycle and conversion volume. For most Dutch Shopify beauty and fashion brands, a test duration of 2 to 4 weeks is typically sufficient to achieve a statistically significant result with a high confidence level (over 90%).

What is a good incremental lift percentage?

A "good" incremental lift percentage varies by industry, margin, and business goals. However, a lift percentage below 10-15% should be a cause for concern, as it suggests your ads are primarily capturing customers who would have converted anyway. The goal is to find the point where your ad spend is generating the highest possible number of new customers, not just retargeting existing ones.

Can I run a holdout test for other channels like Google or TikTok?

Yes, the principles of holdout testing and incrementality measurement apply to all marketing channels. While the setup process differs (Google has its own version called Geo-Lift studies), the methodology of comparing a test group to a control group is the universal standard for measuring causal impact. This is crucial for understanding your true media mix effectiveness.

How does a holdout test differ from A/B testing?

A/B testing compares two variations of an ad or landing page to see which performs better. A holdout test measures the impact of the advertising itself by comparing a group that sees ads to a group that sees no ads. A/B testing optimizes creative, while holdout testing validates budget.

Stop guessing. See true ROI.

Measure your true ROI.

References

[1] Varian, H. R. (2016). Causal inference in economics and marketing. Proceedings of the National Academy of Sciences, 113(27), 7310 - 7315. https://www.pnas.org/doi/abs/10.1073/pnas.1510479113

[2] Lewis, R. A., & Rao, J. M. (2015). The unfavorable economics of measuring the returns to advertising. The Quarterly Journal of Economics, 130(4), 1941-1973. https://academic.oup.com/qje/article/130/4/1941/1869538

[3] Gordon, B. R., Zettelmeyer, F., Bhargava, N., & Chapsky, D. (2019). A comparison of approaches to advertising measurement: Evidence from big field experiments at Facebook. Marketing Science, 38(2), 193-225. https://pubsonline.informs.org/doi/abs/10.1287/mksc.2018.1141

Get attribution insights in your inbox

One email per week. No spam. Unsubscribe anytime.

Key Terms in This Article

Ready to see your real numbers?

Upload your GA4 data. See which channels drive incremental sales. Confidence-scored results in minutes.

Book a Demo

Full refund if you don't see it.

Stay ahead of the attribution curve

Weekly insights on marketing attribution, incrementality testing, and data-driven growth. Written for marketers who care about real numbers, not vanity metrics.

No spam. Unsubscribe anytime. We respect your data.

Ad spend wasted.Revenue recovered.