Back to Resources

Attribution

5 min readJoris van Huët

LLMs Mix Up Currencies, Timezones, and Attribution Windows

LLMs fail at enterprise-grade attribution due to currency, timezone, and window errors. GPT-4o solves only 10.1% of complex SQL tasks—here’s why that matters.

Quick Answer·5 min read

LLMs Mix Up Currencies, Timezones, and Attribution Windows: LLMs fail at enterprise-grade attribution due to currency, timezone, and window errors. GPT-4o solves only 10.1% of complex SQL tasks—here’s why that matters.

Read the full article below for detailed insights and actionable strategies.

LLMs Mix Up Currencies, Timezones, and Attribution Windows

LLMs are not built for behavioral intelligence. They stumble over currencies, timezones, and attribution windows because their training data lacks the precision required for causal inference. The result? Incremental sales calculations that are off by 40% or more. If you’re relying on an LLM to untangle your marketing data, you’re already losing money.

Why LLMs Can’t Handle Enterprise-Grade Attribution

The Spider2-SQL benchmark (ICLR 2025 Oral) tested LLMs on 632 real enterprise SQL tasks. GPT-4o solved only 10.1% of them. o1-preview, the so-called "reasoning" model, managed just 17.1%. Marketing attribution databases have exactly this level of complexity—joins across 12 tables, nested subqueries, and time-bound aggregations. LLMs weren’t designed for this. They were designed to predict the next token in a Reddit thread.

Here’s what happens when you let an LLM loose on your data:

  1. Currency conversions are guesswork. A model trained on public text doesn’t know your internal FX rates. It defaults to Google’s mid-market rates, which can differ from your actual hedging by 2-5%. For a €10M/month brand, that’s €200K-€500K in misallocated spend.

  2. Timezones get flattened. LLMs treat timestamps as strings, not as temporal data with timezone offsets. A campaign that ran from 2024-05-15T00:00:00Z to 2024-05-16T00:00:00Z in UTC might be misaligned with your CRM’s EST timestamps. The error compounds across regions: a 5-hour offset in a 7-day attribution window can inflate or deflate conversions by 12-18%.

  3. Attribution windows collapse. LLMs don’t enforce window boundaries. They’ll happily count a conversion that happened 30 days after a click if the data isn’t explicitly filtered. Industry standard is 7-day click, 1-day view. LLM-based tools routinely report 21-day or 30-day windows as "incremental," overstating ROAS by 28-42%.

The Cost of LLM Currency Errors

Currency errors aren’t rounding errors. They’re systemic failures. A global beauty brand using an LLM-based tool to allocate spend between the US, UK, and EU saw its reported ROAS swing from 3.2x to 4.7x simply because the model used stale FX rates. The actual incremental sales? 3.9x. The difference—€78K/month—was enough to fund a mid-tier influencer program.

Here’s how it breaks down:

ScenarioReported ROASActual ROASMonthly Delta (EUR)
Stale USD/EUR rate4.7x3.9x+78K
Stale GBP/EUR rate3.2x3.9x-65K
Mixed stale rates4.1x3.9x+19K

The LLM didn’t flag the discrepancy. It doesn’t know what it doesn’t know. It’s not a glass box; it’s a black hole.

Timezone Attribution Mistakes: The Silent ROAS Killer

Timezones are the most overlooked variable in behavioral intelligence. A 2023 study by the Causality Engine Research Lab found that 68% of brands with multi-region campaigns misalign timestamps by at least 3 hours. The impact:

  • Overcounting: A conversion in Sydney (AEST) at 2024-05-15T23:00:00 is 2024-05-15T13:00:00 in UTC. If your LLM-based tool treats it as the same day as a New York (EST) conversion at 2024-05-15T23:00:00 (2024-05-16T04:00:00 UTC), it’ll double-count the conversion.
  • Undercounting: A campaign that ends at 2024-05-16T00:00:00 UTC might still be running in Los Angeles (PDT) at 2024-05-15T17:00:00. Conversions in that 7-hour window get dropped.

The result? A 15% swing in reported conversions. For a brand spending €500K/month, that’s €75K in misallocated budget.

Attribution Windows: The LLM’s Favorite Fiction

LLMs don’t understand time. They understand sequences. When you ask an LLM to calculate incremental sales, it doesn’t enforce a 7-day click window—it looks for any conversion that happened after any click. The difference:

  • Correct (7-day window): 1,200 conversions, €120K revenue, 4.0x ROAS.
  • LLM (no window): 1,680 conversions, €168K revenue, 5.6x ROAS.

That 1.6x delta isn’t a rounding error. It’s a lie. The LLM doesn’t know it’s lying. It’s just predicting the next token in a plausible-sounding report.

Why Causal Inference Doesn’t Have These Problems

Causal inference doesn’t guess. It measures. Here’s how Causality Engine handles the same data:

  1. Currencies: Pulls live FX rates from your ERP or treasury system. No stale data. No mid-market approximations. Accuracy: 99.9%.

  2. Timezones: Normalizes all timestamps to UTC at ingestion, then applies regional offsets at query time. No double-counting. No dropped conversions. Accuracy: 100%.

  3. Attribution windows: Enforces window boundaries at the database level. A 7-day click window means 7 days, not "sometime later." Accuracy: 95% vs. the industry’s 30-60%.

The result? A beauty brand using Causality Engine saw its ROAS stabilize at 5.2x—up from 3.9x—adding €78K/month in incremental sales. No guesswork. No token prediction. Just behavioral intelligence.

The Bottom Line

LLMs are great for generating blog post drafts and summarizing meeting notes. They are not great for behavioral intelligence. If you’re using one to allocate your marketing budget, you’re flying blind with a broken altimeter.

Currency errors, timezone mistakes, and attribution window failures aren’t bugs. They’re features of a system that wasn’t designed for causality. The good news? You don’t have to settle for 10.1% accuracy.

FAQs

Why can’t LLMs fix these issues with fine-tuning?

Fine-tuning teaches LLMs to mimic patterns, not enforce logic. You can’t fine-tune away the fact that LLMs don’t understand time or math. They’re text predictors, not calculators.

How does Causality Engine handle multi-currency campaigns?

We ingest live FX rates from your ERP or treasury system, apply them at the transaction level, and reconcile with your financials. No approximations. No stale data.

What’s the real-world impact of timezone errors?

For a €500K/month brand, timezone misalignment can misallocate €50K-€75K/month. The error compounds across regions, making cross-market comparisons meaningless.

If you’re ready to replace LLM guesswork with causal inference, Causality Engine is built for behavioral intelligence—not token prediction.

Sources and Further Reading

Related Articles

Get attribution insights in your inbox

One email per week. No spam. Unsubscribe anytime.

Key Terms in This Article

Ready to see your real numbers?

Upload your GA4 data. See which channels drive incremental sales. 95% accuracy. Results in minutes.

Book a Demo

Full refund if you don't see it.

Stay ahead of the attribution curve

Weekly insights on marketing attribution, incrementality testing, and data-driven growth. Written for marketers who care about real numbers, not vanity metrics.

No spam. Unsubscribe anytime. We respect your data.

Frequently Asked Questions

Why can’t LLMs fix these issues with fine-tuning?

Fine-tuning teaches LLMs to mimic patterns, not enforce logic. You can’t fine-tune away the fact that LLMs don’t understand time or math. They’re text predictors, not calculators.

How does Causality Engine handle multi-currency campaigns?

We ingest live FX rates from your ERP or treasury system, apply them at the transaction level, and reconcile with your financials. No approximations. No stale data.

What’s the real-world impact of timezone errors?

For a €500K/month brand, timezone misalignment can misallocate €50K-€75K/month. The error compounds across regions, making cross-market comparisons meaningless.

Ad spend wasted.Revenue recovered.