Engineering Principles

The Validated Mock Data Principle

Why your MVP compliance scores must come from real sources — not invented numbers

16 March 2026 · Risto Anton · Lifetime Oy

Every SaaS product starts with mock data. Demo dashboards, sample scores, placeholder charts. There is nothing wrong with that — you need to ship before you have real users.

But there is a massive difference between invented mock data and validated mock data. One builds trust. The other destroys it the moment a compliance officer asks: "Where does this number come from?"

The Problem with Fake Scores

In EU regulatory compliance, numbers carry legal weight. When your dashboard shows a company's CSRD readiness at 72%, that number implies a methodology. When an auditor asks for the source, Math.random() * 100 is not an acceptable answer.

What most MVPs do:

// "Looks about right"
const csrdScore = 72;
const etsScore = 85;
const cbamScore = 45;
const overallScore = Math.round((csrdScore + etsScore + cbamScore) / 3);

No methodology. No source. No way to defend these numbers in a meeting.

This approach has three failure modes:

  1. Trust collapse — The first enterprise buyer who asks "where does 72% come from?" gets no answer. Deal lost.
  2. Migration debt — When you finally connect real data, every number changes. Users think the product is broken.
  3. Legal risk — In regulated industries, showing compliance scores without methodology can be classified as misleading representation.

The Validated Mock Data Principle

MVP features may use static data, but every value must cite a real, verifiable source.

Mock data is production-ready data that simply has not been personalized yet.

Instead of inventing numbers, you source every mock value from the same official data that the real system will eventually use. The result: your demo data is real data — it is just sector-average instead of company-specific.

What validated mock data looks like:

// EU ETS benchmark — Commission Implementing Regulation 2021/447
{
  sector: 'steel',
  product: 'Hot metal',
  factor: 1.328,
  unit: 'tCO2e/t product',
  source: 'EU ETS Benchmark 2021-2025',
  methodology: 'ets-benchmark',
  year: 2024
}

Exact source. Auditable. Upgrades to company-specific data with zero migration.

How We Apply It at DWS IQ

Our compliance scoring engine uses this principle at every layer. Here are the six data categories and their validated sources:

Data Type
Source
Example
Emission factors
EU ETS benchmarks (Reg. 2021/447), CBAM defaults (Reg. 2023/1773)
1.328 tCO2e/t hot metal
Sector averages
EEA Greenhouse Gas Inventory, Eurostat SBS
Steel median: 120,000 tCO2e/yr
Carbon prices
ECB Statistical Data Warehouse, ICE/ECX auction
€65/tCO2e (2024 avg)
Regulatory thresholds
EUR-Lex official regulation text
CBAM: €150,000 import threshold
ESRS disclosures
EFRAG ESRS 2023 final set (82 disclosure requirements)
E1-6: Gross Scope 1, 2, 3 GHG
Company data
Published ESG reports, annual reports
Extracted via regex + LLM

The Upgrade Path Is Built In

The beauty of validated mock data is that upgrading to real data is not a migration — it is a configuration change. The scoring engine does not care whether the input came from a sector benchmark or a company's actual ESG report:

Layer 1
Sector Benchmark
No company data needed. Score based on EEA sector averages. Available immediately.
Layer 2
Report Extraction
Paste ESG report text. Regex extracts emissions, measures, ESRS disclosures. Score updates in seconds.
Layer 3
Live Company Data
Connected to compliance profile database. Real-time scoring from verified company data. Full audit trail.

Same scoring engine. Same regulatory thresholds. Same deterministic logic. The only thing that changes is the input source — and each layer is strictly better than the last.

Five Rules for Validated Mock Data

1

Every mock value has a source field

If you cannot cite the source, the number does not belong in your codebase.

2

Use the same data structures as production

Mock data flows through the same scoring engine, same types, same validation. No separate "demo mode."

3

Sector averages are the right default

EEA and Eurostat publish free, public sector data. Use median values (less skewed by outliers than averages).

4

Show the data source in the UI

Users must always know whether they are seeing benchmark data or their own data. A green dot for live, amber for benchmark.

5

Deterministic scoring — no randomness

Same input must always produce same output. No Math.random(), no timestamps in scoring logic, no LLM in the critical path.

Bottom Line

Fake compliance scores are worse than no scores. They create false confidence, fail under scrutiny, and require painful migration when real data arrives.

Validated mock data costs the same effort to build — you are just sourcing from EU regulatory databases instead of your imagination. And when a compliance officer asks "where does this 42% CSRD readiness come from?", you can answer: "EFRAG ESRS 2023 final set, 82 mandatory disclosures, cross-referenced with EEA sector averages for steel (NACE C24). Here is the breakdown."

That answer closes deals. The other one loses them.

Subscribe to Lifetime Scope Journal

Weekly insights on EU compliance, AI agents, and industrial transformation.

Subscribe
Share

From the Store