CRD Scoring: Measuring Confidence-Reality Divergence

Every large language model has the same blind spot: it doesn't know what it doesn't know. When a model generates a response with confident, authoritative language, there is no intrinsic signal telling you whether that confidence is warranted. The model treats "Paris is the capital of France" and "studies show 73% of users prefer dark mode" with identical certainty — even though one is a verifiable fact and the other is a fabricated statistic.

Confidence-Reality Divergence (CRD) was designed to close this gap. It is not a filter, not a classifier, and not a second opinion from another model. CRD is a mathematical measurement of the distance between an AI's expressed confidence and the evidence available to support it.

What CRD Measures

At its core, CRD answers one question: Is the AI's confidence in this claim proportional to the evidence supporting it? A claim backed by multiple verified sources should allow high confidence. A claim with no supporting evidence should trigger skepticism — regardless of how fluently the model expresses it.

The CRD score ranges from 0.0 to 1.0. A score of 0.0 means the AI's confidence is perfectly calibrated to available evidence. A score of 1.0 means maximum divergence — the AI is expressing certainty about something for which no evidence exists.

The Formula

CRD Formula:

CRD = min(1.0, |confidence - evidence| / max(floor, evidence) * domain_multiplier)

Where confidence is the model's expressed certainty (0-1), evidence is the Truth Store's verification score (0-1), floor is a domain-specific minimum denominator, and domain_multiplier scales sensitivity by context.

The formula has three critical design decisions built into it:

Absolute divergence. CRD measures the gap in both directions. Under-confidence (the model hedges on a verified fact) is also a divergence, though governance typically only acts on over-confidence.
Floor protection. The max(floor, evidence) term prevents division by near-zero evidence scores from producing wildly inflated CRD values. The floor varies by domain because the cost of a false positive varies by context.
Domain scaling. A CRD of 0.4 in a casual conversation is unremarkable. A CRD of 0.4 in a medical diagnosis is a governance event. The domain multiplier ensures the formula reflects real-world stakes.

Domain-Specific Floors

The floor parameter is the most important tuning knob in the CRD formula. It determines how aggressively the system responds to low-evidence claims. Lower floors mean higher sensitivity; higher floors mean more tolerance for uncertainty.

0.01 Medical / Legal floor

0.10 Financial / Technical floor

0.40 Casual / Creative floor

A medical floor of 0.01 means that even a small gap between confidence and evidence produces a significant CRD score. This is intentional: in healthcare, an AI expressing unwarranted certainty about a diagnosis can directly harm a patient. The cost of a false negative (failing to flag an unjustified claim) vastly exceeds the cost of a false positive (flagging a legitimate claim for review).

Conversely, a casual floor of 0.40 gives the model substantial room to express opinions, make creative suggestions, and engage in speculative conversation without triggering governance. Not every interaction requires forensic-grade verification.

The Truth Store

CRD depends on a source of ground truth to compute the evidence score. In EVE AI Core, this is the Truth Store — a structured repository of verified facts, source trust levels, and claim provenance chains.

When the CRD engine evaluates a claim, it queries the Truth Store for relevant evidence. The Truth Store returns an evidence score based on:

Source trust. Verified sources (official documentation, peer-reviewed data) contribute more than unverified sources. Trust levels range from VERIFIED (1.0) to UNTRUSTED (0.2).
Corroboration. Multiple independent sources confirming the same fact increase the evidence score. A single source, regardless of trust level, produces a lower score than three independent sources agreeing.
Recency. Facts have a time decay. A verified fact from 2024 is more reliable than one from 2018 when the claim involves current events or rapidly evolving domains.

CRD Thresholds in Practice

CRD Range	Interpretation	Governance Action
0.0 – 0.3	Well-calibrated confidence	No action — claim is supported by evidence
0.3 – 0.6	Moderate divergence	Review — add qualifiers, reduce assertiveness
0.6 – 0.8	Significant divergence	Flag — require human review or source citation
0.8 – 1.0	Critical divergence	Veto — block claim or force retraction

These thresholds are not static. The Control Plane adjusts them based on the stakes profile of the current interaction. In a safety-critical context (medical, legal, financial), thresholds shift downward: a CRD of 0.4 may trigger a veto. In a creative context, thresholds shift upward to allow expressive freedom.

What CRD Is Not

CRD is explicitly not a content filter. It does not evaluate whether a claim is harmful, offensive, or policy-violating. It evaluates whether the AI's confidence in the claim is epistemically justified. A perfectly harmless claim can have a high CRD score (if the model expresses certainty without evidence), and a controversial claim can have a low CRD score (if it is well-supported by verified sources).

CRD doesn't ask if an AI is confident. It asks if that confidence is justified.

This distinction matters because it separates epistemic governance from content policy. Content policy is a matter of organizational values and regulatory requirements. Epistemic governance is a matter of structural integrity: an AI system that expresses unjustified certainty is structurally unreliable, regardless of whether the specific claim happens to be correct.

CRD is one component of the Three-Plane Architecture's Control Plane. It works alongside charter rules, cognitive locks, and domain-specific constraints to produce governance decisions that are deterministic, auditable, and provably correct — in under 2 milliseconds.

End