Empirical notes
Legal Data Audit Checklist
A compact checklist for missingness, extraction drift, panel structure, priors, and interpretive restraint.
2026
The audit begins before the model.
A legal-data result is not serious until the data-generating path is visible enough to challenge. Missingness, extraction error, panel dependence, priors, and case composition are not footnotes. They decide whether the conclusion is stable.
record_span -> extracted_fact
extracted_fact = { value, status, source, uncertainty }
publishable_claim only if:
source can be checked
uncertainty is named
denominators are stable
reviewer action is recorded
alternative explanations were tested
Checklist
| Check | Target | Failure mode |
|---|---|---|
| Missingness | Manual sample. | If "not mentioned" actually means "model missed it," the result can drift in a patterned way. |
| Posterior checks | Observed vs simulated. | A model should reproduce the broad shape of the data it claims to explain. |
| Prior sensitivity | Rank stability. | If small prior changes reorder the result, the support is thinner than the headline. |
| Panel dependence | Shared case effects. | Three judge-vote rows from one case are not three independent worlds. |
| Track effects | Published vs routine decisions. | Case type can explain a visible difference that looks like actor-level disparity. |
Product implication
The same rule applies to legal AI tools. A citation chip that appears only after the answer has already been generated is weak. The source path has to shape the answer while it is being made. The interface should expose files, pages, spans, dates, people, claims, and the path from each claim back to the material that produced it.
Public rule
Good empirical work should make disagreement easier. A reader should be able to ask: what was counted, what was excluded, what was inferred, what denominator changed, and what would have to be true for the claim to fail?