Resource

Data documentation guides

Provenance, limitations, and journalistic applications for every dataset in the archive.

Every dataset we archive ships with a documentation guide so the next person can trust it without re-deriving where it came from. A good guide answers four questions: what the data measures, where and when it came from, how it was collected and transformed, and what it cannot tell you.

We model our guides on established standards rather than inventing our own:

Each guide pairs that structured metadata with a plain-language codebook — every field, its units, and its coded values — and a frank limitations section: known gaps, suppressed cells, definition changes across years, and the questions the data is not fit to answer. Where a series was altered or removed, we record what changed and when — the difference between a defensible story and a correction.

The goal is FAIR data — Findable, Accessible, Interoperable, Reusable — written for a non-specialist audience.

Browse the public data archive, see how documentation fits our reproducible pipelines, or ask the help desk to document a dataset with you.

← Back to Resources