Data liberation toolkit

Most of the lab’s work starts the same way: public data is stranded — removed, scanned, scattered, or undocumented — and someone needs it whole. The data liberation toolkit turns that recurring work into reusable software, so a new rescue starts from a working pipeline instead of a blank file.

It’s two open-source repositories: an agent skill that orchestrates a liberation project end to end, and a project template it scaffolds from. Together they standardize how we acquire, clean, validate, and document a dataset — the same method that powers the At-Risk Federal Data Archive and our Center for Environmental Journalism collaboration.

It’s free to use and adapt. Read the practitioner guide, or bring a dataset to the help desk.