Data liberation is the work of getting public-interest data out of wherever it’s stuck — a removed web page, a stack of PDFs, an undocumented portal — and into a clean, documented, durable form. The lab packages that workflow as two open-source, MIT-licensed repositories you can use today:
- data-liberation-skill — an agent skill that orchestrates a data-liberation project end to end: scoping the source, acquiring and extracting the data, validating it, and documenting its provenance.
- data-liberation-template — the working Python project the skill scaffolds from. Its
scripts/scaffold.pycopies the template to spin up a new, ready-to-run liberation project.
How it fits together
Point the skill at a source; it scaffolds a fresh project from the template, then walks the acquisition → cleaning → validation → documentation pipeline so the result is reproducible and citable. Every run is meant to leave behind data and the record of how it was obtained.
When to reach for it
This is the engine behind the help desk’s everyday requests — “a website’s data disappeared”, “I have a bunch of PDFs”, and “combine data from different sources”.
Installation and exact usage live in each repo’s README. Browse our other tooling & code and tutorials & how-tos, or bring a dataset to the help desk.