Manifesto

What is public interest data science?

By Brian C. Keegan · Jun 2025 · 4 min read

Nearly two decades after the arrival of the “fourth paradigm” of data-intensive scientific discovery, it’s fair to ask: who has really benefited from this data revolution? Many of the promises of more data-driven decisions remain unfulfilled. Instead, we find ourselves in a world where employment is less secure, wealth is more concentrated, politics are more polarized, and our behavior is more surveilled than ever before. Data has not brought us the prosperity, security, or liberation we were promised.

Instead, much of the value generated by this data explosion has been captured by private interests. Corporations routinely harness data from customers, users, and employees to drive dashboards, predictions, recommendations, and artificial intelligence systems. Transparency, privacy, and justice have often been pushed aside in favor of efficiency, scalability, and profitability.

But data science doesn’t have to be this way.

Public interest data science is an approach to data that puts transparency, accountability, and justice at the forefront. Building on the legacies of journalism, law, accounting, and other professions with strong public service values, public interest data science priorities the general welfare and common good, not just for commercial gain or academic prestige.

Journalism, for instance, has a constitutionally protected public mission. Investigative journalists use public records and freedom of information requests to expose injustices and hold the powerful to account. Data journalism takes this tradition further, combining computational tools with data analysis and visualization to tell compelling stories.

Public interest law advances civil rights, consumer protections, and environmental health through legal advocacy. Legal clinics and pro bono lawyers represent those marginalized by exploitative systems, forcing changes through litigation, legislation, and settlements.

Public interest accounting works to ensure that auditing and reporting foster transparency, trust, and ethical standards in financial practices. Public interest technology seeks to build software and tools that serve social needs. Many other disciplines like public health, social work, and urban planning center their work on advancing the common good.

Public interest data science aspires to these same values: rigorous inquiry, democratic accountability, ethical conduct, and a commitment to the common good. This perspective challenges the belief that data is neutral or that “data speaks for itself.” In reality, data is shaped by human choices, power dynamics, and social context. Data can empower, but it can also exclude and cause harm. To serve the public interest, data must be interpreted, explained, debated, and held accountable in the public sphere.

A key ingredient in this work is access to data. While economic and political pressures increasingly limit access, many valuable open data resources remain available through federal, state, and local governments as well as from academics, journalists, and nonprofits.

Boulder and Colorado are fortunate in this respect. City, county, and state agencies provide well-documented, up-to-date open data portals, supported by strong freedom of information policies. Combined with a highly educated and civically engaged population and the area’s unique geography and demographics, Boulder is an ideal place to nurture a public interest data science effort.

My hope is that this publication will serve as a hub for a future public interest data science clinic at the University of Colorado Boulder. While data analysis alone cannot address the existential challenges of rising authoritarianism, accelerating climate change, and deepening inequality, coupling responsible data analysis with engaging stories focused on local issues can help build more resilient, connected, and responsive communities.

Everyone has a part to play in this effort: undergraduate and graduate students, faculty and staff, journalists, technologists, public servants, policymakers, and residents. At the same time, I recognize how much there is to learn from those who have come before. While I am on my sabbatical until Fall 2026, I plan to use this space to reflect on how to do public interest data science.

I’ll share what I’m reading and learning about doing public interest data science and how to bring those lessons back to Boulder in 2026. Then, I plan to launch the University of Colorado Public Interest Data Science Clinic (CUPIDS Clinic) and recruit students from across the university to participate in a hybrid laboratory-newsroom. This newsletter will be central to sharing our stories, tutorials, resources, and reflections on doing public interest data science.

I invite you to join me in exploring what a public interest data science should be, how it might work, and which relationships and values should be prioritized. I hope you’ll subscribe, share your questions and ideas, and help shape the conversation as we imagine new possibilities for data science in the public interest.

← Back to the Dispatch