Last month, Holly Case, a professor of history and the deputy director of the Data Science Institute, held a lunch meeting convening researchers, faculty members and library staff. A shared concern brought them there: losing federal data under the Trump administration.
Over the past two months, numerous federal data sets related to public health and the environment have been removed from government websites. But some professors have been trying to save the data before it is scrubbed.
“One of the reasons why I organized the lunch,” Case said, “was so that we could try to figure out what are the various constituencies that are affected on campus and what can be done.”
Frank Donelly, the head of geographic information systems and data services at the University Library has been working to archive this data.
Donelly said that since Trump’s inauguration, there were “librarians forming different groups across the country to make sure that we had backups of a lot of federal data sets.”
“That was something I was involved with kind of early on,” he said, adding he created computer programs to capture and copy federal data sets he worked with in the past.
“I was concerned about making sure that they were preserved, and so I created a workflow for doing that,” he said.
Donelly is specifically involved in efforts with the Data Rescue Project, a group made up of various organizations focused on “data rescue-related efforts and data access points for public U.S. governmental data that are currently at risk,” according to their website.
“This data captures everything. It’s used by everybody,” Donelly said.
Allan Just, an associate professor of public health, conducts research using public environmental and demographic data to study the impacts of poor air quality and heat waves.
This semester, he said he’s had to change lesson plans and find alternative data sources for students using public data in his classes.
“Some of the best data available on air pollution concentrations in foreign countries is collected at U.S. embassies and consulates, but this program is being shut down,” he wrote in an email to The Herald.
Just said that his research group has archived information from the National Institutes of Health and data sets from the National Oceanic and Atmospheric Administration.
The data from NOAA has not yet been taken down, Just said, but it would be “a huge loss to scientific communities if they were to disappear.”
“The practical implications of losing government data are enormous,” Reuben Fischer-Baum, an adjunct professor and the senior data editor on the data services team at Yahoo News, wrote in an email to The Herald.
He explained that the data can direct areas such as public health resources and economic decisions.
“They’re one of the most important ways in which we understand who we are as a country and how we’re changing,” Fischer-Baum wrote. “Losing that vantage point would be a tragedy.”
Federal data includes various data sets from sources such as the Centers for Disease Control and Prevention and the Bureau of Labor Statistics.
Students and professors often rely on these data sets for research projects, specifically because government data is “very comprehensive, and it’s free in the public domain,” he said.
“Researchers depend on and the public benefits from government data that are accessible, transparent and trustworthy,” Paul Testa, an assistant professor of political science, wrote in an email to The Herald.
Testa suspects public health and education are most likely to be affected by the data loss in the long-term. But any scholar “has to at least ask what would happen if tomorrow they could no longer access data sets they took for granted,” he wrote.
While data sets have been removed or modified, Donnelly said that most of what has been published is still available.
“Most of the government websites are still functioning, but there have been enough changes, and everybody in every discipline has noticed at least something that has been changed or removed,” he said. “That’s why people are doing this.”
The bigger concern is the future of government data, Fischer-Baum and Just told The Herald.
“When a government survey gets canceled or reduced, you lose the data that would have been collected in the future, unless another institution steps in to fill the void,” Fischer-Baum wrote.
“This is not an easy lift, and it feels outside the scope of what the public alone can take on,” he added, “so I hope that universities like Brown are able to contribute here.”
Just said that while research teams can attempt to archive public data sets, terminating federal data collection programs can lead to a loss of continuity in long-running data sets, leading to the loss of “actionable information on which to base public health actions and policy decisions.”
“I’m grateful for the work of staff and faculty at Brown who have been helping with the systematic archiving of essential public data,” he wrote. “But it is hard to know what to grab because what is most at risk often isn’t known until it disappears, and the quantity of federal data is vast. It feels like we are making photocopies in the Library of Alexandria.”
Cate Latimer is a university news editor covering faculty, University Hall and higher education. She is from Portland, OR, and studies English and Urban Studies. In her free time, you can find her playing ultimate frisbee or rewatching episodes of Parks and Rec.