UCSF Archives & Special Collections awarded grant to archive data, documents, and social media of The COVID Tracking Project at The Atlantic

UCSF Archives & Special Collections (A&SC) has been awarded a grant from the Alfred P. Sloan Foundation to compile and archive the data products, public websites, social media, and select internal documents of The COVID Tracking Project (CTP). The project was a citizen-science initiative housed by The Atlantic magazine which tracked COVID data from March 7, 2020 to March 7, 2021. It had a tremendous impact on public, media, scientific, and governmental understanding of and response to the pandemic. This $249,866 grant will help preserve the products and culture of a unique organization created in difficult times.

Products produced by the CTP include testing, outcomes, and hospitalization data that was used by thousands of news organizations and millions of individuals to understand the early phases of the pandemic. The project’s Racial Data Tracker and Long-term Care Tracker highlighted the different ways the pandemic was impacting people of color and residents of nursing homes and similar facilities. Funding from the grant will help ensure these critical datasets are preserved in Dryad and immediately available to researchers in public health.

As an organization that existed only online, archiving the project will require new approaches to storing data from tools like Slack, Github issues, and Google drive. Unlike digital files similar to a Microsoft Word document, data in these tools have multiple levels of interface and context that is not easily preserved. The grant will support developing tools for archiving these rapidly-adopted forms of communication, and making them open source for other archiving projects.

Every datapoint collected by the project was the result of multiple discussions, revisions, and public inquiry. Capturing the entire history of say, the total number of tests in California on November 22, 2020 requires reviewing Slack threads, Github issues, emails, spreadsheet revisions, and unique tools built by tracking project members. The grant will help build a “Data Explorer” that pulls all these disparate metadata into a single web interface for researchers to understand the many contexts around every datapoint collected by the project.

“We’re extremely proud to support a digital preservation project capturing a remarkable record of online collaboration that also provides a unique blueprint for future archiving initiatives,” says Joshua Greenberg, director of the Sloan Foundation’s technology program. “The team is doing more than just creating a rich and valuable repository of a historic moment—it is generating novel and much-needed methods of storing information from modern technology platforms, an approach that will become invaluable as online collaborations increasingly become the norm.”

This 12-month project is being launched in January 2022 and will be overseen by an advisory board composed of former project staff and advisors with backgrounds in data science, medicine, history, and epidemiology. A&SC would like to thank Amanda L. French, Ph.D., former Community Lead at the COVID Tracking Project and other supporters for their help with this proposal. Kevin Miller will serve as an archive lead for this grant project

About the Sloan Foundation

Logo of the Alfred P. Sloan Foundation

The Alfred P. Sloan Foundation is a not-for-profit, mission-driven grantmaking institution dedicated to improving the welfare of all through the advancement of scientific knowledge. Established in 1934 by Alfred Pritchard Sloan Jr., then-President and Chief Executive Officer of the General Motors Corporation, the Foundation makes grants in four broad areas: direct support of research in science, technology, engineering, mathematics, and economics; initiatives to increase the quality, equity, diversity, and inclusiveness of scientific institutions and the science workforce; projects to develop or leverage technology to empower research; and efforts to enhance and deepen public engagement with science and scientists.

About UCSF Archives & Special Collections

The mission of the UCSF Archives and Special Collections is to identify, collect, organize, interpret, and maintain rare and unique material to support research and teaching of the health sciences and medical humanities and to preserve institutional memory. Please contact Polina Ilieva, Associate University Librarian for Collections with questions about this award.

A (very) brief Report back from Society of American Archivists

It’s been a whirlwind last couple of weeks for me as I bounced from conference to conference, but as I settle back in it’s been exciting to collect my thoughts on what I’ve learned. And while it’s still fresh in my memory, this is a brief report back from the largest conference I attended — the annual meeting of the Society of American Archivists (SAA) which was held last week in impossibly-quaint Portland, OR.

Being the digital archivist, I mostly spent my time in sessions focused on processing, preserving, and providing access to digital materials, in all the different forms that can take. One of the most fruitful of these was hosted by colleagues from UCLA, UCB, Stanford’s Hoover Institute, Cornell, and Emory, and was entitled “What we talk about when we talk about processing born-digital.” This session reported on an effort to establish shared definitions for what it means to process born-digital archival collections. Because this field is so new, what is considered “processing” a collection at one institution might be a totally different set of tasks from that performed at another. To address this, the group is attempting to identify which steps are essential or recommended, and assign different processing levels based on these frameworks.

To attempt to break all these steps out in a clear way is an immense amount of work, so I’m incredibly excited that my colleagues have begun to take on this huge task. It will help us all out in a massive way.

UCSF was not without good representation, as our own Polina Ilieva moderated several events — one that was a meeting of the section on Science, Technology, and Health Care archives, and one that was a panel discussion on Collecting and Preserving contemporary science in institutional archives.

Two people in front of a power-point presentation at a meeting of the Science, Technology, and Health Care Section of the Society of American Archivists.

A very poor photo of Polina Ilieva taking over as Senior Co-Chair of the Science, Technology, and Health Care Section of the Society of American Archivists

Finally, some of my most interesting food for thought came from a panel on archival responses to climate change. The panel covered everything from Native Hawaiian community preservation of historic material endangered by sea level-rise, to projects acquiring better data to map which archival repositories are likely to be most affected by a changing climate. Especially pertinent for my work was a presentation urging us as digital archivists to think more explicitly about what kinds of energy use we are engaging through our different preservation practices. Simply put: current digital preservation practices rely on cheap data storage, and cheap data storage relies upon energy from fossil fuels. So where can we start to change that?

More updates soon as we start to engage with all these thoughts more directly at UCSF.

A Report Back from Personal Digital Archiving 2017

Post by Charlie Macquarie, UCSF Archives Digital Archivist

I spent most of last week down the peninsula for the convening of the Personal Digital Archiving (PDA) conference, now in its 7th year, and left with some fascinating thoughts and conversations in my mind. PDA “seeks to host a discussion across domains focusing on how to best manage personal digital material, be it at a large institution or in a home office.” As a result of this focus, it also ends up playing host to all kinds of fascinating new practices and approaches to collecting, preserving, providing access to, and even thinking about personal digital information.

archivists use smart phones to photograph an 8 inch floppy disk reader.

A moment from the Born-Digital Archiving pre-PDA meetup, where archivists hover around a computer built to read 8 inch floppy disks — an almost impossible task these days

The conference covered a huge range of work, and included presentations on different ways to conceptualize digital space (screenshots, video game emulations, the list goes on), projects seeking to allow communities to directly transfer their digital materials to a library collection through apps or interfaces, and even a fascinating assessment of the way that teens store and access information about their personal finances (including the clincher that almost all ages show a tendency to simply discard financial information after a stated financial goal has been reached). Also included were some updates on the sustainability (or lack of it) of some of the field’s pioneering digital archives projects, like the Salman Rushdie papers at Emory University (hint, it’s still people, not machines, that are making it run).

Some presentations particularly interesting to a health sciences institution like our own were those on the self-collection and assessment of health and other biometric data espoused by the Quantified Self movement. Quantified Self is a loosely-organized group who collect and store data about themselves, and then use various computational and creative methods to analyze that data  for self-insights framed as citizen science.

A slide shows in a darkened room as a person gives a presentation on "QS" or Quantified Self.

Gary Wolf gives the keynote on the Quantified Self movement.

Quantified Self (the formal organization) has just embarked on its first experiment to facilitate participants testing and analyzing their own blood, which has brought up a host of questions on the ethics of collecting and making public one’s own health data. Additionally, the project raises questions about the freedoms and constraints that tend to coalesce around these projects of “do it yourself” self-quantification (not to mention the often neglected questions around power and privilege that tinge the conversation around collection of, access to, and work with self-referential data). The approach taken by quantified self practitioners is surely different than ours here in the archives, but we still face similar issues as archivists in a health-sciences university, where historical information mixes with personal narrative and private health data – both in the legal sense and the intimate emotional sense as well.

This forum was a fascinating opportunity to dig a bit deeper into the ideologies and practices behind the collection and preservation of personal digital material, and it seemed fitting that these questions were being explored in dialogue with all the people in the room. One of the biggest takeaways from the conference, after all, was that the tools and technologies to facilitate this work are often the focus of the intrigue and excitement, but that it’s the people who dedicate their time and resources to the endeavor that keep the whole thing running. Just as the Salman Rushdie Digital Collection requires the work of a cadre of dedicated digital archivists at Emory, the future of our digital past will require serious work by a broad and diverse community of archivists, technologists, historians, fanatics, and citizens.

One of the final audience comments was prescient in this regard: “it seems like what might be missing is a discussion of privilege in these projects.” Indeed, any community of practice is unlikely to persist for long if it doesn’t contain a diversity of interests.