UCSF Archives & Special Collections includes numerous digitized collections documenting health sciences topics ranging from institutional, community, and individual response to illness and disease to industry impacts on public health. We make many of these collections available as data that can be computationally analyzed for health sciences and humanities research.
If you are curious about working with data from the UCSF Archives and Special Collections, the Digital Health Humanities (DHH) pilot program will showcase our “archives as data” throughout the month. In two upcoming sessions, we’ll provide an orientation to available data as well as methods for finding, accessing, and exploring these data resources:
- Finding Archives as Data for Digital Health Humanities: Industry Documents Library Collections, Monday, March 13, 10 a.m. – 12 p.m. PT
- Exploring Archives as Data for Digital Health Humanities: AIDS History Project Collections, University Archives, and More, Monday, March 20, 10 a.m. – 12 p.m. PT
Python for Data Analysis series workshops
DHH programming also continues to partner with the Data Science Institute (DSI) to offer workshops on tools and methods well-suited to conducting research with “archives as data.” March workshops in the DSI Python for Data Analysis series will dig in to text analysis using natural language processing and building machine learning models:
- Python Background Text Analysis and Natural Language Processing, Friday, March 10, 10 a.m. – 1 p.m. PT
- Machine Learning with Python and Scikit-Learn, Friday, March 17, 10 a.m. – 1 p.m. PT
- Machine Learning for Document Classification and Sentiment Analysis, Friday, March 24, 10 a.m. – 1 p.m.
Through these workshops and selected companion follow-up sessions with troubleshooting and guided process walkthroughs, researchers can learn and practice data analysis techniques and get familiar with data from our collections. Check out the library’s events calendar to find and register for the latest offerings!
OpenRefine workshops
If you have data you’d like to work with but it needs tidying and preparation attend a DSI OpenRefine workshop. This workshop will cover techniques for cleaning structured data, no programming required! There will be two OpenRefine sessions this month:
- Cleaning Spreadsheet Data with OpenRefine, Monday, March 6, 1 – 2:30 p.m. PT
- OpenRefine for Archives as Data, Wednesday, March 8, 12 – 1:30 p.m. PT (This is a DHH companion session to the Cleaning Spreadsheet Data with OpenRefine DSI workshop and all are welcome.)
Previously-held DHH session slides, linked resources, and recordings are available on the CLE. There you will find materials from a Digital Health Humanities Overview session and recorded walkthroughs for Unix, Python, and Jupyter notebooks basics. Related resources will be updated on the CLE following DHH sessions.
Questions?
Please contact DHH Program Coordinator, Kathryn Stine, at kathryn.stine@ucsf.edu. The UCSF Digital Health Humanities Pilot is funded by the Academic Senate Chancellor’s Fund via the Committee on Library and Scholarly Communication.