Welcome to Industry Documents Library Data Science Fellows!

The Industry Documents Library (IDL) is excited to welcome three Data Science Fellows to our team this summer. The Data Science Fellows will be working with the IDL and with the UCSF Library Data Science Initiative (DSI) to to assess the impact of transcription accuracy on text analysis of digital archives, using the IDL collections.

Through tagging, human transcription, and computer-generated transcription, the team will assess how accuracy may differ between media or document types, and how and whether this difference is more or less pronounced in certain categories of media (for example, video recordings of focus groups, community meetings, court proceedings, or TV commercials, all of which are present in the IDL’s video collections). After identifying transcript accuracy in different media types, we aim to provide guidelines to researchers and technical staff for proper analysis, measurement, and reporting of transcript accuracy when working with digital media.

Our Junior Data Science Fellows are Rogelio Murillo and Lianne De Leon. Rogelio and Lianne are both participating in the San Francisco Unified School District (SFUSD) Career Pathway Summer Fellowship Program. This six-week program provides opportunities for high school students to gain work experience in a variety of industries and to expand their learning and skills outside of the classroom. Lianne and Rogelio will be learning about programming and creating transcription for selected audiovisual materials. The IDL thanks SFUSD and its partners for running this program and providing sponsorship support for our fellows.

Lubov McKone is our Senior Data Science Fellow and will be using automated transcription tools to extract text from audiovisual files, run sentiment and topic analyses, and compare automated results to human transcription. Lubov will also provide guidance and mentoring to the Junior Fellows.

Our Fellows have introduced themselves below. Please join us in welcoming Rogelio, Lianne, and Lubov to the UCSF Library this summer!

Hi my name is Lianne R. de Leon and I go to Phillip and Sala Burton High School as a rising senior. I love playing volleyball in my free time and you may see me at numerous open gyms around the city. In the future I hope to major in computer science or computer engineering. I’m looking forward to meeting many wonderful people here at UCSF and learning more about the data science industry from the inside.

Image of Lianne De Leon, one of IDL's Summer 2022 Junior Data Science Fellows.
IDL Junior Data Science Fellow Lianne de Leon

Hi, my name is Rogelio Murillo and I’m a rising junior at Ruth Asawa School of the Arts. I enjoy playing a variety of music and percussion. I’ve played Japanese Taiko, Afro Brazilian drumming, and Latin Jazz. I’m also learning guitar over the summer. I’m a responsible and respectful person.

Image of Rogelio Murillo, one of IDL's Summer 2022 Junior Data Science Fellows.
IDL Junior Data Science Fellow Rogelio Murillo

My name is Lubov McKone and I’m currently pursuing my Masters in Library and Information Science from Pratt Institute in Brooklyn, NY. I also hold a Bachelor’s degree in Statistics, and prior to entering graduate school I worked as a data analyst in local government. My professional interests include supporting researchers in the accurate and responsible use of data, and I aspire to work as a data librarian in an academic library after graduation. Outside of work, I spend my time cooking, doing yoga, and writing music. I’m very excited to be joining the UCSF Industry Documents Library this summer, and I’m looking forward to learning more about how researchers use digital collections!

Image of Lubov McKone, IDL's Summer 2022 Senior Data Science Fellow.
IDL Senior Data Science Fellow Lubov McKone

Welcome to Summer Interns May Yuan and Lianne de Leon!

Please join us in giving a warm welcome to our two newest summer interns, May Yuan and Lianne de Leon!

May and Lianne are both participating in the San Francisco Unified School District (SFUSD) Career Pathway Summer Fellowship Program. This six-week program provides opportunities for high school students to gain work experience in a variety of industries and to expand their learning and skills outside of the classroom. Lianne and May will be working (remotely) with the UCSF Industry Documents Library (IDL), and we are grateful to SFUSD and its partners for sponsoring these internships.

May and Lianne will be working on several collection description projects with IDL this summer, including correcting and enhancing document metadata, and creating descriptions for audio-visual materials. They have provided their introductions below.

My name is May Yuan and I’m a junior at Raoul Wallenberg Traditional High School. During my free time, I enjoy reading, learning and trying new things, and helping others academically. I’m super excited to work here at the UCSF IDL to help provide valuable information to the public as well as learn more about the various documents, lawsuits, etc. myself; I also hope to enhance my productivity and organization skills during my time working here as these skills are crucial to college and everyday life in general. The career paths I’m interested in are bioengineering (bioinformatics/biostatistics), law, and finance.

IDL Summer Intern May Yuan

Hi, my name is Lianne R. de Leon. I am a part of the Class of 2023 at Phillip and Sala Burton High School. In the past, I have worked on VEX EDR Robotics competition in 2018-2019. In my spare time I enjoy trying new foods and yoga. I aspire to become a computer hardware engineer and to travel across the entirety of Asia. I look forward to meeting and working with you all.

IDL Summer Intern Lianne de Leon

Welcome to IDL Summer Intern, Khushi Bhat

Please join us in giving a warm welcome to Khushi Bhat, who will be conducting a remote internship with the UCSF Industry Documents Library (IDL) this summer.

Khushi is currently a rising senior at Rutgers University where she is majoring in Biotechnology and minoring in Computer Science. This summer, she is working in the Industry Documents Library researching tools and methods to extract geographic locations from a collection of documents related to the tobacco industry’s influence in public policy.

Khushi will be conducting an independent course project to help the IDL team enhance descriptive metadata for our industry documents collections. We have long been aware of a research need to be able to filter documents by geographic location. Tobacco control researchers and other public health experts at UCSF and around the world use the documents in the Industry Documents Library to understand how corporations impact public health. This research is often used to inform policymakers who write laws and policies regulating the sale and use of products such as tobacco. Researchers and policymakers need information which relates to their local area such as their city, county, state, or country.

Geographic location is not currently included in IDL’s document-level metadata, and since IDL contains more than 15 million documents it is not feasible to manually catalog this information.

Khushi’s work will focus on researching Natural Language Processing (NLP) and Named Entity Recognition (NER) text analysis methods. She will investigate available tools which have the potential to automatically identify and label geographic information in text. Khushi’s research, recommendations, and pilot testing will help the IDL team outline workflows and strategies for enhancing our document metadata to include geographic information.

Khushi aspires to pursue a career in bioinformatics in the future and intends on pursuing higher education in this field upon graduation. In her spare time, Khushi enjoys dancing, baking, and hiking. Prior to joining Rutgers, she was an avid Taekwondo practitioner (and has a 2nd degree black belt to show for it!)

Image of IDL intern Khushi Bhat
IDL Summer Intern Khushi Bhat

Learning at the Medical Heritage Library Conference

The Medical Heritage Library 10th Anniversary Conference took place on November 13, 2020. UCSF Archives & Special Collections staff attended the day of virtual presentations, and our Industry Documents Library archivists delivered a talk titled “Smoke on Screens: Audiovisual Evidence of the Tobacco Industry’s Harms to Public Health.”

The conference was convened to celebrate a decade of digitizing and making available medical history resources. Keynote speaker Dr. Jaipreet Virdi, Assistant Professor for the Department of History at the University of Delaware, presented her work on Digitized Disability Histories. She discussed disability identity as represented through material objects of disability, and examined how disability history is separate from medical history.

The program also included fascinating talks from nine other speakers, ranging from the rhetoric used in early 20th century motherhood manuals to medicalize infant care and degrade traditional knowledge, to using convolutional neural networks (CNN) to identify and label objects in historical images in order to visualize thematic collections at scale, to studying the historical lessons from popular culture and medical discourse of face masks during the 1918-1919 Flu epidemic.

All talks were recorded and are being made available with captioning on the Medical Heritage Library YouTube channel (see Session 2 for the “Smoke on Screens” talk).

The Medical Heritage Library (MHL) is “a collaborative digitization and discovery organization committed to providing open access to the history of medicine and health resources.” It was established in 2009 with a grant from the Alfred P. Sloan Foundation to begin digitizing 50,000 medical history texts, and now includes more than 323,000 items made available by multiple contributors through an access portal on the Internet Archive.

UCSF Archives & Special Collections is a contributing partner to the Medical Heritage Library. In 2015-2017 A&SC collaborated with four other medical libraries to digitize and make publicly accessible state medical journals, funded by a $275,000 National Endowment for the Humanities (NEH) grant. 97 journal titles were digitized (nearly every state medical journal in the U.S.) resulting in over 2.7 million full-text searchable pages.

The Industry Documents Library has contributed over 5,000 video recordings to the MHL, beginning in 2012. These videos are part of our Truth Tobacco Industry Documents collection and include recordings of cigarette commercials, marketing focus groups, internal corporate meetings and trainings, depositions of tobacco company employees, and congressional hearings. The recordings document the industry’s marketing and public relations strategies to cast doubt on the harms of smoking and to prevent or delay public health regulations.

Screenshot from 1960 Flintstones commercial for Winstons cigarettes.
Screenshot of 1960 Flintstones TV commercial for Winston cigarettes, available in the Industry Documents Library collection of the Medical Heritage Library on the Internet Archive: https://archive.org/details/tobacco_djq03d00

Announcing the UCSF Food Industry Documents Archive

Image of a grocery store aisle with packaged foods.

The UCSF Archives and Special Collections and Industry Documents Library (IDL) are pleased to announce the launch of the Food Industry Documents Archive, a brand new collection of over 30,000 documents related to the food industry and its impact on public health. These documents, available online for the first time, highlight marketing, research, and policy strategies used by food companies and trade groups, and reveal the communications and connections between industry, academic, and regulatory organizations.

The Food Industry Documents Archive was created in collaboration with the UCSF Philip R. Lee Institute for Health Policy Studies and officially unveiled during the inaugural symposium on November 15, 2018. A full recording of the symposium can be viewed here.

The Food Industry Documents were digitized and made available online through partnerships with other libraries, archives, and related organizations, bringing together historical and contemporary materials to support inquiry into long-standing industry practices.

Topics include the Sugar Research Foundation, the International Sugar Research Foundation, the Sugar Institute, cane sugar and beet sugar production, sugar-sweetened beverages, sugared snack foods advertised to children, the U.S. Public Health Service, and the National Research Council and Food and Nutrition Board.

These documents have been used as the source for a number of publications including:

The Food Industry Documents Archive collection joins the existing Tobacco, Drug, and Chemical Industry Documents collections, allowing users to search across industries and identify common tactics used to sway scientific research, shape public opinion, and influence policies and regulations meant to protect public health.

Read the full announcement and be sure to visit the UCSF Industry Documents Library to view the new Food Industry Documents collections.

Celebrating 20 Years of the UCSF Tobacco Center and Industry Documents Library

Celebrate with us >

Image of tobacco company executives testifying before congress with the following text over the top: Celebrating 20 years of the UCSF Tobacco Center and Industry Documents Library

Tuesday, November 27, 12 – 1 pm

Parnassus Library, 5th Floor, Lange Room: Join us to celebrate 20 years since the signing of the Master Settlement Agreement and the creation of the UCSF Center for Tobacco Control Research and Education and the UCSF Industry Documents Library!

In November 1998 the 5 largest cigarette manufacturers signed the Master Settlement Agreement (MSA) with 46 U.S. states and 6 U.S. jurisdictions. This was the largest civil litigation settlement in U.S. history.

The MSA imposed restrictions on the sale and marketing of cigarettes, especially to youth, and required hundreds of billions of dollar in payments to the states in perpetuity to partially compensate them for the Medicaid costs smoking causes. It also created the American Legacy Foundation (now known as the Truth Initiative) which funded the creation of the UCSF Center for Tobacco Control Research and Education and the Industry Documents Library.

This event is open to the public. Cake and beverages will be provided while supply lasts. RSVP using the link at the top of the post. 

This event is co-organized by the UCSF Industry Documents Library and the Center for Tobacco Control Research and Education.

E-Cigarette Marketing Web Archive: Capturing Trends in Advertising

The UCSF Industry Documents Library (IDL) is a division of the UCSF Archives.

The ‘UCSF E-Cigarette Web Archive’ is a resource created to assist researchers and the public in understanding the history of e-cigarette marketing on the Internet.

Home page of the E-cigarette marketing web archive

The UCSF Industry Documents Library primarily collects internal documents from industries and corporations that attempt to influence policy and regulations meant to protect public health.  In addition to our IDL holdings in tobacco, drug, and chemical industries, we capture and preserve websites and online multimedia resources that further demonstrate the actions of these large industries.  The rise of vaping and e-cigarette use over the last 10 years has undone some of the gains in tobacco control and youth smoking prevention won by the public health community.  The marketing tactics and messaging of the e-cigarette companies mirror that of the tobacco industry 20 years ago and the IDL is attempting to capture and preserve these campaigns for future research and analysis.

The UCSF E-Cigarette Marketing Web Archive utilizes the Archive-It service to crawl and preserve designated websites at selected points in time. The preserved sites range from major e-cigarette company websites, e-cigarette trade associations and advocacy groups, to forums and social media such as YouTube, Twitter and Facebook.

Preserving sites over a number of years allows researchers to see trends in advertising campaigns and marketing.  For instance, in 2014, Blu was urging users to Take Back your Freedom but in 2017, authenticity was the key with e-cigarettes helping you “Be Who you Truly Are”.

These web archives provide an historical perspective on the evolution of e-cigarette marketing and continued captures will hopefully preserve the industry’s shifts in language, imagery and tactics before and after any possible regulatory actions by jurisdictions.

 

Tobacco Industry Sponsorship of the Olympics: A Dive into the UCSF Truth Tobacco Industry Documents

This post highlights just a few of the over 14 million tobacco industry documents contained in the UCSF Truth Tobacco Industry Documents, a division of the UCSF Archives.

Could a sporting event like the Olympics ever equate with smoking? The games summon images of stamina, health, fortitude and strength, and for decades, the tobacco industry worked diligently to affiliate themselves with this major sporting event. Olympic games draw millions of eyes and the promotion and marketing opportunities were gold for the tobacco companies.

In 1936, RJ Reynolds’ Camel brand used Olympic speed skater Kit Klein to advertise the purported health effects of smoking on digestion:

Since 1988, each Olympic Games has adopted a tobacco-free policy but the tobacco industry has continued to create indirect associations in an effort to be connected with not only Olympic ideals but the worldwide platform the Games provide.

The Olympics as a powerful promotional tool: A 1980s memo in our British American Tobacco (BAT) records indicates executives considered the Olympics second only to Formula One motor racing as an effective sports-based “marketing platform.”  Into the 1990s, BAT affiliate UZBAT proudly proclaimed BAT sponsorship of Lina Cheryazova, 1994 Olympic Gold Medalist in freestyle skiing; and a 1992 memo between BAT and Singapore Tobacco Company, Korea, notes proposed Olympic team sponsorship in Thailand is illegal but ‘primary’ sponsors have been used as cover in the past.

The tobacco companies were so heavily invested in advertising and marketing around sporting events they could not risk censure from athletes. In a 1988 statement by Greg Louganis regarding tobacco sponsorship, the Olympian confessed, “I had become a slave to a tobacco company…Philip Morris representatives made it very clear that if I continued to speak out nationally [about tobacco and health], my career at, and association, with Mission Viejo [Realty Group, a PM subsidiary] would be over.”

The 1996 Centennial Olympic Games, Atlanta, Georgia:
Documents in our Philip Morris and RJ Reynolds collections demonstrate that despite the almost decade long tobacco-free policy of the Olympic Games, the companies were still planning promotions and marketing events. A 1996 Philip Morris memo shows the tobacco giant crafted a contract to place a Benson & Hedges ad on the back cover of the Ultimate Games Guide, a souvenir program of the 1996 Olympic Basketball games in Atlanta. Similarly, a 1995 RJ Reynolds email discusses a new tobacco company whose products could be introduced at the Games with catchy brands like Torch and Gold Medal, even going so far as to posit an “official cigarette of the 1996 Olympics.”

The “accommodation” of Olympic visitors who smoke was a hot topic in 1996 and one that allowed the companies to roll in promotional and marketing activities.  “Accommodation programs” were the tobacco industry’s way of holding off smoking bans by partnering with hospitality agencies (hotels and restaurants) to promote “choice”, “preference” and often the “solution” of improved ventilation in order to accommodate both smokers and non-smokers in public areas.

You can view these documents and millions more at the UCSF Industry Documents Library, where we collect and make available internal corporate documents produced by industries that distort science in an effort to influence policies meant to protect public health.

Job Shadowing at the Library

This is an excerpt of a blog post written by Rebecca Tang, Developer with the Industry Documents Library, a division of the UCSF Archives. Read the full article here.

On Thursday October 26, 2017, the Library and Center for Knowledge Management hosted job shadowing for high school students for the first time!

Two students, Kelly and Jane, both Juniors from Balboa High School, visited us and spent the day learning about what it is like to work as a programmer.  Kelly and Jane are part of the Game Design Academy at Balboa High School.  The Game Design Academy is the path way for students who are interested in engineering and programming.  Kelly and Jane have not had any programming experience yet.  They will start programming classes next semester.

They started the day off with a tour of the library with Jim.  Then they attended the weekly meeting with the Industry Documents Library team.  During the meeting, they learned about about the IDL project, databases, and search index.

Then they attended Illios code jam with the Illios team where they got a front end programming primer from Jason and listened in as the Illios team discussed ways to improve their UI…Continue reading the full job shadowing article on the CKM blog.

Searching Tobacco Archives: Sports and Chewing Tobacco

This is a guest post by Allen Smoot, UCSF Archives Intern.

As an intern for the UCSF Archives, I’ve been working on digitized state medical society journals and tobacco control collections. At UCSF, the Archives and the Industry Documents Library both house immense collections of tobacco-related material. In the Industry Documents Library there are millions of documents from tobacco companies about their manufacturing, marketing, and scientific research.  I narrowed in on chewing tobacco and how it became popular in the sporting world.

Image from "The case against smokeless tobacco: five facts for the health professional to consider," September 1980, page 4. https://www.industrydocumentslibrary.ucsf.edu/tobacco/docs/#id=fnyg0028

Image from “The case against smokeless tobacco: five facts for the health professional to consider,” September 1980, page 4.

Smokeless tobacco gained popularity in the United States in part because many jobs prohibited workers from smoking on site.  Advertising also played a role; for example, an article in 1980 outlined the various ways that tobacco companies targeted college campuses and youths through their advertising for chewing tobacco. A report in 1984 cited that in Atlanta, 11% of sample elementary and high school students regularly used snuff.

In the sports world, the numbers could be higher. The same 1984 report, for instance, noted that in a Texas sample, one in every three varsity college athletes on baseball or football teams took two to eight dips per day.  Sports idols like Sparky Lyle, former ace pitcher for the New York Yankees and Texas Rangers, contributed to chewing tobacco usage by serving as spokespersons for tobacco companies. Lyle promoted Levi Garrett pouch chewing tobacco on TV by claiming, “Most ball players dream about making it to the Hall of Fame, but I’d be satisfied for people just to remember me as the guy with the great chewing tobacco.”

Image from "The case against smokeless tobacco: five facts for the health professional to consider," September 1980, page 3. https://www.industrydocumentslibrary.ucsf.edu/tobacco/docs/#id=fnyg0028

Image from “The case against smokeless tobacco: five facts for the health professional to consider,” September 1980, page 3.

Smokeless tobacco was advertised as “macho” by sports figures which led to the increase in use by younger people (“Smokeless tobacco is ‘burning’ young athletes,” 1981).

You can read more documents related to smokeless tobacco online in the Industry Documents Library and in the State Medical Society Journals Collection. You can also visit the UCSF Archives and view the Tobacco Control Archives.