Sarah Lin

Sarah Lin

Sr Information/Content Architect

MongoDB

Biography

Sarah Lin is currently the Senior Information & Content Architect at the NoSQL database software company MongoDB. Prior to joining MongoDB, she managed the Enterprise Information Management team at Posit (formerly RStudio), a data science software company. She has previously been a bindery clerk, a serials librarian, an indexer, a technical services librarian and a content manager in academic, medical, legal and corporate libraries. Sarah believes that data literacy is the key to the future and that librarians would do well to learn programming and data science skills in order to better serve their patrons, colleagues, and careers.

Interests

  • Information Architecture
  • Findability
  • Metadata

Education

  • MS in Library & Information Science, 2006

    University of Illinois at Urbana-Champaign

  • BA in African/African-American Studies & Anthropology, 2003

    University of Chicago

Recent & Upcoming Talks

Legal Research in the Era of Black Lives Matter

This session highlights the lack of access to justice and informs law librarians how to discuss race related topics using Black Lives …

Data Science Fundamentals - SLA

Data science is everywhere these days, but what exactly is it? How do you “do data science”? Is it something law librarians …

Intro to Data Science for [Australian] Law Librarians

Data science is everywhere these days, but what exactly is it? How do you “do data science”? Is it something law librarians …

The Geek in Review Ep. 90 – Using Data Analytics to Tell Your Story with RStudio’s Sarah Lin

Sarah discusses what the R Programming language does, and how she got interested in the profession of statistical computing. While some …

Intro to Data Science for Law Librarians

Data science is everywhere these days, but what exactly is it? How do you “do data science”? Is it something law librarians …

Textual analysis in Legal Scholarship Using Python and R

I was asked present the R portion of the Textual Analysis Deep Dive at AALL’s Annual Conference in New Orleans July 12, 2020 This event was cancelled due to COVID-19.

Transform Your Mindset: TS Librarians as Data Scientists

The standard library school cliche is that we went so we didn’t have to do any math! Yet with machine learning, deep learning, …

Publications

Quickly discover relevant content by filtering publications.

Hands-on data science for librarians available for pre-order!

Librarians understand the need to store, use and analyze data related to their collection, patrons and institution, and there has been consistent interest over the last 10 years to improve data management, analysis, and visualization skills within the profession. However, librarians find it difficult to move from out-of-the-box proprietary software applications to the skills necessary to perform the range of data science actions in code. This book will focus on teaching R through relevant examples and skills that librarians need in their day-to-day lives that includes visualizations but goes much further to include web scraping, working with maps, creating interactive reports, machine learning, and others. While there’s a place for theory, ethics, and statistical methods, librarians need a tool to help them acquire enough facility with R to utilize data science skills in their daily work, no matter what type of library they work at (academic, public or special). By walking through each skill and its application to library work before walking the reader through each line of code, this book will support librarians who want to apply data science in their daily work. Hands-On Data Science for Librarians is intended for librarians (and other information professionals) in any library type (public, academic or special) as well as graduate students in library and information science (LIS).

Hands-on data science for librarians

Hands-on data science for librarians (forthcoming 2023) is a guide to doing data science in R geared directly for library & information professionals. Librarians understand the need to store, use and analyze data related to their collection, patrons and institution, and there has been consistent interest over the last 10 years to improve data management, analysis, and visualization skills within the profession. However, librarians find it difficult to move from out-of-the-box proprietary software applications to the skills necessary to perform the range of data science actions in code. This book will focus on teaching R through relevant examples and skills that librarians need in their day-to-day lives that includes visualizations but goes much further to include web scraping, working with maps, creating interactive reports, machine learning, and others. While there’s a place for theory, ethics, and statistical methods, librarians need a tool to help them acquire enough facility with R to utilize data science skills in their daily work, no matter what type of library they work at (academic, public or special). By walking through each skill and its application to library work before walking the reader through each line of code, this book will support librarians who want to apply data science in their daily work.

10 Ways Data Science Can Help Law Librarians

Data science brings opportunities to work more quickly and easily with data. It provides better reporting formats by incorporating outside data from various sources, and can even turn text into data that can be displayed visually. Even though legal information isn’t always associated with data, science, or data science, data science skills enable law librarians to do their jobs with greater efficiency. With data science skills, we are able to show new value for our teams and organizations, so it is definitely worth the time invested. The following 10 data science skills and techniques, along with descriptions of the amazing deliverables that are associated with them, are listed in a progressive skill-building sequence.

Ten quick tips for making things findable

The distribution of scholarly content today happens in the context of an immense deluge of information found on the internet. As a result, researchers face serious challenges when archiving and finding information that relates to their work. Library science principles provide a framework for navigating information ecosystems in order to help researchers improve findability of their professional output. Here, we describe the information ecosystem which consists of users, context, and content, all 3 of which must be addressed to make information findable and usable. We provide a set of tips that can help researchers evaluate who their users are, how to archive their research outputs to encourage findability, and how to leverage structural elements of software to make it easier to find information within and beyond their publications. As scholars evaluate their research communication strategies, they can use these steps to improve how their research is discovered and reused.

Machine Learning: My Journey into AI

Soon after starting a new position as a librarian at a data science software company, I saw that my employer was offering a workshop to learn how to do machine learning in the R programming language and I jumped at the chance to learn more about the subject. With support from my boss, I struggled through October and November refreshing my linear regression knowledge (knowledge I’d happily left behind in high school) and bringing my coding skills from near zero to “won’t be embarrassed in front of my colleagues.”

Contact