Sarah Lin

Sarah Lin

Information Architect & Digital Librarian



Sarah Lin is currently the Information Architect & Digital Librarian at RStudio, a data science software company. She has previously been a bindery clerk, a serials librarian, an indexer, a technical services librarian and a content manager in academic, medical, legal and corporate libraries. Sarah believes that data literacy is the key to the future and that librarians would do well to learn data science in order to better serve their patrons, colleagues, and careers.


  • Information Architecture
  • Findability
  • Metadata


  • MS in Library & Information Science, 2006

    University of Illinois at Urbana-Champaign

  • BA in African/African-American Studies & Anthropology, 2003

    University of Chicago

Recent & Upcoming Talks

Data Science Fundamentals - SLA

Data science is everywhere these days, but what exactly is it? How do you “do data science”? Is it something law librarians …

Intro to Data Science for [Australian] Law Librarians

Data science is everywhere these days, but what exactly is it? How do you “do data science”? Is it something law librarians …

The Geek in Review Ep. 90 – Using Data Analytics to Tell Your Story with RStudio’s Sarah Lin

Sarah discusses what the R Programming language does, and how she got interested in the profession of statistical computing. While some …

Intro to Data Science for Law Librarians

Data science is everywhere these days, but what exactly is it? How do you “do data science”? Is it something law librarians …

Textual analysis in Legal Scholarship Using Python and R

I was asked present the R portion of the Textual Analysis Deep Dive at AALL's Annual Conference in New Orleans July 12, 2020 This event was cancelled due to COVID-19.


Quickly discover relevant content by filtering publications.

Ten quick tips for making things findable

The distribution of scholarly content today happens in the context of an immense deluge of information found on the internet. As a result, researchers face serious challenges when archiving and finding information that relates to their work. Library science principles provide a framework for navigating information ecosystems in order to help researchers improve findability of their professional output. Here, we describe the information ecosystem which consists of users, context, and content, all 3 of which must be addressed to make information findable and usable. We provide a set of tips that can help researchers evaluate who their users are, how to archive their research outputs to encourage findability, and how to leverage structural elements of software to make it easier to find information within and beyond their publications. As scholars evaluate their research communication strategies, they can use these steps to improve how their research is discovered and reused.

Machine Learning: My Journey into AI

Soon after starting a new position as a librarian at a data science software company, I saw that my employer was offering a workshop to learn how to do machine learning in the R programming language and I jumped at the chance to learn more about the subject. With support from my boss, I struggled through October and November refreshing my linear regression knowledge (knowledge I’d happily left behind in high school) and bringing my coding skills from near zero to “won’t be embarrassed in front of my colleagues.”

Managing technical services long distance: tips for successful collection management

Managing a library collection requires consistency and meticulous attention to detail, but managing remotely requires even more of these skills as well as a strong team that can work together to get the job done. Unlike reference and research services, technical services duties are routine and standard across locations, which function very smoothly with a bit of centralization and procedural discipline. While there are idiosyncrasies at individual libraries, this article covers seven areas that affect the success of collection management in a remote work situation.

Collection management 2016: from anecdotal to analytic

If you attended SLA's recent 2015 Lawmaggedon webinar, you may have heard Jean O'Grady reference the term ‘anecdata’ — usage data based on the anecdotal, in the absence of any hard (or soft) usage data. I have seen the anecdotal trump data over the years: most notably a strong protest against a cancellation, only to find a thick layer of dust in front of the item in question.