Exploring Biomedical Web Resources using Shell Scripting

Springer Book

Tutorial at 30th The Web Conference (WWW '21)

April 12, 2021, Ljubljana
by Francisco M. Couto

Exploring the vast amount of rapidly growing biomedical content available on the web is of utmost importance, but is also particularly challenging due to the very specialized domain knowledge. This hands-on tutorial will explain how to retrieve and process biomedical data and text using shell scripting with minimal software dependencies. The tutorial will also describe how to explore the semantics encoded in biomedical ontologies and how they address the issue of ambiguity of natural language and contextualization of biomedical entities.

Tutorial Slides
Outline

The tutorial will follow an example of manual steps that Health and Life specialists may have to perform to find and retrieve biomedical text about caffeine using publicly available web resources. The participants will then learn how to use command line tools to automatize those steps, including the automatic download of data, and extracting useful information from the text retrieved. The tutorial will be a full day session (6 hours), according to the following outline:

  1. Biomedical Data and Text Retrieval (2h)
    • Biomedical Resources
    • Caffeine Example
    • Data Extraction
    • Text Retrieval
  2. Biomedical Text Processing (1h)
    • Pattern Matching
    • Entity Recognition
  3. Semantic Processing (3h)
    • Biomedical Ontologies
    • OWL Processing
    • Synonyms and Ancestors
    • Entity Linking
    • Large Lexicons

Prerequisite Skills

This is an introductory tutorial, thus no expected prerequisite knowledge and experience in bioinformatics, text mining and ontologies is required. The participants should however have basic experience in shell scripting and pattern matching.

Hardware Requirements

Participants need a computer (any operating system) with access to internet and a terminal with a UNIX shell. Before the tutorial participants should follow check if everything is available on their computer using the Test Script. This YouTube Playlist includes videos explaining how to use the Test Script using different operating systems.