Biomedical Text Processing using Semantics

Tutorial at the 43rd European Conference on Information Retrieval (ECIR 2021)

March 28, 2021

by Francisco M. Couto

Exploring the vast amount of rapidly growing biomedical text is of utmost importance, but is also particularly challenging due to the highly specialized domain knowledge and inconsistency of the nomenclature. This introductory tutorial will be a hands-on session to explore the semantics encoded in biomedical ontologies to process text using shell scripting with minimal software dependencies. Participants will learn how to process OWL, retrieve synonyms and ancestors, perform entity linking, and construct large lexicons.

Tutorial Slides

Outline

This tutorial will present how we can select an ontology that models a given domain and identify the official names and synonyms of biomedical entities. This tutorial will use two ontologies, one about human diseases and the other about chemical entities of biological interest. The semantics encoded in those ontologies will be explored to find the ancestors and related classes of a given entity. Participants will learn how to apply semantic similarity to address ambiguity in the entity linking process. After constructing large lexicons that include all the entities of a given domain, participants will learn how to recognize them in biomedical text.

The tutorial will be a half day session (3 hours), according to the following outline:

Biomedical Ontologies

OWL Processing

Synonyms and Ancestors

Entity Linking

Large Lexicons

Prerequisite Skills

This is an introductory tutorial, thus no expected prerequisite knowledge and experience in bioinformatics, text mining and ontologies is required. The participants should however have basic experience in shell scripting and pattern matching.

Hardware Requirements

Participants need a computer (any operating system) with access to internet and a terminal with a UNIX shell. Before the tutorial participants should follow check if everything is available on their computer using the Test Script. This YouTube Playlist includes videos explaining how to use the Test Script using different operating systems.