Exploring the vast amount of rapidly growing biomedical text is of utmost importance, but is also particularly challenging due to the highly specialized domain knowledge and inconsistency of the nomenclature. This introductory tutorial will be a hands-on session to explore the semantics encoded in biomedical ontologies to process text using shell scripting with minimal software dependencies. Participants will learn how to process OWL, retrieve synonyms and ancestors, perform entity linking, and construct large lexicons.
This tutorial will present how we can select an ontology that models a given domain and identify the official names and synonyms of biomedical entities. This tutorial will use two ontologies, one about human diseases and the other about chemical entities of biological interest. The semantics encoded in those ontologies will be explored to find the ancestors and related classes of a given entity. Participants will learn how to apply semantic similarity to address ambiguity in the entity linking process. After constructing large lexicons that include all the entities of a given domain, participants will learn how to recognize them in biomedical text.
The tutorial will be a half day session (3 hours), according to the following outline:
This is an introductory tutorial, thus no expected prerequisite knowledge and experience in bioinformatics, text mining and ontologies is required. The participants should however have basic experience in shell scripting and pattern matching.
Participants need a computer (any operating system) with access to internet and a terminal with a UNIX shell. Before the tutorial participants should follow check if everything is available on their computer using the Test Script. This YouTube Playlist includes videos explaining how to use the Test Script using different operating systems.