Introduction to information retrieval and text mining, NLP Processing, Named Entity Recognition (NER), Named Entity Normalisation (NEN), and Relation Extraction (RE)
Practical part created by
Sakhaa Alsaedi1
King Abdullah University of Science and Technology (KAUST)1
This repository contains code for building a Natural Language Processing models and tasks for Biomedical Text Mining. This can generally be useful for high quality samples, or for model diversity.
Part | Description | Open in Colab |
---|---|---|
Introduction of NLP | Notebook 1 | |
Named Entity Recognition (NER) | Notebook 2 | |
Named Entity Normalization (NEN) | Notebook 3 | |
Relationship Extraction (RE) | Notebook 4 |
-
Introduction to NLP: Lay the cornerstone by deciphering the art and science of NLP. Traverse the verdant landscapes of tokenization, linguistic preprocessing, and voyage through the lexicon of challenges unique to biomedical linguistics.
-
Named Entity Recognition (NER) in Biomedical Context: Unleash the linguistic sleuth within you! Delve deep into the fascinating world of NER, where genes, diseases, chemicals, and a cornucopia of entities spring to life from the textual canvas. Embark on a quest to unravel their identities with the precision of a virtuoso. Leveraging the might of SpaCy, NLTK, and specialized biomedical NER tools to extract entities. Illuminate the dim alleys of biomedical text with your code's brilliance.
-
Named Entity Normalisation (NEN) in Biomedical Text: Normalising identified entities in text to their reference records in resources.
-
Weaving Relations in Biomedical Text: Enter the intricate dance of entities as they pirouette through the grammatical maze. With Relation Extraction, you're not just reading textβyou're deciphering the intricate web of connections between entities. Marvel at the tales spun by genes, proteins, and diseases as they waltz through sentences. Empower your code with PyTorch, and more. Cast your models into the textual sea, reeling in relationships with performance metrics as your guiding stars.
Google Colab provides all the necessary dependencies for running the code in this repository. You do not need to install any additional packages.
- Necessary libraries and frameworks are detailed in each chapter's readme.
- β¬β¬β¬β¬β¬β¬ Resources and Materials β¬β¬β¬β¬β¬β¬
- Python for Data Analysis by Wes McKinney
- Niels Rogge, Kashif Rasul, Huggingface notebook
- β¬β¬β¬β¬β¬β¬β¬ Papers β¬β¬β¬β¬β¬β¬β¬