Entity Linking tutorial

If your NLP project involves disambiguating textual mentions to different meanings (linked to unique IDs), this new video tutorial is for you! I use spaCy, an open-source library for advanced Natual Language Processing in Python, to implement and train a custom Entity Linking (EL) model. I showcase the functionality on an example use-case of disambiguating mentions of the person “Emerson” to unique identifiers in WikiData. I accomplish this by first annotating some data with our tool Prodigy, and then training a machine learning model from scratch. Near the end of the video, I show how to use the trained model on unseen text and evaluate the performance.

In summary, these are the steps to succesfully implement Entity Linking:

Named Entity Recognition to recognize the textual entities
Create a custom Knowledge Base (KB) that holds information about unique identifiers and likely aliases
Annotate some training text where you manually perform the disambiguation of mentions to their correct KB identifiers
- Train a new Entity Linking component on your training data
- Test its performance on a held-out test dataset

Hope you have fun implementing Entity Linking with spaCy!

→ Video: Youtube

→ Code (spaCy v2): Github

→ Code (spaCy v3): Github

→ Blog post: LinkedIn

Training a custom Entity Linking model with spaCy