Skip to content

Named Entity Recognition with Gilda

Named entity recognition (NER) is the process of identifying spans of text that correspond to concepts. Often, named entity recognition is paired with named entity normalization (NEN) to identify ontology terms for the spans of text. In this tutorial, we show how to use Gilda to do named entity recognition.

Interactive NER

In order to introduce grounding, we refer to the web-based deployment of Gilda at http://grounding.indra.bio/ner. First, type the sentence or paragraph you want to ground into the "Text" field. In the following example, we use the sentence Calcium is released from the ER.

Using the Gilda web form for annotation

The results show two entities were identified - calcium and ER. Gilda implements a simple dictionary-based named entity algorithm that is incredibly fast, has the benefit that it does named entity normalization, and also can do disambiguation based on the whole given text.

Results when using the Gilda web form for annotation

Programmatic NER

Gilda can be installed with pip install gilda and exposes a high-level interface similar to the web interface. Calcium is released from the ER can be annotated in the same way as before:

from gilda.ner import annotate
text = "Calcium is released from the ER"
results = annotate(text)

rows = [
    (
        start,
        end,
        text,
        scored_match.term.db + ":" + scored_match.term.id,
        scored_match.term.entry_name
    )
    for text, scored_match, start, end in results
]
start end text curie name
0 7 Calcium CHEBI:29108 calcium(2+)
29 31 ER GO:0005783 endoplasmic reticulum

Custom Index

A custom index "grounder" object, which exposes all the previously demonstrated functionality, can be created using PyOBO with:

from gilda.ner import annotate
from pyobo.gilda_utils import get_grounder

grounder = get_grounder(["mesh", "cvx"])
text = "Calcium is released from the ER"
results = annotate(text, grounder=grounder)

A custom index can be created by instantiating gilda.Term objects and instantiating a gilda.Grounder object.