Named Entity Recognition with Gilda¶

Named entity recognition (NER) is the process of identifying spans of text that correspond to concepts. Often, named entity recognition is paired with named entity normalization (NEN) to identify ontology terms for the spans of text. In this tutorial, we show how to use Gilda to do named entity recognition.

Interactive NER¶

In order to introduce grounding, we refer to the web-based deployment of Gilda at http://grounding.indra.bio/ner. First, type the sentence or paragraph you want to ground into the "Text" field. In the following example, we use the sentence Calcium is released from the ER.

Using the Gilda web form for annotation

The results show two entities were identified - calcium and ER. Gilda implements a simple dictionary-based named entity algorithm that is incredibly fast, has the benefit that it does named entity normalization, and also can do disambiguation based on the whole given text.

Results when using the Gilda web form for annotation

Programmatic NER¶

Gilda can be installed with pip install gilda and exposes a high-level interface similar to the web interface. Calcium is released from the ER can be annotated in the same way as before:

from gilda.ner import annotate
text = "Calcium is released from the ER"
results = annotate(text)

rows = [
    (
        start,
        end,
        text,
        scored_match.term.db + ":" + scored_match.term.id,
        scored_match.term.entry_name
    )
    for text, scored_match, start, end in results
]

start	end	text	curie	name
0	7	Calcium	CHEBI:29108	calcium(2+)
29	31	ER	GO:0005783	endoplasmic reticulum

Custom Index¶

A custom index "grounder" object, which exposes all the previously demonstrated functionality, can be created using PyOBO with:

from gilda.ner import annotate
from pyobo.gilda_utils import get_grounder

grounder = get_grounder(["mesh", "cvx"])
text = "Calcium is released from the ER"
results = annotate(text, grounder=grounder)

A custom index can be created by instantiating gilda.Term objects and instantiating a gilda.Grounder object.