Skip to content

Synonym Validation

Synonym validation

Basic validation

The same synonym cannot be an exact synonym of two distinct concepts

  • Implemented in the form of the duplicate_exact_synonym check implemented in ROBOT report.
  • The most fundamental of all synonym checks: if the synonym is exact, the assumption is that no two terms can have the same synonym.

Warning

Some exact synonyms are not globally unique. For example, the acronym "ASD" is an exact synonym of the concept representing "Atrial septal defect" and "Autism Spectrum Disorder".

The same synonym cannot be duplicated with a different scope

  • An entity has duplicate synonyms (the same exact string value, like "Depression") with different properties (e.g. broad and related). This causes ambiguity and considerable confusion among downstream users.
  • This test is implemented in the form of Duplicate Scoped Synonyms check in the ROBOT report.
  • Unfortunately, quasi scope-duplicated synonyms cannot always we recognised easily. In the below example, you can see 3 synonyms that almost seem like they are the same, but have different scopes. Better tools are needed to recognise such cases.

image

The same synonym cannot be an exact synonym and a label at the same time

  • Implemented in the form of the duplicate_label_synonym check implemented in ROBOT report.
  • This check has a long history of controversy and confusion. In general, we cannot expect this assumption to hold in all cases for various reasons:
    • For historical reasons, many ontologies avoid attaching synonym metadata and provenance to the primary label of a class. For example, the Mondo ontology captures the preferred labels of various major nomenclature organisations. Instead of capturing which organisations prefer the label on the primary label, they are captured as "exact syonyms", even though the two often co-incide.
    • It is often considered more convenient to be able to expect all exact synonyms to be available via oboInOwl:hasExactSynonym, and not requiring downstream users to know that exact synonyms are scattered across multiple properties (such as rdfs:label).
    • No matter whether you agree or disagree with the above, as a ontology user you should not assume

Synonym types must be a child of Synonym Type Property

  • A synonym type is used in an annotation, but is not properly declared as a child of oboInOwl:SynonymTypeProperty. This can cause problems with conversions to OBO format.
  • For example, if you add your own synonym type, like hp:abbreviation, it has to be child of oboInOwl:SynonymTypeProperty to be correctly interpreted by ROBOT and generally OWL API related tooling.
  • Implemented in the form of Missing Synonym Type Declaration in the ROBOT report.

Advanced validation

Duplicate exact synonym check that excludes abbreviations
  • In Mondo, this SPARQL query checks for duplicate exact synonyms between terms but excludes any abbreviations.
  • For example, "SMS" is an abbreviation for MONDO:0008491 stiff-person syndrome and MONDO:0008434 Smith-Magenis syndrome and this is acceptable.
  • Implemented as qc-duplicate-exact-synonym-no-abbrev.sparql in Mondo (see below).

??? Query qc-duplicate-exact-synonym-no-abbrev.sparql

```
PREFIX obo: <http://purl.obolibrary.org/obo/>
PREFIX oboInOwl: <http://www.geneontology.org/formats/oboInOwl#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT DISTINCT ?entity ?property ?value WHERE {
  VALUES ?property1 {
    obo:IAO_0000118
    oboInOwl:hasExactSynonym
    rdfs:label
  }
  VALUES ?property2 {
    obo:IAO_0000118
    oboInOwl:hasExactSynonym
    rdfs:label
  }
  ?entity1 ?property1 ?value.
  ?entity2 ?property2 ?value .

  FILTER NOT EXISTS {
    ?axiom owl:annotatedSource ?entity1 ;
         owl:annotatedProperty ?property1 ;
         owl:annotatedTarget ?value ;
         oboInOwl:hasSynonymType <http://purl.obolibrary.org/obo/mondo#ABBREVIATION> .
  }

  FILTER NOT EXISTS {
    ?axiom owl:annotatedSource ?entity2 ;
         owl:annotatedProperty ?property2 ;
         owl:annotatedTarget ?value ;
         oboInOwl:hasSynonymType <http://purl.obolibrary.org/obo/mondo#ABBREVIATION> .
  }

  FILTER NOT EXISTS { ?entity owl:deprecated true }
  FILTER NOT EXISTS { ?entity2 owl:deprecated true }
  FILTER (?entity1 != ?entity2)
  FILTER (!isBlank(?entity1))
  FILTER (!isBlank(?entity2))
  BIND(CONCAT(CONCAT(REPLACE(str(?entity1),"http://purl.obolibrary.org/obo/MONDO_","MONDO:"),"-"), REPLACE(str(?entity2),"http://purl.obolibrary.org/obo/MONDO_","MONDO:")) as ?entity)
  BIND(CONCAT(CONCAT(REPLACE(REPLACE(str(?property1),"http://www.w3.org/2000/01/rdf-schema#","rdfs:"),"http://www.geneontology.org/formats/oboInOwl#","oboInOwl:"),"-"), REPLACE(REPLACE(str(?property1),"http://www.w3.org/2000/01/rdf-schema#","rdfs:"),"http://www.geneontology.org/formats/oboInOwl#","oboInOwl:")) as ?property)
}
ORDER BY DESC(UCASE(str(?value)))
```
Exact Synonyms/Non-exact Mappings
  • In Mondo, this SPARQL query checks for an exact synonym and a database cross-reference (dbxref) that is not exact. If the dbxef is equivalent to the Mondo term, the synonyms from that term should be added as exact synonyms.
  • This is a very specific use case to Mondo, as dbxrefs in Mondo have equivalence mappings (in the form of MONDO:equivalentTo). The issue here was, in a merger, DOID:5603 was added as an equivalent dbxref, but the synonyms 'T-cell acute lymphoblastic leukemia' and 'precursor T lymphoblastic leukemia' were related synonyms. They were changed to exact and the QC check passed.
  • Implemented as qc-exact-synonyms-non-exact-mappings.sparql in Mondo.
  • See the Pull Request here where the QC check failed.

    image

??? Query qc-exact-synonyms-non-exact-mappings.sparql

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX oboInOwl: <http://www.geneontology.org/formats/oboInOwl#>
PREFIX obo: <http://purl.obolibrary.org/obo/>
PREFIX MONDO: <http://purl.obolibrary.org/obo/MONDO_>

SELECT DISTINCT ?entity ?label ?xref ?synonym ?code
WHERE {

  VALUES ?code {
    "MONDO:relatedTo"^^xsd:string
    "MONDO:mondoIsNarrowerThanSource"^^xsd:string
    "MONDO:directSiblingOf"^^xsd:string
    "MONDO:mondoIsBroaderThanSource"^^xsd:string
  }

  ?entity rdfs:subClassOf* MONDO:0000001 .
  ?entity rdfs:label ?label .

  ?entity oboInOwl:hasDbXref ?xref .
    [ 
      owl:annotatedSource ?entity ;
      owl:annotatedProperty oboInOwl:hasDbXref ;
      owl:annotatedTarget ?xref ;
      oboInOwl:source ?code 
    ] .

  ?entity oboInOwl:hasExactSynonym ?synonym .
    [ 
      owl:annotatedSource ?entity ;
      owl:annotatedProperty oboInOwl:hasExactSynonym ;
      owl:annotatedTarget ?synonym ;
      oboInOwl:hasDbXref ?xref 
    ] .

}