Tutorial: How to add custom quality checks with ODK¶
This tutorial explains adding quality checks not included in the ROBOT Report.
Prerequisites¶
You have completed the tutorials:
Custom Quality Checks¶
- Identify a quality issue in your ontology. For the sake of this tutorial, we've added the annotation
dcterms:date
to theroot_node
in the CAT Ontology.
- Write the SPARQL query to detect the error you want to check. For example, check the value type for the annotation
dcterms:date
. It will return the class with the annotation if it's not of typexsd:dateTime
.
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?cls WHERE
{
?cls dcterms:date ?date .
FILTER(DATATYPE(?date) != xsd:dateTime)
}
-
Save the SPARQL query in the
src/sparql
folder and name it[violation name]-violation.sparql
. In the case of the tutorial,date-as-string-violation.sparql
-
Add the check to the ODK config file. In the previous tutorial, this is located at
~/cato/src/ontology/cato-odk.yaml
. Insiderobot_report
, addcustom_sparql_checks
robot_report:
use_labels: TRUE
fail_on: ERROR
report_on:
- edit
custom_sparql_checks:
- date-as-string
- Update the repository. After adding the custom SPARQL check, you need to update your pipeline to take this check when testing the ontology.
- Test the check. You can run the checks and verify the expected result.
sh run.sh make sparql_test
FAIL Rule ../sparql/date-as-string-violation.sparql: 1 violation(s)
cls
http://purl.obolibrary.org/obo/CATO_0000000
xsd:dateTime
, and run the test again to certify everything is good this time.
Push the changes to your repository, and the custom checks will run whenever creating a new Pull Request, as detailed here.
Custom checks available in ODK¶
There are several checks already available in the ODK. If you'd like to add them, add the validation name in your ODK config file.
owldef-self-reference
: verify if the term uses its term as equivalentredundant-subClassOf
: verify if there are redundant subclasses between three classestaxon-range
: verify if the annotationspresent_in_taxon
ornever_in_taxon
always use classes from NCBITaxoniri-range
: verify if the value for the annotationsnever_in_taxon
,present_in_taxon
,foaf:depicted_by
,oboInOwl:inSubset
anddcterms:contributor
are not an IRIiri-range-advanced
: same asiri-range
plus check forrdfs:seeAlso
annotationlabel-with-iri
: verify if there is IRI in the labelmultiple-replaced_by
: verify if an obsolete term has multiplereplaced_by
termsterm-tracker-uri
: verify if the value for the annotation term_tracker_item is not URIillegal-date
: verify if the value for the annotationsdcterms:date
,dcterms:issued
anddcterms:created
are of typexds:date
and use the patternYYYY-MM-DD
Custom ROBOT Report in ODK¶
ROBOT report can also have custom quality checks.
- First, you need to add
custom_profile: TRUE
, in the ODK config file.
robot_report:
use_labels: TRUE
fail_on: ERROR
custom_profile: TRUE
report_on:
- edit
custom_sparql_checks:
- date-as-string
src/sparql
. There isn't a restriction on the file name. However, it should return the variables ?entity ?property ?value
.
- Add the path to the SPARQL query in the
src/ontology/profile.txt
file.
- Test your check. You'll find the failed cases on the same report for the ROBOT report at
src/ontology/reports/cato-edit.owl-obo-report.tsv
. The Rule Name will be the SPARQL file name.
How to choose between Custom SPARQL or Custom ROBOT report¶
- If your test can return the exact three variables
entity
,property
andvalue
-> ROBOT report - If you need to return more detailed information -> Custom SPARQL
- If you want the results of your custom tests in the ROBOT report file -> ROBOT report
Keep in mind that after changing the profile.txt
, you won't get any upcoming updates, and you need to update manually.