Git Product home page Git Product logo

sssom-java's Introduction

Damien Goutte-Gattat, PhD (he/him)

In no particular order:

  • 🧑‍🔬 Biocurator and ontology editor at FlyBase
  • 🧑‍💻 Free software developer
  • 🧑‍💻 Slackware user
  • 🥷 Digital privacy advocate
  • 🇫🇷 French

sssom-java's People

Contributors

gouttegd avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Forkers

busekaya2

sssom-java's Issues

Filtering on (non-)empty values

The SSSOM/T filters should allow to select records that do not have a value for a given field.

For example, mapping_tool=="" should select the records that do not have a mapping_tool value. Conversely, !mapping_tool="" should select the records that do have a mapping_tool value (whatever that value is).

Improve test coverage

Several aspects of SSSOM-Java are poorly covered by the existing test suite, notably:

  • most of SSSOM/Transform (core concepts, parsing, SSSOM/T-OWL application);
  • most of SSSOM-CLI.

I’m planning some refactoring of the SSSOM/Transform code, so it needs to be better covered by tests first.

Extend expansion of SSSOM fields in SSSOM/T values

Currently, SSSOM/T-OWL allows to use a few placeholders in function arguments:

  • %subject_id,
  • %subject_label,
  • %subject_curie,
  • and likewise for %object_*.

This should be generalised in two directions:

  • this should be done at the level of SSSOM/T directly, independently of any application;
  • it should be possible to reference most, if not all, SSSOM fields (e.g. %mapping_justification, %comment, etc.).

The syntax should also be improved:

  • use a form like %{field_name} rather than %field_name (%subject_id and the other existing placeholders listed above can still be supported for backwards compatibility);
  • for shortening of an identifier field, use a syntax like %{subject_id|short} (again, %subject_curie can still be supported for backwards compatibility);
  • allow reference to non-standard slots (probably by referring to their property IRI, in the form %{http://example.org/fooProperty}.

Generic transformations of SSSOM mappings

The following is a rough idea for some kind of SSSOM-to-OWL transformation language, which would allow to specify how to generate arbitrary OWL axioms from SSSOM mappings.

By “arbitrary”, I mean in particular that the generated axioms could completely differ from the “default” OWL axioms described in the SSSOM spec, to fit any purpose beyond merely serialising a mapping set into OWL.

The primary use case would be the production of cross-species bridges for UBERON, but the point of the transformation language is that it would be completely generic, so that it could be used for the generation of any other kind of bridges.

A transformation would be specified as a pair selector action, where selector selects the mappings to which the action will be applied.

Selectors

The simplest selector is *, it simply selects all mappings.

Other selectors are of the form field=value, where field is a SSSOM field. For example, predicate_id=skos:exactMatch will select all mappings with a predicate skos:exactMatch. For convenience, for identifier fields the _id suffix may be omitted, so one could use predicate=skos:exactMatch for the same effect.

A * wildcard may be used in the value. For example, subject=UBERON:* would select all mappings where the subject is a Uberon class; subject=UBERON:6* would further select all mappings where the subject is a Uberon class of the 6xxxxxx series. When a wildcard is present, it MUST be the last character of the value; that is, it is not possible to use semapv:crossSpecies*Match.

A selector can select on more than one field. For example, object=FBbt:* mapping_justification=semapv:ManualCuration would select all manually curated mappings where the object is a FBbt class.

Syntactic sugar: Selectors could be nested as follows:

predicate=skos:exactMatch {
    object=FBbt:* ACTION_FOR_FLYBASE_MAPPING;
    object=WBbt:* ACTION_FOR_WORMBASE_MAPPING;
}

The above would be equivalent to:

predicate=skos:exactMatch object=FBbt:* ACTION_FOR_FLYBASE_MAPPING;
predicate=skos:exactMatch object=WBbt:* ACTION_FOR_WORMBASE_MAPPING;

Actions

An ”action” would be either a single function call (action()) or a block containing several function calls ({ action1(); action2(); }).

For now and at least in the context of the ROBOT plugin, the main function available would be a function to generate OWL axioms (hereafter named generate). For example, the following action would generate a cross-reference annotation on the subject pointing to the object:

generate(owl:AnnotationAssertion($subject, oboInOwl:hasDbXref, $object))

$subject and $object would be built-in variables always accessible in the context of an action, and would refer to, well, the subject and the object of the mapping being transformed, respectively.

Example

The following full example would generate the kind of cross-species bridges we use in Uberon:

subject=UBERON:* predicate=semapv:crossSpeciesExactMatch object=FBbt:* {
    # generate the equivalence axiom ($object EquivalentTo $subject and (part_of some NCBITaxon:7227)
    generate(owl:EquivalentClasses($object, owl:ObjectIntersectionOf($subject, owl:ObjectSomeValuesOf(BFO:0000050, NCBITaxon:7227))));

    # generate the "OBO Foundry unique label" (IAO:0000589) on the taxon-specific term
    # by appending " (Drosophila)" to its original label
    # (the `X->Y` syntax gets the annotation Y on object X; `concat` is a built-in string concatenation function)
    generate(owl:AnnotationAssertion($object, IAO:0000589, concat($object->rdfs:label, " (Drosophila)")));
}

Re-think mapping-dependent variables

Mapping-dependent variables are currently only supported for SSSOM/T-OWL and are declared in header functions:

set_var("NAME", VALUE1);
set_var("NAME", VALUE2, "CONDITION");

where CONDITION identifies the mapping(s) for which the variable NAME will have the value VALUE2.

Such variables should be handled at the level of SSSOM/T directly, independent of any application. Using them should rely on the same mechanism as the expansion of SSSOM field placeholders (#3) (e.g. %{NAME|short} should expand to a shortened-if-possible form of the value of the variable NAME).

The syntax should also be more natural, with something like:

predicate=* -> set_var("NAME", VALUE1);

to set the default value of the variable NAME, and

CONDITION -> set_var("NAME", VALUE2);

to set the value of that variable for mappings that match the CONDITION (where CONDITION is a standard SSSOM/T filter).

A complication to that is that the primary (and only, in fact) use case of set_var so far is to set a variable for mappings where the subject is a descendant of a given term:

set_var("TAXREL", BFO:0000050);
set_var("TAXREL", BFO:0000066, "%subject_id is_a UBERON:0000104");
set_var("TAXREL", BFO:0000066, "%subject_id is_a UBERON:0000105");

The TAXREL variable should have a default value of BFO:0000050, and BFO:0000066 if the subject is either UBERON:0000104, UBERON:0000105, or any of the descendants of those terms.

There is currently no filter expression to allow filtering on whether the subject (or object) of a mapping belongs to a given hierarchy, so we would need one. Something like:

subject>=UBERON:0000104 || subject>=UBERON:0000105 -> set_var("TAXREL", BFO:0000066);

where the >= operator would mean “the subject ID is either equal to the operand, or equal to the ID of any descendant of the operand”.

Filling a mapping set with missing labels from an ontology

We need a way to add missing subject and object labels in a mapping set by extracting them from a reference ontology.

That is, if we have the following mapping:

subject_id      subject_label   predicate_id      object_id        object_label
FBbt:00004865   ovary           skos:exactMatch   UBERON:0000992   

where the object_label is empty, we should be able to extract the label for UBERON:0000992 from an ontology file.

This could be done either with another ROBOT command in the sssom plugin, or within the sssom-cli tool.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.