Git Product home page Git Product logo

Comments (7)

mcourtot avatar mcourtot commented on August 26, 2024

I think there are 2 distinct cases here:

(1) problem with the file size: if this is the main driver (the immediate use case) then splitting arbitrarily an OWL file into multiple ones shouldn't be an issue (assuming we don't intend for each OWL file to be a valid ontology). I would call such an option "split". I'm not even sure we'd need to consider axiom-based, because there are few chances that such as a plot has any biological meaning associated and I can't think of an advantage to having specific types of axioms into specific files. Also there is no guarantee the axiom-based split is not itself over the allowable size. Re different types of users - the resources I am familiar with have a set of editors who do it all. I can imagine cases where we'd want for extract per user type (such as TermGenie and template/freeforms) but it seems this is a rarer case, maybe not motivating at this stage?

(2) having meaningful subsets of the ontology: this seems very similar to what Ontodog does at release time. For editing purpose, it is akin to the old way of managing branches in OBI, and this came with quite a lot of pain: there were issues with selecting the right file in Protege, making sure axioms were stored in the correct file, and it was very hard to do an update across files.

If the current use case is OBO, is it possible to split the OBO and then do the OWL conversion? Or is it better to convert the whole OBO and then do the split?

from robot.

cmungall avatar cmungall commented on August 26, 2024

If the current use case is OBO, is it possible to split the OBO and then do the OWL conversion? Or is it better to convert the whole OBO and then do the split?

I think the end result is the same? When I say the use case is obo I mean projects that manage source in obo. These groups are happy with a relatively compact and VCS friendly source, and only hit large file issues at release time for the RDF/XML-OWL version. If a project is managing source as a W3C OWL format and the ontology is large then they may be forced to modularize at source time, prior to release time. For example, if GO is to migrate to github and switch source to OWL then we will be forced to modularize at source time.

from robot.

cmungall avatar cmungall commented on August 26, 2024

I'm happy with split as the name, it has a good precedent.

axiom-based split has many advantages over a random split, e.g. lessening VCS churn and making the diffs (partly) more interpretable.

from robot.

cmungall avatar cmungall commented on August 26, 2024

'decomposition' is also a term used, e.g. in the OWL/ZIP paper

That's a bit overloaded for us (we often use it in the sense of class composition/decomposition), so I suggest avoiding this term.

It would be nice if the output of this option made something that was an owl/zip file or similar, see owlcs/owlapi#375 (comment)

from robot.

cmungall avatar cmungall commented on August 26, 2024

Command: split

Core Options:

  • by-subject-source SS
  • by-axiom-type AT
  • by-target-source TS
  • by-object-property OP
  • by-annotation-property AP

These are not mutually exclusive

The resulting ontology IRI and filename is determined by a concatenation of these:

BASE{-SS}{-AT}{-OP}{-AP}{-TS}.owl

Where braces indicate optionality.

The set of all split ontologies must exhaustively and mutually exclusively cover the input ontology

Only certain axioms should go into a P partition, axioms that use the P rather than are about the P

Does this feel appropriate for robot? An argument can be made that this is both too niche and overloads too much customization into options, and that this is best done using project specific code (perhaps as robot plugin?)

from robot.

sesuncedu avatar sesuncedu commented on August 26, 2024

Can you clarify "subject source" and "target source"?

Also, when splitting by axiom type, beware of Declaration. RDF formats
require declarations for parsing, so unless explicitly disabled, OWLAPI
will add any missing declarations back in (e.g. HasKey becomes ambiguous).

Declaration axioms are basically useless otherwise- although they do
provide a place to mistakenly put annotations (the annotations apply to the
axiom, not the entity), so they may require fixes to achieve true
uselessness.

[e.g The OWL Full/DL Gap in the Field
http://ceur-ws.org/Vol-1265/owled2014_submission_9.pdf 49-60
Nicolas Matentzoglu, Bijan Parsia. ](http://
http://ceur-ws.org/Vol-1265/owled2014_submission_9.pdfceur-ws.org
http://ceur-ws.org/Vol-1265/owled2014_submission_9.pdf
/Vol-1265/owled2014_submission_9.pdf
<http://ceur-ws.org/Vol-1265/owled2014_submission_9.pdf)

from robot.

cmungall avatar cmungall commented on August 26, 2024

I figured piggybacking off of the axiom annotation to rdf spec to define source/target.. but that may not be so simple

I mostly had subClassOf-SomeValuesFrom and equivalentTo-intersectionOf patterns in mind, but even within these subsets we need cover GCIs, and for intersections there would be multiple targets

A better idea would be to abandon TS and for SS specify this as being the ontology in which the axiom is declared (or rdfs:isDefinedBy)

We still need some suitably fragile rules for contracting the ontology IRIs such that they can be embedded in other IRIs. This feels like a project specific kind of rule. Perhaps split should be simplified, only by axiom type

from robot.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.