Comments (16)
Yes but container images are not the only remote environment supported by WMSs. For example, StreamFlow can offload each step of a workflow to a different environment (a Cloud VM, a bare metal node, an HPC Queue manager, a local Docker, etc.). Plus, with the following release it will be possible to stack the things (e.g., a SLURM Queue Manager over an SSH-connected node, a Singularity container over a Queue Manager over SSH, etc.)
I fear that the pure schema.org ontology cannot represent these things effectively enough. I think that if we want to capture these scenarios we have to move to external ontologies. One example is the very recent GAIA-X Ontology. However, it is more provider-oriented than consumer-oriented. Indeed, the link between the resource and a software consumer seems to be missing.
from workflow-run-crate.
This could be a resolved Conda environment (conda export
), a resolved Docker image #9 or others. Need sub-types per environment.
Renske have started modelling this for CWLprov.
from workflow-run-crate.
One complication here is how many stack levels to represent. E.g. a Conda environment running in a Docker container running on Kubernetes running on OpenStack running on an HPC cluster.
from workflow-run-crate.
In https://github.com/osoc-es/c2t#ontology-diagram we addressed the basic representation of packages in a container (Docker). But this may be out of the scope of what we intend to do here.
I suggest allowing pointing out to the file that creates the container/environment, or pointing to the id and registry where the container is stored in.
from workflow-run-crate.
Also, the command used to invoke the container/creating the environment is quite useful
from workflow-run-crate.
It's a bit confusing because we have #9 specifically for Docker images. I guess this is more general, maybe it should be split into multiple ones depending on the level of abstraction (minus Docker, since we already have #9).
from workflow-run-crate.
Maybe #9 and this issue should be merged. It is necessary to reference container images; the discussion in #9 has so far been dedicated to this. CQ1/#9 is concerned with compiling a list of all images use by the run. That list can be compiled by collecting the images from all workflow execution steps as described by the spec part which will come out of this issue.
from workflow-run-crate.
Ok. I think we're mixing two orthogonal things though. On the one hand we have the environment in which the process is executed, while on the other we have the method through which the compute infrastructure is accessed to get resources and instantiate the environment.
The first is the essential part. For that the container image, or conda environment, or indeed VM image seems sufficient. Instead, for the latter to me it seems debatable whether it is even interesting enough to be captured in most cases.
Taking containers as an example, my point is that generally I would not care much whether you executed a container over SSH, over batch queue or over k8s; I still only need the container image and the command to reproduce what you did.
from workflow-run-crate.
Related Issues (18)
- Join the working group (post here to be added) HOT 40
- CQ2 - Resource usage HOT 25
- CQ3 - Configuration files HOT 5
- CQ5 - Step running time HOT 2
- CQ6 - Workflow running time HOT 4
- CQ7 - Outcome HOT 5
- CQ8 - Workflow inputs and outputs HOT 2
- CQ9 - Software version
- CQ10 - Tool wrappers HOT 6
- cwlprov_to_crate: support for nested workflows HOT 1
- cwlprov_to_crate: test converting cwlprovs created for each CWL conformance tests HOT 6
- CQ11 - Parameter connections HOT 4
- Resource requirements HOT 3
- Representing conditional execution HOT 7
- Representing secondary files HOT 2
- Add extra columns in ro-crate requirements HOT 1
- CQ1 - Container image HOT 11
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from workflow-run-crate.