Git Product home page Git Product logo

wfexs-backend's People

Contributors

acivico avatar astrojuanlu avatar dcl10 avatar github-actions[bot] avatar jmfernandez avatar lrodrin avatar paulaidt avatar stain avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wfexs-backend's Issues

Add support to `ga4ghdos` CURIE

Data Object Service standard allows using a common identifier to locate resources which are replicated among several cloud services, as it is described at https://registry.identifiers.org/registry/ga4ghdos . For instance, ga4ghdos:dg.4503/01b048d0-e128-4cb0-94e9-b2d2cab7563d can be queried as

https://dataguids.org/ga4gh/dos/v1/dataobjects/dg.4503/01b048d0-e128-4cb0-94e9-b2d2cab7563d

In the obtained JSON the urls section contains the links to the different replicas of the dataset, which could be FTP, HTTP(S), S3 or Google Cloud URIs.

Cannot download content from ftp

Dear WfExS-Team,
I was testing WfExS on my local WSL2/Ubuntu.
Set up of core and further dependencies on a conda environment worked without any trouble.
Though during of the test workflow
python3 WfExS-backend.py execute -W tests/wetlab2variations_execution_nxf_secure.wfex.stage
I got the following error:


[ERROR] Cannot download content from ftp://ftp-trace.ncbi.nih.gov/giab/ftp/data/NA12878/NIST_NA12878_HG001_HiSeq_300x/140407_D00360_0017_BH947YADXX/Project_RM8398/Sample_U5c/U5c_CCGTCC_L001_R1_001.fastq.gz to 42be63ef9b0fc7d80d09513bfd3fa42b2288fd9b (while processing LicensedURI(uri='ftp://ftp-trace.ncbi.nih.gov/giab/ftp/data/NA12878/NIST_NA12878_HG001_HiSeq_300x/140407_D00360_0017_BH947YADXX/Project_RM8398/Sample_U5c/U5c_CCGTCC_L001_R1_001.fastq.gz', licences=('https://choosealicense.com/no-permission/',), attributions=[], secContext=None)) (temp file /tmp/wfexsivum2b3rtmpcache/wf-inputs/caching-5f6ef9b7-b9b8-4f40-b38e-9ac854ef5ec3): can only concatenate str (not "NoneType") to str
Traceback (most recent call last):
  File "WfExS-backend.py", line 445, in <module>
    main()
  File "WfExS-backend.py", line 429, in main
    wfInstance.stageWorkDir()
  File "/home/valentin/wfexs/WfExS-backend/wfexs_backend/workflow.py", line 1027, in stageWorkDir
    self.materializeInputs()
  File "/home/valentin/wfexs/WfExS-backend/wfexs_backend/workflow.py", line 809, in materializeInputs
    theParams, numInputs = self.fetchInputs(
  File "/home/valentin/wfexs/WfExS-backend/wfexs_backend/workflow.py", line 1008, in fetchInputs
    newInputsAndParams, lastInput = self.fetchInputs(inputs,
  File "/home/valentin/wfexs/WfExS-backend/wfexs_backend/workflow.py", line 932, in fetchInputs
    matContent = self.wfexs.downloadContent(
  File "/home/valentin/wfexs/WfExS-backend/wfexs_backend/wfexs_backend.py", line 980, in downloadContent
    inputKind, cachedFilename, metadata_array, cachedLicences = self.cacheHandler.fetch(remote_file, workflowInputs_destdir, offline, ignoreCache, registerInCache, secContext)
  File "/home/valentin/wfexs/WfExS-backend/wfexs_backend/cache_handler.py", line 549, in fetch
    raise CacheHandlerException(errmsg) from nested_exception
wfexs_backend.cache_handler.CacheHandlerException: Cannot download content from ftp://ftp-trace.ncbi.nih.gov/giab/ftp/data/NA12878/NIST_NA12878_HG001_HiSeq_300x/140407_D00360_0017_BH947YADXX/Project_RM8398/Sample_U5c/U5c_CCGTCC_L001_R1_001.fastq.gz to 42be63ef9b0fc7d80d09513bfd3fa42b2288fd9b (while processing LicensedURI(uri='ftp://ftp-trace.ncbi.nih.gov/giab/ftp/data/NA12878/NIST_NA12878_HG001_HiSeq_300x/140407_D00360_0017_BH947YADXX/Project_RM8398/Sample_U5c/U5c_CCGTCC_L001_R1_001.fastq.gz', licences=('https://choosealicense.com/no-permission/',), attributions=[], secContext=None)) (temp file /tmp/wfexsivum2b3rtmpcache/wf-inputs/caching-5f6ef9b7-b9b8-4f40-b38e-9ac854ef5ec3): can only concatenate str (not "NoneType") to str

No VPN was activated or anything else that could have prevent the fastq from download.

wget ftp://ftp-trace.ncbi.nih.gov/giab/ftp/data/NA12878/NIST_NA12878_HG001_HiSeq_300x/140407_D00360_0017_BH947YADXX/Project_RM8398/Sample_U5c/U5c_CCGTCC_L001_R1_001.fastq.gz
was working though.
Do you have any ideas how to solve it?

Warn about `scrypt` crypt4gh keys

Library crypt4gh can generate and use keys based on different algorithms. One of them is scrypt, which depends on very specific features from OpenSSL used to compile python interpreter.

https://github.com/EGA-archive/crypt4gh/blob/2ba98a7cea96e8fb337b17310cc1a226ad3b3e65/crypt4gh/keys/kdf.py#L29-L43

As this algorithm availability is very dependent on the version of OpenSSL, WfExS-backend should:

  1. Emit a warning each time the conditions where it could fail happen: OpenSSL < 1.1.0 and key generated with scrypt.
  2. Generate new keys always using a different algorithm, like bcrypt, which is not so sensitive to used OpenSSL version on python interpreter compilation.

Can't execute workflows using podman

Description

Using stage I can stage a workflow with podman. However running the workflow with staged-workdir offline-exec I get the following error:

ERROR Workflow error:
Docker is not available for this tool, try --no-container to disable Docker, or install a user space Docker replacement like uDocker with --user-space-docker-cmd.: Docker image hutchstack/rquest-omop-worker:next not found

Fiddling with the code on a fork, I found adding --no-container or --user-space-docker-cmd isn't compatible with --podman.

In cwl_engine.py I found that commenting out the --disable-pull line seemed to fix the problem and the workflow runs as expected. However, I guess the --disable-pull is there for a good reason. Could something be preventing WfExS from looking where the podman image is saved for the staged image?

Bug in path resolution in local config file

Description

When running the following command: WfExS-backend/WfExS-backend.py -L local-config.yml execute -W test-stage.yml I got the following error message:

schema_salad.exceptions.ValidationException: Not found: '/root//root/wfexs-backend-test_WorkDir/efb98299-cb1f-48f8-862e-7a8746bba1a4/workflow/workflows/sec-hutchx86.cwl'

The path resolution appears to have added an additional /root/ to the front of the path in local-config.yml (see below). When I changed the workDir to ./wfexs-backend-test_WorkDir, the execution appeared to proceed as expected and I saw this in the logging output:

materialized workflow repository (checkout 6d500ca1396283faae2ce5eebf778500dd8be2da): /root/wfexs-backend-test_WorkDir/f51c9984-8e43-49fa-a03b-8e683e884980/workflow

The path resolves as would be expected if I ran WfExS from /root.

Local config file

cacheDir: $HOME/wfexs-backend-test
crypt4gh:
  key: local_config.yaml.key
  passphrase: strive backyard dividing gumball
  pub: local_config.yaml.pub
tools:
  containerType: podman
  dockerCommand: docker
  podmanCommand: podman
  encrypted_fs:
    command: encfs
    type: encfs
  engineMode: local
  gitCommand: git
  javaCommand: java
  singularityCommand: singularity
  staticBashCommand: bash-linux-x86_64
workDir: $HOME/wfexs-backend-test_WorkDir

Stage file

workflow_id: https://raw.githubusercontent.com/HDRUK/hutch/main/workflows/sec-hutchx86.cwl
workflow_config:
  container: 'podman'
  secure: false
nickname: 'vas-workflow'
cacheDir: /tmp/wfexszn6siq2jtmpcache
crypt4gh:
  key: cosifer_test1_cwl.wfex.stage.key
  passphrase: mpel nite ified g
  pub: cosifer_test1_cwl.wfex.stage.pub
outputs:
  output_file:
    c-l-a-s-s: File
    glob: "output.json"
params:
  body:
    c-l-a-s-s: File
    url:
      - https://raw.githubusercontent.com/HDRUK/hutch/main/workflows/inputs/rquest-query.json
  is_availability: true
  db_host: "localhost"
  db_name: "hutch"
  db_user: "postgres"
  db_password: "example"

Record the licence of the workflow in RO-Crate

When a workflow is fetched from a git repository or an RO-Crate pointing to a repository, the licence file of the workflow repository should be included in generated RO-Crates, in case it exists.

WfExS-backend init issues

WfExS-backend init should create valid yaml configuration files when --cache-dir parameter is provided. Also, it should validate already existing configuration files against the corresponding JSON schema.

An example of the bad behaviour:

(.pyWEenv) jmfernandez@pavonis[14]:~/projects/WfExS-backend> python WfExS-backend.py --cache-dir /tmp/gorrito -L prueba2.yaml init
[WARNING] Configuration file prueba2.yaml does not exist
[WARNING] Cache directory not defined. Created a temporary one at /tmp/wfexsrkoltayctmpcache
2024-01-31 10:54:02,182 - [WARNING] [WARNING] Installation key file /home/jmfernandez/projects/WfExS-backend/prueba2.yaml.key does not exist
2024-01-31 10:54:02,182 - [WARNING] [WARNING] Installation pub file /home/jmfernandez/projects/WfExS-backend/prueba2.yaml.pub does not exist
* Storing updated configuration at prueba2.yaml
(.pyWEenv) jmfernandez@pavonis[15]:~/projects/WfExS-backend> cat prueba2.yaml
cache-directory: /tmp/gorrito
cacheDir: /tmp/wfexsrkoltayctmpcache
crypt4gh:
  key: prueba2.yaml.key
  passphrase: ndcart ndredth ndline elling
  pub: prueba2.yaml.pub

PermissionError: [Errno 13] Permission denied: '/home/ansible/wfexs-backend-test/wf-cache'

Description

When staging a workflow for the first time as non-sudo/root user I'm getting the error in the title. Oddly, deleting the cache dir and re-running the stage command seems to fix the issue. As does adding write permissions with chown. I'm not sure if it has anything to do with the calls to os.makesirs or a umask issue.

Here's something I was reading about the problem. Not sure if it will be helpful. https://stackoverflow.com/questions/5231901/permission-problems-when-creating-a-dir-with-os-makedirs-in-python/67723702#67723702

Parsing of Nextflow DSL2 workflows

Right now, Nextflow workflow source is parsed in order to learn the needed containers. The approach is not fail proof, as the container declaration can depend on variables, and in the case of DSL2 workflows, the declarations can be spread over several files.

So, at least, it is needed to parse all the (sub)workflow files involved.

Publish a major release with DOI

Now we have a CITATION.cff (as of #13 ), we have to publish a major release with a DOI generated by Zenodo, in order to add that DOI to CITATION.cff.

That major release should be fired by a major event.

Add metadata related to fetched URIs

Right now WfExS does not keep a correspondence between URLs and downloaded files, as the filenames are hashes generated from the URL. But there are several scenarios where additional upstream metadata is available, and future cases where a single URL corresponds to a collection of files. An example of this last one, an ENCODE Experiment id or EGA dataset id correspond to more than one file, maybe with their independent download URL.

So, there should be an intermediate metadata layer, where these correspondences and upstream metadata are kept. After this change, name of cached files should be the sha256 of their content, and URIs should translate to JSON files named as the hash of the URI, containing the correspondences to cached files, and their origins.

Last, but not the least important, upstream metadata should be gathered and preserved in the execution provenance

Add several checks in the code to detect containers unavailable for the current hardware architecture

Thanks to the tests from @dcl10 some issues have been uncovered related to workflows which depend on container images which are not available for the current processor architecture.

A way to reproduce the chain of issues is trying to execute cosifer workflow, which depends on a single container prepared for x86_64 / amd64 architecture, in a different architecture like linux arm64.

cosifer "toy" workflow uses a single custom container which is only available for x86_64. WfExS-backend tries materializing the container by itself, and most probably it is doing wrongly despite the architecture mismatch, but it should have complained before even trying to run cwltool. So, when cwltool tries running it, it is surely failing because either the previously materialized container is for the wrong architecture, or because cwltool is not able to fetch any container suitable for the task. So, cwltool is returning an empty description of its outputs which is deserialized to None instead to a dictionary, and the code is failing trying to access key "class" because None is not a dictionary.

Also, the caching directory should have container images directory per supported architecture, so it can hold cached versions for x86_64 and arm64, in case the caching directory is used in an heterogeneous HPC environment.

Secrets/secret inputs

Background

My team would like to use WfExS in Trusted Research Environment (TRE) which has data sources which can't be exposed to the outside world. We anticipate that the environment will contain variables which must be kept secret, i.e. not in the output RO-Crate). In some cases, some inputs may also be sensitive and we would like them not to be included in the output RO-Crate either.

Proposed Feature

For secret environment variables, would it be possible to add a section in the local config yaml file where we could put the variables as key-value pairs and then have WfExS load these into the local environment at runtime? Then during the creation of the RO-Crate, check for the secrets and exclude them from the crate and its metadata?

For secret inputs. would it be possible to add to the definition of an input a boolean flag to tell WfExS whether that input is secret or not? Then similarly to above have it excluded from the crate and its metadata.

Use pyinvoke and Fabric

Next example shows how to issue commands which can be either remotely or locally run https://stackoverflow.com/a/55704170 based on both pyinvoke and Fabric libraries.

Past the 1.0 milestone, WfExS-backend is going to gain different non-raw execution scenarios, like in-container runs, runs with different users, remote runs through ssh, remote runs through a queue system (first, monolithic, later, spread) and remote runs through GA4GH TES and WES.

A way to seamless integrate this is first transitioning to use both pyinvoke and Fabric, so local and ssh executions are seamless integrated, and then trying to extend it to the other execution environments.

Error while running WfExS using a local workflow file/directory

Description

I am running WfExS with the config files shown below. When I run WfExS-backend.py -L local-config.yml stage -W test-stage.yml I get the following error: NotADirectoryError: [Errno 20] Not a directory: '/root/wfexs-backend-test_WorkDir/47761fdd-f06f-4260-a1f3-7351265805b3/workflow'.

Looking at the path in the error message, it seems workflow is the file in workflow_id in the stage file. However, WfExS is expecting there to be a directory. I also tried putting a path to a directory in the workflow_id field, but that failed saying it couldn't work out which runner to use.

Traceback

Traceback (most recent call last):
File "/root/WfExS-backend/WfExS-backend.py", line 21, in
main()
File "/root/WfExS-backend/wfexs_backend/main.py", line 1122, in main
stagedSetup = wfInstance.stageWorkDir()
File "/root/WfExS-backend/wfexs_backend/workflow.py", line 1985, in stageWorkDir
self.materializeWorkflowAndContainers(offline=offline, ignoreCache=ignoreCache)
File "/root/WfExS-backend/wfexs_backend/workflow.py", line 1233, in materializeWorkflowAndContainers
self.setupEngine(offline=offline, ignoreCache=ignoreCache)
File "/root/WfExS-backend/wfexs_backend/workflow.py", line 1191, in setupEngine
self.fetchWorkflow(
File "/root/WfExS-backend/wfexs_backend/workflow.py", line 1152, in fetchWorkflow
engineVer, candidateLocalWorkflow = engine.identifyWorkflow(
File "/root/WfExS-backend/wfexs_backend/cwl_engine.py", line 316, in identifyWorkflow
newLocalWf = self._enrichWorkflowDeps(newLocalWf, engineVer)
File "/root/WfExS-backend/wfexs_backend/cwl_engine.py", line 542, in _enrichWorkflowDeps
with subprocess.Popen(
File "/usr/lib/python3.10/subprocess.py", line 969, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "/usr/lib/python3.10/subprocess.py", line 1845, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
NotADirectoryError: [Errno 20] Not a directory: '/root/wfexs-backend-test_WorkDir/47761fdd-f06f-4260-a1f3-7351265805b3/workflow'

Settings

Stage file

# test-stage.yml
workflow_id: file:///root/hutch/workflows/sec-hutchx86.cwl
workflow_config:
  container: 'docker'
  secure: false
nickname: 'vas-workflow'
cacheDir: /tmp/wfexszn6siq2jtmpcache
crypt4gh:
  key: cosifer_test1_cwl.wfex.stage.key
  passphrase: mpel nite ified g
  pub: cosifer_test1_cwl.wfex.stage.pub
outputs:
  output_file:
    c-l-a-s-s: File
    glob: "output.json"
params:
  body:
    c-l-a-s-s: File
    url:
      - https://raw.githubusercontent.com/HDRUK/hutch/main/workflows/inputs/rquest-query.json
  is_availability: true
  db_host: "localhost"
  db_name: "hutch"
  db_user: "postgres"
  db_password: "example"

Local config

# local-config.yml
cacheDir: ./wfexs-backend-test
crypt4gh:
  key: local_config.yaml.key
  passphrase: strive backyard dividing gumball
  pub: local_config.yaml.pub
tools:
  containerType: docker
  dockerCommand: docker
  encrypted_fs:
    command: encfs
    type: encfs
  engineMode: local
  gitCommand: git
  javaCommand: java
  singularityCommand: singularity
  staticBashCommand: bash-linux-x86_64
workDir: ./wfexs-backend-test_WorkDir

Add validation capabilities over fetched contents

Today I have found the scenario where some content fetched from FTP was corrupted through the download process. There are several validation mechanisms which can be integrated into WfExS-backend:

  • When a file is a known compressed archive (tar, gz, bzip2, xz, zip), its integrity should be checked.
  • When a file is signed, and a public signing key is available, check the file was not tampered.
  • Declaring a file to be fetched containing MD5 or SHA1 sums or signatures of the fetched contents.
  • Declaring inline fields containing the MD5 or SHA1 sums of the fetched contents.

Add support for swh permanent identifiers

Software Heritage swh permanent identifiers, described at https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html#interoperability , should be supported by WfExS-backend, as they can be used in two different ways.

First one, there are repos there which could contain workflows, so a method to fetch those workflows should be implemented.

Second one, they provide a standardized way to compute a stable identifier for directories. Although there is an available implementation at https://pypi.org/project/swh.model/ , due license collisions (it is GPLv3) a reimplementation of the algorithm is needed.

`dot` dependency should be optional

Right now, when a prospective RO-Crate is generated, dot is used to translate the workflow representation generated by the workflow engine into a PNG. When the command is not available or properly installed, the generation of the RO-Crate fails.

Add support for compact `drs` identifiers

As of https://ga4gh.github.io/data-repository-service-schemas/preview/release/drs-1.1.0/docs/#_appendix_compact_identifier_based_uris , drs URIs can be in compact form, which add an additional level of indirection resolving where the DRS server is living against either n2t.net or identifiers.org . Implementation added at 11d6873 does not consider this level of indirection, and it is not able to know whether it is dealing with a compact DRS URI or a not so compact one.

TypeError: Multiple inheritance with NamedTuple is not supported

Hi!

On Ubuntu LTS, with miniconda and Python 3.9.13, I cannot run WfExS. It fails with the following traceback:

(venv) kinow@ranma:~/Development/python/workspace/WfExS-backend$ python WfExS-backend.py --full-help
Traceback (most recent call last):
  File "/home/kinow/Development/python/workspace/WfExS-backend/WfExS-backend.py", line 39, in <module>
    from wfexs_backend.wfexs_backend import WfExSBackend
  File "/home/kinow/Development/python/workspace/WfExS-backend/wfexs_backend/wfexs_backend.py", line 57, in <module>
    from .common import AbstractWfExSException
  File "/home/kinow/Development/python/workspace/WfExS-backend/wfexs_backend/common.py", line 288, in <module>
    class GeneratedContent(AbstractGeneratedContent, NamedTuple):
  File "/home/kinow/Development/python/miniconda3/lib/python3.9/typing.py", line 1929, in _namedtuple_mro_entries
    raise TypeError("Multiple inheritance with NamedTuple is not supported")
TypeError: Multiple inheritance with NamedTuple is not supported

It looks like this could be related to the following issue:

I think it was first released with 3.9.0-alpha6. Given this is a change in Python, I guess WfExS will have to update the code eventually to support Py 3.9+. This patch fixes the initial command, but I am not sure if it doesn't break something else ๐Ÿ‘

diff --git a/wfexs_backend/common.py b/wfexs_backend/common.py
index 56878fd..a51dd7f 100644
--- a/wfexs_backend/common.py
+++ b/wfexs_backend/common.py
@@ -285,7 +285,7 @@ class ExpectedOutput(NamedTuple):
 class AbstractGeneratedContent(object):
     pass
 
-class GeneratedContent(AbstractGeneratedContent, NamedTuple):
+class GeneratedContent(AbstractGeneratedContent):
     """
     local: Local absolute path of the content which was generated. It
       is an absolute path in the outputs directory of the execution.
@@ -302,7 +302,7 @@ class GeneratedContent(AbstractGeneratedContent, NamedTuple):
     secondaryFiles: Optional[Sequence[AbstractGeneratedContent]] = None
 
 
-class GeneratedDirectoryContent(AbstractGeneratedContent, NamedTuple):
+class GeneratedDirectoryContent(AbstractGeneratedContent):
     """
     local: Local absolute path of the content which was generated. It
       is an absolute path in the outputs directory of the execution.

Add support for `insdc.sra` CURIE

Many public projects, like 1000genomes, publish their genomes at SRA repository, which is mirrored at NCBI, EBI and DDBJ. The idea is adding support to insdc.sra compact URI scheme, providing all the download links based on the different mirrors.

Allow using as a staging source a Workflow Run RO-Crate

The target here is just WfExS-backend should be able to consume its own RO-Crates, demonstrating true reproducibility.

This feature is divided in two milestones:

  • Being able to reuse as much metadata as possible, so inputs, commits and containers are reused.
  • Being able to reuse RO-Crate bundled copies of workflow, inputs and containers in the instantiation.

This last one can bring issues related to docker containers, as it might imply reassigning local container tags.

cwl_engine management of arrays of inputs

There is an issue with workflow https://raw.githubusercontent.com/kids-first/kf-alignment-workflow/v2.7.3/workflows/kfdrc_alignment_wf.cwl , leading to error message

inputdeclarations.yaml:2:1:  * the `input_bam_list` field is not valid because value is a CommentedMap, expected null or array of <File>

due input_bam_list not being properly represented.

cwl_engine.CWLWorkflowEngine generates file inputdeclarations.yaml before calling cwltool, in order to tell it the input parameters and where to find the files.

That yaml is created by createYAMLFile

def createYAMLFile(self, matInputs, cwlInputs, filename):
"""
Method to create a YAML file that describes the execution inputs of the workflow
needed for their execution. Return parsed inputs.
"""
try:
execInputs = self.executionInputs(matInputs, cwlInputs)
if len(execInputs) != 0:
with open(filename, mode="w+", encoding="utf-8") as yaml_file:
yaml.dump(execInputs, yaml_file, allow_unicode=True, default_flow_style=False, sort_keys=False)
return execInputs
else:
raise WorkflowEngineException(
"Dict of execution inputs is empty")
except IOError as error:
raise WorkflowEngineException(
"ERROR: cannot create YAML file {}, {}".format(filename, error))

which depends on the output from executionInputs

def executionInputs(self, matInputs: List[MaterializedInput], cwlInputs):
"""
Setting execution inputs needed to execute the workflow
"""
if len(matInputs) == 0: # Is list of materialized inputs empty?
raise WorkflowEngineException("FATAL ERROR: Execution with no inputs")
if len(cwlInputs) == 0: # Is list of declared inputs empty?
raise WorkflowEngineException("FATAL ERROR: Workflow with no declared inputs")
execInputs = dict()
for matInput in matInputs:
if isinstance(matInput, MaterializedInput): # input is a MaterializedInput
# numberOfInputs = len(matInput.values) # number of inputs inside a MaterializedInput
for input_value in matInput.values:
name = matInput.name
value_type = cwlInputs.get(name, {}).get('type')
if value_type is None:
raise WorkflowEngineException("ERROR: input {} not available in workflow".format(name))
value = input_value
if isinstance(value, MaterializedContent): # value of an input contains MaterializedContent
if value.kind in (ContentKind.Directory, ContentKind.File):
if not os.path.exists(value.local):
self.logger.warning("Input {} is not materialized".format(name))
value_local = value.local
if isinstance(value_type, dict): # MaterializedContent is a List of File
classType = value_type['items']
execInputs.setdefault(name, []).append({"class": classType, "location": value_local})
else: # MaterializedContent is a File
classType = value_type
execInputs[name] = {"class": classType, "location": value_local}
else:
raise WorkflowEngineException(
"ERROR: Input {} has values of type {} this code does not know how to handle".format(name, value.kind))
else:
execInputs[name] = value
return execInputs

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.