Git Product home page Git Product logo

Comments (10)

sclan avatar sclan commented on August 26, 2024

Directory output feature from WDL 2.0 is supported:
https://github.com/dnanexus/dxCompiler/blob/develop/doc/ExpertOptions.md#directory-outputs

from dxcompiler.

RachelDuffin avatar RachelDuffin commented on August 26, 2024

Hi Stanley, I did try to use this but I wasn't able to get it to work. I will give it another try and let you know if I am still having issues.

from dxcompiler.

sclan avatar sclan commented on August 26, 2024

Hi Rachel you may find the test case helpful as template:
https://github.com/dnanexus/dxCompiler/tree/develop/test/wdl_2_0
We include the tests from the test folder for our integration test so all tests will have to pass in order for the dxCompiler possible. Please make sure that the latest release is used for compile since the directory output support is relatively new.

from dxcompiler.

RachelDuffin avatar RachelDuffin commented on August 26, 2024

from dxcompiler.

Gvaihir avatar Gvaihir commented on August 26, 2024

Could you elaborate what specifically does not work?
The expected and tested behavior is that the code in the task organizes the outputs on a worker into directories - according to the logic of the given task. Like the following:

mkdir outputs/folder_{1..4}
for i in {1..4};
do
echo "Hello" > outputs/folder${i}/my_file_${i}
done

then make your task's outputs look like that:

output {
    Directory outdir = "outputs/"
  }

your next task will have Directory inputs "outputs/" from the previous task (above)
Where does this behavior break for you?

from dxcompiler.

sclan avatar sclan commented on August 26, 2024

I think Rachel's goal is to be able to specify (or find a way to specify) one individual file from a task directory output and use it as the input for the next task.

I don't believe such feature is supported by either WDL or dxCompiler/executor. If it is directory output the task generates, that output can only be passed as directory as input for the downstream task(s) via WDL. I can't find any case in the WDL spec that allows pick-and-choose among the directory outputs:

https://github.com/openwdl/wdl/blob/main/versions/development/SPEC.md

from dxcompiler.

Gvaihir avatar Gvaihir commented on August 26, 2024

Then just simply use Directory as an input and point to a specific file in your command section of the task

from dxcompiler.

RachelDuffin avatar RachelDuffin commented on August 26, 2024

My goal is to be able to specify relative output locations for individual files.

Using directory as an output would not work because in my case I am using both WDL tasks and native applets imported as WDL tasks (these have file inputs) in my workflow. The native applets require file inputs whereas the outputs from the previous task passed as inputs to the next task would be directories. It isn't feasible to go through and change all our existing apps to take a different input given the lengthy validation / release / quality management process we have to go through to make changes in a clinical setting. Also dependent upon the task you may not necessarily always want the outputs in the same directory. We would also only want the next task/ app to run if the inputs consisted of all the required files - i.e. the input files exist. If you were to run a workflow that took a directory input, it could be that a previous task/app finished but did not output all the required files for the next task to run (not all required files exist within that directory). With a directory input the next task would still run and fail with an internal error as opposed to just failing to start. This incurs a cost to the customer. Not only that but if you want to group all files within one directory, e.g. a 'bams' directory and there are multiple bams within that directory (e.g. across many samples), specifying that directory as input would mean that all filees within that directory would be uploaded to the worker, again incurring greater cost and a greater runtime for the app despite potentially only needing a single file from that directory. In my case I would be running many concurrent workflows - the app would have no way of knowing which bam file would be the correct one to use for the command, i.e. the bam file that was specific to just that workflow.

I know it is possible to output files from a task to a specific subdirectory using the '--stage-relative-output-folder' argument however this specifies the output directory for all files output by that task and this is not always desirable and doesn't allow to specify output locations for individual files within that task (which is something that is possible and we already do for native applets).

The behaviour makes WDL workflows inflexible in comparison to native workflows and I think there would be benefit to the user in supporting the relative output locations specified within the task output section being reflected within the project heirarchy / relative to the specified destination folder.

I also know that a reorg app is an option however this is also not ideal as it only moves the files at the end of the completion of the workflow.

When compiling/ running a workflow using Cromwell, if a relative output location is specified within the task for an output file e.g. "output/file.txt", the file is then placed within an output directory relative to the execution directory. Is it not possible to replicate similar behaviour but relative to the project for the dxCompiler?

from dxcompiler.

sclan avatar sclan commented on August 26, 2024

If it is the individual output file among the output directory that your downstream / native app needs, can you declare a File output along with the directory output in the (upstream) task?

That way you get both the output directory as well as the individual output File from the task, and the later can be used as input for downstream.

from dxcompiler.

RachelDuffin avatar RachelDuffin commented on August 26, 2024

Hi, I suppose this would be possible however it can be misleading as we have multiple apps that output files into the same folder - if a folder is declared as an output from one file I don't think it would be clear which files from within that folder came from that app and which came from a different app. I think that behaviour allowing you to specify relative output for individual files is the desired behaviour as it is explicit.

from dxcompiler.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.