Git Product home page Git Product logo

dxcompiler's Introduction

DNAnexus

Dnanexus Apps and Scripts

applets

  • binning_step0: BioBin Pipeline
  • biobin_pipeline
  • binning_step1: BioBin Pipeline
  • biobin_pipeline
  • binning_step2: BioBin Pipeline
  • biobin_pipeline
  • binning_step3: BioBin Pipeline
  • biobin_pipeline
  • impute2_group_join: Impute2_group_join
  • This app can be used to merge multiple imputed impute2 files
  • plato_biobin: PLATO BioBin Regression Analysis
  • PLATO_BioBin
  • vcf_batch: VCF Batch effect tester
  • vcf_batch

apps

  • association_result_annotation: Annotate GWAS, PheWAS Assocaitions
  • association_result_annotation
  • biobin:
  • This app runs the latest development build of the rare variant binning tool BioBin.
  • generate_phenotype_matrix: Generate Phenotype Matrix
  • generate_phenotype_matrix
  • genotype_case_control: Generate Case/Control by Genotype
  • App provides case and control number by each genotype
  • impute2: imputation
  • This will perfrom imputation using Impute2
  • impute2_to_plink: Impute2 To PLINK
  • Convert Impute2 file to PLINK files
  • plato_single_variant: PLATO - Single Variant Analysis
  • Apps allows you to run single variant association testing against single phenotype (GWAS) or multiple phenotype (PheWAS) test
  • rl_sleeper_app: sleeper
  • This App provides some useful tools when working with data in DNANexus. This App is designed to be run on the command line with "dx run --ssh RL_Sleeper_App" in the project that you have data that you want to explore (use "dx select" to switch projects as needed).
  • shapeit2: SHAPEIT2
  • This app do phasing using SHAPEIT2
  • strand_align: Strand Align
  • Strand Align prior to phasing
  • vcf_annotation_formatter:
  • Extracts and reformats VCF annotations (CLINVAR, dbNSFP, SIFT, SNPEff)
  • QC_apps subfolder:
    • drop_marker_sample: Drop Markers and/or Samples (PLINK)
      • drop_marker_sample
  • drop_relateds: Relatedness Filter (IBD)
    • drop_relateds
  • extract_marker_sample: Drop Markers and/or Samples (PLINK)"
    • extract_marker_sample
  • maf_filter: Marker MAF Rate Filter (PLINK)
    • maf_filter
  • marker_call_filter: Marker Call Rate Filter (PLINK)
    • marker_call_filter
  • missing_summary: Missingness Summary (PLINK)
    • Returns missingness rate by sample
  • pca: Principal Component Analysis using SMARTPCA
    • pca
  • sample_call_filter: Sample Call Rate Filter (PLINK)
    • sample_call_filter

scripts

  • cat_vcf.py *
  • download_intervals.py *
  • download_part.py *
  • estimate_size.py *
  • interval_pad.py
    • This reads a bed file from standard input, pads the intervals, sorts and then outputs the intervals guranteed to be non-overlapping
  • update_applet.sh *

sequencing

  • bcftools_view:
    • Calls "bcftools view". Still in experimental stages.
  • calc_ibd:
    • Calculates a pairwise IBD estimate from either VCF or PLINK files using PLINK 1.9.
  • call_bqsr: Base Quality Score Recalibration
  • call_genotypes:
    • Obsolete, do not use; use geno_p instead. Calls GATK GenotypeGVCFs.
  • call_hc:
  • call_vqsr:
  • cat_variants: combine_variants
    • Combines non-overlapping VCF files with the same subjects. A reimplementation of GATK CatVariants (GATK CatVariants available upon request)
  • combine_variants: combine_variants
  • gen_ancestry:
    • Determine Ancestry from PCA. Uses an eigenvector file and training dataset listing known ancestries. Runs QDA to determine posterior ancestries for all samples, even those in the training set.
  • gen_related_todrop:
    • Uses a PLINK IBD file to determine the minimal set of samples to drop in order to generate an unrelated sample set. Uses a minimum vertex cut algorithm of the related samples to get
  • geno_p:
  • merge_gvcfs:
  • plink_merge:
    • Merge PLINK bed/bim/fam files using PLINK 1.9
  • select_variants: VCF QC
  • variant_annotator: VCF QC
  • vcf_annotate: Annotate VCF File
    • Use a variety of tools to annotate a sites-only VCF.
  • vcf_concordance: VCF Concordance
  • vcf_gen_lof:
    • Subset a VCF from vcf_annotate based on the given annotations to get a sites-only VCF of loss-of-function variants.
  • vcf_pca:
    • Uses PLINK 1.9 and eigenstrat 6.0 to calculate principal components from VCF or PLINK bed/bim/fam files.
  • vcf_qc:
  • vcf_query:
    • Calls "bcftools query" to extract annotations from the VCF file. Used in the stripping of files for MEGAbase
  • vcf_sitesonly: VCF QC
    • Generates a sites-only file from full VCF files.
  • vcf_slice: Slice VCF File(s)
    • Return a small section of a VCF file (similar to tabix). For large output, many small regions, or subsetting samples, use subset_vcf instead.
  • vcf_summary: VCF Summary Statistics
    • Generate summary statistics for a VCF file (by sample and by variant)
  • vcf_to_plink:
    • Uses PLINK 1.9 to convert VCF files to PLINK bed/bim/fam files

dxcompiler's People

Contributors

commandlinegirl avatar emiloslavsky avatar gvaihir avatar janprovaznik avatar jdidion avatar jselbaz avatar jtratner avatar kpjensen avatar mckinsel avatar mfojtak avatar mhrvol avatar mlin avatar mr-c avatar odoublewen avatar orodeh avatar r-i-v-a avatar sclan avatar sstadick avatar vsoch avatar xquek avatar yuxinshi0423 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dxcompiler's Issues

Feature request: runtime evaluation of docker tarball on DNANexus

Background:

Currently, tarball docker images gets evaluated at compile time and would error out if the docker image is not found. This is true when the tarball is declared using either dx path syntax ("dx://GenomeSequenceProject:/A/B/myOrgTools") or dx file id syntax (e.g. "dx://file-xxxx"). I will be focusing on dx path syntax for this feature request.
Once the wdl is compiled with tarball docker images, the workflow become statically associated with the images. If the images are swapped with tarballs by the same name, the workflow still tries to load the old tarball images. To use new images, the wdl files need to be re-compiled.

One way to trick DNANexus, is to add a constant variable in docker image, like below. This way, DNANexus will try to locate the image at runtime, instead of compile time.

task  ... {
  String tag='1.2.3'

  runtime {
    docker: "dx://GenomeSequenceProject:/A/B/myOrgTools-~{tag}.tar"
  }
  ...
}

Request:

If the user specifies docker images using dx path syntax, dxCompiler should skip checking the availability of the images. Instead, it should check them at runtime.
We could do the same for docker images specified using dx file syntac, given the similarity between docker load ${file_id} vs docker pull docker.io/${layer_id}. But I digress.

Reason:

  1. CI/CD pipeline
    Our wdl files and docker files are kept in two separate git repositories and have/will have separate CI/CD pipelines.
    The wdl file repo's CI/CD compiles wdl to DX.
    The docker file repo's CI/CD builds and pushes images to dockerhub as well as save to DX.
    The compile time check would require us to run the CI/CD in a specific order. Or in some god-forbid cases when docker images are recompiled, the affected wdl needs to be re-compiled.
    Having tarball docker images to be evaluated at runtime mitigate this problem. wdl and docker file can be managed by separate CI/CD, just like when we stick to dockerhub only deployment.

  2. Similarity with docker pull
    If an image is specified using a registry url, the image is pulled at runtime and not checked at compile time. Similarly, I think it makes more sense to docker load tarball images at runtime and not check them at compile time.
    It gives us a freedom to swap the images without needing to recompile the workflow. This is a bad practice, I know, but it is convenient during development.

TL; DR

Having tarball docker images evaluated at runtime helps CI/CD.

Other

My request aside, I'd love to here your suggestions on a proper CI/CD solution that works with the current dxCompiler implementation as well.

Best
Joe

wdl 1.1 `sep()` vs command block `~{sep}` return different values

Hi DXCompiler team, I recently came across a behavior when trying to convert an array of File into a CLI arg

I have an input Array[File] snp_vcfs

Originally in a command block I had given this to a CLI tool like the following
--sample-snp-vcfs ~{sep=' ' snp_vcfs}

Which resulted in the following resolution when run on DNAnexus --sample-snp-vcfs /home/dnanexus/inputs/input4994899743764944893/s_1_AACATACTGAGTGATCCGGA.final.vcf /home/dnanexus/inputs/input4994899743764944893/s_1_AATCACGGTTCGGATCGGTT.final.vcf

But then I needed to make this optional, so I put the following in my task block for the input Array[File] snp_vcfs = []

And added this arg
String snp_vcfs_args = if length(snp_vcfs) > 0 then "--sample-snp-vcfs " + sep(" ", snp_vcfs) else ""

Which I reference in command block ~{snp_vcfs_args} but this looks resolves as the following
--sample-snp-vcfs dx://file-GQZPx0Q0xKpKqJp3kxZ4XBP8::/s_1_AACATACTGAGTGATCCGGA.final.vcf dx://file-GQZKf9Q0GxY4K65ZPqGYkp4Y::/s_1_AATCACGGTTCGGATCGGTT.final.vcf

Is it possible for the 1.1 builtin sep() to resolve File objects to their path (like the ~{sep} does) i.e. /home/dnanexus/inputs/... rather than just converting them to strings (and rendering these paths potentially unusable)?

1.1 says ~{sep} is deprecated but I feel like they should have the same behavior unless there is something I'm missing? Thanks

Array[String?] compiles incorrectly to "(hash" and "(array:file)"

When using dxCompiler-2.10.4.jar, I've found a workflow level input type that does not appear to compile correctly.
Array[String] testReq compiles correctly to [stage-common.testReq (array:string)]
Array[String?] testOp compiles incorrectly to stage-common.testOp (hash) and [stage-common.testOp___dxfiles (array:file)]

Here is a test wdl if you want to try yourself.

version 1.0

workflow Test {
    input {
        Array[String] testReq
        Array[String?] testOp
    }
}

This appears to be a bug report unless I'm missing something.

dxCompiler doesn't seem to work with recent versions of java

This has been mentioned in a previous issue before.

I personally got the issue below using dxCompiler with the command java -jar dxCompiler-2.11.6.jar compile my_workflow.wdl -project $PROJID -folder /folder/ -streamFiles all -archive using openjdk 22.0.1-internal 2024-04-16 but it went away when downgrading to openjdk 11.0.1 2018-10-16 LTS.

If I have diagnosed this issue correctly and it's an issue with dxCompiler's compatibility with modern versions of java, can dxCompiler be fixed to work with them?

[290/1824][error] org.fusesource.scalate.TemplateException:
  bad constant pool index: 0 at pos: 49176
     while compiling: <no file>
        during phase: globalPhase=<no phase>, enteringPhase=<some phase>
     library version: version 2.13.7
    compiler version: version 2.13.7
  reconstructed args: -dependencyfile none -deprecation -Wconf:cat=deprecation:w -Wconf:cat=deprecation:ws -Wconf:cat=feature:ws -Wconf:cat=optimizer:ws -classpath dxCompiler-2.11.6.jar -d /tmp/scalate-9011258783282726438-workdir/classes

  last tree to typer: EmptyTree
       tree position: <unknown>
            tree tpe: <notype>
              symbol: null
           call site: <none> in <none>

== Source file context for tree position ==


        at org.fusesource.scalate.TemplateEngine.compileAndLoad(TemplateEngine.scala:864)
        at org.fusesource.scalate.TemplateEngine.compileAndLoadEntry(TemplateEngine.scala:725)
        at org.fusesource.scalate.TemplateEngine.liftedTree1$1(TemplateEngine.scala:436)
        at org.fusesource.scalate.TemplateEngine.load(TemplateEngine.scala:430)
        at org.fusesource.scalate.TemplateEngine.load(TemplateEngine.scala:485)
        at org.fusesource.scalate.TemplateEngine.layout(TemplateEngine.scala:582)
        at wdlTools.generators.Renderer.render(Renderer.scala:15)
        at dx.compiler.ApplicationCompiler.generateJobScript(ApplicationCompiler.scala:139)
        at dx.compiler.ApplicationCompiler.createRunSpec(ApplicationCompiler.scala:189)
        at dx.compiler.ApplicationCompiler.apply(ApplicationCompiler.scala:552)
        at dx.compiler.Compiler$BundleCompiler.maybeBuildApplet(Compiler.scala:357)
        at dx.compiler.Compiler$BundleCompiler.$anonfun$apply$1(Compiler.scala:471)
        at scala.collection.IterableOnceOps.foldLeft(IterableOnce.scala:646)
        at scala.collection.IterableOnceOps.foldLeft$(IterableOnce.scala:642)
        at scala.collection.AbstractIterable.foldLeft(Iterable.scala:926)
        at dx.compiler.Compiler$BundleCompiler.apply(Compiler.scala:441)
        at dx.compiler.Compiler.apply(Compiler.scala:494)
        at dxCompiler.Main$.compile(Main.scala:538)
        at dxCompiler.Main$.dispatchCommand(Main.scala:791)
        at dxCompiler.Main$.main(Main.scala:922)
        at dxCompiler.MainApp$.delayedEndpoint$dxCompiler$MainApp$1(Main.scala:927)
        at dxCompiler.MainApp$delayedInit$body.apply(Main.scala:926)
        at scala.Function0.apply$mcV$sp(Function0.scala:39)
        at scala.Function0.apply$mcV$sp$(Function0.scala:39)
        at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
        at scala.App.$anonfun$main$1(App.scala:76)
        at scala.App.$anonfun$main$1$adapted(App.scala:76)
        at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:563)
        at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:561)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:926)
        at scala.App.main(App.scala:76)
        at scala.App.main$(App.scala:74)
        at dxCompiler.MainApp$.main(Main.scala:926)
        at dxCompiler.MainApp.main(Main.scala)
Caused by: scala.reflect.internal.FatalError:
  bad constant pool index: 0 at pos: 49176
     while compiling: <no file>
        during phase: globalPhase=<no phase>, enteringPhase=<some phase>
     library version: version 2.13.7
    compiler version: version 2.13.7
  reconstructed args: -dependencyfile none -deprecation -Wconf:cat=deprecation:w -Wconf:cat=deprecation:ws -Wconf:cat=feature:ws -Wconf:cat=optimizer:ws -classpath dxCompiler-2.11.6.jar -d /tmp/scalate-9011258783282726438-workdir/classes

  last tree to typer: EmptyTree
       tree position: <unknown>
            tree tpe: <notype>
              symbol: null
           call site: <none> in <none>

== Source file context for tree position ==


        at scala.reflect.internal.Reporting.abort(Reporting.scala:69)
        at scala.reflect.internal.Reporting.abort$(Reporting.scala:65)
        at scala.reflect.internal.SymbolTable.abort(SymbolTable.scala:28)
        at scala.tools.nsc.symtab.classfile.ClassfileParser$ConstantPool.errorBadIndex(ClassfileParser.scala:407)
        at scala.tools.nsc.symtab.classfile.ClassfileParser$ConstantPool.getExternalName(ClassfileParser.scala:262)
        at scala.tools.nsc.symtab.classfile.ClassfileParser.readParamNames$1(ClassfileParser.scala:853)
        at scala.tools.nsc.symtab.classfile.ClassfileParser.parseAttribute$1(ClassfileParser.scala:859)
        at scala.tools.nsc.symtab.classfile.ClassfileParser.$anonfun$parseAttributes$6(ClassfileParser.scala:936)
        at scala.tools.nsc.symtab.classfile.ClassfileParser.parseAttributes(ClassfileParser.scala:936)
        at scala.tools.nsc.symtab.classfile.ClassfileParser.parseMethod(ClassfileParser.scala:635)
        at scala.tools.nsc.symtab.classfile.ClassfileParser.parseClass(ClassfileParser.scala:548)
        at scala.tools.nsc.symtab.classfile.ClassfileParser.$anonfun$parse$2(ClassfileParser.scala:174)
        at scala.tools.nsc.symtab.classfile.ClassfileParser.$anonfun$parse$1(ClassfileParser.scala:159)
        at scala.tools.nsc.symtab.classfile.ClassfileParser.parse(ClassfileParser.scala:142)
        at scala.tools.nsc.symtab.SymbolLoaders$ClassfileLoader.doComplete(SymbolLoaders.scala:342)
        at scala.tools.nsc.symtab.SymbolLoaders$SymbolLoader.$anonfun$complete$2(SymbolLoaders.scala:249)
        at scala.tools.nsc.symtab.SymbolLoaders$SymbolLoader.complete(SymbolLoaders.scala:247)
        at scala.reflect.internal.Symbols$Symbol.completeInfo(Symbols.scala:1561)
        at scala.reflect.internal.Symbols$Symbol.info(Symbols.scala:1533)
        at scala.reflect.internal.Definitions.scala$reflect$internal$Definitions$$enterNewMethod(Definitions.scala:47)
        at scala.reflect.internal.Definitions$DefinitionsClass.String_$plus$lzycompute(Definitions.scala:1256)
        at scala.reflect.internal.Definitions$DefinitionsClass.String_$plus(Definitions.scala:1256)
        at scala.reflect.internal.Definitions$DefinitionsClass.syntheticCoreMethods$lzycompute(Definitions.scala:1577)
        at scala.reflect.internal.Definitions$DefinitionsClass.syntheticCoreMethods(Definitions.scala:1559)
        at scala.reflect.internal.Definitions$DefinitionsClass.symbolsNotPresentInBytecode$lzycompute(Definitions.scala:1590)
        at scala.reflect.internal.Definitions$DefinitionsClass.symbolsNotPresentInBytecode(Definitions.scala:1590)
        at scala.reflect.internal.Definitions$DefinitionsClass.init(Definitions.scala:1646)
        at scala.tools.nsc.Global$Run.<init>(Global.scala:1226)
        at org.fusesource.scalate.support.ScalaCompiler.compile(ScalaCompiler.scala:101)
        at org.fusesource.scalate.TemplateEngine.compileAndLoad(TemplateEngine.scala:787)
        ... 33 more

CWL workflow complied to include a subworkflow with an `outputs` applet fails to run successfully

If you have a workflow as follows:

cwlVersion: v1.2
class: Workflow
inputs: []
requirements:
  SubworkflowFeatureRequirement: {}
steps:
    - id: one
      in: []
      out: [output]
      run: one.cwl
    - id: two
      in: []
      out: [output]
      run: two_wrapper.cwl
outputs:
    output1:
      type: File
      outputSource: one/output
    output2:
      type: File
      outputSource: two/output

one.cwl:

cwlVersion: v1.2
class: CommandLineTool
inputs: []
baseCommand:
    - bash
    - -c
    - "echo something > output"
outputs:
    - id: output
      type: File
      outputBinding:
        glob: output

two_wrapper.cwl:

cwlVersion: v1.2
class: Workflow
inputs: []
steps:
    - id: two
      in: []
      out: [output]
      run: two.cwl
outputs:
    output:
      type: File
      outputSource: two/output

two.cwl:

cwlVersion: v1.2
class: CommandLineTool
inputs: []
baseCommand:
    - bash
    - -c
    - "echo something > output"
outputs:
    - id: output
      type: File?
      outputBinding:
        glob: output
`cwlpack` workflow
    "class": "Workflow",
    "cwlVersion": "v1.2",
    "id": "workflow.cwl",
    "inputs": [],
    "outputs": [
        {
            "id": "output1",
            "outputSource": "one/output",
            "type": "File"
        },
        {
            "id": "output2",
            "outputSource": "two/output",
            "type": "File"
        }
    ],
    "requirements": [
        {
            "class": "SubworkflowFeatureRequirement"
        }
    ],
    "steps": [
        {
            "id": "one",
            "in": [],
            "out": [
                "output"
            ],
            "run": {
                "baseCommand": [
                    "bash",
                    "-c",
                    "echo something > output"
                ],
                "class": "CommandLineTool",
                "cwlVersion": "v1.2",
                "id": "workflow.cwl@[email protected]",
                "inputs": [],
                "outputs": [
                    {
                        "id": "output",
                        "outputBinding": {
                            "glob": "output"
                        },
                        "type": "File"
                    }
                ],
                "requirements": []
            }
        },
        {
            "id": "two",
            "in": [],
            "out": [
                "output"
            ],
            "run": {
                "class": "Workflow",
                "cwlVersion": "v1.2",
                "id": "workflow.cwl@step_two@two_wrapper.cwl",
                "inputs": [],
                "outputs": [
                    {
                        "id": "output",
                        "outputSource": "two/output",
                        "type": "File"
                    }
                ],
                "requirements": [],
                "steps": [
                    {
                        "id": "two",
                        "in": [],
                        "out": [
                            "output"
                        ],
                        "run": {
                            "baseCommand": [
                                "bash",
                                "-c",
                                "echo something > output"
                            ],
                            "class": "CommandLineTool",
                            "cwlVersion": "v1.2",
                            "id": "two_wrapper.cwl@[email protected]",
                            "inputs": [],
                            "outputs": [
                                {
                                    "id": "output",
                                    "outputBinding": {
                                        "glob": "output"
                                    },
                                    "type": [
                                        "null",
                                        "File"
                                    ]
                                }
                            ],
                            "requirements": []
                        }
                    }
                ]
            }
        }
    ]
}

This generates the following applets:
image

Note: there are two outputs applets, the inner workflow_cwl_step_two_two_wrapper_cwl_outputs applet created by two_wrapper.cwl narrowing the output type from File? in two.cwl to File in two_wrapper.cwl.

When running this, two_wrapper's outputs step fails as seen below:
image
with the error message:

Environment: Map(two/output -> (TOptional(TFile),VFile(dx://file-GYqV19QJKB2b8pXF24xJ1xBb::/output,Some(output),None,Some(sha1$50a4e988380c09d290acdab4bd53d24ee7b497df),Some(10),Vector(),None,None)))
Evaluating workflow outputs
Evaluating output parameters:
  (output1,WorkflowOutputParameter(Some(Identifier(Some(file:/null),workflow.cwl/output1)),None,None,CwlFile,Vector(),None,false,Vector(Identifier(Some(file:/null),one/output)),None,None),CwlFile)
  (output2,WorkflowOutputParameter(Some(Identifier(Some(file:/null),workflow.cwl/output2)),None,None,CwlFile,Vector(),None,false,Vector(Identifier(Some(file:/null),two/output)),None,None),CwlFile)
[error] failure executing Workflow action 'Outputs'
java.lang.Exception: cannot coerce VNull to TFile
	at dx.core.ir.Value$.coerceTo(Value.scala:311)
	at dx.executor.cwl.CwlWorkflowExecutor.$anonfun$evaluateOutputs$3(CwlWorkflowExecutor.scala:329)
	at scala.collection.immutable.Vector1.map(Vector.scala:1872)
	at scala.collection.immutable.Vector1.map(Vector.scala:375)
	at dx.executor.cwl.CwlWorkflowExecutor.evaluateOutputs(CwlWorkflowExecutor.scala:283)
	at dx.executor.WorkflowExecutor.evaluateOutputs(WorkflowExecutor.scala:156)
	at dx.executor.WorkflowExecutor.apply(WorkflowExecutor.scala:897)
	at dx.executor.BaseCli.dispatchCommand(BaseCli.scala:103)
	at dx.executor.BaseCli.main(BaseCli.scala:137)
	at dxExecutorCwl.MainApp$.delayedEndpoint$dxExecutorCwl$MainApp$1(Main.scala:27)
	at dxExecutorCwl.MainApp$delayedInit$body.apply(Main.scala:26)
	at scala.Function0.apply$mcV$sp(Function0.scala:39)
	at scala.Function0.apply$mcV$sp$(Function0.scala:39)
	at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
	at scala.App.$anonfun$main$1(App.scala:76)
	at scala.App.$anonfun$main$1$adapted(App.scala:76)
	at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:563)
	at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:561)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:926)
	at scala.App.main(App.scala:76)
	at scala.App.main$(App.scala:74)
	at dxExecutorCwl.MainApp$.main(Main.scala:26)
	at dxExecutorCwl.MainApp.main(Main.scala)

implying that it's trying to collect outputs for the entire workflow, not the two workflow.

I've had a bit of a delve into the dxCompiler source code to see what might be going on. My theory is that:

  1. Prior to CwlWorkflowExecutor.scala:329 (mentioned in the exception), we find that the output parameters are determined from the workflow property in the CwlWorkflowExecutor class here:
    https://github.com/dnanexus/dxCompiler/blob/develop/executorCwl/src/main/scala/dx/executor/cwl/CwlWorkflowExecutor.scala#L271
  2. The workflow that it fetches here seems to be from the CwlWorkflowExecutor.create function. In this function, the following bit of code executes this to find the appropriate workflow:
    https://github.com/dnanexus/dxCompiler/blob/develop/executorCwl/src/main/scala/dx/executor/cwl/CwlWorkflowExecutor.scala#L78-L82
  3. In this bit of ciode OriginalName of the outputs applet is "${wfName}_outputs", as can be seen here:
    https://github.com/dnanexus/dxCompiler/blob/develop/compiler/src/main/scala/dx/translator/cwl/ProcessTranslator.scala#L748
  4. CwlWorkflowExecutor.create therefore follows:
    https://github.com/dnanexus/dxCompiler/blob/develop/executorCwl/src/main/scala/dx/executor/cwl/CwlWorkflowExecutor.scala#L98-L103
    and considers the wrong workflow

Feature request: Setting per workflow custom reorg applet

Hello

I am developing a workflow that has several sub-workflow.
I wish to set a custom reorg applet for one the sub-workflows (which is imported from a separate WDL file).
I tried accomplishing this by setting the perWorkflowDxAttributes in the extras file, namely:

 "perWorkflowDxAttributes":{
	"mapping_variation_annotation_v16_mt_IDT":{
	"customReorgAttributes" : {
    	"appUri" : "applet-G4vx2q80GjFY4KFy7vp4ykkK",
    	"configFile" : "dx://project-G436vv80GjFz96PfFqxF92Kf:file-G4pypqQ0GjFvYXB2KxzKVbxx"
  	}	
	}
  }

However, when I compiled the workflow with the extras file, the custom reorg applet was ignored, and instead the subworkflow mapping_variation_annotation_v16_mt_IDT used the default reorg applet.
Is there a way to accomplish what I'm trying to do?
When I try to set the custom reorg applet as an attribute of the entire workflow (i.e. not placing the customReorgAttributes bloc within the perWorkflowDxAttributes), it is assigned to the top-level workflow, and the sub-workflows are compiled without a reorg stage. This solution is not ideal for me because the sub-workflow mapping_variation_annotation_v16_mt_IDT is intended to be called by a scatter of the top-level workflow, and the outputs from each call to mapping_variation_annotation_v16_mt_IDT are intended to be place in a different folder in the project.

Thanks in advance!!

Coercing Array[String] to Array[Int]

Hi -

I'm trying to compile the Broad Joint Genotyping workflow into DNANexus, and am getting the following error:

[error] Error creating translator for pipelines/broad/dna_seq/germline/joint_genotyping/JointGenotyping.wdl
wdlTools.types.TypeException: fingerprinting_indices value GetFingerprintingIntervalIndices.indices_to_fingerprint of type Array[String] is not coercible to Array[Int] at 368:4-368:95 in /Users/dexzb9/src/warp/pipelines/broad/dna_seq/germline/joint_genotyping/JointGenotyping.wdl

Indeed, the referenced line assigns a variable that comes from a call to read_lines() to an Array[Int]. According to the WDL 1.0 spec, the output of read_lines() "can be auto converted to other Array types", and the linked section even includes an example where the result of read_lines() is assigned to an Array[Int].

However, I also understand that there's an argument to be made that auto coercions should be considered harmful, and that dxCompiler/wdlTools do not allow them. Unfortunately, this paints me into a little bit of a corner - the proposed explicit coercions don't exist (yet) in WDL, but dxCompiler won't allow implicit coercions. I have no problem editing the Broad's wdl file to work with dxCompiler, but I can't figure out how to read a file of stringified integers into an Array[Int]. Any suggestions?

Runtime error coercing optional array in zip/scatter

Hi DNAnexus,

When running a workflow compiled with dxCompiler 2.10.2, I got the following runtime error at this scatter step (no scatter jobs kicked off)

value V_Null cannot be coerced to type T_Float at 131:36-133:7 in <string>
Where align_initial_bam.outbam clip_dd_ds_bam.clipped_bam are Array[File] and extract_umi.downsample_fraction is Array[Float?]

scatter (initial_final_bam_downsample in zip(zip(align_initial_bam.outbam, clip_dd_ds_bam.clipped_bam), extract_umi.downsample_fraction)) {
    call metrics_tasks.sample_qc_metrics {
        input:
            sample=basename(initial_final_bam_downsample.left.left, ".initial.bam"),
            assay_subtype=assay_subtype,
            initial_bam=initial_final_bam_downsample.left.left,
            final_bam=initial_final_bam_downsample.left.right,
            downsample_fraction=initial_final_bam_downsample.right
    }
}

Explicitly declaring the result of the zip to a variable seems to fix this issue, so there seems to be an issue with coercing optional array types in zip?

    Array[Pair[Pair[File, File], Float?]] initial_final_bam_downsampled = zip(zip(align_initial_bam.outbam, clip_dd_ds_bam.clipped_bam), extract_umi.downsample_fraction)
    scatter (initial_final_bam_downsample in initial_final_bam_downsampled) {
        call metrics_tasks.sample_qc_metrics {
            input:
                sample=basename(initial_final_bam_downsample.left.left, ".initial.bam"),
                assay_subtype=assay_subtype,
                initial_bam=initial_final_bam_downsample.left.left,
                final_bam=initial_final_bam_downsample.left.right,
                downsample_fraction=initial_final_bam_downsample.right
        }
    }

Please let me know what you think of this, thank you

UK Biobank RAP path with spaces in it

How do I get dxCompiler to accept a path with spaces in it? e.g.

Array[File]+ bgens = ["dx://project-GJbvyPjJy3Gy01jz4x8bXzgv:/Bulk/Imputation/UKB imputation from genotype/ukb22828_c10_b0_v3.bgen","dx://project-GJbvyPjJy3Gy01jz4x8bXzgv:/Bulk/Imputation/UKB imputation from genotype/ukb22828_c6_b0_v3.bgen",

Below doesn't work nor does one \ or 4 \\

Array[File]+ bgens = ["dx://project-GJbvyPjJy3Gy01jz4x8bXzgv:/Bulk/Imputation/UKB\ imputation\ from\ genotype/ukb22828_c10_b0_v3.bgen","dx://project-GJbvyPjJy3Gy01jz4x8bXzgv:/Bulk/Imputation/UKB\ imputation\ from\ genotype/ukb22828_c6_b0_v3.bgen"]

What does this mean?

I ran code just to test converting CWL pipeline below:

java -jar dxCompiler-2.4.7.jar compile /Users/****/****/UKBB/analysis-workflows/definitions/pipelines/CH_exome_Final2.cwl -language CWL -project UKBB_Exome_2021 -folder /CH_Exome/

And I get this error. What do I have to do/create/make/add to clear this error? Any advice is welcomed.

[error] Error creating translator for /Users/****/****/UKBB/analysis-workflows/definitions/pipelines/CH_exome_Final2.cwl
java.lang.RuntimeException: missing definition for schema file:///Users/****/****/UKBB/analysis-workflows/definitions/types/labelled_file.yml#labelled_file
	at dx.cwl.CwlType$.inner$1(CwlType.scala:117)
	at dx.cwl.CwlType$.translate(CwlType.scala:137)
	at dx.cwl.CwlArray$.translate(CwlType.scala:451)
	at dx.cwl.CwlSchema$.translateSchema(CwlType.scala:326)
	at dx.cwl.CwlType$.inner$1(CwlType.scala:96)
	at dx.cwl.CwlType$.translate(CwlType.scala:137)
	at dx.cwl.CwlType$.apply(CwlType.scala:144)
	at dx.cwl.CommandInputParameter$.apply(CommandLineTool.scala:65)
	at dx.cwl.CommandLineTool$.$anonfun$apply$19(CommandLineTool.scala:202)
	at scala.collection.Iterator$$anon$9.next(Iterator.scala:575)
	at scala.collection.mutable.Growable.addAll(Growable.scala:62)
	at scala.collection.mutable.Growable.addAll$(Growable.scala:59)
	at scala.collection.mutable.ArrayBuffer.addAll(ArrayBuffer.scala:147)
	at scala.collection.mutable.ArrayBuffer.addAll(ArrayBuffer.scala:40)
	at scala.collection.mutable.ArrayBuffer$.from(ArrayBuffer.scala:265)
	at scala.collection.mutable.ArrayBuffer$.from(ArrayBuffer.scala:253)
	at scala.collection.IterableOps.map(Iterable.scala:671)
	at scala.collection.IterableOps.map$(Iterable.scala:671)
	at scala.collection.AbstractIterable.map(Iterable.scala:920)
	at dx.cwl.CommandLineTool$.apply(CommandLineTool.scala:201)
	at dx.cwl.Parser.parse(Parser.scala:64)
	at dx.cwl.Parser.parseFile(Parser.scala:92)
	at dx.translator.cwl.CwlTranslatorFactory.create(CwlTranslator.scala:160)
	at dx.translator.TranslatorFactory$$anonfun$createTranslator$6.applyOrElse(Translator.scala:97)
	at dx.translator.TranslatorFactory$$anonfun$createTranslator$6.applyOrElse(Translator.scala:97)
	at scala.collection.IterableOnceOps.collectFirst(IterableOnce.scala:1086)
	at scala.collection.IterableOnceOps.collectFirst$(IterableOnce.scala:1078)
	at scala.collection.AbstractIterable.collectFirst(Iterable.scala:920)
	at dx.translator.TranslatorFactory$.createTranslator(Translator.scala:97)
	at dxCompiler.Main$.compile(Main.scala:394)
	at dxCompiler.Main$.dispatchCommand(Main.scala:765)
	at dxCompiler.Main$.main(Main.scala:870)
	at dxCompiler.MainApp$.delayedEndpoint$dxCompiler$MainApp$1(Main.scala:875)
	at dxCompiler.MainApp$delayedInit$body.apply(Main.scala:874)
	at scala.Function0.apply$mcV$sp(Function0.scala:39)
	at scala.Function0.apply$mcV$sp$(Function0.scala:39)
	at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
	at scala.App.$anonfun$main$1(App.scala:73)
	at scala.App.$anonfun$main$1$adapted(App.scala:73)
	at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:553)
	at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:551)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:920)
	at scala.App.main(App.scala:73)
	at scala.App.main$(App.scala:71)
	at dxCompiler.MainApp$.main(Main.scala:874)
	at dxCompiler.MainApp.main(Main.scala)

variable declaration prevents workflow from running in parallel in dxCompiler 2.10.2

Hi dxCompiler team,

One of the scientists I work with is reporting an issue with one of her workflows in that an execution is not starting on DNAnexus even though all of its inputs are satisfied.

Here is the basic workflow definition. When run on DNAnexus, it appears that the tasks mutect2 strelka and the scatter block all execute concurrently, however muse does not. It waits for the scatter block to be complete before executing, even though none of its inputs require anything from the block.

Moving muse up in the WDL does not resolve the issue.

version 1.0

import "bam_tools.wdl" as bam_tools
import "vcf_tools.wdl" as vcf_tools
import "mutect2.wdl" as mutect2
import "muse.wdl" as muse

workflow call_snps {
    input {
        File tumor_bam
        File normal_bam
        File genome_tarball
        File intervals
        File known_sites
        File known_sites_index
    }

    call mutect2.mutect2 {
        input:
            tumor_bam = tumor_bam,
            normal_bam = normal_bam,
            genome_tarball = genome_tarball,
            intervals = intervals
    }

    call strelka {
        input:
            tumor_bam = tumor_bam,
            normal_bam = normal_bam,
            genome_tarball = genome_tarball
    }

    Array[String] chr_num = ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17",  "18", "19", "20", "21", "22", "X", "Y", "M"]
    Array[String] chrs = prefix("chr", chr_num)

    scatter (chr in chrs) {
        call bam_tools.split_bam as split_tumor_bam {
            input:
                chr = chr,
                bam = tumor_bam,
        }

        call bam_tools.split_bam as split_normal_bam {
            input:
                chr = chr,
                bam = normal_bam,
        }

        call lofreq {
            input:
                tumor_bam = split_tumor_bam.chr_bam,
                normal_bam = split_normal_bam.chr_bam,
                genome_tarball = genome_tarball,
                chr = chr,
                known_sites = known_sites,
                known_sites_index = known_sites_index,
        }
    }

    call muse.muse {
        input:
            tumor_bam = tumor_bam,
            normal_bam = normal_bam,
            genome_tarball = genome_tarball,
            chrs = chrs,
            known_sites = known_sites,
    }

    call vcf_tools.bcftools_concat_vcfs as merge_lofreq {
        input:
            vcfs = select_all(lofreq.somatic_vcf),
            caller_name = "lofreq"
    }

    call vcf_tools.intersect_calls as intersect_calls {
        input:
            lofreq_calls = merge_lofreq.merged_vcf,
            muse_calls = muse.muse_vcf,
            mutect2_calls = mutect2.filtered_vcf,
            strelka_calls = strelka.somatic_snv_vcf,
    }

    output {
        File mutect2_vcf = mutect2.filtered_vcf
        File mutect2_filtering_stats = mutect2.filtering_stats
        File strelka_snv_vcf = strelka.somatic_snv_vcf
        File strelka_indel_vcf = strelka.somatic_indel_vcf
        File muse_vcf = muse.muse_vcf
        File lofreq_vcf = merge_lofreq.merged_vcf
        File intersected_calls = intersect_calls.intersected_calls
    }
}

In order to to get them to run concurrently the user reported:

  1. I moved
    Array[String] chr_num = ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "X", "Y", "M"]
    Array[String] chrs = prefix("chr", chr_num)
    Into defaults.jsonโ€‹ as "chrs": ["chr1", "chr2", etc.]
  2. I moved LoFreq into a separate workflow and passed it the array chrsโ€‹
  3. I complied with dxWDL 1.50. I know this one is deprecated, but I think some of the issues I've been having recently with DNAnexus are coming from dxCompiler.

There appears to be something about the way this sequence of commands is compiled by dxCompiler that prevents it from running in parallel (i.e. muse must run after the scatter) - let me know what you think and what the solution(d) may be.

[error] You must be logged in to compile using mode All

Hello,

I am trying to compile a SAIGE workflow following these instructions: https://saigegit.github.io/SAIGE-doc/docs/UK_Biobank_WES_analysis.html#prerequisites.

I keep on getting this error when trying to compile the workflow: [error] You must be logged in to compile using mode All.

I am fully logged in to my project, I am able to download/upload files and send swiss-army-knife jobs.

This is the code I am running from my university HPC:
java -jar dxCompiler-2.11.0.jar compile saige_step1_spGRMforNULLModel.wdl -project project-G54K68jJxz3BKpkb0fF7fvvf -folder /workflows/ -f -verbose -logFile run.out

And this is how I defined the docker image path in the saige_null_sGRM_vr_withinfo.wdl file:
docker: "dx://project-G54K68jJxz3BKpkb0fF7fvvf:/GWAS_CAD/associations/ukb/SAIGE/docker_images/saige_1.1.6.3.tar.gz"

Do you have any dea what might be going on?

Thanks much in advance and have a great day!
Ben

WDL tasks in docker containers are missing DX environment variables

Hello,

I was creating a wdl that would access files directly from dnanexus something similar to:

task test{
   comand<<<
   python3 -c "import dxpy; list(dxpy.find_jobs())"
   >>>
    runtime{
        docker: "dockerimage"
        dx_access: object{
            Project: "CONTRIBUTE"
        }
    }
}

But the command fails with a permission error:
"Requires VIEW access to the project".

On further investigation adding source /home/dnanexus/environment to the command section does seem to work.

I suspect adding this to the commandScript file or in containerRunScript would solve the problem.

Task with stubbed applet as runtime does not obey instance type

Hi dxCompiler team; I am looking to call an existing applet in WDL as part of a larger workflow, and specify the instance to use based on input size.

I see some examples in the unit tests 28e3f7f but following the example got an unexpected result

task myriad_bcl2fastq220 {
      input {
        String? rerun_nonce
        Array[File] run_archive
        String? advanced_opts
        Boolean? no_lane_splitting
        Boolean? create_empty_fastqs
        Int? upload_parallelism
        File? sample_sheet
      }
      command {}
      output {
        Array[File]+ stats = ["dummy.txt"]
        Array[File]+ demux_summaries = ["dummy.txt"]
        Array[File]+ fastq_summaries = ["dummy.txt"]
        File interop_run_tar = "dummy.txt"
        Array[File] undetermined_reads = []
        Array[File]+ reads = ["dummy.txt"]
        File run_log = "dummy.txt"
        Array[File] reads2 = []
      }
      runtime {
        # choose large bandwidth instance if s4 flowcell
        dx_instance_type: if size(run_archive, "GB") > 200 then "mem3_ssd3_x96" else "mem1_ssd1_v2_x72"
        dx_app: object {
            type: "applet",
            id: "applet-GK3BQz80jf5258Z175B26xgV",  # app: myriad-bcl2fastq220 with retry
            name: "myriad-bcl2fastq220"
        }
      }
}

At runtime this job continues to use the default app instance https://platform.dnanexus.com/projects/FyJQfy80jf50JG4xGXxF2bgq/monitor/job/GKFfFx80jf5FzKx239VB5gkg and also is not proper WDL syntax

(demux.wdl Ln 66 Col 17) Expected String instead of object(id : String, name : String, type : String)
            dx_app: object {
                    ^^^^^^^^

How may I specify an existing app/applet but override the instance / compute parameters?

Supporting WDL 1.0 "struct"

dxCompiler 2.4.1 is not supporting "struct" properly, and gives error message during compiling. dxWDL 1.50 is on the other hand compiles successfully.

==============START OF ERROR MESSAGE================
[error] Error translating test-struct.wdl to IR
java.lang.Exception: input <sample_name, TString> to call is missing from the environment. We don't have sampleStruct.sample_name in the environment.
...
==============END OF ERROR MESSAGE================

===============START OF WDL FILE =====================
version 1.0

task test {
input {
String sample_name
Array[File] read_fastq_list
Array[String] flowcell_id_list
}

command <<<
set -ex

mkdir fastq_dir
while read flowcell_id fastq_file
  do mkdir ${flowcell_id}
  mv ${fastq_file} $flowcell
done < <(paste ~{write_lines(flowcell_id_list)} ~{write_lines(read_fastq_list)})
echo $(ls -R )

output {
String out = read_string(stdout())
}

runtime {
docker: "docker.io/scicomppublic/centos:7"
}
}

struct SampleStruct {
String sample_name
Array[File] read_fastq_list
Array[String] flowcell_id_list
}

workflow test_workflow {

input {
SampleStruct sampleStruct
}

call test_map_serialization {
input:
sample_name = sampleStruct.sample_name,
read_fastq_list = sampleStruct.read_fastq_list,
flowcell_id_list = sampleStruct.flowcell_id_list,
}

output {
String out = test.out
}
}
===============END OF FILE =====================

Downloading lots of small files

I have a large number of small files files stored in the input Array[Array[File]]. This seems to be taking a lot of time to download on DNAnexus while the total size is relatively small - are you aware of any additional parameters (for example that might enabling threaded downloads) that may help with download speeds?

Many Thanks,
Barney

java.lang.AssertionError: assertion failed: only the last part may end with '_', and only if the suffix does not start with '_'

Could you update this to provide more information when this condition is triggered?

"only the last part may end with '_', and only if the suffix does not start with '_'"

For example, when compiling a workflow I get:

[error] Error translating seaseq-control.wdl to IR
java.lang.AssertionError: assertion failed: only the last part may end with '_', and only if the suffix does not start with '_'
        at scala.Predef$.assert(Predef.scala:279)
        at dx.core.ir.DxName.$anonfun$new$9(DxName.scala:77)
        at dx.core.ir.DxName.$anonfun$new$9$adapted(DxName.scala:71)
        at scala.collection.immutable.Vector.foreach(Vector.scala:1856)
        at dx.core.ir.DxName.$anonfun$new$7(DxName.scala:71)
        at dx.core.ir.DxName.$anonfun$new$7$adapted(DxName.scala:69)
        at scala.Option.foreach(Option.scala:437)
        at dx.core.ir.DxName.<init>(DxName.scala:69)
        at dx.core.languages.wdl.WdlDxName.<init>(WdlDxName.scala:42)
        at dx.core.languages.wdl.WdlDxName$.fromSourceName(WdlDxName.scala:30)
        at dx.core.languages.wdl.WdlUtils$.$anonfun$getClosureInputsAndOutputs$2(WdlUtils.scala:975)
        at scala.collection.StrictOptimizedIterableOps.map(StrictOptimizedIterableOps.scala:100)
        at scala.collection.StrictOptimizedIterableOps.map$(StrictOptimizedIterableOps.scala:87)
        at scala.collection.immutable.TreeSeqMap.map(TreeSeqMap.scala:45)
        at dx.core.languages.wdl.WdlUtils$.$anonfun$getClosureInputsAndOutputs$1(WdlUtils.scala:972)
        at scala.collection.StrictOptimizedIterableOps.flatMap(StrictOptimizedIterableOps.scala:118)
        at scala.collection.StrictOptimizedIterableOps.flatMap$(StrictOptimizedIterableOps.scala:105)
        at scala.collection.immutable.Vector.flatMap(Vector.scala:113)
        at dx.core.languages.wdl.WdlUtils$.getOutputs$1(WdlUtils.scala:964)
        at dx.core.languages.wdl.WdlUtils$.$anonfun$getClosureInputsAndOutputs$1(WdlUtils.scala:980)
        at scala.collection.StrictOptimizedIterableOps.flatMap(StrictOptimizedIterableOps.scala:118)
        at scala.collection.StrictOptimizedIterableOps.flatMap$(StrictOptimizedIterableOps.scala:105)
        at scala.collection.immutable.Vector.flatMap(Vector.scala:113)
        at dx.core.languages.wdl.WdlUtils$.getOutputs$1(WdlUtils.scala:964)
        at dx.core.languages.wdl.WdlUtils$.getClosureInputsAndOutputs(WdlUtils.scala:1050)
        at dx.core.languages.wdl.WdlBlock$.$anonfun$createBlocks$3(WdlBlock.scala:444)
        at scala.collection.immutable.Vector1.map(Vector.scala:1886)
        at scala.collection.immutable.Vector1.map(Vector.scala:375)
        at dx.core.languages.wdl.WdlBlock$.createBlocks(WdlBlock.scala:442)
        at dx.translator.wdl.CallableTranslator$WdlWorkflowTranslator.translate(CallableTranslator.scala:1104)
        at dx.translator.WorkflowTranslator.apply(WorkflowTranslator.scala:239)
        at dx.translator.wdl.CallableTranslator.translateCallable(CallableTranslator.scala:1131)
        at dx.translator.wdl.WdlTranslator.$anonfun$apply$2(WdlTranslator.scala:172)
        at scala.collection.IterableOnceOps.foldLeft(IterableOnce.scala:646)
        at scala.collection.IterableOnceOps.foldLeft$(IterableOnce.scala:642)
        at scala.collection.AbstractIterable.foldLeft(Iterable.scala:926)
        at dx.translator.wdl.WdlTranslator.apply$lzycompute(WdlTranslator.scala:170)
        at dx.translator.wdl.WdlTranslator.apply(WdlTranslator.scala:144)
        at dxCompiler.Main$.compile(Main.scala:438)
        at dxCompiler.Main$.dispatchCommand(Main.scala:799)
        at dxCompiler.Main$.main(Main.scala:922)
        at dxCompiler.MainApp$.delayedEndpoint$dxCompiler$MainApp$1(Main.scala:927)
        at dxCompiler.MainApp$delayedInit$body.apply(Main.scala:926)
        at scala.Function0.apply$mcV$sp(Function0.scala:39)
        at scala.Function0.apply$mcV$sp$(Function0.scala:39)
        at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
        at scala.App.$anonfun$main$1(App.scala:76)
        at scala.App.$anonfun$main$1$adapted(App.scala:76)
        at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:563)
        at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:561)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:926)
        at scala.App.main(App.scala:76)
        at scala.App.main$(App.scala:74)
        at dxCompiler.MainApp$.main(Main.scala:926)
        at dxCompiler.MainApp.main(Main.scala)

Even using the --verbose option doesn't provide enough information to localize where the error is arising.

ParsingException when compiling CWL

Hi all,

I am looking into how to use dxCompiler with some trivial CWL files but I am not getting very far. Perhaps I am doing something wrong?

It appears to try and parse the CWL file as JSON, even though my file is in YAML, which is certainly allowed by the CWL standard. But perhaps not by dxCompiler?

$ java -jar /dxCompiler.jar compile workflows/hello-world-plain.cwl -language CWL -verbose
Warning: no DNAnexus security context found.
[error] Error creating translator for workflows/hello-world-plain.cwl
spray.json.JsonParser$ParsingException: Unexpected character 'c' at input index 0 (line 1, position 1), expected JSON Value:
cwlVersion: v1.2
^

        at spray.json.JsonParser.fail(JsonParser.scala:237)
        at spray.json.JsonParser.value(JsonParser.scala:79)
        at spray.json.JsonParser.parseJsValue(JsonParser.scala:51)
        at spray.json.JsonParser.parseJsValue(JsonParser.scala:47)
        at spray.json.JsonParser$.apply(JsonParser.scala:30)
        at spray.json.RichString.parseJson(package.scala:50)
        at dx.util.JsUtils$.jsFromFile(JsUtils.scala:10)
        at dx.cwl.Parser.parseFile(Parser.scala:213)
        at dx.translator.cwl.CwlTranslatorFactory.create(CwlTranslator.scala:259)
        at dx.translator.TranslatorFactory$$anonfun$createTranslator$6.applyOrElse(Translator.scala:103)
        at dx.translator.TranslatorFactory$$anonfun$createTranslator$6.applyOrElse(Translator.scala:104)
        at scala.collection.IterableOnceOps.collectFirst(IterableOnce.scala:1086)
        at scala.collection.IterableOnceOps.collectFirst$(IterableOnce.scala:1078)
        at scala.collection.AbstractIterable.collectFirst(Iterable.scala:920)
        at dx.translator.TranslatorFactory$.createTranslator(Translator.scala:104)
        at dxCompiler.Main$.compile(Main.scala:418)
        at dxCompiler.Main$.dispatchCommand(Main.scala:799)
        at dxCompiler.Main$.main(Main.scala:919)
        at dxCompiler.MainApp$.delayedEndpoint$dxCompiler$MainApp$1(Main.scala:924)
        at dxCompiler.MainApp$delayedInit$body.apply(Main.scala:923)
        at scala.Function0.apply$mcV$sp(Function0.scala:39)
        at scala.Function0.apply$mcV$sp$(Function0.scala:39)
        at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
        at scala.App.$anonfun$main$1(App.scala:73)
        at scala.App.$anonfun$main$1$adapted(App.scala:73)
        at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:553)
        at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:551)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:920)
        at scala.App.main(App.scala:73)
        at scala.App.main$(App.scala:71)
        at dxCompiler.MainApp$.main(Main.scala:923)
        at dxCompiler.MainApp.main(Main.scala)

with hello-world-plain.cwl:

cwlVersion: v1.2
class: CommandLineTool

baseCommand:
  - echo
  - Hello World!
stdout: output.txt
inputs: {}
outputs:
  hello-world-plain/example_out:
    type: stdout

dxcompiler can not process Pair object

I use Array[Pair[X,Y]] = zip(Array[X], Array[Y]) in my wdl workflow at line
Using dxCompiler gives an illegal expression failure error. Is there a workaround for this?

[31m[error] failure executing Workflow action 'run'
wdlTools.eval.EvalException: member access (left) in expression is illegal at 1:0-1:22 in <string>

Inputs produced by dxCompiler

Hi,

I've imported the following pipeline with dxCompiler: https://github.com/broadinstitute/warp/blob/develop/pipelines/broad/dna_seq/germline/single_sample/wgs/WholeGenomeGermlineSingleSample.wdl

In the original pipeline, the inputs are as follows:

{
  "WholeGenomeGermlineSingleSample.sample_and_unmapped_bams": {
    "sample_name": "NA12878 PLUMBING",
    "base_file_name": "NA12878_PLUMBING",
    "flowcell_unmapped_bams": [
      "gs://broad-public-datasets/NA12878_downsampled_for_testing/unmapped/H06HDADXX130110.1.ATCACGAT.20k_reads.bam",
      "gs://broad-public-datasets/NA12878_downsampled_for_testing/unmapped/H06HDADXX130110.2.ATCACGAT.20k_reads.bam",
      "gs://broad-public-datasets/NA12878_downsampled_for_testing/unmapped/H06JUADXX130110.1.ATCACGAT.20k_reads.bam"
    ],
    "final_gvcf_base_name": "NA12878_PLUMBING",
    "unmapped_bam_suffix": ".bam"
  },

  "WholeGenomeGermlineSingleSample.references": {
    "contamination_sites_ud": "gs://gcp-public-data--broad-references/hg38/v0/contamination-resources/1000g/1000g.phase3.100k.b38.vcf.gz.dat.UD",
    "contamination_sites_bed": "gs://gcp-public-data--broad-references/hg38/v0/contamination-resources/1000g/1000g.phase3.100k.b38.vcf.gz.dat.bed",
    "contamination_sites_mu": "gs://gcp-public-data--broad-references/hg38/v0/contamination-resources/1000g/1000g.phase3.100k.b38.vcf.gz.dat.mu",
    "calling_interval_list": "gs://gcp-public-data--broad-references/hg38/v0/wgs_calling_regions.hg38.interval_list",
    "reference_fasta" : {
        "ref_dict": "gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.dict",
        "ref_fasta": "gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.fasta",
        "ref_fasta_index": "gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.fai",
        "ref_alt": "gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.64.alt",
        "ref_sa": "gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.64.sa",
        "ref_amb": "gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.64.amb",
        "ref_bwt": "gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.64.bwt",
        "ref_ann": "gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.64.ann",
        "ref_pac": "gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.64.pac"
    },
    "known_indels_sites_vcfs": [
      "gs://gcp-public-data--broad-references/hg38/v0/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz",
      "gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.known_indels.vcf.gz"
    ],
    "known_indels_sites_indices": [
      "gs://gcp-public-data--broad-references/hg38/v0/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz.tbi",
      "gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.known_indels.vcf.gz.tbi"
    ],
    "dbsnp_vcf": "gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.dbsnp138.vcf",
    "dbsnp_vcf_index": "gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.dbsnp138.vcf.idx",
    "evaluation_interval_list": "gs://gcp-public-data--broad-references/hg38/v0/wgs_evaluation_regions.hg38.interval_list",
    "haplotype_database_file": "gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.haplotype_database.txt"
  },

  "WholeGenomeGermlineSingleSample.scatter_settings": {
    "haplotype_scatter_count": 10,
    "break_bands_at_multiples_of": 100000
  },

  "WholeGenomeGermlineSingleSample.fingerprint_genotypes_file": "gs://broad-gotc-test-storage/single_sample/plumbing/bams/G96830.NA12878/G96830.NA12878.hg38.reference.fingerprint.vcf.gz",
  "WholeGenomeGermlineSingleSample.fingerprint_genotypes_index": "gs://broad-gotc-test-storage/single_sample/plumbing/bams/G96830.NA12878/G96830.NA12878.hg38.reference.fingerprint.vcf.gz.tbi",
  "WholeGenomeGermlineSingleSample.wgs_coverage_interval_list": "gs://gcp-public-data--broad-references/hg38/v0/wgs_coverage_regions.hg38.interval_list",

  "WholeGenomeGermlineSingleSample.papi_settings": {
    "preemptible_tries": 3,
    "agg_preemptible_tries": 3
  }
}

The common inputs produced are:

Inputs:
 stage-common
  stage-common.fingerprint_genotypes_file: [-istage-common.fingerprint_genotypes_file=(file)]

  stage-common.fingerprint_genotypes_index: [-istage-common.fingerprint_genotypes_index=(file)]

  stage-common.papi_settings: -istage-common.papi_settings=(hash)

  stage-common.papi_settings___dxfiles: [-istage-common.papi_settings___dxfiles=(file) [-istage-common.papi_settings___dxfiles=... [...]]]

  stage-common.provide_bam_output: [-istage-common.provide_bam_output=(boolean, default=false)]

  stage-common.references: -istage-common.references=(hash)

  stage-common.references___dxfiles: [-istage-common.references___dxfiles=(file) [-istage-common.references___dxfiles=... [...]]]

  stage-common.sample_and_unmapped_bams: -istage-common.sample_and_unmapped_bams=(hash)

  stage-common.sample_and_unmapped_bams___dxfiles: [-istage-common.sample_and_unmapped_bams___dxfiles=(file) [-istage-common.sample_and_unmapped_bams___dxfiles=... [...]]]

  stage-common.scatter_settings: -istage-common.scatter_settings=(hash)

  stage-common.scatter_settings___dxfiles: [-istage-common.scatter_settings___dxfiles=(file) [-istage-common.scatter_settings___dxfiles=... [...]]]

  stage-common.use_gatk3_haplotype_caller: [-istage-common.use_gatk3_haplotype_caller=(boolean, default=true)]

  stage-common.wgs_coverage_interval_list: -istage-common.wgs_coverage_interval_list=(file)

I've looked in the documentation but it's unclear how the parameters that now take a hash and a file (e.g. stage-common.references, stage-common.references___dxfiles) should be entered; can you provide some guidance?

Runtime error with default string = ""

Hi DXCompiler team,

I recently ran into a runtime IOerror No value found for I O spec field caller_to_emulate when calling a workflow with a String input that is "" by default, using dxCompiler 2.10.2

For example

workflow1 {
  input {
    Int req_input
    Int random_seed = -1
    String caller_to_emulate = ""
  }
... 
}

# workflow 2 calls workflow 1, but does not provide caller to emulate
workflow2 {
...
  call workflow1 {
    input:
        req_input=1
  }
}

I haven't found anything in the wdl spec as to why an empty string would not be accepted as a default. https://github.com/openwdl/wdl/blob/main/versions/1.1/SPEC.md#concatenation-of-optional-values could you investigate this one?

Preserve folder hierarchies on the platform

My team would find the previously requested behaviour in dnanexus/dxWDL#168 very useful.

We are investigating switching over to WDL workflows and are almost there for one of our pipelines. However, one of the problems is that whilst the '--stage-relative-output-folder' argument allows specifying the relative output folder per task for all files produced by that task, it does not allow for individual files to be placed in different directories.

It would be very helpful if the platform allowed for this behaviour through specification in the outputs section of each WDL task, rather than needing to create a separate reorg app (this just adds to the code base that needs to be maintained and means that files are only moved to the correct location at the end of the workflow rather than at the point of delocalisation).

I noticed it has been several years since the last comment on closed issue dnanexus/dxWDL#168 and was hoping you would be able to provide an update as to whether the exploratory work went anywhere and whether the professional services department decided that this functionality would be useful to incorporate into the dxCompiler.

Many thanks!

type T_Optional(T_Float) is not coercible to string

Hi,

I am trying to compile the following pipeline to use on the DNAnexus platform: https://github.com/broadinstitute/warp/blob/develop/pipelines/broad/dna_seq/germline/single_sample/wgs/WholeGenomeGermlineSingleSample.wdl

I've successfully processed it using wdlTools fix first but am now encountering the following type coercion issues: type T_Optional(T_Float) is not coercible to string

Is this something that should be supported by dxCompiler or do I need to fix these manually?

Thanks,
Laurent

Stack trace:

[error] WDL code is syntactically valid BUT it fails type-checking -----
version 1.0

import "../../../../../../tasks/broad/UnmappedBamToAlignedBam.wdl" as ToBam
import "../../../../../../tasks/broad/AggregatedBamQC.wdl" as AggregatedQC
import "../../../../../../tasks/broad/Qc.wdl" as QC
import "../../../../../../tasks/broad/BamToCram.wdl" as ToCram
import "../../../../../../pipelines/broad/dna_seq/germline/variant_calling/VariantCalling.wdl" as
  ToGvcf
import "../../../../../../structs/dna_seq/DNASeqStructs.wdl" as DNASeqStructs

workflow WholeGenomeGermlineSingleSample {
  input {
    SampleAndUnmappedBams sample_and_unmapped_bams
    DNASeqSingleSampleReferences references
    VariantCallingScatterSettings scatter_settings
    PapiSettings papi_settings
    File? fingerprint_genotypes_file
    File? fingerprint_genotypes_index
    File wgs_coverage_interval_list
    Boolean provide_bam_output = false
    Boolean use_gatk3_haplotype_caller = true
  }

  String pipeline_version = "2.3.4"
  Int read_length = 250
  Float lod_threshold = -20.0
  String cross_check_fingerprints_by = "READGROUP"
  String recalibrated_bam_basename = sample_and_unmapped_bams.base_file_name +
    ".aligned.duplicates_marked.recalibrated"
  String final_gvcf_base_name = select_first(
    [sample_and_unmapped_bams.final_gvcf_base_name, sample_and_unmapped_bams.base_file_name]
    )

  call ToBam.UnmappedBamToAlignedBam {
    input:
      sample_and_unmapped_bams = sample_and_unmapped_bams,
      references = references,
      papi_settings = papi_settings,
      contamination_sites_ud = references.contamination_sites_ud,
      contamination_sites_bed = references.contamination_sites_bed,
      contamination_sites_mu = references.contamination_sites_mu,
      cross_check_fingerprints_by = cross_check_fingerprints_by,
      haplotype_database_file = references.haplotype_database_file,
      lod_threshold = lod_threshold,
      recalibrated_bam_basename = recalibrated_bam_basename
  }

  call AggregatedQC.AggregatedBamQC {
    input:
      base_recalibrated_bam = UnmappedBamToAlignedBam.output_bam,
      base_recalibrated_bam_index = UnmappedBamToAlignedBam.output_bam_index,
      base_name = sample_and_unmapped_bams.base_file_name,
      sample_name = sample_and_unmapped_bams.sample_name,
      recalibrated_bam_base_name = recalibrated_bam_basename,
      haplotype_database_file = references.haplotype_database_file,
      references = references,
      fingerprint_genotypes_file = fingerprint_genotypes_file,
      fingerprint_genotypes_index = fingerprint_genotypes_index,
      papi_settings = papi_settings
  }

  call ToCram.BamToCram as BamToCram {
    input:
      input_bam = UnmappedBamToAlignedBam.output_bam,
      ref_fasta = references.reference_fasta.ref_fasta,
      ref_fasta_index = references.reference_fasta.ref_fasta_index,
      ref_dict = references.reference_fasta.ref_dict,
      duplication_metrics = UnmappedBamToAlignedBam.duplicate_metrics,
      chimerism_metrics = AggregatedBamQC.agg_alignment_summary_metrics,
      base_file_name = sample_and_unmapped_bams.base_file_name,
      agg_preemptible_tries = papi_settings.agg_preemptible_tries
  }

  call QC.CollectWgsMetrics as CollectWgsMetrics {
    input:
      input_bam = UnmappedBamToAlignedBam.output_bam,
      input_bam_index = UnmappedBamToAlignedBam.output_bam_index,
      metrics_filename = sample_and_unmapped_bams.base_file_name + ".wgs_metrics",
      ref_fasta = references.reference_fasta.ref_fasta,
      ref_fasta_index = references.reference_fasta.ref_fasta_index,
      wgs_coverage_interval_list = wgs_coverage_interval_list,
      read_length = read_length,
      preemptible_tries = papi_settings.agg_preemptible_tries
  }

  call QC.CollectRawWgsMetrics as CollectRawWgsMetrics {
    input:
      input_bam = UnmappedBamToAlignedBam.output_bam,
      input_bam_index = UnmappedBamToAlignedBam.output_bam_index,
      metrics_filename = sample_and_unmapped_bams.base_file_name + ".raw_wgs_metrics",
      ref_fasta = references.reference_fasta.ref_fasta,
      ref_fasta_index = references.reference_fasta.ref_fasta_index,
      wgs_coverage_interval_list = wgs_coverage_interval_list,
      read_length = read_length,
      preemptible_tries = papi_settings.agg_preemptible_tries
  }

  call ToGvcf.VariantCalling as BamToGvcf {
    input:
      calling_interval_list = references.calling_interval_list,
      evaluation_interval_list = references.evaluation_interval_list,
      haplotype_scatter_count = scatter_settings.haplotype_scatter_count,
      break_bands_at_multiples_of = scatter_settings.break_bands_at_multiples_of,
      contamination = UnmappedBamToAlignedBam.contamination,
      input_bam = UnmappedBamToAlignedBam.output_bam,
      input_bam_index = UnmappedBamToAlignedBam.output_bam_index,
      ref_fasta = references.reference_fasta.ref_fasta,
      ref_fasta_index = references.reference_fasta.ref_fasta_index,
      ref_dict = references.reference_fasta.ref_dict,
      dbsnp_vcf = references.dbsnp_vcf,
      dbsnp_vcf_index = references.dbsnp_vcf_index,
      base_file_name = sample_and_unmapped_bams.base_file_name,
      final_vcf_base_name = final_gvcf_base_name,
      agg_preemptible_tries = papi_settings.agg_preemptible_tries,
      use_gatk3_haplotype_caller = use_gatk3_haplotype_caller
  }

  if (provide_bam_output) {
    File provided_output_bam = UnmappedBamToAlignedBam.output_bam
    File provided_output_bam_index = UnmappedBamToAlignedBam.output_bam_index
  }

  output {
    Array[File] quality_yield_metrics = UnmappedBamToAlignedBam.quality_yield_metrics
    Array[File] unsorted_read_group_base_distribution_by_cycle_pdf =
      UnmappedBamToAlignedBam.unsorted_read_group_base_distribution_by_cycle_pdf
    Array[File] unsorted_read_group_base_distribution_by_cycle_metrics =
      UnmappedBamToAlignedBam.unsorted_read_group_base_distribution_by_cycle_metrics
    Array[File] unsorted_read_group_insert_size_histogram_pdf =
      UnmappedBamToAlignedBam.unsorted_read_group_insert_size_histogram_pdf
    Array[File] unsorted_read_group_insert_size_metrics =
      UnmappedBamToAlignedBam.unsorted_read_group_insert_size_metrics
    Array[File] unsorted_read_group_quality_by_cycle_pdf =
      UnmappedBamToAlignedBam.unsorted_read_group_quality_by_cycle_pdf
    Array[File] unsorted_read_group_quality_by_cycle_metrics =
      UnmappedBamToAlignedBam.unsorted_read_group_quality_by_cycle_metrics
    Array[File] unsorted_read_group_quality_distribution_pdf =
      UnmappedBamToAlignedBam.unsorted_read_group_quality_distribution_pdf
    Array[File] unsorted_read_group_quality_distribution_metrics =
      UnmappedBamToAlignedBam.unsorted_read_group_quality_distribution_metrics
    File read_group_alignment_summary_metrics = AggregatedBamQC.read_group_alignment_summary_metrics
    File read_group_gc_bias_detail_metrics = AggregatedBamQC.read_group_gc_bias_detail_metrics
    File read_group_gc_bias_pdf = AggregatedBamQC.read_group_gc_bias_pdf
    File read_group_gc_bias_summary_metrics = AggregatedBamQC.read_group_gc_bias_summary_metrics
    File? cross_check_fingerprints_metrics =
      UnmappedBamToAlignedBam.cross_check_fingerprints_metrics
    File selfSM = UnmappedBamToAlignedBam.selfSM
    Float contamination = UnmappedBamToAlignedBam.contamination
    File calculate_read_group_checksum_md5 = AggregatedBamQC.calculate_read_group_checksum_md5
    File agg_alignment_summary_metrics = AggregatedBamQC.agg_alignment_summary_metrics
    File agg_bait_bias_detail_metrics = AggregatedBamQC.agg_bait_bias_detail_metrics
    File agg_bait_bias_summary_metrics = AggregatedBamQC.agg_bait_bias_summary_metrics
    File agg_gc_bias_detail_metrics = AggregatedBamQC.agg_gc_bias_detail_metrics
    File agg_gc_bias_pdf = AggregatedBamQC.agg_gc_bias_pdf
    File agg_gc_bias_summary_metrics = AggregatedBamQC.agg_gc_bias_summary_metrics
    File agg_insert_size_histogram_pdf = AggregatedBamQC.agg_insert_size_histogram_pdf
    File agg_insert_size_metrics = AggregatedBamQC.agg_insert_size_metrics
    File agg_pre_adapter_detail_metrics = AggregatedBamQC.agg_pre_adapter_detail_metrics
    File agg_pre_adapter_summary_metrics = AggregatedBamQC.agg_pre_adapter_summary_metrics
    File agg_quality_distribution_pdf = AggregatedBamQC.agg_quality_distribution_pdf
    File agg_quality_distribution_metrics = AggregatedBamQC.agg_quality_distribution_metrics
    File agg_error_summary_metrics = AggregatedBamQC.agg_error_summary_metrics
    File? fingerprint_summary_metrics = AggregatedBamQC.fingerprint_summary_metrics
    File? fingerprint_detail_metrics = AggregatedBamQC.fingerprint_detail_metrics
    File wgs_metrics = CollectWgsMetrics.metrics
    File raw_wgs_metrics = CollectRawWgsMetrics.metrics
    File duplicate_metrics = UnmappedBamToAlignedBam.duplicate_metrics
    File output_bqsr_reports = UnmappedBamToAlignedBam.output_bqsr_reports
    File gvcf_summary_metrics = BamToGvcf.vcf_summary_metrics
    File gvcf_detail_metrics = BamToGvcf.vcf_detail_metrics
    File? output_bam = provided_output_bam
    File? output_bam_index = provided_output_bam_index
    File output_cram = BamToCram.output_cram
    File output_cram_index = BamToCram.output_cram_index
    File output_cram_md5 = BamToCram.output_cram_md5
    File validate_cram_file_report = BamToCram.validate_cram_file_report
    File output_vcf = BamToGvcf.output_vcf
    File output_vcf_index = BamToGvcf.output_vcf_index
  }

  meta {
    allowNestedInputs: true
  }
}
[error] Error creating translator for WholeGenomeGermlineSingleSample.wdl
wdlTools.types.TypeException: expression sorting_collection_size_ratio of type T_Optional(T_Float) is not coercible to string at 106:41-106:70 in /Users/franciol/OneDrive - Vertex Pharmaceuticals/tools/warp/pipelines/broad/dna_seq/germline/single_sample/wgs/gatk-fixed/pipelines/broad/dna_seq/germline/single_sample/wgs/../../../../../../tasks/broad/../../tasks/broad/../../tasks/broad/BamProcessing.wdl
expression max_output of type T_Optional(T_Int) is not coercible to string at 393:22-393:32 in /Users/franciol/OneDrive - Vertex Pharmaceuticals/tools/warp/pipelines/broad/dna_seq/germline/single_sample/wgs/gatk-fixed/pipelines/broad/dna_seq/germline/single_sample/wgs/../../../../../../tasks/broad/../../tasks/broad/Qc.wdl
expression sorting_collection_size_ratio of type T_Optional(T_Float) is not coercible to string at 106:41-106:70 in /Users/franciol/OneDrive - Vertex Pharmaceuticals/tools/warp/pipelines/broad/dna_seq/germline/single_sample/wgs/gatk-fixed/pipelines/broad/dna_seq/germline/single_sample/wgs/../../../../../../tasks/broad/../../tasks/broad/../../tasks/broad/BamProcessing.wdl
expression max_output of type T_Optional(T_Int) is not coercible to string at 393:22-393:32 in /Users/franciol/OneDrive - Vertex Pharmaceuticals/tools/warp/pipelines/broad/dna_seq/germline/single_sample/wgs/gatk-fixed/pipelines/broad/dna_seq/germline/single_sample/wgs/../../../../../../tasks/broad/../../tasks/broad/Qc.wdl
expression max_output of type T_Optional(T_Int) is not coercible to string at 393:22-393:32 in /Users/franciol/OneDrive - Vertex Pharmaceuticals/tools/warp/pipelines/broad/dna_seq/germline/single_sample/wgs/gatk-fixed/pipelines/broad/dna_seq/germline/single_sample/wgs/../../../../../../tasks/broad/Qc.wdl
expression max_output of type T_Optional(T_Int) is not coercible to string at 393:22-393:32 in /Users/franciol/OneDrive - Vertex Pharmaceuticals/tools/warp/pipelines/broad/dna_seq/germline/single_sample/wgs/gatk-fixed/pipelines/broad/dna_seq/germline/single_sample/wgs/../../../../../../tasks/broad/../../tasks/broad/Qc.wdl
expression max_output of type T_Optional(T_Int) is not coercible to string at 393:22-393:32 in /Users/franciol/OneDrive - Vertex Pharmaceuticals/tools/warp/pipelines/broad/dna_seq/germline/single_sample/wgs/gatk-fixed/pipelines/broad/dna_seq/germline/single_sample/wgs/../../../../../../pipelines/broad/dna_seq/germline/variant_calling/../../../../../tasks/broad/Qc.wdl
expression sorting_collection_size_ratio of type T_Optional(T_Float) is not coercible to string at 106:41-106:70 in /Users/franciol/OneDrive - Vertex Pharmaceuticals/tools/warp/pipelines/broad/dna_seq/germline/single_sample/wgs/gatk-fixed/pipelines/broad/dna_seq/germline/single_sample/wgs/../../../../../../pipelines/broad/dna_seq/germline/variant_calling/../../../../../tasks/broad/BamProcessing.wdl
	at wdlTools.types.TypeInfer.apply(TypeInfer.scala:1307)
	at dx.core.languages.wdl.WdlUtils$.parseAndCheckSource(WdlUtils.scala:101)
	at dx.core.languages.wdl.VersionSupport$.fromSource(VersionSupport.scala:128)
	at dx.core.languages.wdl.VersionSupport$.fromSourceFile(VersionSupport.scala:141)
	at dx.translator.wdl.WdlTranslatorFactory.liftedTree1$1(WdlTranslator.scala:223)
	at dx.translator.wdl.WdlTranslatorFactory.create(WdlTranslator.scala:222)
	at dx.translator.TranslatorFactory$$anonfun$createTranslator$6.applyOrElse(Translator.scala:97)
	at dx.translator.TranslatorFactory$$anonfun$createTranslator$6.applyOrElse(Translator.scala:97)
	at scala.collection.IterableOnceOps.collectFirst(IterableOnce.scala:1086)
	at scala.collection.IterableOnceOps.collectFirst$(IterableOnce.scala:1078)
	at scala.collection.AbstractIterable.collectFirst(Iterable.scala:920)
	at dx.translator.TranslatorFactory$.createTranslator(Translator.scala:97)
	at dxCompiler.Main$.compile(Main.scala:394)
	at dxCompiler.Main$.dispatchCommand(Main.scala:765)
	at dxCompiler.Main$.main(Main.scala:870)
	at dxCompiler.MainApp$.delayedEndpoint$dxCompiler$MainApp$1(Main.scala:875)
	at dxCompiler.MainApp$delayedInit$body.apply(Main.scala:874)
	at scala.Function0.apply$mcV$sp(Function0.scala:39)
	at scala.Function0.apply$mcV$sp$(Function0.scala:39)
	at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
	at scala.App.$anonfun$main$1(App.scala:73)
	at scala.App.$anonfun$main$1$adapted(App.scala:73)
	at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:553)
	at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:551)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:920)
	at scala.App.main(App.scala:73)
	at scala.App.main$(App.scala:71)
	at dxCompiler.MainApp$.main(Main.scala:874)
	at dxCompiler.MainApp.main(Main.scala)

[error] org.fusesource.scalate.TemplateException:

Hi team, I'm trying to compile a WDL workflow and I have this error:

$ java -jar /path/to/apps/dxCompiler-2.11.4.jar compile wgs_qc_wf.wdl -project project-F292qGQ02k8Y48QfFzJqX2j0 -folder /apps/wdl_wf/wgs_qc_wf/ -f
[error] org.fusesource.scalate.TemplateException:
  bad constant pool index: 0 at pos: 48445
     while compiling: <no file>
        during phase: globalPhase=<no phase>, enteringPhase=<some phase>
     library version: version 2.13.7
    compiler version: version 2.13.7
  reconstructed args: -dependencyfile none -deprecation -Wconf:cat=deprecation:w -Wconf:cat=deprecation:ws -Wconf:cat=feature:ws -Wconf:cat=optimizer:ws -classpath /Users/gcarvalhoneto/Documents/apps/dxCompiler-2.11.4.jar -d /var/folders/ml/m0gn1_b5737fr2617kl9mzymkfh1ym/T/scalate-15682643873275838727-workdir/classes

  last tree to typer: EmptyTree
       tree position: <unknown>
            tree tpe: <notype>
              symbol: null
           call site: <none> in <none>

== Source file context for tree position ==


	at org.fusesource.scalate.TemplateEngine.compileAndLoad(TemplateEngine.scala:864)
	at org.fusesource.scalate.TemplateEngine.compileAndLoadEntry(TemplateEngine.scala:725)
	at org.fusesource.scalate.TemplateEngine.liftedTree1$1(TemplateEngine.scala:436)
	at org.fusesource.scalate.TemplateEngine.load(TemplateEngine.scala:430)
	at org.fusesource.scalate.TemplateEngine.load(TemplateEngine.scala:485)
	at org.fusesource.scalate.TemplateEngine.layout(TemplateEngine.scala:582)
	at wdlTools.generators.Renderer.render(Renderer.scala:15)
	at dx.compiler.ApplicationCompiler.generateJobScript(ApplicationCompiler.scala:139)
	at dx.compiler.ApplicationCompiler.createRunSpec(ApplicationCompiler.scala:189)
	at dx.compiler.ApplicationCompiler.apply(ApplicationCompiler.scala:552)
	at dx.compiler.Compiler$BundleCompiler.maybeBuildApplet(Compiler.scala:357)
	at dx.compiler.Compiler$BundleCompiler.$anonfun$apply$1(Compiler.scala:471)
	at scala.collection.IterableOnceOps.foldLeft(IterableOnce.scala:646)
	at scala.collection.IterableOnceOps.foldLeft$(IterableOnce.scala:642)
	at scala.collection.AbstractIterable.foldLeft(Iterable.scala:926)
	at dx.compiler.Compiler$BundleCompiler.apply(Compiler.scala:441)
	at dx.compiler.Compiler.apply(Compiler.scala:494)
	at dxCompiler.Main$.compile(Main.scala:538)
	at dxCompiler.Main$.dispatchCommand(Main.scala:791)
	at dxCompiler.Main$.main(Main.scala:922)
	at dxCompiler.MainApp$.delayedEndpoint$dxCompiler$MainApp$1(Main.scala:927)
	at dxCompiler.MainApp$delayedInit$body.apply(Main.scala:926)
	at scala.Function0.apply$mcV$sp(Function0.scala:39)
	at scala.Function0.apply$mcV$sp$(Function0.scala:39)
	at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
	at scala.App.$anonfun$main$1(App.scala:76)
	at scala.App.$anonfun$main$1$adapted(App.scala:76)
	at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:563)
	at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:561)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:926)
	at scala.App.main(App.scala:76)
	at scala.App.main$(App.scala:74)
	at dxCompiler.MainApp$.main(Main.scala:926)
	at dxCompiler.MainApp.main(Main.scala)
Caused by: scala.reflect.internal.FatalError:
  bad constant pool index: 0 at pos: 48445
     while compiling: <no file>
        during phase: globalPhase=<no phase>, enteringPhase=<some phase>
     library version: version 2.13.7
    compiler version: version 2.13.7
  reconstructed args: -dependencyfile none -deprecation -Wconf:cat=deprecation:w -Wconf:cat=deprecation:ws -Wconf:cat=feature:ws -Wconf:cat=optimizer:ws -classpath /Users/gcarvalhoneto/Documents/apps/dxCompiler-2.11.4.jar -d /var/folders/ml/m0gn1_b5737fr2617kl9mzymkfh1ym/T/scalate-15682643873275838727-workdir/classes

  last tree to typer: EmptyTree
       tree position: <unknown>
            tree tpe: <notype>
              symbol: null
           call site: <none> in <none>

== Source file context for tree position ==


	at scala.reflect.internal.Reporting.abort(Reporting.scala:69)
	at scala.reflect.internal.Reporting.abort$(Reporting.scala:65)
	at scala.reflect.internal.SymbolTable.abort(SymbolTable.scala:28)
	at scala.tools.nsc.symtab.classfile.ClassfileParser$ConstantPool.errorBadIndex(ClassfileParser.scala:407)
	at scala.tools.nsc.symtab.classfile.ClassfileParser$ConstantPool.getExternalName(ClassfileParser.scala:262)
	at scala.tools.nsc.symtab.classfile.ClassfileParser.readParamNames$1(ClassfileParser.scala:853)
	at scala.tools.nsc.symtab.classfile.ClassfileParser.parseAttribute$1(ClassfileParser.scala:859)
	at scala.tools.nsc.symtab.classfile.ClassfileParser.$anonfun$parseAttributes$6(ClassfileParser.scala:936)
	at scala.tools.nsc.symtab.classfile.ClassfileParser.parseAttributes(ClassfileParser.scala:936)
	at scala.tools.nsc.symtab.classfile.ClassfileParser.parseMethod(ClassfileParser.scala:635)
	at scala.tools.nsc.symtab.classfile.ClassfileParser.parseClass(ClassfileParser.scala:548)
	at scala.tools.nsc.symtab.classfile.ClassfileParser.$anonfun$parse$2(ClassfileParser.scala:174)
	at scala.tools.nsc.symtab.classfile.ClassfileParser.$anonfun$parse$1(ClassfileParser.scala:159)
	at scala.tools.nsc.symtab.classfile.ClassfileParser.parse(ClassfileParser.scala:142)
	at scala.tools.nsc.symtab.SymbolLoaders$ClassfileLoader.doComplete(SymbolLoaders.scala:342)
	at scala.tools.nsc.symtab.SymbolLoaders$SymbolLoader.$anonfun$complete$2(SymbolLoaders.scala:249)
	at scala.tools.nsc.symtab.SymbolLoaders$SymbolLoader.complete(SymbolLoaders.scala:247)
	at scala.reflect.internal.Symbols$Symbol.completeInfo(Symbols.scala:1561)
	at scala.reflect.internal.Symbols$Symbol.info(Symbols.scala:1533)
	at scala.reflect.internal.Definitions.scala$reflect$internal$Definitions$$enterNewMethod(Definitions.scala:47)
	at scala.reflect.internal.Definitions$DefinitionsClass.String_$plus$lzycompute(Definitions.scala:1256)
	at scala.reflect.internal.Definitions$DefinitionsClass.String_$plus(Definitions.scala:1256)
	at scala.reflect.internal.Definitions$DefinitionsClass.syntheticCoreMethods$lzycompute(Definitions.scala:1577)
	at scala.reflect.internal.Definitions$DefinitionsClass.syntheticCoreMethods(Definitions.scala:1559)
	at scala.reflect.internal.Definitions$DefinitionsClass.symbolsNotPresentInBytecode$lzycompute(Definitions.scala:1590)
	at scala.reflect.internal.Definitions$DefinitionsClass.symbolsNotPresentInBytecode(Definitions.scala:1590)
	at scala.reflect.internal.Definitions$DefinitionsClass.init(Definitions.scala:1646)
	at scala.tools.nsc.Global$Run.<init>(Global.scala:1226)
	at org.fusesource.scalate.support.ScalaCompiler.compile(ScalaCompiler.scala:101)
	at org.fusesource.scalate.TemplateEngine.compileAndLoad(TemplateEngine.scala:787)
	... 33 more

I don't understand the error, it also happens when I try to compile the applet individually.
Do you have any recommendation?

sub-workflow defaults not overridden by provided value

Hello, I noticed in testing dxCompiler 2.4.10 that sub-workflow default input values are not overridden by a different provided value.

For example with the following 2 WDL files

workflow1.wdl

import "workflow2.wdl"

workflow workflow1 {
    input {
    ...
    }
   call workflow2 {
       input:
            my_input=["hello"]
   }
...
}

workflow2.wdl

workflow workflow2 {
    input {
        Array[String] my_input = []
    }
...
}

when workflow2 is run, it is using [] rather than ["hello"]

Apollo - Spark and cohort files

Would it be possible to add spark clusters to a wdl task?
This would enable extraction of information from cohort files, as well as from the databases such as UKBiobank.

Can the input parameters be passed by ID to allow for other dx objects to work?

difficulty setting a default docker image

Hi there,

I am having trouble getting a global docker image working via my extras.json:

{
    "defaultRuntimeAttributes" : {
                "docker" : "dx://ukbb_wgs_test1:/docker_images/wgs_phaser.tar.gz"
      }
}

I then compile with:

java -jar ~/software/dxCompiler-2.10.4.jar compile phase_wgs_noruntime.wdl  -project ukbb_wgs_test1 -folder /workflows/test/ -f -extras extras.json

but get various runtime errors since it is not using the docker image.

If I add:

    runtime {
        docker: "dx://ukbb_wgs_test1:/docker_images/wgs_phaser.tar.gz"
    }

to each task then it works fine so the docker path is correct etc.

InvalidInput: i/o value reads needs to be given using DNAnexus links

I have come across some unexpected behaviour when setting off a WDL workflow from within an app. This 'setoff' app runs the following command:

dx run project-G76q9bQ0PXfP7q972fVf2X19:/TSO500_workflow/TSO500_workflow -y \ -f inputs.json \ --dest=$project_name:/ \ --detach \ --brief \ --auth-token $API_KEY"

Where the inputs.json file created within the app contains amongst other inputs 2 fastq files:
"stage-common.fastq_file_1": { "$dnanexus_link": { "project": $fastq_file1_proj, "id": $fastq_file1_id } }, "stage-common.fastq_file_2": { "$dnanexus_link": { "project": $fastq_file2_proj, "id": $fastq_file2_id } },

This command successfully sets off the workflow 'TSO500_workflow', and the tasks in the 'TSO500_workflow' which are WDL tasks and are successfully set off, with the fastq_file1 & fastq_file_2 files loaded into the worker successfully. Some of these tasks use DNAnexus apps, one of which ('fastqc_v1_4_0') has an array of files as one of the task inputs ('reads').

`version 1.0

task fastqc_v1_4_0 {
input {
Array[File]+ reads
File? contaminants_txt
File? adapters_txt
File? limits_txt
String? format
Int? kmer_size
Boolean? nogroup
String? extra_options
}
meta {
title: "fastqc_v1_4_0"
summary: "ADD HEADLINE SUMMARY HERE"
description: "ADD LONGER DESCRIPTION HERE"
tags: ["TSO500", "WDL"]
properties: {
runtime_applet: "applet-FBPFfkj0jy1Q114YGQ0yQX8Y",
version: "v1.4.0",
release_status: "released"
}
}
command <<< >>>
output {
Array[File]+ report_html = ["placeholder.txt"]
Array[File]+ stats_txt = ["placeholder.txt"]
}
runtime {
dx_app: object {
type: "applet",
id: "applet-FBPFfkj0jy1Q114YGQ0yQX8Y",
name: "fastqc_v1.4.0"
}
}
}`

This array of files is input from the workflow as follows:

call fastqc.fastqc_v1_4_0 as fastqc_v1_4_0 { input: reads = [fastq_file_1, fastq_file_2] }

The following error is occurring during 'fastqc_v1_4_0' execution, and I cannot work out why, as the files have been supplied to the TSO500 workflow as dnanexus links as displayed above.
InvalidInput: i/o value reads needs to be given using DNAnexus links

I have had a look through the documentation on your site and within this repository including old issues but can't find anything that might help me with this. If someone is able to look into this and help me I would be hugely grateful. Many thanks.

Compilation unrecognized exception with default String? = None

Hi DNAnexus, I receive a compilation error when using a workflow with an optional string default on dxCopmiler 2.10.2

workflow2 {
  input {
    Int arg1 = 1
    String? caller_to_emulate = None
  }
}

and invoked without providing caller_to_emulate

workflow1 {
...
call workflow2 {
    input:
      arg1=1
  }
}

There are a bunch of lines like [warning] unable to parse field caller_to_emulate default value null in the verbose trace. The exact exception raised seems to be

[error] java.lang.Exception: Unrecognized expression ValueNull(T_Any)
	at wdlTools.generators.code.WdlGenerator.wdlTools$generators$code$WdlGenerator$$buildExpression(WdlGenerator.scala:825)
	at wdlTools.generators.code.WdlGenerator$DeclarationStatement.$anonfun$rhs$1(WdlGenerator.scala:944)
	at scala.Option.map(Option.scala:242)
	at wdlTools.generators.code.WdlGenerator$DeclarationStatement.<init>(WdlGenerator.scala:942)
	at wdlTools.generators.code.WdlGenerator$InputsBlock.$anonfun$body$5(WdlGenerator.scala:992)
	at scala.collection.immutable.Vector1.map(Vector.scala:1886)
	at scala.collection.immutable.Vector1.map(Vector.scala:375)
	at wdlTools.generators.code.WdlGenerator$InputsBlock.body(WdlGenerator.scala:989)
	at wdlTools.generators.code.WdlGenerator$BlockStatement.<init>(WdlGenerator.scala:964)
	at wdlTools.generators.code.WdlGenerator$InputsBlock.<init>(WdlGenerator.scala:987)
	at wdlTools.generators.code.WdlGenerator$TaskBlock.body(WdlGenerator.scala:1336)
	at wdlTools.generators.code.WdlGenerator$BlockStatement.<init>(WdlGenerator.scala:964)
	at wdlTools.generators.code.WdlGenerator$TaskBlock.<init>(WdlGenerator.scala:1329)
	at wdlTools.generators.code.WdlGenerator$DocumentSections$$anonfun$format$4.applyOrElse(WdlGenerator.scala:1396)
	at wdlTools.generators.code.WdlGenerator$DocumentSections$$anonfun$format$4.applyOrElse(WdlGenerator.scala:1395)
	at scala.collection.StrictOptimizedIterableOps.collect(StrictOptimizedIterableOps.scala:151)
	at scala.collection.StrictOptimizedIterableOps.collect$(StrictOptimizedIterableOps.scala:136)
	at scala.collection.immutable.Vector.collect(Vector.scala:113)
	at wdlTools.generators.code.WdlGenerator$DocumentSections.format(WdlGenerator.scala:1395)
	at wdlTools.generators.code.WdlGenerator.generateElement(WdlGenerator.scala:1420)
	at wdlTools.generators.code.WdlGenerator.generateDocument(WdlGenerator.scala:1428)
	at dx.core.languages.wdl.VersionSupport.generateDocument(VersionSupport.scala:93)
	at dx.core.languages.wdl.WdlDocumentSource.toString(WdlCodeGenerator.scala:16)
	at dx.compiler.ApplicationCompiler.apply(ApplicationCompiler.scala:618)
	at dx.compiler.Compiler$BundleCompiler.maybeBuildApplet(Compiler.scala:358)
	at dx.compiler.Compiler$BundleCompiler.$anonfun$apply$1(Compiler.scala:472)
	at scala.collection.IterableOnceOps.foldLeft(IterableOnce.scala:646)
	at scala.collection.IterableOnceOps.foldLeft$(IterableOnce.scala:642)
	at scala.collection.AbstractIterable.foldLeft(Iterable.scala:926)
	at dx.compiler.Compiler$BundleCompiler.apply(Compiler.scala:442)
	at dx.compiler.Compiler.apply(Compiler.scala:495)
	at dxCompiler.Main$.compile(Main.scala:541)
	at dxCompiler.Main$.dispatchCommand(Main.scala:794)
	at dxCompiler.Main$.main(Main.scala:924)
	at dxCompiler.MainApp$.delayedEndpoint$dxCompiler$MainApp$1(Main.scala:929)
	at dxCompiler.MainApp$delayedInit$body.apply(Main.scala:928)
	at scala.Function0.apply$mcV$sp(Function0.scala:39)
	at scala.Function0.apply$mcV$sp$(Function0.scala:39)
	at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
	at scala.App.$anonfun$main$1(App.scala:76)
	at scala.App.$anonfun$main$1$adapted(App.scala:76)
	at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:563)
	at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:561)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:926)
	at scala.App.main(App.scala:76)
	at scala.App.main$(App.scala:74)
	at dxCompiler.MainApp$.main(Main.scala:928)
	at dxCompiler.MainApp.main(Main.scala)

Removing the None to just String? caller_to_emulate resolves the issue, but I'm not sure why that'd be the case.

dxCompiler possibly incorrect localizes input file expressions

I'm compiling the following WDL

task load_shared_covars {
  input {
    String script_dir
    File script = "~{script_dir}/traits/load_shared_covars.py"
    File python_array_utils = "~{script_dir}/traits/python_array_utils.py"
  }

  output {
    ...
  }

  command <<< 
    ~{script}
  >>> 

  runtime {
    ...
  }
}

Inside the load_shared_covars.py script I have the line import python_array_utils. However, this fails with the error ModuleNotFoundError: No module named 'python_array_utils'. I'm guessing this is because the python_array_utils input is being mislocalized.

WDL guarantees here that files that originate in the same input directory should be localized into the same runtime directory. So I should be able to rely on being able to import a script that resides in the same input directory. But I'm guessing that dxCompiler isn't properly respecting that for inputs that are assigned to expressions, as here its docs suggest that it does not consider those inputs.

Even if dxCompiler will not considered inputs with default expression values as inputs (which is fine for my use case) can dxCompiler still guarantee that these files get localized to the location that WDL requires?

Happy to provide more details or clarification (or an example run). Also happy to hear if I've misdiagnosed what's going on.

Thanks!

Failed to generate Array[File] with optional input

In the applet, I define the input is the array:file with optional is true. When I use the dxni command to generate the wdl file to import to the workflow. It generate the array:file with missing "?"

dx_instance_type runtime keywork not recongizned in compiling WDL v1.1 and development version

Hello

I am attempting to compile the WDL workflow attached below (which is a modification of https://github.com/dnanexus/dxCompiler/blob/90d136ee8736dd5af5f20edd2d745896a0decb4e/test/cromwell/sub_workflow_interactions_scatter/sub_workflow_interactions_import.wdl)

When I declare the workflow to be WDL version 1.0 in the wdl file header, the workflow compiles successfully.
However, when I attempt to compile it as WDL version 1.1 or the development version, compilation fails if I specify dx_instance_type in any of the task runtime sections:
Using version 1.1 I get the following error:

[error] Error translating sub_workflow_interactions_import.wdl to IR
java.lang.StackOverflowError
	at scala.runtime.Statics.anyHash(Statics.java:127)
	at scala.collection.immutable.HashMap.contains(HashMap.scala:124)
	at wdlTools.eval.Runtime.contains(Runtime.scala:32)
	at wdlTools.eval.Runtime.contains(Runtime.scala:35)

Using development I get:

[error] Error creating translator for sub_workflow_interactions_import.wdl
wdlTools.syntax.SyntaxException: error parsing document /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/sub_workflow_interactions_import.wdl
	at wdlTools.syntax.v2.ParseAll.parseDocument(ParseAll.scala:402)
	at dx.core.languages.wdl.WdlUtils$.parseSource(WdlUtils.scala:82)
	at dx.core.languages.wdl.WdlUtils$.parseAndCheckSource(WdlUtils.scala:97)
	at dx.core.languages.wdl.VersionSupport$.fromSource(VersionSupport.scala:128)
	at dx.core.languages.wdl.VersionSupport$.fromSourceFile(VersionSupport.scala:141)
	at dx.translator.wdl.WdlTranslatorFactory.liftedTree1$1(WdlTranslator.scala:223)
	at dx.translator.wdl.WdlTranslatorFactory.create(WdlTranslator.scala:222)
	at dx.translator.TranslatorFactory$$anonfun$createTranslator$6.applyOrElse(Translator.scala:97)
	at dx.translator.TranslatorFactory$$anonfun$createTranslator$6.applyOrElse(Translator.scala:97)
	at scala.collection.IterableOnceOps.collectFirst(IterableOnce.scala:1086)
	at scala.collection.IterableOnceOps.collectFirst$(IterableOnce.scala:1078)
	at scala.collection.AbstractIterable.collectFirst(Iterable.scala:920)
	at dx.translator.TranslatorFactory$.createTranslator(Translator.scala:97)
	at dxCompiler.Main$.compile(Main.scala:402)
	at dxCompiler.Main$.dispatchCommand(Main.scala:778)
	at dxCompiler.Main$.main(Main.scala:884)
	at dxCompiler.MainApp$.delayedEndpoint$dxCompiler$MainApp$1(Main.scala:889)
	at dxCompiler.MainApp$delayedInit$body.apply(Main.scala:888)
	at scala.Function0.apply$mcV$sp(Function0.scala:39)
	at scala.Function0.apply$mcV$sp$(Function0.scala:39)
	at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
	at scala.App.$anonfun$main$1(App.scala:73)
	at scala.App.$anonfun$main$1$adapted(App.scala:73)
	at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:553)
	at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:551)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:920)
	at scala.App.main(App.scala:73)
	at scala.App.main$(App.scala:71)
	at dxCompiler.MainApp$.main(Main.scala:888)
	at dxCompiler.MainApp.main(Main.scala)
Caused by: wdlTools.syntax.SyntaxException: invalid runtime keyword dx_instance_type at 11:8-11:43 in /home/hadassah/Documents/Dolev/pipeline_automation/sandbox/sub_workflow_interactions_import.wdl
	at wdlTools.syntax.v2.ParseTop.visitTask_runtime_kv(ParseTop.scala:859)
	at wdlTools.syntax.v2.ParseTop.$anonfun$visitTask_runtime$1(ParseTop.scala:872)
	at scala.collection.Iterator$$anon$9.next(Iterator.scala:575)
	at scala.collection.mutable.Growable.addAll(Growable.scala:62)
	at scala.collection.mutable.Growable.addAll$(Growable.scala:59)
	at scala.collection.mutable.ArrayBuffer.addAll(ArrayBuffer.scala:147)
	at scala.collection.mutable.ArrayBuffer.addAll(ArrayBuffer.scala:40)
	at scala.collection.mutable.ArrayBuffer$.from(ArrayBuffer.scala:265)
	at scala.collection.mutable.ArrayBuffer$.from(ArrayBuffer.scala:253)
	at scala.collection.IterableOps.map(Iterable.scala:671)
	at scala.collection.IterableOps.map$(Iterable.scala:671)
	at scala.collection.AbstractIterable.map(Iterable.scala:920)
	at wdlTools.syntax.v2.ParseTop.visitTask_runtime(ParseTop.scala:872)
	at wdlTools.syntax.v2.ParseTop.visitTask_runtime(ParseTop.scala:14)
	at org.openwdl.wdl.parser.v2.WdlV2Parser$Task_runtimeContext.accept(WdlV2Parser.java:4276)
	at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:46)
	at org.openwdl.wdl.parser.v2.WdlV2ParserBaseVisitor.visitTask_element(WdlV2ParserBaseVisitor.java:566)
	at wdlTools.syntax.v2.ParseTop.$anonfun$visitTask$1(ParseTop.scala:989)
	at scala.collection.Iterator$$anon$9.next(Iterator.scala:575)
	at scala.collection.mutable.Growable.addAll(Growable.scala:62)
	at scala.collection.mutable.Growable.addAll$(Growable.scala:59)
	at scala.collection.mutable.ArrayBuffer.addAll(ArrayBuffer.scala:147)
	at scala.collection.mutable.ArrayBuffer.addAll(ArrayBuffer.scala:40)
	at scala.collection.mutable.ArrayBuffer$.from(ArrayBuffer.scala:265)
	at scala.collection.mutable.ArrayBuffer$.from(ArrayBuffer.scala:253)
	at scala.collection.IterableOps.map(Iterable.scala:671)
	at scala.collection.IterableOps.map$(Iterable.scala:671)
	at scala.collection.AbstractIterable.map(Iterable.scala:920)
	at wdlTools.syntax.v2.ParseTop.visitTask(ParseTop.scala:989)
	at wdlTools.syntax.v2.ParseTop.visitTask(ParseTop.scala:14)
	at org.openwdl.wdl.parser.v2.WdlV2Parser$TaskContext.accept(WdlV2Parser.java:4949)
	at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:46)
	at wdlTools.syntax.v2.ParseTop.visitDocument_element(ParseTop.scala:1272)
	at wdlTools.syntax.v2.ParseTop.$anonfun$visitDocument$1(ParseTop.scala:1298)
	at scala.collection.Iterator$$anon$9.next(Iterator.scala:575)
	at scala.collection.mutable.Growable.addAll(Growable.scala:62)
	at scala.collection.mutable.Growable.addAll$(Growable.scala:59)
	at scala.collection.mutable.ArrayBuffer.addAll(ArrayBuffer.scala:147)
	at scala.collection.mutable.ArrayBuffer.addAll(ArrayBuffer.scala:40)
	at scala.collection.mutable.ArrayBuffer$.from(ArrayBuffer.scala:265)
	at scala.collection.mutable.ArrayBuffer$.from(ArrayBuffer.scala:253)
	at scala.collection.IterableOps.map(Iterable.scala:671)
	at scala.collection.IterableOps.map$(Iterable.scala:671)
	at scala.collection.AbstractIterable.map(Iterable.scala:920)
	at wdlTools.syntax.v2.ParseTop.visitDocument(ParseTop.scala:1298)
	at wdlTools.syntax.v2.ParseTop.$anonfun$parseDocument$1(ParseTop.scala:1322)
	at wdlTools.syntax.Antlr4Util$Grammar.visitDocument(Antlr4Util.scala:255)
	at wdlTools.syntax.v2.ParseTop.parseDocument(ParseTop.scala:1322)
	at wdlTools.syntax.v2.ParseAll.parseDocument(ParseAll.scala:399)
	... 29 more

In both cases, compilation is successful if I omit the dx_instance_type setting.
I have attempted this both with dxCompiler version 2.5.0 and version 2.4.8, both leading to the same result (stack traces in error message above are from v2.5.0).

Any advice on this will be highly appreciated.

Workflow

version 1.1
task countTo {
    input{
    Int value
    }
    command {
        seq 0 1 ${value}
    }
    runtime {
        dx_instance_type: "mem1_ssd1_v2_x2"
      }
    output {
        File range = stdout()
    }
}

task filterEvens {
    input{
    File numbers
    }
    command {
        grep '[02468]$' ${numbers} > evens
        grep '[13579]$' ${numbers} > odds
    }
    runtime {
        dx_instance_type: "mem1_ssd1_v2_x2"
      }


    output {
     Map[String,File] counter_output = {"odds":"odds","evens":"evens"}
    }
}

workflow countEvens {
    input{
    Int max
    String sample
    }
    call countTo { input: value = max }
    call filterEvens { input: numbers = countTo.range }
    output {
        Pair[String,Map[String,File]] counter_wf_output = (sample,filterEvens.counter_output)
    }
}

slow compilation "unable to parse field default value null"

Hi DXCompiler team

We are experiencing slow builds when using the projectWideReuse flag where there is one step where "unable to parse field" comes up a lot

Querying for executables in project DxProject(project-G0vbVFj0046589VQPz0kpgFk)
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
[warning] unable to parse field caller_to_emulate default value null
Found 9996 executables in project-G0vbVFj0046589VQPz0kpgFk:None

This used to be a String? = None argument to a WDL task but has since been changed to a default string. However all future builds now compile slowly searching this project. Is there anything we can do to avoid this? Happy to share build logs or WDL but @emiloslavsky asked me a while back to open an issue if we were still seeing this which we are. Makes building small changes take ~10min longer than they should I estimate

issue with GATK wdls

Hi, I'm trying to compile the wdl from the GATK workflow,
https://dockstore.org/workflows/github.com/broadinstitute/gatk/mutect2:4.1.8.1?tab=info

however running a few versions including the latest yields this error here:

java -jar dxCompiler-2.4.7.jar compile /dnanexus/workflows/gatk/mutect2.wdl -project WES -folder /workflows/mutec2
[error] WDL code is syntactically valid BUT it fails type-checking -----
version 1.0

Is this something on my end or is this just not formatted correctly. Interestingly the link has a import to dnanexus button above.

A

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.