Git Product home page Git Product logo

broadinstitute / cromwell Goto Github PK

View Code? Open in Web Editor NEW
958.0 112.0 349.0 45.74 MB

Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments

Home Page: http://cromwell.readthedocs.io/

License: BSD 3-Clause "New" or "Revised" License

Scala 72.94% Java 15.71% Shell 1.53% HTML 0.10% WDL 8.73% Dockerfile 0.02% Python 0.96%
workflow-execution workflow cloud hpc bioinformatics executor scala docker ga4gh containers

cromwell's Introduction

codecov

Welcome to Cromwell

Cromwell is an open-source Workflow Management System for bioinformatics. Licensing is BSD 3-Clause.

The Cromwell documentation has a dedicated site.

First time to Cromwell? Get started with Tutorials.

Community

Thinking about contributing to Cromwell? Get started by reading our Contributor Guide.

Cromwell has a growing ecosystem of community-backed projects to make your experience even better! Check out our Ecosystem page to learn more.

Talk to us:

Capabilities and roadmap

Many users today run their WDL workflows in Terra, a managed cloud bioinformatics platform with built-in WDL support provided by Cromwell. See here for a quick-start guide.

Users with specialized needs who wish to install and maintain their own Cromwell instances can download a JAR or Docker image. The development team accepts reproducible bug reports from self-managed instances, but cannot feasibly provide direct support.

Cromwell's backends receive development resources proportional to user demand. The team is actively developing for Google Cloud and Microsoft Azure (see Cromwell on Azure). Maintenance of other backends is primarily community-based.

Cromwell supports the WDL workflow language. Cromwell version 80 and above no longer support CWL.

CWL will be re-introduced at a later date in the Terra platform, using a solution other than Cromwell. See the blog post "Terra’s roadmap to supporting more workflow languages" for details.

Security reports

If you believe you have found a security issue please contact [email protected].

Issue tracking

Need to file an issue? Head over to Github Issues.

If you previously filed an issue in JIRA, the link is here. New signups are no longer accepted.

Jamie, the Cromwell pig

cromwell's People

Contributors

aednichols avatar breilly2 avatar broadbot avatar cahrens avatar cjllanwarne avatar coreone avatar danbills avatar delocalizer avatar elerch avatar francares avatar geoffjentry avatar horneth avatar jgainerdewar avatar jsotobroad avatar jvthomas avatar katevoss avatar kcibul avatar kpierre13 avatar kraefrei avatar kshakir avatar mcovarr avatar orodeh avatar rhpvorderman avatar rsasch avatar ruchim avatar salonishah11 avatar scala-steward avatar scottfrazer avatar thwiseman avatar trinaap13 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cromwell's Issues

Replace backend-specific execution tables

The JES_JOB, LOCAL_JOB, and SGE_JOB tables are specific to the backends currently hardcoded into Cromwell and are not suitable for dynamically pluggable backends. The data currently captured by these tables should be replaced by a schema that is not backend specific, perhaps something like the following:

EXECUTION_INFO

EXECUTION_ID  INT           FK to EXECUTION.EXECUTION_ID, NOT NULL
INFO_KEY      VARCHAR(xyz)  NOT NULL
INFO_VALUE    VARCHAR(xyz)  

I'd also propose adding a nullable BACKEND_TYPE VARCHAR(xyz) field to EXECUTION to record what backend was actually used to run a job.

LOG_LEVEL support in run command

Hi. Thanks for a great project. Attempting to bring up docker container and hit REST endpoints.
Looks like I needed to add -DLOG_LEVEL to run command

# pwd
/etc/service/cromwell
# cat run
#!/bin/bash
exec java $JAVA_OPTS -DLOG_LEVEL=$LOG_LEVEL -Djava.library.path=./native -jar /cromwell/cromwell-0.14.jar server

Restore GCS IO Interface

Hacked out of WorkflowDescriptor, but this might also be rethought to be more generic for other cloud providers

SingleWorkflowRunnerActor: Ask timed out

Hi,

I'm trying to run through the Hello World example. I'm running Cromwell on OS X with Java 8. Any ideas on how to troubleshoot the below error?

Thanks!

› cromwell run hello.wdl hello.json
[2015-12-18 08:43:13,222] [info] Slf4jLogger started
[2015-12-18 08:43:13,335] [info] Default backend: LOCAL
[2015-12-18 08:43:13,335] [info] RUN sub-command
[2015-12-18 08:43:13,336] [info]   WDL file: hello.wdl
[2015-12-18 08:43:13,439] [info]   Inputs: hello.json
[2015-12-18 08:43:13,560] [info] input: test.hello.name => "world"
[2015-12-18 08:43:13,776] [info] SingleWorkflowRunnerActor: launching workflow
[2015-12-18 08:43:15,936] [info] Running with database db.url = jdbc:hsqldb:mem:86473284-494c-43d2-94fd-d00107a2a787;shutdown=false;hsqldb.tx=mvcc
[2015-12-18 08:43:17,516] [info] WorkflowManagerActor submitWorkflow input id = None, effective id = e67af113-c3a7-41f4-9178-6640c1c652e9
[2015-12-18 08:43:17,592] [info] WorkflowManagerActor Found no workflows to restart.
[2015-12-18 08:43:18,816] [error] SingleWorkflowRunnerActor: Ask timed out on [Actor[akka://cromwell-system/user/WorkflowManagerActor#-1616857312]] after [5000 ms]
akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://cromwell-system/user/WorkflowManagerActor#-1616857312]] after [5000 ms]
    at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:334)
    at akka.actor.Scheduler$$anon$7.run(Scheduler.scala:117)
    at scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:599)
    at scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109)
    at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:597)
    at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(Scheduler.scala:467)
    at akka.actor.LightArrayRevolverScheduler$$anon$8.executeBucket$1(Scheduler.scala:419)
    at akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:423)
    at akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375)
    at java.lang.Thread.run(Thread.java:745)
[2015-12-18 08:43:19,174] [info] Message [cromwell.engine.workflow.WorkflowManagerActor$RestartWorkflows] from Actor[akka://cromwell-system/user/WorkflowManagerActor#-1616857312] to Actor[akka://cromwell-system/user/WorkflowManagerActor#-1616857312] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2015-12-18 08:43:19,180] [info] Message [akka.actor.Status$Failure] from Actor[akka://cromwell-system/user/WorkflowManagerActor#-1616857312] to Actor[akka://cromwell-system/deadLetters] was not delivered. [2] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2015-12-18 08:43:19,182] [error] WorkflowManagerActor: Workflow failed submission: cannot create children while terminating or terminated
java.lang.IllegalStateException: cannot create children while terminating or terminated
    at akka.actor.dungeon.Children$class.makeChild(Children.scala:199)
    at akka.actor.dungeon.Children$class.actorOf(Children.scala:37)
    at akka.actor.ActorCell.actorOf(ActorCell.scala:369)
    at cromwell.engine.workflow.WorkflowManagerActor$$anonfun$11.apply(WorkflowManagerActor.scala:246)
    at cromwell.engine.workflow.WorkflowManagerActor$$anonfun$11.apply(WorkflowManagerActor.scala:245)
    at scala.util.Success$$anonfun$map$1.apply(Try.scala:237)
    at scala.util.Try$.apply(Try.scala:192)
    at scala.util.Success.map(Try.scala:237)
    at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
    at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
    at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
    at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
cromwell run hello.wdl hello.json  9.53s user 0.80s system 108% cpu 9.542 total

continueOnReturnCode doesn't appear to work?

https://github.com/broadinstitute/cromwell#continueonreturncode
doesn't seem to work as documented for me. Here's the test I ran, and the behavior was identical to the behavior without continueOnReturnCode as far as I can tell.

This isn't a high priority for me - I believe we can get the information we need without this option.

cat error_continue.wdl

task hello {
String addressee
command {
echo "Hello ${addressee}!" && exit 1
}
output {
String salutation = read_string(stdout())
}
runtime {
docker: "ubuntu:latest"
continueOnReturnCode: true
}
}

workflow w {
call hello
}

curl -v "localhost:8000/api/workflows/v1" -F wdlSource=@error_continue.wdl -F [email protected]

-> {
"id": "dbd26ad6-5a29-4c80-8f49-8b5f53830782",
"status": "Submitted"
}
curl -v "localhost:8000/api/workflows/v1/dbd26ad6-5a29-4c80-8f49-8b5f53830782/status"
-> {
"id": "dbd26ad6-5a29-4c80-8f49-8b5f53830782",
"status": "Failed"
}
curl -v "localhost:8000/api/workflows/v1/dbd26ad6-5a29-4c80-8f49-8b5f53830782/outputs"
{
"id": "dbd26ad6-5a29-4c80-8f49-8b5f53830782",
"outputs": {

}
}

CallActor instantiate Backend

  1. CallActor will instantiate a Backend (currently this is in WorkflowManagerSystem)
      - Backend will have a TaskDescriptor
  2. Make CallExecutionActor work with a Backend object
      - in props(), instead of passing a BackendCall, we will pass a Backend
  3. pluggable_backends_develop branch on Cromwell has an example CallActor modifications

Highlight removes runtime section

The highlight subcommand of cromwell strips out the runtime, param_meta and meta information.

test.wdl:

task runtime_meta {
  String memory_mb
  String sample_id
  String param
  String sample_id

  command {
    java -Xmx${memory_mb}M -jar task.jar -id ${sample_id} -param ${param} -out ${sample_id}.out
  }
  output {
    File results = "${sample_id}.out"
  }
  runtime {
    docker: "broadinstitute/baseimg"
  }
  parameter_meta {
    memory_mb: "Amount of memory to allocate to the JVM"
    param: "Some arbitrary parameter"
    sample_id: "The ID of the sample in format foo_bar_baz"
  }
  meta {
    author: "Joe Somebody"
    email: "[email protected]"
  }
}

workflow test {
  call runtime_meta
}

The command $ java -jar cromwell-0.15.jar highlight test.wdl console outputs:

task runtime_meta {
  String memory_mb
  String sample_id
  String param
  String sample_id

  command {
    java -Xmx${memory_mb}M -jar task.jar -id ${sample_id} -param ${param} -out ${sample_id}.out
  }
  output {
    File results = "${sample_id}.out"
  }

workflow test {
  call runtime_meta
}

Add Cromwell-backend as a sub-project

Add Cromwell-backend as a sub-project of Cromwell.
Cromwell-backend should have minimal interfaces needed to add basic required functionality in the future.
Connect #495

Move useCachedCall

Move between CallActor and Backend, store call root as special symbol not unlike stdout/stderr

Support additional Docker configuration options.

Hi,

TL;DR

Cromwell should allow for the configuration of Docker resource / environment flags at run-time.


I have a use-case where I'd like to run Cromwell jobs in a cluster environment via Docker Swarm. Since Swarm doesn't require any additional configuration outside of standard docker run commands, it's trivial to distribute Cromwell jobs across Swarm nodes.

However, Swarm provides a series of filters and constrains that control how the scheduler distributes containers to nodes. For example, I might be interested in limiting the execution of a Cromwell job to a specific region / datacenter. This requires you to specify filters in the docker run command with the environment flag, -e. For example, to run a container on Swarm nodes that run in the us-east region:

› docker run -d --name my_image -e constraint:region!=us-east* my_container

Obviously, this configuration should not be managed in the WDL document. Instead, it would be great for the Cromwell command-line tool and REST API to support additional runtime options for specifying Docker environment variables. For example:

› cromwell run --docker-env "constraint:region!=us-east*" my_workflow.wdl -

Hint: Docker supports daemon labels. In the above case, the workflow would
execute on a Swarm node whose Docker daemon that was started with:

    docker daemon --label region=us-east

As for the API, the POST action to /api/workflows/:version would allow for multiple Docker env strings.

The other feature I would like to request is translating memory and cpu configuration options (at the task level) to Docker via --memory and --cpuset-cpus docker run flags, respectively. These options are currently only used for the JES backend, but it seems as though they can also be used for the Local backend if Docker is specified.

So, to summarize:

  1. Allow Docker -e flags to be specified for all tasks in a given workflow.
  2. Allow task memory and cpu options in a WDL document to be translated to --memory and --cpuset-cpus in the docker run command.

Please let me know if there's anything I can do to help this move forward.

Cheers! 🍻

WorkflowValidationActor - Generic Validation

To create a workflow, send a message to WorkflowValidationActor with WorkflowSourceFiles, it tries to build a namespace and coerced inputs. If this passes, iterate the backends sending validation messages to the BackendWorkflowValidationActors. The workflow is validated if it passes one of these.

Revisit the FinalCall flow in the pluggable backends

There is only one call actor in the pluggable backends branch , while it was divided into two viz. BackendCallActor and FinalCallActor (likewise for CallExecutionActor, except it doesn't exist anymore). See how this can be reimplemented in the Pluggable Backends world

Cromwell expects docker images to be run as root user

When running a docker image through Cromwell, it assumes that you are the root user for the docker container.

I was trying to run a docker image which has to be run as a non-root user, so they don't have access to the root user home folder (/root). The problem with this is that Cromwell will place any files you pass to the container in the /root directory, so you need to be the root user or else you will get a permission error.

Move stdoutStderr

Look at code in existing but closed PRs in cromwell-backend/cromwell. This will need to be greatly adapted for develop, but should be doable.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.