Git Product home page Git Product logo

Comments (7)

kimin0402 avatar kimin0402 commented on August 26, 2024 3

Hey @mbhall88 , sorry for a late reply. Posts #7 and #13 indeed are good ideas. Especially post #7 helped me figure out how to assign different -q options to different rules. Thanks a lot.

By the way, I think I figured out this dependency issue, mainly by adopting scripts from pbs-torque profile. This is what I changed:

1) Open ~/.config/snakemake/lsf/config.yaml,
1-1) change cluster specification:
cluster: "lsf-submit.py" to => cluster: "lsf-submit.py --depend \"{dependencies}\""

1-2) add:

immediate-submit: true
notemp: true

2) Edit ~/.config/snakemake/lsf/lsf-submit.py,
2-1) Add following lines:

parser=argparse.ArgumentParser(add_help=False)
parser.add_argument("--depend", help="Space separated list of ids for jobs this job should depend on.")
parser.add_argument("positional",action="append",nargs="?")
args = parser.parse_args()

depend=""

if args.depend:
    depend = args.depend
    depend = depend.replace(" ", " && ")
if depend:
    depend = f" -w '{depend}' "

2-2) change the variable "submit_cmd" so that it includes a string variable "depend" e.g.)

submit_cmd = "bsub {resources} {job_info} {queue} {dep} {jobscript}".format(
    resources=resources_cmd,
    job_info=jobinfo_cmd,
    queue=queue_cmd,
    dep=depend,
    jobscript=jobscript,
)

I removed cluster_cmd variable because it kept reading arguments specified by --depend option of argparse as system arguments.

This seems to submit all job scripts at once with dependency specified. There might be some problems in the future but it seems to work fine for me so far now.

from lsf.

mbhall88 avatar mbhall88 commented on August 26, 2024

Hi @kimin0402 ,

If I understand your enquiry correctly, you are wondering whether we could support job dependencies i.e. only submit a job if a given dependency expression is true (described in the documentation for -w here)?

Snakemake is effectively handling job dependencies already for you, isn't it? I am just wondering what an example use-case for this is?

from lsf.

kimin0402 avatar kimin0402 commented on August 26, 2024

Hi @mbhall88,

Yes, I was wondering whether LSF profile could support job dependencies by utilizing -w option of LSF. Snakemake is handling the job dependencies, but not without "--immediate-submit" option set true in the config file (see this link for --immediate-submit option: https://snakemake.readthedocs.io/en/stable/executing/cli.html).

With this option, you can execute a snakemake script and all bash scripts created from it will be submitted to the cluster at once, and among all bash scripts, those requiring dependencies will be bsub with '-w {dependencies}' with the submit.py wrapper. Right now, when I execute a snakemake script with this LSF profile, the shell in which I executed is constantly running snakemake, waiting until the dependent job is finished. If immediate-submit is set true, all scripts will be submitted to the cluster and I can do other jobs after snakemake completes submitting all bash scripts. Those scripts requiring dependencies will be submitted with -w argument and shown as "PEND" (or "H" in PBS cluster) in a que list.

from lsf.

mbhall88 avatar mbhall88 commented on August 26, 2024

I see. And how would you describe the dependencies?

Regarding your use case where your shell is constantly running snakemake I would strongly recommend submitting the "master" snakemake process as a job rather than letting it run on the login node. See an example script I use for exactly this with all of my snakemake pipelines.

from lsf.

kimin0402 avatar kimin0402 commented on August 26, 2024

Thank you for your example script. In the case where 'immediate submit' is not activated, this seems to be the best way to run snakemake with dependencies. However if you run this master script, and your master script has rules (A, B, C) where dependencies are set to execute A -> B -> C, my understanding is that snakemake does not submit job script for B unless job script for A is finished.

In this case, if other people submitted job scripts while job A was running, there will be lots of scripts queing in between job A and B. What I want to do is submit job script A and B together, and specify dependencies for job B with -w argument of lsf. So my jobs would appear like this:
RUN job.A.sh
PEND job.B.sh

and then
DONE job.A.sh
RUN job.B.sh
RUN other_script_1.sh
RUN other_script_2.sh
RUN other_script_3.sh

whereas in the case with a master script submitted, job queue list would appear like this first:
RUN master_script.sh
RUN job.A.sh

and then
RUN master_script.sh
DONE job.A.sh
RUN other_script_1.sh
RUN other_script_2.sh
RUN other_script_3.sh
RUN job.B.sh

I hope my explanation clarified the question a little bit. This kind of action is possible with pbs-torque profile, so I was wondering whether lsf profile could do this too.

from lsf.

mbhall88 avatar mbhall88 commented on August 26, 2024

Yes, I understand your use-case. Would something like I am proposing in #7 and #13 meet your needs? I think it should but I am just not sure about how the dependecy specification would work in that case.

from lsf.

mbhall88 avatar mbhall88 commented on August 26, 2024

@kimin0402 this is super cool!! Really neat solution.

Would you be interested in creating a PR on the development branch to add this support? If you start a PR I can help with tweaking this a little to work in combination with the lsf.yaml and add some testing.

from lsf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.