Git Product home page Git Product logo

Comments (11)

lczech avatar lczech commented on July 30, 2024

Hey Yannick,

so, you installed the grenepipe env via micromamba, then activated it, and then ran snakemake from within that environment? What are the exact commands that you are running there?

Lucas

from grenepipe.

Leovar101 avatar Leovar101 commented on July 30, 2024

Hei Lucas,

I have to admit to my shame that in this instance, I forgot to activate the grenepipe environment after starting up the new tmux session.
I'll see if I run into an issue this time, but I think it was just me being negligent.
Sorry for any wasted time!
Yannick

from grenepipe.

lczech avatar lczech commented on July 30, 2024

Haha no worries, it happens :-)

from grenepipe.

Leovar101 avatar Leovar101 commented on July 30, 2024

So far so good, this time the analysis seems to be working fine!
However, it now crashed with another permission error, which I'm unsure how to interpret.

Here the snakemake log and the bwa_mem log, since there seems to be an error appearing there as well (line 2227).
2024-07-17T151022.207727.snakemake.log
D-1.log
Up to here, everything seems to go well.

Cheerio
Yannick

from grenepipe.

lczech avatar lczech commented on July 30, 2024

Okay, interesting, a lot is going on here...

First, the error was in sample B-1, not D-1. Could you maybe also send me the file logs/mapping/bwa-mem/B-1.log?

Furthermore, the error message

PermissionError: [Errno 13] Permission denied: '/scratch/fr_yf1009_o05i14'

somehow indicates some issue with that directory. I assume that is some user scratch space or temporary directory? It could be that Snakemake automatically sets this, following some configuration of your cluster - not sure where that exact directory is coming from. But it could for instance be that there is no more space there, or some similar issue. Could you check that?

Furthermore, I am noticing the following warning of grenepipe in the beginning of your log file:

In the reference genome, there are chromosome/contig names that contain problematic characters. As we use these names to create file names, this can lead to crashes later in the pipeline. We generally advise to only use alpha-numeric characters, dots, dashes, and underscores for the reference sequence names for this reason.
Problematic reference genome names:
 - lcl|ptg000001l
 - lcl|ptg000002l
 - lcl|ptg000003l
 - ...

The issue here is that your reference genome contains pipe (or bar) characters |. This will very likely lead to issues down the line, as that is a special character in unix systems, and using it in file names can lead to all kind of trouble. Depending on your settings, grenepipe will however create files for the contigs, named after the contigs, so that might fail.

Hence, I highly recommend to rename the reference genome sequences, for instance by replacing the | with a simply underscore _. I have thought about doing that automatically in grenepipe, but concluded that it would probably be more trouble for the user to find all their contigs renamed after the fact. But let me know what you think about that.

Lastly, the log file you sent also made me realize that Snakemake changed their logging behaviour, so that the log file that you sent does not contain all information of grenepipe any more... I've told them about it. In the meantime, for future debugging, when you need to send more log files, could you maybe use the latest master branch of the grenepipe repository, e.g., via the green "Code" button here. That will produce an additional log file almost identical to the one you sent there, but with all the information. It will be located along the other logs of the pipeline in logs/snakemake.

Cheers and so long
Lucas

from grenepipe.

Leovar101 avatar Leovar101 commented on July 30, 2024

Hei Lucas,

unfortunately no log seems to have been created for B-1, I added the only one in the bwa-mem log folder.
Also, I'm not sure how to check this scratch directory. As far as I understand it's stored on our cluster's temporary file directory $TMPDIR, which appears to be emptied as soon as the corresponding jobs are either done or terminated.
edit: I managed to acces it, and du -sh for the directory returns a size of 8.0K
I've changed to contig names in the reference genome and replaced the grenepipe folder with your master version and am about to start a new run now.
I'll let you know what happens!

All the best
Yannick

from grenepipe.

lczech avatar lczech commented on July 30, 2024

Hej Yannick,

unfortunately no log seems to have been created for B-1, I added the only one in the bwa-mem log folder.

Ah okay, that likely means the job failed for some snakemake reason, so that the program being run there did not even get to produce any output.

Also, I'm not sure how to check this scratch directory. As far as I understand it's stored on our cluster's temporary file directory $TMPDIR, which appears to be emptied as soon as the corresponding jobs are either done or terminated.
edit: I managed to acces it, and du -sh for the directory returns a size of 8.0K

Haha yes, 8K is not nearly enough space... not sure why your cluster would give you such limited temp space... I'd get in touch with your admins to see what is happening there.

In the meantime, you can set the temp dir explicitly to something that has more space available, see here. This should work for example:

--default-resources tmpdir="/path/to/more/space"

I've changed to contig names in the reference genome and replaced the grenepipe folder with your master version and am about to start a new run now.
I'll let you know what happens!

Great!

So long
Lucas

from grenepipe.

Leovar101 avatar Leovar101 commented on July 30, 2024

Hei Lucas,

Haha yes, 8K is not nearly enough space... not sure why your cluster would give you such limited temp space... I'd get in touch with your admins to see what is happening there.

I double-checked and as far as I can tell, there should be no quota and therefore unlimited space in the temp directories, so it is a bit of a mystery. I'll try your approach if/when the current analysis fails.

Cheers
Yannick

from grenepipe.

lczech avatar lczech commented on July 30, 2024

UNLIMITED SPACE? You could get rich! :-D

Yes, see what happens. If things get deleted from that temp dir once the job finishes, it could be that it is too eagerly deleting files before Snakemake is done moving them to the actual results dir, or something like that.

Cheers
Lucas

from grenepipe.

Leovar101 avatar Leovar101 commented on July 30, 2024

Hi Lucas,

the last attempt unfortunately also finished with the same "permission denied" notification in regards to that scratch directory.
Here the snakemake log, I hope this one is the one you'd need:
2024-07-18T111225.342551.snakemake.log

Interestingly, the analysis got much further than the last time.
Is there a way to have a new run start where this one left off?

Cheers
Yannick

from grenepipe.

lczech avatar lczech commented on July 30, 2024

Hi Yannick,

the last attempt unfortunately also finished with the same "permission denied" notification in regards to that scratch directory.

That does not look like an issue with Snakemake or grenepipe, but rather with your cluster configuration, so there is not much I can do to fix that particular error. If you want to keep using that scratch space, I'd get in touch with the cluster admin, to see what is causing this.

In the meantime, you can instead instruct Snakemake to use a different directory for temp files, as explained above. Have you tried that?

Interestingly, the analysis got much further than the last time.

Hm not sure that I understand. Your previous run went from 15:10 to 18:07, and your new one from 11:12 to 17:16, which seems longer to me :-)

Is there a way to have a new run start where this one left off?

Yes, Snakemake usually does not run things that are already there and up to date. However, if an input file changes, any output files that depend on it will be re-created. That is determined via time stamps, so if the input is newer than the output, the job creating the output is executed again.

In your case it seems that the reference file was updated:

[Thu Jul 18 11:12:28 2024]
localcheckpoint samtools_faidx:
    input: /gpfs/bwfor/home/fr/fr_fr/fr_yf1009/GenoDat/Mbel_Q1_clean.fasta
    output: /gpfs/bwfor/home/fr/fr_fr/fr_yf1009/GenoDat/Mbel_Q1_clean.fasta.fai
    log: /gpfs/bwfor/home/fr/fr_fr/fr_yf1009/GenoDat/logs/Mbel_Q1_clean.fasta.samtools_faidx.log
    jobid: 7
    reason: Updated input files: /gpfs/bwfor/home/fr/fr_fr/fr_yf1009/GenoDat/Mbel_Q1_clean.fasta
    resources: tmpdir=/scratch/fr_yf1009_o05i14
DAG of jobs will be updated after completion.

Did you somehow change or move that file in the meantime? If so, that would explain the re-running.

So long
Lucas

from grenepipe.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.