Git Product home page Git Product logo

Comments (4)

lczech avatar lczech commented on July 30, 2024

Hi @ospfsg,

can you please attach some of the log files of that step? You'll find them in logs/samtools/sort/. Thanks!

Does that also mean that #43 is solved? Shall we close that one then?

Cheers and so long
Lucas

from grenepipe.

ospfsg avatar ospfsg commented on July 30, 2024

Hi @lcech

I sorted most of the previous problems #43 and #44. But I am ending up with new ones:

This time the log file is empty and the error messages are odd:

/usr/bin/bash: line 1: AP028914: command not found !!!!

see below the where the problem started, I attached the empty log file:

ENA|AP028914|AP028914.1.log

and the
general log file:

2024-04-06T154706.002800.snakemake.log

could you advice me on this?

 output: called/ENA|AP028914|AP028914.1.vcf (pipe), called/ENA|AP028914|AP028914.1.vcf.done
        log: logs/freebayes/ENA|AP028914|AP028914.1.log
        jobid: 271
        benchmark: benchmarks/freebayes/ENA|AP028914|AP028914.1.bench.log
        wildcards: contig=ENA|AP028914|AP028914.1
        threads: 118

    [Sat Apr  6 17:51:31 2024]
    rule compress_vcf:
        input: called/ENA|AP028914|AP028914.1.vcf
        output: called/ENA|AP028914|AP028914.1.vcf.gz, called/ENA|AP028914|AP028914.1.vcf.gz.done
        log: logs/compress_vcf/ENA|AP028914|AP028914.1.log
        jobid: 270
        wildcards: contig=ENA|AP028914|AP028914.1
        threads: 2

Activating conda environment: /home/dau1/software/conda-envs/274fa057cbe0d35f70f6e72a7bbf331c
Activating conda environment: /home/dau1/software/conda-envs/c33c6fe4a4427c0a2e5bff68c2c7ae7c
/usr/bin/bash: line 1: AP028914: command not found
/usr/bin/bash: line 1: AP028914.1.vcf: command not found
/usr/bin/bash: line 1: AP028914: command not found
/usr/bin/bash: line 1: AP028914: command not found
/usr/bin/bash: line 1: AP028914.1.log: command not found
Activating conda environment: /home/dau1/software/conda-envs/c33c6fe4a4427c0a2e5bff68c2c7ae7c
/usr/bin/bash: line 1: AP028914: command not found
/usr/bin/bash: line 1: AP028914.1.log: command not found
/usr/bin/bash: line 1: AP028914: command not found
/usr/bin/bash: line 1: AP028914.1.vcf: command not found
Writing to /tmp/bcftools.9ofika
Academic tradition requires you to cite works you base your article on.
If you use programs that use GNU Parallel to process data for an article in a
scientific publication, please cite:

  O. Tange (2018): GNU Parallel 2018, Mar 2018, ISBN 9781387509881,
  DOI https://doi.org/10.5281/zenodo.1146014

This helps funding further development; AND IT WON'T COST YOU A CENT.
If you pay 10000 EUR you should feel free to use GNU Parallel without citing.

More about funding GNU Parallel and the citation notice:
https://www.gnu.org/software/parallel/parallel_design.html#Citation-notice

To silence this citation notice: run 'parallel --citation' once.

[Sat Apr  6 17:51:32 2024]
Error in group job b179584a-f6f9-4cdf-9f53-d7c4fabb4a39:
    [Sat Apr  6 17:51:32 2024]
    Error in rule compress_vcf:
        jobid: 270
        output: called/ENA|AP028914|AP028914.1.vcf.gz, called/ENA|AP028914|AP028914.1.vcf.gz.done
        log: logs/compress_vcf/ENA|AP028914|AP028914.1.log (check log file(s) for error message)
        conda-env: /home/dau1/software/conda-envs/274fa057cbe0d35f70f6e72a7bbf331c
        shell:
        bgzip --force --threads 2 called/ENA|AP028914|AP028914.1.vcf > called/ENA|AP028914|AP028914.1.vcf.gz 2> logs/compress_vcf/ENA|AP028914|AP028914.1.log
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

    [Sat Apr  6 17:51:32 2024]
    Error in rule call_variants:
        jobid: 271
        output: called/ENA|AP028914|AP028914.1.vcf (pipe), called/ENA|AP028914|AP028914.1.vcf.done
        log: logs/freebayes/ENA|AP028914|AP028914.1.log (check log file(s) for error message)
        conda-env: /home/dau1/software/conda-envs/c33c6fe4a4427c0a2e5bff68c2c7ae7c

[W::bcf_hrec_check] Invalid tag name: "technology.-"
[Sat Apr  6 17:51:51 2024]
Finished job 219.
241 of 288 steps (84%) done
Merging 1 temporary files
[W::bcf_hrec_check] Invalid tag name: "technology.-"
[W::bcf_hrec_check] Invalid tag name: "technology.-"
Cleaning
Done
Traceback (most recent call last):
  File "/mnt/data1/Project_Miridios/Operational/4_data_analysis/5_grenepipe/run2/.snakemake/scripts/tmp1m3j68uq.freebayes.py", line 162, in <module>
    "({freebayes} {extra_params} -f {snakemake.input.ref}"
  File "/home/dau1/miniconda3/envs/grenepipe/lib/python3.7/site-packages/snakemake/shell.py", line 231, in __new__
    raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail;  (freebayes-parallel <(bedtools intersect -a <(sed 's/:\([0-9]*\)-\([0-9]*\)$/\t\1\t\2/' <(echo "ENA|AP028914|AP028914.1:0-16004536")) -b <(sed 's/:\([0-9]*\)-\([0-9]*\)$/\t\1\t\2/' <(fasta_generate_regions.py /mnt/data1/Project_Miridios/Operational/6_reference_genomes/Genome_nesidiocoris_tenuis/GCA_036186465.1.fasta.fai 100000)) | sed 's/\t\([0-9]*\)\t\([0-9]*\)$/:\1-\2/') 118 --min-alternate-count 2 -f /mnt/data1/Project_Miridios/Operational/6_reference_genomes/Genome_nesidiocoris_tenuis/GCA_036186465.1.fasta dedup/MIR_Dicy37_EKDN230030510-1A_HJGYCDSX7_L1.bam dedup/MIR_Dicy43_EKDN230030511-1A_HJGYCDSX7_L1.bam dedup/MIR_Dicy47_EKDN230030512-1A_HJK33DSX7_L2.bam dedup/MIR_Dicy78b_EKDN230030513-1A_HJGYCDSX7_L1.bam dedup/MIR_Dicy88_EKDN230030514-1A_HKM3VDSX7_L3.bam dedup/MIR_Macr17_EKDN230030515-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr18_EKDN230030516-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr19_EKDN230030517-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr20_EKDN230030518-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr25_EKDN230030519-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr30_EKDN230030520-1A_HKM5HDSX7_L4.bam dedup/MIR_Macr31.bam dedup/MIR_Macr32_EKDN230030522-1A_HJK33DSX7_L1.bam dedup/MIR_Macr33b_EKDN230030523-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr37_EKDN230030524-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr38_EKDN230030525-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr42_EKDN230030526-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr46b_EKDN230030527-1A_HKM3VDSX7_L3.bam dedup/MIR_Macr47_EKDN230030528-1A_HKM3VDSX7_L2.bam dedup/MIR_Macr54b_EKDN230030529-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr56_EKDN230030530-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr58_EKDN230030531-1A_HJGYCDSX7_L1.bam | bcftools sort -Ou - | bcftools view -Ov - > called/ENA|AP028914|AP028914.1.vcf)  > logs/freebayes/ENA|AP028914|AP028914.1.log 2>&1' returned non-zero exit status 127.
[Sat Apr  6 17:52:16 2024]
Finished job 232.
242 of 288 steps (84%) done
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /mnt/data1/Project_Miridios/Operational/4_data_analysis/5_grenepipe/run2/.snakemake/log/2024-04-06T154706.002800.snakemake.log
(grenepipe) dau1@frey:~/software/grenepipe-0.12.2$ 

from grenepipe.

lczech avatar lczech commented on July 30, 2024

Oh haha that is an interesting new error that I have not seen before. The issue is that your reference genome contains chromosomes or contigs with names such as ENA|AP028914|AP028914.1. That name contains pipe characters (|), which have a special meaning in Unix/Linux systems. As grenepipe runs the variant calling per chromosome/contig, and uses these names to name the resulting files, this hence introduces pipe characters into the file names, and hence into the commands being run that Unix then interprets in a different way. That then leads to the error.

Generally, I recommend to only use safe characters in file names. See last paragraph of the section here:

we recommend to ensure file names that only consist of alpha-numerical characters, dots, dashes, and underscores. Almost all other characters are special in some contexts, and might hence cause trouble when running the pipeline.

In your case, your samples are all named fine, but then the error came from the chromosome/contig names in the reference genome, which I had not thought of to check before, and hence is not checked prior to running the pipeline.

I will add a check for this to the code, so that a nice error message is printed. However, I won't have time for that in the next couple of weeks. So, for now, the quick solution for you is to re-name the sequences in the reference genome (/mnt/data1/Project_Miridios/Operational/6_reference_genomes/Genome_nesidiocoris_tenuis/GCA_036186465.1.fasta) by removing any characters that are not dashes, underscores, or dots.

Hope that helps, and let me know if this works :-)

Cheers and so long
Lucas

from grenepipe.

lczech avatar lczech commented on July 30, 2024

Hey @ospfsg,

did this resolve the issue? It seems that according to #45, this issue is solved? If not, feel free to re-open :-)

Cheers and so long
Lucas

from grenepipe.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.