Comments (4)
Hi @ospfsg,
can you please attach some of the log files of that step? You'll find them in logs/samtools/sort/
. Thanks!
Does that also mean that #43 is solved? Shall we close that one then?
Cheers and so long
Lucas
from grenepipe.
Hi @lcech
I sorted most of the previous problems #43 and #44. But I am ending up with new ones:
This time the log file is empty and the error messages are odd:
/usr/bin/bash: line 1: AP028914: command not found !!!!
see below the where the problem started, I attached the empty log file:
and the
general log file:
2024-04-06T154706.002800.snakemake.log
could you advice me on this?
output: called/ENA|AP028914|AP028914.1.vcf (pipe), called/ENA|AP028914|AP028914.1.vcf.done
log: logs/freebayes/ENA|AP028914|AP028914.1.log
jobid: 271
benchmark: benchmarks/freebayes/ENA|AP028914|AP028914.1.bench.log
wildcards: contig=ENA|AP028914|AP028914.1
threads: 118
[Sat Apr 6 17:51:31 2024]
rule compress_vcf:
input: called/ENA|AP028914|AP028914.1.vcf
output: called/ENA|AP028914|AP028914.1.vcf.gz, called/ENA|AP028914|AP028914.1.vcf.gz.done
log: logs/compress_vcf/ENA|AP028914|AP028914.1.log
jobid: 270
wildcards: contig=ENA|AP028914|AP028914.1
threads: 2
Activating conda environment: /home/dau1/software/conda-envs/274fa057cbe0d35f70f6e72a7bbf331c
Activating conda environment: /home/dau1/software/conda-envs/c33c6fe4a4427c0a2e5bff68c2c7ae7c
/usr/bin/bash: line 1: AP028914: command not found
/usr/bin/bash: line 1: AP028914.1.vcf: command not found
/usr/bin/bash: line 1: AP028914: command not found
/usr/bin/bash: line 1: AP028914: command not found
/usr/bin/bash: line 1: AP028914.1.log: command not found
Activating conda environment: /home/dau1/software/conda-envs/c33c6fe4a4427c0a2e5bff68c2c7ae7c
/usr/bin/bash: line 1: AP028914: command not found
/usr/bin/bash: line 1: AP028914.1.log: command not found
/usr/bin/bash: line 1: AP028914: command not found
/usr/bin/bash: line 1: AP028914.1.vcf: command not found
Writing to /tmp/bcftools.9ofika
Academic tradition requires you to cite works you base your article on.
If you use programs that use GNU Parallel to process data for an article in a
scientific publication, please cite:
O. Tange (2018): GNU Parallel 2018, Mar 2018, ISBN 9781387509881,
DOI https://doi.org/10.5281/zenodo.1146014
This helps funding further development; AND IT WON'T COST YOU A CENT.
If you pay 10000 EUR you should feel free to use GNU Parallel without citing.
More about funding GNU Parallel and the citation notice:
https://www.gnu.org/software/parallel/parallel_design.html#Citation-notice
To silence this citation notice: run 'parallel --citation' once.
[Sat Apr 6 17:51:32 2024]
Error in group job b179584a-f6f9-4cdf-9f53-d7c4fabb4a39:
[Sat Apr 6 17:51:32 2024]
Error in rule compress_vcf:
jobid: 270
output: called/ENA|AP028914|AP028914.1.vcf.gz, called/ENA|AP028914|AP028914.1.vcf.gz.done
log: logs/compress_vcf/ENA|AP028914|AP028914.1.log (check log file(s) for error message)
conda-env: /home/dau1/software/conda-envs/274fa057cbe0d35f70f6e72a7bbf331c
shell:
bgzip --force --threads 2 called/ENA|AP028914|AP028914.1.vcf > called/ENA|AP028914|AP028914.1.vcf.gz 2> logs/compress_vcf/ENA|AP028914|AP028914.1.log
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
[Sat Apr 6 17:51:32 2024]
Error in rule call_variants:
jobid: 271
output: called/ENA|AP028914|AP028914.1.vcf (pipe), called/ENA|AP028914|AP028914.1.vcf.done
log: logs/freebayes/ENA|AP028914|AP028914.1.log (check log file(s) for error message)
conda-env: /home/dau1/software/conda-envs/c33c6fe4a4427c0a2e5bff68c2c7ae7c
[W::bcf_hrec_check] Invalid tag name: "technology.-"
[Sat Apr 6 17:51:51 2024]
Finished job 219.
241 of 288 steps (84%) done
Merging 1 temporary files
[W::bcf_hrec_check] Invalid tag name: "technology.-"
[W::bcf_hrec_check] Invalid tag name: "technology.-"
Cleaning
Done
Traceback (most recent call last):
File "/mnt/data1/Project_Miridios/Operational/4_data_analysis/5_grenepipe/run2/.snakemake/scripts/tmp1m3j68uq.freebayes.py", line 162, in <module>
"({freebayes} {extra_params} -f {snakemake.input.ref}"
File "/home/dau1/miniconda3/envs/grenepipe/lib/python3.7/site-packages/snakemake/shell.py", line 231, in __new__
raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail; (freebayes-parallel <(bedtools intersect -a <(sed 's/:\([0-9]*\)-\([0-9]*\)$/\t\1\t\2/' <(echo "ENA|AP028914|AP028914.1:0-16004536")) -b <(sed 's/:\([0-9]*\)-\([0-9]*\)$/\t\1\t\2/' <(fasta_generate_regions.py /mnt/data1/Project_Miridios/Operational/6_reference_genomes/Genome_nesidiocoris_tenuis/GCA_036186465.1.fasta.fai 100000)) | sed 's/\t\([0-9]*\)\t\([0-9]*\)$/:\1-\2/') 118 --min-alternate-count 2 -f /mnt/data1/Project_Miridios/Operational/6_reference_genomes/Genome_nesidiocoris_tenuis/GCA_036186465.1.fasta dedup/MIR_Dicy37_EKDN230030510-1A_HJGYCDSX7_L1.bam dedup/MIR_Dicy43_EKDN230030511-1A_HJGYCDSX7_L1.bam dedup/MIR_Dicy47_EKDN230030512-1A_HJK33DSX7_L2.bam dedup/MIR_Dicy78b_EKDN230030513-1A_HJGYCDSX7_L1.bam dedup/MIR_Dicy88_EKDN230030514-1A_HKM3VDSX7_L3.bam dedup/MIR_Macr17_EKDN230030515-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr18_EKDN230030516-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr19_EKDN230030517-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr20_EKDN230030518-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr25_EKDN230030519-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr30_EKDN230030520-1A_HKM5HDSX7_L4.bam dedup/MIR_Macr31.bam dedup/MIR_Macr32_EKDN230030522-1A_HJK33DSX7_L1.bam dedup/MIR_Macr33b_EKDN230030523-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr37_EKDN230030524-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr38_EKDN230030525-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr42_EKDN230030526-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr46b_EKDN230030527-1A_HKM3VDSX7_L3.bam dedup/MIR_Macr47_EKDN230030528-1A_HKM3VDSX7_L2.bam dedup/MIR_Macr54b_EKDN230030529-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr56_EKDN230030530-1A_HJGYCDSX7_L1.bam dedup/MIR_Macr58_EKDN230030531-1A_HJGYCDSX7_L1.bam | bcftools sort -Ou - | bcftools view -Ov - > called/ENA|AP028914|AP028914.1.vcf) > logs/freebayes/ENA|AP028914|AP028914.1.log 2>&1' returned non-zero exit status 127.
[Sat Apr 6 17:52:16 2024]
Finished job 232.
242 of 288 steps (84%) done
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /mnt/data1/Project_Miridios/Operational/4_data_analysis/5_grenepipe/run2/.snakemake/log/2024-04-06T154706.002800.snakemake.log
(grenepipe) dau1@frey:~/software/grenepipe-0.12.2$
from grenepipe.
Oh haha that is an interesting new error that I have not seen before. The issue is that your reference genome contains chromosomes or contigs with names such as ENA|AP028914|AP028914.1
. That name contains pipe characters (|
), which have a special meaning in Unix/Linux systems. As grenepipe runs the variant calling per chromosome/contig, and uses these names to name the resulting files, this hence introduces pipe characters into the file names, and hence into the commands being run that Unix then interprets in a different way. That then leads to the error.
Generally, I recommend to only use safe characters in file names. See last paragraph of the section here:
we recommend to ensure file names that only consist of alpha-numerical characters, dots, dashes, and underscores. Almost all other characters are special in some contexts, and might hence cause trouble when running the pipeline.
In your case, your samples are all named fine, but then the error came from the chromosome/contig names in the reference genome, which I had not thought of to check before, and hence is not checked prior to running the pipeline.
I will add a check for this to the code, so that a nice error message is printed. However, I won't have time for that in the next couple of weeks. So, for now, the quick solution for you is to re-name the sequences in the reference genome (/mnt/data1/Project_Miridios/Operational/6_reference_genomes/Genome_nesidiocoris_tenuis/GCA_036186465.1.fasta
) by removing any characters that are not dashes, underscores, or dots.
Hope that helps, and let me know if this works :-)
Cheers and so long
Lucas
from grenepipe.
Hey @ospfsg,
did this resolve the issue? It seems that according to #45, this issue is solved? If not, feel free to re-open :-)
Cheers and so long
Lucas
from grenepipe.
Related Issues (20)
- MissingRuleException HOT 13
- PID error HOT 9
- java.lang.OutOfMemoryError: Java heap space HOT 2
- GRENEPIPE v12.1 HOT 5
- Make "trimming-tool" optional HOT 4
- restrict-regions and short contigs HOT 2
- ModuleNotFoundError: No module named 'chardet' HOT 2
- Write full executed command for each step to log files for reproducibility HOT 3
- merging calls from multiple pipeline runs? HOT 2
- mamba is difficult to install in grenepipe environment HOT 6
- Feature Request: Download reference genome and known variation HOT 2
- config file HOT 5
- greenepipe run error HOT 5
- a new type of error HOT 2
- a new type of error HOT 1
- another type of error HOT 11
- permission denied error HOT 5
- not sure what the issue is HOT 11
- --rerun-incomplete flag repeat mapping step for all samples HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from grenepipe.