genomematt / xenomapper2 Goto Github PK
View Code? Open in Web Editor NEWA utility for splitting mixed origin NGS reads with secondary or alt mappings
License: Other
A utility for splitting mixed origin NGS reads with secondary or alt mappings
License: Other
Multiple issues with docopt with no clear method to resolve
Will revert to using argparse
Master has been renamed main
271 self.bgzf_file = BgzfWriter(filename=Path(file),
272 File "/opt/python/3.6.7/lib/python3.6/pathlib.py", line 1001, in __new__
273 self = cls._from_parts(args, init=False)
274 File "/opt/python/3.6.7/lib/python3.6/pathlib.py", line 656, in _from_parts
275 drv, root, parts = self._parse_args(args)
276 File "/opt/python/3.6.7/lib/python3.6/pathlib.py", line 648, in _parse_args
277 % type(a))
278TypeError: argument should be a str object or an os.PathLike object returning str, not <class 'bytes'>```
Do BAM reading and decompression in at least one thread per BAM. Consider threads for output files.
When running xenomapper2 on bams mapped by bwa mem I get the following errors from pylazybam:
xenomapper2 v2.0rc1 --primary=/scratch-shared/fmlab/nmoldovan/tmp/FM_seq_020/trimmed/7_xenomapper/1_mapping/LP0051_08_L001_primary.bam --secondary=/scratch-shared/fmlab/nmoldovan/tmp/FM_seq_020/trimmed/7_xenoma$
Traceback (most recent call last):
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/bin/xenomapper2", line 8, in
sys.exit(main())
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/lib/python3.10/site-packages/xenomapper2/cli.py", line 193, in main
pair_counts, counts, writer = xenomap(primary_bam,
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/lib/python3.10/site-packages/xenomapper2/xenomapper2.py", line 884, in xenomap
forward_state, reverse_state = xenomap_states(primary_aligns,
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/lib/python3.10/site-packages/xenomapper2/xenomapper2.py", line 685, in xenomap_states
prim_f_AS, prim_f_XS = score_function(prim_f_aligns,
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/lib/python3.10/site-packages/xenomapper2/xenomapper2.py", line 314, in get_bamprimary_AS_XS
AS = AS_function(bamprimary[0])
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/lib/python3.10/site-packages/pylazybam/tags.py", line 55, in get_AS
raise ValueError(
ValueError: More than one match to b'ASC.' was found in b'I\x01\x00\x00\x04\x00\x00\x00ASC\x01'<V\x17\x01\x00S\x00\x97\x00\x00\x00\x04\x00\x00\x00!SC\x01I\xff\xff\xffA01685:16:HGHWVDSX3:1:1258:25373:19100\x00p$
or
xenomapper2 v2.0rc1 --primary=/scratch-shared/fmlab/nmoldovan/tmp/FM_seq_020/trimmed/7_xenomapper/1_mapping/LP0051_07_L001_primary.bam --secondary=/scratch-shared/fmlab/nmoldovan/tmp/FM_seq_020/trimmed/7_xenoma$
Traceback (most recent call last):
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/bin/xenomapper2", line 8, in
sys.exit(main())
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/lib/python3.10/site-packages/xenomapper2/cli.py", line 193, in main
pair_counts, counts, writer = xenomap(primary_bam,
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/lib/python3.10/site-packages/xenomapper2/xenomapper2.py", line 884, in xenomap
forward_state, reverse_state = xenomap_states(primary_aligns,
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/lib/python3.10/site-packages/xenomapper2/xenomapper2.py", line 685, in xenomap_states
prim_f_AS, prim_f_XS = score_function(prim_f_aligns,
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/lib/python3.10/site-packages/xenomapper2/xenomapper2.py", line 315, in get_bamprimary_AS_XS
XS = XS_function(bamprimary[0])
File "/gpfs/home1/norbertm/projects/code/core_pipeline_310322/workflow/rules/7_xenomapper/.snakemake/conda/39b62994/lib/python3.10/site-packages/pylazybam/tags.py", line 108, in get_XS
raise ValueError(
ValueError: More than one match to b'XSC.' was found in b'N\x01\x00\x00\x07\x00\x00\x00XSC\x02'<V\x1b\x01\x00S\x00\x97\x00\x00\x00\x07\x00\x00\x00QSC\x02b\xff\xff\xffA01685:16:HGHWVDSX3:1:1161:21965:27946\x00p$
I am mapping and sorting reads in the previous step by:
bwa mem {params.ref} {input}
-M
-t {threads}
2> {log} |
samtools sort
-n
-@ {threads}
-o {output} 2>> {log}
Intriguingly this pipeline worked with another batch of files already, but now it is throwing the above error.
Edit: I forgot to add, that we are talking about 150 bp PE seq of genomic data.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.