schneebergerlab / syri Goto Github PK
View Code? Open in Web Editor NEWSynteny and Rearrangement Identifier
Home Page: https://schneebergerlab.github.io/syri/
License: MIT License
Synteny and Rearrangement Identifier
Home Page: https://schneebergerlab.github.io/syri/
License: MIT License
Hi @mnshgl0110 ,
when I run the chroder module, I got this error like this :
Traceback (most recent call last):
File "/home1/lyj/software/syri/syri/bin/chroder", line 729, in
scaf(args)
File "/home1/lyj/software/syri/syri/bin/chroder", line 337, in scaf
refdata = getdata(reflength, refid, refdir)
File "/home1/lyj/software/syri/syri/bin/chroder", line 104, in getdata
a2 = a2[a2[:, 0].argsort()]
IndexError: too many indices for array
so how could I fix this problem? THanks in advance.
Best Regards,
Yung-Chien
Hi,
I executed the following commands:
nucmer --maxmatch -c 100 -b 500 -l 500 -L 5000 refgenome qrygenome
delta-filter -m -i 90 -l 100 out.delta > out.filtered.delta
show-coords -THrd out.filtered.delta > out.filtered.coords
python3 ${PATH_TO_SYRI}syri --log DEBUG --no-chrmatch -c out.filtered.coords -d out.filtered.delta -r refgenome -q qrygenome
But after it runs for a while, i stops with this message:
"""
Traceback (most recent call last):
File "/cm/shared/apps/python/3.5.0/lib/python3.5/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/cm/shared/apps/python/3.5.0/lib/python3.5/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "syri/pyxFiles/synsearchFunctions.pyx", line 655, in syri.pyxFiles.synsearchFunctions.syri (syri/pyxFiles/synsearchFunctions.cpp:22087)
IndexError: list index out of range
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/lustre/scratch/WUR/PRI/bakke227/tools/syri/syri/bin/syri", line 216, in
startSyri(args, coords[["aStart", "aEnd", "bStart", "bEnd", "aLen", "bLen", "iden", "aDir", "bDir", "aChr", "bChr"]])
File "syri/pyxFiles/synsearchFunctions.pyx", line 359, in syri.pyxFiles.synsearchFunctions.startSyri (syri/pyxFiles/synsearchFunctions.cpp:13110)
File "syri/pyxFiles/synsearchFunctions.pyx", line 360, in syri.pyxFiles.synsearchFunctions.startSyri (syri/pyxFiles/synsearchFunctions.cpp:13060)
File "/cm/shared/apps/python/3.5.0/lib/python3.5/multiprocessing/pool.py", line 260, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/cm/shared/apps/python/3.5.0/lib/python3.5/multiprocessing/pool.py", line 608, in get
raise self._value
IndexError: list index out of range
What can i do to solve this?
with kind regards,
Linda
Hello,
Thank you for your good job for assembly SV detection.
I am using SyRI to idenitify SVs between a new assembly and a rice reference, but an error appears.
Reading BAM/SAM file - ERROR - Error in reading BAM/SAM file. reference_id -1 out of range 0<=tid<14
And i check the header information in the sam file.
@SQ SN:Chr1 LN:43270923
@SQ SN:Chr2 LN:35937250
@SQ SN:Chr3 LN:36413819
@SQ SN:Chr4 LN:35502694
@SQ SN:Chr5 LN:29958434
@SQ SN:Chr6 LN:31248787
@SQ SN:Chr7 LN:29697621
@SQ SN:Chr8 LN:28443022
@SQ SN:Chr9 LN:23012720
@SQ SN:Chr10 LN:23207287
@SQ SN:Chr11 LN:29021106
@SQ SN:Chr12 LN:27531856
@SQ SN:ChrUn LN:633586
@PG ID:minimap2 PN:minimap2 VN:2.17-r974-dirty CL:minimap2 -t 1 -ax asm5 --eqx /ref/reference/all.fa /assembly/new.assembly.fasta
The head of some records
# samtools view QUAN/out.sam| awk '{print $3}' | head
Chr1
Chr1
Chr1
Chr1
Chr1
Chr1
Chr1
Chr1
Chr1
Chr1
Does SYRI have some rules with reference ID or others? Could you please give me some suggestions?
Thanks!
out.filtered.coords.txt
Hello,
I have a small questions for the coords file by MUMmer3.23. I used the example for test. So I append the example coords. But I have an error : File "../syri/bin/syri", line 153, in
coords, chrlink = readCoords(args.infile.name, args.chrmatch, args.dir, args.prefix, args, args.cigar)
File "syri/pyxFiles/synsearchFunctions.pyx", line 31, in syri.pyxFiles.synsearchFunctions.readCoords
TypeError: readCoords() takes at most 5 positional arguments (6 given).
I do need your help, please.
Hope to receive your answers. Thank you very much.
Firstly, I presume this is the SyRI mentioned in this paper. Secondly, can you please tag a release? That would make it easier to package the software for projects like bioconda.
Thank you.
Hi,
I used minimap2 to align two large genomes ( >3Gbp) and produced a .bam file.
The command i used to start syri:
python3 syri -c input.bam -r refgenome -q qrygenome -F B --log DEBUG --no-chrmatch --dir syri
After a while , i get this message:
Traceback (most recent call last):
File "/...../syri", line 218, in
startSyri(args, coords[["aStart", "aEnd", "bStart", "bEnd", "aLen", "bLen", "iden", "aDir", "bDir", "aChr", "bChr"]])
File "syri/pyxFiles/synsearchFunctions.pyx", line 370, in syri.pyxFiles.synsearchFunctions.startSyri (syri/pyxFiles/synsearchFunctions.cpp:13314)
File "syri/pyxFiles/synsearchFunctions.pyx", line 761, in syri.pyxFiles.synsearchFunctions.outSyn (syri/pyxFiles/synsearchFunctions.cpp:25625)
File "/cm/shared/apps/python/3.5.0/lib/python3.5/site-packages/pandas/core/generic.py", line 2682, in setattr
return object.setattr(self, name, value)
File "pandas/src/properties.pyx", line 65, in pandas.lib.AxisProperty.set (pandas/lib.c:45018)
File "/cm/shared/apps/python/3.5.0/lib/python3.5/site-packages/pandas/core/generic.py", line 425, in _set_axis
self._data.set_axis(axis, labels)
File "/cm/shared/apps/python/3.5.0/lib/python3.5/site-packages/pandas/core/internals.py", line 2578, in set_axis
(old_len, new_len))
ValueError: Length mismatch: Expected axis has 0 elements, new values have 7 elements
Can someone help me?
Some additional information (don't know if it helps)
I tried to use mummer4 (nucmer) first, but i kept running out of memory, so i switched to minimap2
mimimap2 produced a .sam file, but syri gave this truncated file issue ( #34 ), so i converted to .bam file.
starting syri, i also get a warning about the missing bam index file. I don't know if that is somehow related to this issue.
Thank you!
Hi, I have run syri sucessfully on a few genomes. I was looking into the sizes of the syri events in the .out file and noticed that I have some NOTAL that are 0 length.
eg.
Chr03 21182259 21182259 - - - - - NOTAL425 - NOTAL -
thanks
Hello,
trying to run syri
./syri/syri/bin/syri -c out_m_i90_l100.coords -r ref.fasta -q contigs_k61.fa -d out_m_i90_l100.delta
but
Traceback (most recent call last):
File "./syri/syri/bin/syri", line 148, in <module>
from syri.pyxFiles.synsearchFunctions import readCoords
ImportError: No module named 'syri.pyxFiles'
thank you
So i can pip3 install syri
?
This will make it easier to get it into bioconda and homebrew.
Hi,
I'm currently attempting to test out syri but I'm having issues using a SAM file as input.
I've used minimap2 to align my two fasta files:
minimap2 -ax asm5 MINF_9D.fasta MSB1_6J.fasta > MINF_9D_vs_MSB1_6J_minimap.sam
I then pass this sam file to syri:
./syri/syri/bin/syri -c MINF_9D_vs_MSB1_6J_minimap.sam -r MINF_9D.fasta -q MSB1_6J.fasta -F S
However I get this error:
syri - WARNING - starting
Reading Coords - ERROR - Error in reading the SAM file
I can't see anything in the log file that might help me work out what's going on - I set the log level to debug, and this is what's inside the log file:
2019-12-20 11:43:15,324 - syri - WARNING - <module>:115 - starting
2019-12-20 11:43:15,324 - syri - DEBUG - <module>:115 - memory usage: 0.07046127319335938
2019-12-20 11:43:15,325 - Reading Coords - DEBUG - <module>:115 - S
2019-12-20 11:43:15,325 - Reading Coords - INFO - <module>:115 - Reading input from .tsv file
2019-12-20 11:43:15,344 - Reading Coords - ERROR - <module>:115 - Error in reading the SAM file
I'm attaching my sam file here, in case there's something in it that's preventing syri from reading it?
MINF_9D_vs_MSB1_6J_minimap.sam.txt
Any help you could give would be greatly appreciated!
In general, I know that the TLs are not annotated as deletions in the other genome (at the origin of the TL) - however there might be arguments to do this. This would require classification of the space between sytenic blocks. Perhaps needs more thoughts.
I got the following error when running syri:
/path/syri/bin/syri -c out_m_i90_l100.coords -r ref.fa -q query.fa -k -d out_m_i90_l100.delta
any idea? thank you!
syri - WARNING - starting
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/anaconda3/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/anaconda3/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "syri/pyxFiles/synsearchFunctions.pyx", line 362, in syri.pyxFiles.synsearchFunctions.syri
File "syri/pyxFiles/synsearchFunctions.pyx", line 700, in syri.pyxFiles.synsearchFunctions.getSynPath
ValueError: max() arg is an empty sequence
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/syri/syri/bin/syri", line 115, in
chrlink = startSyri(args)
File "syri/pyxFiles/synsearchFunctions.pyx", line 311, in syri.pyxFiles.synsearchFunctions.startSyri
File "syri/pyxFiles/synsearchFunctions.pyx", line 312, in syri.pyxFiles.synsearchFunctions.startSyri
File "/anaconda3/lib/python3.6/multiprocessing/pool.py", line 288, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/anaconda3/lib/python3.6/multiprocessing/pool.py", line 670, in get
raise self._value
ValueError: max() arg is an empty sequence
here is the log file: (looks like the error happens when it is calling translocation and dup in chr9 )
2020-03-10 13:47:56,650 - syri - WARNING - :115 - starting
2020-03-10 13:47:56,653 - Reading Coords - INFO - :115 - Reading input from .tsv file
2020-03-10 13:48:05,808 - syri - INFO - :115 - Analysing chromosomes: ['Chr0', 'Chr1', 'Chr10', 'Chr2', 'Chr3', 'Chr4', 'Chr5', 'Chr6', 'Chr7', 'Chr8', 'Chr9']
2020-03-10 13:48:06,566 - syri.Chr0 - INFO - mapstar:44 - Chr0 (2065, 11)
2020-03-10 13:48:06,566 - syri.Chr0 - INFO - mapstar:44 - Identifying Synteny for chromosome Chr0
2020-03-10 13:48:07,862 - syri.Chr0 - INFO - mapstar:44 - Identifying Inversions for chromosome Chr0
2020-03-10 13:48:16,331 - syri.Chr0 - INFO - mapstar:44 - Identifying translocation and duplication for chromosome Chr0
2020-03-10 13:50:00,789 - syri.Chr1 - INFO - mapstar:44 - Chr1 (0, 11)
2020-03-10 13:50:00,789 - syri.Chr1 - INFO - mapstar:44 - Identifying Synteny for chromosome Chr1
2020-03-10 13:50:01,213 - syri.Chr2 - INFO - mapstar:44 - Chr2 (9604, 11)
2020-03-10 13:50:01,214 - syri.Chr2 - INFO - mapstar:44 - Identifying Synteny for chromosome Chr2
2020-03-10 13:50:19,335 - syri.Chr2 - INFO - mapstar:44 - Identifying Inversions for chromosome Chr2
2020-03-10 15:27:50,313 - syri.Chr2 - INFO - mapstar:44 - Identifying translocation and duplication for chromosome Chr2
2020-03-10 15:36:56,381 - syri.Chr3 - INFO - mapstar:44 - Chr3 (9780, 11)
2020-03-10 15:36:56,383 - syri.Chr3 - INFO - mapstar:44 - Identifying Synteny for chromosome Chr3
2020-03-10 15:37:15,246 - syri.Chr3 - INFO - mapstar:44 - Identifying Inversions for chromosome Chr3
2020-03-10 15:42:05,687 - syri.Chr3 - INFO - mapstar:44 - Identifying translocation and duplication for chromosome Chr3
2020-03-10 15:49:06,173 - syri.Chr4 - INFO - mapstar:44 - Chr4 (9463, 11)
2020-03-10 15:49:06,177 - syri.Chr4 - INFO - mapstar:44 - Identifying Synteny for chromosome Chr4
2020-03-10 15:49:21,402 - syri.Chr4 - INFO - mapstar:44 - Identifying Inversions for chromosome Chr4
2020-03-10 15:52:07,506 - syri.Chr4 - INFO - mapstar:44 - Identifying translocation and duplication for chromosome Chr4
2020-03-10 15:58:20,288 - syri.Chr5 - INFO - mapstar:44 - Chr5 (8619, 11)
2020-03-10 15:58:20,288 - syri.Chr5 - INFO - mapstar:44 - Identifying Synteny for chromosome Chr5
2020-03-10 15:58:33,046 - syri.Chr5 - INFO - mapstar:44 - Identifying Inversions for chromosome Chr5
2020-03-10 16:00:35,611 - syri.Chr5 - INFO - mapstar:44 - Identifying translocation and duplication for chromosome Chr5
2020-03-10 16:06:40,948 - syri.Chr6 - INFO - mapstar:44 - Chr6 (7187, 11)
2020-03-10 16:06:40,949 - syri.Chr6 - INFO - mapstar:44 - Identifying Synteny for chromosome Chr6
2020-03-10 16:06:49,995 - syri.Chr6 - INFO - mapstar:44 - Identifying Inversions for chromosome Chr6
2020-03-10 16:07:57,365 - syri.Chr6 - INFO - mapstar:44 - Identifying translocation and duplication for chromosome Chr6
2020-03-10 16:15:09,059 - syri.Chr7 - INFO - mapstar:44 - Chr7 (7592, 11)
2020-03-10 16:15:09,059 - syri.Chr7 - INFO - mapstar:44 - Identifying Synteny for chromosome Chr7
2020-03-10 16:15:18,045 - syri.Chr7 - INFO - mapstar:44 - Identifying Inversions for chromosome Chr7
2020-03-10 16:16:42,809 - syri.Chr7 - INFO - mapstar:44 - Identifying translocation and duplication for chromosome Chr7
2020-03-10 16:22:03,063 - syri.Chr8 - INFO - mapstar:44 - Chr8 (6974, 11)
2020-03-10 16:22:03,063 - syri.Chr8 - INFO - mapstar:44 - Identifying Synteny for chromosome Chr8
2020-03-10 16:22:10,781 - syri.Chr8 - INFO - mapstar:44 - Identifying Inversions for chromosome Chr8
2020-03-10 16:23:28,347 - syri.Chr8 - INFO - mapstar:44 - Identifying translocation and duplication for chromosome Chr8
2020-03-10 16:30:54,816 - syri.Chr9 - INFO - mapstar:44 - Chr9 (6378, 11)
2020-03-10 16:30:54,816 - syri.Chr9 - INFO - mapstar:44 - Identifying Synteny for chromosome Chr9
2020-03-10 16:31:01,710 - syri.Chr9 - INFO - mapstar:44 - Identifying Inversions for chromosome Chr9
2020-03-10 16:31:57,941 - syri.Chr9 - INFO - mapstar:44 - Identifying translocation and duplication for chromosome Chr9
The mummer run for the pseudochr generation already includes all hits that will be found in the second mummer run for syri. Rewriting the mummer files with adjusted positions and chr ID after pseudo chr generation would make running mummer again obsulete.
Hello,
Is there anyway to make this work with inter-chromosomal rearrangements?
For assemblies with an equal number of chromosomes I get this response due to a reciprocal translocation between two chromosomes
Reading Coords — ERROR — chrVII_chrIV in genome B is best match for two chromosomes in genome A. Cannot assign chromosomes automatically.
Thanks!
Chromosome should be matched automatically, so that the chr ID do not need to be identical be the genomes.
In the final VCF syri produces it outputs both the mummer alignments and the larger regions that syri puts together. As I understand it, a single region can be made of multiple alignments. In the VCF each alignment has a parent ID (e.g. Parent=SYN3080), but the parent regions aren't labelled. To get what region SYN3080 corresponds to I have to compare the maximum start and stop of all the alignments with that parental region. Could you include an ID field in the region lines?
Also, could the VCF file be sorted numerically by position? Right now its alphabetically so ordering is a bit weird.
Thanks for making this program!
Hi,
It's mentioned that the runtime and memory use for syri increases with the number of duplications and translocations. I have an issue in that there are major discrepancies between 2 of my runs. They're both runs comprising only 1 chromosome, each of which are around 30Mb with similar numbers of inversions. However, one run took approximately 3 hours (run A) to complete whereas the other one (run B) has taken 2 days (and is still running). Is there a something that I might be missing?
Based on mummer plots, the runA comparison has a 10Mb inversion in the middle of the chromosome while runB has a 20Mb inversion in the middle of the chromosome.
Thanks in advance!
syri - WARNING - starting
Reading Coords - ERROR - Error in reading the SAM file
When I tried the following command:
syri --cigar -c HG002PM_GRCh38.sam -r grch38.fa -q ragoo_output/ragoo.fasta -F S
Hello,
I still can not understand how to make figure based on the results. Or what is the format of results?
Could you help me figure out it or do you update this function?
Thanks,
Fuyou
"snps.txt" was produced but no log report about "Combining outputs". Everything else looks fine. I use conda installation in CentOS Linux 7. Appreciate your guide to solve the problem. Thanks.
Below is the error:
/.conda/envs/syri/lib/python3.5/site-packages/pandas/core/ops.py:1167: FutureWarning: elementwise comparison failed; returning scalar instead,/software/syri/syri/bin/syri", line 255, in
but in the future will perform elementwise comparison
result = method(y)
Traceback (most recent call last):
File "
getshv(args, coords, chrlink)
File "syri/pyxFiles/findshv.pyx", line 196, in syri.findshv.getshv
File "/.conda/envs/syri/lib/python3.5/site-packages/pandas/core/ops.py", line 1283, in wrapper/.conda/envs/syri/lib/python3.5/site-packages/pandas/core/ops.py", line 1169, in na_op
res = na_op(values, other)
File "
raise TypeError("invalid type comparison")
TypeError: invalid type comparison
Hi,
I was running SyRI to look for variations and rearrangement, but SyRI didn't recognize my BAM file. I have deleted all small contigs and only kept the ones in chromosome length from both reference sequence and query sequence, based on the error #15.
Here is my command and the error:
syri -c test.bam -r ../../../2.Alignments/1.ref/hs.chr.fa -q ../../mr-2k.big.fasta -k -F B
syri - WARNING - starting
Reading Coords - ERROR - Error in reading the BAM file
Could you please help me figure out the issue? I can provide more details if you need more.
Thank you so much for your time!
Regards
Hello syri developer
I had an error with syri that seems to indicate there's unequal number of chromosomes.
The following is the error message I have recieved
syri — WARNING — starting
Reading Coords — WARNING — Chromosomes IDs do not match.
Reading Coords — ERROR — Unequal number of chromosomes in the genomes. Exiting
What I did was used a high quality reference genome that have been scaffolded to generate pseudomolecules, and used that to align my de novo assembled species genome. Note that my assembled genome has not been scaffolded so its a draft genome.
I then used nucmer to do the alignment, following the steps outlined in the github to generate the coord and delta file.
However I get the error message when I start syri with command:
python3 syri -c test.coords -ref.fa -q query.fa -d test.delta
Any suggestions?
Thank you.
Jae
Hi,
I have carefully read the manual (https://schneebergerlab.github.io/syri/fileformat.html) as well as your preprint paper (https://www.biorxiv.org/content/biorxiv/early/2019/08/20/546622.full.pdf). I still can not tell the difference between SYN and SYNAL. Could you give me more tips?
Thank you very much.
Best regards,
Hui Liu
Hi
I'm getting an error I'm not sure how to deal with. I got the tsv output with the command
show-coords -THrd OsatJ-OsatB_bas.c500.b500.l50.deltafilter_m_i90_l100.delta > OsatJ-OsatB_bas.c500.b500.l50.deltafilter_m_i90_l100.coords
and used that output for syri but get the below error.
`
syri — WARNING — starting
/home/jyc387/PROGRAMS_AND_SCRIPTS/PROGRAMS/syri/syri/bin/syri:106: FutureWarning: read_table is deprecated, use read_csv instead, passing sep='\t'.
chrlink = startSyri(args)
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/jyc387/miniconda2/envs/py3/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/home/jyc387/miniconda2/envs/py3/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "syri/pyxFiles/synsearchFunctions.pyx", line 208, in syri.pyxFiles.synsearchFunctions.syri
TypeError: ufunc 'add' did not contain a loop with signature matching types dtype('<U21') dtype('<U21') dtype('<U21')
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/jyc387/PROGRAMS_AND_SCRIPTS/PROGRAMS/syri/syri/bin/syri", line 106, in
chrlink = startSyri(args)
File "syri/pyxFiles/synsearchFunctions.pyx", line 193, in syri.pyxFiles.synsearchFunctions.startSyri
File "syri/pyxFiles/synsearchFunctions.pyx", line 194, in syri.pyxFiles.synsearchFunctions.startSyri
File "/home/jyc387/miniconda2/envs/py3/lib/python3.6/multiprocessing/pool.py", line 288, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/home/jyc387/miniconda2/envs/py3/lib/python3.6/multiprocessing/pool.py", line 670, in get
raise self._value
TypeError: ufunc 'add' did not contain a loop with signature matching types dtype('<U21') dtype('<U21') dtype('<U21')`
TL and invTL are essentially the same, they should be reported in one output file.
When i try syri using the *.sam(from minimap2) as an input file, there are some things wrong:
this is my code :
python3 ../syri/bin/syri -c out.bam -r refgenome -q qrygenome -k -F B
And then, there are somethings wrong, as follow:
Traceback (most recent call last):
File "syri/pyxFiles/synsearchFunctions.pyx", line 155, in syri.pyxFiles.synsearchFunctions.readCoords
File "syri/pyxFiles/synsearchFunctions.pyx", line 34, in syri.pyxFiles.synsearchFunctions.readSAMBAM
File "/home/jgxiang/miniconda2/envs/py35/lib/python3.5/site-packages/pysam/init.py", line 5, in
from pysam.libchtslib import *
ImportError: libbz2.so.1.0: cannot open shared object file: No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "../syri/bin/syri", line 153, in
coords, chrlink = readCoords(args.infile.name, args.chrmatch, args.dir, args.prefix, args, args.cigar)
File "syri/pyxFiles/synsearchFunctions.pyx", line 157, in syri.pyxFiles.synsearchFunctions.readCoords
TypeError: Can't convert 'ImportError' object to str implicitly
Hello,
when I run syri, a error masssage was reported :
syri - WARNING - starting
Reading Coords - WARNING - Chromosomes IDs do not match.
Reading Coords - WARNING - Matching them automatically. For each reference genome, most
similar query genome will be selected. Check mapids.txt for mapping used.
Reading Coords - ERROR - Chr4 in genome B is best match for two chromosomes in genome A.
Cannot assign chromosomes automatically.
A and B are not the same species. How can I sovle this problem ?
Thanks!
What if my species' chromosome have undergone split and fission?How should I use SyRI, or just the one-to-one synteny chromosme?
Originally posted by @baozg in #34 (comment)
Hi,
The reference genome and the query genome has the same chromosome ids for homologous chromosomes and the number of chromosomes. I aligned them using minimap2, and called rearrangements using the following command:
$ python3 $PATH_TO_SYRI -c JGXP_WT18.bam -r JGXP.fasta -q WT18.fasta -k -F B --nosnp --nc 15 --prefix JGXP_WT18_
Then, I got the following issues:
Namespace(TransUniCount=1000, TransUniPercent=0.5, all=False, bruteRunTime=60, chrmatch=False, cigar=False, delta=None, dir=None, fout='syri', ftype='B', increaseBy=1000, infile=<_io.TextIOWrapper name='JGXP_WT18.bam' mode='r' encoding='UTF-8'>, keep=True, log='INFO', log_fin=<_io.TextIOWrapper name='syri.log' mode='w' encoding='UTF-8'>, nCores=15, nosnp=True, nosr=False, nosv=False, novcf=False, offset=5, prefix='JGXP_WT18_', qry=<_io.TextIOWrapper name='WT18.fasta' mode='r' encoding='UTF-8'>, ref=<_io.TextIOWrapper name='JGXP.fasta' mode='r' encoding='UTF-8'>, seed=1, sspath='show-snps')
syri - WARNING - starting
('Cluster is too big for Brute Force\nTime taken for last iteration ', 8.678436279296875e-05, ' iterations remaining ', 49)
('Cluster is too big for Brute Force\nTime taken for last iteration ', 9.131431579589844e-05, ' iterations remaining ', 36)
('Cluster is too big for Brute Force\nTime taken for last iteration ', 2.4318695068359375e-05, ' iterations remaining ', 45)
('Cluster is too big for Brute Force\nTime taken for last iteration ', 0.00018978118896484375, ' iterations remaining ', 34)
('Cluster is too big for Brute Force\nTime taken for last iteration ', 0.00010824203491210938, ' iterations remaining ', 33)
('Cluster is too big for Brute Force\nTime taken for last iteration ', 0.001171112060546875, ' iterations remaining ', 46)
('Cluster is too big for Brute Force\nTime taken for last iteration ', 6.556510925292969e-05, ' iterations remaining ', 34)
('Cluster is too big for Brute Force\nTime taken for last iteration ', 0.0008320808410644531, ' iterations remaining ', 42)
('Cluster is too big for Brute Force\nTime taken for last iteration ', 0.00020074844360351562, ' iterations remaining ', 41)
('Cluster is too big for Brute Force\nTime taken for last iteration ', 0.00015544891357421875, ' iterations remaining ', 32)
('Cluster is too big for Brute Force\nTime taken for last iteration ', 0.00010156631469726562, ' iterations remaining ', 36)
('Cluster is too big for Brute Force\nTime taken for last iteration ', 4.76837158203125e-06, ' iterations remaining ', 46)
('Cluster is too big for Brute Force\nTime taken for last iteration ', 0.000354766845703125, ' iterations remaining ', 39)
('Cluster is too big for Brute Force\nTime taken for last iteration ', 1.2564423084259033, ' iterations remaining ', 10)
('Cluster is too big for Brute Force\nTime taken for last iteration ', 6.699562072753906e-05, ' iterations remaining ', 34)
('Cluster is too big for Brute Force\nTime taken for last iteration ', 0.0003211498260498047, ' iterations remaining ', 37)
('Cluster is too big for Brute Force\nTime taken for last iteration ', 1.1444091796875e-05, ' iterations remaining ', 41)
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/media/100TB/huli0009/bin/miniconda3/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/media/100TB/huli0009/bin/miniconda3/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "syri/pyxFiles/synsearchFunctions.pyx", line 623, in syri.pyxFiles.synsearchFunctions.syri
IndexError: list index out of range
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/media/100TB/huli0009/bin/syri/syri/bin/syri", line 115, in <module>
chrlink = startSyri(args)
File "syri/pyxFiles/synsearchFunctions.pyx", line 309, in syri.pyxFiles.synsearchFunctions.startSyri
File "syri/pyxFiles/synsearchFunctions.pyx", line 310, in syri.pyxFiles.synsearchFunctions.startSyri
File "/media/100TB/huli0009/bin/miniconda3/lib/python3.7/multiprocessing/pool.py", line 268, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/media/100TB/huli0009/bin/miniconda3/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
IndexError: list index out of range
Here are what I got:
total 528K
-rw-rw-r-- 1 huli0009 huli0009 23K Nov 27 04:12 JGXP_ZS4_Chr01_invOut.txt
-rw-rw-r-- 1 huli0009 huli0009 0 Nov 27 04:12 JGXP_ZS4_Chr01_synOut.txt
-rw-rw-r-- 1 huli0009 huli0009 20K Nov 27 04:07 JGXP_ZS4_Chr02_invOut.txt
-rw-rw-r-- 1 huli0009 huli0009 0 Nov 27 04:07 JGXP_ZS4_Chr02_synOut.txt
-rw-rw-r-- 1 huli0009 huli0009 12K Nov 27 03:55 JGXP_ZS4_Chr03_dupOut.txt
-rw-rw-r-- 1 huli0009 huli0009 5.3K Nov 27 03:55 JGXP_ZS4_Chr03_invDupOut.txt
-rw-rw-r-- 1 huli0009 huli0009 1.2K Nov 27 03:55 JGXP_ZS4_Chr03_invOut.txt
-rw-rw-r-- 1 huli0009 huli0009 5.8K Nov 27 03:55 JGXP_ZS4_Chr03_invTLOut.txt
-rw-rw-r-- 1 huli0009 huli0009 24K Nov 27 03:55 JGXP_ZS4_Chr03_synOut.txt
-rw-rw-r-- 1 huli0009 huli0009 3.0K Nov 27 03:55 JGXP_ZS4_Chr03_TLOut.txt
-rw-rw-r-- 1 huli0009 huli0009 4.8K Nov 27 03:55 JGXP_ZS4_Chr04_dupOut.txt
-rw-rw-r-- 1 huli0009 huli0009 3.1K Nov 27 03:55 JGXP_ZS4_Chr04_invDupOut.txt
-rw-rw-r-- 1 huli0009 huli0009 4.8K Nov 27 03:55 JGXP_ZS4_Chr04_invOut.txt
-rw-rw-r-- 1 huli0009 huli0009 4.8K Nov 27 03:55 JGXP_ZS4_Chr04_invTLOut.txt
-rw-rw-r-- 1 huli0009 huli0009 21K Nov 27 03:55 JGXP_ZS4_Chr04_synOut.txt
-rw-rw-r-- 1 huli0009 huli0009 3.1K Nov 27 03:55 JGXP_ZS4_Chr04_TLOut.txt
-rw-rw-r-- 1 huli0009 huli0009 8.9K Nov 27 03:55 JGXP_ZS4_Chr05_dupOut.txt
-rw-rw-r-- 1 huli0009 huli0009 4.3K Nov 27 03:55 JGXP_ZS4_Chr05_invDupOut.txt
-rw-rw-r-- 1 huli0009 huli0009 2.0K Nov 27 03:55 JGXP_ZS4_Chr05_invOut.txt
-rw-rw-r-- 1 huli0009 huli0009 3.5K Nov 27 03:55 JGXP_ZS4_Chr05_invTLOut.txt
-rw-rw-r-- 1 huli0009 huli0009 27K Nov 27 03:55 JGXP_ZS4_Chr05_synOut.txt
-rw-rw-r-- 1 huli0009 huli0009 5.9K Nov 27 03:55 JGXP_ZS4_Chr05_TLOut.txt
-rw-rw-r-- 1 huli0009 huli0009 12K Nov 27 03:55 JGXP_ZS4_Chr06_dupOut.txt
-rw-rw-r-- 1 huli0009 huli0009 6.2K Nov 27 03:55 JGXP_ZS4_Chr06_invDupOut.txt
-rw-rw-r-- 1 huli0009 huli0009 3.2K Nov 27 03:55 JGXP_ZS4_Chr06_invOut.txt
-rw-rw-r-- 1 huli0009 huli0009 4.7K Nov 27 03:55 JGXP_ZS4_Chr06_invTLOut.txt
-rw-rw-r-- 1 huli0009 huli0009 21K Nov 27 03:55 JGXP_ZS4_Chr06_synOut.txt
-rw-rw-r-- 1 huli0009 huli0009 5.7K Nov 27 03:55 JGXP_ZS4_Chr06_TLOut.txt
-rw-rw-r-- 1 huli0009 huli0009 23K Nov 27 04:11 JGXP_ZS4_Chr07_invOut.txt
-rw-rw-r-- 1 huli0009 huli0009 0 Nov 27 04:11 JGXP_ZS4_Chr07_synOut.txt
-rw-rw-r-- 1 huli0009 huli0009 9.2K Nov 27 03:55 JGXP_ZS4_Chr08_dupOut.txt
-rw-rw-r-- 1 huli0009 huli0009 6.0K Nov 27 03:55 JGXP_ZS4_Chr08_invDupOut.txt
-rw-rw-r-- 1 huli0009 huli0009 2.1K Nov 27 03:55 JGXP_ZS4_Chr08_invOut.txt
-rw-rw-r-- 1 huli0009 huli0009 5.0K Nov 27 03:55 JGXP_ZS4_Chr08_invTLOut.txt
-rw-rw-r-- 1 huli0009 huli0009 22K Nov 27 03:55 JGXP_ZS4_Chr08_synOut.txt
-rw-rw-r-- 1 huli0009 huli0009 4.6K Nov 27 03:55 JGXP_ZS4_Chr08_TLOut.txt
-rw-rw-r-- 1 huli0009 huli0009 19K Nov 27 04:08 JGXP_ZS4_Chr09_invOut.txt
-rw-rw-r-- 1 huli0009 huli0009 0 Nov 27 04:08 JGXP_ZS4_Chr09_synOut.txt
-rw-rw-r-- 1 huli0009 huli0009 12K Nov 27 04:10 JGXP_ZS4_Chr10_invOut.txt
-rw-rw-r-- 1 huli0009 huli0009 0 Nov 27 04:10 JGXP_ZS4_Chr10_synOut.txt
-rw-rw-r-- 1 huli0009 huli0009 6.2K Nov 27 03:55 JGXP_ZS4_Chr11_dupOut.txt
-rw-rw-r-- 1 huli0009 huli0009 3.6K Nov 27 03:55 JGXP_ZS4_Chr11_invDupOut.txt
-rw-rw-r-- 1 huli0009 huli0009 778 Nov 27 03:55 JGXP_ZS4_Chr11_invOut.txt
-rw-rw-r-- 1 huli0009 huli0009 4.7K Nov 27 03:55 JGXP_ZS4_Chr11_invTLOut.txt
-rw-rw-r-- 1 huli0009 huli0009 24K Nov 27 03:55 JGXP_ZS4_Chr11_synOut.txt
-rw-rw-r-- 1 huli0009 huli0009 3.1K Nov 27 03:55 JGXP_ZS4_Chr11_TLOut.txt
-rw-rw-r-- 1 huli0009 huli0009 18K Nov 27 04:01 JGXP_ZS4_Chr12_invOut.txt
-rw-rw-r-- 1 huli0009 huli0009 0 Nov 27 04:01 JGXP_ZS4_Chr12_synOut.txt
-rw-rw-r-- 1 huli0009 huli0009 15K Nov 27 03:59 JGXP_ZS4_Chr13_invOut.txt
-rw-rw-r-- 1 huli0009 huli0009 0 Nov 27 03:59 JGXP_ZS4_Chr13_synOut.txt
-rw-rw-r-- 1 huli0009 huli0009 11K Nov 27 03:57 JGXP_ZS4_Chr14_invOut.txt
-rw-rw-r-- 1 huli0009 huli0009 0 Nov 27 03:57 JGXP_ZS4_Chr14_synOut.txt
-rw-rw-r-- 1 huli0009 huli0009 18K Nov 27 04:09 JGXP_ZS4_Chr15_invOut.txt
-rw-rw-r-- 1 huli0009 huli0009 0 Nov 27 04:09 JGXP_ZS4_Chr15_synOut.txt
hello,
I run syri on a big genome with ~4g size and 7 chromosomes, than get a error message:
syri - WARNING - starting
/share/nas2/genome/biosoft/syri/1.1/syri/bin/syri:115: FutureWarning: read_table is deprecated, use read_csv instead, passing sep='\t'.
chrlink = startSyri(args)
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/share/nas2/genome/biosoft/Python/3.5.2/lib/python3.5/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/share/nas2/genome/biosoft/Python/3.5.2/lib/python3.5/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "syri/pyxFiles/synsearchFunctions.pyx", line 622, in syri.pyxFiles.synsearchFunctions.syri
IndexError: list index out of range
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/share/nas2/genome/biosoft/syri/1.1/syri/bin/syri", line 115, in
chrlink = startSyri(args)
File "syri/pyxFiles/synsearchFunctions.pyx", line 308, in syri.pyxFiles.synsearchFunctions.startSyri
File "syri/pyxFiles/synsearchFunctions.pyx", line 309, in syri.pyxFiles.synsearchFunctions.startSyri
File "/share/nas2/genome/biosoft/Python/3.5.2/lib/python3.5/multiprocessing/pool.py", line 260, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/share/nas2/genome/biosoft/Python/3.5.2/lib/python3.5/multiprocessing/pool.py", line 608, in get
raise self._value
IndexError: list index out of range
How can I sovle this problem ?
Thanks!
Thanks the syri team. This is a very nice tool. I have several questions to interpret the output.
Are CPG and CPL local duplicated? What is the major difference between CPG/L and Duplication gain/loss?
Can we consider Insertion/Deletion as presence/absence variation (PAV)?
I guess I am confused about the difference between Insertion/Deletion and CPG/CPL too.
thank you,
-Sanzhen
Hi, I am looking into using SyRI on a number of genomes I am asembling, but before I do I need to scaffold with chroder. I have run
nucmer --maxmatch -c 100 -b 500 -l 50 -t 6 -p alignment reference.fasta query.fasta
delta-filter -m -i 90 -l 100 alignment.delta > alignment.filtered.delta
show-coords -THrd alignment.filtered.delta > alignment.filtered.coords
chroder -n 100 -o test_scaffold alignment.filtered.coords reference.fasta query.fasta
chroder has been running for ~2days and is currently using ~130GB of RAM. Is it normal for chroder to take this long and use this much RAM.
Reference and query genomes ~650 Mbp. Reference has 12 scaffolds and query 1133.
Thanks.
Hi! I got a MemoryError error when I was running chroder, How can I solve it? Thank you!
The following is the error log:
Traceback (most recent call last):
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 732, in
scaf(args)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 340, in scaf
refdata = getdata(reflength, refid, refdir)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 144, in getdata
path = getpath(nodes)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 34, in getpath
tmppath = getpath(nodes, added, j, last)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 62, in getpath
tmppath = getpath(nodes, added, j, last)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 62, in getpath
tmppath = getpath(nodes, added, j, last)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 62, in getpath
tmppath = getpath(nodes, added, j, last)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 62, in getpath
tmppath = getpath(nodes, added, j, last)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 62, in getpath
tmppath = getpath(nodes, added, j, last)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 50, in getpath
path = path + getpath(nodes, added, nodes[i].children[0], last)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 62, in getpath
tmppath = getpath(nodes, added, j, last)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 62, in getpath
tmppath = getpath(nodes, added, j, last)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 62, in getpath
tmppath = getpath(nodes, added, j, last)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 62, in getpath
tmppath = getpath(nodes, added, j, last)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 50, in getpath
path = path + getpath(nodes, added, nodes[i].children[0], last)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 62, in getpath
tmppath = getpath(nodes, added, j, last)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 50, in getpath
path = path + getpath(nodes, added, nodes[i].children[0], last)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 50, in getpath
path = path + getpath(nodes, added, nodes[i].children[0], last)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 62, in getpath
tmppath = getpath(nodes, added, j, last)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 50, in getpath
path = path + getpath(nodes, added, nodes[i].children[0], last)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 50, in getpath
path = path + getpath(nodes, added, nodes[i].children[0], last)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 64, in getpath
tmppath = getpath(nodes, added + [nodes[i].name], j, nodes[i].name)
File "/public/home/zhangtz/user/jsk/tools/syri-master/syri/bin/chroder", line 54, in getpath
path = path + getpath(nodes, added + [nodes[i].name], nodes[i].children[0], nodes[i].name)
MemoryError
I am interested in using the chroder
function to get generate a pseudo-genome scale assembly. I used minimap2 to generate the reference-assembly alignment. Do you plan to modify chroder
to support sam input files or have a script to convert sam files to the expected tsv file format for chroder
? Alternatively, do you recommend using mummer3 to generate the appropriately formatted input file for chroder
.
any idea what could go wrong? Thanks
python3 /path/path/syri/syri/bin/syri -c d3-filter-1511.coords -r /path/path/projects/sv/canola/ref.fa -q /path/path/projects/sv/canola/query.fa -k --log DEBUG -d d3-filter-1511.delta
Namespace(TransUniCount=1000, TransUniPercent=0.5, all=False, bruteRunTime=60, chrmatch=False, cigar=False, delta=<_io.TextIOWrapper name='d3-filter-1511.delta' mode='r' encoding='UTF-8'>, dir=None, fout='syri', ftype='T', increaseBy=1000, infile=<_io.TextIOWrapper name='d3-filter-1511.coords' mode='r' encoding='UTF-8'>, keep=True, log='DEBUG', log_fin=<_io.TextIOWrapper name='syri.log' mode='w' encoding='UTF-8'>, nCores=1, nosnp=False, nosr=False, nosv=False, novcf=False, offset=5, prefix='', qry=<_io.TextIOWrapper name='/path/path/projects/sv/canola/query.fa' mode='r' encoding='UTF-8'>, ref=<_io.TextIOWrapper name='/path/path/projects/sv/canola/ref.fa' mode='r' encoding='UTF-8'>, seed=1, sspath='show-snps')
syri - WARNING - starting
('Cluster is too big for Brute Force\nTime taken for last iteration ', 5.7220458984375e-05, ' iterations remaining ', 36)
('Cluster is too big for Brute Force\nTime taken for last iteration ', 9.870529174804688e-05, ' iterations remaining ', 38)
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/path/path/anaconda3/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/path/path/anaconda3/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "syri/pyxFiles/synsearchFunctions.pyx", line 373, in syri.pyxFiles.synsearchFunctions.syri
File "syri/pyxFiles/inversions.pyx", line 729, in syri.inversions.getInversions
File "syri/pyxFiles/inversions.pyx", line 290, in syri.inversions.getProfitable
IndexError: Out of bounds on buffer access (axis 0)
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/path/path/syri/syri/bin/syri", line 115, in
chrlink = startSyri(args)
File "syri/pyxFiles/synsearchFunctions.pyx", line 311, in syri.pyxFiles.synsearchFunctions.startSyri
File "syri/pyxFiles/synsearchFunctions.pyx", line 312, in syri.pyxFiles.synsearchFunctions.startSyri
File "/path/path/anaconda3/lib/python3.6/multiprocessing/pool.py", line 288, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/path/path/anaconda3/lib/python3.6/multiprocessing/pool.py", line 670, in get
raise self._value
IndexError: Out of bounds on buffer access (axis 0)
Hi,
I aligned reference and query genomes and successfully executed the following commands as given in your example.
nucmer --maxmatch -c 100 -b 500 -l 50 refgenome qrygenome
delta-filter -m -i 90 -l 100 out.delta > out.filtered.delta
how-coords -THrd out.filtered.delta > out.filtered.coords
But I get the following error running syri
python3 $PATH_TO_SYRI -c out.filtered.coords -d out.filtered.delta -r refgenome -q qrygenome
syri - WARNING - :115 - starting
Reading Coords - INFO - :115 - Reading input from .tsv file
Reading Coords - WARNING - :115 - Chromosomes IDs do not match.
Reading Coords - ERROR - :115 - Unequal number of chromosomes in the genomes. Exiting
Why should I get this error although I have the same Chromosomes IDs and an equal number of chromosomes in both genomes? I appreciate your help. Thank you.
Hi @mnshgl0110 ,
I was trying to run the syri module after installing SyRI, when I typed this command"python syri", I got this :
File "syri", line 72
SyntaxError: Non-ASCII character '\xe2' in file syri on line 72, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details
But actually the code "# -- coding: utf-8 -" is in the syri file, so how could I fix this problem,Thanks in advance.
Best Regards,
Yung-Chien
I was able to run syri successfully but plotsr returned an error. It would be nice to see how you visualize the result. Please help if you know the potential problem. Thanks.
Here is the code
plotsr -s 1000 -o pdf syri.out $ref $qry
Here is the error:
syri/lib/python3.5/site-packages/matplotlib/font_manager.py:229: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
'Matplotlib is building the font cache using fc-list. '
Traceback (most recent call last):
File "syri/bin/plotsr", line 154, in
plt.switch_backend('Qt5Agg')
File "syri/lib/python3.5/site-packages/matplotlib/pyplot.py", line 221, in switch_backend
newbackend, required_framework, current_framework))
ImportError: Cannot load backend 'Qt5Agg' which requires the 'qt5' interactive framework, as 'headless' is currently running
I am running syri to get rearrangements between two genomes but I do not get any insertion or deletions as output. Is it supposed to be like this? or does it have to be enabled in some way?
python3 /apps/syri-1.2/syri/bin/syri -c test.sam -r ref.fa -q query.fa -k -F S --prefix syri.all --nc 4
Thanks
It is not clear how to interpret the duplications. What is the "A" and "B" at the end of the line? Could this refer to which genome has/has not the duplication?
Do we need to rephrase the duplications to "polymorphic duplications" so that it is clear that syri does only find duplications if they are unique to one sample?
Hi guys,
thanks for Syri. However I have troubles using it. Here the full error:
(py35) grusso@fgcz-h-105:/srv/GT/analysis/grusso/darioCopetti/syriResults$ python $syriExec -c refGenome.vs.qryGenome.bam -r $refGenome -q $qryGenome -k -F B
Traceback (most recent call last):
File "/usr/local/ngseq/src/syri/syri/bin/syri", line 163, in <module>
coords, chrlink = readCoords(args.infile.name, args.chrmatch, args.dir, args.prefix, args, args.cigar)
File "syri/pyxFiles/synsearchFunctions.pyx", line 253, in syri.pyxFiles.synsearchFunctions.readCoords
File "/usr/local/ngseq/miniconda3/envs/py35/lib/python3.5/site-packages/pandas/core/ops.py", line 1283, in wrapper
res = na_op(values, other)
File "/usr/local/ngseq/miniconda3/envs/py35/lib/python3.5/site-packages/pandas/core/ops.py", line 1143, in na_op
result = _comp_method_OBJECT_ARRAY(op, x, y)
File "/usr/local/ngseq/miniconda3/envs/py35/lib/python3.5/site-packages/pandas/core/ops.py", line 1122, in _comp_method_OBJECT_ARRAY
result = libops.scalar_compare(x, y, op)
File "pandas/_libs/ops.pyx", line 98, in pandas._libs.ops.scalar_compare
TypeError: unorderable types: str() > int()
Here is the executable seeming to work OK
(py35) grusso@fgcz-h-105:/srv/GT/analysis/grusso/darioCopetti/syriResults$ echo $syriExec
/usr/local/ngseq/src/syri/syri/bin/syri
(py35) grusso@fgcz-h-105:/srv/GT/analysis/grusso/darioCopetti/syriResults$ python $syriExec -h
usage: syri [-h] -c INFILE [-r REF] [-q QRY] [-d DELTA] [-F {T,S,B}] [-k]
[--log {DEBUG,INFO,WARN}] [--lf LOG_FIN] [--dir DIR]
[--prefix PREFIX] [--seed SEED] [--nc NCORES] [--novcf] [-f]
[--nosr] [--tdgaplen TDGL] [-b BRUTERUNTIME]
[--unic TRANSUNICOUNT] [--unip TRANSUNIPERCENT] [--inc INCREASEBY]
[--no-chrmatch] [--nosv] [--nosnp] [--all] [--allow-offset OFFSET]
[--cigar] [-s SSPATH]
Input Files:
-c INFILE File containing alignment coordinates (default: None)
-r REF Genome A (which is considered as reference for the
alignments). Required for local variation (large
indels, CNVs) identification. (default: None)
-q QRY Genome B (which is considered as query for the
alignments). Required for local variation (large
indels, CNVs) identification. (default: None)
-d DELTA .delta file from mummer. Required for short variation
(SNPs/indels) identification when CIGAR string is not
available (default: None)
optional arguments:
-h, --help show this help message and exit
-F {T,S,B} Input file type. T: Table, S: SAM, B: BAM (default: T)
-k Keep intermediate output files (default: False)
--log {DEBUG,INFO,WARN}
log level (default: INFO)
--lf LOG_FIN Name of log file (default: syri.log)
--dir DIR path to working directory (if not current directory).
All files must be in this directory. (default: None)
--prefix PREFIX Prefix to add before the output file Names (default: )
--seed SEED seed for generating random numbers (default: 1)
--nc NCORES number of cores to use in parallel (max is number of
chromosomes) (default: 1)
--novcf Do not combine all files into one output file
(default: False)
-f Filter out low quality alignments (default: True)
SR identification:
--nosr Set to skip structural rearrangement identification
(default: False)
--tdgaplen TDGL Maximum allowed gap-length between two alignments of a
multi-alignment translocation or duplication (TD).
Larger values increases TD identification sensitivity
but also runtime. (default: 500000)
-b BRUTERUNTIME Cutoff to restrict brute force methods to take too
much time (in seconds). Smaller values would make
algorithm faster, but could have marginal effects on
accuracy. In general case, would not be required.
(default: 60)
--unic TRANSUNICOUNT Number of uniques bps for selecting translocation.
Smaller values would select smaller TLs better, but
may increase time and decrease accuracy. (default:
1000)
--unip TRANSUNIPERCENT
Percent of unique region requried to select
translocation. Value should be in range (0,1]. Smaller
values would selection of translocation which are more
overlapped with other regions. (default: 0.5)
--inc INCREASEBY Minimum score increase required to add another
alignment to translocation cluster solution (default:
1000)
--no-chrmatch Do not allow SyRI to automatically match chromosome
ids between the two genomes if they are not equal
(default: False)
ShV identification:
--nosv Set to skip structural variation identification
(default: False)
--nosnp Set to skip SNP/Indel (within alignment)
identification (default: False)
--all Use duplications too for variant identification
(default: False)
--allow-offset OFFSET
BPs allowed to overlap (default: 5)
--cigar Find SNPs/indels using CIGAR string. Necessary for
alignment generated using aligners other than nucmers
(default: False)
-s SSPATH path to show-snps from mummer (default: show-snps)
Here the dependencies
(py35) grusso@fgcz-h-105:/srv/GT/analysis/grusso/darioCopetti/syriResults$ pip show Cython | grep Version
Version: 0.28.2
(py35) grusso@fgcz-h-105:/srv/GT/analysis/grusso/darioCopetti/syriResults$ pip show numpy | grep Version
Version: 1.14.3
(py35) grusso@fgcz-h-105:/srv/GT/analysis/grusso/darioCopetti/syriResults$ pip show scipy | grep Version
Version: 1.1.0
(py35) grusso@fgcz-h-105:/srv/GT/analysis/grusso/darioCopetti/syriResults$ pip show pandas | grep Version
Version: 0.23.4
(py35) grusso@fgcz-h-105:/srv/GT/analysis/grusso/darioCopetti/syriResults$ pip show python-igraph | grep Version
Version: 0.8.2
(py35) grusso@fgcz-h-105:/srv/GT/analysis/grusso/darioCopetti/syriResults$ pip show biopython | grep Version
Version: 1.76
(py35) grusso@fgcz-h-105:/srv/GT/analysis/grusso/darioCopetti/syriResults$ pip show psutil| grep Version
Version: 5.4.5
(py35) grusso@fgcz-h-105:/srv/GT/analysis/grusso/darioCopetti/syriResults$ pip show pysam| grep Version
Version: 0.15.1
(py35) grusso@fgcz-h-105:/srv/GT/analysis/grusso/darioCopetti/syriResults$ pip show matplotlib| grep Version
Version: 2.2.2
It's a python 3.5 conda environment.
What am I missing?
Many thanks,
Giancarlo
When I use syri, the error is below:
all stack:
File "/share/nas1/liufuyan/HELP/Soft/syri-master/syri/bin/syri", line 179, in
getTSV(args.dir, args.prefix, args.ref.name)
Message: '/share/nas1/liufuyan/HELP/JieCai/01.MuMer/snps.txt cannot be opened. Cannot output SNPs and short indels.'
but produced one file snps_GARB.txt. I make statistics for the file ,and the SNPnum is only 2433 ,very small ! Could you help for me to find the problems?Thank you!
Hi,
I am trying to run the working example on my machine (CentOS) following the instructions. However, I am getting error when try to feed syri with SAM format and the following commands were used:
minimap2 -ax asm5 --eqx refgenome qrygenome > out.sam
syri -c out.sam -r refgenome -q qrygenome -k -F S
syri - WARNING - starting
Reading Coords - ERROR - Error in reading the SAM file
However, when I use the nucmer, then I am getting results from syri without any issue. Please let me know why syri raises error when the input is in SAM format.
Thank you,
Best regards,
Suresh
Hi, @mnshgl0110
Thanks for read SAM
function. I use the minimap2
do WGA,then use syri to call SV. It run smoothly in plant genome.
But when I use this for mammal (2.3Gb genome), it stopped after throw a error. The two genomes are the same species, different individuals.
Here is the command:
minimap2 -ax asm5 --eqx -t 24 ./D.fa X.fa > D_X.sam
/data/software/SyRI/1.3/syri/bin/syri -c D_X.sam -F S -q ./X.fa -r ./D.fa --nc 24 -k
Reading Coords - WARNING - Chromosomes IDs do not match.
Reading Coords - WARNING - Matching them automatically. For each reference genome, most similar query genome will be selected. Check mapids.txt for mapping used.
More than one chromosome found for a SR
Here is the syri.log when use --log DEBUG
2020-09-08 21:21:54,691 - syri.13 - DEBUG - mapstar:44 - Translocations : processing translocations 132020-09-08 21:21:54.691237
2020-09-08 21:21:55,023 - getmeblocks - DEBUG - mapstar:44 - Number of mutually exclusive blocks identified 0
2020-09-08 21:21:55,025 - syri.X - DEBUG - mapstar:44 - Translocations : finding solutions X2020-09-08 21:21:55.025257
2020-09-08 21:21:55,069 - syri.X - DEBUG - mapstar:44 - Translocations : processing translocations X2020-09-08 21:21:55.069200
2020-09-08 21:21:55,137 - syri.1 - DEBUG - mapstar:44 - Translocations : finding solutions 12020-09-08 21:21:55.137388
2020-09-08 21:21:56,015 - syri.2 - DEBUG - mapstar:44 - Translocations : processing translocations 22020-09-08 21:21:56.014827
2020-09-08 21:22:16,948 - syri.12 - DEBUG - mapstar:44 - Translocations : processing translocations 122020-09-08 21:22:16.947921
2020-09-08 21:22:22,973 - syri.1 - DEBUG - mapstar:44 - Translocations : processing translocations 12020-09-08 21:22:22.973848
2020-09-08 21:22:24,353 - getCTX - INFO - <module>:218 - Identifying cross-chromosomal translocation and duplication for chromosome2020-09-08 21:22:24.353405
2020-09-08 21:22:24,353 - getCTX - DEBUG - <module>:218 - Reading Coords2020-09-08 21:22:24.353653
2020-09-08 21:22:24,525 - getCTX - DEBUG - <module>:218 - CTX identification: ctxdata size(858, 13)
2020-09-08 21:22:24,526 - getCTX - DEBUG - <module>:218 - Making Tree
2020-09-08 21:22:24,790 - getCTX - DEBUG - <module>:218 - finding Blocks
2020-09-08 21:22:24,791 - getCTX - DEBUG - <module>:218 - Preparing for cluster analysis
2020-09-08 21:22:24,957 - getCTX - DEBUG - <module>:218 - Getting clusters
2020-09-08 21:22:25,061 - getCTX - DEBUG - <module>:218 - Finding ME Blocks
2020-09-08 21:22:26,279 - getmeblocks - DEBUG - <module>:218 - Number of mutually exclusive blocks identified 0
2020-09-08 21:22:26,286 - getCTX - DEBUG - <module>:218 - Finding best subset of clusters
2020-09-08 21:22:26,563 - Brute-force TD identification - INFO - mapstar:44 - Cluster is too big for Brute Force, using randomized-greedy approach
Time taken for last iteration 3.600120544433594e-05. iterations remaining 40
2020-09-08 21:22:27,082 - Brute-force TD identification - INFO - mapstar:44 - Cluster is too big for Brute Force, using randomized-greedy approach
Time taken for last iteration 0.001781463623046875. iterations remaining 26
2020-09-08 21:22:36,940 - local_variation - INFO - <module>:229 - Finding SVs in synOut.txt, invOut.txt, TLOut.txt, invTLOut.txt, dupOut.txt, invDupOut.txt,ctxOut.txt
Hi, I have looked over your paper and sups and didn't see any definition of what a highly diverged region is. I was hoping you could help me in understanding how a HDR is defined.
thanks.
Hi, trying to get SyRI to work on some genomes I am building. I ran nucmer and SyRI with the following commands and got an error
~/mummer-4.0.0beta2/nucmer --maxmatch -c 500 -b 500 -l 100 -p nucmer_align $REF $QUERY
~/mummer-4.0.0beta2/delta-filter -m -i 90 -l 100 nucmer_align.delta > nucmer_align_m_i90_l100.delta
~/mummer-4.0.0beta2/show-coords -THrd nucmer_align_m_i90_l100.delta > nucmer_align_m_i90_l100.coords
syri -c nucmer_align_m_i90_l100.coords \
-r $REF \
-q $QUERY \
-d nucmer_align_m_i90_l100.delta \
--nc 7 \
--log DEBUG
error
syri — WARNING — starting
Namespace(TransUniCount=1000, TransUniPercent=0.5, all=False, bruteRunTime=60, chrmatch=False, cigar=False, delta=<_io.TextIOWrapper name='/media/Pangenome/syri_test/grandis_mel_m_i90_l100.delta' mode='r' encoding='UTF-8'>, dir=None, fout='syri', ftype='T', increaseBy=1000, infile=<_io.TextIOWrapper name='/media/Pangenome/syri_test/grandis_mel_m_i90_l100.coords' mode='r' encoding='UTF-8'>, keep=False, log='DEBUG', log_fin=<_io.TextIOWrapper name='syri.log' mode='w' encoding='UTF-8'>, nCores=7, nosnp=False, nosr=False, nosv=False, novcf=False, offset=5, prefix='', qry=<_io.TextIOWrapper name='/media/Pangenome/honours/E_melliodora.fasta' mode='r' encoding='UTF-8'>, ref=<_io.TextIOWrapper name='/media/Pangenome/E_grandis/chromosomes/E_grandis_chromosomes_simple_names.fasta' mode='r' encoding='UTF-8'>, seed=1, sspath='show-snps')
('Cluster is too big for Brute Force\nTime taken for last iteration ', 1.1920928955078125e-06, ' iterations remaining ', 46)
('Cluster is too big for Brute Force\nTime taken for last iteration ', 3.62396240234375e-05, ' iterations remaining ', 40)
('Cluster is too big for Brute Force\nTime taken for last iteration ', 0.0001552104949951172, ' iterations remaining ', 38)
('Cluster is too big for Brute Force\nTime taken for last iteration ', 0.0008757114410400391, ' iterations remaining ', 28)
Traceback (most recent call last):
File "/home/syri/syri/bin/syri", line 143, in <module>
getNotAligned(args.dir, args.prefix, args.ref.name, args.qry.name, chrlink)
File "syri/pyxFiles/findsv.pyx", line 489, in syri.findsv.getNotAligned
File "syri/pyxFiles/findsv.pyx", line 528, in syri.findsv.getNotAligned
KeyError: 'NW_010092438.1'
[syri.log](https://github.com/schneebergerlab/syri/files/3567819/syri.log)
Log is also attached
After failing with nucmer I ran minimap2 with the same data and SyRI ran sucessfully.
From what I understand of minimap2 and nucmer, the nucmer results should be more accurate and I am hoping someone can help me to get SyRI working with nucmer.
Thanks
Hello,
When using chroder,
I come across this issue:
Traceback (most recent call last):
File "../../../syri-master/syri/bin/chroder", line 716, in <module>
scaf(args)
File "../../../syri-master/syri/bin/chroder", line 337, in scaf
refdata = getdata(reflength, refid, refdir)
File "../../../syri-master/syri/bin/chroder", line 78, in getdata
fnd = defaultdict(dict)
NameError: name 'defaultdict' is not defined
hi, these error reported, would you help me ? thank you!
error log:
syri — WARNING — starting
/data/home/songht/tools/SyRI/syri/syri/bin/syri:106: FutureWarning: read_table is deprecated, use read_csv instead, passing sep='\t'.
chrlink = startSyri(args)
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/data/software/anaconda3/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/data/software/anaconda3/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "syri/pyxFiles/synsearchFunctions.pyx", line 260, in syri.pyxFiles.synsearchFunctions.syri
File "syri/pyxFiles/synsearchFunctions.pyx", line 598, in syri.pyxFiles.synsearchFunctions.getSynPath
ValueError: max() arg is an empty sequence
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/home/songht/tools/SyRI/syri/syri/bin/syri", line 106, in
chrlink = startSyri(args)
File "syri/pyxFiles/synsearchFunctions.pyx", line 209, in syri.pyxFiles.synsearchFunctions.startSyri
File "syri/pyxFiles/synsearchFunctions.pyx", line 210, in syri.pyxFiles.synsearchFunctions.startSyri
File "/data/software/anaconda3/lib/python3.7/multiprocessing/pool.py", line 268, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/data/software/anaconda3/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
ValueError: max() arg is an empty sequence
Hi,
I am using SyRI to idenitify SVs between two species (same genus, 10 Mya divergened). Here is the full command I use, but the error encounted [W::sam_read1] Parse error at line 20 Reading BAM/SAM file - ERROR - Error in reading BAM/SAM file. truncated file
minimap2 -t 24 -ax asm10 --eqx ref.fa que.fa > out.sam
python3 /data/software/SyRI/syri/syri/bin/syri -c out.sam -r ref.fa -q que.fa -k -F S --nc 12
CentOS 7.4
python3.5
lastest SyRI (install yesterday from git clone)
Chr1 0 Chr1 74430467 60 78538456S106=1X244=2I202=1X104=4I123=1X157=1X49=1X102=1X123=1X260=2X75=1X270=1X40=1X222=1D19=1X205=1X376=1X229=1X15=1X920=1X138=1X679=1X464=1D228=1X1=1X32=1D113=1X47=1X393=1X195=1X334=1X3=1X160=1X141=1X61=1X313=1X2=1X90=1X118=1X9=1X200=1X56=1X33=1X58=1X147=1D281=1X94=2X202=1X36=1X38=1X46=1X691=1X480=1X196=1X277=1X6=1X512=4I313=1X717=1X230=1X287=1X389=1D147=1X23=1X300=1X66=11D74=1X46=1X525=1D177=16I86=5I11=1X128=1X1=1X1=13D181=1X241=1X97=1D88=1X102=1X50=1X56=1X84=1X226=1X26=1X238=1X59=1D23=1X71=1D129=1X399=1X190=1D169=1X4=1X179=1X11=1X45=1D96=1X546=1X176=1X1=1X12=1X30=1X28=1X195=1X178=1X50=12I1=1X317=1X55=1D258=1X226=1D190=1X188=1X35=1X124=1X13=1D620=1D78=1X572=1X467=1X154=1I47=1X15=1X13=1X946=1X80=1X342=1X40=1I358=1X85=1X340=1X43=1X230=1X191=1X53=1X384=1X137=1X47=1X30=1X276=1X114=1X339=1X78=1X24=1X45=1X55=1X9=1X246=1X205=1X13=1X6=1X44=1X44=1X145=1X40=7I441=1X69=1X19=1X10=1X134=1X276=1X257=1X113=1X74=11D1173=1X223=1X112=1X241=1X176=1X389=1X119=4D282=1X455=1X29=1X11=6D1=1X360=1X439=1X499=1X91=1X207=1X62=1X38=1X248=1X80=1I173=143I206=1X100=1X63=1X397=1X58=1X310=1X535=1X444=1D154=1X39=1X201=1X63=1X68=1X86=1X431=1X564=4D17=1X92=1X10=1X37=1X21=1X127=1X232=1X149=1X66=1X114=1X8=1X446=1X51=1X98=1X210=1D189=1X150=1D1X61=1X72=1X109=1X122=1X126=1X252=1X504=1X84=1X150=1X56=3D232=1I435=1X133=1X373=1X151=1X206=1X128=1X53=1X15=1X52=1X88=1D52=1X9=1X99=1X241=1X144=1X43=2I90=16I128=16I150=1X25=1X344=2X346=1X132=1X420=1X754=1X101=1X559=1X58=1X724=16I14=1X160=1X433=143D1X182=1X483=1I179=1X124=1D72=1X202=1D93=1X67=1X237=3D44=1X48=1I225=1X291=1X479=1X75=1X376=1D723=1X26=1X17=1X88=1X4=1X413=1X26=1X4=1X10=2037I332=1D751=1X532=1X218=1X9=4I231=1X288=1X214=1D93=1X21=1I147=1X23=1D312=1X48=1X203=1I51=1X47=1X17=1X13=1X498=1D112=1X44=1D9=1X40=10I292=1X30=1I10=1X214=1X52=1I102=1D74=1X79=1D140=1X30=1X62=1X126=1X314=1I154=1D15=1X93=1X558=1X5=1X5=1X70=1X56=1X33=1X15=1X307=1X492=1X206=1X376=1X188=1X95=1X1=1X166=1D841=1X71=1X39=1X527=1X93=1X152=1X1278=1X633=1X81=2I39=1X413=1D118=1X111=1X294=1I6=1X224=1X93=1X71=1X251=23I10=1X421=1X11=1D338=1I211=1X477=4I267=1D1=1X450=1X201=1X73=1X350=1X64=2I383=1X13=4D329=1X250=1X267=1X496=5I95=1X129=1I12=1X82=1X833=20I46=1X107=1X117=1X10=1X121=1X35=1X28=4I212=1X41=1X83=1X707=1X340=1X221=1X45=1D10=1X136=1X207=1X36=7I12=1X188=1X295=1X53=1X367=1X257=1X19=1X192=1X417=1D82=1X403=1X56=1X178=1X288=1X270=1X30=1I356=1I141=1X194=1X97=1X178=1X230=4D102=1I5=1X202=1X19=1X512=12D240=1X465=1X26=1X200=1X356=1X364=1X6=4I185=1X163=1I529=1X1054=3I802=1D572=2D168=1X309=1X347=1X1233=1D69=1X89=2D71=1D34=1X1083=1X22=1D227=2I812=2X50=1X69=1X144=1X33=1X7=1X72=1X113=1X44=1X37=1X16=1D218=1X469=1X19=1D172=1X143=1D1X305=1X328=1I16=1X73=6D126=1X61=1X83=1X176=1X318=1X152=1X269=1X1=9D15=1X18=1X43=1X665=1X388=1X70=1X6=1D224=20I9=4D78=1X9=1X30=8D25=20I3=1X3=1X137=1X370=1I196=1X149=1D105=1D226=1X31=1X270=1I251=27D59=1X79=1X56=2D143=16I63=1X27=1X27=1X11=3579I52=1X22=1X71=1X351=1X123=1X499=3I157=10I8=1I61=1X14=1X43=1X113=1X313=6I15=1X84=1X83=1X84=1X27=1X18=1X1
Hi @mnshgl0110
Can we use one-to-one MAF alignments format to identify structural rearrangements using 'syri'?
Best,
zheng zhuqing
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.