Git Product home page Git Product logo

isovic / graphmap Goto Github PK

View Code? Open in Web Editor NEW
177.0 177.0 44.0 23.73 MB

GraphMap - A highly sensitive and accurate mapper for long, error-prone reads http://www.nature.com/ncomms/2016/160415/ncomms11307/full/ncomms11307.html Note: This was the original repository which will no longer be officially maintained. Please use the new official repository here:

Home Page: https://github.com/lbcb-sci/graphmap2

License: MIT License

Makefile 1.00% C++ 97.44% Python 1.55%

graphmap's People

Contributors

isovic avatar krpa avatar nnnagara avatar robegan21 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

graphmap's Issues

Error in SAM CIGAR

Hello, it's me again! Sorry to bug you once more, but I think I've found an issue in the current version of GraphMap.

Here are the files to replicate the problem:
contig.fasta.txt
read.fastq.txt

Here's my command for the current version of GraphMap:
graphmap align -r contig.fasta -d read.fastq -o out.sam -a anchorgotoh -M 3 -X 6 -G 5 -E 2

Which produces this alignment:

S1_19_19    0   NODE_24-_length_125137_cov_1.00206  83535   40  8M1I4M1I2M1I42M1I5M1D6M1D15M1I8M1I12M1D4M1D5M1I15M1I6M1D1M1I4M2I7M1D2M1I2M1I6M2I1M2D10M1I3M1I8M1I7M2I5M1D8M1I18M1D11M1I10M1D6M1D11M1D3M1I19M2I10M2D10M1I11M1I8M1D3M2I13M1D35M1D3M1D3M2I13M2I14M1I21M1D9M1D6M1I28M1I2M1I2M1I17M1I1M1I9M1D5M1I3M1I4M1I12M1I15M1I4M1D10M1D10M2I9M1I14M1D11M2I6M1D8M1I2M1D2M1D6M1I3M1I2M2I10M1I3M1D7M1D3M1D7M2I7M1I9M1I1M2I4M1D17M1I4M2I2M2D20M1D3M1I12M1I7M1D9M1D29M1I9M1I8M1I11M3I3M1I1M1I20M1I8M1I19M1D5M1D3M1I6M1D3M1I3M4I7M1D1M1I26M1D32M1D3M1I7M1D12M1I7M1I2M1I22M1D13M1I2M1I6M1D12M1I6M2I7M1I4M1I1M1D3M1I18M2I11M1I9M1I6M3I9M1D3M1D15M2D27M1I11M1D6M1I7M1I6M1I3M1I1M1D5M1D4M1I11M3I3M2I4M1D2M1I12M1D9M1D4M1I7M2I4M1D12M1I5M1I18M1D1M1I10M1I6M1I7M1I7M1I6M1D9M1D2M1I7M1I8M2I2M1D3M1I1M1D2M1D2M1D7M2I14M1I3M1I4M1D15M1D8M1D26M1D18M1D2M1I7M1D17M1I10M1I6M1I5M1D19M1I5M2I14M1I3M2I21M1I5M1I2M1I1M1I11M1D13M1D8M3D2M1I2M1D5M1I10M1I18M1I4M1D5M1D16M1I11M2I8M1D15M2I8M1D5M1I27M1I3M1I2M2I10M1D3M1D6M1I6M1I8M1I13M2I11M1I6M1D15M4I6M1D10M1I25M2I25M1I8M1I9M1I9M1D11M1D10M3I2M1D2M1I26M1I2M2I8M1D13M1I5M1D1M1D6M1D7M1I1M1I18M1D3M1D12M1D8M1I18M1I10M1I6M1I14M2I2M2D2M1I4M1I17M1I7M1I15M1I2M1I1M1I7M1I12M1I5M1I1M1D17M2I2M1I8M1I21M1D3M1D5M1I7M1I6M1I10M1I7M1I1M3D14M1D2M1I6M3I11M1I3M1I2M1D7M1D2M1D3M2I17M1I8M3I13M1D1M1I5M2I14M1I6M1I10M1I2M1I17M1D3M1D3M3I2M2I1M2D3M1I3M1I8M1D17M1D6M1I5M1I13M2I19M1D1M1D7M2I3M1D2M1I8M1I10M1I18M1I9M1D8M1I9M1I6M1I3M1D2M2D8M1D19M1D2M2I15M1I7M1I20M1D25M1I5M3I7M3D3M1I11M1D1M3I24M3I2M1D7M1D5M2D2M1I6M1D8M1D10M1I28M2I17M2I7M1I1M1I2M1D9M1D2M1D9M1I10M1D9M1D15M1D18M1I12M1D6M1I2M1D15M1D7M1D7M1I2M1I9M2I10M1D16M1I3M1I5M1I8M2I6M1D2M1D20M1I2M1I10M1D4M1D3M1D5M1I7M1I6M2D6M1D20M1D10M3I4M1I5M1I3M2I12M1D4M1D9M2I2M1I4M1D1M1I4M1I7M3I5M1I8M1I11M1D11M2I1M3D4M1I10M2I2M1D4M1I7M1I6M1D6M1I4M1I27M1I2M1D2M1D2M1I2M1I3M1I5M1D2M2I27M1I7M1I16M1D1M1I3M1I8M1D6M1I15M1D1M1I7M1D2M1I6M1D3M1I13M1D2M1I3M1I7M1D14M1D5M2D26M1D6M1I2M1I5M1D4M1I7M1D4M1D1M1I8M2I11M1D10M1I26M2I3M1D2M1I12M1D4M1I7M1D5M1D2M1D5M1D34M1I2M1I24M1I2M1I13M1D20M1D4M1D2M1I10M1I22M2I10M1D5M2I12M1I10M1I4M1I2M2I13M1D5M1I6M1I5M1I13M1I1M2I8M1I24M1I4M1I7M1I7M2I2M1D2M1D3M1I1M1D3M1I8M1I3M1I18M1I3M2I2M1I27M1I13M1I6M1I6M2I9M1D2M1I4M1D6M1D4M1D3M1D8M1I15M1I4M1D2M3I5M1I3M1D6M1D2M1I3M1D7M1I6M1I5M1D3M1D6M1D6M2D1M1I8M1I11M1I13M1I6M1I3M1I3M1I16M1I5M1I24M1I9M3I3M1I5M1I8M1D2M1D8M  *   0   0   CACTCGGTTACAACTGTCTGAGAATCTGATATGTTATGTCTTGGCAAACGGTGGCGGAACCGACTATATACATGGAGCAAGAGTGGCCAACTATGAATAGTTCTTTCCGCCCAAGTCTTGTTCCACAGGCCGGTTGGATAATTTCTCCCCCTAACAATTGTTTAAAAGGTGGGTAACAAAGGGCGCTCGGATCTTCCATGAGGGGGCGGGACTAGCAACGCGTGCGGGGGTAAGGTGTGGTCCTGGCCGACCTGGACGACACTCGCTCCCCTGACTACAAATCTAGTCAAAGTTAGTTAAGATGATGTCCGACAGGAACAAGCTAAGACGCTTTTGGCTCGTTTCAATATGAAACAAGGGGGATGATTCGCGGTAATCTACACAGGTCACCAATACACCCAGGGCGCTCGTGTGGAGTTGAGCAGCGAATCCTCAATGATAACAATCTACTGGCTCGTGCGAATCGTCCCTTTTCTGGCGAAATTGAGGATTAGCACTCAATAAAGAGGTTATCTGCCAGTCGGGGACTTTAGGCTACCCGTGTAGAAAACTTTGGAATAACTGTCTTATCAAGTCCACACAACGTTTCCAGCACAAAGTAAAAGGCGCTCGGCAGGCGGTAGACCTCTGTAACGCCTTTTCTTGGGCTTGCGCCGCGGATCAAGGGTGGTACAAGGGATCAATTAGTTATCACTCCGCCATCCCAAAAATTGTTTCACAAATATTCCAAGATTTAAATTAGGGTTCTACAAACCATTGATCGTCACCTTACTCAAAACGTTTTCGCGGTCCGACCGGTGACTCTCGTAGACCAGGCTGGATAAAGGTTCCTCATAGACGTCTCGTCCCAATGGAGACTGACCAACTGGGCCCCCTCGCATATTCTGTCGAGAGGGGAGTCGGGACTATTCTGTTAGCTCTGGGGACGATTAGACGAATGCCGTCCTTTCAAATATACATCCAATGGCGGACGAGCGTTAAAGCGGCGATACAAAATTTGACCCATATTTGAGGCACCGGAGACAAACCTGAGTTTCCGGCCGCTTGAACACAATGGTGAACTTTAAGGAAGTCCTGATCGGCCTCCCGTACCTATACCCGGGATTTAATCGAGGTAACTCCCCAAATTGATAGAGTATGGCTTTAACGGGCGAACAGGCTAGTCTCTAACGCGGGGATCTAAGTCGCTGGTGGCTACAGAGCGAGATTCGGCCGAAGACTTTGCGATTCCTGACTATAGAAAGCACTTAGTTTATCGGAGCTCCCTTTGGCCTGAGGAAATCAGTCAATGGTAAGCTCGCCAAAACCCCAGGGGTAACTTCCCTAGTTGGCCCTGAAGAGGATTGATGTCGAGTTCTTACGAACGATAAATTTAGATCTTCTAAGTTGTCGGCCATAACCTTCTGGTCAGATGTCGGGACTGCCGTATAGTTGCCCAATCAAGCTCGCTTTGTCGCTGTGTACGCAAAGTGTCAACGGTGGTTAGGAAGTCGCAACCGGCATTTGAACCGACGGAAGTCACATTGAGGTCTCCAACATGGGCGCTACACAGGCTCTTTATAGCATTTGTTGGTTCCCGTGGGTTTGATCTCTTGAGTGGTCCAACCATCCACGTCGGACCCATCCTATTTGATGGTACCCTTCCTCTAAATTAGCGCGCGCGTGGTTGTTGCCAAATGAATGACTCTGAACGACGGTAACAGCCATGAAAAAAATGCGTCTATACCTCTCGACCCGTTGGCACGGATGGGCCTACCAATGCAAAGATAGCCATTGGGGATAATACAGGATTGGTCCCAGTGACGCTCCCAACAATGGCTTTTAATTTCCGGAATATCTGGGCCCGGTTAAGTTAAGTTGTAGAGCCTGGCTTTAATCAGGATTAGACCTCGGTTCCCCTGTTGCGCGGAAAAATGACCTGATGCTAAATAGGTGGCGGATAGGGTTGCCACGGCGAAGGAGCTCGTAATAGAACCTAATACAAAGGGTACGTTACCCATACCCCAGATGCCCATCCTTTGCCAACATCTTCGGGCCCGCACACACTCTGACGTGAGACTAACCCGGCGTTAGGACCAAGCGAAAGCGTCAAAGACAGTTTGGAATCAGCGCGAAAAGACGGTGCGAGGGTAATAATGGACCCTCTTCCTATATCTGAATGAGAAAAGCTGGGTAGTACCAGTGTACTCGTAAACGGTTGCATCTCCTAGGTAGCCCGGTAAAGGAGTCTTCTCTTCTGACTACGTAAAGTTGGGGCTTCGGAGTGATATATTAGGAGCCTTGCTGGGCCGGGGAACACCACTCTCATCCCGGACGTCACTTGAGGTAACACCGCTACCCATACGAAATTTAGTGTTTGAGATGACTTTGACTGTTTTTTAAGTTGGTTCGCCAGATTGGCCGATGAGTGCGTGTAGCGTTGCCTTAAAAAGACTTCTGATCCCGGATGCTCGGTCAGCCTCTCAGAGGGAATGTTCGTCGTGATAAAATCATTGGACAGCCTGCTATACCGGCATATCAAACCGCATTGTGTGGGGAGGTCGACGCCCCCTCAAACTTGGGCGGGTGTAGAGTGCGCCCCTCCGTACCGATAAATGCGGCCGAACTGGACTTCGTCACTAATACCAGAGAAACTTGCCCGTGATAATTTTGAGTTCATTTTCGCCAAGTTGCATACTTTCAGCGGGACATGAAACCCAAGCAAACTACGTTGCATTGAAGGGGATTAACGGAGACTCAATTGAGGCGAACTCTTAGTTGTATACCGCGAGTGGGGGAGAAGTTAATAGTAAGTACTAACTTTAAACTTCGCCTGGTAGATATTCCTTGCGACAAGGCCAAGGTTCCAGCGGCGTAGCCAGGCGTACATAGTCAGCAGGCAAATCGAATAGCTGGGGGTACAGCTTAGATATCTGCACCCAAATATCTTCATATTACACGGCGCCACAAAAAAAAACCGCCGATGGCTTCCAAAGCTCAGGAAATGATCGAGGCATCAAGAGGCTGCGTCCAATAGAAGCCTTTATCGGTCTTTTCGTATTAATTTCTTCTCCGTCCATGGCGTGGACCGCGCAGTGTATTCCAGTGACCTTGCCAGTGGGGAGAGCCCATGACTGGTGAATTGCGCACTATTGACTGGTAGAACACGGTCGAAAAGAAAGTAATCGAAATATGGTAGTACCGGTAACGGGAGACGTCGCAGCCTAATCTTCCCTGGGGCATGAGATAAACGGGCCTTACCCACGGAGGTAATAAGCGAGTGCCCCCCCAAATAGTCTTTTGCTTTCAACGGAACAAGTCCCGAGTAAACTCATAAGAGGTGCGGTGAGCAAATCTCTCCCCCTGATGTATGCGTTGGATCAACAGGTTTTGGTTCAAAGGCTTTGAAATACCCCCGGTGCTAGAATCAGGGGGCCTCCGCTTAGTAACCCATCTGACCATCCTAACTCCCGAGAGCGTGGGCGCGCCTCACACTACTGTAGGCAAAAGTCTCCGGCAACATGATCACCCACGTTGTCCGTACGGTCAGGAGTCTCCCGTTAAATTCGTTCCGGCCGGCACCACACGGGCTTTGCACTCCCCTCCGGCACCACTTTGACAAAGCTGTAAATAGGGAAAAGGAGAAGTGTTTTTCTGCGGGCTCCATTACTTACTCACTGTAGACGCTTCAAGCCTCGACTTCATGAGACATTAGCGGGAAGGCATAAGACCCATGACTGCCCCTCATATAGGGTCGTCAATATGGAGCATAGGATGCTGGCTGAGGGTCAACGCCCTTGCAGGGCTAGAATTAAGACGAGGTATTTTATTCTACGTGGCTCTGCCGAAAATGTTCAAAGCCCCGAACGGGCAAACCCTGATTGGGGCGGTTGTAATCGGGATCAATGACTAGTAGCTATCGGGATCGCCAGTTGCCATTTGTGATCACTCCTCCGGGGTTAACTCACATGTAAATACAAGAAGGGGTTTGCCAAATACTGAGTGACGGGAAGTTGAGGCCACGGGAATAAGAAGGAAGAAACGCCGTCACCATCGGCCCCCTTATGTCGTAGACTTCAATGTTCGATCGTTAAAGCTCTTGTCGGTTATCCCTAGAGTATGGTAAATTTCCGTAAAAACGTCACTTCGCCGTTTAAGGGGCGTCTGGTGGGGGTAGCGACCCGGGCATATAATATCCTATAGCGACCGACGTTTCGATAAGCTATAACTAGAAATGTTCAGCCACTCGCGGCTCAGCGACATCACACTCTGCTACAATAGAATTCGCTAGCATCCCCCGGAATTTTGCCTCCAAGATCGTGTCTAGGACCTGTCAAATCCTGAGTCTATCATTCTTTGATCCTGTCTGTGCGCGTGAACGCCGGGCAGTAGAGATATTTCGATCTGTTTTATTCTTGAACAGATCCTAAGGGCCTAAATGAAAGACTTTTCTAAACATACTACTAAATCACACCGTTACTGTCTAGGTCGCGACACGCATGGTTGGGCGTAAGTCCTCATGTTTGGGTTACCACGGTGCCGTACTATGCCGGGTTACGGAGCCTTATAAGTGTCCCCCCGCC  /..*%,.&.).)%(,*&.-*,*)+."',-.,.,-".$+((+-+,.$.$..(.(*-().+-%$**.+/*&%..-'#..,,,..%-'%&.,$).%+..-(')-%-&(--'..,..-/'-(),*'*&&(**),.+-'&,)(.+*)-,-/*,$)+%%.&..$(*&'-(.((%-+,')#,*(#......",+*%'.-%).,.*.'/.++!.+-$+(*.-/.$,.%-./...".,(---&--'.-$+..+'.-*#-,#--.---,-++.*',,).',./.&-*&,..%,*-,-/%-++&*$-",'--,&*.*-*.-.(-#,(.-+,-),----,....#%#)''&((#&.%-+,.-+.,-,,.**$-'+.-.-*+,.*-$-"--$,-,#)-"-#$&'..&.)(-*+!#.-)**,-.$,--$$),'(*."+-&--%+%*..-%,-,%(.,....*-.,$-.'%%(,.)-.%-")"*--+).'.+..(-/.$-)*",--&.'&.&../&-%$-#'*+"(+'.,.%,(+%,+.-$&,-,$)-).-.-)$,-.+-(%..)/+,.+&-&.$....(,+,#,&,.&.%+-*.'/%)%.-'%/-).,.*,+%#$,$,+/--..-&*'-.-,.-''++,-,-+-,-,%&&--,.$&*$.+.%-/,).'..%(+...(,-&#.".(,-',.-,+*-*"-,').&--%'&$.&.%.+,/$(*+-.--#*#$+.-.!',%''(!+.,(..-%,'',,,**,%(%*&,*.--#'$.,$)$...-("-)(,)%$)*'-'%,..+''..+-(-".,(,.../+-+-/".,,-*%#++*",')/.%+#$,.-/)*&--*..*+..&%(-#".&'-,,++)-,.#-.$-..)#)-.--$*&-.--.,..-'(,--)-#(.+$.%*%.+.)#(.+#*.*,(..$*$.$%(&.)-..(.-)+&+-%,-..)/'(##"!/+-(.$(.+,.(%+,#,,--.....,.)+&%(-.-*+.,*,*&-+,)-*+&-#.+*./$(+-).#).,-&+&+%-(+**.%,+*".(,)...)(.$..-''(,.+'%-,---.-../*.$/.('-,,)...*-*-'*))/#,.%,-$&("(-*$''$'*%-(.+---)",*,)"*$+*),&.*..--),%..,-*.-(.&.."#)#-#.&$-*.)-*#-)(-.-+&+(%.+.-*(*$".%,-++.-)-*+--%-+,,,.).!*--.)+*+,&'),'-&.&$&.$%(*.--,%-)-.,..*-,+('+.).#..).+..".)-..',.$.&-,.!%..%.+&-$--,..-.%.-$#.,$+%$,(-.,&(..".(+%+.-)/.,%%&.*&,$.-''.(%--'$"#+*++.+*.-++('.'-'$--,*+'&,*!%.,,..*%(-$#-.%-)-*.-..*$/),.+&'$--+#--#((.**.+&-&,**.-.,....#,.%&..,.'-%.$*.%).&("%#,*).,.$.--.-"--*,*!-,'.,.*&-(%-.+),,(""&..,&()'+/-),'+,-%-..&-..,.+.)-,+"..&*+#-.)+--.++.,-.(--.-..+-,$,**$+-+-."%%*,...*.'%)-.(-(+.%.&'"''..-+*%%&#$&)&.+,-(.%.)$.#$-&-)&%,-..)+%--$--*.#'-*(.*..$(*-,#".)-(*#+,*-*/$.-,).-,-.*.,,'""/--%-.-$%)&-*-..+.,.,.".)*++'(..+-.*,",()-!*%(-%.,&+).%..*-&%-'-)#),,-&-(.,+*&(%(*"(&.+%-*(.").%$%(.-.&,).)*-)(.*-!//.-&,.,..#.)',)#.$,.---+$,*-&$..+%-.&+,*.%&'',./-,#-.(..$--**%-+,-.,'",(,,*&#.$...-".*.--.,+(..(.+./%-..%..*-((&##,-.(+-.+.)$%+,%.*(.+-..-+#+%&'*++.%,*/.-.%#-.)*&.+,$&(--."%--+...-,!.)-&(,/%)&-..+,,.,.)++#.').).$-"#*(.%)*..','..)/#&,.)),.,,++-(-+.#'.-.,(--*"'+,*"+-(+.#&#+*-+(,+-..-+"%.&.!,'('''%%(,..,((*--%','-.-(+.+-),,.$.$",+-(.%-',&..(**-,-..+-",--/,-))./#.,..%&+!&%&,*%.%,&,+--(.)##-,**,+*&/.$&'$,,$-$.)...'.'.-+')-+/-.+..-*%(,+.&-+%".-"---)**#,-$,*+.",**--+#,(-/+*.$*.*-.+,-'#.#,,'&$.(.".)..*,&#+(.%+.&.-+-,)'!-+*&),$'$-#-.+"+,.),.+.--.)./-,-&-*#.+,#'%','-%&--..%(,'..++.-$*!-,..*..',.*..+/&.+#*+-(-*..%,*%,'.#)&$,-('')$!.)%+--,-.!-(./)-(+-(,)'.%,.*-%)(+.+%-+.')'"$/*)&-.*+./(.&..-#,-),-,...,$'.(%#.)%,,.#+-*).,)#(.-.-(-(.--*-!*..-..,**'*)'--*).*.&.-+'+-#)/,..,).#...-.--*#+.".+.).(*+!+--&.&*'-)'.-+&-(.,%%$#-,!*/+)%.&+%++"&$..#+%+%*-.%*'-,...,-.'%)-),'.,!,--.%+),().&*.-%)$'/#,-,#&---++*.+.).$..($--*-+#-'"*,++-+.,*&----.&+.!.,,'&#.$(..-,.)'-.-,.-,(,")(,(*.%.-/',.)--.#-,-,-./"$,.-&*.''(-,..&.-..(-+--**.,,--.+-.,.+-..$#%(.+%+-$(-&.++,)"..,.)&-!)',-*$**)$,).+,.,+.-*/.,(--..+--.,%$-.*,,&'-+'-&'%,,'$.)%'.,-..-.,-".,)-*.).-,'-+'"/""))'-),.$--$--(.).),$#*+,*,.%-..'.,+)),./.+#+.)**++.+*#."+-,+.,$*,/)&-)%+&.'(*.'%,-#(%.(.$.#%#-&/.+'..($-.,--*.!#..)-."+*,,,.*..*$(,*-*%./&-.#-',-%.#--&-+-.($.#-++.)''--++..--)$,+%.(-)-)#...**!..'*)&,,-%(.,,-(+*-+*.-.#%'-*&.--*,--/+$,+.,#&+%-,"!$.$&#-,)--.)/,.+,,%.,-),.--)+&-(..!,,"*&&+."-&#'-"*()+-+&.+..#)-+..$''.-'((-.---)./-."',../%",%((!#+..*!/.'$%.,-**..+.%$".-)..-#.+..$-..%%-/.-&(-)#,*(',-,.--.,,.*'.,.).%''.-$#%$'%-,.)-#,-.%(--..--,,.&'.,.(%&*%(-.-%,.#$+%%.-#.++$.,.,/+---(-&&"+#&(&,#(*,.),%,-",'+!$).$,"-.-,.-&%."#+%#&-*-/.+,.-,",)!&)+*-$!%-,($*.,*,.--..(,%-.-"-($$-.+*..).()+,)*(.+--)--*%+,+%.,-+&)-),&+*,(..'.-)%$"$/-*--&.%++.&.')..,&...+,,'%&(&.---&&(&)$"--.).%.',+#..+*#.)-$.$)+,.(..*..*$..-+%.-,'(.,)(.+-$/,'-)*,.'.$-*%-$.'*%(#&.$*-%.,*!(.*+*.,&(.+.+-,..%.-(.*..++%(-$+-,*-..,'(.)-).**.-)-&....*.'/().-'*')##)+..%-+*,#*.)..!+.---%*#-$&$&-$,.,.,/.'"(+.-.'.-,$,'.,.#,*--..'%%-.&"+(*,$.%*".%,-+%*'$$,-+---'.(+-+....,,..-.(+"--.-,"..*-')-,--$.*$-.*----%-+%,..--%&*.)#-*%'!-.(#--..)--..#,..+-.*+#%).$(,.#&*..),-$''..'.*,.,.#,)...*$&,-,..*,--(+,*-%+*%'.-+.-.).&.'*,,')-#-%+,($.)#.$-%+%*-%+&$!$&+-*)-*&&-&&'(.-))-*)-).)#/&,"+",*.-.-&(+.-*.(-*.+#*&-..)--+.#$".#.$.-*-'-('(&,'&%-"'$'-*.#-")'&.-".'-(-.+#.-/'-).'...!!.....,&+)(.$!,-)",.+.-%(-.-%.&.(&)/**&,.+-'+.-,+..,(/.+.).,-*'..-#"*'#-$'--.../)-./,,.%-,%--*.(.-,+-.(.$-*-",.(#%-)(.#&+'),,)-$".-./.-..(*$*#(.)+%.+*-&...,,*,*,*))/.'+,.$,+-..,(,.(!$'+..#"&,-.--(&---(-+&,+*+#,.$.+(.$+&,.*%&+.&.+&.-.%).)-..'.,.-*%.,+..&.-+#".*.-+,-.(-.$*.&&(.,*)+*.+"-/..%.),(!'(+*&&-%.*$.#.)+!.).,)#+)-&!'+**(."#+,*+,$-+-.-*"$/+#.-%(+./&$%(.'+.-,/----$,&.--)&.'.-#$"-.+  MD:Z:61^A6^T13T21^T4^A15T10^T9G2^G11^T0C14A18^T0A15C9^T12T2T5^A6^C11^C32^C0G21A7^T16^T32A2^G3^T51^C9^A65^G43^G10^G33^A17^A10^A2^T12T11^A7^A3^G19C8^A13A9^C0A12G7^C22^C9^A4A11T35T23A27T1A1^C5^A9^C13^A27^G0G16C1C9C2^G6A3^T1C23C17^T21^A1A4C23^C56^T3^G3T11^G0C38^G23^G5^T1C20^T10G3^A9^A15^A35^C8A10T1T15^G9^G10G8^T4^T2^T2^C1C26^T15^T8^T26^T1T2C3C2T6^A9^T4C33^C7A1A2A60C7^C13^C8^G0A0T4^C37^A5^G0G20G10C2^A20C2^C0C46^A3^A15G34^A5G4T10^C25G9G25G19C4^C11^G5A6^C13C24^C18^A1^G6^C11A1T12^G3^T12^T39A18^T0T73^G36C11^T3^G36^T0G0A14^C16C7^G7^C2^T6G11A17A4^T10T15G3G24^C3^G6^C0G14^A1G15^G11T31^T1^G10^A14C23T8^T11T14^A2^C0T8^G19^C44^C33C3^C0A0C14^T1G6C9C8^C7^C5^C0T8^T8^C1C63^A9^C2^G5G4C8^A1A4C2^A1T13^G30^T8^G15^G7^G28^G2T2C23A8^T2^G11A20^A4^A3^T18^A0T6^A1G18^G10C23^C4^A15^A36^G4A7^A0T0T16^T8A8^C6C32^T2^A12^G25A15G10^C12^A18G2^A8^T8^G11G4^T12^T14^A5^G0T26^G10G2^G11^C4^G20^C0A15C11C3A4G1^G3A2C7^G1C9^T5^G2^A5^A38C2T16C16^C0C10A0C7^G4^A6C37^A40T5^C0C39C0G0T0A30A7^T2^A4^T11A40C3A41^A6^G6^A4^T3^T21C5^A10^T6^T5^T18^G3^C6^T6^A0C4C23T19C2T6A21G34^A2^C8    NM:i:1011   AS:i:12499  H0:i:1  ZE:f:0  ZF:f:0.696692   ZQ:i:4562   ZR:i:125137

I had some code that was stepping through the alignment using the CIGAR and things seemed very wrong towards the end.

For comparison, here's the equivalent command in an earlier version of GraphMap (commit 2e314e6):
graphmap -r contig.fasta -d read.fastq -o out.sam -a anchorgotoh -M 3 -X 6 -G 5 -E 2

And that produces this alignment:

S1_19_19    0   NODE_24-_length_125137_cov_1.00206  83535   40  8M1I4M1I2M1I42M1I5M1D6M1D15M1I8M1I12M1D4M1D5M1I15M1I6M1D1M1I4M2I7M1D2M1I2M1I6M2I1M2D10M1I3M1I8M1I7M2I5M1D8M1I18M1D11M1I10M1D6M1D11M1D3M1I19M2I10M2D10M1I11M1I8M1D3M2I13M1D35M1D3M1D3M2I13M2I14M1I21M1D9M1D6M1I28M1I2M1I2M1I17M1I1M1I9M1D5M1I3M1I4M1I12M1I15M1I4M1D10M1D10M2I9M1I14M1D11M2I6M1D8M1I2M1D2M1D6M1I3M1I2M2I10M1I3M1D7M1D3M1D7M2I7M1I9M1I1M2I4M1D17M1I4M2I2M2D20M1D3M1I12M1I7M1D9M1D29M1I9M1I8M1I11M3I3M1I1M1I20M1I8M1I19M1D5M1D3M1I6M1D3M1I3M4I7M1D1M1I26M1D32M1D3M1I7M1D12M1I7M1I2M1I22M1D13M1I2M1I6M1D12M1I6M2I7M1I4M1I1M1D3M1I18M2I11M1I9M1I6M3I9M1D3M1D15M2D27M1I11M1D6M1I7M1I6M1I3M1I1M1D5M1D4M1I11M3I3M2I4M1D2M1I12M1D9M1D4M1I7M2I4M1D12M1I5M1I18M1D1M1I10M1I6M1I7M1I7M1I6M1D9M1D2M1I7M1I8M2I2M1D3M1I1M1D2M1D2M1D7M2I14M1I3M1I4M1D15M1D8M1D26M1D18M1D2M1I7M1D17M1I10M1I6M1I5M1D19M1I5M2I13M1I4M2I21M1I5M1I2M1I1M1I11M1D13M1D8M3D2M1I2M1D5M1I10M1I18M1I4M1D5M1D16M1I11M2I8M1D15M2I8M1D5M1I27M1I3M1I2M2I10M1D3M1D6M1I6M1I8M1I13M2I11M1I6M1D15M4I6M1D10M1I25M2I25M1I8M1I9M1I9M1D11M1D10M3I2M1D2M1I26M1I2M2I8M1D13M1I5M1D1M1D6M1D7M1I1M1I18M1D3M1D12M1D8M1I18M1I10M1I6M1I14M2I2M2D2M1I4M1I17M1I7M1I14M1I3M1I1M1I7M1I12M1I5M1I1M1D17M2I2M1I8M1I21M1D3M1D5M1I7M1I6M1I10M1I7M1I1M3D14M1D2M1I6M3I11M1I3M1I2M1D7M1D2M1D3M2I17M1I8M3I13M1D1M1I5M2I14M1I6M1I10M1I2M1I17M1D3M1D3M3I2M2I1M2D3M1I3M1I8M1D17M1D6M1I5M1I13M2I19M1D1M1D7M2I3M1D2M1I8M1I10M1I18M1I9M1D8M1I9M1I6M1I3M1D2M2D8M1D19M1D2M2I15M1I7M1I20M1D25M1I5M3I7M3D3M1I11M1D1M3I24M3I2M1D7M1D5M2D2M1I6M1D8M1D10M1I28M2I17M2I7M1I1M1I2M1D9M1D2M1D9M1I10M1D9M1D15M1D18M1I12M1D6M1I2M1D15M1D7M1D7M1I2M1I9M2I10M1D16M1I3M1I5M1I8M2I6M1D2M1D20M1I2M1I10M1D4M1D3M1D5M1I7M1I6M2D6M1D20M1D10M3I4M1I5M1I3M2I12M1D4M1D9M2I2M1I4M1D1M1I4M1I7M3I5M1I8M1I11M1D11M2I1M3D4M1I10M2I2M1D4M1I7M1I6M1D6M1I4M1I27M1I2M1D2M1D2M1I2M1I3M1I5M1D2M2I27M1I7M1I16M1D1M1I3M1I8M1D6M1I15M1D1M1I7M1D2M1I6M1D3M1I13M1D2M1I3M1I7M1D14M1D5M2D26M1D6M1I2M1I5M1D4M1I7M1D4M1D1M1I8M2I11M1D10M1I26M2I3M1D2M1I12M1D4M1I7M1D5M1D2M1D5M1D34M1I2M1I24M1I2M1I13M1D20M1D4M1D2M1I10M1I22M2I10M1D5M2I12M1I10M1I4M1I2M2I13M1D5M1I6M1I5M1I11M1I3M2I8M1I2M3I19M1I4M1I7M1I7M2I2M1D2M1D3M1I1M1D3M1I8M1I3M1I18M1I3M2I2M1I27M1I13M1I6M1I6M2I9M1D2M1I4M1D6M1D4M1D3M1D8M1I15M1I4M1D2M3I5M1I3M1D6M1D2M1I3M1D7M1I6M1I5M1D3M1D6M1D6M2D1M1I8M1I11M1I13M1I6M1I3M1I3M1I16M1I5M1I24M1I9M3I3M1I5M1I8M1D2M1D8M  *   0   0   CACTCGGTTACAACTGTCTGAGAATCTGATATGTTATGTCTTGGCAAACGGTGGCGGAACCGACTATATACATGGAGCAAGAGTGGCCAACTATGAATAGTTCTTTCCGCCCAAGTCTTGTTCCACAGGCCGGTTGGATAATTTCTCCCCCTAACAATTGTTTAAAAGGTGGGTAACAAAGGGCGCTCGGATCTTCCATGAGGGGGCGGGACTAGCAACGCGTGCGGGGGTAAGGTGTGGTCCTGGCCGACCTGGACGACACTCGCTCCCCTGACTACAAATCTAGTCAAAGTTAGTTAAGATGATGTCCGACAGGAACAAGCTAAGACGCTTTTGGCTCGTTTCAATATGAAACAAGGGGGATGATTCGCGGTAATCTACACAGGTCACCAATACACCCAGGGCGCTCGTGTGGAGTTGAGCAGCGAATCCTCAATGATAACAATCTACTGGCTCGTGCGAATCGTCCCTTTTCTGGCGAAATTGAGGATTAGCACTCAATAAAGAGGTTATCTGCCAGTCGGGGACTTTAGGCTACCCGTGTAGAAAACTTTGGAATAACTGTCTTATCAAGTCCACACAACGTTTCCAGCACAAAGTAAAAGGCGCTCGGCAGGCGGTAGACCTCTGTAACGCCTTTTCTTGGGCTTGCGCCGCGGATCAAGGGTGGTACAAGGGATCAATTAGTTATCACTCCGCCATCCCAAAAATTGTTTCACAAATATTCCAAGATTTAAATTAGGGTTCTACAAACCATTGATCGTCACCTTACTCAAAACGTTTTCGCGGTCCGACCGGTGACTCTCGTAGACCAGGCTGGATAAAGGTTCCTCATAGACGTCTCGTCCCAATGGAGACTGACCAACTGGGCCCCCTCGCATATTCTGTCGAGAGGGGAGTCGGGACTATTCTGTTAGCTCTGGGGACGATTAGACGAATGCCGTCCTTTCAAATATACATCCAATGGCGGACGAGCGTTAAAGCGGCGATACAAAATTTGACCCATATTTGAGGCACCGGAGACAAACCTGAGTTTCCGGCCGCTTGAACACAATGGTGAACTTTAAGGAAGTCCTGATCGGCCTCCCGTACCTATACCCGGGATTTAATCGAGGTAACTCCCCAAATTGATAGAGTATGGCTTTAACGGGCGAACAGGCTAGTCTCTAACGCGGGGATCTAAGTCGCTGGTGGCTACAGAGCGAGATTCGGCCGAAGACTTTGCGATTCCTGACTATAGAAAGCACTTAGTTTATCGGAGCTCCCTTTGGCCTGAGGAAATCAGTCAATGGTAAGCTCGCCAAAACCCCAGGGGTAACTTCCCTAGTTGGCCCTGAAGAGGATTGATGTCGAGTTCTTACGAACGATAAATTTAGATCTTCTAAGTTGTCGGCCATAACCTTCTGGTCAGATGTCGGGACTGCCGTATAGTTGCCCAATCAAGCTCGCTTTGTCGCTGTGTACGCAAAGTGTCAACGGTGGTTAGGAAGTCGCAACCGGCATTTGAACCGACGGAAGTCACATTGAGGTCTCCAACATGGGCGCTACACAGGCTCTTTATAGCATTTGTTGGTTCCCGTGGGTTTGATCTCTTGAGTGGTCCAACCATCCACGTCGGACCCATCCTATTTGATGGTACCCTTCCTCTAAATTAGCGCGCGCGTGGTTGTTGCCAAATGAATGACTCTGAACGACGGTAACAGCCATGAAAAAAATGCGTCTATACCTCTCGACCCGTTGGCACGGATGGGCCTACCAATGCAAAGATAGCCATTGGGGATAATACAGGATTGGTCCCAGTGACGCTCCCAACAATGGCTTTTAATTTCCGGAATATCTGGGCCCGGTTAAGTTAAGTTGTAGAGCCTGGCTTTAATCAGGATTAGACCTCGGTTCCCCTGTTGCGCGGAAAAATGACCTGATGCTAAATAGGTGGCGGATAGGGTTGCCACGGCGAAGGAGCTCGTAATAGAACCTAATACAAAGGGTACGTTACCCATACCCCAGATGCCCATCCTTTGCCAACATCTTCGGGCCCGCACACACTCTGACGTGAGACTAACCCGGCGTTAGGACCAAGCGAAAGCGTCAAAGACAGTTTGGAATCAGCGCGAAAAGACGGTGCGAGGGTAATAATGGACCCTCTTCCTATATCTGAATGAGAAAAGCTGGGTAGTACCAGTGTACTCGTAAACGGTTGCATCTCCTAGGTAGCCCGGTAAAGGAGTCTTCTCTTCTGACTACGTAAAGTTGGGGCTTCGGAGTGATATATTAGGAGCCTTGCTGGGCCGGGGAACACCACTCTCATCCCGGACGTCACTTGAGGTAACACCGCTACCCATACGAAATTTAGTGTTTGAGATGACTTTGACTGTTTTTTAAGTTGGTTCGCCAGATTGGCCGATGAGTGCGTGTAGCGTTGCCTTAAAAAGACTTCTGATCCCGGATGCTCGGTCAGCCTCTCAGAGGGAATGTTCGTCGTGATAAAATCATTGGACAGCCTGCTATACCGGCATATCAAACCGCATTGTGTGGGGAGGTCGACGCCCCCTCAAACTTGGGCGGGTGTAGAGTGCGCCCCTCCGTACCGATAAATGCGGCCGAACTGGACTTCGTCACTAATACCAGAGAAACTTGCCCGTGATAATTTTGAGTTCATTTTCGCCAAGTTGCATACTTTCAGCGGGACATGAAACCCAAGCAAACTACGTTGCATTGAAGGGGATTAACGGAGACTCAATTGAGGCGAACTCTTAGTTGTATACCGCGAGTGGGGGAGAAGTTAATAGTAAGTACTAACTTTAAACTTCGCCTGGTAGATATTCCTTGCGACAAGGCCAAGGTTCCAGCGGCGTAGCCAGGCGTACATAGTCAGCAGGCAAATCGAATAGCTGGGGGTACAGCTTAGATATCTGCACCCAAATATCTTCATATTACACGGCGCCACAAAAAAAAACCGCCGATGGCTTCCAAAGCTCAGGAAATGATCGAGGCATCAAGAGGCTGCGTCCAATAGAAGCCTTTATCGGTCTTTTCGTATTAATTTCTTCTCCGTCCATGGCGTGGACCGCGCAGTGTATTCCAGTGACCTTGCCAGTGGGGAGAGCCCATGACTGGTGAATTGCGCACTATTGACTGGTAGAACACGGTCGAAAAGAAAGTAATCGAAATATGGTAGTACCGGTAACGGGAGACGTCGCAGCCTAATCTTCCCTGGGGCATGAGATAAACGGGCCTTACCCACGGAGGTAATAAGCGAGTGCCCCCCCAAATAGTCTTTTGCTTTCAACGGAACAAGTCCCGAGTAAACTCATAAGAGGTGCGGTGAGCAAATCTCTCCCCCTGATGTATGCGTTGGATCAACAGGTTTTGGTTCAAAGGCTTTGAAATACCCCCGGTGCTAGAATCAGGGGGCCTCCGCTTAGTAACCCATCTGACCATCCTAACTCCCGAGAGCGTGGGCGCGCCTCACACTACTGTAGGCAAAAGTCTCCGGCAACATGATCACCCACGTTGTCCGTACGGTCAGGAGTCTCCCGTTAAATTCGTTCCGGCCGGCACCACACGGGCTTTGCACTCCCCTCCGGCACCACTTTGACAAAGCTGTAAATAGGGAAAAGGAGAAGTGTTTTTCTGCGGGCTCCATTACTTACTCACTGTAGACGCTTCAAGCCTCGACTTCATGAGACATTAGCGGGAAGGCATAAGACCCATGACTGCCCCTCATATAGGGTCGTCAATATGGAGCATAGGATGCTGGCTGAGGGTCAACGCCCTTGCAGGGCTAGAATTAAGACGAGGTATTTTATTCTACGTGGCTCTGCCGAAAATGTTCAAAGCCCCGAACGGGCAAACCCTGATTGGGGCGGTTGTAATCGGGATCAATGACTAGTAGCTATCGGGATCGCCAGTTGCCATTTGTGATCACTCCTCCGGGGTTAACTCACATGTAAATACAAGAAGGGGTTTGCCAAATACTGAGTGACGGGAAGTTGAGGCCACGGGAATAAGAAGGAAGAAACGCCGTCACCATCGGCCCCCTTATGTCGTAGACTTCAATGTTCGATCGTTAAAGCTCTTGTCGGTTATCCCTAGAGTATGGTAAATTTCCGTAAAAACGTCACTTCGCCGTTTAAGGGGCGTCTGGTGGGGGTAGCGACCCGGGCATATAATATCCTATAGCGACCGACGTTTCGATAAGCTATAACTAGAAATGTTCAGCCACTCGCGGCTCAGCGACATCACACTCTGCTACAATAGAATTCGCTAGCATCCCCCGGAATTTTGCCTCCAAGATCGTGTCTAGGACCTGTCAAATCCTGAGTCTATCATTCTTTGATCCTGTCTGTGCGCGTGAACGCCGGGCAGTAGAGATATTTCGATCTGTTTTATTCTTGAACAGATCCTAAGGGCCTAAATGAAAGACTTTTCTAAACATACTACTAAATCACACCGTTACTGTCTAGGTCGCGACACGCATGGTTGGGCGTAAGTCCTCATGTTTGGGTTACCACGGTGCCGTACTATGCCGGGTTACGGAGCCTTATAAGTGTCCCCCCGCC  /..*%,.&.).)%(,*&.-*,*)+."',-.,.,-".$+((+-+,.$.$..(.(*-().+-%$**.+/*&%..-'#..,,,..%-'%&.,$).%+..-(')-%-&(--'..,..-/'-(),*'*&&(**),.+-'&,)(.+*)-,-/*,$)+%%.&..$(*&'-(.((%-+,')#,*(#......",+*%'.-%).,.*.'/.++!.+-$+(*.-/.$,.%-./...".,(---&--'.-$+..+'.-*#-,#--.---,-++.*',,).',./.&-*&,..%,*-,-/%-++&*$-",'--,&*.*-*.-.(-#,(.-+,-),----,....#%#)''&((#&.%-+,.-+.,-,,.**$-'+.-.-*+,.*-$-"--$,-,#)-"-#$&'..&.)(-*+!#.-)**,-.$,--$$),'(*."+-&--%+%*..-%,-,%(.,....*-.,$-.'%%(,.)-.%-")"*--+).'.+..(-/.$-)*",--&.'&.&../&-%$-#'*+"(+'.,.%,(+%,+.-$&,-,$)-).-.-)$,-.+-(%..)/+,.+&-&.$....(,+,#,&,.&.%+-*.'/%)%.-'%/-).,.*,+%#$,$,+/--..-&*'-.-,.-''++,-,-+-,-,%&&--,.$&*$.+.%-/,).'..%(+...(,-&#.".(,-',.-,+*-*"-,').&--%'&$.&.%.+,/$(*+-.--#*#$+.-.!',%''(!+.,(..-%,'',,,**,%(%*&,*.--#'$.,$)$...-("-)(,)%$)*'-'%,..+''..+-(-".,(,.../+-+-/".,,-*%#++*",')/.%+#$,.-/)*&--*..*+..&%(-#".&'-,,++)-,.#-.$-..)#)-.--$*&-.--.,..-'(,--)-#(.+$.%*%.+.)#(.+#*.*,(..$*$.$%(&.)-..(.-)+&+-%,-..)/'(##"!/+-(.$(.+,.(%+,#,,--.....,.)+&%(-.-*+.,*,*&-+,)-*+&-#.+*./$(+-).#).,-&+&+%-(+**.%,+*".(,)...)(.$..-''(,.+'%-,---.-../*.$/.('-,,)...*-*-'*))/#,.%,-$&("(-*$''$'*%-(.+---)",*,)"*$+*),&.*..--),%..,-*.-(.&.."#)#-#.&$-*.)-*#-)(-.-+&+(%.+.-*(*$".%,-++.-)-*+--%-+,,,.).!*--.)+*+,&'),'-&.&$&.$%(*.--,%-)-.,..*-,+('+.).#..).+..".)-..',.$.&-,.!%..%.+&-$--,..-.%.-$#.,$+%$,(-.,&(..".(+%+.-)/.,%%&.*&,$.-''.(%--'$"#+*++.+*.-++('.'-'$--,*+'&,*!%.,,..*%(-$#-.%-)-*.-..*$/),.+&'$--+#--#((.**.+&-&,**.-.,....#,.%&..,.'-%.$*.%).&("%#,*).,.$.--.-"--*,*!-,'.,.*&-(%-.+),,(""&..,&()'+/-),'+,-%-..&-..,.+.)-,+"..&*+#-.)+--.++.,-.(--.-..+-,$,**$+-+-."%%*,...*.'%)-.(-(+.%.&'"''..-+*%%&#$&)&.+,-(.%.)$.#$-&-)&%,-..)+%--$--*.#'-*(.*..$(*-,#".)-(*#+,*-*/$.-,).-,-.*.,,'""/--%-.-$%)&-*-..+.,.,.".)*++'(..+-.*,",()-!*%(-%.,&+).%..*-&%-'-)#),,-&-(.,+*&(%(*"(&.+%-*(.").%$%(.-.&,).)*-)(.*-!//.-&,.,..#.)',)#.$,.---+$,*-&$..+%-.&+,*.%&'',./-,#-.(..$--**%-+,-.,'",(,,*&#.$...-".*.--.,+(..(.+./%-..%..*-((&##,-.(+-.+.)$%+,%.*(.+-..-+#+%&'*++.%,*/.-.%#-.)*&.+,$&(--."%--+...-,!.)-&(,/%)&-..+,,.,.)++#.').).$-"#*(.%)*..','..)/#&,.)),.,,++-(-+.#'.-.,(--*"'+,*"+-(+.#&#+*-+(,+-..-+"%.&.!,'('''%%(,..,((*--%','-.-(+.+-),,.$.$",+-(.%-',&..(**-,-..+-",--/,-))./#.,..%&+!&%&,*%.%,&,+--(.)##-,**,+*&/.$&'$,,$-$.)...'.'.-+')-+/-.+..-*%(,+.&-+%".-"---)**#,-$,*+.",**--+#,(-/+*.$*.*-.+,-'#.#,,'&$.(.".)..*,&#+(.%+.&.-+-,)'!-+*&),$'$-#-.+"+,.),.+.--.)./-,-&-*#.+,#'%','-%&--..%(,'..++.-$*!-,..*..',.*..+/&.+#*+-(-*..%,*%,'.#)&$,-('')$!.)%+--,-.!-(./)-(+-(,)'.%,.*-%)(+.+%-+.')'"$/*)&-.*+./(.&..-#,-),-,...,$'.(%#.)%,,.#+-*).,)#(.-.-(-(.--*-!*..-..,**'*)'--*).*.&.-+'+-#)/,..,).#...-.--*#+.".+.).(*+!+--&.&*'-)'.-+&-(.,%%$#-,!*/+)%.&+%++"&$..#+%+%*-.%*'-,...,-.'%)-),'.,!,--.%+),().&*.-%)$'/#,-,#&---++*.+.).$..($--*-+#-'"*,++-+.,*&----.&+.!.,,'&#.$(..-,.)'-.-,.-,(,")(,(*.%.-/',.)--.#-,-,-./"$,.-&*.''(-,..&.-..(-+--**.,,--.+-.,.+-..$#%(.+%+-$(-&.++,)"..,.)&-!)',-*$**)$,).+,.,+.-*/.,(--..+--.,%$-.*,,&'-+'-&'%,,'$.)%'.,-..-.,-".,)-*.).-,'-+'"/""))'-),.$--$--(.).),$#*+,*,.%-..'.,+)),./.+#+.)**++.+*#."+-,+.,$*,/)&-)%+&.'(*.'%,-#(%.(.$.#%#-&/.+'..($-.,--*.!#..)-."+*,,,.*..*$(,*-*%./&-.#-',-%.#--&-+-.($.#-++.)''--++..--)$,+%.(-)-)#...**!..'*)&,,-%(.,,-(+*-+*.-.#%'-*&.--*,--/+$,+.,#&+%-,"!$.$&#-,)--.)/,.+,,%.,-),.--)+&-(..!,,"*&&+."-&#'-"*()+-+&.+..#)-+..$''.-'((-.---)./-."',../%",%((!#+..*!/.'$%.,-**..+.%$".-)..-#.+..$-..%%-/.-&(-)#,*(',-,.--.,,.*'.,.).%''.-$#%$'%-,.)-#,-.%(--..--,,.&'.,.(%&*%(-.-%,.#$+%%.-#.++$.,.,/+---(-&&"+#&(&,#(*,.),%,-",'+!$).$,"-.-,.-&%."#+%#&-*-/.+,.-,",)!&)+*-$!%-,($*.,*,.--..(,%-.-"-($$-.+*..).()+,)*(.+--)--*%+,+%.,-+&)-),&+*,(..'.-)%$"$/-*--&.%++.&.')..,&...+,,'%&(&.---&&(&)$"--.).%.',+#..+*#.)-$.$)+,.(..*..*$..-+%.-,'(.,)(.+-$/,'-)*,.'.$-*%-$.'*%(#&.$*-%.,*!(.*+*.,&(.+.+-,..%.-(.*..++%(-$+-,*-..,'(.)-).**.-)-&....*.'/().-'*')##)+..%-+*,#*.)..!+.---%*#-$&$&-$,.,.,/.'"(+.-.'.-,$,'.,.#,*--..'%%-.&"+(*,$.%*".%,-+%*'$$,-+---'.(+-+....,,..-.(+"--.-,"..*-')-,--$.*$-.*----%-+%,..--%&*.)#-*%'!-.(#--..)--..#,..+-.*+#%).$(,.#&*..),-$''..'.*,.,.#,)...*$&,-,..*,--(+,*-%+*%'.-+.-.).&.'*,,')-#-%+,($.)#.$-%+%*-%+&$!$&+-*)-*&&-&&'(.-))-*)-).)#/&,"+",*.-.-&(+.-*.(-*.+#*&-..)--+.#$".#.$.-*-'-('(&,'&%-"'$'-*.#-")'&.-".'-(-.+#.-/'-).'...!!.....,&+)(.$!,-)",.+.-%(-.-%.&.(&)/**&,.+-'+.-,+..,(/.+.).,-*'..-#"*'#-$'--.../)-./,,.%-,%--*.(.-,+-.(.$-*-",.(#%-)(.#&+'),,)-$".-./.-..(*$*#(.)+%.+*-&...,,*,*,*))/.'+,.$,+-..,(,.(!$'+..#"&,-.--(&---(-+&,+*+#,.$.+(.$+&,.*%&+.&.+&.-.%).)-..'.,.-*%.,+..&.-+#".*.-+,-.(-.$*.&&(.,*)+*.+"-/..%.),(!'(+*&&-%.*$.#.)+!.).,)#+)-&!'+**(."#+,*+,$-+-.-*"$/+#.-%(+./&$%(.'+.-,/----$,&.--)&.'.-#$"-.+  NM:i:748    AS:i:14813  H0:i:1  ZE:f:0  ZF:f:0.697229   ZQ:i:4562   ZR:i:125137

And that CIGAR seems correct.

There are a few little differences in the alignment which aren't relevant, but I think I've found the real problem at this point, 1991 characters into the CIGAR:

  • Current version (wrong): ...24M...
  • Older version (correct): ...2M3I19...

There are 3 insertions that the current version seems to be incorrectly calling matches. This means that everything after this part is shifted by 3 bases and is therefore full of mismatches.

printing duplicate (I think) lines

Hi,
I'm using graphmap (0.3.1) align with the -Z option and occasionally have duplicated alignment printed twice. I've attached the reference, example reads and the output in sam format. The command used was:
graphmap align -Z -t 1 -r Ce_july12_2d.fasta -d dup.fq

Any ideas?
Thanks
James

Ce_july12_2d.fasta.gz

Turn terminal deletions into soft clips

I got a bunch of sequences that get aligned with leading (trailing not observed) deletions e.g. 50D740M1D1M1D2M. Terminal indels don't make sense in an alignment (after all: how do you distinguish between a terminal insertion or deletion?). They should be soft clipped.

Doesn't compile with clang

The default compiler on Mac (clang/LLVM) is not supported. (I know: make mac sort of circumvents this but in theory there should be no need to install GCC in addition.)

$ make clean

$ make
Makefile:66: "*** WARNING g++ minor version <7 ***"
mkdir -p obj_linux/src/alignment/
g++ -static-libgcc -static-libstdc++ -D__cplusplus=201103L -I"./src/" -I"/usr/include/" -I"libs/libdivsufsort-2.0.1/build/include" -I"libs/seqan-library-1.4.2/include" -DRELEASE_VERSION -O3 -fdata-sections -ffunction-sections -c -fmessage-length=0 -ffreestanding -fopenmp -m64 -std=c++11 -Werror=return-type -pthread -o obj_linux/src/alignment/myers.o src/alignment/myers.cpp
clang: warning: argument unused during compilation: '-static-libgcc'
clang: warning: argument unused during compilation: '-static-libstdc++'
clang: warning: argument unused during compilation: '-fopenmp'
mkdir -p obj_linux/src/alignment/
g++ -static-libgcc -static-libstdc++ -D__cplusplus=201103L -I"./src/" -I"/usr/include/" -I"libs/libdivsufsort-2.0.1/build/include" -I"libs/seqan-library-1.4.2/include" -DRELEASE_VERSION -O3 -fdata-sections -ffunction-sections -c -fmessage-length=0 -ffreestanding -fopenmp -m64 -std=c++11 -Werror=return-type -pthread -o obj_linux/src/alignment/cigargen.o src/alignment/cigargen.cc
clang: warning: argument unused during compilation: '-static-libgcc'
clang: warning: argument unused during compilation: '-static-libstdc++'
clang: warning: argument unused during compilation: '-fopenmp'
In file included from src/alignment/cigargen.cc:9:
./src/utility/utility_general.h:108:10: error: no member named 'iota' in namespace 'std'
    std::iota(begin(indices), end(indices), static_cast<size_t>(0));
    ~~~~~^
./src/utility/utility_general.h:120:10: error: no member named 'iota' in namespace 'std'
    std::iota(begin(indices), end(indices), static_cast<size_t>(0));
    ~~~~~^
2 errors generated.
make: *** [obj_linux/src/alignment/cigargen.o] Error 1

$ g++ --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 6.0 (clang-600.0.54) (based on LLVM 3.5svn)
Target: x86_64-apple-darwin13.4.0
Thread model: posix

Crashing with -C circular reference parameter

Hi there,

I am using graphmap commit 84f058f to map a set of PacBio RSII reads to circular mitochondrial genomes from NCBI (for partitioning and subassembly of mitochondria from a somewhat dirty sample of a wildcaught specimen).

With the following invocation, graphmap crashes almost immediately:
/lab/solexa_weng/testtube/graphmap_main/bin/Linux-x64/graphmap align -r head3_mito_reference.fasta -C -d head3_reads.fasta -P -o test.sam

However without the -C parameter the mapping works fine. I've attached the minimal set of data needed to make graphmap crash (3 mitochondrial genomes, and 3 reads), as well as the log from the crashed run.

I've also tried with the most recent commit of the "dev" branch, however it still crashes.

Thank you for your work on graphmap!

graphmap_circular_crash.zip

installation problem on debian 8 (jessie)

Hello
I'm new to linux and I'm having problems compiling graphmap i think. My linux distro is using gcc 4.9 so after downloading the source files I ran $make modlues and then $make without specifying which version of gcc to use unlike in the INSTALL.md file. There was a lot of activity on the screen. Subsequently $graphmap command was not recognised and no green folders were created anywhere I'd expect. Perhaps I am missing some dependancies. Any advise would be greatly appreciated

compilation problem

I'm getting the following error (gcc-4.8) on a Mac:
ld: warning: directory not found for option '-Lcodebase/seqlib/src/libs/libdivsufsort-2.0.1/build/lib'
Should it be ? : codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/

Segmentation fault when processing reads

Hey, I am getting a segmentation fault when using the latest version of graph map ( e8d6100 )

I am invoking graph map using the command:

./bin/Linux-x64/graphmap align -C -r EhR2_v2.fasta -d all_subreads.fastq -o graphmap_out.sam

I have ran the command many times and find that the program aborts on differing reads whilst processing reads (i.e. sometimes it will segfault at read 0/141336 and what appears at random numbers between read 0 to around read 40/141336).

Here is my output.

[15:26:29 Index] Running in normal (parsimonious) mode. Only one index will be used.
[15:26:29 Index] Index already exists. Loading from file.
[15:26:30 Index] Index loaded in 0.17 sec.
[15:26:30 Index] Memory consumption: [currentRSS = 259 MB, peakRSS = 259 MB]

[15:26:30 Run] Automatically setting the maximum allowed number of regions: max. 500, attempt to reduce after 0
[15:26:30 Run] No limit to the maximum number of seed hits will be set in region selection.
[15:26:30 Run] Reference genome is assumed to be circular.
[15:26:30 Run] Only one alignment will be reported per mapped read.
[15:26:30 ProcessReads] Reads will be loaded in batches of up to 1024 MB in size.
[15:26:35 ProcessReads] Batch of 141336 reads (1024 MiB) loaded in 5.22 sec. (12151080 bases)
[15:26:35 ProcessReads] Memory consumption: [currentRSS = 1334 MB, peakRSS = 1334 MB]
[15:26:35 ProcessReads] Using 24 threads.
[15:26:35 ProcessReads] [CPU time: 5.24 sec, RSS: 1334 MB] Read: 8/141336 (0.01%) [m: 0, u: 0], length = 8250, qname: m150916_104438_42215_c10085833255000000182...Segmentation fault

The last line of output differs each time (i.e. a different read is quoted) the program is invoked and subsequently segfaults.

Could you possibly help with a solution?

NB: our reads are pacbio reads.

installation on i386?

My compilation gets terminated with : /usr/include/zlib.h:34:19: fatal error: zconf.h: No such file or directory
#include "zconf.h"

My guess is it is because I am compiling on i386 system.

Do you have any advice to fix this?

Thanks in advance.

CIGAR and MD field mismatch

Hi Ivan,

I am running graphmap using nanopore ecoli reads and encountered a problem as CIGAR sequence and MD field don't match.

Sum of the number of Deletion, Mismatch and Equal match in CIGAR should match up with the sum of integers and number of [A, C, T, G] in MD flag (ignoring the ‘^’ and 0).

But I get a mismatch between them for few files (but most work fine), in the zip file mentioned below are two sample files
nanopore_reads.zip

I used poretools( 0.5.1)to get the fastq files from them.[ecoli_k12.fa.txt]

Reference: ecoli_k12.fa (https://github.com/isovic/graphmap/files/448448/ecoli_k12.fa.txt)
Query: ./bin/Linux-x64/graphmap align --extcigar -r ecoli_k12.fa -d fastq_file -o output.sam

Here is the code i used to check that .
CIGAR_MD.py.zip

Thanks
Parithi

Cannot build reference index

I am attempting to test graphmap using some human reads we just generated with a Minion.

I cloned and compiled the software as per the instructions and should be running the latest version in the master branch:
Version: v0.3.0
Build date: Aug 11 2016 at 17:06:03

I also downloaded the latest human genome build ( GCA_000001405.23_GRCh38.p8_genomic.fna from the GRC -- ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA_000001405.23_GRCh38.p8).

I tried to map and just build the index with no success:

  1. Build and map in 1 command:
    $ graphmap align -K fasta -r /grc/genomes_db/GCA_000001405.23_GRCh38.p8_genomic.fna -d gt50k.fasta -o test.sam
    [17:27:37 Index] Running in normal (parsimonious) mode. Only one index will be used.
    [17:27:37 Index] Index is not prebuilt. Generating index.
    [17:27:37 LoadOrGenerate] Started generating new index from file '/grc/genomes_db/GCA_000001405.23_GRCh38.p8_genomic.fna'...
    [Fri, 12 Aug 16 00:27:37 +0000 FATAL] #5: Unexpected value found! Input sequence file format unknown!
    In function: 'LoadSeqs_'.

[Fri, 12 Aug 16 00:27:37 +0000 FATAL] #5: Unexpected value found! Input sequence file format unknown!
In function: 'LoadSeqs_'.
Exiting.

  1. Try to build index only as per the documentation:
    $ graphmap align -I -r GCA_000001405.23_GRCh38.p8_genomic.fna
    Reads file does not exist: ''

For detailed help, please run with -h option.

Example usage:
./graphmap align -r escherichia_coli.fa -d reads.fastq -o alignments.sam

GraphMap (c) by Ivan Sovic, Mile Sikic and Niranjan Nagarajan
GraphMap is licensed under The MIT License.

Version: v0.3.0
Build date: Aug 11 2016 at 17:06:04

multiple matches

Dear,
I have a question about graphmap output. We are trying to align two identical files to each other and identify secondary alignments. With this I mean that we use a single fasta file as database and query. We aspect to have one best hit to the read itself and secondary alignments.

How can we get the software to report all the alignments irrespective of the identity? Right now, we only get the self alignment of the read to itself and no secondary alignments.

thanks
Luigi

extra newline

Hi,
I've been able to align using:

Linux-x64/graphmap align -C -r genome.fa -d 1Dreads.fastq -o chr.sam

However, when I try to create a sorted bam file, I receive "parse error at line X" messages. Going into the sam file, it looks like there are empty new lines between samlines.

Possibly an error in the code that inserts this during output?

case sensitivity for reads

Hi Ivan,

I tried to map a (PacBio) FastQ file with lower case reads (produced by dextract) but none of these reads were mapped by Graphmap (as opposed to Blasr). If I uppercase them, all map. I think I saw an uppercase function for indexing of the reference. But what about reads?

Andreas

Recompiling libraries fails

Running 'make deps' fails with error on both os x and linux. Perhaps due to nonstandard install of cmake. Resolved in other cases by other people by replacing cmake with CMAKE_COMMAND.

[dmulder@samwise graphmap]$ make deps
cd libs; cd libdivsufsort-2.0.1; make clean; rm -rf build; ./configure; mkdir build ;cd build; cmake -DBUILD_DIVSUFSORT64:BOOL=ON -DCMAKE_BUILD_TYPE="Release" -DBUILD_SHARED_LIBS=OFF -DCMAKE_INSTALL_PREFIX="/usr/local" .. ; make
make[1]: Entering directory /mnt/work1/users/home2/dmulder/bin/graphmap/libs/libdivsufsort-2.0.1' Making clean in examples make[2]: Entering directory/mnt/work1/users/home2/dmulder/bin/graphmap/libs/libdivsufsort-2.0.1/examples'
rm -rf .libs _libs
rm -f suftest mksary sasearch bwt unbwt
rm -f _.o
rm -f *.lo
make[2]: Leaving directory /mnt/work1/users/home2/dmulder/bin/graphmap/libs/libdivsufsort-2.0.1/examples' Making clean in lib make[2]: Entering directory/mnt/work1/users/home2/dmulder/bin/graphmap/libs/libdivsufsort-2.0.1/lib'
test -z "libdivsufsort.la " || rm -f libdivsufsort.la
rm -f "./so_locations"
rm -rf .libs _libs
rm -f *.o
rm -f *.lo
make[2]: Leaving directory /mnt/work1/users/home2/dmulder/bin/graphmap/libs/libdivsufsort-2.0.1/lib' Making clean in include make[2]: Entering directory/mnt/work1/users/home2/dmulder/bin/graphmap/libs/libdivsufsort-2.0.1/include'
rm -rf .libs _libs
rm -f *.lo
make[2]: Leaving directory /mnt/work1/users/home2/dmulder/bin/graphmap/libs/libdivsufsort-2.0.1/include' Making clean in . make[2]: Entering directory/mnt/work1/users/home2/dmulder/bin/graphmap/libs/libdivsufsort-2.0.1'
rm -rf .libs _libs
rm -f *.lo
make[2]: Leaving directory /mnt/work1/users/home2/dmulder/bin/graphmap/libs/libdivsufsort-2.0.1' make[1]: Leaving directory/mnt/work1/users/home2/dmulder/bin/graphmap/libs/libdivsufsort-2.0.1'
checking build system type... x86_64-unknown-linux-gnu
checking host system type... x86_64-unknown-linux-gnu
checking target system type... x86_64-unknown-linux-gnu
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking whether to enable maintainer-specific portions of Makefiles... no
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking for style of include used by make... GNU
checking dependency style of gcc... gcc3
checking whether make sets $(MAKE)... (cached) yes
checking for a sed that does not truncate output... /bin/sed
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for fgrep... /bin/grep -F
checking for ld used by gcc... /usr/bin/ld
checking if the linker (/usr/bin/ld) is GNU ld... yes
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking whether ln -s works... yes
checking the maximum length of command line arguments... 1966080
checking whether the shell understands some XSI constructs... yes
checking whether the shell understands "+="... yes
checking for /usr/bin/ld option to reload object files... -r
checking for objdump... objdump
checking how to recognize dependent libraries... pass_all
checking for ar... ar
checking for strip... strip
checking for ranlib... ranlib
checking command to parse /usr/bin/nm -B output from gcc object... ok
checking how to run the C preprocessor... gcc -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for dlfcn.h... yes
checking for objdir... .libs
checking if gcc supports -fno-rtti -fno-exceptions... no
checking for gcc option to produce PIC... -fPIC -DPIC
checking if gcc PIC flag -fPIC -DPIC works... yes
checking if gcc static flag -static works... no
checking if gcc supports -c -o file.o... yes
checking if gcc supports -c -o file.o... (cached) yes
checking whether the gcc linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking dynamic linker characteristics... GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking whether stripping libraries is possible... yes
checking if libtool supports shared libraries... yes
checking whether to build shared libraries... no
checking whether to build static libraries... yes
checking for ANSI C header files... (cached) yes
checking for inttypes.h... (cached) yes
checking for memory.h... (cached) yes
checking stddef.h usability... yes
checking stddef.h presence... yes
checking for stddef.h... yes
checking for stdint.h... (cached) yes
checking for stdlib.h... (cached) yes
checking for string.h... (cached) yes
checking for strings.h... (cached) yes
checking for sys/types.h... (cached) yes
checking io.h usability... no
checking io.h presence... no
checking for io.h... no
checking fcntl.h usability... yes
checking fcntl.h presence... yes
checking for fcntl.h... yes
checking for uint8_t... yes
checking for int32_t... yes
checking for an ANSI C-conforming const... yes
checking for inline... inline
checking for stdlib.h... (cached) yes
checking for GNU libc compatible malloc... yes
checking for fopen_s... no
checking for _setmode... no
checking for setmode... no
checking for _fileno... no
configure: creating ./config.status
config.status: creating Makefile
config.status: creating include/Makefile
config.status: creating include/divsufsort.h
config.status: creating include/lfs.h
config.status: creating lib/Makefile
config.status: creating examples/Makefile
config.status: creating include/config.h
config.status: include/config.h is unchanged
config.status: executing depfiles commands
config.status: executing libtool commands
/bin/sh: cmake: command not found
make[1]: Entering directory /mnt/work1/users/home2/dmulder/bin/graphmap/libs/libdivsufsort-2.0.1/build' make[1]: *_\* No targets specified and no makefile found. Stop. make[1]: Leaving directory /mnt/work1/users/home2/dmulder/bin/graphmap/libs/libdivsufsort-2.0.1/build'
make: *** [deps] Error 2

pacbio parameter set

In the README you state having parameter sets for pacbio, but when I try -x pacbio, I am told they only exist for illumina and oxford. Is there a recommended pacbio set of parameters?

Release tarball 0.3.0 doesn't compile

Downloaded the release (from github; i.e. not repo cloned) and tried to compile, but certain headers are missing (see below). Make modules doesn't work either because the release is not a git repository. Cloning the repo, then git checkout v0.3.0 and make modules; make works fine.

Might be worthwhile mentioning that the github created release tarball is broken and shouldn't be used

Andreas

GCC=/opt/gcc-4.7.2/bin/gcc make -j 4
mkdir -p obj_linux/src/alignment/
/opt/gcc-4.7.2/bin/gcc -static-libgcc -static-libstdc++ -D__cplusplus=201103L -I"./src/" -I"/usr/include/" -I"codebase/seqlib/src/libs/seqan-library-2.0.1/include" -I"codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/"  -DRELEASE_VERSION -O3 -fdata-sections -ffunction-sections -c -fmessage-length=0 -ffreestanding -fopenmp -m64 -std=c++11 -Werror=return-type -pthread  -o obj_linux/src/alignment/alignment.o src/alignment/alignment.cc
mkdir -p obj_linux/src/alignment/
/opt/gcc-4.7.2/bin/gcc -static-libgcc -static-libstdc++ -D__cplusplus=201103L -I"./src/" -I"/usr/include/" -I"codebase/seqlib/src/libs/seqan-library-2.0.1/include" -I"codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/"  -DRELEASE_VERSION -O3 -fdata-sections -ffunction-sections -c -fmessage-length=0 -ffreestanding -fopenmp -m64 -std=c++11 -Werror=return-type -pthread  -o obj_linux/src/alignment/alignment_wrappers.o src/alignment/alignment_wrappers.cc
mkdir -p obj_linux/src/alignment/
/opt/gcc-4.7.2/bin/gcc -static-libgcc -static-libstdc++ -D__cplusplus=201103L -I"./src/" -I"/usr/include/" -I"codebase/seqlib/src/libs/seqan-library-2.0.1/include" -I"codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/"  -DRELEASE_VERSION -O3 -fdata-sections -ffunction-sections -c -fmessage-length=0 -ffreestanding -fopenmp -m64 -std=c++11 -Werror=return-type -pthread  -o obj_linux/src/alignment/anchored.o src/alignment/anchored.cc
mkdir -p obj_linux/src/alignment/
/opt/gcc-4.7.2/bin/gcc -static-libgcc -static-libstdc++ -D__cplusplus=201103L -I"./src/" -I"/usr/include/" -I"codebase/seqlib/src/libs/seqan-library-2.0.1/include" -I"codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/"  -DRELEASE_VERSION -O3 -fdata-sections -ffunction-sections -c -fmessage-length=0 -ffreestanding -fopenmp -m64 -std=c++11 -Werror=return-type -pthread  -o obj_linux/src/alignment/cigargen.o src/alignment/cigargen.cc
In file included from src/alignment/anchored.cc:8:0:
./src/alignment/alignment_wrappers.h:19:37: fatal error: utility/utility_general.h: No such file or directory
compilation terminated.
make: *** [obj_linux/src/alignment/anchored.o] Error 1
make: *** Waiting for unfinished jobs....
In file included from src/alignment/alignment.cc:8:0:
./src/alignment/alignment.h:19:39: fatal error: sequences/single_sequence.h: No such file or directory
compilation terminated.
In file included from src/alignment/alignment_wrappers.cc:8:0:
src/alignment/alignment_wrappers.h:19:37: fatal error: utility/utility_general.h: No such file or directory
compilation terminated.
make: *** [obj_linux/src/alignment/alignment.o] Error 1
make: *** [obj_linux/src/alignment/alignment_wrappers.o] Error 1
In file included from src/alignment/cigargen.cc:8:0:
./src/alignment/cigargen.h:19:24: fatal error: libs/edlib.h: No such file or directory
compilation terminated.
make: *** [obj_linux/src/alignment/cigargen.o] Error 1

Effect of -F option

I noticed odd behaviour with the -F option, and I'm not sure if this is a bug or if I'm misinterpreting the algorithm. My understanding was that a higher -F value would potentially lead to more alignments as it lowered the score threshold, and a value of 1.0 would potentially give the most alignments, effectively not using a score threshold. But I've encountered a case where increasing the -F value reduces my alignments.

Here are some example files to replicate what I'm seeing.
contig.fasta.txt - 3 kb reference
reads.fastq.txt - two reads

This command produces two alignments (one for each read):
graphmap -r contig.fasta.txt -d reads.fastq.txt -o out.sam -Z -F 0.9

Increasing the -F value to 0.99 produces only one alignment:
graphmap -r contig.fasta.txt -d reads.fastq.txt -o out.sam -Z -F 0.99

And an even higher -F value results in no alignments:
graphmap -r contig.fasta.txt -d reads.fastq.txt -o out.sam -Z -F 0.999

I only see this behaviour with the default anchor algorithm. The anchorgotoh algorithm behaves more as expected, where increasing -F value towards 1.0 results in a third alignment (two for one read). I'm using the 2e314e6 commit (most current of the extanchorend branch).

Support BAM as input

Would be nice if Graphmap would support read input from BAM files, with the option to retain existing tags like insertion and deletion qualities for example. Also, the new SMRT Portal / Pacbio versions will produce BAM as native sequencing output apparently.

Segmentation fault

Hi,
Graphmap would sometimes be terminated before finished.
I dont know what was going on there.
Could you help to check? Thank you!

error message
....
....
1 (35.33%) [m: 38765, u: 4570], length = 4567, qname: d2055667-23e2-4a2f-9f31-15[22:09:37 ProcessReads] [CPU time: 16923.34 sec, RSS: 3080 MB] Read: 43355/122701 (35.33%) [m: 38774, u: 4570], length = 15346, qname: d8bdb47f-cc9a-4769-b9bb-1.../var/spool/torque/mom_priv/jobs/487788.statgenpro.SC: line 10: 11686 Segmentation fault (core dumped) graphmap align -t 12 -r C.elegans_ref.fasta -d allreads_ac.correctedReads.fasta.gz -o all_ac2w.sam

compile error: 'pow' is not a member of 'std'

The build fails on Ubuntu 14.04 on the src\index\ source files with the error: 'pow' is not a member of 'std'.

This can be fixed by adding #include <cmath> to src/index/index.h.

Compile error - Control reaches end of non-void function

Ran into the following compile error below:

src/graphmap/graphmap.cc: In member function ‘std::shared_ptris::MinimizerIndex GraphMap::SetupIndex_(std::shared_ptr, const string&, const string&, const ProgramParameters&, int64_t) const’:
src/graphmap/graphmap.cc:215:10: error: cannot bind ‘std::unique_ptris::MinimizerIndex’ lvalue to ‘std::unique_ptris::MinimizerIndex&&’
return index;
^
In file included from /usr/include/c++/4.8.3/memory:82:0,
from ./src/sparsehash/dense_hash_map:102,
from codebase/gindex/src/minimizer_index/minimizer_index.h:11,
from ./src/graphmap/graphmap.h:23,
from src/graphmap/graphmap.cc:11:
/usr/include/c++/4.8.3/bits/shared_ptr.h:257:2: error: initializing argument 1 of ‘std::shared_ptr<_Tp>::shared_ptr(std::unique_ptr<_Up, _Ep>&&) [with _Tp1 = is::MinimizerIndex; _Del = std::default_deleteis::MinimizerIndex; _Tp = is::MinimizerIndex]’
shared_ptr(std::unique_ptr<_Tp1, _Del>&& __r)
^
src/graphmap/graphmap.cc:216:1: error: control reaches end of non-void function [-Werror=return-type]
}
^
cc1plus: some warnings being treated as errors
make: *** [obj_linux/src/graphmap/graphmap.o] Error 1

Presets stopped working

Presets (e.g. -x illumina) translate into option -w which was removed:

graphmap align -x illumina ...
ERROR: Unknown parameter '-w'.

Andreas

PS: version 0.3.0 commit 1d16f07

Extended CIGAR (sam v.1.4)

Hi Ivan,

I wonder if there is a way to obtain an extended CIGAR string in the sam output by the current version of Graphmap, aka displaying detailed (X,=) for (match,mismatch) instead of M for alignment match (that can be match or mismatch).

Thanks in advance,
Amina

Additional steps required for building on OS X

Hi there,
In spite of the Makefile seemingly accomodating OS X, make failed for me using Clang which was no big surprise. Your Makefile also seemingly ignores the CC and CXX environment variables and uses its own variables which need to be set.

Using GCC 5.3 and OpenMPI installed with Homebrew, these were steps I used to build graphmap on OS 10.10.5.

Perhaps this may help others:

export GCC=/usr/local/bin/g++-5
export GCC_MAC=/usr/local/bin/g++-5
make deps
make

Binaries then appear in a bin/Linux-x64 directory... But they are executable.

Best wishes,
Bede

Sam header

Seems to be an issue with sam headers with commit a697f01.

While converting sam -> bam, receiving many lines of:
[sam_read1] reference 'U00096.2' is recognized as '*'.

This did not pop up when using the same fastq data for alignment on commit 817ba03.

segfault while indexing largish 16S database

The reference file of interest is gg_13_5_otus/rep_set/99_otus.fasta, which comes with ftp://greengenes.microbio.me/greengenes_release/gg_13_5/gg_13_8_otus.tar.gz
It might be considered unusual in so far as it only contains short sequences (16S rRNA; shortest 1254 bp, longest 2368 bp) and all sequence ids are numeric (but unique)

Here's how to reproduce the segfault:

$ graphmap  -I -r 99_otus.fasta
[Index 22:42:34] Running in fast and sensitive mode. Two indexes will be used (double memory consumption).
[Index 22:42:34] Generating index.
[Index 22:44:51] Generating secondary index.
Segmentation fault (core dumped)

$ ll
total 6213492
lrwxrwxrwx 1 wilma csb5         96 Aug 19 22:42 99_otus.fasta -> /mnt/genomeDB/misc/greengenes.secondgenome.com/downloads/13_5/gg_13_5_otus/rep_set/99_otus.fasta
-rw-r--r-- 1 wilma csb5 5338573735 Aug 19 22:44 99_otus.fasta.gmidx

Here a backtrace:

$ gdb /mnt/software/bin/graphmap
(gdb) set args  -I -r 99_otus.fasta
(gdb) r
Starting program: /mnt/software/bin/graphmap -I -r 99_otus.fasta
[Thread debugging using libthread_db enabled]
[Index 22:58:01] Running in fast and sensitive mode. Two indexes will be used (double memory consumption).
[Index 22:58:01] Generating index.
[Index 23:00:13] Generating secondary index.

Program received signal SIGSEGV, Segmentation fault.
0x000000000047a449 in IndexSpacedHash::CreateIndex_(signed char*, unsigned long) ()
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.132.el6.x86_64 libgomp-4.4.7-4.el6.x86_64
(gdb) bt
#0  0x000000000047a449 in IndexSpacedHash::CreateIndex_(signed char*, unsigned long) ()
#1  0x000000000047465b in Index::GenerateFromSequenceFile(SequenceFile const&) ()
#2  0x00000000004735c1 in Index::GenerateFromFile(std::basic_string<char, std::char_traits<char>, std::allocator<char> >) ()
#3  0x000000000045937c in GraphMap::BuildIndex(ProgramParameters&) ()
#4  0x000000000045d7db in GraphMap::Run(ProgramParameters&) ()
#5  0x0000000000496a85 in main ()

This happens with release v0.21 and also commit 95b9dca

GOMP_4.0 not found

Hi,
I encounter an problem when I run the graphmap for alignment:
graphmap: /usr/lib/x86_64-linux-gnu/libgomp.so.1: version `GOMP_4.0' not found (required by graphmap)

I compiled the graphmap using gcc-6.2.0 which install at home/user directory. The default version of gcc on my server is 4.6.
Is there any way to redirect the graphmap to find the libgomp.so.1 from other directory? or how can I solve this problem?

Thank you!

Can't get tagged version 0.3.0 and 1d16f07 to compile with GCC 4.7.2 and GCC 4.9.3 on Linux

Tried GCC 4.7.2 and GCC 4.9.3 on Linux and tagged version v0.3.0 as well as current master 1d16f07. Errors look similar:

git clone https://github.com/isovic/graphmap.git graphmap.git
cd graphmap.git
# alternativaly: git checkout v0.3.0

# modules compile fine
GCC=/opt/gcc-4.7.2/bin/gcc make modules

# but:
GCC=/opt/gcc-4.7.2/bin/gcc make
...
mkdir -p ./bin/Linux-x64/
/opt/gcc-4.7.2/bin/gcc -static-libgcc -static-libstdc++ -m64 -ffreestanding -L"/usr/local/lib" -L"codebase/seqlib/src/libs/libdivsufsort-2.0.1/build/lib" -o ./bin/Linux-x64/graphmap ./obj_linux/codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/utils.o ./obj_linux/codebase/seqlib/src/libs/edlib.o ./obj_linux/codebase/seqlib/src/libs/opal.o ./obj_linux/codebase/argumentparser/src/argparser.o ./obj_linux/codebase/argumentparser/src/example.o ./obj_linux/codebase/seqlib/src/test.o ./obj_linux/codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/divsufsort.o ./obj_linux/codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/sssort.o ./obj_linux/codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/trsort.o ./obj_linux/codebase/seqlib/src/log_system/log_system.o ./obj_linux/codebase/seqlib/src/sequences/sequence_alignment.o ./obj_linux/codebase/seqlib/src/sequences/sequence_alignment_test.o ./obj_linux/codebase/seqlib/src/sequences/sequence_file.o ./obj_linux/codebase/seqlib/src/sequences/sequence_gfa.o ./obj_linux/codebase/seqlib/src/sequences/sequence_gfa_test.o ./obj_linux/codebase/seqlib/src/sequences/single_sequence.o ./obj_linux/codebase/seqlib/src/utility/evalue.o ./obj_linux/codebase/seqlib/src/utility/evalue_constants.o ./obj_linux/codebase/seqlib/src/utility/utility_general.o ./obj_linux/src/alignment/alignment.o ./obj_linux/src/alignment/alignment_wrappers.o ./obj_linux/src/alignment/anchored.o ./obj_linux/src/alignment/cigargen.o ./obj_linux/src/alignment/semiglobal.o ./obj_linux/src/containers/mapping_data.o ./obj_linux/src/containers/path_graph_entry.o ./obj_linux/src/containers/region.o ./obj_linux/src/containers/score_registry.o ./obj_linux/src/containers/vertices.o ./obj_linux/src/graphmap/core_graphmap.o ./obj_linux/src/graphmap/experimental.o ./obj_linux/src/graphmap/filter_anchors.o ./obj_linux/src/graphmap/graphmap.o ./obj_linux/src/graphmap/lcs_anchored.o ./obj_linux/src/graphmap/lcs_semiglobal.o ./obj_linux/src/graphmap/process_read.o ./obj_linux/src/graphmap/region_selection.o ./obj_linux/src/index/index.o ./obj_linux/src/index/index_hash.o ./obj_linux/src/index/index_owler.o ./obj_linux/src/index/index_sa.o ./obj_linux/src/index/index_spaced_hash.o ./obj_linux/src/index/index_spaced_hash_fast.o ./obj_linux/src/owler/dpfilter.o ./obj_linux/src/owler/owler.o ./obj_linux/src/owler/owler_data.o ./obj_linux/src/owler/owler_experimental.o ./obj_linux/src/owler/process_read.o ./obj_linux/src/main.o ./obj_linux/src/program_parameters.o -lpthread -lgomp -lm -lz
./obj_linux/codebase/seqlib/src/libs/edlib.o: In function `std::_Vector_base<int, std::allocator<int> >::~_Vector_base()':
edlib.cpp:(.text._ZNSt12_Vector_baseIiSaIiEED2Ev[_ZNSt12_Vector_baseIiSaIiEED5Ev]+0x9): undefined reference to `operator delete(void*)'
./obj_linux/codebase/seqlib/src/libs/edlib.o: In function `_ZL23myersCalcEditDistanceNWPmiiPKhiS1_iiiPiS2_bPP13AlignmentDatai.isra.44':
edlib.cpp:(.text._ZL23myersCalcEditDistanceNWPmiiPKhiS1_iiiPiS2_bPP13AlignmentDatai.isra.44+0xe5): undefined reference to `operator new[](unsigned long)'
...

(full error message is 3MB!)

compiling issue (gcc/g++-4.8.4)

Trying to compile on CentOS... Seems like there is a missing range.h file? Any ideas?

g++ -static-libgcc -static-libstdc++ -D__cplusplus=201103L -I"./src/" -I"/usr/include/" -I"codebase/seqlib/src/libs/seqan-library-2.0.1/include" -I"codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/" -Icodebase/argumentparser/src -Icodebase/seqlib/src -DRELEASE_VERSION -O3 -fdata-sections -ffunction-sections -c -fmessage-length=0 -ffreestanding -fopenmp -m64 -std=c++11 -Werror=return-type -pthread  -o obj_linux/src/alignment/alignment.o src/alignment/alignment.cc
In file included from ./src/alignment/alignment.h:23:0,
                 from src/alignment/alignment.cc:8:
./src/containers/path_graph_entry.h:14:30: fatal error: containers/range.h: No such file or directory
 #include "containers/range.h"
                              ^
compilation terminated.

Free end gap alignments possible?

I'd like to use GraphMap to align PacBio reads to contigs in an end gap free manner, as described here. However, I'm getting cases where my alignment is being soft-clipped before the end of the contig.

Here's my attempt to illustrate with a little example. Imagine the top sequence is my PacBio read and the bottom sequence is my contig:

             CTTTGGGCAAAC
             ||||||
ACGCGCATACATTCTTTGGCC
CIGAR: 6M6S

What I'd instead like is this, where the alignment always extends to the end of the reference contig:

             CTTTGGGCAAAC
             ||||||||
ACGCGCATACATTCTTTGGCC
CIGAR: 8M4S

I've tried lots of GraphMap parameters but can't seem to make it behave this way. Is it possible? Thanks!

Version 0.4.1 ignores and rebuilds pre-existing indexes from 0.3.1

I have large genomes pre-indexed using version 0.3.1 of graphmap. When I run graphmap align against these indexes using version 0.4.1, graphmap seems not to recognize the pre-built indexes and overwrites them. This behavior does not occur when running 0.4.1 against 0.4.1-built indexes. Is there any way to build-in backwards compatibility with indexes built by previous versions of graphmap? Or at the very least, can we be assured that moving forward, all future graphmap versions will be compatible with indexes built by 0.4.1?

segfault when building index only

Hello Ivan,

I was trying to build the index for my simulated GRCH38 genome using the command line:

graphmap align -I -r grch38.simu.fasta

The index ran about 7 minutes before issued seg fault error (see the file attached below).

graphmap.index.err.txt

I also tried to use .fa as the reference extension and with/without reads file, and the error persisted.

I am using graphmap v0.3.2.

Any insights on how I could fix this?

Thanks,
Simo

MD flag in SAM file

To use graphmap as part of a workflow, it'd be great to have the MD flag in the SAM file without having to run samtools calmd as an extra step

Merging contigs

I am trying to merge my contigs by closing the gaps with long reads. I ran graphmap in hopes of finding reads that overlap two different contigs (which would allow me to merge these), and got a mhat output that looks promising. However, I am having a hard time finding software to actually take in this file to merge the contigs. Is this something that anyone encountered, and what tools would you recommend?

Thank you!

mapping problems when mapping near a gap (dev version)

Dear,

Im currently using the code and found mapping issues in multiple of the
mapping modes. (anchor, myers, gotoh and anchor gotoh)

anchor

  • the SAM file reports one base to much that is inside the gap matching a N

myers

  • the read looks compressed resulting in large amounts of Inserts

gotoh

  • the read looks compressed resulting in large amounts of Inserts (same as myers)

anchorgotoh

  • read is matched completly even over the gaps containing N

I will sent a email with some additional files.
But in the attachment a igv snapshot of the results

igv_snapshot

Thank you,
Jordy

Compilation error mac

Getting this error when running make mac
ld: warning: directory not found for option '-Lcodebase/seqlib/src/libs/libdivsufsort-2.0.1/build/lib

It looks like this isn't actually a folder (I guess the error is saying that already). Is it supposed to be libdivsufsort-2.0.1-64bit/ perhaps?

Sam header when mapping to transcriptome

I tried mapping to the transcriptome, and it seems to have worked. However, while the reads are mapped to the individual chromosomes, the sam header still consists of the transcript IDs.
Example:

@HD     VN:1.0  SO:unknown
@SQ     SN:ENST00000456328_1    LN:1657
@SQ     SN:ENST00000515242_1    LN:1653
@SQ     SN:ENST00000518655_1    LN:1483
@SQ     SN:ENST00000450305_1    LN:632
@SQ     SN:ENST00000438504_1    LN:1783

Since the reads have the expected "chromosome" identifiers in the sam fields, this results in many error messages such as printed below when converting the sam file to bam or calling samtools flagstat:

[W::sam_parse1] urecognized reference name; treated as unmapped

Installation error

Ran into the following error during the installation procedure:

mkdir -p obj_linux/codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/
g++ -static-libgcc -static-libstdc++ -D__cplusplus=201103L -I"./src/" -I"/usr/include/" -I"codebase/seqlib/src/libs/seqan-library-2.0.1/include" -I"codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/" -Icodebase/argumentparser/src -Icodebase/seqlib/src -DRELEASE_VERSION -O3 -fdata-sections -ffunction-sections -c -fmessage-length=0 -ffreestanding -fopenmp -m64 -std=c++11 -Werror=return-type -pthread  -o obj_linux/codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/utils.o codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/utils.cpp
In file included from codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/divsufsort_private.h:69:0,
                 from codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/utils.cpp:29:
codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/utils.cpp: In function 'saint_t sufcheck64(const sauchar_t*, const saidx64_t*, saidx64_t, saint_t)':
codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/divsufsort64.h:61:23: error: expected ')' before 'PRId64'
 #define PRIdSAIDX64_T PRId64
                       ^
codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/divsufsort_private.h:75:23: note: in expansion of macro 'PRIdSAIDX64_T'
 #  define PRIdSAIDX_T PRIdSAIDX64_T
                       ^
codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/utils.cpp:185:49: note: in expansion of macro 'PRIdSAIDX_T'
         fprintf(stderr, "Out of the range [0,%" PRIdSAIDX_T "].\n"
                                                 ^
codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/divsufsort64.h:61:23: error: expected ')' before 'PRId64'
 #define PRIdSAIDX64_T PRId64
                       ^
codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/divsufsort_private.h:75:23: note: in expansion of macro 'PRIdSAIDX64_T'
 #  define PRIdSAIDX_T PRIdSAIDX64_T
                       ^
codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/utils.cpp:198:36: note: in expansion of macro 'PRIdSAIDX_T'
                         "  T[SA[%" PRIdSAIDX_T "]=%" PRIdSAIDX_T "]=%d"
                                    ^
codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/divsufsort64.h:61:23: error: expected ')' before 'PRId64'
 #define PRIdSAIDX64_T PRId64
                       ^
codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/divsufsort_private.h:75:23: note: in expansion of macro 'PRIdSAIDX64_T'
 #  define PRIdSAIDX_T PRIdSAIDX64_T
                       ^
codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/utils.cpp:229:34: note: in expansion of macro 'PRIdSAIDX_T'
                         "  SA[%" PRIdSAIDX_T "]=%" PRIdSAIDX_T " or\n"
                                  ^
make: *** [obj_linux/codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/utils.o] Error 1

Any idea how to address it? Thanks!

ERROR: Unknown parameter '-w'.

Hi,

when trying to run the program like this
bin/Linux-x64/graphmap align -x illumina -r /data1/Xenoturbella/redundans/run7-j3-i3/PacBio_scaffolder_results_l7/PacBio_l7.run7-j3-i3.scaffolds.fna.fa -d /data1/Xenoturbella/xenoturbella.TSLR.fastq -o /data1/Xenoturbella/redundans/run7-j3-i3/PacBio_scaffolder_results_l7/xenoturbella.TSLR.graphmap.alignments.sam

I get
ERROR: Unknown parameter '-w'..

But there is no "w".

Best

Philipp

graphmap terminate called after throwing an instance of 'std::bad_alloc'

Hi Ivan,
In a few instances, graphmap throws the error "terminate called after throwing an instance of 'std::bad_alloc'". Error message:

[03:40:52 ProcessReads] [CPU time: 207530.06 sec, RSS: 108099 MB] Read: 40255/93606 (43.00%) [m: 33162, u: 7082], length = 26, qname: 6ab83d51-3dbe-4d6c-ba2a-67...
[03:40:52 ProcessReads] [CPU time: 207530.06 sec, RSS: 108104 MB] Read: 40256/93606 (43.01%) [m: 33162, u: 7083], length = 49, qname: 88a881ee-0e1e-4939-adca-ed...
[03:40:52 ProcessReads] [CPU time: 207530.08 sec, RSS: 108115 MB] Read: 40257/93606 (43.01%) [m: 33162, u: 7084], length = 140, qname: 672cd3c8-990d-4aea-b32d-8...
[03:40:54 ProcessReads] [CPU time: 207543.75 sec, RSS: 110695 MB] Read: 40293/93606 (43.05%) [m: 33162, u: 7120], length = 19, qname: db403299-d52f-4bf9-a8c8-91...
[03:40:54 ProcessReads] [CPU time: 207543.77 sec, RSS: 110699 MB] Read: 40294/93606 (43.05%) [m: 33162, u: 7121], length = 106, qname: 3cef43ed-3d26-4e54-9a04-0...terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc

Can you have a look to see how can we avoid these?
Thanks

Mac fails build under OS X El Capitan g++-5.3.0

There seems to be a build error when building on mac. There are a number of files that seem to #include "sequences/single_sequence.h" but searching the repository I don't find any file names single_sequence.h. I tried removing those references which leads to more missing headers down the lines, utility/* and seqan/*, which in turn lead to build errors from missing the debug commands.

The error given is:

make mac
mkdir -p obj_mac/src/alignment/
/usr/local/bin/g++-5 -static-libgcc -static-libstdc++ -D__cplusplus=201103L -I"./src/" -I"/usr/include/" -I"codebase/seqlib/src/libs/seqan-library-2.0.1/include" -I"codebase/seqlib/src/libs/libdivsufsort-2.0.1-64bit/" -DRELEASE_VERSION -O3 -fdata-sections -ffunction-sections -c -fmessage-length=0 -ffreestanding -fopenmp -m64 -std=c++11 -Werror=return-type -pthread -o obj_mac/src/alignment/alignment.o src/alignment/alignment.cc
In file included from src/alignment/alignment.cc:8:0:
./src/alignment/alignment.h:19:39: fatal error: sequences/single_sequence.h: No such file or directory
compilation terminated.
make: *** [obj_mac/src/alignment/alignment.o] Error 1

g++ output:

/usr/local/bin/g++-5 -v
Using built-in specs.
COLLECT_GCC=/usr/local/bin/g++-5
COLLECT_LTO_WRAPPER=/usr/local/Cellar/gcc/5.3.0/libexec/gcc/x86_64-apple-darwin15.0.0/5.3.0/lto-wrapper
Target: x86_64-apple-darwin15.0.0
Configured with: ../configure --build=x86_64-apple-darwin15.0.0 --prefix=/usr/local/Cellar/gcc/5.3.0 --libdir=/usr/local/Cellar/gcc/5.3.0/lib/gcc/5 --enable-languages=c,c++,objc,obj-c++,fortran --program-suffix=-5 --with-gmp=/usr/local/opt/gmp --with-mpfr=/usr/local/opt/mpfr --with-mpc=/usr/local/opt/libmpc --with-isl=/usr/local/opt/isl --with-system-zlib --enable-libstdcxx-time=yes --enable-stage1-checking --enable-checking=release --enable-lto --with-build-config=bootstrap-debug --disable-werror --with-pkgversion='Homebrew gcc 5.3.0' --with-bugurl=https://github.com/Homebrew/homebrew/issues --enable-plugin --disable-nls --enable-multilib
Thread model: posix
gcc version 5.3.0 (Homebrew gcc 5.3.0)

segfault on some reads

Hello, I'm getting a segfault when building the latest version: 817ba03

Using this graphmap command line:
graphmap align --threads 1 --ref Chlamy_and_lambda.fa --index Chlamy_and_lambda.fa.graphmap.gmidx --reads y.fastq -o y.sam

using the following reference:
http://portal.nersc.gov/dna/RD/Adv-Seq/ONT/Chlamy_and_lambda.fa

and one of the reads that triggers it is:
http://portal.nersc.gov/dna/RD/Adv-Seq/ONT/y.fastq

gdb of the core dump narrows it down to
codebase/seqlib/src/libs/edlib.cpp:1150

I suspect there is in integer overflow somewhere since blockIdx is suspiciously large, but I don't understand the code very well:

#0  0x000000000040ddef in obtainAlignmentHirschberg (
    query=0x1d962f5 "CGCCTTCACACACACACACACCACCGCGAACCCACCCACCACCCGCACCACCCACCGCACCACCCACCCGCCGCCCACCCACCCACCCACCACCACCCACACCGCACCCACCACACCCACCCACCACACCCACCACCACCACCCACCCACCCACCACCCACCCACCCACCACACACACGCACACCACCCACCACCACCCA"..., 
    rQuery=0x1db0560 "GAGAGAACACGTTTGCGCGAGCACTTTTACGTCCGCACCCGCCACGCGAACCGTCGAGTTCCCCTTCCTCCCGACCACCGCCGCCGCCGTTCCCGCTTGACTGCCGCGTCCTCAGCGCCTTCGACGCCGTCCAAACACACCCGCCTCCTTACCTCCTCTCCCGCCGGCCACACAACACGCCTTTCGCTCCCGGCACTCCC"..., queryLength=2240, target=0x7fd4ddab20ae 'N' <repeats 200 times>..., 
    rTarget=0x1dba150 "TTGTGCGTGCGCGTCACCCTACTCGCGGACCTGCGCCGCGACCTACGGAGCGACCGCGGCGAAACGACCCCCGCGCTGGACCGAGTGCACATGCCGTCGGGTGCGCCGAAGTACCGCCGTGTGGGGCTCGACAAGCGCAAGGCTGTCCACTCCCCCCCCCCGACCCCACACACACCCACACCCACTCCACACACCCCCAC"..., targetLength=14666, alphabetLength=128, bestScore=12426, alignment=0x7fff02f76180, alignmentLength=0x7fff02f7617c) at codebase/seqlib/src/libs/edlib.cpp:1150
1150                        alignDataLeftHalf->scores[blockIdx]);
(gdb) list
1145        // and ending with scoresLeftEndIdx row (0-indexed).
1146        int scoresLeftLength = (lastBlockIdxLeft - firstBlockIdxLeft + 1) * WORD_SIZE;
1147        int* scoresLeft = new int[scoresLeftLength];
1148        for (int blockIdx = firstBlockIdxLeft; blockIdx <= lastBlockIdxLeft; blockIdx++) {
1149            Block block(alignDataLeftHalf->Ps[blockIdx], alignDataLeftHalf->Ms[blockIdx],
1150                        alignDataLeftHalf->scores[blockIdx]);
1151            readBlock(block, scoresLeft + (blockIdx - firstBlockIdxLeft) * WORD_SIZE);
1152        }
1153        int scoresLeftStartIdx = firstBlockIdxLeft * WORD_SIZE;
1154        // If last block contains padding, shorten the length of scores for the length of padding.
(gdb) p blockIdx
$1 = 1483722392

Here is the output:

16:49:26 Index] Running in normal (parsimonious) mode. Only one index will be used.
[16:49:26 Index] Index already exists. Loading from file.
[16:49:27 Index] Index loaded in 1.25 sec.
[16:49:27 Index] Memory consumption: [currentRSS = 2103 MB, peakRSS = 2104 MB]

[16:49:27 Run] Automatically setting the maximum allowed number of regions: max. 2346, attempt to reduce after 0
[16:49:27 Run] No limit to the maximum number of seed hits will be set in region selection.
[16:49:27 Run] Reference genome is assumed to be linear.
[16:49:27 Run] Only one alignment will be reported per mapped read.
[16:49:27 ProcessReads] Reads will be loaded in batches of up to 1024 MB in size.
[16:49:27 ProcessReads] Batch of 1 reads (0 MiB) loaded in 0.00 sec. (38832296 bases)
[16:49:27 ProcessReads] Memory consumption: [currentRSS = 2103 MB, peakRSS = 2104 MB]
[16:49:27 ProcessReads] Using 1 threads.
[16:49:27 ProcessReads] [CPU time: 0.00 sec, RSS: 2103 MB] Read: 0/1 (0.00%) [m: 0, u: 0], length = 25116, qname: 72a2fdd0-ab17-47cb-982b-c5d663753a66_Basecall_...Segmentation fault

alignment out of reference bounds

Hi Ivan,

I have a weird case where a read gets aligned to reference end +1. This is a long, stitched-together Illumina read mapped against a database of many similar sequences . I used the following command line (see PS for links to files):

graphmap -x illumina -t 8 -r 99_otus.fasta -d offending_seq.fa

The alignment in question is:
2|EU668175.1.1895 0 159627 317 0 122M1I539M1I3M111I2M2I1M4I1M1I1M2I2M1I1M2I1M1I3M1I1M1I4M4I10M2I3M2I5M1I10M1I13M2I3M1I6M2I3M2I7M4I3M1I3M4I1M1I2M4I4M4I13M1I3M2I7M2I4M1I6M2I8M1I2M1I16M3I1M2I10M1I7M1I5M3I2M1I4M1I1M1I1M3I3M1I7M1I2M1I2M1I1M2I6M1I1M6I1M1I1M1I6M5I1M1I4M1I2M2I1M2I1M1I2M2I11M6I8M1I2M1I8M2I8M2I5M2I1M2I10M5I1M3I8M8I2M2I5M7I2M1I4M1I10M2I1M1I1M2I6M2I3M2I3M2I7M1I1M1I6M2I2M1I5M1I15M2I14M2I8M2I4M2I2M2I1M1I7M

The cigar string translates into a length of 1056. Start position 317 + 1056 gives 1373, but the reference is of length 1372. You get the same result with e.g. pysam's aligned_pairs. I get this for 0.2.2 604a386 (dev) and 0.22 db1362c (master). Weirdly enough this doesn't happen if I simply extract the reference of interest from the bigger database and only align against that.

Andreas

PS:
https://dl.dropboxusercontent.com/u/4119940/graphmap-out-of-bounds-aln/offending_seq.fa
https://dl.dropboxusercontent.com/u/4119940/graphmap-out-of-bounds-aln/99_otus.fasta.gz

segmentation fault when making reference index

Hi Ivan,

Whenever I run graphmap (v.022), I get a segmentation fault. My computer is running Ubuntu 15.10 with 30GB of RAM and the parsimonious memory mode.

[Index 14:21:45] Running in parsimonious mode. Only one index will be used.
[Index 14:21:45] Generating index.
Segmentation fault (core dumped)

Do you know what might be causing this error?

Thanks!

Jacob

Empty lines in Sam file

Graphmap alignment resulted in a Sam-file containing emtpy lines. Downstream applications like samtools gave me errors because of this.

Case - Aligning of a 1D NanoPore E.Coli dataset to the reference genome

graphmap align \
        -t 32 \
        -C \
        -r U00096.2.fasta \
        -d $inputfile \
        -o /mnt/output/output.sam

Workaround I used, but not ideal:
sed '/^$/d' output.sam > output.new.sam

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.