Comments (10)
should be fixed. please reopen in case of further problems!
from gapseq.
Hi,
This is again happened in the version I cloned from github on Mar 13, 2021.
(antismash5) [jinnjy@origo gapseq]$ ./gapseq doall /home/jinnjy/DATA/genomes/Prokaryotes/batch3/GCA_004936435.1_ASM493643v1_genomic.fna
index file GCA_004936435.1_ASM493643v1_genomic.fna.tmp.fai not found, generating...
Predicted taxonomy: Bacteria
Checking updates for Bacteria /home/jinnjy/Tools/gapseq/gapseq/src/../dat/seq/Bacteria
2021-05-18 10:06:05 URL: ftp://ftp.rz.uni-kiel.de/pub/medsystbio/Bacteria/rev/sequences.tar.gz [193] -> ".listing" [1]
2021-05-18 10:06:09 URL: ftp://ftp.rz.uni-kiel.de/pub/medsystbio/Bacteria/unrev/sequences.tar.gz [194] -> ".listing" [1]
2021-05-18 10:06:12 URL: ftp://ftp.rz.uni-kiel.de/pub/medsystbio/Bacteria/rxn/sequences.tar.gz [193] -> ".listing" [1]
Bacteria reviewed sequences already up-to-date
Bacteria unreviewed sequences already up-to-date
Bacteria additional reaction sequences already up-to-date
Warning: [tblastn] lcl|Query_6 UniRef50_A0A6J4S0Y1 Aspartyl-tRNA(Asn) amidotransferase subunit C @ Glutamyl-tRNA(Gln) amidotransferase subunit C (Fragment) n=1 Tax=uncultured Solirubrobacteraceae bacterium TaxID=1162706 RepID=A0A6J4S0Y1_9ACTN Subunit 3: Warning: Could not calculate ungapped Karlin-Altschul parameters due to an invalid query sequence or its translation. Please verify the query sequence(s) and/or filtering options
Warning: [tblastn] lcl|Query_6 UniRef50_A0A6J4S0Y1 Aspartyl-tRNA(Asn) amidotransferase subunit C @ Glutamyl-tRNA(Gln) amidotransferase subunit C (Fragment) n=1 Tax=uncultured Solirubrobacteraceae bacterium TaxID=1162706 RepID=A0A6J4S0Y1_9ACTN Subunit 3: Warning: Could not calculate ungapped Karlin-Altschul parameters due to an invalid query sequence or its translation. Please verify the query sequence(s) and/or filtering options
Warning: [tblastn] lcl|Query_1 UniRef50_O30642 Monomethylamine methyltransferase MtmB1 n=104 Tax=cellular organisms TaxID=131567 RepID=MTMB1_METBA: Warning: One or more U or O characters replaced by X for alignment score calculations at positions 201
Warning: [tblastn] lcl|Query_1 UniRef90_Q18TV3 Trimethylamine methyltransferase MttB n=7 RepID=MTTB_DESHD: Warning: One or more U or O characters replaced by X for alignment score calculations at positions 330
ls: cannot access 'query_subunit.part-.fasta': No such file or directory
rm: cannot remove 'query_subunit.part-.fasta*': No such file or directory
from gapseq.
More specifically, this only happened if I use "doall".
Everything was fine when I later separated the execution of different steps by following the top figure of https://gapseq.readthedocs.io/en/latest/usage/basics.html:
./gapseq find -p all
./gapseq find-transport
./gapseq draft
./gapseq fill
The account I used on the server of my institution has no administrator access.
Hi,
This is again happened in the version I cloned from github on Mar 13, 2021.
(antismash5) [jinnjy@origo gapseq]$ ./gapseq doall /home/jinnjy/DATA/genomes/Prokaryotes/batch3/GCA_004936435.1_ASM493643v1_genomic.fna
index file GCA_004936435.1_ASM493643v1_genomic.fna.tmp.fai not found, generating...
Predicted taxonomy: Bacteria
Checking updates for Bacteria /home/jinnjy/Tools/gapseq/gapseq/src/../dat/seq/Bacteria
2021-05-18 10:06:05 URL: ftp://ftp.rz.uni-kiel.de/pub/medsystbio/Bacteria/rev/sequences.tar.gz [193] -> ".listing" [1]
2021-05-18 10:06:09 URL: ftp://ftp.rz.uni-kiel.de/pub/medsystbio/Bacteria/unrev/sequences.tar.gz [194] -> ".listing" [1]
2021-05-18 10:06:12 URL: ftp://ftp.rz.uni-kiel.de/pub/medsystbio/Bacteria/rxn/sequences.tar.gz [193] -> ".listing" [1]
Bacteria reviewed sequences already up-to-date
Bacteria unreviewed sequences already up-to-date
Bacteria additional reaction sequences already up-to-date
Warning: [tblastn] lcl|Query_6 UniRef50_A0A6J4S0Y1 Aspartyl-tRNA(Asn) amidotransferase subunit C @ Glutamyl-tRNA(Gln) amidotransferase subunit C (Fragment) n=1 Tax=uncultured Solirubrobacteraceae bacterium TaxID=1162706 RepID=A0A6J4S0Y1_9ACTN Subunit 3: Warning: Could not calculate ungapped Karlin-Altschul parameters due to an invalid query sequence or its translation. Please verify the query sequence(s) and/or filtering options
Warning: [tblastn] lcl|Query_6 UniRef50_A0A6J4S0Y1 Aspartyl-tRNA(Asn) amidotransferase subunit C @ Glutamyl-tRNA(Gln) amidotransferase subunit C (Fragment) n=1 Tax=uncultured Solirubrobacteraceae bacterium TaxID=1162706 RepID=A0A6J4S0Y1_9ACTN Subunit 3: Warning: Could not calculate ungapped Karlin-Altschul parameters due to an invalid query sequence or its translation. Please verify the query sequence(s) and/or filtering options
Warning: [tblastn] lcl|Query_1 UniRef50_O30642 Monomethylamine methyltransferase MtmB1 n=104 Tax=cellular organisms TaxID=131567 RepID=MTMB1_METBA: Warning: One or more U or O characters replaced by X for alignment score calculations at positions 201
Warning: [tblastn] lcl|Query_1 UniRef90_Q18TV3 Trimethylamine methyltransferase MttB n=7 RepID=MTTB_DESHD: Warning: One or more U or O characters replaced by X for alignment score calculations at positions 330
ls: cannot access 'query_subunit.part-.fasta': No such file or directory rm: cannot remove 'query_subunit.part-.fasta*': No such file or directory
from gapseq.
Thank for sharing the details and logs. We will look into it this week.
Best
Silvio
from gapseq.
Sorry I was still wrong, there are still cases for executing gapseq find:
This is the command I used.
./gapseq find -p all -v 0 /home/jinnjy/DATA/genomes/Prokaryotes/Acidobacteria_20210518_fna/fna/GCA_000014005.1_ASM1400v1_genomic.fna
And the following are the messages:
2021-05-25 16:47:46 URL: ftp://ftp.rz.uni-kiel.de/pub/medsystbio/Bacteria/rev/sequences.tar.gz [193] -> ".listing" [1]
2021-05-25 16:47:49 URL: ftp://ftp.rz.uni-kiel.de/pub/medsystbio/Bacteria/unrev/sequences.tar.gz [194] -> ".listing" [1]
2021-05-25 16:47:52 URL: ftp://ftp.rz.uni-kiel.de/pub/medsystbio/Bacteria/rxn/sequences.tar.gz [193] -> ".listing" [1]
Bacteria reviewed sequences already up-to-date
Bacteria unreviewed sequences already up-to-date
Bacteria additional reaction sequences already up-to-date
Warning: [tblastn] lcl|Query_6 UniRef50_A0A6J4S0Y1 Aspartyl-tRNA(Asn) amidotransferase subunit C @ Glutamyl-tRNA(Gln) amidotransferase subunit C (Fragment) n=1 Tax=uncultured Solirubrobacteraceae bacterium TaxID=1162706 RepID=A0A6J4S0Y1_9ACTN Subunit 3: Warning: Could not calculate ungapped Karlin-Altschul parameters due to an invalid query sequence or its translation. Please verify the query sequence(s) and/or filtering options
Warning: [tblastn] lcl|Query_6 UniRef50_A0A6J4S0Y1 Aspartyl-tRNA(Asn) amidotransferase subunit C @ Glutamyl-tRNA(Gln) amidotransferase subunit C (Fragment) n=1 Tax=uncultured Solirubrobacteraceae bacterium TaxID=1162706 RepID=A0A6J4S0Y1_9ACTN Subunit 3: Warning: Could not calculate ungapped Karlin-Altschul parameters due to an invalid query sequence or its translation. Please verify the query sequence(s) and/or filtering options
Warning: [tblastn] lcl|Query_1 UniRef50_O30642 Monomethylamine methyltransferase MtmB1 n=104 Tax=cellular organisms TaxID=131567 RepID=MTMB1_METBA: Warning: One or more U or O characters replaced by X for alignment score calculations at positions 201
Warning: [tblastn] lcl|Query_1 UniRef90_Q18TV3 Trimethylamine methyltransferase MttB n=7 RepID=MTTB_DESHD: Warning: One or more U or O characters replaced by X for alignment score calculations at positions 330
ls: cannot access 'query_subunit.part-.fasta': No such file or directory
rm: cannot remove 'query_subunit.part-.fasta*': No such file or directory
from gapseq.
Hi @jinnjy
I figured out what causes the issue:
First, the [tblastn] Warnings are nothing to worry about here. Some (very few) reference protein sequences have non-standard letters in their amino acid sequences, which causes that a few alignments cannot be scored. Since this affects only a few (3-5) reference sequences this should not have a big effect on gapseq's output.
The second and bigger problem is the part
ls: cannot access 'query_subunit.part-.fasta': No such file or directory
rm: cannot remove 'query_subunit.part-.fasta*': No such file or directory
This is caused by an erroneous fasta file in gapseq's reference sequence database. However, the errors here are likely not affecting your gapseq output files. But we'll make sure to correct the error in the sequence file in the upcoming update of the reference sequences. Until now you can correct this error by running in your gapseq installation directory:
rm dat/seq/Bacteria/unrev/fff1a57554ea00ed065fb8ee193e2959.fasta
In addition, since the genome you have here is bacterial, you can also tell gapseq to limit the pathway search to pathways described for bacteria:
./gapseq find -p all -m Bacteria -v 0 /home/jinnjy/DATA/genomes/Prokaryotes/Acidobacteria_20210518_fna/fna/GCA_000014005.1_ASM1400v1_genomic.fna
Thanks for your feedback. I will post here again as soon as we fixed the reference sequences in gapseq.
from gapseq.
Hi @Waschina,
Thanks for pointing out the reason these message pop up.
So far, gapseq is the best fit for the analysis I want to perform, it is a relieve that these warnings and errors look like relatively minor issues.
from gapseq.
Hi @jinnjy
we updated all sequences to avoid the issue @Waschina pointed out!
Hope it works better now :)
from gapseq.
Hi @jinnjy
we updated all sequences to avoid the issue @Waschina pointed out!
Hope it works better now :)
Thank you for the great helps.
I updated the data before my new tests using the command.
git pull https://github.com/jotech/gapseq.git
I installed all the required package under an independent environment
conda create -n gapseq
conda activate gapseq
I finished the following new tests only with warnings which you have explained why they might be happening.
./gapseq find -p all -m Bacteria -v 0 /home/jinnjy/DATA/genomes/Prokaryotes/Acidobacteria_20210518_fna/fna/GCA_000014005.1_ASM1400v1_genomic.fna
./gapseq find-transport -v 0 /home/jinnjy/DATA/genomes/Prokaryotes/Acidobacteria_20210518_fna/fna/GCA_000014005.1_ASM1400v1_genomic.fna
So we should be able to close this issue for now
from gapseq.
Perfect, thanks for the feedback and please reopen if new issues occur.
from gapseq.
Related Issues (20)
- How was dat/media/gut.csv formulated?
- stat: cannot stat '/scratch/users/nus/e0512805/gapseq/src/../dat/seq/Bacteria/rev/sequences.tar.gz': No such file or directory HOT 8
- Error in curl::curl_fetch_memory(url, handle = handle) : Timeout was reached: [rest.uniprot.org] SSL connection timeout HOT 4
- Issue with options when calling subcommand for "gapseq doall". HOT 3
- Download sequences fails HOT 2
- HTML entities for special characters in reaction name causes incorrect uniprot queries HOT 1
- libsbml and libglpk not found while installed using conda HOT 4
- Question: GapSeq includes a protein as being present when it got a bad blast? HOT 14
- Reaction inferred from pseudogene regions when using gapseq on genome fasta file.
- CHNOSZ NOT FOUND HOT 1
- [Request Update] [Tutorial] For anyone who intend to use CPLEX in gapseq HOT 3
- Diamond not used for the transport-find command HOT 2
- Could not use find-transport function using '-m' option for specific metabolite. HOT 8
- subex.tbl issue HOT 1
- Any plans on updating MetaCyc database? HOT 2
- Using seed reaction database in Adapt function HOT 8
- adding pathways| Error: No model reactions found and Error: Error in !opt$sbml.no.output : invalid argument type HOT 5
- Inquiry on instllation HOT 2
- Inquiry on error HOT 5
- Missing exchange reactions are not being added in the "adapt" module HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gapseq.