Git Product home page Git Product logo

Comments (10)

jotech avatar jotech commented on May 30, 2024 1

should be fixed. please reopen in case of further problems!

from gapseq.

jinnjy avatar jinnjy commented on May 30, 2024

Hi,

This is again happened in the version I cloned from github on Mar 13, 2021.

(antismash5) [jinnjy@origo gapseq]$ ./gapseq doall /home/jinnjy/DATA/genomes/Prokaryotes/batch3/GCA_004936435.1_ASM493643v1_genomic.fna
index file GCA_004936435.1_ASM493643v1_genomic.fna.tmp.fai not found, generating...
Predicted taxonomy: Bacteria
Checking updates for Bacteria /home/jinnjy/Tools/gapseq/gapseq/src/../dat/seq/Bacteria
2021-05-18 10:06:05 URL: ftp://ftp.rz.uni-kiel.de/pub/medsystbio/Bacteria/rev/sequences.tar.gz [193] -> ".listing" [1]
2021-05-18 10:06:09 URL: ftp://ftp.rz.uni-kiel.de/pub/medsystbio/Bacteria/unrev/sequences.tar.gz [194] -> ".listing" [1]
2021-05-18 10:06:12 URL: ftp://ftp.rz.uni-kiel.de/pub/medsystbio/Bacteria/rxn/sequences.tar.gz [193] -> ".listing" [1]
Bacteria reviewed sequences already up-to-date
Bacteria unreviewed sequences already up-to-date
Bacteria additional reaction sequences already up-to-date
Warning: [tblastn] lcl|Query_6 UniRef50_A0A6J4S0Y1 Aspartyl-tRNA(Asn) amidotransferase subunit C @ Glutamyl-tRNA(Gln) amidotransferase subunit C (Fragment) n=1 Tax=uncultured Solirubrobacteraceae bacterium TaxID=1162706 RepID=A0A6J4S0Y1_9ACTN Subunit 3: Warning: Could not calculate ungapped Karlin-Altschul parameters due to an invalid query sequence or its translation. Please verify the query sequence(s) and/or filtering options
Warning: [tblastn] lcl|Query_6 UniRef50_A0A6J4S0Y1 Aspartyl-tRNA(Asn) amidotransferase subunit C @ Glutamyl-tRNA(Gln) amidotransferase subunit C (Fragment) n=1 Tax=uncultured Solirubrobacteraceae bacterium TaxID=1162706 RepID=A0A6J4S0Y1_9ACTN Subunit 3: Warning: Could not calculate ungapped Karlin-Altschul parameters due to an invalid query sequence or its translation. Please verify the query sequence(s) and/or filtering options
Warning: [tblastn] lcl|Query_1 UniRef50_O30642 Monomethylamine methyltransferase MtmB1 n=104 Tax=cellular organisms TaxID=131567 RepID=MTMB1_METBA: Warning: One or more U or O characters replaced by X for alignment score calculations at positions 201
Warning: [tblastn] lcl|Query_1 UniRef90_Q18TV3 Trimethylamine methyltransferase MttB n=7 RepID=MTTB_DESHD: Warning: One or more U or O characters replaced by X for alignment score calculations at positions 330
ls: cannot access 'query_subunit.part-.fasta': No such file or directory
rm: cannot remove 'query_subunit.part-
.fasta*': No such file or directory

from gapseq.

jinnjy avatar jinnjy commented on May 30, 2024

More specifically, this only happened if I use "doall".
Everything was fine when I later separated the execution of different steps by following the top figure of https://gapseq.readthedocs.io/en/latest/usage/basics.html:
./gapseq find -p all
./gapseq find-transport
./gapseq draft
./gapseq fill

The account I used on the server of my institution has no administrator access.

Hi,

This is again happened in the version I cloned from github on Mar 13, 2021.

(antismash5) [jinnjy@origo gapseq]$ ./gapseq doall /home/jinnjy/DATA/genomes/Prokaryotes/batch3/GCA_004936435.1_ASM493643v1_genomic.fna
index file GCA_004936435.1_ASM493643v1_genomic.fna.tmp.fai not found, generating...
Predicted taxonomy: Bacteria
Checking updates for Bacteria /home/jinnjy/Tools/gapseq/gapseq/src/../dat/seq/Bacteria
2021-05-18 10:06:05 URL: ftp://ftp.rz.uni-kiel.de/pub/medsystbio/Bacteria/rev/sequences.tar.gz [193] -> ".listing" [1]
2021-05-18 10:06:09 URL: ftp://ftp.rz.uni-kiel.de/pub/medsystbio/Bacteria/unrev/sequences.tar.gz [194] -> ".listing" [1]
2021-05-18 10:06:12 URL: ftp://ftp.rz.uni-kiel.de/pub/medsystbio/Bacteria/rxn/sequences.tar.gz [193] -> ".listing" [1]
Bacteria reviewed sequences already up-to-date
Bacteria unreviewed sequences already up-to-date
Bacteria additional reaction sequences already up-to-date
Warning: [tblastn] lcl|Query_6 UniRef50_A0A6J4S0Y1 Aspartyl-tRNA(Asn) amidotransferase subunit C @ Glutamyl-tRNA(Gln) amidotransferase subunit C (Fragment) n=1 Tax=uncultured Solirubrobacteraceae bacterium TaxID=1162706 RepID=A0A6J4S0Y1_9ACTN Subunit 3: Warning: Could not calculate ungapped Karlin-Altschul parameters due to an invalid query sequence or its translation. Please verify the query sequence(s) and/or filtering options
Warning: [tblastn] lcl|Query_6 UniRef50_A0A6J4S0Y1 Aspartyl-tRNA(Asn) amidotransferase subunit C @ Glutamyl-tRNA(Gln) amidotransferase subunit C (Fragment) n=1 Tax=uncultured Solirubrobacteraceae bacterium TaxID=1162706 RepID=A0A6J4S0Y1_9ACTN Subunit 3: Warning: Could not calculate ungapped Karlin-Altschul parameters due to an invalid query sequence or its translation. Please verify the query sequence(s) and/or filtering options
Warning: [tblastn] lcl|Query_1 UniRef50_O30642 Monomethylamine methyltransferase MtmB1 n=104 Tax=cellular organisms TaxID=131567 RepID=MTMB1_METBA: Warning: One or more U or O characters replaced by X for alignment score calculations at positions 201
Warning: [tblastn] lcl|Query_1 UniRef90_Q18TV3 Trimethylamine methyltransferase MttB n=7 RepID=MTTB_DESHD: Warning: One or more U or O characters replaced by X for alignment score calculations at positions 330
ls: cannot access 'query_subunit.part-.fasta': No such file or directory rm: cannot remove 'query_subunit.part-.fasta*': No such file or directory

from gapseq.

Waschina avatar Waschina commented on May 30, 2024

Thank for sharing the details and logs. We will look into it this week.
Best
Silvio

from gapseq.

jinnjy avatar jinnjy commented on May 30, 2024

Sorry I was still wrong, there are still cases for executing gapseq find:
This is the command I used.
./gapseq find -p all -v 0 /home/jinnjy/DATA/genomes/Prokaryotes/Acidobacteria_20210518_fna/fna/GCA_000014005.1_ASM1400v1_genomic.fna

And the following are the messages:

2021-05-25 16:47:46 URL: ftp://ftp.rz.uni-kiel.de/pub/medsystbio/Bacteria/rev/sequences.tar.gz [193] -> ".listing" [1]
2021-05-25 16:47:49 URL: ftp://ftp.rz.uni-kiel.de/pub/medsystbio/Bacteria/unrev/sequences.tar.gz [194] -> ".listing" [1]
2021-05-25 16:47:52 URL: ftp://ftp.rz.uni-kiel.de/pub/medsystbio/Bacteria/rxn/sequences.tar.gz [193] -> ".listing" [1]
Bacteria reviewed sequences already up-to-date
Bacteria unreviewed sequences already up-to-date
Bacteria additional reaction sequences already up-to-date
Warning: [tblastn] lcl|Query_6 UniRef50_A0A6J4S0Y1 Aspartyl-tRNA(Asn) amidotransferase subunit C @ Glutamyl-tRNA(Gln) amidotransferase subunit C (Fragment) n=1 Tax=uncultured Solirubrobacteraceae bacterium TaxID=1162706 RepID=A0A6J4S0Y1_9ACTN Subunit 3: Warning: Could not calculate ungapped Karlin-Altschul parameters due to an invalid query sequence or its translation. Please verify the query sequence(s) and/or filtering options
Warning: [tblastn] lcl|Query_6 UniRef50_A0A6J4S0Y1 Aspartyl-tRNA(Asn) amidotransferase subunit C @ Glutamyl-tRNA(Gln) amidotransferase subunit C (Fragment) n=1 Tax=uncultured Solirubrobacteraceae bacterium TaxID=1162706 RepID=A0A6J4S0Y1_9ACTN Subunit 3: Warning: Could not calculate ungapped Karlin-Altschul parameters due to an invalid query sequence or its translation. Please verify the query sequence(s) and/or filtering options
Warning: [tblastn] lcl|Query_1 UniRef50_O30642 Monomethylamine methyltransferase MtmB1 n=104 Tax=cellular organisms TaxID=131567 RepID=MTMB1_METBA: Warning: One or more U or O characters replaced by X for alignment score calculations at positions 201
Warning: [tblastn] lcl|Query_1 UniRef90_Q18TV3 Trimethylamine methyltransferase MttB n=7 RepID=MTTB_DESHD: Warning: One or more U or O characters replaced by X for alignment score calculations at positions 330
ls: cannot access 'query_subunit.part-.fasta': No such file or directory
rm: cannot remove 'query_subunit.part-
.fasta*': No such file or directory

from gapseq.

Waschina avatar Waschina commented on May 30, 2024

Hi @jinnjy

I figured out what causes the issue:
First, the [tblastn] Warnings are nothing to worry about here. Some (very few) reference protein sequences have non-standard letters in their amino acid sequences, which causes that a few alignments cannot be scored. Since this affects only a few (3-5) reference sequences this should not have a big effect on gapseq's output.
The second and bigger problem is the part

ls: cannot access 'query_subunit.part-.fasta': No such file or directory
rm: cannot remove 'query_subunit.part-.fasta*': No such file or directory

This is caused by an erroneous fasta file in gapseq's reference sequence database. However, the errors here are likely not affecting your gapseq output files. But we'll make sure to correct the error in the sequence file in the upcoming update of the reference sequences. Until now you can correct this error by running in your gapseq installation directory:

rm dat/seq/Bacteria/unrev/fff1a57554ea00ed065fb8ee193e2959.fasta

In addition, since the genome you have here is bacterial, you can also tell gapseq to limit the pathway search to pathways described for bacteria:

./gapseq find -p all -m Bacteria -v 0 /home/jinnjy/DATA/genomes/Prokaryotes/Acidobacteria_20210518_fna/fna/GCA_000014005.1_ASM1400v1_genomic.fna

Thanks for your feedback. I will post here again as soon as we fixed the reference sequences in gapseq.

from gapseq.

jinnjy avatar jinnjy commented on May 30, 2024

Hi @Waschina,

Thanks for pointing out the reason these message pop up.
So far, gapseq is the best fit for the analysis I want to perform, it is a relieve that these warnings and errors look like relatively minor issues.

from gapseq.

jotech avatar jotech commented on May 30, 2024

Hi @jinnjy
we updated all sequences to avoid the issue @Waschina pointed out!
Hope it works better now :)

from gapseq.

jinnjy avatar jinnjy commented on May 30, 2024

Hi @jinnjy
we updated all sequences to avoid the issue @Waschina pointed out!
Hope it works better now :)

Thank you for the great helps.
I updated the data before my new tests using the command.
git pull https://github.com/jotech/gapseq.git
I installed all the required package under an independent environment
conda create -n gapseq
conda activate gapseq
I finished the following new tests only with warnings which you have explained why they might be happening.
./gapseq find -p all -m Bacteria -v 0 /home/jinnjy/DATA/genomes/Prokaryotes/Acidobacteria_20210518_fna/fna/GCA_000014005.1_ASM1400v1_genomic.fna
./gapseq find-transport -v 0 /home/jinnjy/DATA/genomes/Prokaryotes/Acidobacteria_20210518_fna/fna/GCA_000014005.1_ASM1400v1_genomic.fna

So we should be able to close this issue for now

from gapseq.

Waschina avatar Waschina commented on May 30, 2024

Perfect, thanks for the feedback and please reopen if new issues occur.

from gapseq.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.