linzhi2013 / mitoz Goto Github PK

View Code? Open in Web Editor NEW

112.0 12.0 39.0 223.82 MB

MitoZ: A toolkit for assembly, annotation, and visualization of animal mitochondrial genomes

Home Page: https://doi.org/10.1093/nar/gkz173

License: GNU General Public License v3.0

Shell 100.00%

mitoz mitochondrial genome-assembly mitochondrion

mitoz's Introduction

MitoZ 3

THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

WHEN YOU ADAPT (PART OF) THE SOFTWARE FOR YOUR USE CASES, THE AUTHOR AND THE SOFTWARE MUST BE EXPLICITLY CREDITED IN YOUR PUBLICATIONS AND SOFTWARE, AND YOU SHOULD ASK THE USERS OF YOUR SOFTWARE TO CITE THE SOFTWARE IN THEIR PUBLICATIONS. IN A WORD, 请讲武德.

About:

MitoZ provides a "one-click" solution to get annotated mitogenomes from raw data fastq files.

News

For Windows Sublinux users, you might need to recompile the cmsearch program for tRNA annotation.
(April-20-2023) Docker and Singularity versions of MitoZ 3.6 come out now. Both were tested on Ubuntu 20.04.4 LTS.
(April-19-2023) On the installation problem (#188), hopefully now it is fixed. Please let me know if it is not.
(April-14-2023) MitoZ 3.6 is just released (https://github.com/linzhi2013/MitoZ/releases/tag/3.6), fixed some bugs in MitoZ 3.5. It is recommended to upgrade to this version! You can install it via conda-pack (firstly recommended if the conda way does not work for you), conda and source code.

See

Installation: https://github.com/linzhi2013/MitoZ/wiki/Installation.
MAKE SURE that you do a test run using provided test dataset before running your own samples!.
Tutorial: https://github.com/linzhi2013/MitoZ/wiki/Tutorial (Recommended if you are NEW to MitoZ!)
Documentation: https://github.com/linzhi2013/MitoZ/wiki and the HMTL version. The HTML version may not be update-to-date.
Latest release: https://github.com/linzhi2013/MitoZ/releases/

Bugs and Questions

Have a look at https://github.com/linzhi2013/MitoZ/issues and https://github.com/linzhi2013/MitoZ/wiki/Known-issues for known bugs or issues.
Please try the latest version first if you find some bugs in the old versions
- to do that, you should specify the version of MitoZ when you use the mamba/conda command (please refer to the installation instruction), as I found out that many people still download the older versions.
Known bugs for MitoZ 3.5 (April-13-2023): (1) If your default shell is not bash, you can run into the missing annotation of tRNA genes (see #187). Please change the default shell to bash before using MitoZ 3.5! (2) In MitoZ 3.5, I mistakenly used a cmsearch binary for Mac OS for Linux platform, which leads to the problem of failing to annotate any tRNA genes. Please check #187 for the current solution.
check which shell you are using:
```
$ echo "$SHELL"
```
I have been updating the documentation (wiki) from time to time, so it may be good for you to check the documentation again every after some time.
In case there are still bugs in the latest version, please firstly search https://github.com/linzhi2013/MitoZ/issues and the Wiki to check whether similar questions have been raised by other users. If no related issues and answers are found, then please raise a new issue. Thank you!
Any feedbacks are wellcome!

Citations

Meng G, Li Y, Yang C, Liu S. MitoZ: a toolkit for animal mitochondrial genome assembly, annotation and visualization. Nucleic acids research. 2019 Jun 20;47(11):e63-. https://doi.org/10.1093/nar/gkz173
Additionally, please cite the related software invoked by MitoZ: https://github.com/linzhi2013/MitoZ/wiki/Citations

mitoz's People

Contributors

Stargazers

Watchers

mitoz's Issues

assembley error (all module)

Hello, I was exploring MitoZ to see if it can be implemented in our research. I tried following the all module, but when it tries to assemble the clean data I'm presented with the following error:

Error occured when running command:
/bin/assemble/mitoAssemble all -K 71 -o work71 -s work71.soaptrans.lib -p 12

I checked that all files have the correct format, but can't find any anomalies there.
Any ideas what the problem could be and how to solve this issue?
Thanks in advance

Gene coordinates problem in the `summary.txt` file

---后边是我手动矫正后的位置

Python编码问题。对于一个基因，Python坐标是 [0, 2）。我提取这个坐标的时候，起始位置记得+1了，但是结束位置忘记减1了。

Can't produce a correct summary.txt when using annotation module.

I was annotating for about 50 species with MitoZ. But there are about 10 species can't produce a correct summary.txt. The following is an error report. I think the main reason is "Name and length collide in the LOCUS line". Why this situation happened?

2019-10-01 10:29:06
/home/zf/install/miniconda3/envs/mitozEnv/bin/python3 /home/zf/install/version_2.3/release_MitoZ_v2.3/bin/common/genbank_gene_stat_v2.py 1_Poeciloneta_variegata_mitoscaf.fa.gbf 1_Poeciloneta_variegata.most_related_species.txt annotate > summary.txt

Traceback (most recent call last):
File "/home/zf/install/version_2.3/release_MitoZ_v2.3/bin/common/genbank_gene_stat_v2.py", line 238, in
main()
File "/home/zf/install/version_2.3/release_MitoZ_v2.3/bin/common/genbank_gene_stat_v2.py", line 202, in main
seqid_len_topology_relatedSP = get_seq_topology_and_related_sp(gbfile=gbfile, closely_related_sp_file=closely_related_sp_file)
File "/home/zf/install/version_2.3/release_MitoZ_v2.3/bin/common/genbank_gene_stat_v2.py", line 173, in get_seq_topology_and_related_sp
for rec in SeqIO.parse(gbfile, 'gb'):
File "/home/zf/install/miniconda3/envs/mitozEnv/lib/python3.6/site-packages/Bio/SeqIO/init.py", line 609, in parse
for r in i:
File "/home/zf/install/miniconda3/envs/mitozEnv/lib/python3.6/site-packages/Bio/GenBank/Scanner.py", line 480, in parse_records
record = self.parse(handle, do_features)
File "/home/zf/install/miniconda3/envs/mitozEnv/lib/python3.6/site-packages/Bio/GenBank/Scanner.py", line 464, in parse
if self.feed(handle, consumer, do_features):
File "/home/zf/install/miniconda3/envs/mitozEnv/lib/python3.6/site-packages/Bio/GenBank/Scanner.py", line 431, in feed
self._feed_first_line(consumer, self.line)
File "/home/zf/install/miniconda3/envs/mitozEnv/lib/python3.6/site-packages/Bio/GenBank/Scanner.py", line 1285, in _feed_first_line
'Name and length collide in the LOCUS line:\n' + line
AssertionError: Name and length collide in the LOCUS line:
LOCUS 1_Poeciloneta_variegata14332 bp DNA linear 01-OCT-2019

Error occured when running command:
/home/zf/install/miniconda3/envs/mitozEnv/bin/python3 /home/zf/install/version_2.3/release_MitoZ_v2.3/bin/common/genbank_gene_stat_v2.py 1_Poeciloneta_variegata_mitoscaf.fa.gbf 1_Poeciloneta_variegata.most_related_species.txt annotate > summary.txt

The clean reads to map mitochondrial genome

hi,
Thanks to the author for providing mitoz, which is very convenient. But for the rigor of the article, I need to verify the accuracy of mitochondrial genome assembled with mitoz and the issue about presence of pseudogenes, so can you tell me how to get file that only include clean reads?Or which file is in the generated file. To facilitate me to see the map results through the visualization software.

Error report with sample data

Hi,

i am using the docker version and its work well with test.data. But it shows error with sample files (TS-COR_R*.fq.gz). This is the report from m.err file:

Killed
Error occured when running command:
/project/bin/assemble/mitoAssemble all -K 71 -o work71 -s work71.soaptrans.lib -p 4

I am not able to fix this problem, pls help me.

Sridhar

About the depth distribution question

Dear professor Meng
In this sentence "if the depth larger than upper quartile, it turns dark green as same with the outline" I find that there is no dark green as the same with the outline in my output images. Is this my script mistake or input fastq data is wrong? My script:
python3 MitoZ.py visualize
--circos circos
--gb zzz.gb
--gc yes
--win 50
--gc_fill 128,177,211
--run_map yes
--bwa bwa
--thread 2
--depth_fill 190,186,218
--fq1 clean.1.fq.gz

Thanks for your attention to this question!

/bin/sh: cmsearch: command not found

Hi,
I didn't get all the result and this is the end of error file,picture shows files in result folder;

error file:
/bin/sh: cmsearch: command not found
/home/depengli/release_MitoZ_v2.4-alpha/MitoZ.py:2004: DeprecationWarning: time.clock has been deprecated in Python 3.3 and will be removed from Python 3.8: use time.perf_counter or time.process_time instead
program_begin_clock = time.clock()
Error occured when running command:
cmsearch -g --tblout SRR8695260_.trim_mitoscaf.fa.s-rRNA.tbl --cpu 40 /home/depengli/release_MitoZ_v2.4-alpha/bin/profiles/rRNA_CM/v1.1_12snew.cm SRR8695260_.trim_mitoscaf.fa >SRR8695260_.trim_mitoscaf.fa.s-rRNA.out

Below is job file I use:
python3 /home/depengli/release_MitoZ_v2.4-alpha/MitoZ.py all --genetic_code auto --clade Arthropoda --outprefix SRR9308458_.trim
--thread_number 40
--fastq1 SRR9308458_1.trim.fastq
--fastq2 SRR9308458_2.trim.fastq
--fastq_read_length 150
--insert_size 250
--run_mode 2
--filter_taxa_method 1
--requiring_taxa 'Arthropoda'

So how can I solve this problem? If I rerun the job,will the script check the existent files and skip or overwrite them？

Thank you!
Looking forward to your reply!

Depeng

How update NCBI database in Docker

Hi, my question is simple. How can I update NCBI database in docker Container from MitoZ?.
Thanks.

How to extract partial fastq data from large fastq files for MitoZ?

Hi,
it's not really an issue related to the program.

I would like to test your program MitoZ.
It seems to work well but i can't find raw.1.fq.gz and raw.2.fq.gz files.
I would like to first test your program with the same data you used.
Can you provide me these data ?

The issue about had not find visualization/circos.png

Hi,
I noticed some result is not good by using Mitoz to assemble and annotate mtDNA sequence, i do not known if it is the sample or others reason.The details are as follows: it will report an error cannot stat 'visualization/circos.png, have not circos.png in ZZZ.result and the result of assembly is very fragile.

A question about gw2gff.pl in the MT annotation toolkit

Dear Meng:
I'm trying to port the annotation toolkit to Python for my own pipeline usage, but it seems that some output number from *.genewise.gff is unreasonably low:

So I digged through the code of MT_annotation_BGI_V1.32, and found that your wrapped version has a unexpected round bracket in line 44 of the gw2gff.pl. Which multiplied the gene length in database by 100 before the overlapped length divide it, then caused the gene coverage dropped.

my $cover=sprintf("%.2f",$len/($Len{$pid}*100));

I also search out the original toolkit repository, and found that their code is correct:

my $cover=sprintf("%.2f",$len/$Len{$pid}*100);

Is this coverage calculation was done intended or just a mistake? My lab is already worked out some result with the uncorrected toolkit, so it could be a serious problem if the results had to be rerun once more.

report in file "*.errorsummary.val"

Dear Guanliang,

Thank you providing us such a powerful tool! I will recommend it to everyone who needs.

BTW, given that the procedure have generated a circled mitochondrial genome for my species, I am not very clear why those errors occurred in *.errorsummary.val file, could you give more explanation on each hint as below?

 5 ERROR:   SEQ_FEAT.NoStop
 1 WARNING: SEQ_INST.CompleteCircleProblem
 1 INFO:    SEQ_DESCR.OrganismIsUndefinedSpecies

Thank you very much!

Fan

Please update the install.md of v2.4-alpha

The constructor of the NCBITaxa class at the version of 3.0.0b35 doesn't take the augment 'taxdump_file', which caused really a problem when I'm installing the MitoZ from source code, as downloading the dump file from NCBI is slow and unstable, and there's no way for the NCBITaxa to use downloaded dump file in version 3.0.0b35. Updating the ete3 toolkit to version 3.1.1 using conda update or just specify the version when creating environment will solve this issue.

understanding visualization of the mitochondrial genome

Hi Linzhi,

MitoZ worked well and i got a visualization as output. I checked briefly your article but didn't find an explanation of the figure 4 Demonstration of mitogenome visualization using MitoZ.

https://www.biorxiv.org/content/10.1101/489955v1.full

Especially the stats inside the circle and also the color of each gene (red, blue, orange...)

Can you explain me the contents of this figure ?

MitoZ pulling from Shub failed

I am using a Windows-based platform (32-bit), installed with several programs from Singularity website and running on Git Bash attached with vagrant.

When I was trying to pull MitoZ file from Singularity hub, it failed. Hence, attached herewith the return command I received from Git Bash for your reference. Please kindly assist me with this matter.

Can I use this program to assemble the fungal mitochondrial genome?

Hi,
Can I use this program to assemble the fungal mitochondrial genome?

findmitoscaf error

Hello

I have a problem with findmitoscaf, I have a mitogenome sequences form Trinity assembly for this reason I run MitoZ with the tag --from_soaptrans, but MitoZ return this error:

2020-01-09 21:53:49
/app/anaconda/bin/python3 /app/release_MitoZ_v2.4-alpha/bin/assemble/filter_by_abundance.py 10 alud_findmitoscaf.hmmtblout.besthit.sim.filtered /media/osl/trabajo/Moscas_BGI/all_alud.fasta

Traceback (most recent call last):
File "/app/release_MitoZ_v2.4-alpha/bin/assemble/filter_by_abundance.py", line 61, in
main()
File "/app/release_MitoZ_v2.4-alpha/bin/assemble/filter_by_abundance.py", line 56, in main
min_abundance=float(min_abundance)
File "/app/release_MitoZ_v2.4-alpha/bin/assemble/filter_by_abundance.py", line 24, in filter_by_abundance
abun = m.group(1)
AttributeError: 'NoneType' object has no attribute 'group'
Error occured when running command:
/app/anaconda/bin/python3 /app/release_MitoZ_v2.4-alpha/bin/assemble/filter_by_abundance.py 10 alud_findmitoscaf.hmmtblout.besthit.sim.filtered /media/osl/trabajo/Moscas_BGI/all_alud.fasta

changing MitoZ's annotating database to annotate other type of mitogenome

Hi,
I've found MitoZ annotating mitogenome with a database of arthropods and mammals, if l want to annotate other type of mitogenome ,such as the molluscs, which means l need to change the database. And l want to know how to do that ?
thanks

Error while running circos

Dear @linzhi2013,

MitoZ is running fine until visualization. Here is the error message I'm getting:

perl: symbol lookup error: /usr/users/bheimbu/perl5/lib/perl5/x86_64-linux-thread-multi/auto/List/Util/Util.so: undefined symbol: Perl_xs_apiversion_bootcheck

Do I have to use a specific version of perl to run circos?

Cheers Bastian

not creating 'mitoscaf.fa.gb' file

Hi,
Im encountering an error when running the MitoZ.py all command. The assembly and annotation run fine until about 2 hours in when I get the following error:

FileNotFoundError: [Errno 2] No such file or directory: '/home/jonas/anaconda3/envs/mitozEnv/release_MitoZ_v2.4-alpha/K_lyco_mito/K_lyco_mitoscaf.fa.gbf'
Error occured when running command:
/home/jonas/anaconda3/envs/mitozEnv/bin/python3 /home/jonas/anaconda3/envs/mitozEnv/release_MitoZ_v2.4-alpha/bin/common/genbank_gene_stat_v2.py /home/jonas/anaconda3/envs/mitozEnv/release_MitoZ_v2.4-alpha/K_lyco_mito/K_lyco_mitoscaf.fa.gbf /home/jonas/anaconda3/envs/mitozEnv/release_MitoZ_v2.4-alpha/K_lyco_mito/K_lyco.most_related_species.txt all > summary.txt

I cant find the .gbf file anywhere on my system and I can't find the command that makes it in any of the python scripts. As it makes all the files up to this point I'm thinking I can create the .gbf file manually with UGENE and place it in the appropriate directory. Does anyone know why that file isn't being created or have any suggestions for me? Thanks! I'm running on Ubuntu v18. The following is the input command for MitoZ:

python3 MitoZ.py all --genetic_code 5 --clade Arthropoda --outprefix ~/anaconda3/envs/mitozEnv/release_MitoZ_v2.4-alpha/K_lyco_mito/K_lyco --thread_number 30 --fastq1 ~/anaconda3/envs/mitozEnv/release_MitoZ_v2.4-alpha/K_lyco_mito/K_lycopersicella_1.1.fastq --fastq2 ~/anaconda3/envs/mitozEnv/release_MitoZ_v2.4-alpha/K_lyco_mito/K_lycopersicella_1.2.fastq --fastq_read_length 150 --insert_size 250 --run_mode 2 --filter_taxa_method 1 --requiring_taxa 'Arthropoda'

Database is locked

NCBI database format is outdated. Upgrading
Downloading taxdump.tar.gz from NCBI FTP site...
Done. Parsing...
Traceback (most recent call last):
File "/home/rubens/.bin/MitoZ.py", line 2113, in
check_requiring_taxa(args.requiring_taxa)
File "/home/rubens/.bin/MitoZ.py", line 1975, in check_requiring_taxa
ncbi = NCBITaxa()
File "/home/rubens/miniconda3/envs/mitozEnv/lib/python3.6/site-packages/ete3/ncbi_taxonomy/ncbiquery.py", line 96, in init
self.update_taxonomy_database()
File "/home/rubens/miniconda3/envs/mitozEnv/lib/python3.6/site-packages/ete3/ncbi_taxonomy/ncbiquery.py", line 113, in update_taxonomy_database
update_db(self.dbfile)
File "/home/rubens/miniconda3/envs/mitozEnv/lib/python3.6/site-packages/ete3/ncbi_taxonomy/ncbiquery.py", line 740, in update_db
upload_data(dbfile)
File "/home/rubens/miniconda3/envs/mitozEnv/lib/python3.6/site-packages/ete3/ncbi_taxonomy/ncbiquery.py", line 768, in upload_data
db.execute(cmd)
sqlite3.OperationalError: database is locked

NCBI installation updates

Dear MitoZ team,
Thanks for updating and maintaining the repository.
I have a question regarding installing NCBI database inside ete3 package command:

from ete3 import NCBITaxa
ncbi = NCBITaxa()
ncbi.update_taxonomy_database()

Many HPC actually refuse to give FTP right to the users.However the NCBI database are all already installed locally.
I went through the source codes and it seems mitoz actually need FTP connection for creating populating database when first executed. I see we can change the database later on but i don't see how we can feed a local NCBI database to the function "ncbi = NCBITaxa()".
I am wondering if you have any suggestions regarding solving the issue.
Thanks in advance

Not able to used Docker Container of MitoZ

I have used this command with docker container:
fq1=test.1.fq.gz fq2=test.2.fq.gz outprefix=test docker run -v /home/sridhar/mito/ --rm guanliangmeng/mitoz:2.3 python /home/sridhar/mito/MitoZ.py all2 --genetic_code 5 --clade Arthropoda --insert_size 150 --thread_number 4 --fastq1 $fq1 --fastq2 $fq2 --outprefix $outprefix --fastq_read_length 124 1>m.log 2>m.err

Message in m.err file:

python: can't open file '/home/sridhar/mito/MitoZ.py': [Errno 2] No such file or directory

using MitoZ for annotating mitogenome generated by other assemblers

Hi,
I've got a mitochondrial assembly file through Novoplasty, and if you continue to use the Mitoz software to annotate and visualize, what can I replace with "" in the input file? I know the '' in the fasta files indicates that the nucleotide before is a possible deletion/insertion.
thanks

Problems with bult-in database

I am trying to use MitoZ to annotate the mitochondrial genome of an ant assembled from target captured sequences by another application. I get the following error, which I think is internal and has to do with NCBI's taxonomy:

name gi_NC_009093_ATP6_Galendromus_occidentalis_222_aa is not uniq at /users/PAS1032/osu9668/MitoZ/version_2.4-alpha/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl line 240, line 1163.
name gi_NC_009093_ATP8_Galendromus_occidentalis_53_aa is not uniq at /users/PAS1032/osu9668/MitoZ/version_2.4-alpha/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl line 240, line 4737.
name gi_NC_009093_COX1_Galendromus_occidentalis_517_aa is not uniq at /users/PAS1032/osu9668/MitoZ/version_2.4-alpha/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl line 240, line 7701.
name gi_NC_009093_COX2_Galendromus_occidentalis_221_aa is not uniq at /users/PAS1032/osu9668/MitoZ/version_2.4-alpha/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl line 240, line 10455.
name gi_NC_009093_COX3_Galendromus_occidentalis_275_aa is not uniq at /users/PAS1032/osu9668/MitoZ/version_2.4-alpha/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl line 240, line 14025.
name gi_NC_009093_CYTB_Galendromus_occidentalis_357_aa is not uniq at /users/PAS1032/osu9668/MitoZ/version_2.4-alpha/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl line 240, line 17589.
name gi_NC_009093_ND1_Galendromus_occidentalis_270_aa is not uniq at /users/PAS1032/osu9668/MitoZ/version_2.4-alpha/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl line 240, line 21171.
name gi_NC_009093_ND4_Galendromus_occidentalis_437_aa is not uniq at /users/PAS1032/osu9668/MitoZ/version_2.4-alpha/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl line 240, line 31847.
name gi_NC_009093_ND4L_Galendromus_occidentalis_89_aa is not uniq at /users/PAS1032/osu9668/MitoZ/version_2.4-alpha/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl line 240, line 35423.
name gi_NC_009093_ND5_Galendromus_occidentalis_560_aa is not uniq at /users/PAS1032/osu9668/MitoZ/version_2.4-alpha/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl line 240, line 38985.

my config file is:

genetic_code = auto
clade = Arthropoda
fastq1 = ./ponera_latreille_1.fastq
fastq2 = ./ponera_latreille_2.fastq
topology = circular
outprefix = plat
thread_number = 4
fastafile = ./plat.fa

Any help will be very welcome!

Cheers!

-run_mode 3 command syntax

Hello,

I would like to use -run_mode 3, but i'm not clear about which files to use. You provide this as your example:

python3 MitoZ.py all2 --genetic_code 5 --clade Arthropoda --outprefix test
--thread_number 12 --fastq1 clean.1.fq.gz --fastq2 clean.2.fq.gz
--fastq_read_length 150 --insert_size 250
--run_mode 3
--filter_taxa_method 1
--requiring_taxa 'Arthropoda'
--quick_mode_seq_file quickMode.fa
--quick_mode_fa_genes_file quick_mode_fa_genes.txt
--missing_PCGs ND4L ND6 ND2
--quick_mode_score_file work71.hmmtblout.besthit.sim.filtered.high_abundance_10.0X.reformat.sorted
--quick_mode_prior_seq_file work71.hmmtblout.besthit.sim.filtered.fa

But I don't know how to write the command with these output files:

IC11_AAH1795_Hymenoptera_Ichneumonidae_Campodorus_ultimus.cds
IC11_AAH1795_Hymenoptera_Ichneumonidae_Campodorus_ultimus.fasta
IC11_AAH1795_Hymenoptera_Ichneumonidae_Campodorus_ultimus.misc_feature
IC11_AAH1795_Hymenoptera_Ichneumonidae_Campodorus_ultimus.rrna
IC11_AAH1795_Hymenoptera_Ichneumonidae_Campodorus_ultimus.trna
IC11_AAH1795_Hymenoptera_Ichneumonidae_Campodorus_ultimus_mitoscaf.fa.gbf
IC11_AAH1795_Hymenoptera_Ichneumonidae_Campodorus_ultimus_mitoscaf.fa.sqn
IC11_AAH1795_Hymenoptera_Ichneumonidae_Campodorus_ultimus_mitoscaf.fa.tbl
IC11_AAH1795_Hymenoptera_Ichneumonidae_Campodorus_ultimus_mitoscaf.fa.val
README.txt
circos.png
circos.svg
errorsummary.val
summary.txt
work71.hmmtblout.besthit.sim.filtered.high_abundance_10.0X.reformat.sorted.Not-picked
work71.hmmtblout.besthit.sim.filtered.high_abundance_10.0X.reformat.sorted.Not-picked.fa
work71.hmmtblout.besthit.sim.filtered.low_abundance
work71.hmmtblout.besthit.sim.filtered.low_abundance.fasta
work71.mitogenome.fa
work71.most_related_species.txt

which mitogenome fasta file to use?
which work71.hmmtblout.besthit.sim.filtered files to use?
there are no clean.{1,2}.fq.gz files. Should I use python3 MitoZ.py all instead?

This is my attempt:

python3 MitoZ.py all2 --genetic_code 5 --clade Arthropoda --outprefix test
--thread_number 12 --fastq1 clean.1.fq.gz --fastq2 clean.2.fq.gz
--fastq_read_length 150 --insert_size 250
--run_mode 3
--filter_taxa_method 1
--requiring_taxa 'Arthropoda'
--quick_mode_seq_file work71.mitogenome.fa
--quick_mode_fa_genes_file work71.mitogenome_genes.txt
--missing_PCGs ATP6 ATP8 COX3 ND2 ND3 ND4 ND4L ND5
--quick_mode_score_file work71.hmmtblout.besthit.sim.filtered.high_abundance_10.0X.reformat.sorted.Not-picked
--quick_mode_prior_seq_file work71.hmmtblout.besthit.sim.filtered.high_abundance_10.0X.reformat.sorted.Not-picked.fa

Where work71.mitogenome_genes.txt contains the following text:

scaffold41 ND1 CYTB ND6
C5251 COX2 COX1

This is from the summary.txt

#Seq_id Length(bp) Circularity Closely_related_species
scaffold41 2499 no Diadegma semiclausum
C5251 2608 no Diadegma semiclausum

#Seq_id Start End Length(bp) Direction Type Gene_name Gene_prodcut Total_freq_occurred

scaffold41 21 930 910 + CDS ND1 NADH dehydrogenase subunit 1 1
scaffold41 933 1000 68 - tRNA trnS(uga) tRNA-Ser 1
scaffold41 1032 2191 1160 - CDS CYTB cytochrome b 1
scaffold41 2183 2486 304 - CDS ND6 NADH dehydrogenase subunit 6 1
C5251 92 156 65 - tRNA trnD(guc) tRNA-Asp 1
C5251 155 227 73 - tRNA trnK(cuu) tRNA-Lys 1
C5251 227 903 677 - CDS COX2 cytochrome c oxidase subunit II 1
C5251 962 2501 1540 - CDS COX1 cytochrome c oxidase subunit I 1

Protein coding genes totally found: 5
tRNA genes totally found: 3
rRNA genes totally found: 0

Genes totally found: 8

Potential missing genes:
#Gene total_missing_number

ATP6 1
ATP8 1
COX3 1
ND2 1
ND3 1
ND4 1
ND4L 1
ND5 1
l-rRNA 1
s-rRNA 1
tRNA-Ala 1
tRNA-Arg 1
tRNA-Asn 1
tRNA-Cys 1
tRNA-Gln 1
tRNA-Glu 1
tRNA-Gly 1
tRNA-His 1
tRNA-Ile 1
tRNA-Leu 2
tRNA-Met 1
tRNA-Phe 1
tRNA-Pro 1
tRNA-Ser 1
tRNA-Thr 1
tRNA-Trp 1
tRNA-Tyr 1
tRNA-Val 1

The missing genes might be foud from the
'.high_abundance' and '.low_abundance' files!

如何实现线虫动物门的线粒体组拼接？

我尝试在MitoZ.py修改了
search_and_annot_mito_parser.add_argument("--clade", default="Arthropoda",
choices=["Nematoda", "Arthropoda"]
但是还是不能完成拼接，请问还有什么地方需修改的？

How to optimize the assembly result and ensure accuracy

Hello Guanliang

I tried to assemble a mito-genome from WGS data,but the result is two linear sequences. I have several questions:
1.Whether extracting more data would be better for assemble(such as 8Gb with parameter --fq_size 8)?
2.The missing data is control region ,not PCGs, so are all parameters are required when rerun --run_mode3(such as "missi ng_PCGs =")?
3. Can I apply closely-related species' mito-genome sequence in MitoZ?
4. How can I run all2 mode without extraction function when rerunning run_mode3?

Thank you!
Depeng

start codon position problem

这个应该跟参考数据库里面的蛋白基因本身的注释存在问题有关。应该是由于数据库里面的这个基因，由于他们上传的注释结果就是在目前mitoz给出的注释结果的后面（如果该基因在正链的话），导致mitoz将参考蛋白与老师您的序列比对时候，也只能比对到这个位置。

mitoz 会自动往前找起始密码子，找到第一个之后，就停止了。

老师的想法是，只要能够往前延伸，就把坐标不断往前推？即使最终预测到的基因长度很长？

因为参考数据库里边的长度也未必对

如果参考数据库本身有问题呢

只是我认为细菌线粒体什么的不会平白无故多出来一些非编码区域

你也考虑一下吧其实这东西最好有转录组能验证
但我的延长规律也是我一己之见。

biopython not found

$ python3 MitoZ.py
package biopython not found! Please install it!

I have installed using miniconda and exported to path:
export PATH=~/miniconda2/pkgs/biopython-1.70-np112py27_1:$PATH

A suggestion for findmitoscaf program design

The program "findmitoscaf" is a very convenient tool to find mitochondrial scaffolds. It calculates the mapping depth of each scaffold and employs HMM search against a protein database. Obviously, HMM searching is faster than BWA mapping, so I think you could do HMM first to avoid much computational consumption.

Discussion group about how to use MitoZ

Dear professor Meng
Thanks for your software:MitoZ，it is a so convenient tool for me. However, sometimes there are some errors caused by my mistakes during running MitoZ, maybe an online discussion group for users will be better.Then users who have questions can help each other in time. I create a discussion group based on QQ, the number is 818630967. I sincerely hope you can join us and welcome other users too. Thanks for your help again.

installation error: packages incompatibilities

Hi,

I have tried to install MitoZ on a Linux server and a MacOs laptop, and in both cases the installation failed due to incompatibilities between python packages.

Here is the error message I obtained on my Mac:

conda create -n mitozEnv libgd=2.2.4 python=3.6.0 biopython=1.69 ete3=3.0.0b35 perl-list-moreutils perl-params-validate perl-clone circos=0.69 perl-bioperl blast=2.2.31 hmmer=3.1b2 bwa=0.7.12 samtools=1.3.1 infernal=1.1.1 tbl2asn openjdk

Collecting package metadata (current_repodata.json): done
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: |
Found conflicts! Looking for incompatible packages.
This can take several minutes. Press CTRL-C to abort.
failed

UnsatisfiableError: The following specifications were found to be incompatible with each other:

Package libiconv conflicts for:
libgd=2.2.4 -> libiconv
Package libidn11 conflicts for:
tbl2asn -> libidn11
Package perl-exporter-tiny conflicts for:
perl-list-moreutils -> perl-exporter-tiny
Package perl-time-hires conflicts for:
circos=0.69 -> perl-time-hires
Package perl-readonly conflicts for:
circos=0.69 -> perl-readonly
Package xz conflicts for:
python=3.6.0 -> xz=5.2
Package perl-bio-tools-phylo-paml conflicts for:
perl-bioperl -> perl-bio-tools-phylo-paml
...
...

I'm pretty sure I'm doing something wrong since I am apparently the only one getting such error.
Any help would be appreciated.

thanks
Romain

Also working with single end?

Hello, I just discovered the preprint on Biorxiv and came over here to check your programm. I was wondering whether it would work with single end data? If not, is there a way to trick it into doing it anyway? I know it sounds like a strange question, but our sequencing run failed half through (long story) and we could recover only one side of the reads.

A quesstion about how to find the missing genes？

Dear professor Meng
I found a problem I can't handle it. Sometimes the annotated mitogenome will miss tRNA and PCGs, how can I find complete mitochondrial genomes as much as possible. I tried to use the multi-kmer model to handle it however, the results were unsatisfactory. What should I do next?
sincerely,
Nan Zhou

Annotation Issue: Genewise Error

I have successfully installed the docker version of MitoZ 2.4alpha and ran the test data set with no problem. I am interested in the annotation and visualization models. I have tried only using the annotation module, with a consensus sequence from a full length amplicon, and it will not proceed past the 'run the genewise shell file'. I copied a part of the terminal output, it proceeded to go through a number of other species besides O. porcinus with the same error, that it cannot read the fasta file. The input file is formatted appropriately and has a sequence ID less than 10 characters with "topology=circular" included. Any help would be appreciated!

$/project# python3 /app/release_MitoZ_v2.4-alpha/MitoZ.py annotate --genetic_code 5 --clade Arthropoda --thread_number 4 --fastafile travis.consensus.fasta --outprefix nano
2019-12-15 06:20:09
export WISECONFIGDIR=/app/release_MitoZ_v2.4-alpha/bin/annotate/wisecfg
perl /app/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl -i nano_mitoscaf.fa -d /app/release_MitoZ_v2.4-alpha/bin/profiles/MT_database/Arthropoda_CDS_protein.fa -o ./ -g 5 -cpu 4

makeblastdb -in nano_mitoscaf.fa -dbtype nucl
run the tblastn shell file
Run solar to conjoin HSPs and filter bad HSPs and redundance.
preparing genewise input directories and files
name gi_NC_009093_ATP6_Galendromus_occidentalis_222_aa is not uniq at /app/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl line 240, line 1163.
name gi_NC_009093_ATP8_Galendromus_occidentalis_53_aa is not uniq at /app/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl line 240, line 4737.
name gi_NC_009093_COX1_Galendromus_occidentalis_517_aa is not uniq at /app/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl line 240, line 7701.
name gi_NC_009093_COX2_Galendromus_occidentalis_221_aa is not uniq at /app/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl line 240, line 10455.
name gi_NC_009093_COX3_Galendromus_occidentalis_275_aa is not uniq at /app/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl line 240, line 14025.
name gi_NC_009093_CYTB_Galendromus_occidentalis_357_aa is not uniq at /app/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl line 240, line 17589.
name gi_NC_009093_ND1_Galendromus_occidentalis_270_aa is not uniq at /app/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl line 240, line 21171.
name gi_NC_009093_ND4_Galendromus_occidentalis_437_aa is not uniq at /app/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl line 240, line 31847.
name gi_NC_009093_ND4L_Galendromus_occidentalis_89_aa is not uniq at /app/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl line 240, line 35423.
name gi_NC_009093_ND5_Galendromus_occidentalis_560_aa is not uniq at /app/release_MitoZ_v2.4-alpha/bin/annotate/MT_annotation_BGI_V1.32/MT_annotation_BGI.pl line 240, line 38985.
run the genewise shell file
running genewise
Warning Error
Cannot open

caf.fa.genewise/001/gi_NC_005820_ND3_Ornithodoros_porcinus_111_aa_Nan
for read_fasta_file

The situation that reads map mitogenome

From the visualization results, I found that some reads is't mitochondrial reads but I think they may be pseudogenes. Can you provide some methods to remove these reads or do you think this situation will not have a bad effects?

The topology of sequence ID

In you mannual about rearrangement the mitochondrial genome sequence, i want to know if the sequence ID is reference mitogeome from NCBI.

Mitogenome_reorder error

Dear Guanliang,
Thank for updating the repository.
I have an issue with nitogenome_reorder in V2.4.
I constantly receive the error "can not find topology=[circular|linear] in seqID"
However, my heading(seq-id) are like below:

scaffold26058;len=15661;topology=linear <<< for mito.fa >>> >scaffold26058;len=18661;topology=circular <<< for reference.fasta >>>

Thanks in advance
Arsalan

Corrupt release_MitoZ_v2.4-alpha.tar.bz2

Hi,

I could not unzip the package. It gave this error: bzip2: (stdin) is not a bzip2 file.
tar: Child returned status 2
tar: Error is not recoverable: exiting now

This also happened when I tried to unzip example files. I think they are all corrupt
Please assist. Thank you very much!

Error in genbank_gene_stat_v2.py

Hi Guanliang!

I installed MitoZ by docker (v2.3 and 2.4a) and tried to analyze mitogenome but I see always the same error.

Variants of command line:

python3 /app/release_MitoZ_v2.4-alpha/MitoZ.py all --genetic_code 1 --clade Chordata --thread_number 13 --outprefix duck-01 --fastq1 duck-01_S10_L001_R1_001.fastq.gz --fastq2 duck-01_S10_L001_R2_001.fastq.gz --fastq_read_length 300 --insert_size 300 --run_mode 2 --filter_taxa_method 1 --requiring_taxa 'Chordata'
python3 /app/release_MitoZ_v2.4-alpha/MitoZ.py annotate --fastafile duck.fa --genetic_code 1 --clade Chordata --thread_number 13 --outprefix duck.fa
python3 /app/release_MitoZ_v2.3/MitoZ.py annotate --fastafile duck.fa --genetic_code 1 --clade Chordata --thread_number 13 --outprefix duck.fa

In all cases it was error such type:
/app/anaconda/bin/python3 /app/release_MitoZ_v2.4-alpha/bin/common/genbank_gene_stat_v2.py duck-01_mitoscaf.fa.gbf duck-01.most_related_species.txt all > summary.txt

Traceback (most recent call last):
File "/app/release_MitoZ_v2.4-alpha/bin/common/genbank_gene_stat_v2.py", line 238, in
main()
File "/app/release_MitoZ_v2.4-alpha/bin/common/genbank_gene_stat_v2.py", line 215, in main
gene_infor, gene_freq = gene_stat(gbfile)
File "/app/release_MitoZ_v2.4-alpha/bin/common/genbank_gene_stat_v2.py", line 97, in gene_stat
print(ass_num, "Warning: NO gene or product tag! this gene is not output!\n")
NameError: name 'ass_num' is not defined
Error occured when running command:
/app/anaconda/bin/python3 /app/release_MitoZ_v2.4-alpha/bin/common/genbank_gene_stat_v2.py duck-01_mitoscaf.fa.gbf duck-01.most_related_species.txt all > summary.txt

Could you help with this bug?

Best wishes,
Marsel

Unknown error in raw_reads_filter_v0.5.pl

I am getting this error for my initial run using MitoZ. I request you to kindly help me to run MitoZ successfully.

python3 MitoZ.py all --genetic_code 1 --clade Chordata --thread_number 13 --outprefix duck-01 --fastq1 TS-COR_R1.fq.gz --fastq2 TS-COR_R2.fq.gz --fastq_read_length 300 --insert_size 300 --run_mode 2 --filter_taxa_method 1 --requiring_taxa 'Chordata'
MitoZ.py:1942: DeprecationWarning: time.clock has been deprecated in Python 3.3 and will be removed from Python 3.8: use time.perf_counter or time.process_time instead
program_begin_clock = time.clock()
2020-01-21 16:59:57
perl /home/sridhar/mito/release_MitoZ_v2.3/bin/filter/raw_reads_filter_v0.5.pl -1 /home/sridhar/mito/release_MitoZ_v2.3/TS-COR_R1.fq.gz -2 /home/sridhar/mito/release_MitoZ_v2.3/TS-COR_R2.fq.gz -3 /home/sridhar/mito/release_MitoZ_v2.3/tmp/duck-01.cleandata/clean.1.fq.gz -4 /home/sridhar/mito/release_MitoZ_v2.3/tmp/duck-01.cleandata/clean.2.fq.gz -m 3 -l 15 -n 10 -q 55,20 -z

input files:
/home/sridhar/mito/release_MitoZ_v2.3/TS-COR_R1.fq.gz
/home/sridhar/mito/release_MitoZ_v2.3/TS-COR_R2.fq.gz
output files:
/home/sridhar/mito/release_MitoZ_v2.3/tmp/duck-01.cleandata/clean.1.fq.gz
/home/sridhar/mito/release_MitoZ_v2.3/tmp/duck-01.cleandata/clean.2.fq.gz
filter parameters:
k, keep only bases between BEG and END for each read: full length read
n, the maximun N allowed in single end reads: 10
q, the maixmum percentage of low quality (ASCII code's integer) bases allowed in single end reads: 55,20
d, Filtering duplications: False

Use of uninitialized value $r01 in chomp at /home/sridhar/mito/release_MitoZ_v2.3/bin/filter/raw_reads_filter_v0.5.pl line 181, line 25348.
Use of uninitialized value $r02 in chomp at /home/sridhar/mito/release_MitoZ_v2.3/bin/filter/raw_reads_filter_v0.5.pl line 182, line 25348.
Use of uninitialized value $r03 in chomp at /home/sridhar/mito/release_MitoZ_v2.3/bin/filter/raw_reads_filter_v0.5.pl line 183, line 25348.
Use of uninitialized value $r04 in chomp at /home/sridhar/mito/release_MitoZ_v2.3/bin/filter/raw_reads_filter_v0.5.pl line 184, line 25348.
Use of uninitialized value $r01 in exists at /home/sridhar/mito/release_MitoZ_v2.3/bin/filter/raw_reads_filter_v0.5.pl line 195, line 25348.
Use of uninitialized value $r02 in transliteration (tr///) at /home/sridhar/mito/release_MitoZ_v2.3/bin/filter/raw_reads_filter_v0.5.pl line 199, line 25348.
Use of uninitialized value $seq_len in subtraction (-) at /home/sridhar/mito/release_MitoZ_v2.3/bin/filter/raw_reads_filter_v0.5.pl line 261, line 25348.
Use of uninitialized value $seq_len in division (/) at /home/sridhar/mito/release_MitoZ_v2.3/bin/filter/raw_reads_filter_v0.5.pl line 265, line 25348.
Illegal division by zero at /home/sridhar/mito/release_MitoZ_v2.3/bin/filter/raw_reads_filter_v0.5.pl line 265, line 25348.
Error occured when running command:
perl /home/sridhar/mito/release_MitoZ_v2.3/bin/filter/raw_reads_filter_v0.5.pl -1 /home/sridhar/mito/release_MitoZ_v2.3/TS-COR_R1.fq.gz -2 /home/sridhar/mito/release_MitoZ_v2.3/TS-COR_R2.fq.gz -3 /home/sridhar/mito/release_MitoZ_v2.3/tmp/duck-01.cleandata/clean.1.fq.gz -4 /home/sridhar/mito/release_MitoZ_v2.3/tmp/duck-01.cleandata/clean.2.fq.gz -m 3 -l 15 -n 10 -q 55,20 -z

MitoZ low abundance (<10X) error (providing too much data)

Dear Gaunliang,
Currently, I am trying to assemble the mitochondrial genomes of two fishes species using pair-end NGS data and some analysis is working seamlessly without any error but some have when applying the same MitoZ singularity parameters. As follows;

Sample Name Read Number Sequence Length (bp)
Oe1 56117045 (x2) 150
Oe2 88703882 (x2) 150
Yt2 60001863 (x2) 150
Yt3 61976186 (x2) 150

Code:
MitoZ.simg all --genetic_code 2 --clade Chordata --outprefix Oe1 --thread_number 28 --fastq1 Oe1_FDSW190704924-1a_HN2HFDSXX_L3_1.fq.gz --fastq2 Oe1_FDSW190704924-1a_HN2HFDSXX_L3_2.fq.gz --fastq_read_length 150 --insert_size 250 --run_mode 2 --filter_taxa_method 1 --requiring_taxa 'Actinopterygii'

Following the analysis, Oe1 and Yt3 samples worked perfectly and output results whereas Oe2 and Yt2 samples gave the errors (attached to the mail), it seems that everything is OK but we could not obtain the ".fasta" file for these samples. If you help us with this issue we will be greatly appreciated.
slurm-Oe2.txt
slurm-Yt2.txt

Java error at MiTFi step

Hi!
I'd like to use Mitoz for its assemble and annotate steps.
Currently, "annotate" gives me a java error with the test data you provide. I run the "assemble" step followed by "annotate" step as below:

#assemble module
cd /work/BourguignonU/BUCEK/Kalos/MitoZ_assemble_out
fastqDir=/work/BourguignonU/BUCEK/Kalos/sandbox
for sample in test
do MitoZ.py assemble --genetic_code 5 --clade Arthropoda --outprefix $sample \
--thread_number 10 \
--fastq1 ${fastqDir}/test.1.fq.gz \
--fastq2 ${fastqDir}/test.2.fq.gz \
--fastq_read_length 150 \
--insert_size 250 \
--run_mode 2 \
--filter_taxa_method 1 \
--requiring_taxa 'Arthropoda'
done

#annotate module
cd /work/BourguignonU/BUCEK/Kalos/MitoZ_assemble_out/test.result
for sample in test
do MitoZ.py annotate --genetic_code 5 --clade Arthropoda \
--outprefix ${sample} --thread_number 10 \
--fastafile work71.mitogenome.fa 
done

The error is:

MiTFi - mitochondrial tRNA finder v0.1
Exception in thread "main" java.lang.StringIndexOutOfBoundsException: begin 2, end 0, length 14
at java.base/java.lang.String.checkBoundsBeginEnd(String.java:3319)
at java.base/java.lang.String.substring(String.java:1874)
at mitfi.Main.main(Main.java:221)
Error occured when running command:
cd /home/a/ales-bucek/bin/release_MitoZ_v2.3/bin/annotate/mitfi
java -Xmx2048m -jar mitfi.jar -cores 1 -code 5 -evalue 0.001 -onlycutoff /work/BourguignonU/BUCEK/Kalos/MitoZ_assemble_out/test.result/tmp/test.annotation/test_mitoscaf.fa.C5 >>/work/BourguignonU/BUCEK/Kalos/MitoZ_assemble_out/test.result/tmp/test.annotation/test_mitoscaf.fa.trna

Since that happens with test data I believe it is something specific to my system rather then an actual Mitoz bug but in case you have any suggestions how to fix this I'd be thankful for them!

error -K 71 -o work71 -s work71.soaptrans.lib -p 8

Hi,

I got a problem when runing the 'all' module.

This is the information:
Error occured when running command:
/home/XXXXX/release_MitoZ_v2.4-alpha/bin/assemble/mitoAssemble all -K 71 -o work71 -s work71.soaptrans.lib -p 8

I used clean data, 4G base pair.
I got this error information after the 'all' module was running for 10 hours. But I am not really sure what is the error '-K 71 -o work71 -s work71.soaptrans.lib -p 8'.

Could you please give me some advice?

Cheers,

David

Question about parameter insert size

Hi,
I am trying to get Mitochondrial Genome from some SRA data.
How can I I know the insert size of those data if the author didn't list the information?
Can I use --insert_size 250 for all the data no matter how long it actually is ?

For example:
Run=SRR9308458: Is this file too large(over200GB) for MITOZ? How can I know the insert size?
Run=SRR8695259: Insert size is 3000bp(according to the information in Design). Is it too long for MitoZ?

Looking forwart to your reply!

Sincerely,
Depeng

disk I/O error

from ete3 import NCBITaxa
ncbi = NCBITaxa()
NCBI database format is outdated. Upgrading
Downloading taxdump.tar.gz from NCBI FTP site (via HTTP)...
Done. Parsing...
Loading node names...
2194198 names loaded.
213209 synonyms loaded.
Loading nodes...
2194198 nodes loaded.
Linking nodes...
Tree is loaded.
Updating database: /share/home/zhaojun01/.etetoolkit/taxa.sqlite ...
2194000 generating entries...
Uploading to /share/home/zhaojun01/.etetoolkit/taxa.sqlite
Traceback (most recent call last):
File "", line 1, in
File "/share/home/zhaojun01/.conda/envs/mitozEnv/lib/python3.6/site-packages/ete3/ncbi_taxonomy/ncbiquery.py", line 120, in init
self.update_taxonomy_database(taxdump_file)
File "/share/home/zhaojun01/.conda/envs/mitozEnv/lib/python3.6/site-packages/ete3/ncbi_taxonomy/ncbiquery.py", line 129, in update_taxonomy_database
update_db(self.dbfile)
File "/share/home/zhaojun01/.conda/envs/mitozEnv/lib/python3.6/site-packages/ete3/ncbi_taxonomy/ncbiquery.py", line 760, in update_db
upload_data(dbfile)
File "/share/home/zhaojun01/.conda/envs/mitozEnv/lib/python3.6/site-packages/ete3/ncbi_taxonomy/ncbiquery.py", line 791, in upload_data
db.execute(cmd)
sqlite3.OperationalError: disk I/O error

Mitogenome_reorder Error

Dear Guanliang,

I hope that you are doing well and thank you again for providing MitoZ for us, it seems really useful for us. I have a question regarding reordering of newly assembled mitogenomes using following code;

python3 /okyanus/users/veldem/05.Miscellaneous/MitoZ/MitoZ/version_2.3/useful_scripts/Mitogenome_reorder.py -f Oe2.fasta -r Schistura_longa.fasta

When running above code, it gives an error;

can not find topology=[circular|linear] in seqID

But the fasta header has already this information;

C720972;len=16571;topology=circular

I am missing something to do this job? Many thanks in advance for your help!

Could not open blosum62.bla as a filename for read Blast matrix

I'm sorry to disturb you so many times, but I still have some questions to ask you.

Warning Error
Could not open blosum62.bla as a filename for read Blast matrix
Warning Error
Could not read Comparison matrix file in blosum62.bla
Fatal Error
Could not build objects!

mitoz.log

python3 $DIR_mitoz/MitoZ.py all --genetic_code 5 --clade Arthropoda --outprefix $abb
--thread_number 8
--fastq1 $fq1
--fastq2 $fq2
--fastq_read_length 150
--insert_size 250
--run_mode 2
--filter_taxa_method 1
--requiring_taxa 'Arthropoda' >> mitoz.log 2>&1

findmitoscaf.log
python3 $DIR_mitoz/MitoZ.py findmitoscaf --genetic_code 5 --clade Arthropoda --outprefix $abb
--thread_number 8
--from_soaptrans
--fastafile $scafSeq >> findmitoscaf.log 2>&1

I have this problem running both ways, but I didn't find the file(blosum62.bla).

Error in annotate module

Dear Guanliang Meng:
I've got a mitochondrial assembly file through MitoZ, and rearrangement of fasta file basing on reference mitogeome. When i use the Mitoz annotate module to annotate and visualize the new fasta file, its reported err.

I been try it under docker envs (ubuntu-18.04.1 both mitoZ-2.3 and mitoZ-2.4-alpha), also used different input file(rearranged mitochondrial fasta file generated by Mitogenome_reorder.py, original assembly file generated by MitoZ-2.3 all2 , another mitogeome download from NCBI ), almost same errors here.

Here is the Log file.
anno.log
The input file is generated by mitoZ-2.3 with following command.

python3 /app/release_MitoZ_v2.3/MitoZ.py all2 --genetic_code 2 --clade Chordata --outprefix ${g}_${b} \
--thread_number 20 \
--fastq1 ./${g}_${b}_R1.fq.gz \
--fastq2 ./${g}_${b}_R2.fq.gz \
--fastq_read_length 150 \
--insert_size 250 \
--run_mode 2 \
--filter_taxa_method 1 \
--requiring_taxa 'Rodentia' \
1>${g}_${b}.log 2>${g}_${b}.err

thanks

linzhi2013 / mitoz Goto Github PK

mitoz's Introduction

MitoZ 3

Citations

mitoz's People

Contributors

Stargazers

Watchers

Forkers

mitoz's Issues

#Seq_id Start End Length(bp) Direction Type Gene_name Gene_prodcut Total_freq_occurred

Protein coding genes totally found: 5 tRNA genes totally found: 3 rRNA genes totally found: 0

Potential missing genes: #Gene total_missing_number

Recommend Projects

Recommend Topics

Recommend Org

Protein coding genes totally found: 5
tRNA genes totally found: 3
rRNA genes totally found: 0

Potential missing genes:
#Gene total_missing_number