alexamk / decrippter Goto Github PK
View Code? Open in Web Editor NEWGenome mining tool for novel RiPP BGCs
License: GNU Affero General Public License v3.0
Genome mining tool for novel RiPP BGCs
License: GNU Affero General Public License v3.0
Hi I tried to install into a conda environment as I saw that python2 is still installable from conda but no luck. numpy
does not seem to be available for python 2
VERSION=0.0.0
URL=https://github.com/Alexamk/decRiPPter
ENVNAME="decrippter"
conda create -y -n $ENVNAME-$VERSION python=2.7
conda activate $ENVNAME-$VERSION
conda install -c conda-forge scikit-learn=0.11 biopython=1.76 scipy=1.2.3 matplotlib=2.2.5 networkx=2.2 numpy=1.16.6
conda install -c bioconda blast=2.6 diamond=0.9.31.132 hmmer=3.1b2 mcl muscle prodigal antismash
Unfortunately, the first conda install
command already fails. Any hints on installation or alternative tools would be appreciated.
Thanks
Hello,
The following error occurred when python genecluster_formation -o path/to/output PROJECT_NAME was used.
Traceback (most recent call last):
File "gene_cluster_formation.py",line 1121, in
precursor_group_names,precursor_operon_collections,ops,original_paris_prec,sp,sg - group_precursors.main(operons,mibig_domaindict,settings)
File "/home/lab/software/decRiPPter/lib/group_operons_precursors.py",
muscle_operator(operon_collections,path,settings['cores'])
File "home/lab/software/decRiPPTer/lib_groups_operons_precursors.py", line 193, in muscle_operator
out_text= ' ',join(results)
TypeError: sequence item 0: expected string,exceptions.ValueError found
Python 2 is long end of life. When will this tool be updated to python 3?
Please also state that numpy need to be installed before scipy, otherwise the install from requirements.txt will fail
getting this error while trying to mine some genomes from the NCBI with gene_cluster_formation.py
Traceback (most recent call last):
File "./gene_cluster_formation.py", line 1137, in
genome_collections = make_collections_per_genomes(operons,genome_dict)
File "./gene_cluster_formation.py", line 523, in make_collections_per_genomes
collection_type='genome',realname=genome.descr,active=True)
File "/home/asf/decRiPPter/lib/Genes.py", line 455, in init
self.prep()
File "/home/asf/decRiPPter/lib/Genes.py", line 471, in prep
self.set_flanks()
File "/home/asf/decRiPPter/lib/Genes.py", line 492, in set_flanks
operon.right_flank = None
UnboundLocalError: local variable 'operon' referenced before assignment
Hi Alexamk,
I installed the program and the run was not successful due to the lack of lib.split_genomes_mash on lib folder. I could not find this python module even when installing mash onto the environment.
Would you mind providing the modules or showing me how I can obtain it? Thanks.
Hi,
I am trying to run decrippter, but the following error arise:
INFO - 2022-03-29 14:20:18,334 - genome_prep - Copying genomes from --in command
INFO - 2022-03-29 14:20:18,594 - genome_prep - Selecting genomes with relevant files...
INFO - 2022-03-29 14:20:18,640 - genome_prep - Files previously annotated with prodigal found for 0 out of 16 genomes
INFO - 2022-03-29 14:20:18,640 - genome_prep - DNA fasta files found for 16 out of 16 remaining genomes
INFO - 2022-03-29 14:20:18,640 - genome_prep - 16 genomes are going to be parsed/processed
INFO - 2022-03-29 14:20:18,640 - genome_prep - Annotating DNA fasta files with prodigal
miniconda3/envs/decrippter/lib/python2.7/site-packages/Bio/SeqIO/InsdcIO.py:687: BiopythonWarning: Increasing length of locus line to allow long name. This will result in fields tha$
are not in usual positions.
BiopythonWarning,
INFO - 2022-03-29 14:20:37,885 - genome_prep - Starting the parsing of 16 genomes
INFO - 2022-03-29 14:20:45,277 - genome_prep - Analysing smORFs and assigning SVM score
miniconda3/envs/decrippter/lib/python2.7/site-packages/sklearn/__init__.py:21: DeprecationWarning: Importing from numpy.testing.nosetester is deprecated since 1.15.0, import from nu$
py.testing instead.
from numpy.testing import nosetester
INFO - 2022-03-29 14:22:47,505 - lib.smorfs - Finished smORF SVM scoring for 227039 candidates
INFO - 2022-03-29 14:22:48,010 - genome_prep - Prepped genomes in 0.50 seconds
INFO - 2022-03-29 14:22:48,056 - genome_prep - Number of proteins below max_proteins. Genomes will not be split up
INFO - 2022-03-29 14:22:48,056 - genome_prep - Starting further analysis per group. Number of groups: 1
INFO - 2022-03-29 14:22:48,056 - genome_prep - Group number: 1, size: 16
INFO - 2022-03-29 14:22:48,056 - genome_prep - Starting allVall BLASTs
INFO - 2022-03-29 14:22:48,868 - genome_prep - Making DIAMOND database
INFO - 2022-03-29 14:22:49,072 - genome_prep - Running allvall DIAMOND
INFO - 2022-03-29 14:22:50,851 - genome_prep - Running COG.main...
INFO - 2022-03-29 14:22:50,852 - lib.COG - Parsing allvall BLAST results
INFO - 2022-03-29 14:22:50,854 - lib.COG - Getting bidirectional best hits
INFO - 2022-03-29 14:22:50,855 - lib.COG - Getting pairwise truecogs
Traceback (most recent call last):
File "../../decRiPPter/genome_prep.py", line 824, in <module>
skipped_genomes = COG.main(COG_path, allvall_name, name, settings, genome_dict, group)
File "decRiPPter/lib/COG.py", line 864, in main
base_truecogs = make_cog_group(genomes_allowed,truecog_pairs_per_genome,true_pair_dict,genome_dict)
File "decRiPPter/lib/COG.py", line 294, in make_cog_group
base_truecogs = truecog_pairs[group[0],group[1]]
KeyError: ('GCA_009837215.1_ASM983721v1_genomic', 'GCA_009843125.1_ASM984312v1_genomic')
Best,
Pavlo
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.