Git Product home page Git Product logo

probiopred's Introduction

[alt text]

ProBioPred can predict the potential probiotic candidates from genome sequence based on Support Vector Machine (SVM) trained models. Preferable input for ProBioPred is complete genome for better results, but you can also provide draft genome assembly. Currently, ProBioPred supports prediction for only 9 genera viz. Bacillus, Clostridium, Lactobacillus, Leuconostoc, Streptococcus, Bifidobacterium, Enterococcus, Lactococcus and Pediococcus. The input genome should be in standard FASTA format to run this tool.

Theory

The ProBioPred uses available genetic information and Support Vector Machine (SVM) models for prediction of potential probiotic candidate. In brief, based on extensive literature survey and available databases, ProBioPred uses information on genes imparting probiotic properties, virulence factors and antibiotic resistance genes to generate and train models which eventually predicts a potential probiotic candidate. ProBioPred can also serves as a tool to predict probiotic genes, virulence factors and antibiotic resistance genes which can be browsed on the website or downloaded. These models can be used for analysis of genome sequences using ProBioPred either online or as a stand-alone tool.

Installing ProBioPred

# create conda environment
conda create -n probiopred python=3.10
conda activate probiopred

# install dependencies
# blast
conda install -c bioconda blast

# libsvm
conda install -c conda-forge libsvm

# install rgi
git clone https://github.com/arpcard/rgi.git
cd rgi
pip install .

# install rgi database
rgi auto_load

# install ProBioPred
git clone https://github.com/microDM/ProBioPred.git
cd ProBioPred
pip install .

Running ProBioPred

usage: proBioPred.py [-h] -i PATH -g GENUS [-o PATH] [-t THREADS]

Wrapper for running ProBioPred. Searches for probiotic, virulent and
antibiotic resistence genes in query genome. Then predicts the probability
score of genome being probiotic or non-probiotic based on SVM model.

optional arguments:
  -h, --help            show this help message and exit
  -i PATH, --input_genome PATH
                        Query genome sequence in FASTA format
  -g GENUS, --genus GENUS
                        Genus of query genome. Currently support only
                        following 9 genera.[bacillus, clostridium,
                        lactobacillus, leuconostoc, streptococcus,
                        bifidobacterium, enterococcus, lactococcus,
                        pediococcus]
  -o PATH, --output_dir PATH
                        Path of output directory [Default: ProBioPred_out].
  -t THREADS, --threads THREADS
                        Number of threads to run for BLAST and RGI.

Run ProBioPred on batch of genomes

# create tab-separated file with 3 columns:
        1. genomeID: unique genome ID
        2. genomeFile: file path to respective genome (fasta)
        3. genus: one of the genus listed in ProBioPred help.

Output

ProBioPred generates output directory with several files and prints SVM score for probiotic/non-probiotic on standard output.

File Description
out.libsvm svm-predict output (1/-1 referes to probiotic/non-probiotic class)
pro_hits.pfasta Probiotic genes (multi-FASTA file)
pro_outFiltered.blast BLAST outfmt6 for probiotic genes
resulTab.csv Scores for each features (.csv format)
rgi_out.json RGI output (json format)
rgi_out.txt RGI output (tab-delimited format)
vfdb_hits.pfasta Virulent genes (mult-FASTA file)
vfdb_outFiltered.blast BLAST outfmt6 for virulent genes

probiopred's People

Contributors

microdm avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.