Git Product home page Git Product logo

ibpred's Introduction

Data

Ion binding proteins (IBPs) and non-IBPs are stored in ./datas/data.fasta without labels. But first 114 sequences are IBPs, and other 207 samples are non-IBPs.

Environment build

solution (new)

Create ibpred env, execute following commands in base env of Anaconda:

conda env create -n ibpred -f requirements.yaml
conda activate ibpred

previous solution

Anaconda (Anaconda3-2021.11-Linux-x86_64) virtual environment called ibpred on Linux system was used.

To build ibpred, please execute following commands in base environment of Anaconda:

conda create -n ibpred -y
conda activate ibpred
conda install python==3.9.7 numpy==1.21.2 pandas==1.3.5 scipy==1.7.1 scikit-learn==1.0.1 tenacity==8.0.1 matplotlib==3.4.3 seaborn==0.11.1 ipykernel==6.4.1 traitlets==4.3.3 black -y
conda install -c conda-forge scikit-optimize==0.9.0 -y

If the user wants to run Jupyter notebook version-based workflow, all in process.ipynb, the ipykernel need to be installed.

The Jupyter notebook in VS code editor required; traitlets is optional for solving some bugs that can't run 2+ notebooks in the same env

However, failed :(

Otherwise you can reproduce the paper results by run the following command line in code directory (It's better to use ipynb because our results were produced by running all codes in ipynb):

python process.py

The matplotlib, seaborn, black packages are alternative for model training, but matplotlib and seaborn are necessary for plot pictures, as well black can be used for formatting python code.

How to use

auto_pred.py is the tool to identify IBPs. Users must confirm the model and the feature subset num used for prediction. Because the default parameter model is the base_bayes_clf_PD and default set_num is 193.

Command line in the ibpred env like this:

python auto_pred.py --file 'input_file.fasta' --model 'base_bayes_clf_PD' --set_num 193 --out 'result.csv'

Notice:

  • Don't use the same identifiers for different samples.
  • Don't place the same sequences in the input file.

ibpred's People

Contributors

shishiyuan avatar

Watchers

 avatar

Forkers

harel-coffee

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.