RF-Score-VS 1.0

RF-Score-VS is a novel Random Forest-based scoring function for Virtual Screening which predicts binding affinity. Its descriptors are based on RF-Score developed by Pedro Ballester et. al. Presented binary implements RF-Score-VS v2, meaning, it counts atoms of certain types within a 12A radius, divided into 2A bins.

Presented repository contains scripts required to reproduce results included in publication introducing RF-Score-VS.

Standalone scoring function

The RF-Score-VS is available as a standalone scoring function with no dependencies required. Usage instructions and detailed information about the binary are available in README.md file alongside binaries and in separate repository.

Download RF-Score-VS for your platform:

Retraining scoring function / reusing features

Features used in training of RF-Score-VS are available in head1_full directory. They are stored as compressed CSV files (*.csv.gz) and divided by DUD-e target in subdirectories.

If you want to use all data, we provide a convenient flat CSV files.

Requirements for running iPython Notebooks

Required software:

Python 2.7
ODDT 0.2+
OpenBabel 2.4.1+
Scikit-Learn 0.17+
Seaborn
Pandas

Additional software:

sklearn-compiledtrees 1.3+ (compiling RFs for final scoring function)
dask / ipyparallel / ipython-cluster-helper (parallel computations on cluster)

References:

Wójcikowski M, Ballester PJ, Siedlecki P. Performance of machine-learning scoring functions in structure-based virtual screening. Sci Rep. Nature Publishing Group; 2017;7: 46710. doi:10.1038/srep46710
Wójcikowski M, Zielenkiewicz P, Siedlecki P. Open Drug Discovery Toolkit (ODDT): a new open-source player in the drug discovery field. J Cheminform. 2015;7: 5317. doi:10.1186/s13321-015-0078-2
Ballester PJ, Mitchell JBO. A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking. Bioinformatics. 2010;26: 1169–1175. doi:10.1093/bioinformatics/btq112
Ballester PJ, Schreyer A, Blundell TL. Does a more precise chemical description of protein-ligand complexes lead to more accurate prediction of binding affinity? J Chem Inf Model. 2014;54: 944–955. doi:10.1021/ci500091r
Li H, Leung K-S, Wong M-H, Ballester PJ. Improving AutoDock Vina Using Random Forest: The Growing Accuracy of Binding Affinity Prediction by the Effective Exploitation of Larger Data Sets. Mol Inform. WILEY-VCH Verlag; 2015;34: 115–126. doi:10.1002/minf.201400132

oddt / rfscorevs Goto Github PK

rfscorevs's Introduction

RF-Score-VS 1.0

Standalone scoring function

Retraining scoring function / reusing features

Requirements for running iPython Notebooks

References:

rfscorevs's People

Contributors

Stargazers

Watchers

Forkers

rfscorevs's Issues

How to train new data?

Intel MKL FATAL ERROR: Cannot load libmkl_def.so. [17]

Integration with smina

Error while running the jupyter notebook named "001_dude_horizontal.ipynb"

000_MASTER_plots.ipynb not working by missing dude_classical_final.csv and dude_classical_final_nfolds.csv

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent