Git Product home page Git Product logo

deepaccnet's Introduction

DeepAccNet.py

Python-PyTorch implemenation of DeepAccNet described in https://www.biorxiv.org/content/10.1101/2020.07.17.209643v2

This method will estimate how good your protein models are using a metric called l-DDT (local distance difference test).

usage: DeepAccNet.py [-h] [--modelpath MODELPATH] [--pdb] [--csv] [--leaveTempFile] [--process PROCESS] [--featurize]
                     [--reprocess] [--verbose] [--bert] [--ensemble]
                     input ...

Error predictor network

positional arguments:
  input                 path to input folder or input pdb file
  output                path to output (folder path, npz, or csv)

optional arguments:
  -h, --help            show this help message and exit
  --pdb, -pdb           Running on a single pdb file instead of a folder (Default: False)
  --csv, -csv           Writing results to a csv file (Default: False)
  --per_res_only, -pr   Writing per-residue accuracy only (Default: False)
  --leaveTempFile, -lt  Leaving temporary files (Default: False)
  --process PROCESS, -p PROCESS
                        Specifying # of cpus to use for featurization (Default: 1)
  --featurize, -f       Running only the featurization part (Default: False)
  --reprocess, -r       Reprocessing all feature files (Default: False)
  --verbose, -v         Activating verbose flag (Default: False)
  --bert, -bert         Run with bert features. Use extractBert.py to generate them. (Default: False)
  --ensemble, -e        Running with ensembling of 4 models. This adds 4x computational time with some overheads
                        (Default: False)

v0.0.1
  • For the previous TensorFlow implementation, please see here.
  • For the MSA version of DeepAccNet, please see here.
  • For the refinement script, please see the modeling folder.

Softwares

  • Python > 3.5
  • PyTorch 1.3
  • PyRosetta for DeepAccNet and DeepAccNet-Bert.
  • ProtTrans and the ProtBert model (second one in the model availability table) for DeepAccNet-Bert.
  • Tested on Ubuntu 20.04 LTS

(For IPD users, please use the tensorflow conda environment)

Example usages

Running on a folder of pdbs (foldername: samples)

python DeepAccNet.py -r -v samples outputs

Running on a silentfile (filename: sample.silent)

python DeepAccNet-SILENT.py sample.silent output.csv

How to look at outputs

Output of the network is written to [input_file_name].npz, unless you had the --csv flag on. You can extract the predictions as follows.

import numpy as np

x = np.load("testoutput.npz")

lddt = x["lddt"]           # per residue lddt
estogram = x["estogram"]   # per pairwise distance e-stogram
mask = x["mask"]           # mask predicting native < 15

Perhaps lddt is the easiest place to start as it is per-residue quality score. You can simply take an average if you want a global score per protein structure.

If you want to do something more involved, check.ipynb is a good place to start.

Trouble shooting

  • If DeepAccNet.py returns an OOM (out of memory) error, your protein is probably too big. Try getting on titan instead of rtx2080 or run without gpu if running time is not your problem. You can also run it on cpus although it would be slow.
  • If you get an import error for pyErrorPred, you probably moved the script out of the DeepAccNet folder. In that case, you would have to add pyErrorPred to python path or do so within the script.
  • Send an e-mail at hiranumn at cs dot washington dot edu.

Resources

  • The dataset used to train this model can be accessed through here. Training splits can be accessed through data
  • DeepAccNet prediction on the test set can be downloaded here.
  • DeepAccNet prediction on the CAMEO set can be downloaded here.
  • DeepAccNet prediction on the CASP13 set can be downloaded here.

Updates

  • Repo initialized 2020.7.20
  • Transitioned to PyTorch 2020.11.3
  • Added versions that do not depend on pyRosetta, "distance with 3D" and "distance with 3D and Bert" from the paper. 2020.11.6

deepaccnet's People

Contributors

hiranumn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

deepaccnet's Issues

is it possible to use DeepAccNet with protein complex?

Hi Hiranumn,

Nice project! I am trying to use DeepAccNet to score a protein complex model, however, it only scored the first chain. I wonder if it is OK to combine multiple chains into a single chain before feeding it to DeepAccNet and how reliable the results will be.

Thank you! :)

Error in cst file generation

Hi Hiranumn,
Nice project!
when i tried working with estogram2cst.py by using the samples/tag0137.relaxed.al.pdb together with outputs/tag0137.relaxed.al.npz (both were copied to a new dir and renamed), theres an error related to Pref and Pavrg . the command was automatically generated by MainDiversification.py.

cst files were empty. Would u please help me with that? THX!
the error msg is attached.
errormsg.txt

i simply printed out the lengthes of Pref and Pavrg - they are not equal to each other.

update#1
the readme of modeling script notes that the version of PyRosetta is 3 but the codes use some expressions like 'import pyrosetta' or 'from pyrosetta import *', both of which are only available with PyRosetta-4.

so i just wonder if there may be any mistakes inside ur doc or i just confused by some magical things. : -)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.