haddocking / prodigy Goto Github PK

View Code? Open in Web Editor NEW

91.0 16.0 23.0 5.53 MB

Predict the binding affinity of protein-protein complexes from structural data

Home Page: https://wenmr.science.uu.nl/prodigy/

License: Apache License 2.0

Python 99.11% Dockerfile 0.89%

binding-affinity bioinformatics python3 utrecht-university

prodigy's Introduction

PRODIGY / Binding Affinity Prediction

PRODIGY is also available as a web service @ wenmr.science.uu.nl/prodigy

Installation

pip install prodigy-prot

If you want to develop PRODIGY, check DEVELOPMENT for more details.

Usage

prodigy <pdb file> [--selection <chain1><chain2>]

To get a list of all the possible options.

$ prodigy --help
usage: prodigy [-h] [--distance-cutoff DISTANCE_CUTOFF] [--acc-threshold ACC_THRESHOLD] [--temperature TEMPERATURE]
               [--contact_list] [--pymol_selection] [-q] [-V] [--selection A B [A,B C ...]]
               structf

Binding affinity predictor based on Intermolecular Contacts (ICs).

Anna Vangone and Alexandre M.J.J. Bonvin,
Contacts-based prediction of binding affinity in protein-protein complexes.
eLife (2015)

positional arguments:
  structf               Structure to analyse in PDB or mmCIF format

options:
  -h, --help            show this help message and exit
  --distance-cutoff DISTANCE_CUTOFF
                        Distance cutoff to calculate ICs
  --acc-threshold ACC_THRESHOLD
                        Accessibility threshold for BSA analysis
  --temperature TEMPERATURE
                        Temperature (C) for Kd prediction
  --contact_list        Output a list of contacts
  --pymol_selection     Output a script to highlight the interface (pymol)
  -q, --quiet           Outputs only the predicted affinity value
  -V, --version         Print the version and exit.

Selection Options:

      By default, all intermolecular contacts are taken into consideration,
      a molecule being defined as an isolated group of amino acids sharing
      a common chain identifier. In specific cases, for example
      antibody-antigen complexes, some chains should be considered as a
      single molecule.

      Use the --selection option to provide collections of chains that should
      be considered for the calculation. Separate by a space the chains that
      are to be considered _different_ molecules. Use commas to include multiple
      chains as part of a single group:

      --selection A B => Contacts calculated (only) between chains A and B.
      --selection A,B C => Contacts calculated (only) between chains A and C; and B and C.
      --selection A B C => Contacts calculated (only) between chains A and B; B and C; and A and C.


  --selection A B [A,B C ...]

Example

Download the PDB 3BZD and run PRODIGY on it.

$ curl -o 3bzd.pdb https://files.rcsb.org/download/3BZD.pdb
$ prodigy 3bzd.pdb
[+] Reading structure file: /Users/rvhonorato/dbg/3bzd.pdb
[+] Parsed structure file 3bzd (2 chains, 343 residues)
[+] No. of intermolecular contacts: 51
[+] No. of charged-charged contacts: 4
[+] No. of charged-polar contacts: 7
[+] No. of charged-apolar contacts: 6
[+] No. of polar-polar contacts: 7
[+] No. of apolar-polar contacts: 15
[+] No. of apolar-apolar contacts: 12
[+] Percentage of apolar NIS residues: 29.48
[+] Percentage of charged NIS residues: 29.48
[++] Predicted binding affinity (kcal.mol-1):     -9.4
[++] Predicted dissociation constant (M) at 25.0˚C:  1.3e-07

Details of the binding affinity predictor implemented in PRODIGY can be found at 10.7554/elife.07454

Citing us

If our tool is useful to you, please cite PRODIGY in your publications:

Xue L, Rodrigues J, Kastritis P, Bonvin A.M.J.J, Vangone A.: PRODIGY: a web server for predicting the binding affinity of protein-protein complexes. Bioinformatics (2016) (10.1093/bioinformatics/btw514)
Anna Vangone and Alexandre M.J.J. Bonvin: Contacts-based prediction of binding affinity in protein-protein complexes. eLife, e07454 (2015) (10.7554/eLife.07454)
Panagiotis L. Kastritis , João P.G.L.M. Rodrigues, Gert E. Folkers, Rolf Boelens, Alexandre M.J.J. Bonvin: Proteins Feel More Than They See: Fine-Tuning of Binding Affinity by Properties of the Non-Interacting Surface. Journal of Molecular Biology, 14, 2632–2652 (2014). (10.1016/j.jmb.2014.04.017)

Contact

For questions about PRODIGY usage, please reach out the team at ask.bioexcel.eu

Information about dependencies

The scripts rely on Biopython to validate the PDB structures and calculate interatomic distances. freesasa, with the parameter set used in NACCESS (Chothia, 1976), is also required for calculating the buried surface area.

DISCLAIMER: given the different software to calculate solvent accessiblity, predicted values might differ (very slightly) from those published in the reference implementations. The correlation of the actual atomic accessibilities is over 0.99, so we expect these differences to be very minor.

prodigy's People

Contributors

Stargazers

Watchers

prodigy's Issues

Prodigy v2.1.4 and v2.1.5 appear as v2.1.3

Just a minor issue - versions 2.1.4 and 2.1.5 of prodigy are both listed as v2.1.3 when running prodigy -V on the command line, and is also listed as v2.1.3 by pip after installation. For pip, can this be fixed by bumping the version number in pyproject.toml on line 4? For the command line, it looks like the version is controlled by line 19 of prodigy/predict_IC.py.

--acc-threshold range?

I believe that the --acc-threshold parameter specifies the accessibility threshold for the buried surface area analysis. Is this parameter a percentage? I tried changing the default from 0.05 to 1.25 and it worked, but anything larger failed.

My thought is that the IC contact list might help me with the TAP criteria. Would the IC residues be similar to the "CDR vicinity" in the TAP calculations?

Contact_list flag

Hello,

I'm currently running prodigy on a protein complex.
I don't see any changes in the terminal or in the directory when activating the contact_list flag

Do you have any idea why?

(prodigy) loris@big-gpu:~/dev/prodigy$ prodigy 7f62.pdb --selection H,L A
[+] Reading structure file: /home/loris/loris/ground_thruth/7f62.pdb
[!] Structure contains gaps:
        A PRO330 < Fragment 0 > A LYS444
        A GLY447 < Fragment 1 > A SER530
        H GLU1 < Fragment 2 > H ALA119
        L GLN1 < Fragment 3 > L ARG109
[+] Parsed structure file 7f62 (3 chains, 427 residues)
.....
[++] Predicted binding affinity (kcal.mol-1):    -11.5
[++] Predicted dissociation constant (M) at 25.0˚C:  3.6e-09

(prodigy) loris@big-gpu:~/dev/prodigy$ prodigy ground_thruth/7f62.pdb --selection H,L A **--contact_list**
[+] Reading structure file: /home/loris/loris/ground_thruth/7f62.pdb
[!] Structure contains gaps:
        A PRO330 < Fragment 0 > A LYS444
        A GLY447 < Fragment 1 > A SER530
        H GLU1 < Fragment 2 > H ALA119
        L GLN1 < Fragment 3 > L ARG109
[+] Parsed structure file 7f62 (3 chains, 427 residues)
....
[++] Predicted binding affinity (kcal.mol-1):    -11.5
[++] Predicted dissociation constant (M) at 25.0˚C:  3.6e-09
(prodigy) loris@big-gpu:~/dev/prodigy$

Dataset at http://bmm.crick.ac.uk/~bmmadmin/Affinity not available anymore.

Dear Authors,

Thanks so much for the amazing work.

I am a PhD student at Eotvos Lorand University looking for the dataset provided in the paper: Kastritis P. L., Moal I. H., Hwang H., Weng Z., Bates P. A., Bonvin A. M. and Janin J.(2011). A structure-based benchmark for protein-protein binding affinity.

I wanted to know if its still available and if not, what open-source dataset/s you would suggest with affinity data as well as 3D bound structures?

All the best,
Oz

Silent failure

When using the attached pdb file, prodigy silently fails in the server with the following message:

[!] Structure contains gaps:
A ASP1 < Fragment 0 > A THR186
A LEU187 < Fragment 1 > A LEU286
A LEU287 < Fragment 2 > A GLN425
A MET426 < Fragment 3 > A SER551
B CYS552 < Fragment 4 > B CYS567

Running Prodigy for structure test
ERROR:

test.pdb.zip

Code does not accept any spaces or "-" in the file name

(...) it seems that the code does not accept any spaces or "-" in the file name.

Originally posted by @esraaelmligy in #16 (comment)

prodigy: error: unrecognized arguments: model.pdb

Keeps giving me this error for the pdb file argument and i don't really know why

ValueError: Unsupported non-standard amino acid found: DC in parsers.py

Hi, I encountered this error after running prodigy on some PDB ids (eg. 4JGJ, 4H10, 4XRS).
Seems that some residue names are non-standard for some reason.

Demo code:
I am using python3_package branch.

> prodigy 4XRS.pdb  --selection A G --contact_list --pymol_selection
[+] Reading structure file: 4XRS.pdb
Traceback (most recent call last):
  File "prodigy", line 11, in <module>
    load_entry_point('prodigy', 'console_scripts', 'prodigy')()
  File "prodigy/prodigy/predict_IC.py", line 308, in main
    structure, n_chains, n_res = parse_structure(struct_path)
  File "prodigy/prodigy/lib/parsers.py", line 141, in parse_structure
    return (validate_structure(s),
  File "prodigy/prodigy/lib/parsers.py", line 87, in validate_structure
    raise ValueError('Unsupported non-standard amino acid found: {0}'.format(res.resname))
ValueError: Unsupported non-standard amino acid found:  DC

I wonder if there is any easy way to get around this error.

Result Interpretation

Hello,
I was wondering how to interpret the results of a complex.

[+] Parsed structure file 7F62_input_fasta_relaxed_rank_005_alphafold2_multimer_v3_model_1_seed_000 (3 chains, 427 residues)
[+] No. of intermolecular contacts: 55
[+] No. of charged-charged contacts: 3
[+] No. of charged-polar contacts: 8
[+] No. of charged-apolar contacts: 12
[+] No. of polar-polar contacts: 5
[+] No. of apolar-polar contacts: 16
[+] No. of apolar-apolar contacts: 11
[+] Percentage of apolar NIS residues: 35.60
[+] Percentage of charged NIS residues: 19.50
[++] Predicted binding affinity (kcal.mol-1): -10.7
[++] Predicted dissociation constant (M) at 25.0˚C: 1.3e-08

Is the binding affinity score made up of all the "[+]" scores ?

The option with --pymol_selection produce no output

Hi,

Thank you for making this wonderful tools.
I was trying this command:

prodigy my_input.pdb --selection A B --pymol_selection

It prints the output.
But I couldn't find the Pymol script. What's the way to resolve it?

G.V.

ValueError: invalid literal for int() with base 10 at int(struct.residueNumber(idx))

Hi, I encountered this error after running prodigy on some PDB ids (eg. 1TVD, 4E44, 2PUX).
Seems that some residue numbers are not integers for some reason.

Demo code:
I am using python3_package branch.

> prodigy 1TVD.pdb  --selection A B --contact_list --pymol_selection 
[+] Reading structure file: 1TVD.pdb
[!] Structure contains gaps:
        A ASP1 < Fragment 0 > A GLY57
        A GLY58 < Fragment 1 > A PRO116
        B ASP1 < Fragment 2 > B GLY57
        B GLY58 < Fragment 3 > B PRO116

[+] Parsed structure file 1TVD (2 chains, 226 residues)
Traceback (most recent call last):
  File "prodigy", line 11, in <module>
    load_entry_point('prodigy', 'console_scripts', 'prodigy')()
  File "prodigy/prodigy/predict_IC.py", line 311, in main
    prodigy.predict(distance_cutoff=cmd.distance_cutoff, acc_threshold=cmd.acc_threshold)
  File "prodigy/prodigy/predict_IC.py", line 146, in predict
    _, cmplx_sasa = execute_freesasa_api(self.structure)
  File "prodigy/prodigy/lib/freesasa_tools.py", line 185, in execute_freesasa_api
    resid = int(struct.residueNumber(idx))
ValueError: invalid literal for int() with base 10: '57B'

I wonder if there is any easy way to get around this error.

Improve `Radius array is <= 0 for the residue: ILE ,atom: CD` error message

Several users have asked for help relating to this error message Radius array is <= 0 for the residue: ILE ,atom: CD

This message specifically seems related to a input structure containing atoms other than CD - which might be the case for structures generated with a few different modelling software.

We could improve the message to already contain the advice on how to proceed;

[!] Error: Radius array is <= 0 for the residue: ILE ,atom: CD
[!] Error: Make sure the atom names in your PDB file match the cannonical naming and belong to default residues

Unable to install by pip

When I try to install PRODIGY by pip, it looks like something went wrong

minfei@minfei:~$ pip install prodigy
Defaulting to user installation because normal site-packages is not writeable
ERROR: Could not find a version that satisfies the requirement prodigy (from versions: none)
ERROR: No matching distribution found for prodigy

Domain-wise binding affinities

Hi again,
A quick question: Would it be possible to calculate binding affinity for a defined section of the interface e.g. protein domain, with the method used by prodigy?
It seems that prodigy (in its current version) is designed to estimate the binding affinities for entire binding interface. Out of curiosity, I wonder if there any possibility that this method could be used to obtain domain-wise binding affinities.
Thanks.