Git Product home page Git Product logo

prodigy's Introduction

PRODIGY / Binding Affinity Prediction

PyPI - License PyPI - Status PyPI - Python Version ci Codacy Badge Codacy Badge fair-software.eu

SQAaaS badge shields.io

SQAaaS badge


PRODIGY is also available as a web service @ wenmr.science.uu.nl/prodigy

Installation

pip install prodigy-prot

If you want to develop PRODIGY, check DEVELOPMENT for more details.

Usage

prodigy <pdb file> [--selection <chain1><chain2>]

To get a list of all the possible options.

$ prodigy --help
usage: prodigy [-h] [--distance-cutoff DISTANCE_CUTOFF] [--acc-threshold ACC_THRESHOLD] [--temperature TEMPERATURE]
               [--contact_list] [--pymol_selection] [-q] [-V] [--selection A B [A,B C ...]]
               structf

Binding affinity predictor based on Intermolecular Contacts (ICs).

Anna Vangone and Alexandre M.J.J. Bonvin,
Contacts-based prediction of binding affinity in protein-protein complexes.
eLife (2015)

positional arguments:
  structf               Structure to analyse in PDB or mmCIF format

options:
  -h, --help            show this help message and exit
  --distance-cutoff DISTANCE_CUTOFF
                        Distance cutoff to calculate ICs
  --acc-threshold ACC_THRESHOLD
                        Accessibility threshold for BSA analysis
  --temperature TEMPERATURE
                        Temperature (C) for Kd prediction
  --contact_list        Output a list of contacts
  --pymol_selection     Output a script to highlight the interface (pymol)
  -q, --quiet           Outputs only the predicted affinity value
  -V, --version         Print the version and exit.

Selection Options:

      By default, all intermolecular contacts are taken into consideration,
      a molecule being defined as an isolated group of amino acids sharing
      a common chain identifier. In specific cases, for example
      antibody-antigen complexes, some chains should be considered as a
      single molecule.

      Use the --selection option to provide collections of chains that should
      be considered for the calculation. Separate by a space the chains that
      are to be considered _different_ molecules. Use commas to include multiple
      chains as part of a single group:

      --selection A B => Contacts calculated (only) between chains A and B.
      --selection A,B C => Contacts calculated (only) between chains A and C; and B and C.
      --selection A B C => Contacts calculated (only) between chains A and B; B and C; and A and C.


  --selection A B [A,B C ...]

Example

Download the PDB 3BZD and run PRODIGY on it.

$ curl -o 3bzd.pdb https://files.rcsb.org/download/3BZD.pdb
$ prodigy 3bzd.pdb
[+] Reading structure file: /Users/rvhonorato/dbg/3bzd.pdb
[+] Parsed structure file 3bzd (2 chains, 343 residues)
[+] No. of intermolecular contacts: 51
[+] No. of charged-charged contacts: 4
[+] No. of charged-polar contacts: 7
[+] No. of charged-apolar contacts: 6
[+] No. of polar-polar contacts: 7
[+] No. of apolar-polar contacts: 15
[+] No. of apolar-apolar contacts: 12
[+] Percentage of apolar NIS residues: 29.48
[+] Percentage of charged NIS residues: 29.48
[++] Predicted binding affinity (kcal.mol-1):     -9.4
[++] Predicted dissociation constant (M) at 25.0˚C:  1.3e-07

Details of the binding affinity predictor implemented in PRODIGY can be found at 10.7554/elife.07454

Citing us

If our tool is useful to you, please cite PRODIGY in your publications:

  • Xue L, Rodrigues J, Kastritis P, Bonvin A.M.J.J, Vangone A.: PRODIGY: a web server for predicting the binding affinity of protein-protein complexes. Bioinformatics (2016) (10.1093/bioinformatics/btw514)

  • Anna Vangone and Alexandre M.J.J. Bonvin: Contacts-based prediction of binding affinity in protein-protein complexes. eLife, e07454 (2015) (10.7554/eLife.07454)

  • Panagiotis L. Kastritis , João P.G.L.M. Rodrigues, Gert E. Folkers, Rolf Boelens, Alexandre M.J.J. Bonvin: Proteins Feel More Than They See: Fine-Tuning of Binding Affinity by Properties of the Non-Interacting Surface. Journal of Molecular Biology, 14, 2632–2652 (2014). (10.1016/j.jmb.2014.04.017)

Contact

For questions about PRODIGY usage, please reach out the team at ask.bioexcel.eu

Information about dependencies

The scripts rely on Biopython to validate the PDB structures and calculate interatomic distances. freesasa, with the parameter set used in NACCESS (Chothia, 1976), is also required for calculating the buried surface area.

DISCLAIMER: given the different software to calculate solvent accessiblity, predicted values might differ (very slightly) from those published in the reference implementations. The correlation of the actual atomic accessibilities is over 0.99, so we expect these differences to be very minor.


prodigy's People

Contributors

amjjbonvin avatar avangone avatar brianjimenez avatar joaorodrigues avatar rraadd88 avatar rvhonorato avatar schaarj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

prodigy's Issues

Prodigy v2.1.4 and v2.1.5 appear as v2.1.3

Just a minor issue - versions 2.1.4 and 2.1.5 of prodigy are both listed as v2.1.3 when running prodigy -V on the command line, and is also listed as v2.1.3 by pip after installation. For pip, can this be fixed by bumping the version number in pyproject.toml on line 4? For the command line, it looks like the version is controlled by line 19 of prodigy/predict_IC.py.

--acc-threshold range?

I believe that the --acc-threshold parameter specifies the accessibility threshold for the buried surface area analysis. Is this parameter a percentage? I tried changing the default from 0.05 to 1.25 and it worked, but anything larger failed.

My thought is that the IC contact list might help me with the TAP criteria. Would the IC residues be similar to the "CDR vicinity" in the TAP calculations?

Contact_list flag

Hello,

I'm currently running prodigy on a protein complex.
I don't see any changes in the terminal or in the directory when activating the contact_list flag

Do you have any idea why?


(prodigy) loris@big-gpu:~/dev/prodigy$ prodigy 7f62.pdb --selection H,L A
[+] Reading structure file: /home/loris/loris/ground_thruth/7f62.pdb
[!] Structure contains gaps:
        A PRO330 < Fragment 0 > A LYS444
        A GLY447 < Fragment 1 > A SER530
        H GLU1 < Fragment 2 > H ALA119
        L GLN1 < Fragment 3 > L ARG109
[+] Parsed structure file 7f62 (3 chains, 427 residues)
.....
[++] Predicted binding affinity (kcal.mol-1):    -11.5
[++] Predicted dissociation constant (M) at 25.0˚C:  3.6e-09

(prodigy) loris@big-gpu:~/dev/prodigy$ prodigy ground_thruth/7f62.pdb --selection H,L A **--contact_list**
[+] Reading structure file: /home/loris/loris/ground_thruth/7f62.pdb
[!] Structure contains gaps:
        A PRO330 < Fragment 0 > A LYS444
        A GLY447 < Fragment 1 > A SER530
        H GLU1 < Fragment 2 > H ALA119
        L GLN1 < Fragment 3 > L ARG109
[+] Parsed structure file 7f62 (3 chains, 427 residues)
....
[++] Predicted binding affinity (kcal.mol-1):    -11.5
[++] Predicted dissociation constant (M) at 25.0˚C:  3.6e-09
(prodigy) loris@big-gpu:~/dev/prodigy$ 

Dataset at http://bmm.crick.ac.uk/~bmmadmin/Affinity not available anymore.

Dear Authors,

Thanks so much for the amazing work.

I am a PhD student at Eotvos Lorand University looking for the dataset provided in the paper: Kastritis P. L., Moal I. H., Hwang H., Weng Z., Bates P. A., Bonvin A. M. and Janin J.(2011). A structure-based benchmark for protein-protein binding affinity.

I wanted to know if its still available and if not, what open-source dataset/s you would suggest with affinity data as well as 3D bound structures?

All the best,
Oz

Silent failure

When using the attached pdb file, prodigy silently fails in the server with the following message:

[!] Structure contains gaps:
A ASP1 < Fragment 0 > A THR186
A LEU187 < Fragment 1 > A LEU286
A LEU287 < Fragment 2 > A GLN425
A MET426 < Fragment 3 > A SER551
B CYS552 < Fragment 4 > B CYS567

Running Prodigy for structure test
ERROR:

test.pdb.zip

ValueError: Unsupported non-standard amino acid found: DC in parsers.py

Hi, I encountered this error after running prodigy on some PDB ids (eg. 4JGJ, 4H10, 4XRS).
Seems that some residue names are non-standard for some reason.

Demo code:
I am using python3_package branch.

> prodigy 4XRS.pdb  --selection A G --contact_list --pymol_selection
[+] Reading structure file: 4XRS.pdb
Traceback (most recent call last):
  File "prodigy", line 11, in <module>
    load_entry_point('prodigy', 'console_scripts', 'prodigy')()
  File "prodigy/prodigy/predict_IC.py", line 308, in main
    structure, n_chains, n_res = parse_structure(struct_path)
  File "prodigy/prodigy/lib/parsers.py", line 141, in parse_structure
    return (validate_structure(s),
  File "prodigy/prodigy/lib/parsers.py", line 87, in validate_structure
    raise ValueError('Unsupported non-standard amino acid found: {0}'.format(res.resname))
ValueError: Unsupported non-standard amino acid found:  DC

I wonder if there is any easy way to get around this error.

Result Interpretation

Hello,
I was wondering how to interpret the results of a complex.

[+] Parsed structure file 7F62_input_fasta_relaxed_rank_005_alphafold2_multimer_v3_model_1_seed_000 (3 chains, 427 residues)
[+] No. of intermolecular contacts: 55
[+] No. of charged-charged contacts: 3
[+] No. of charged-polar contacts: 8
[+] No. of charged-apolar contacts: 12
[+] No. of polar-polar contacts: 5
[+] No. of apolar-polar contacts: 16
[+] No. of apolar-apolar contacts: 11
[+] Percentage of apolar NIS residues: 35.60
[+] Percentage of charged NIS residues: 19.50
[++] Predicted binding affinity (kcal.mol-1): -10.7
[++] Predicted dissociation constant (M) at 25.0˚C: 1.3e-08

Is the binding affinity score made up of all the "[+]" scores ?

The option with --pymol_selection produce no output

Hi,

Thank you for making this wonderful tools.
I was trying this command:

prodigy my_input.pdb --selection A B --pymol_selection

It prints the output.
But I couldn't find the Pymol script. What's the way to resolve it?

G.V.

ValueError: invalid literal for int() with base 10 at int(struct.residueNumber(idx))

Hi, I encountered this error after running prodigy on some PDB ids (eg. 1TVD, 4E44, 2PUX).
Seems that some residue numbers are not integers for some reason.

Demo code:
I am using python3_package branch.

> prodigy 1TVD.pdb  --selection A B --contact_list --pymol_selection 
[+] Reading structure file: 1TVD.pdb
[!] Structure contains gaps:
        A ASP1 < Fragment 0 > A GLY57
        A GLY58 < Fragment 1 > A PRO116
        B ASP1 < Fragment 2 > B GLY57
        B GLY58 < Fragment 3 > B PRO116

[+] Parsed structure file 1TVD (2 chains, 226 residues)
Traceback (most recent call last):
  File "prodigy", line 11, in <module>
    load_entry_point('prodigy', 'console_scripts', 'prodigy')()
  File "prodigy/prodigy/predict_IC.py", line 311, in main
    prodigy.predict(distance_cutoff=cmd.distance_cutoff, acc_threshold=cmd.acc_threshold)
  File "prodigy/prodigy/predict_IC.py", line 146, in predict
    _, cmplx_sasa = execute_freesasa_api(self.structure)
  File "prodigy/prodigy/lib/freesasa_tools.py", line 185, in execute_freesasa_api
    resid = int(struct.residueNumber(idx))
ValueError: invalid literal for int() with base 10: '57B'

I wonder if there is any easy way to get around this error.

Improve `Radius array is <= 0 for the residue: ILE ,atom: CD` error message

Several users have asked for help relating to this error message Radius array is <= 0 for the residue: ILE ,atom: CD

  1. https://ask.bioexcel.eu/t/prodigy-error-radius-array-is-0-for-the-residue-ile-atom-cd/4375
  2. https://ask.bioexcel.eu/t/error-radius-array-is-0-for-the-residue-ile-atom-cd/2835
  3. https://ask.bioexcel.eu/t/gap-in-protein-chain-and-atom-naming-error/3825
  4. https://ask.bioexcel.eu/t/prodigy-fails-to-calculate-binding-energy-of-protein-protein-complex/3420
  5. https://ask.bioexcel.eu/t/some-problems-in-doing-prodigy/3187

This message specifically seems related to a input structure containing atoms other than CD - which might be the case for structures generated with a few different modelling software.

We could improve the message to already contain the advice on how to proceed;

[!] Error: Radius array is <= 0 for the residue: ILE ,atom: CD
[!] Error: Make sure the atom names in your PDB file match the cannonical naming and belong to default residues

Unable to install by pip

When I try to install PRODIGY by pip, it looks like something went wrong

minfei@minfei:~$ pip install prodigy
Defaulting to user installation because normal site-packages is not writeable
ERROR: Could not find a version that satisfies the requirement prodigy (from versions: none)
ERROR: No matching distribution found for prodigy

Domain-wise binding affinities

Hi again,
A quick question: Would it be possible to calculate binding affinity for a defined section of the interface e.g. protein domain, with the method used by prodigy?
It seems that prodigy (in its current version) is designed to estimate the binding affinities for entire binding interface. Out of curiosity, I wonder if there any possibility that this method could be used to obtain domain-wise binding affinities.
Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.