bjornwallner / dockq Goto Github PK
View Code? Open in Web Editor NEWDockQ is a single continuous quality measure for protein docked models based on the CAPRI evaluation protocol
License: MIT License
DockQ is a single continuous quality measure for protein docked models based on the CAPRI evaluation protocol
License: MIT License
I’m attempting to get the DOCKQ score of a model of CAPRI target #50, from the score_set dataset. The model is named Target50_0000.pdb and the correct crystal structure is named Target50_3r2x.pdb. Both are attached (but with the extention txt added, as pdb files aren't allowed to be uploaded by github) here:
Target50_0000.pdb.txt
Target50_3r2x.pdb.txt
Running
scripts/fix_numbering.pl /path/to/Target50_0000.pdb /path/to/Target50_3r2x.pdb
works fine, but running
python3 DockQ.py /path/to/Target50_0000.pdb.fixed /path/to/Target50_3r2x.pdb -native_chain1 A B -native_chain2 C -model_chain1 A B -model_chain2 C
results in the following error:
Traceback (most recent call last):
File "/dartfs/rc/lab/G/Grigoryanlab/home/coy/DockQ/DockQ.py", line 732, in <module>
main()
File "/dartfs/rc/lab/G/Grigoryanlab/home/coy/DockQ/DockQ.py", line 510, in main
model_chains=get_pdb_chains(model)
File "/dartfs/rc/lab/G/Grigoryanlab/home/coy/DockQ/DockQ.py", line 387, in get_pdb_chains
pdb_struct = pdb_parser.get_structure("reference", pdb)[0]
File "/dartfs-hpc/rc/home/4/f002v94/.conda/envs/myenv/lib/python3.9/site-packages/Bio/PDB/PDBParser.py", line 100, in get_structure
self._parse(lines)
File "/dartfs-hpc/rc/home/4/f002v94/.conda/envs/myenv/lib/python3.9/site-packages/Bio/PDB/PDBParser.py", line 123, in _parse
self.trailer = self._parse_coordinates(coords_trailer)
File "/dartfs-hpc/rc/home/4/f002v94/.conda/envs/myenv/lib/python3.9/site-packages/Bio/PDB/PDBParser.py", line 198, in _parse_coordinates
resseq = int(line[22:26].split()[0]) # sequence identifier
IndexError: list index out of range
when I run the DockQ, it showed the error "AssertionError: Zero number of equivalent atoms in native and model ligand (chain B) 0 0. Check that the residue numbers in model and native is consistent." And I used the fix_numbering.pl tool to produce the model.pdb.fixed file. However, How did I use the file to fix the error?
I try to run the readme example for multichain functionality and I get this.
Traceback (most recent call last):
File "./DockQ.py", line 731, in
main()
File "./DockQ.py", line 568, in main
native=make_two_chain_pdb_perm(native,nat_group1,nat_group2)
File "./DockQ.py", line 452, in make_two_chain_pdb_perm
exec_path=os.path.dirname(Path.abspath(sys.argv[0]))
AttributeError: type object 'Path' has no attribute 'abspath'
I have already installed 1.79 of biopython by pip.
But when I launch the DockQ.py it says
Biopython version (1.59) is too old need at least >=1.61
Hi, I found "chain mismatch" problem in almost all complexes I tested.
I've been stuck for days. I would very much appreciate it if anyone help me out.
After I ran ./DockQ.py model/7P79.pdb native/7P79.pdb, it showed:
7P79.pdb
chain mismatch A B H Cchain mismatch A B H CTraceback (most recent call last):
File "./DockQ.py", line 732, in
main()
File "./DockQ.py", line 660, in main
info=calc_DockQ(model,native,use_CA_only=use_CA_only,capri_peptide=capri_peptide) #False):
File "./DockQ.py", line 234, in calc_DockQ
if key in chain_res[chain]: # if key is present in sample
KeyError: 'A'
Any tips for fixing this?
Thank you so much.
Dear @bjornwallner
My system is ubuntu 20.04
From https://emboss.sourceforge.net/download/, I install stable version EMBOSS-6.5.7.tar.gz
which was released in 2012/2013 according to the change log
https://emboss.sourceforge.net/developers/changelog.html.
After installation of emboss(I guess it failed), I cannot find "needle" by "which needle"?
thanks very much.
When using the cython branch, after running
DockQ.py <model> <native> -native_chain1 A B -perm1 -perm2
I get an error (see below). Note that it works fine running
DockQ.py <model> <native>
or
DockQ.py <model> <native> -native_chain1 A -model_chain1 A
Best
Samuel
The error:
Traceback (most recent call last):
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/DockQ.py", line 965, in <module>
main()
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/DockQ.py", line 851, in main
test_info = run_on_groups(
^^^^^^^^^^^^^^
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/DockQ.py", line 635, in run_on_groups
info = calc_DockQ(
^^^^^^^^^^^
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/DockQ.py", line 139, in calc_DockQ
ref_res_distances = get_residue_distances(
^^^^^^^^^^^^^^^^^^^^^^
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/DockQ.py", line 533, in get_residue_distances
model_res_distances = residue_distances(
^^^^^^^^^^^^^^^^^^
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/operations_nocy.py", line 29, in residue_distances
atom_distances = get_distances_across_chains(atom_coordinates1, atom_coordinates2)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/operations_nocy.py", line 6, in get_distances_across_chains
distances = ((model_A_atoms[:, None] - model_B_atoms[None, :]) ** 2).sum(-1)
~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
ValueError: operands could not be broadcast together with shapes (4712,1,3) (1,0)
I’m attempting to get the DOCKQ score of a model of CAPRI target #47, from the score_set dataset. The model is named Target47_0064.pdb and the correct crystal structure is named Target47_3u4e.pdb. However, the DOCKQ script seems to be getting the interface RMSD wrong. Both are attached (but with the extention txt added, as pdb files aren't allowed to be uploaded by github) here:
Target47_0064.pdb.txt
Target47_3u4e.pdb.txt
In the DockQ folder, I’ve run
scripts/fix_numbering.pl /path/to/Target47_0064.pdb /path/to/Target47_3u4e.pdb
then
python3 DockQ.py /path/to/Target47_0064.pdb.fixed /path/to/Target47_3u4e.pdb
which produces the following output (truncated to just show Fnat through DockQ):
Fnat 0.833 45 correct of 54 native contacts
Fnonnat 0.297 19 non-native of 64 model contacts
iRMS 4.576
LRMS 1.827
DockQ 0.629
However when I calculate the three components of DockQ myself (using protein structure tools from the C++ library Mosaist), I find:
Fnat: 0.846
iRMS: 1.019
lRMS: 1.819
DockQ: 0.829
The slight differences in Fnat and lRMS aren’t very concerning to me (I assume they come down to some slight difference in atom-matching between the structures), but the iRMS is significantly off. Examining the structure visually, it seems like the iRMS should be around 1 Å, so I think this probably comes down to a bug in the DOCKQ script where it sometimes gets iRMS wrong. This happens reproducibly with a number of other models for Target 47 as well, all having larger iRMS values than they should, lowering their DockQ scores considerably (usually a difference of ~0.2). Here's a visual of the model and crystal structure aligned in pymol, with the crystal in yellows, model in greens, and different binding partners in either lighter or darker shades. My apologies if I'm wrong, but it appears the iRMS should be much lower than 4.5 Å, and an iRMS of around 1.0 Å would be reasonable. I figured I should bring the discrepancy here, in case it represents some buggy edge case.
I'm following the instructions on the readme file, I clones the repo, moved to its directory, run the make command then I'm trying to run the example provided as well. but when I do an error comes up.
the example command:
bash ./DockQ.py examples/model.pdb examples/native.pdb
The error that shows up
import-im6.q16: attempt to perform an operation not allowed by the security policy 'PS' @ error/constitute.c/IsCoderAuthorized/408. from: can't read /var/mail/Bio ./DockQ.py: line 6: syntax error near unexpected token 'ignore',' ./DockQ.py: line 6: 'warnings.simplefilter('ignore', BiopythonWarning)'
When i try to follow the example steps to check the installation is ok, the text file 'renumber_pdb' is just popping up on my screen.
I'm not sure why this popping up, and the program isn't running. Any help would be appreciated.
(I'm trying to run the software on windows, in the Anaconda terminal)
I've noticed that receptor molecule is selected according to the one with larger number of residues. Is there a way to bypass this behavior?
Thanks in advance for your help, congratulations on this useful software!
When I am trying to run the Make command i am getting this:
cc -O3 -funroll-loops -Isrc/ -c src/molecule.c -lm
process_begin: CreateProcess(NULL, cc -O3 -funroll-loops -Isrc/ -c src/molecule.c -lm, ...) failed.
make (e=2): The system cannot find the file specified.
make: *** [Makefile:14: molecule.o] Error 2
I am writing a python script to rename chain ID in model file and confronted with a lot of problems.
I was wondering how do people align chain IDs before running DockQ? Is there any tool?
./DockQ.py examples/1a14_pred.pdb examples/1a14.pdb
Multi-chain model need sets of chains to group
use -native_chain1 and/or -model_chain1 if you want a different mapping than 1-1
Model chains : ['A', 'H']
Native chains : ['N', 'H', 'L', 'A']
than I tried this.
./DockQ.py examples/1a14_pred.pdb examples/1a14.pdb -native_chain1 A H -model_chain1 A H
Traceback (most recent call last):
File "/home/fkt/Downloads/abdockgen/DockQ/./DockQ.py", line 732, in <module>
main()
^^^^^^
File "/home/fkt/Downloads/abdockgen/DockQ/./DockQ.py", line 569, in main
native=make_two_chain_pdb_perm(native,nat_group1,nat_group2)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fkt/Downloads/abdockgen/DockQ/./DockQ.py", line 447, in make_two_chain_pdb_perm
f.write(change_chain(pdb_chains[c],"A"))
~~~~~~~~~~^^^
KeyError: 'A'
Please help.
I am having trouble running DockQ with moderately large homo-dimers. Is this a known issue for the tools here to fail when there are many residues in a chain?
I ran DockQ successfully for most of the models and references in a given benchmark set but the largest files failed.
The smallest file where I could observe a failure was when comparing the attached (4u59_2_files.zip) 4u59_2_model.pdb with 4u59_2.pdb (i.e. simple call ./DockQ.py 4u59_2_model.pdb 4u59_2.pdb
).
Here the model covers more than the reference and so ./DockQ.py 4u59_2.pdb 4u59_2.pdb
works (3076 residues in 4u59_2) while ./DockQ.py 4u59_2_model.pdb 4u59_2_model.pdb
fails (3294 residues in 4u59_2_model).
The traceback of the error looks as follows when run with Python 3:
Traceback (most recent call last):
File ".../DockQ.py", line 730, in <module>
main()
File ".../DockQ.py", line 658, in main
info=calc_DockQ(model,native,use_CA_only=use_CA_only,capri_peptide=capri_peptide) #False):
File ".../DockQ.py", line 112, in calc_DockQ
fnat_out = os.popen(cmd_fnat).read()
File ".../python3.9/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xef in position 852: invalid continuation byte
and as follows with Python 2:
Traceback (most recent call last):
File "../DockQ.py", line 730, in <module>
main()
File "../DockQ.py", line 658, in main
info=calc_DockQ(model,native,use_CA_only=use_CA_only,capri_peptide=capri_peptide) #False):
File "../DockQ.py", line 118, in calc_DockQ
assert fnat!=-1, "Error running cmd: %s\n" % (cmd_fnat)
AssertionError: Error running cmd: .../fnat 4u59_2_model.pdb 4u59_2_model.pdb 5 -all
The latter error indicates an issue in the fnat
binary which indeed produces wrong looking characters before segfaulting. Here the last few lines of the output of fnat 4u59_2_model.pdb 4u59_2_model.pdb 5
:
NATIVE: 25259?b 1629C 0.107644
Fnat 85805 13756 6.237642
Fnonnat -72049 13756 -5.237642
Segmentation fault
As an additional note I observed plenty of compile-time warnings when compiling using GCC 10.3.0 and it may be worth checking them as they could be indicative of some overflows or so...
The specific files do not matter and I could reproduce the same failures when downloading moderately large homo-dimers from the PDB (e.g. https://files.rcsb.org/download/6EQO.pdb).
Given that large complexed and multi-domain proteins are interesting and challenging prediction problems it would be good to fix the issue described here to be able to apply DockQ on benchmarks for such problems.
Hi,
Thank you for developing this useful software. When I attempt to run the script to align the protein lengths of the separate chains of the predicted model, I am met with the error that needle is not installed. However, when I run the script without any input, it says needle is in fact installed. How do I circumvent this?
Thanks in advance.
I want to predict the docking scores of the result of Alphafold-Multimer by DockQ. However, I have no idea which file is "model" and which one is "native".
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.