bjornwallner / dockq Goto Github PK

View Code? Open in Web Editor NEW

156.0 156.0 41.0 4.49 MB

DockQ is a single continuous quality measure for protein docked models based on the CAPRI evaluation protocol

License: MIT License

Python 43.89% Makefile 0.55% C 43.69% Perl 11.80% Shell 0.07%

dockq's People

Contributors

Stargazers

Watchers

dockq's Issues

IndexError: list index out of range

I’m attempting to get the DOCKQ score of a model of CAPRI target #50, from the score_set dataset. The model is named Target50_0000.pdb and the correct crystal structure is named Target50_3r2x.pdb. Both are attached (but with the extention txt added, as pdb files aren't allowed to be uploaded by github) here:

Target50_0000.pdb.txt
Target50_3r2x.pdb.txt

Running

scripts/fix_numbering.pl /path/to/Target50_0000.pdb /path/to/Target50_3r2x.pdb

works fine, but running

python3 DockQ.py /path/to/Target50_0000.pdb.fixed /path/to/Target50_3r2x.pdb -native_chain1 A B -native_chain2 C -model_chain1 A B -model_chain2 C

results in the following error:

Traceback (most recent call last):
  File "/dartfs/rc/lab/G/Grigoryanlab/home/coy/DockQ/DockQ.py", line 732, in <module>
    main()    
  File "/dartfs/rc/lab/G/Grigoryanlab/home/coy/DockQ/DockQ.py", line 510, in main
    model_chains=get_pdb_chains(model)
  File "/dartfs/rc/lab/G/Grigoryanlab/home/coy/DockQ/DockQ.py", line 387, in get_pdb_chains
    pdb_struct = pdb_parser.get_structure("reference", pdb)[0]
  File "/dartfs-hpc/rc/home/4/f002v94/.conda/envs/myenv/lib/python3.9/site-packages/Bio/PDB/PDBParser.py", line 100, in get_structure
    self._parse(lines)
  File "/dartfs-hpc/rc/home/4/f002v94/.conda/envs/myenv/lib/python3.9/site-packages/Bio/PDB/PDBParser.py", line 123, in _parse
    self.trailer = self._parse_coordinates(coords_trailer)
  File "/dartfs-hpc/rc/home/4/f002v94/.conda/envs/myenv/lib/python3.9/site-packages/Bio/PDB/PDBParser.py", line 198, in _parse_coordinates
    resseq = int(line[22:26].split()[0])  # sequence identifier
IndexError: list index out of range

AssertionError: Zero number of equivalent atoms in native and model ligand (chain B) 0 0.

when I run the DockQ, it showed the error "AssertionError: Zero number of equivalent atoms in native and model ligand (chain B) 0 0. Check that the residue numbers in model and native is consistent." And I used the fix_numbering.pl tool to produce the model.pdb.fixed file. However, How did I use the file to fix the error?

Multichain functionality in readme

I try to run the readme example for multichain functionality and I get this.

Traceback (most recent call last):
File "./DockQ.py", line 731, in
main()
File "./DockQ.py", line 568, in main
native=make_two_chain_pdb_perm(native,nat_group1,nat_group2)
File "./DockQ.py", line 452, in make_two_chain_pdb_perm
exec_path=os.path.dirname(Path.abspath(sys.argv[0]))
AttributeError: type object 'Path' has no attribute 'abspath'

biopython version

I have already installed 1.79 of biopython by pip.
But when I launch the DockQ.py it says
Biopython version (1.59) is too old need at least >=1.61

Chain mismatch , KeyError: 'A'

Hi, I found "chain mismatch" problem in almost all complexes I tested.
I've been stuck for days. I would very much appreciate it if anyone help me out.

After I ran ./DockQ.py model/7P79.pdb native/7P79.pdb, it showed:

7P79.pdb
chain mismatch A B H Cchain mismatch A B H CTraceback (most recent call last):
File "./DockQ.py", line 732, in
main()
File "./DockQ.py", line 660, in main
info=calc_DockQ(model,native,use_CA_only=use_CA_only,capri_peptide=capri_peptide) #False):
File "./DockQ.py", line 234, in calc_DockQ
if key in chain_res[chain]: # if key is present in sample
KeyError: 'A'

Any tips for fixing this?
Thank you so much.

cannot install emboss and cannot find "which needle"

Dear @bjornwallner

My system is ubuntu 20.04
From https://emboss.sourceforge.net/download/, I install stable version EMBOSS-6.5.7.tar.gz
which was released in 2012/2013 according to the change log
https://emboss.sourceforge.net/developers/changelog.html.
After installation of emboss(I guess it failed), I cannot find "needle" by "which needle"?

thanks very much.

[cython branch] Value error when using -perm1 -perm2 flags

When using the cython branch, after running

DockQ.py <model> <native> -native_chain1 A B -perm1 -perm2

I get an error (see below). Note that it works fine running
DockQ.py <model> <native>
or
DockQ.py <model> <native> -native_chain1 A -model_chain1 A

Best
Samuel

The error:

Traceback (most recent call last):
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/DockQ.py", line 965, in <module>
main()
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/DockQ.py", line 851, in main
test_info = run_on_groups(
^^^^^^^^^^^^^^
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/DockQ.py", line 635, in run_on_groups
info = calc_DockQ(
^^^^^^^^^^^
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/DockQ.py", line 139, in calc_DockQ
ref_res_distances = get_residue_distances(
^^^^^^^^^^^^^^^^^^^^^^
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/DockQ.py", line 533, in get_residue_distances
model_res_distances = residue_distances(
^^^^^^^^^^^^^^^^^^
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/operations_nocy.py", line 29, in residue_distances
atom_distances = get_distances_across_chains(atom_coordinates1, atom_coordinates2)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/proj/berzelius-2021-29/users/x_safro/programs/iTM-align/operations_nocy.py", line 6, in get_distances_across_chains
distances = ((model_A_atoms[:, None] - model_B_atoms[None, :]) ** 2).sum(-1)
~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
ValueError: operands could not be broadcast together with shapes (4712,1,3) (1,0)

DockQ.py incorrectly calculating iRMS

I’m attempting to get the DOCKQ score of a model of CAPRI target #47, from the score_set dataset. The model is named Target47_0064.pdb and the correct crystal structure is named Target47_3u4e.pdb. However, the DOCKQ script seems to be getting the interface RMSD wrong. Both are attached (but with the extention txt added, as pdb files aren't allowed to be uploaded by github) here:

Target47_0064.pdb.txt
Target47_3u4e.pdb.txt

In the DockQ folder, I’ve run

scripts/fix_numbering.pl /path/to/Target47_0064.pdb /path/to/Target47_3u4e.pdb

then

python3 DockQ.py /path/to/Target47_0064.pdb.fixed /path/to/Target47_3u4e.pdb

which produces the following output (truncated to just show Fnat through DockQ):

Fnat 0.833 45 correct of 54 native contacts
Fnonnat 0.297 19 non-native of 64 model contacts
iRMS 4.576
LRMS 1.827
DockQ 0.629

However when I calculate the three components of DockQ myself (using protein structure tools from the C++ library Mosaist), I find:

Fnat: 0.846
iRMS: 1.019
lRMS: 1.819
DockQ: 0.829

The slight differences in Fnat and lRMS aren’t very concerning to me (I assume they come down to some slight difference in atom-matching between the structures), but the iRMS is significantly off. Examining the structure visually, it seems like the iRMS should be around 1 Å, so I think this probably comes down to a bug in the DOCKQ script where it sometimes gets iRMS wrong. This happens reproducibly with a number of other models for Target 47 as well, all having larger iRMS values than they should, lowering their DockQ scores considerably (usually a difference of ~0.2). Here's a visual of the model and crystal structure aligned in pymol, with the crystal in yellows, model in greens, and different binding partners in either lighter or darker shades. My apologies if I'm wrong, but it appears the iRMS should be much lower than 4.5 Å, and an iRMS of around 1.0 Å would be reasonable. I figured I should bring the discrepancy here, in case it represents some buggy edge case.

Running Readme Example

I'm following the instructions on the readme file, I clones the repo, moved to its directory, run the make command then I'm trying to run the example provided as well. but when I do an error comes up.
the example command:
bash ./DockQ.py examples/model.pdb examples/native.pdb
The error that shows up
import-im6.q16: attempt to perform an operation not allowed by the security policy 'PS' @ error/constitute.c/IsCoderAuthorized/408. from: can't read /var/mail/Bio ./DockQ.py: line 6: syntax error near unexpected token 'ignore',' ./DockQ.py: line 6: 'warnings.simplefilter('ignore', BiopythonWarning)'

PermissionError: [WinError 32]

When i try to follow the example steps to check the installation is ok, the text file 'renumber_pdb' is just popping up on my screen.

I'm not sure why this popping up, and the program isn't running. Any help would be appreciated.
(I'm trying to run the software on windows, in the Anaconda terminal)

Receptor as the smaller molecule

I've noticed that receptor molecule is selected according to the one with larger number of residues. Is there a way to bypass this behavior?

Thanks in advance for your help, congratulations on this useful software!

Bug in aligning model to native (cython branch)

In cython branch, an instance of Align.PairwiseAligner is created, configured, but not used:

DockQ/src/DockQ/DockQ.py

Lines 305 to 310 in 01f9703

 aligner = Align.PairwiseAligner() 

 aligner.match = 5 

 aligner.mismatch = 0 

 aligner.open_gap_score = -10 

 aligner.extend_gap_score = -0.5 

 aln = Align.PairwiseAligner().align(model_sequence, native_sequence)[0]

Also, extracting the alignment is not accurate here:

DockQ/src/DockQ/DockQ.py

Lines 311 to 312 in 01f9703

 alignment["seqA"] = aln.format().split("\n")[0] #aln.seqA 

 alignment["seqB"] = aln.format().split("\n")[2] #aln.seqB

and should be:

alignment['seqA'] = aln[0, :]
alignment['seqB'] = aln[1, :]

Trying to run Readme example, issues with make command

When I am trying to run the Make command i am getting this:
cc -O3 -funroll-loops -Isrc/ -c src/molecule.c -lm
process_begin: CreateProcess(NULL, cc -O3 -funroll-loops -Isrc/ -c src/molecule.c -lm, ...) failed.
make (e=2): The system cannot find the file specified.
make: *** [Makefile:14: molecule.o] Error 2

How to align chain ID of two complexes before running DockQ?

I am writing a python script to rename chain ID in model file and confronted with a lot of problems.
I was wondering how do people align chain IDs before running DockQ? Is there any tool?

I got KeyError: 'A'

1st I tried.

./DockQ.py examples/1a14_pred.pdb examples/1a14.pdb
Multi-chain model need sets of chains to group
use -native_chain1 and/or -model_chain1 if you want a different mapping than 1-1
Model chains  : ['A', 'H']
Native chains : ['N', 'H', 'L', 'A']

than I tried this.

./DockQ.py examples/1a14_pred.pdb examples/1a14.pdb -native_chain1 A H -model_chain1 A H

Traceback (most recent call last):
  File "/home/fkt/Downloads/abdockgen/DockQ/./DockQ.py", line 732, in <module>
    main()    
    ^^^^^^
  File "/home/fkt/Downloads/abdockgen/DockQ/./DockQ.py", line 569, in main
    native=make_two_chain_pdb_perm(native,nat_group1,nat_group2)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/fkt/Downloads/abdockgen/DockQ/./DockQ.py", line 447, in make_two_chain_pdb_perm
    f.write(change_chain(pdb_chains[c],"A"))
                         ~~~~~~~~~~^^^
KeyError: 'A'

Please help.

Chain size limitations?

I am having trouble running DockQ with moderately large homo-dimers. Is this a known issue for the tools here to fail when there are many residues in a chain?

I ran DockQ successfully for most of the models and references in a given benchmark set but the largest files failed.
The smallest file where I could observe a failure was when comparing the attached (4u59_2_files.zip) 4u59_2_model.pdb with 4u59_2.pdb (i.e. simple call ./DockQ.py 4u59_2_model.pdb 4u59_2.pdb).

Here the model covers more than the reference and so ./DockQ.py 4u59_2.pdb 4u59_2.pdb works (3076 residues in 4u59_2) while ./DockQ.py 4u59_2_model.pdb 4u59_2_model.pdb fails (3294 residues in 4u59_2_model).

The traceback of the error looks as follows when run with Python 3:

Traceback (most recent call last):
  File ".../DockQ.py", line 730, in <module>
    main()    
  File ".../DockQ.py", line 658, in main
    info=calc_DockQ(model,native,use_CA_only=use_CA_only,capri_peptide=capri_peptide) #False):
  File ".../DockQ.py", line 112, in calc_DockQ
    fnat_out = os.popen(cmd_fnat).read()
  File ".../python3.9/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xef in position 852: invalid continuation byte

and as follows with Python 2:

Traceback (most recent call last):
  File "../DockQ.py", line 730, in <module>
    main()    
  File "../DockQ.py", line 658, in main
    info=calc_DockQ(model,native,use_CA_only=use_CA_only,capri_peptide=capri_peptide) #False):
  File "../DockQ.py", line 118, in calc_DockQ
    assert fnat!=-1, "Error running cmd: %s\n" % (cmd_fnat)
AssertionError: Error running cmd: .../fnat 4u59_2_model.pdb 4u59_2_model.pdb 5 -all

The latter error indicates an issue in the fnat binary which indeed produces wrong looking characters before segfaulting. Here the last few lines of the output of fnat 4u59_2_model.pdb 4u59_2_model.pdb 5:

NATIVE: 25259?b 1629C 0.107644
Fnat 85805 13756 6.237642
Fnonnat -72049 13756 -5.237642
Segmentation fault

As an additional note I observed plenty of compile-time warnings when compiling using GCC 10.3.0 and it may be worth checking them as they could be indicative of some overflows or so...

The specific files do not matter and I could reproduce the same failures when downloading moderately large homo-dimers from the PDB (e.g. https://files.rcsb.org/download/6EQO.pdb).

Given that large complexed and multi-domain proteins are interesting and challenging prediction problems it would be good to fix the issue described here to be able to apply DockQ on benchmarks for such problems.

Script fixnumbering.pl does not output model.pdb.fixed

Hi,

Thank you for developing this useful software. When I attempt to run the script to align the protein lengths of the separate chains of the predicted model, I am met with the error that needle is not installed. However, when I run the script without any input, it says needle is in fact installed. How do I circumvent this?

Thanks in advance.

Readme:./DockQ.py <model> <native>

I want to predict the docking scores of the result of Alphafold-Multimer by DockQ. However, I have no idea which file is "model" and which one is "native".

	aligner = Align.PairwiseAligner()
	aligner.match = 5
	aligner.mismatch = 0
	aligner.open_gap_score = -10
	aligner.extend_gap_score = -0.5
	aln = Align.PairwiseAligner().align(model_sequence, native_sequence)[0]

	alignment["seqA"] = aln.format().split("\n")[0] #aln.seqA
	alignment["seqB"] = aln.format().split("\n")[2] #aln.seqB

bjornwallner / dockq Goto Github PK

dockq's People

Contributors

Stargazers

Watchers

Forkers

dockq's Issues

1st I tried.

Recommend Projects

Recommend Topics

Recommend Org