Hi Lane, On the in-house HPC (A100), the MD part of the simulation is over and too

converge step failure about seekr2 HOT 12 CLOSED

seekrcentral commented on June 14, 2024

converge step failure

from seekr2.

Comments (12)

vaibhavadixit commented on June 14, 2024

Hi,
I got the final results on the in-house HPC from the analyze step as follows.
The kon is at least 2 orders of magnitude smaller than those reported in your SEEKR2 paper and thus gives a lower DG.
Please do suggest what I should try to get values close to the exp and your paper.
Is it mostly related to convergence failure or other factors might also contribute?
I want to start simulating smaller and larger systems and thus want to be able to reproduce tutorial results before tackling complicated systems where the # of influencing factors could be even more complicated.
Please suggest.
Thank you and best regards.

SEEKR2 (2022) | kon = (2.4 ± 0.2) × 107 | koff = 990 ± 130 |DG = –5.98 ± 0.09

All post-simulation checks passed.
Printing results from MMVT SEEKR calculation
k_off (1/s): (1.731 ± 0.060) * 10^+04
k_ons :
k_on (1/s * 1/M) to state 0 : (8.53 ± 0.22) * 10^+05
Dissociation constant (M) to state 0 : (2.029 ± 0.087) * 10^-02
ΔG (kcal/mol) to state 0 : (-2.308 ± 0.025) * 10^+00
Mean first passage times (s):
MFPT from anchor_0 to bulk : (5.79 ± 0.19) * 10^-05
All plots being saved to: /home/vaibhav/seekr2-tutorial/images_and_plots/

from seekr2.

lvotapka commented on June 14, 2024

First question: You don't need to have every anchor converged to proceed to the analyze step, but converge.py will warn you if some anchors do not appear to be completely converged, so if you were to dedicate more time to simulating anchors, converge.py shows you which ones would represent the best investment of your computation.

There could be any number of reasons why the results of the tutorial are different from the published paper. Are you using the systems whose charges were found using semiempirical QM? We used a more sophisticated method to generate the partial charges. Did you run as much MD as we did in the paper? Are you using the same milestone placements?

from seekr2.

vaibhavadixit commented on June 14, 2024

Ok,
Following is my model.xml file which shows that I've done 125000000 * 0.002 = 250 ns per anchor.
Your paper mentions 500 ns/anchor. Is it likely to be the main source of the discrepancy?
Also, will the code do the remaining MD steps if I increase the num_production_steps to =250000000?

Will this also automatically lead to convergence in the cases where it led to failure with 250 ns?
Please do suggest the best possible options from here to reproduce the results that you reported in SEEKR2 paper.
Thank you and best regards.
Vaibhav

298.0 mmvt **125000000** 1250000 1250000 1250000 . 12 11 PME 0.9 HBonds True 1.0 298.0 0 **0.002** 1e-06 langevin

from seekr2.

lvotapka commented on June 14, 2024

250 ns vs. 500 ns could be a possible source of some of the discrepancy.

Increasing to 500 ns (250000000) in the unconverged anchors may lead to convergence as defined by converge.py, but keep in mind that converge.py saying whether an anchor is converged or not is an arbirary metric, and can be adjusted using the arguments to converge.py. By default, an anchor is converged if it has more than 100 transitions and a "convergence metric" of less than 0.1. (The "convergence metric" is a function of transition time averages and variances between anchors). Whether converge.py returns False or True is more intended for an automated process to choose whether to keep running an anchor for more time or not. Users can look at those quantities to help them make a judgment of whether to run more, but their own intuition should override the output of converge.py.

The trypsin/benzamidine system from the seekrtools tutorial is not likely to recreate the results of the latest publication. If you really want to recreate results that are close to those reported in the paper, you should use the input files included in https://github.com/seekrcentral/seekr2_systems/tree/master/systems/trypsin_benzamidine_files. Specifically, you would run prepare.py on input_tryp_ben_mmvt.xml.

from seekr2.

vaibhavadixit commented on June 14, 2024

OK,
I'll try that asap, but I guess this would take 15-20 days on my A100 single GPU card, isn't it?
I usually use this approach to reproduce tutorials or published results before going into a detailed investigation with new/complicated and untested systems. I hope that makes sense here especially when theory and methods have multiple layers that could make a big difference in results.

Also, it would help if you can summarize the ligand parameterization procedure chosen in the paper.
Is it opt+esp calculation with hf/6-31G(d) or a dft with descent basis set like b3lyp/6-311+G(d,p) or higher along with the resp procedure that is available in AmberTools (antechamber)? I'm using this dft method for proteins with metal and organic co-factors.
I'm guessing that these would be compatible with the protein (ff19SB) and water TIP3P or OPC models.
Please suggest.
thank you.

from seekr2.

lvotapka commented on June 14, 2024

Feel free to attempt to recreate the results of the publication if you want to. As long as everything is done identically, I would expect you should get within an order of magnitude of the published results for both k-off and k-on.

The trypsin-benzamidine system was parametrized by my co-author Ben Jagger, and the procedure is described in this publication: https://pubs.acs.org/doi/full/10.1021/acs.jpcb.6b09388. Unfortunately, we were not very detailed in our exact parametrization procedure in that paper, but I'm fairly certain that Ben used RESP for the partial charges and GAFF for the other parameters. We used TIP4Pew waters. I'm not sure what level of DFT he used for the RESP, but if you send me a private email message, I can give you his email for you to ask him. He might also respond to a Github reference. @brjagger?

from seekr2.

vaibhavadixit commented on June 14, 2024

Hi,
I'll reassess and drop you an email in a day or two.

Related to this, I have a very general question.
Does the milestoning or MD theory, in general, allow an estimate for the traj equilibration, or property convergence time based on the protein/system properties (e.g., size, mol wt, sequence length, fold-type, volume, surface charge or any other combination of physchem properties)?
The choice of 500 ns for tryp-benzamidine system was based on such understanding or a simple arbitrarily chosen large number.
thanks.

from seekr2.

lvotapka commented on June 14, 2024

That is a question that we are very actively searching for an answer to - we, at least I, don't know yet how to do that. The 500 ns was an arbitrarily large number that we hoped would be enough to capture the important motions of the tryp-benz system.

from seekr2.

vaibhavadixit commented on June 14, 2024

hmm, that is very concerning for reasons obvious to experts like you.
In short, If we can't predict traj equil and property convergence time scales for different systems, then how can one be sure about the accuracy of kon, koff and kd value predictions for systems (ligands and proteins) where no experimental data exists and/or is very difficult to obtain experimentally. This has imp consequences that may precipitate as limitations for applying this theory/method to drug discovery, isn't it?

Please do share your current understanding and thoughts if possible supplemented with some literature links on the same.
For e.g. I'm trying to search for expected traj equil times but didn't find anything useful yet. I'm guessing property convergence would be a much harder problem.
thanks

from seekr2.

lvotapka commented on June 14, 2024

Yes, that limitation is disappointing, but I'd like to remind you that this is a problem, not just for SEEKR2, but for MD simulations and enhanced sampling methods in general. It is a large source of doubt for the use of molecular simulations for drug discovery. However, the situation is not hopeless. Convergence analysis like what exists in SEEKR2 offers some evidence whether a system is or is not converged. Convergence of SEEKR2 results to experimental values after a certain span of simulation time gives reassurance that the same span of time is likely to converge similar molecular systems. Optimal choice of CV is likely to give more confidence that SEEKR2 simulations will give decent results within a reasonable span of time. (see: https://pubmed.ncbi.nlm.nih.gov/35138832/).

from seekr2.

vaibhavadixit commented on June 14, 2024

Hi,
I just reran anchor 1 with 500 ns to check if it improves the convergence and prediction of kinetic parameters.
To my surprise, anchor 1 did not converge but others did, strange!
Nonetheless, the predictions for kinetics are more or less similar (two orders of magnitude off for koff and 1 order of magnitude for kon and 1/3rd for DG.
Please let me know your thoughts on this outcome.
I'll try to rerun the whole calculation with resp charges for ligand and 500 ns/anchor.
I'm also trying to prepare the system for Heme protein and will let you know if I face trouble setting those calculations.
thank you.

(base) [vaibhav@param_embryo seekr2-tutorial]$ tail job.204.out
All post-simulation checks passed.
Printing results from MMVT SEEKR calculation
k_off (1/s): (1.918 ± 0.069) * 10^+04
k_ons :
k_on (1/s * 1/M) to state 0 : (8.53 ± 0.24) * 10^+05
Dissociation constant (M) to state 0 : (2.25 ± 0.10) * 10^-02
ΔG (kcal/mol) to state 0 : (-2.247 ± 0.027) * 10^+00
Mean first passage times (s):
MFPT from anchor_0 to bulk : (5.23 ± 0.19) * 10^-05
All plots being saved to: /home/vaibhav/seekr2-tutorial/images_and_plots/
(base) [vaibhav@param_embryo seekr2-tutorial]$

(base) [vaibhav@param_embryo seekr2-tutorial]$ cat job.203.out
All available options for --k_on_state include: [0]
No BD milestone has been provided for k_on_state. The BD milestone 0 has been chosen by default.
Processing interval 0 of 100
Processing interval 10 of 100
Processing interval 20 of 100
Processing interval 30 of 100
Processing interval 40 of 100
Processing interval 50 of 100
Processing interval 60 of 100
Processing interval 70 of 100
Processing interval 80 of 100
Processing interval 90 of 100
All plots have been saved to: /home/vaibhav/seekr2-tutorial/images_and_plots/
Molecular dynamics results:

Anchor 0:
Milestone transitions: 0->0 : 580072,
Milestone avg. transition time (ps): 0 : 0.431,
Convergence value: 1.9139e-16.
Converged? True
Anchor 1:
Milestone transitions: 0->1 : 136, 1->0 : 137,
Milestone avg. transition time (ps): 0 : 3620.612, 1 : 18.060,
Convergence value: 1.3015e-01.
Converged? False
Anchor 2:
Milestone transitions: 1->2 : 3487, 2->1 : 3487,
Milestone avg. transition time (ps): 1 : 59.639, 2 : 12.046,
Convergence value: 1.2437e-16.
Converged? True
Anchor 3:
Milestone transitions: 2->3 : 5877, 3->2 : 5877,
Milestone avg. transition time (ps): 2 : 33.158, 3 : 9.358,
Convergence value: 0.0000e+00.
Converged? True
Anchor 4:
Milestone transitions: 3->4 : 127, 4->3 : 127,
Milestone avg. transition time (ps): 3 : 1279.104, 4 : 563.341,
Convergence value: 2.1342e-16.
Converged? True
Anchor 5:
Milestone transitions: 4->5 : 936, 5->4 : 937,
Milestone avg. transition time (ps): 4 : 167.988, 5 : 98.718,
Convergence value: 0.0000e+00.
Converged? True
Anchor 6:
Milestone transitions: 5->6 : 622, 6->5 : 623,
Milestone avg. transition time (ps): 5 : 87.694, 6 : 313.408,
Convergence value: 1.7429e-16.
Converged? True
Anchor 7:
Milestone transitions: 6->7 : 1188, 7->6 : 1187,
Milestone avg. transition time (ps): 6 : 102.850, 7 : 107.059,
Convergence value: 0.0000e+00.
Converged? True
Anchor 8:
Milestone transitions: 7->8 : 395, 8->7 : 395,
Milestone avg. transition time (ps): 7 : 60.049, 8 : 569.117,
Convergence value: 1.3724e-16.
Converged? True
Anchor 9:
Milestone transitions: 8->9 : 1789, 9->8 : 1789,
Milestone avg. transition time (ps): 8 : 89.347, 9 : 49.790,
Convergence value: 1.2121e-16.
Converged? True
Anchor 10:
Milestone transitions: 9->10 : 2609, 10->9 : 2609,
Milestone avg. transition time (ps): 9 : 44.651, 10 : 50.262,
Convergence value: 0.0000e+00.
Converged? True
Brownian dynamics results:
b-surface reaction counts:
to milestone 10: 26946
to milestone 9: 9463
escaped: 990537
stuck: 0
total: 1000000

from seekr2.

lvotapka commented on June 14, 2024

Closing this issue for now...

from seekr2.

converge step failure about seekr2 HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent