Comments (12)
Hi,
I got the final results on the in-house HPC from the analyze step as follows.
The kon is at least 2 orders of magnitude smaller than those reported in your SEEKR2 paper and thus gives a lower DG.
Please do suggest what I should try to get values close to the exp and your paper.
Is it mostly related to convergence failure or other factors might also contribute?
I want to start simulating smaller and larger systems and thus want to be able to reproduce tutorial results before tackling complicated systems where the # of influencing factors could be even more complicated.
Please suggest.
Thank you and best regards.
SEEKR2 (2022) | kon = (2.4 ± 0.2) × 107 | koff = 990 ± 130 |DG = –5.98 ± 0.09
All post-simulation checks passed.
Printing results from MMVT SEEKR calculation
k_off (1/s): (1.731 ± 0.060) * 10^+04
k_ons :
k_on (1/s * 1/M) to state 0 : (8.53 ± 0.22) * 10^+05
Dissociation constant (M) to state 0 : (2.029 ± 0.087) * 10^-02
ΔG (kcal/mol) to state 0 : (-2.308 ± 0.025) * 10^+00
Mean first passage times (s):
MFPT from anchor_0 to bulk : (5.79 ± 0.19) * 10^-05
All plots being saved to: /home/vaibhav/seekr2-tutorial/images_and_plots/
from seekr2.
First question: You don't need to have every anchor converged to proceed to the analyze step, but converge.py will warn you if some anchors do not appear to be completely converged, so if you were to dedicate more time to simulating anchors, converge.py shows you which ones would represent the best investment of your computation.
There could be any number of reasons why the results of the tutorial are different from the published paper. Are you using the systems whose charges were found using semiempirical QM? We used a more sophisticated method to generate the partial charges. Did you run as much MD as we did in the paper? Are you using the same milestone placements?
from seekr2.
Ok,
Following is my model.xml file which shows that I've done 125000000 * 0.002 = 250 ns per anchor.
Your paper mentions 500 ns/anchor. Is it likely to be the main source of the discrepancy?
Also, will the code do the remaining MD steps if I increase the num_production_steps to =250000000?
Will this also automatically lead to convergence in the cases where it led to failure with 250 ns?
Please do suggest the best possible options from here to reproduce the results that you reported in SEEKR2 paper.
Thank you and best regards.
Vaibhav
from seekr2.
250 ns vs. 500 ns could be a possible source of some of the discrepancy.
Increasing to 500 ns (250000000) in the unconverged anchors may lead to convergence as defined by converge.py, but keep in mind that converge.py saying whether an anchor is converged or not is an arbirary metric, and can be adjusted using the arguments to converge.py. By default, an anchor is converged if it has more than 100 transitions and a "convergence metric" of less than 0.1. (The "convergence metric" is a function of transition time averages and variances between anchors). Whether converge.py returns False or True is more intended for an automated process to choose whether to keep running an anchor for more time or not. Users can look at those quantities to help them make a judgment of whether to run more, but their own intuition should override the output of converge.py.
The trypsin/benzamidine system from the seekrtools tutorial is not likely to recreate the results of the latest publication. If you really want to recreate results that are close to those reported in the paper, you should use the input files included in https://github.com/seekrcentral/seekr2_systems/tree/master/systems/trypsin_benzamidine_files. Specifically, you would run prepare.py on input_tryp_ben_mmvt.xml.
from seekr2.
OK,
I'll try that asap, but I guess this would take 15-20 days on my A100 single GPU card, isn't it?
I usually use this approach to reproduce tutorials or published results before going into a detailed investigation with new/complicated and untested systems. I hope that makes sense here especially when theory and methods have multiple layers that could make a big difference in results.
Also, it would help if you can summarize the ligand parameterization procedure chosen in the paper.
Is it opt+esp calculation with hf/6-31G(d) or a dft with descent basis set like b3lyp/6-311+G(d,p) or higher along with the resp procedure that is available in AmberTools (antechamber)? I'm using this dft method for proteins with metal and organic co-factors.
I'm guessing that these would be compatible with the protein (ff19SB) and water TIP3P or OPC models.
Please suggest.
thank you.
from seekr2.
Feel free to attempt to recreate the results of the publication if you want to. As long as everything is done identically, I would expect you should get within an order of magnitude of the published results for both k-off and k-on.
The trypsin-benzamidine system was parametrized by my co-author Ben Jagger, and the procedure is described in this publication: https://pubs.acs.org/doi/full/10.1021/acs.jpcb.6b09388. Unfortunately, we were not very detailed in our exact parametrization procedure in that paper, but I'm fairly certain that Ben used RESP for the partial charges and GAFF for the other parameters. We used TIP4Pew waters. I'm not sure what level of DFT he used for the RESP, but if you send me a private email message, I can give you his email for you to ask him. He might also respond to a Github reference. @brjagger?
from seekr2.
Hi,
I'll reassess and drop you an email in a day or two.
Related to this, I have a very general question.
Does the milestoning or MD theory, in general, allow an estimate for the traj equilibration, or property convergence time based on the protein/system properties (e.g., size, mol wt, sequence length, fold-type, volume, surface charge or any other combination of physchem properties)?
The choice of 500 ns for tryp-benzamidine system was based on such understanding or a simple arbitrarily chosen large number.
thanks.
from seekr2.
That is a question that we are very actively searching for an answer to - we, at least I, don't know yet how to do that. The 500 ns was an arbitrarily large number that we hoped would be enough to capture the important motions of the tryp-benz system.
from seekr2.
hmm, that is very concerning for reasons obvious to experts like you.
In short, If we can't predict traj equil and property convergence time scales for different systems, then how can one be sure about the accuracy of kon, koff and kd value predictions for systems (ligands and proteins) where no experimental data exists and/or is very difficult to obtain experimentally. This has imp consequences that may precipitate as limitations for applying this theory/method to drug discovery, isn't it?
Please do share your current understanding and thoughts if possible supplemented with some literature links on the same.
For e.g. I'm trying to search for expected traj equil times but didn't find anything useful yet. I'm guessing property convergence would be a much harder problem.
thanks
from seekr2.
Yes, that limitation is disappointing, but I'd like to remind you that this is a problem, not just for SEEKR2, but for MD simulations and enhanced sampling methods in general. It is a large source of doubt for the use of molecular simulations for drug discovery. However, the situation is not hopeless. Convergence analysis like what exists in SEEKR2 offers some evidence whether a system is or is not converged. Convergence of SEEKR2 results to experimental values after a certain span of simulation time gives reassurance that the same span of time is likely to converge similar molecular systems. Optimal choice of CV is likely to give more confidence that SEEKR2 simulations will give decent results within a reasonable span of time. (see: https://pubmed.ncbi.nlm.nih.gov/35138832/).
from seekr2.
Hi,
I just reran anchor 1 with 500 ns to check if it improves the convergence and prediction of kinetic parameters.
To my surprise, anchor 1 did not converge but others did, strange!
Nonetheless, the predictions for kinetics are more or less similar (two orders of magnitude off for koff and 1 order of magnitude for kon and 1/3rd for DG.
Please let me know your thoughts on this outcome.
I'll try to rerun the whole calculation with resp charges for ligand and 500 ns/anchor.
I'm also trying to prepare the system for Heme protein and will let you know if I face trouble setting those calculations.
thank you.
(base) [vaibhav@param_embryo seekr2-tutorial]$ tail job.204.out
All post-simulation checks passed.
Printing results from MMVT SEEKR calculation
k_off (1/s): (1.918 ± 0.069) * 10^+04
k_ons :
k_on (1/s * 1/M) to state 0 : (8.53 ± 0.24) * 10^+05
Dissociation constant (M) to state 0 : (2.25 ± 0.10) * 10^-02
ΔG (kcal/mol) to state 0 : (-2.247 ± 0.027) * 10^+00
Mean first passage times (s):
MFPT from anchor_0 to bulk : (5.23 ± 0.19) * 10^-05
All plots being saved to: /home/vaibhav/seekr2-tutorial/images_and_plots/
(base) [vaibhav@param_embryo seekr2-tutorial]$
(base) [vaibhav@param_embryo seekr2-tutorial]$ cat job.203.out
All available options for --k_on_state include: [0]
No BD milestone has been provided for k_on_state. The BD milestone 0 has been chosen by default.
Processing interval 0 of 100
Processing interval 10 of 100
Processing interval 20 of 100
Processing interval 30 of 100
Processing interval 40 of 100
Processing interval 50 of 100
Processing interval 60 of 100
Processing interval 70 of 100
Processing interval 80 of 100
Processing interval 90 of 100
All plots have been saved to: /home/vaibhav/seekr2-tutorial/images_and_plots/
Molecular dynamics results:
- Anchor 0:
Milestone transitions: 0->0 : 580072,
Milestone avg. transition time (ps): 0 : 0.431,
Convergence value: 1.9139e-16.
Converged? True - Anchor 1:
Milestone transitions: 0->1 : 136, 1->0 : 137,
Milestone avg. transition time (ps): 0 : 3620.612, 1 : 18.060,
Convergence value: 1.3015e-01.
Converged? False - Anchor 2:
Milestone transitions: 1->2 : 3487, 2->1 : 3487,
Milestone avg. transition time (ps): 1 : 59.639, 2 : 12.046,
Convergence value: 1.2437e-16.
Converged? True - Anchor 3:
Milestone transitions: 2->3 : 5877, 3->2 : 5877,
Milestone avg. transition time (ps): 2 : 33.158, 3 : 9.358,
Convergence value: 0.0000e+00.
Converged? True - Anchor 4:
Milestone transitions: 3->4 : 127, 4->3 : 127,
Milestone avg. transition time (ps): 3 : 1279.104, 4 : 563.341,
Convergence value: 2.1342e-16.
Converged? True - Anchor 5:
Milestone transitions: 4->5 : 936, 5->4 : 937,
Milestone avg. transition time (ps): 4 : 167.988, 5 : 98.718,
Convergence value: 0.0000e+00.
Converged? True - Anchor 6:
Milestone transitions: 5->6 : 622, 6->5 : 623,
Milestone avg. transition time (ps): 5 : 87.694, 6 : 313.408,
Convergence value: 1.7429e-16.
Converged? True - Anchor 7:
Milestone transitions: 6->7 : 1188, 7->6 : 1187,
Milestone avg. transition time (ps): 6 : 102.850, 7 : 107.059,
Convergence value: 0.0000e+00.
Converged? True - Anchor 8:
Milestone transitions: 7->8 : 395, 8->7 : 395,
Milestone avg. transition time (ps): 7 : 60.049, 8 : 569.117,
Convergence value: 1.3724e-16.
Converged? True - Anchor 9:
Milestone transitions: 8->9 : 1789, 9->8 : 1789,
Milestone avg. transition time (ps): 8 : 89.347, 9 : 49.790,
Convergence value: 1.2121e-16.
Converged? True - Anchor 10:
Milestone transitions: 9->10 : 2609, 10->9 : 2609,
Milestone avg. transition time (ps): 9 : 44.651, 10 : 50.262,
Convergence value: 0.0000e+00.
Converged? True
Brownian dynamics results: - b-surface reaction counts:
to milestone 10: 26946
to milestone 9: 9463
escaped: 990537
stuck: 0
total: 1000000
from seekr2.
Closing this issue for now...
from seekr2.
Related Issues (20)
- presimulation check failures and Model Input File HOT 12
- attribute error in analyze.py HOT 3
- query about tutorial simulation time trypsin-benzamidine HOT 9
- errors in seekr2 test HOT 5
- Importance of Target vulnerability for binding kinetics HOT 3
- Non-Uniform Discretization HOT 10
- unable to locate trypsin-benzamidine tutorial page HOT 3
- Convergence Check HOT 36
- error while generating trypsin-benzamidine smd input for hidr HOT 6
- assumptions related to milestoning size/shape interactions in MD region HOT 8
- seekr2 testing error on HPC HOT 16
- BD or MD first with run.py any HOT 25
- error at the convergence step HOT 7
- error in prepare step due to ambpdb output HOT 1
- installing seekr2 on HPC via user account HOT 24
- Build Seekr2 Plugin from source fails HOT 3
- error setting up seekr2 calculation with Heme protein HOT 13
- warnings in prepare.py step HOT 1
- Import Error During Installation HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from seekr2.