Can I get an update on the latest protocol recommendations (at least in the saltswap c

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Question on relaxation vs alchemical steps,about choderalab/saltswap

Comments (32)

jchodera commented on July 3, 2024

Those issues aren't really relevant if you are (1) using alchemical systems generated with one of our alchemical factories, and (2) disable the long-range dispersion calculation for the alchemically-modified atoms in the alchemical factory with the disable_alchemical_dispersion_correction=True argument when creating your alchemically-modified system.

The issues @gregoryross noted were mainly due to the fact that he was instead (1) calling updateParametersInContext and (2) causing the long-range dispersion correction to be numerically recomputed on the CPU each time.

from saltswap.

jchodera commented on July 3, 2024

I'd love to hear about how you guys are doing with BLUES. I think a few weeks back we had asked if you had looked at the stddev of the protocol work as a function of the number of protocol steps, since this is the main way to examine what is going on with protocol efficiency, but I've never seen any plots.

from saltswap.

jchodera commented on July 3, 2024

Specifically, quantifying this for hopping a ligand around in a box of solvent would be sufficient to diagnose any issues.

@gregoryross also found that continuing to use a barostat during the NCMC protocol was important in getting good acceptance rates as well. Not sure if this applies to your case too.

from saltswap.

davidlmobley commented on July 3, 2024

Ah, OK, so I misunderstood slightly. I thought that the thread was saying there was still a major speed difference even aside from the alchemical dispersion correction.

I'd love to hear about how you guys are doing with BLUES.

We've been working through figuring out how to get reasonable acceptance in the "new world" where we no longer use the buggy alchemy which resulted in better acceptance but also crashes due to what I think was basically noise in the work distributions. Looks like a lot more relaxation (20K steps or more) which we are still sorting through.

I think a few weeks back we had asked if you had looked at the stddev of the protocol work as a function of the number of protocol steps, since this is the main way to examine what is going on with protocol efficiency, but I've never seen any plots

I think I'd missed this suggestion, but I'm checking with Sam.

@gregoryross also found that continuing to use a barostat during the NCMC protocol was important in getting good acceptance rates as well. Not sure if this applies to your case too.

Why? This is very interesting -- we have not pursued that at all. Maybe we should look at it.

Specifically, quantifying this for hopping a ligand around in a box of solvent would be sufficient to diagnose any issues.

We've done a bit of that specific test now and can at least get acceptance; we also see the work distribution shift towards "less unfavorable" as we do more relaxation, but have not looked specifically at the standard deviation.

from saltswap.

davidlmobley commented on July 3, 2024

One other thing I've been interested in is whether it would make sense to do more propagation near the "physical" states rather than at the alchemical intermediate states. At least for some simple problems (like rotating a ligand in-place in a cavity in a protein) I can convince myself this would make sense -- for example if you relax a great deal when it is non-interacting, then you may have the issue where the protein cavity collapses around the ligand, and then you have to push it back out of the way again as you turn the ligand back on. In that case, I can wave my hands and convince myself that relaxing more closer to the "physical" states and less towards the intermediate state might result in better acceptance. However, I'm not convinced it's worth exploring this much nor that it would really improve things much.

from saltswap.

jchodera commented on July 3, 2024

We've been working through figuring out how to get reasonable acceptance in the "new world" where we no longer use the buggy alchemy which resulted in better acceptance but also crashes due to what I think was basically noise in the work distributions. Looks like a lot more relaxation (20K steps or more) which we are still sorting through.

This sounds much more in line with my experience and expectations.

One other thing I've been interested in is whether it would make sense to do more propagation near the "physical" states rather than at the alchemical intermediate states. At least for some simple problems (like rotating a ligand in-place in a cavity in a protein) I can convince myself this would make sense -- for example if you relax a great deal when it is non-interacting, then you may have the issue where the protein cavity collapses around the ligand, and then you have to push it back out of the way again as you turn the ligand back on. In that case, I can wave my hands and convince myself that relaxing more closer to the "physical" states and less towards the intermediate state might result in better acceptance. However, I'm not convinced it's worth exploring this much nor that it would really improve things much.

There are likely ways to optimize the protocols to increase acceptance rates---potentially by orders of magnitude---but the first step is to get to the point where the work stddevs are less than a few kT. The relevant theory is explained here:

http://threeplusone.com/Crooks2007c.pdf
http://threeplusone.com/Sivak2012b.pdf

Why? This is very interesting -- we have not pursued that at all. Maybe we should look at it.

We don't know for sure, but the use of a barostat did have a measurable effect.

from saltswap.

sgill2 commented on July 3, 2024

I'd love to hear about how you guys are doing with BLUES. I think a few weeks back we had asked if you had looked at the stddev of the protocol work as a function of the number of protocol steps, since this is the main way to examine what is going on with protocol efficiency, but I've never seen any plots.

@jchodera Here's some histogram plots of the work distributions from NCMC using various relaxations steps for toluene in water, and toluene bound to T4 lysozyme.
The titles of each histogram give the following information, in order: "ncmc steps", "acceptance", "total samples", "stddev", "mean". The x axis is the work in terms of -kBT (so positive is more favorable for acceptance).
toluene in water without random rotations

toluene in water with random rotations

toluene in lysozyme with random rotations

from saltswap.

jchodera commented on July 3, 2024

@sgill2: These plots look exactly how I would expect! Thanks!

You should be able to take this data and plot the <P_accept> and its 95% confidence intervals as a function of the number of steps (perhaps on a log-log plot) so that you can see how the acceptance rates are changing with the number of steps. Your acceptance rates are still pretty low in lysozyme, but there will be ways we can increase those acceptance rates.

There are a lot of ways we can increase the acceptance probabilities. Here are just a few ideas:

Optimize the nonequilibrium path. If you've stored the work vs step number, we can likely estimate the thermoynamic length as a function of lambda and from this develop a better protocol. The easiest way to compute the thermodynamic metric tensor is to estimate var(du/dlambda) at different values of lambda from equilibrium simulations at fixed lambda, however, so you might consider this too. We can also optimize the alchemical softcore parameters, for example.
Use some method to increase timestep without increasing error. For example, it could be that hydrogen mass partitioning (HMR) lets us take larger timesteps. We're exploring this right now, quantifying the error for a T4 lysozyme system in explicit solvent.
Don't propagate all degrees of freedom during the switching. The more degrees of freedom there are, the larger the work variance. If we only let residues near the insert-delete locations move (by setting the masses of everything else to zero), for example, we may be able to greatly reduce the work variance because the system behaves like a smaller system.
Come up with more clever proposals. If you're hopping between endpoints where we don't already have to form a cavity, the dissipative work will be smaller and the acceptance rates will go up.
Try multiple proposals in parallel. When MC acceptance rates are low, multiple attempts can be made in parallel and one move can be accepted using a modified Metropolis criteria, such as in waste-recycling Monte Carlo. This can linearly increase small acceptance rates.

from saltswap.

sgill2 commented on July 3, 2024

@jchodera Thanks! Those all sound like interesting suggestions. I'm excited to try some of them out and see how they affect the acceptance.

from saltswap.

sgill2 commented on July 3, 2024

I've ran the same toluene in water NCMC simulations (with random rotations) as above, but this time using the MonteCarloBarostat. I observed higher acceptance rates using the barostat, but the work distributions were distributed quite differently than the NCMC simulations without a barostat.

Toluene in water (with random rotations)

Toluene in water with a barostat (and with random rotations

Toluene in lysozyme (with random rotations)

Toluene in lysozyme with barostat (and with random rotations)

The main difference is that the std deviation doesn't seem to decrease with increased relaxation as before. I'm in the process of running the toluene/lysozyme systems with a barostat, and the shorter relaxation runs show similar behavior to the toluene/water system with a barostat.

We could rationalize the acceptance increase of the barostat in the toluene/water case (since volume changes could help the water respond to the insertion/deletion of toluene), but we can't reason why the acceptance for the lysozyme case would be affected. From what John said earlier about the work distributions, we're leaning towards thinking that this isn't intended behavior and might be due to a bug in the barostat or in our code.

Do you have any thoughts on this @jchodera?

from saltswap.

jchodera commented on July 3, 2024

The main difference is that the std deviation doesn't seem to decrease with increased relaxation as before.

This is actually really concerning, and I'm wondering if this indicates some sort of bug in our work-tracking scheme. Can you point to the code you're using for these tests?

@gregoryross : Can you take a look at the above? I think it is essential to compute the acceptance rate vs number of steps (on a log-log plot) with and without the barostat for saltswap given the results above.

from saltswap.

sgill2 commented on July 3, 2024

I'll try to get up a minimum working example up–hopefully later today–using only openmmtools, so you folks don't have to decipher our BLUES code to look into this (and to help determine whether or not this is a bug in BLUES).

from saltswap.

sgill2 commented on July 3, 2024

Here’s a minimal example script for toluene in water. In the zip, the example.py script runs for a specified number of relaxation steps and number of iterations and outputs the protocol work after each iteration to a file. You can graph the output using the graph.py script.

bar_example.zip

from saltswap.

davidlmobley commented on July 3, 2024

@gregoryross -- just wanted to confirm you saw this? If not we can follow up with you by e-mail or Slack...?

from saltswap.

jchodera commented on July 3, 2024

I spoke to @gregoryross today. He is going to check his code (which uses ExternalPerturbationLangevinIntegrator) to see if it has the same behavior.

I'm working on running the bar_example.zip myself to verify we see the same thing, and looking into whether something might be going on with AlchemicalNonequilibriumLangevinIntegrator that would cause this issue.

Tagging @pgrinaway too since this issue will directly impact his current perses work, though he is swamped with immediate tasks just to get perses up and running.

from saltswap.

gregoryross commented on July 3, 2024

Hi @davidlmobley and @sgill2. Sorry for my late reply to all this, I've been preoccupied with sorting out my visa up to now. I'm now more able to return to this.

The results you've found are pretty odd @sgill2, especially the fact that the variance of the protocol work doesn't appear to be decreasing that much. I've taken a look at some of my saltswap benchmarking simulations to see if my results make sense.

As John mentioned, my code uses the ExternalPerturbationLangevinIntegrator together with updateParametersInContext to perform NCMC salt insertions and deletions. I already have some results where the barostat is switched on. Here's how the variance of the protocol work changes with increasing the number of perturbation steps:

These results are taken from simulations where there was 1 propagation step for every perturbation step. Each propagation step was 2fs long so that a protocol length of 20ps has 10,000 perturbation steps. The plot shows the variance for insertions and deletions. The variance decreases with more perturbation steps as we expect.

Here's a plot of the how the acceptance rate changes with increasing the number of perturbations:

As you can see, the acceptance rate increases with the number of perturbation steps.

Just to be clear, the results above have been run with the barostat switched on. I've currently set-up some new simulations that repeat the above simulations without the barostat.

Tomorrow I'll add some histograms of the work distributions so that we can have a clearer picture of what's going with my code. @sgill2, I'll also have a close look at your example script to see if I can spot anything.

from saltswap.

jchodera commented on July 3, 2024

@gregoryross : Presumably by "variance" you mean "standard deviation"?

from saltswap.

jchodera commented on July 3, 2024

@sgill2 : I'm looking at your example right now.

When you reject, you neglect to reset the box volume:

def acceptReject(context, original_pos, protocol_work):
	log_randnum = math.log(np.random.random())
	if -1*protocol_work > log_randnum:
		#move accepted, don't need to do anything                                                                                                                                                                                   
		pass
	else:
             	#rejected so use previous positions                                                                                                                                                                                         
		context.setPositions(original_pos)

You will need to store the box vectors and restore these before context.setPositions(original_pos).

from saltswap.

sgill2 commented on July 3, 2024

@jchodera Opps, I left setting the box vectors out in the example script, but did set them in my runs in BLUES. For BLUES though, I set the volume with ( context.setPeriodicBoxVectors) after setting the positions. Would that cause different behavior then setting the box vectors before the positions?

from saltswap.

jchodera commented on July 3, 2024

For BLUES though, I set the volume with ( context.setPeriodicBoxVectors) after setting the positions. Would that cause different behavior then setting the box vectors before the positions?

Yes, that could cause the coordinates to be incorrectly wrapped and your initial conditions for NCMC to be distorted.

from saltswap.

gregoryross commented on July 3, 2024

@jchodera, the plot does show the variance as a function of number of perturbations. However, the units on the y-axis are indeed wrong: it should read kT^2 instead of kT. Thanks for clearing that up.

from saltswap.

sgill2 commented on July 3, 2024

Yes, that could cause the coordinates to be incorrectly wrapped and your initial conditions for NCMC to be distorted.

Thanks. In that case I'll rerun my barostat examples again with that fix and let you folks know how the work distributions look after they finish.

from saltswap.

jchodera commented on July 3, 2024

@jchodera, the plot does show the variance as a function of number of perturbations. However, the units on the y-axis are indeed wrong: it should read kT^2 instead of kT. Thanks for clearing that up.

Let's show the stddev instead when possible.

from saltswap.

jchodera commented on July 3, 2024

Thanks. In that case I'll rerun my barostat examples again with that fix and let you folks know how the work distributions look after they finish.

Please do! We're happy to keep debugging if you're still seeing odd behavior at that point.

from saltswap.

sgill2 commented on July 3, 2024

@jchodera After making the changes to setting periodic boxes I still see the previously mentioned behavior with the barostat. Below are the stddev and acceptance plots for the toluene in water and lysozyme.
Again, the stddevs using the barostat are much higher than without using the barostat. This is particularly pronounced for the lysozyme case, where the acceptance rate increases dramatically for the barostat case, while the stddev is basically constant.
Toluene in Water (with random rotations)

Toluene in Lysozyme (with random rotations)

from saltswap.

jchodera commented on July 3, 2024

OK, this is progress! I was hoping you could update the simple test demonstrations script so we can focus on working on that together---do you have the updated version here?

from saltswap.

gregoryross commented on July 3, 2024

Here are histograms for the protocol work distributions for removing salt from a box of water for different numbers of NCMC perturbations. These simulations were run with a barostat. (I'm still waiting for my constant volume simulations to run on the cluster.)

These distributions exhibit the behaviour I would expect, with both the mean and variance decreasing for more NCMC perturbations.

Clearly this doesn't match what you're seeing @sgill2. I had a look at your minimal script, and in addition to using a different NCMC driver to the one used in saltswap, I noticed that you reset the velocities before every NCMC attempt. While I'm not sure if that would cause any problems, it is another difference between your NCMC protocol and mine. In saltswap, the velocities are reversed if a move is accepted; if a move is rejected, the velocities are retained.

from saltswap.

jchodera commented on July 3, 2024

Some other observations:

Please use integrator.get_protocol_work(dimensionless=True) to retrieve the protocol work
Be sure to use integrator.reset() at the beginning of each iteration to reset the work statistics instead of changing context variables directly
It should be OK to redraw the velocities at the beginning of each iteration---that just preserves the equilibrium distribution and make it irrelevant whether the velocities are negated or not on rejection.

from saltswap.

jchodera commented on July 3, 2024

Also, are you sure your protocol is symmetric? I haven't worked through this carefully:

functions = { 'lambda_sterics' : 'min(1, (1/0.3)*abs(lambda-0.5))',
                  'lambda_electrostatics' : 'step(0.2-lambda) - 1/0.2*lambda*step(0.2-lambda) + 1/0.2*(lambda-0.8)*step(lambda-0.8)' }

I'm still checking the bookkeeping to make sure we aren't doing something silly regarding barostats.

from saltswap.

jchodera commented on July 3, 2024

@sgill2 : Which version of OpenMM are you using?

python -c "from simtk import openmm; print(openmm.version.version)"

from saltswap.

jchodera commented on July 3, 2024

@sgill2 : One thing to note is that, if you don't call integrator.reset(), you're not resetting all of the relevant step counters, such as lambda_step which is used to compute the current lambda value:
https://github.com/choderalab/openmmtools/blob/master/openmmtools/integrators.py#L1700-L1719

from saltswap.

sgill2 commented on July 3, 2024

I've made the recommended changes to the example script here: bar_ex.zip.

Which version of OpenMM are you using?

I'm using 7.1.0.dev-9567ddb

Also, are you sure your protocol is symmetric?

I think so. Here are the plots of the equations.

One detail I left out in my last post was that those graphs were from using BLUES, so I'll probably want to rerun the same cases using the minimal example script. If everything above looks fine and there's no other suggested changes I'll go ahead and rerun them.

from saltswap.

Question on relaxation vs alchemical steps about saltswap HOT 32 OPEN

Comments (32)

Related Issues (13)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent