Comments (11)
Fri 30 Jan 2015 05:38:21 PM CET, comment #1:
We just checked how the deviations depend on the particle friction. For a friction of 10 (instead of 2), the fluid temperature is unaffected, but the particle temperature deviation is 5 times higher (and closer to the higher fluid temperature). This suggests that the error happens in the fluid and the particles are just perturbed from their proper temperature by the coupling with the improperly thermalized fluid.
from espresso.
Fri 30 Jan 2015 05:50:55 PM CET, comment #2:
Regarding (4): Could you elaborate in what way global seed plus thread index is improper seeding? Provided we use a reasonably uncorrelated RNG, that would be correct and is actually a usual approach to many core seeding. Using for example the RNG to initialize the seeds for the nodes in turn would be plain wrong, because that introduces correlations.
If you see nearly identical progression of the kinetic energy, that rather indicates that the RNG seed isn't used.
Regarding (5): That's likely the one you want to hunt down. The program might initially start always the same because the initial memory contents is the same, but with time due to MPI/PCI timing jitter you will get different memory contents at places the code isn't supposed to read. And be advised that last time we had such a reading of uninitialized memory, valgrind was unable to detect it. printf debugging works though, but you need to be able to handle GB size of logs.
from espresso.
Fri 30 Jan 2015 05:53:21 PM CET, comment #3:
Just a caveat, if you increase the friction the integrator becomes less accurate which may also affect the particle temperature. It would be better to keep gamma*dt constant.
The distinct way to check would be to measure the distribution of the modes which will allow you to identify which one is off.
from espresso.
Tue 03 Feb 2015 01:46:45 PM CET, comment #4:
I played around a bit with the test case and discovered that the problem gets worst as one lowers the time step. For time steps of 10^-4 and lower the particle temperature seems to be 0.5 while the fluid temperature is more or less correct, not sure if this is helpful or not.
from espresso.
Tue 03 Feb 2015 02:51:47 PM CET, comment #5:
I discovered the cause of the discrepancy. I had ROTATION compiled in and the rotational degrees of freedom were only being equilibrated during an initial warm up. With a small time step these degrees of freedom did not have enough time to warm up and the resulting temperature was then half of what it should be. Apologies for the red herring.
Owen
from espresso.
Tue 03 Feb 2015 04:01:24 PM CET, comment #6:
(7) If there are no particles in the system, the global seed is not propagated to the GPU, which then uses uninitialized memory as seed (according to Dominic).
@axel: By using thread_seed=global_seed+thread_index, subsequent runs whose global_seed doesn't differ by number_of_threads or more produce correlated temperatures. The reason is that the simulation uses the same random numbers to create the noise, just at shifted positions in the grid. The shift doesn't matter so much, since the temperature uses the summed kinetic energies. The only difference between the two systems (considering temperature) comes from the first few threads, that actually use new random numbers, as well as the different spacial relations of the "noises". If this does indeed explain what we see, then the drifting apart of two runs with two very similar seeds (+6) takes a very long time, though.
Of course this is not a problem in principle, one just needs to use sufficiently different seeds, but seeding with the plain pid is common (as in the test case), that's why I thought it should be changed. Dominic uses PID^k with a large k from the TCL level. Something like that would already work.
from espresso.
Tue 03 Feb 2015 04:25:11 PM CET, comment #7:
I don't think "the shift doesn't matter". If the shift is not a multiple of the box length, then the neighbor topology of the nodes is different, and very quickly different parts of the random sequence interact, which should lead to additional decorrelation.
Otherwise, we would have a fundamental problem, since we use a 48-bit RNG. That takes just a couple of time steps to repeat, even for a classical MD with 10000 thousands of particles, and for LB it is much less.
from espresso.
Can this be closed?
from espresso.
In pull request #338 the lb test case failed again.
from espresso.
No. Several things need to be done:
-
Take into account the momentum from the initial kick and calculate the kinetic energy in the center of mass reference frame. That might fix the temperature offset.
-
The original script only compared to observables after the last integration step instead of their averages. Also the "expected" variances were wayy too high, which made the test case fail not that often anyways. I did fix the former and fudged some values for the latter. This should be done more rigorously, though.
from espresso.
-
The unexplained sudden divergence of different runs with the same seed remains.
-
I don't think anyone implemented the GPU seeding interface.
Thesse should be moved into a separate issue.
from espresso.
Related Issues (20)
- CI build failed for merged PR
- CI build failed for merged PR
- Performance regressions since 4.0 HOT 2
- Performance and Scaleability tracking ticket
- CI build failed for merged PR
- CI build failed for merged PR
- CI build failed for merged PR
- CI build failed for merged PR
- Document use of virtual environments
- CI build failed for merged PR
- Lees Edwards update after LB construciton
- CI build failed for merged PR
- CI build failed for merged PR
- CI build failed for merged PR
- CI build failed for merged PR
- CI build failed for merged PR
- CI build failed for merged PR
- CI build failed for merged PR
- CI build failed for merged PR
- CI build failed for merged PR
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from espresso.