Git Product home page Git Product logo

Comments (22)

shinaoka avatar shinaoka commented on September 2, 2024

Hi, I did not implement estimate of error bars simply because I did not see a need at that point.
There is no available error information in current h5 output file.

How do you propagate the errors of average sign in a Jackknife resampling?

from ct-hyb.

dombrno avatar dombrno commented on September 2, 2024

The idea would be to mimick a series of shorter QMC runs by splitting the full set of points of the simulation from a single run into bunches. We would then invert the BSE and obtain the dynamic susceptibility in imaginary bosonic frequency, based on each of these bunches. For this, what we would need is G2(l,l', omega_n) at each measurement point of the QMC simulation, if that is dumped anywhere?
Said differently, we are trying to avoid having to run a number of independent runs by analyzing the data of the long run we already have. but probably these data are not dumped, I would imagine this would be several Go large.

from ct-hyb.

dombrno avatar dombrno commented on September 2, 2024

In the old times, in the very first version of Alps, there was a function "evaluate" which actually had to be called once the QMC was finished, and did the all the averaging etc. Based on this architecture, it would have been possible to do the procedure mentioned above. But I understand that the new architecture probably does not give access to this fine-grained detail, which is not necessary, since the averaging is done at the end of the QMC run before results are dumped.

from ct-hyb.

shinaoka avatar shinaoka commented on September 2, 2024

It may be possible with the accumulator or alea libraries in ALPSCore.
Markus Wallerberger may be the right person to answer this technical question.
Alex, Emanuel, could you assign him to this thread?

from ct-hyb.

shinaoka avatar shinaoka commented on September 2, 2024

Alex, Emanuel, any idea?

from ct-hyb.

egull avatar egull commented on September 2, 2024

Yes. We have the FullBinning observable in the old ALPS. That one will keep a number of bins from the simulation (typically 128, I think) and fill them with the values. They are then written into the HDF5 file. A BS equation calculation can then take the data from the bins and post process them.
A word of warning though: this will require 128 times the storage for the HDF5 file (and in memory), which is substantial for vertex functions.
Hiroshi, to enable this you would have to find the accumulator and change it from mean or nobinning to fullbinning. @dombrno would then have to rerun the calculation.
With the new ALPS we have the same capabilities, just the rewrite of the ALEA has the binning procedure optimized a bit. A paper is in preparation.

from ct-hyb.

dombrno avatar dombrno commented on September 2, 2024

I understand, that makes sense: I remember that in the course of adapting the old Alps code (segment) for my needs (expand it to two orbitals, and use non diagonal hybridization function), I replaced the proposed accumulators with others which did not do any binning, and enjoyed a much more compact output file (I did not check memory consumption, and I was only calculating single-particle GF).

from ct-hyb.

egull avatar egull commented on September 2, 2024

So... how can we help? Should we get you a version that can do binning so you can try?

from ct-hyb.

dombrno avatar dombrno commented on September 2, 2024

I think that before anything is done, I should assess the memory and storage which are needed at the moment, and consider how much more the hardware is able to handle on my side.

from ct-hyb.

shinaoka avatar shinaoka commented on September 2, 2024

Thank you, Emanuel.

@dombrno
I want to ask you a real need for your BS equation stuff.
What if you just run the CT-HYB solver several times with different random seeds?
This may also give some estimates of error bars.

from ct-hyb.

dombrno avatar dombrno commented on September 2, 2024

Yes, that is an option.

Maybe we can keep this feature as a "nice to have" option, but I would give it the lowest priority.

from ct-hyb.

dombrno avatar dombrno commented on September 2, 2024

For what it's worth, with the current code, the h5 file is 100 Mo large, while, and the calculation consumes 5Go RAM. The available RAM is 128Go on my hardware, so I could probably use up to 20 bins, if this option were ever implemented.

If I do have a real strong need for it, I will ask for guidance as to which type of accumulator is best to use in your opinion, and implement it on my side, but for the time being I thank you for your answer and your help, which perfectly answers my initial question.

from ct-hyb.

dombrno avatar dombrno commented on September 2, 2024

@shinaoka I finally went for the option you suggested: run the CT-HYB solver several times with different random seeds. I do this by using the job array feature of the PBS scheduler: a number of jobs are launched with the exact same inputs. Each job uses one full node with 24 cpus. The only difference between each job is the value of SEED in the input.ini file.

Does it look reasonable to you? In particular, as the seed in my setup increases by one at each node, I would like to make sure that the same seed is used by all cpus controlled by a given job, so that there can be no seed overlap between cpus from different jobs. In other terms, I would like a confirmation that all the mpi processes of a single run share the same seed. This is a probably a question for @galexv ? Thanks a lot.

from ct-hyb.

shinaoka avatar shinaoka commented on September 2, 2024

Hmm, the value of the seed increases one by one for different nodes?
This could cause a problem.
For the MPI process of rank n, its pseudorandom number generator is initialized with SEED + n.
Here, SEED is the seed in a given file. (This is the specification of ALPSCore libraries)
So, the value of SEED should increase at least by 24 from one node to another one.

from ct-hyb.

shinaoka avatar shinaoka commented on September 2, 2024

To prevent this happen, I may be able to apply some non-linear transformation $f$ to SEED to initialize Brandon-number generators with f(SEED) + n.
What do you think?

from ct-hyb.

dombrno avatar dombrno commented on September 2, 2024

I suspected this could be the case, thanks for clarifying - I will simply increase the seed by 24 on each node, and should be fine then. Thank you!

from ct-hyb.

dombrno avatar dombrno commented on September 2, 2024

Maybe just one detail to make sure everything is working as expected: I am controlling the seed via the key "SEED" at the top level of the .ini file, based on what I saw implemented in alps/mc/mcbase.cp. Is this the recommended way to control this parameter?

from ct-hyb.

shinaoka avatar shinaoka commented on September 2, 2024

You're right!

from ct-hyb.

dombrno avatar dombrno commented on September 2, 2024

I have now obtained the data corresponding to 64 runs on a single node, using different seed values. I would like to do some resampling of the quantities G1_LEGENDRE and G2_LEGENDRE. For this purpose I need to use the number of measurements performed on each node for these quantities (the average sign is 1.0). It looks like

  • /simulation/results/G1_Re/count
  • /simulation/results/G2_Re/count

might be the suitable fields - can you please confirm if this is correct?
Thank you!

from ct-hyb.

egull avatar egull commented on September 2, 2024

from ct-hyb.

shinaoka avatar shinaoka commented on September 2, 2024

BTW, why do you need to know the number of measurements?

from ct-hyb.

dombrno avatar dombrno commented on September 2, 2024

Well, I have 64 samples, and need to calculate the values of G1 and G2 over subsets of these samples, so I was thinking that the number of measurements is the reasonable weight to apply to the contribution of each sample to the partial resummation, given the fact that the average sign is 1.0

from ct-hyb.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.