Git Product home page Git Product logo

Comments (21)

davidrpugh avatar davidrpugh commented on June 15, 2024

@adrian-carro Looks like the random number generator instance rand is declared as a public variable which means it is shared with all of the code.

Probably makes the most sense to create a single instance of the random number generator and then pass that instance into the constructor of any object that needs a source of randomness.

from housing-model.

davidrpugh avatar davidrpugh commented on June 15, 2024

Also watch out for any operations where you are iterating over HashSet or HashMap data structures as the order in which the iteration is done will not be consistent between model runs.

from housing-model.

davidrpugh avatar davidrpugh commented on June 15, 2024

Is there a test case? What is the best way to generate an output file that we can quickly check is the same after two model runs.

from housing-model.

adrian-carro avatar adrian-carro commented on June 15, 2024

With PR #52, any problem with random number generation would only affect branch ML-calibration. I'm thus reducing priority to low, for now, though I'll check this issue towards the end of this week.

from housing-model.

adrian-carro avatar adrian-carro commented on June 15, 2024

@davidrpugh I am re-stating high priority for this issue as I've just had time to check the random numbers in the main branch and I get again inconsistencies between different runs. For example, if one prints a prng.nextInt() after 5 times steps, different runs will lead to a different number being printed.

from housing-model.

davidrpugh avatar davidrpugh commented on June 15, 2024

Just so that I can reproduce this issue myself. Are you saying the if you insert a System.out.print(prng.nextInt()) in the main method of the Model.java class, then after t=5 time steps, different runs will produce different outputs?

Which output file should I look at the see the differences?

from housing-model.

adrian-carro avatar adrian-carro commented on June 15, 2024

So, if you place a System.out.print(prng.nextInt()) within the main method in the Model class and you take note, for example, of the 5th number printed, when you compare it between different runs, you'll see different numbers. That proves something is going wrong with the random number generation. Apart from this, of course, you can compare the output files, which are also different after the first several time steps and in several columns.

from housing-model.

davidrpugh avatar davidrpugh commented on June 15, 2024

Ok. The only way that can happen is if the generator is getting called a different number of times depending on model run. I have a few ideas about why this is happening but they will be difficult to diagnose give the currrentbstate if the code.

Can you please point me to the single most important example of a model output file that is different across model runs using the same seed.

from housing-model.

adrian-carro avatar adrian-carro commented on June 15, 2024

Ok, let's take the Output-run1.csv file, row 6, corresponding to Model time = 4 (first column). For two different runs I find different values at the second column, nNonBTLHomeless. This is just an easy to find example, representative of what is happening. I am guessing any example is fine, as we just need to detect where the problem is stemming from.

from housing-model.

davidrpugh avatar davidrpugh commented on June 15, 2024

Is the insertion of the print statement necessary to generate the bug? I spot checked the Output-1.csv file when I made my previous changes and the output appeared identical.

Is there any multi-threading in the model?

from housing-model.

adrian-carro avatar adrian-carro commented on June 15, 2024

There is no multi-threading, and no, the print statement is not needed, it was just to get information faster, without any need to compare output files. Could you confirm is the Output-1.csv files are identical for you? That would be fairly surprising... why would there be system dependencies?

from housing-model.

davidrpugh avatar davidrpugh commented on June 15, 2024

I will try to replicate the issue when I get home. I am running on Windows using some version of oraclejdk 8.

from housing-model.

adrian-carro avatar adrian-carro commented on June 15, 2024

Great, thanks for doing this! I'm on Ubuntu 16.04, and running openjdk version "1.8.0_162".

from housing-model.

davidrpugh avatar davidrpugh commented on June 15, 2024

@adrian-carro I just ran the main method of the Model.java class for roughly 500 time steps. This generates a directory of output in the Results folder. I then, re-run the main method again (without changing anything) to generate a second directory of output in the Results folder. I then compare the resulting Output-run1.csv files. They appear to be identical. For example here is the record for t=103 from both files:

First simulation I ran generated...

103, 283, 3, 286, 331, 617, 646, 31, 677, 255, 289, 2, 0, 1549, 1270, 12, 0, 7, 0.26614173228346455, 1.0244697019962505, -0.002299730036721126, 320082.397005936, 250030.7937810174, 252088.47594443636, 0.0, 21.326370650946313, 95, 46, 19, 19, 0, 19, 0.15789473684210525, 0.631578947368421, 1.01157359475851, -0.024098208995406822, 294.161880977776, 876.8550416063504, 624.6606537036129, 5.217391304347826, 270, 30, 23, 0.04411977882667759, 933, 185688.6040452648

Second simulation I ran generated...

103, 283, 3, 286, 331, 617, 646, 31, 677, 255, 289, 2, 0, 1549, 1270, 12, 0, 7, 0.26614173228346455, 1.0244697019962505, -0.002299730036721126, 320082.397005936, 250030.7937810174, 252088.47594443636, 0.0, 21.326370650946313, 95, 46, 19, 19, 0, 19, 0.15789473684210525, 0.631578947368421, 1.01157359475851, -0.024098208995406822, 294.161880977776, 876.8550416063504, 624.6606537036129, 5.217391304347826, 270, 30, 23, 0.04411977882667759, 933, 185688.6040452648

Is this the right experimental setup to replicate your results? Perhaps you are running the simulations from the command line rather than from within an IDE?

from housing-model.

adrian-carro avatar adrian-carro commented on June 15, 2024

@davidrpugh I am using the IDE (intelliJ IDEA), but I've also tried command line with the same results: random numbers are not consistent between runs. I'm fairly surprised by the fact you do not obtain that inconsistency! I know Donovan (one of our PhD students) is having the same problem, as he flagged it yesterday when he started working with the model, and he uses a Mac. Do you have any clue of what might be happening here?

from housing-model.

davidrpugh avatar davidrpugh commented on June 15, 2024

I am also using IntelliJ. I am running the model of my work computer which runs on Windows.

I want to be exactly clear about what I am doing just to be certain that we are doing the same thing.

  1. Right click on the Model.java file and run the main method.
  2. After roughly 500 time steps I stop the model.
    3 click the green play button to re-run the simulation.
    After roughly 500 time steps I stop the model.
  3. I then compare the Output-run1.csv files from the relevant directories in the Results folder.

Is this what you are doing? Please try to be very specific in your description.

from housing-model.

adrian-carro avatar adrian-carro commented on June 15, 2024

Ouch, ok, I can confirm the problem does not appear when I follow your instructions. I have only seen the problem when:

  1. Running the code from the command line with mvn exec:java -Dexec.mainClass="housing.Model".
  2. Running the code with a Maven configuration within IntelliJ, with the command line argument clean validate compile exec:java.
    For the rest (your points 2 and 3), I proceed exactly as you do.

from housing-model.

davidrpugh avatar davidrpugh commented on June 15, 2024

@adrian-carro This is useful information! It suggests that the issue arises at compilation time, rather than runtime. Running multiple simulations using the steps that I provided compiles the model once. Doing it your way forces compilation of the code twice.

Do you know if the configuration files are being picked up when the model is compiled or after the model is compiled?

from housing-model.

adrian-carro avatar adrian-carro commented on June 15, 2024

@davidrpugh The config file should be picked up only at execution, if I am not wrong. However, as I understand it, my option number 1, running the code from command line with mvn exec:java -Dexec.mainClass="housing.Model" is not compiling the code, but just executing it. At least I had assumed so, please, correct me if I wrong! ;-)

from housing-model.

adrian-carro avatar adrian-carro commented on June 15, 2024

@davidrpugh I finally managed to find the origin of this problem: there was a small bug at the PriorityQueue2D which was leading to some transactions happening or not happening, as a consequence of which the amount of random numbers used would change, thus leading to the observed inconsistencies. I obtain now consistent random numbers between runs. Also, I have checked and I obtain now the same stream of random numbers whether directly running the class or using the maven execution goal. I still don't understand why these two ways of running the code were leading to different results before, but I guess they might simply have different ways of handling errors!

from housing-model.

davidrpugh avatar davidrpugh commented on June 15, 2024

@adrian-carro Great! Glad that you found the issue. Sounds super tedious. I don't actually think that the 2D priority queue is necessary at all to implement the matching algorithm. We might want to think about removing it and replacing it with something simpler.

from housing-model.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.