Comments (21)
Maybe run one instance to check it's not some weird global variable getting messed up thing.
from euro-neurips-2022.
If I run just one instance, I do consistently get the same result. E.g.
poetry run python benchmark.py --max_iterations 10 --num_procs 1 --instance_pattern instances/ORTEC-VRPTW-ASYM-7f99f05a-d1-n528-k40.txt
Gives me an average cost of 253753
. @leonlan can you check that you get the same result?
from euro-neurips-2022.
So there might be something wrong with how the benchmark script operates. But at least it's not in the solver itself :-).
from euro-neurips-2022.
Using the rollout branch:
cmake -Brelease -DCMAKE_BUILD_TYPE=Release -Shgs_vrptw && cd release && make && cd ..
poetry run python benchmark.py --max_iterations 10 --num_procs 1 --instance_pattern instances/ORTEC-VRPTW-ASYM-7f99f05a-d1-n528-k40.txt
Instance OK Objective Iters. (#) Time (s)
ORTEC-VRPTW-ASYM-7f99f05a-d1-n528-k40 Y 261113 10 0.158
Avg. objective: 261113 (w/o infeas: 261113)
Avg. iterations: 10
Avg. run-time (s): 0.16
Total not OK: 0
from euro-neurips-2022.
I usually get the same result when running again. However, after rebuilding, I once got this:
Instance OK Objective Iters. (#) Time (s)
ORTEC-VRPTW-ASYM-7f99f05a-d1-n528-k40 Y 260802 10 0.127
Avg. objective: 260802 (w/o infeas: 260802)
Avg. iterations: 10
Avg. run-time (s): 0.13
Total not OK: 0
from euro-neurips-2022.
Not sure what's happening here. (But seems to be another issue.)
$ git.exe checkout codalab-submission
Switched to branch 'codalab-submission'
Your branch is up to date with 'origin/codalab-submission'.
$ cmake -Brelease -DCMAKE_BUILD_TYPE=Release -Shgs_vrptw
-- pybind11 v2.9.2
-- Configuring done
-- Generating done
-- Build files have been written to: /Euro-NeurIPS-2022/release
$ cd release
$ make
Scanning dependencies of target hgs
[ 4%] Building CXX object src/CMakeFiles/hgs.dir/GeneticAlgorithm.cpp.o
In file included from /Euro-NeurIPS-2022/hgs_vrptw/include/Params.h:6,
from /Euro-NeurIPS-2022/hgs_vrptw/include/Individual.h:4,
from /Euro-NeurIPS-2022/hgs_vrptw/include/GeneticAlgorithm.h:4,
from /Euro-NeurIPS-2022/hgs_vrptw/src/GeneticAlgorithm.cpp:1:
/Euro-NeurIPS-2022/hgs_vrptw/include/XorShift128.h:60:15: error: ‘std::integral’ has not been declared
60 | template <std::integral T> result_type randint(T high)
| ^~~
/Euro-NeurIPS-2022/hgs_vrptw/include/XorShift128.h:60:52: error: ‘T’ has not been declared
60 | template <std::integral T> result_type randint(T high)
| ^
/Euro-NeurIPS-2022/hgs_vrptw/src/GeneticAlgorithm.cpp: In member function ‘Individual GeneticAlgorithm::crossover() const’:
/Euro-NeurIPS-2022/hgs_vrptw/src/GeneticAlgorithm.cpp:83:28: error: no matching function for call to ‘XorShift128::randint(int)’
83 | if (rng.randint(100) < params.config.selectProbability)
| ^
In file included from /Euro-NeurIPS-2022/hgs_vrptw/include/Params.h:6,
from /hgs_vrptw/include/Individual.h:4,
from /Euro-NeurIPS-2022/hgs_vrptw/include/GeneticAlgorithm.h:4,
from /Euro-NeurIPS-2022/hgs_vrptw/src/GeneticAlgorithm.cpp:1:
/Euro-NeurIPS-2022/hgs_vrptw/include/XorShift128.h:60:44: note: candidate: ‘template<<declaration error> > XorShift128::result_type XorShift128::rand
int(int)’
60 | template <std::integral T> result_type randint(T high)
| ^~~~~~~
/Euro-NeurIPS-2022/hgs_vrptw/include/XorShift128.h:60:44: note: template argument deduction/substitution failed:
/Euro-NeurIPS-2022/hgs_vrptw/src/GeneticAlgorithm.cpp: In member function ‘void GeneticAlgorithm::educate(Individual&)’:
/Euro-NeurIPS-2022/hgs_vrptw/src/GeneticAlgorithm.cpp:101:27: error: no matching function for call to ‘XorShift128::randint(int)’
101 | && rng.randint(100) < params.config.repairProbability)
| ^
In file included from /Euro-NeurIPS-2022/hgs_vrptw/include/Params.h:6,
from /Euro-NeurIPS-2022/hgs_vrptw/include/Individual.h:4,
from /Euro-NeurIPS-2022/hgs_vrptw/include/GeneticAlgorithm.h:4,
from /Euro-NeurIPS-2022/hgs_vrptw/src/GeneticAlgorithm.cpp:1:
/Euro-NeurIPS-2022/hgs_vrptw/include/XorShift128.h:60:44: note: candidate: ‘template<<declaration error> > XorShift128::result_type XorShift128::rand
int(int)’
60 | template <std::integral T> result_type randint(T high)
| ^~~~~~~
/Euro-NeurIPS-2022/hgs_vrptw/include/XorShift128.h:60:44: note: template argument deduction/substitution failed:
make[2]: *** [src/CMakeFiles/hgs.dir/build.make:63: src/CMakeFiles/hgs.dir/GeneticAlgorithm.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:191: src/CMakeFiles/hgs.dir/all] Error 2
make: *** [Makefile:84: all] Error 2
$ cd ..
$ git.exe checkout rollout
Switched to branch 'rollout'
Your branch is up to date with 'origin/rollout'.
$ cmake -Brelease -DCMAKE_BUILD_TYPE=Release -Shgs_vrptw
-- pybind11 v2.9.2
-- Configuring done
-- Generating done
-- Build files have been written to: /mnt/c/Users/jdn317/Documents/Jasper van Doorn/Euro-NeurIPS-2022/release
$ cd release
$ make
Scanning dependencies of target hgs
[ 4%] Building CXX object src/CMakeFiles/hgs.dir/GeneticAlgorithm.cpp.o
[ 8%] Building CXX object src/CMakeFiles/hgs.dir/crossover/crossover.cpp.o
[ 13%] Building CXX object src/CMakeFiles/hgs.dir/crossover/brokenPairsExchange.cpp.o
[ 17%] Building CXX object src/CMakeFiles/hgs.dir/crossover/selectiveRouteExchange.cpp.o
[ 21%] Building CXX object src/CMakeFiles/hgs.dir/Individual.cpp.o
[ 26%] Building CXX object src/CMakeFiles/hgs.dir/LocalSearch.cpp.o
[ 30%] Building CXX object src/CMakeFiles/hgs.dir/operators/Exchange.cpp.o
[ 34%] Building CXX object src/CMakeFiles/hgs.dir/operators/MoveTwoClientsReversed.cpp.o
[ 39%] Building CXX object src/CMakeFiles/hgs.dir/operators/RelocateStar.cpp.o
[ 43%] Building CXX object src/CMakeFiles/hgs.dir/operators/SwapStar.cpp.o
[ 47%] Building CXX object src/CMakeFiles/hgs.dir/operators/TwoOpt.cpp.o
[ 52%] Building CXX object src/CMakeFiles/hgs.dir/Node.cpp.o
[ 56%] Building CXX object src/CMakeFiles/hgs.dir/Params.cpp.o
[ 60%] Building CXX object src/CMakeFiles/hgs.dir/Population.cpp.o
[ 65%] Building CXX object src/CMakeFiles/hgs.dir/Result.cpp.o
[ 69%] Building CXX object src/CMakeFiles/hgs.dir/Route.cpp.o
[ 73%] Building CXX object src/CMakeFiles/hgs.dir/Statistics.cpp.o
[ 78%] Building CXX object src/CMakeFiles/hgs.dir/TimeWindowSegment.cpp.o
[ 82%] Linking CXX static library ../lib/libhgs.a
[ 82%] Built target hgs
Scanning dependencies of target hgspy
[ 86%] Building CXX object src/CMakeFiles/hgspy.dir/bindings.cpp.o
[ 91%] Linking CXX shared module ../lib/hgspy.cpython-38-x86_64-linux-gnu.so
[ 91%] Built target hgspy
Scanning dependencies of target genvrp
[ 95%] Building CXX object src/CMakeFiles/genvrp.dir/main.cpp.o
[100%] Linking CXX executable ../bin/genvrp
[100%] Built target genvrp
$ cd ..
$ poetry run python benchmark.py --max_iterations 10 --num_procs 1 --instance_pattern instances/ORTEC-VRPTW-ASYM-7f99f05a-d1-n528-k40.txt
100%|██| 1/1 [00:00<00:00, 4.38instance/s]
Instance OK Objective Iters. (#) Time (s)
------------------------------------- -- --------- ---------- --------
ORTEC-VRPTW-ASYM-7f99f05a-d1-n528-k40 Y 261409 10 0.154
Avg. objective: 261409 (w/o infeas: 261409)
Avg. iterations: 10
Avg. run-time (s): 0.15
Total not OK: 0
$ poetry run python benchmark.py --max_iterations 10 --num_procs 1 --instance_pattern instances/ORTEC-VRPTW-ASYM-7f99f05a-d1-n528-k40.txt
100%|██| 1/1 [00:00<00:00, 4.18instance/s]
Instance OK Objective Iters. (#) Time (s)
------------------------------------- -- --------- ---------- --------
ORTEC-VRPTW-ASYM-7f99f05a-d1-n528-k40 Y 261113 10 0.162
Avg. objective: 261113 (w/o infeas: 261113)
Avg. iterations: 10
Avg. run-time (s): 0.16
Total not OK: 0
$ poetry run python benchmark.py --max_iterations 10 --num_procs 1 --instance_pattern instances/ORTEC-VRPTW-ASYM-7f99f05a-d1-n528-k40.txt
100%|██| 1/1 [00:00<00:00, 4.26instance/s]
Instance OK Objective Iters. (#) Time (s)
------------------------------------- -- --------- ---------- --------
ORTEC-VRPTW-ASYM-7f99f05a-d1-n528-k40 Y 261113 10 0.158
Avg. objective: 261113 (w/o infeas: 261113)
Avg. iterations: 10
Avg. run-time (s): 0.16
Total not OK: 0
$ poetry run python benchmark.py --max_iterations 10 --num_procs 1 --instance_pattern instances/ORTEC-VRPTW-ASYM-7f99f05a-d1-n528-k40.txt
100%|██| 1/1 [00:00<00:00, 4.25instance/s]
Instance OK Objective Iters. (#) Time (s)
------------------------------------- -- --------- ---------- --------
ORTEC-VRPTW-ASYM-7f99f05a-d1-n528-k40 Y 261113 10 0.157
Avg. objective: 261113 (w/o infeas: 261113)
Avg. iterations: 10
Avg. run-time (s): 0.16
Total not OK: 0
$
from euro-neurips-2022.
So there might be something wrong with how the benchmark script operates. But at least it's not in the solver itself :-).
Apart from the fact I am not sure if the outcomes are deterministic over different computers.
For me, running poetry run python benchmark.py --max_iterations 10 --num_procs 1
is deterministic.
But running poetry run python benchmark.py --max_iterations 10 --num_procs 2
is not.
from euro-neurips-2022.
@jaspervd96 sorry, I should have specified the branch. Using the latest commit on rollout
(SHA: e8c81e1), I consistently get
Instance OK Objective Iters. (#) Time (s)
------------------------------------- -- --------- ---------- --------
ORTEC-VRPTW-ASYM-7f99f05a-d1-n528-k40 Y 255387 10 0.215
Avg. objective: 255387 (w/o infeas: 255387)
Avg. iterations: 10
Avg. run-time (s): 0.21
Total not OK: 0
This does not appear to be the same thing you get. I am unsure why: as far as I can tell, there's no machine-specific stuff that goes into how the rng works. But at least it's consistent.
But running
poetry run python benchmark.py --max_iterations 10 --num_procs 2
is not.
There appears to be something going on with the parallel implementation. But I don't know enough about tqdm
and pybind to immediately understand what's going on.
from euro-neurips-2022.
I don't have the full answer, but discovered some interesting things.
We now use concurrent.futures.ProcessPoolExecutor.map
with a tqdm wrapper to show a progress bar. This leads to inconsistent results when more than one worker is used.
This function works similar to multiprocessing.Pool.imap
. It issues one* task at a time to a process pool.
(*by default, but this number can be specified by setting a "chunksize")
In contrary to the function multiprocessing.Pool.map
. This issues all tasks to the process pool at once.
This behavior can be mimicked by specifying chunksize=len(func_args)
in both functions mentioned before.
Conclusion: I don't know why, but with the latter function / argument the results are consistent again. But with a lower chunksize they are not.
Note that this leads to a progress bar that is only updated after every chunksize
number of tasks, and in our case thus only after finishing.
from euro-neurips-2022.
Somehow, the issue seems to occur when tasks are submitted to the pool while another task is already running.
Still not found the exact details of what's happening here.
But since we will not use n_procs > 1
for the submissions, I think this issue might not have a big impact.
@N-Wouda could you confirm?
from euro-neurips-2022.
But since we will not use
n_procs > 1
for the submissions, I think this issue might not have a big impact.
Submissions do not use our benchmark script at all - it's just for ourselves. Submissions run everything separately, using the controller and solver scripts directly.
So we're good on the submission side. I'd just rather also have had deterministic benchmark runs, and do not really understand why parallel runs in different processes influence each other since I do not think there's any shared state.
from euro-neurips-2022.
@N-Wouda Okay, not sure why/how, but it seems that either the entire solver, or at least the XorShift, is shared between different solves.
Running this function from benchmark.py
twice, gives different results:
solve("ORTEC-VRPTW-ASYM-9016f313-d1-n200-k20",
seed=1,
num_procs=1,
instance_pattern="instances/ORTEC-VRPTW-ASYM-*.txt",
max_runtime=None,
max_iterations=10,
phase=None)
Out[1]: ('ORTEC-VRPTW-ASYM-9016f313-d1-n200-k20', 'Y', 191208, 10, 0.066)
Out[2]: ('ORTEC-VRPTW-ASYM-9016f313-d1-n200-k20', 'N', 1393065, 10, 0.062)
Out[3]: ('ORTEC-VRPTW-ASYM-9016f313-d1-n200-k20', 'Y', 191208, 10, 0.066)
Out[4]: ('ORTEC-VRPTW-ASYM-9016f313-d1-n200-k20', 'Y', 190870, 10, 0.067)
from euro-neurips-2022.
@N-Wouda do you have any clues what might be happening here or know how we should look into this further? Because I don't
Or do we ignore it, since it won't affect the real submission?
from euro-neurips-2022.
I'm inclined to ignore it for now, since it does not seem to affect the submission. It's probably some sort of pybind interface stuff that I'm too noob to understand right now, but I do not think our time is best served figuring it out in detail (beyond that we already checked it's not actually in the solver!).
from euro-neurips-2022.
With that I think we can close this issue, if you agree @jaspervd96.
from euro-neurips-2022.
(beyond that we already checked it's not actually in the solver!)
Well, it is in the hgspy module (=solver), that does not seem to correctly start from scratch and somehow shares things between different solves. (see previous comments)
So the only reason it is deterministic, is because the end state of a solver for instance 1 that is used to start solving instance 2 is consistent over different runs. Until we use parallelization, and the same solver is used by two instances at the same time leading to inconsistent results.
This also means that running the solver twice on the exact same instance, within the same script, gives inconsistent results. Same for running the solver on the same set of instances in a different order.
In the dynamic setting and rollout algorithm, where we do use the solver multiple times within a single solve (so also in the submission), this means we basically use solvers that are not seeded at 1 (but consistent).
I'm inclined to ignore it for now, since it does not seem to affect the submission.
So as long as we are sure the only issue / part that is shared would be the XOrShift, I am fine with ignoring it.
However, I am not yet entirely convinced we can assume that (only) the XOrShift would be shared and not other objects of the solver. Because I did not find any reason to believe why this class should behave different then e.g. Population.
Plus that I tested importing multiple instances of XOrShift in python and generating random numbers, which all seemed to totally work as expected an independently. So I am not even sure the XOrShift has anything to do with this issue.
from euro-neurips-2022.
I'll read this later!
from euro-neurips-2022.
Well, it is in the hgspy module (=solver), that does not seem to correctly start from scratch and somehow shares things between different solves.
The hgspy
module is solver + pybind. Just the solver (genvrp
) is deterministic when given a fixed number of iterations (checked this with four instances and 100 solves each - all the same). So on the cpp side we're good. There's something in pybind and/or its interaction in multiple processes that seems to be the issue, but I am not bothered enough to investigate this further right now.
from euro-neurips-2022.
I came across the following observation:
Running the solver without the broken_pairs_exchange
crossover operator makes it deterministic.
Not sure what to do with this observations, but maybe it spontaneously rings a bell for @N-Wouda or @leonlan
If not, we keep ignoring this :)
from euro-neurips-2022.
Can't think of a reason right now why BPX would behave non-deterministically. It's fairly likely we'll drop BPX altogether (since it doesn't perform all that well), so good to know that might solve the problem!
from euro-neurips-2022.
I think we can close this for now. BPX is - AFAICT - not used anywhere anymore.
from euro-neurips-2022.
Related Issues (20)
- Impact of simulation-solution quality on rollout performance HOT 17
- Improve rollout dispatching criteria HOT 2
- Filter instance method unsafe? HOT 9
- How to structure codebase HOT 3
- Single static solver builder HOT 6
- Route minimization procedures HOT 15
- Configuration management
- Change restarting mechanism HOT 7
- Parent selection for crossover HOT 12
- Documentation HOT 12
- Rename rollout and parameters
- High variance in solution quality HOT 6
- Fitness comparison in binary tournament
- TODOs in code HOT 7
- Neighbourhood sizes HOT 15
- Determining minimum number of vehicles HOT 10
- Slack-induced string removals as mutation operator HOT 1
- Solve epochs with low number of must_dispatch requests greedily HOT 17
- Postprocess after finishing LS HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from euro-neurips-2022.