n-wouda / euro-neurips-2022 Goto Github PK
View Code? Open in Web Editor NEWOptiML's contribution to the EURO meets NeurIPS 2022 vehicle routing competition.
License: Other
OptiML's contribution to the EURO meets NeurIPS 2022 vehicle routing competition.
License: Other
Originally posted by N-Wouda July 20, 2022
We currently use a time-based stopping criterion. That makes a lot of sense, since the competition is really aimed at solving something in a fixed time budget. But for testing, time-based stopping criteria are horrible: sometimes you get a few more iterations in, sometimes a few less, and that makes it hard to compare different implementations.
I propose we generalise the stopping criteria to arbitrary objects (like we did in ALNS, @leonlan). Then we can use the MaxRuntime
one for the final implementation, and MaxIterations
for reproducible testing.
Since we now are building the static solver in 3-4 different places, I would propose making a single function that handles this.
Something like:
def solve_static(
instance,
config,
*,
seed=1,
max_runtime=None,
max_iterations=None,
initial_solutions=()
):
config = hgspy.Config(seed=seed, **config["params"])
params = hgspy.Params(config, **tools.inst_to_vars(instance))
rng = hgspy.XorShift128(seed=seed)
pop = hgspy.Population(params, rng)
for sol in initial_solutions:
pop.add_individual(hgspy.Individual(params, sol))
ls = hgspy.LocalSearch(params, rng)
for op in config["node_ops"]:
ls.add_node_operator(node_ops[op](params))
for op in config["route_ops"]:
ls.add_route_operator(route_ops[op](params))
algo = hgspy.GeneticAlgorithm(params, rng, pop, ls)
for op in config["crossover_ops"]:
algo.add_crossover_operator(crossover_ops[op])
if max_runtime is not None:
stop = hgspy.stop.MaxRuntime(max_runtime)
else:
stop = hgspy.stop.MaxIterations(max_iterations)
return algo.run(stop)
Solving the dynamic problem requires some sort of classification system to decide which customers to dispatch, and which to have wait until the next epoch.
The following Params constructor does not use the config to determine the number of vehicles.
Euro-NeurIPS-2022/hgs_vrptw/src/Params.cpp
Lines 370 to 386 in 6d0005e
As a result, passing in nbVehicles
doesn't do anything and the bin packing heuristic always used to determine the number of vehicles. The fix is to replace the lines above with the ones in the other Params constructor that reads from an instance file:
Euro-NeurIPS-2022/hgs_vrptw/src/Params.cpp
Lines 323 to 334 in 6d0005e
Furthermore, benchmark.py
and solver.py
etc. should also not take nbVeh=-1
as configuration option, but use the default nbVeh=INT_MAX
instead.
We need a way to select which operators to apply, in a manner that is hopefully better than what we do now. I have collected some (20GB uncompressed; <400MB compressed) data to facilitate this. See below for data explanation.
Data gathered on commit 6e8e3de. The file is available here, as raw operator selection data.zip
.
We can now submit our solver to the Codalab environment thing. See also Slack and the quickstart.
This requires a bit of setup on our end.
SMDs are basically pre-computed delta costs for particular moves. Since a large part of the solution does not frequently change, such precomputed deltas remain valid for quite a while. They can be used to shortcut repeat move evaluations, speeding up the local search procedure.
See e.g. Beek et al. 2018 for a description of an "efficient implementation". The idea was introduced by Zachariadis and Kiranoudis 2010.
The initial population is currently initialized with random solutions. This doesn't matter too much for long-run performance, but having no good/feasible initial solution may cause issues later, especially when we are solving for small time limits (e.g., solving multiple instances within a single epoch). Some observations:
Some ideas:
There are a few TODOs here and there in the current implementation. Most are low-value things that we should at some point look at, others are odd implementation details we've inherited from the baseline. We should get a list of these and discuss their resolution at some point (maybe soon?).
Similar to #32 for the crossover operators: we should decouple the operators from the LocalSearch
(LS) class to allow testing their added value. This is difficult, because the LS class is a bit of a beast. I attempted to clean this up before in #24, but that eventually did not work out. Another attempt might be more successful, however.
First, we need to determine a good signature for the local search operators. What must they minimally know about the current neighbourhood? What do they use currently? Can we pass that data in as arguments, and somehow get a (possibly improved) solution out?
Voigt et al. (2021) propose a new crossover operator. In short, a single individual from the population is used together with the best solution found so far. For every client
Originally posted by @leonlan in https://github.com/N-Wouda/Euro-NeurIPS-2022/discussions/54#discussioncomment-3401551
Can we implement this ourselves?
Right now, we always select the best (min cost) offspring generated by the crossover operators. That's not terrible, but does mean we lose out on diversity. We should probably sometimes skip the best, and instead move further down the list.
This issue mostly exists as a wishlist, it is not necessarily a do this now and then close type of thing.
I will kick off:
We need more data on LS performance. E.g. the number of iterations/moves in the LS procedure. Perhaps split by operator? And also track operator effectiveness compared to its runtime cost?
There is a nbGranular
parameter that controls the size of the explored node neighbourhoods. This value is set as follows (for now):
// Granular search parameter, limits the number of moves in the RI local
// search
size_t nbGranular = 40;
I am slightly bothered by the fact that this value restricts the search to a somewhat arbitrary number of clients: on instances where
nbGranular
to be a percentage of the number of clients, rather than an absolute value.Or maybe both?
Let's discuss here how we want to structure our code base. I will propose something, you shoot :)
As main layout:
Euro-NeurIPS-2022
|--- .github/workflows
|--- data
|--- dynamic
|--- hgs_vrptw
|--- instances
|--- notebooks
|--- static
-- LICENSE.md
-- README.md
-- analysis.py
-- benchmark.py
-- benchmark_dynamic.py
-- controller.py
-- create_submission.sh
-- environment.py
-- plotting.py
-- poetry.lock
-- pyproject.toml
-- setup.cfg
-- solver.py
-- tools.py
It would actually be my preference to move any file that cannot be directly called as a script out of the main folder (e.g. plotting.py, tools.py, environment.py)
However, not sure if/how we could do that, without messing with code we are not allowed to mess with.
We then add a directory that handles the static solves:
static
-- init.py
-- solver.py
Where solver.py is the only place in all of the code that imports hgspy and contains a method "solve_static" that builds a (customizable) solver.
The init exposes this method and exposes all operator (names)s or other things that might be needed as input for this method.
And one directory for the dynamic solves
dynamic
|--- strategies
-- init.py
-- solver.py
-- utils.py
Solver.py contains three functions:
The init exposes the "solve_dynamic" method as well as all strategies to choose from
With a subdirectory for all strategies. (Which I think is a lot cleaner than just putting them in dynamic directly)
strategies
|--- rollout
-- init.py
-- baselines.py
Where baselines.py contains methods for the greedy, random and lazy baselines (+ one generic private method that is used by all three of them).
init exposes all strategies to the "dynamic" level.
rollout
-- init.py
-- algorithms.py
-- config.??
-- simulate.py
One file with a method for each different rollout algorithm.
One file with a method for simulating an instance and one that wraps around the static solver.
I would then propose having a config file in static, dynamic and dynamic/strategies/rollout where we put all the solver configurations we use, that are overwritten in a hierarchical fasion. But let's discuss how we want to handle configuration afterwards / seperately as soon as we have the codebase structure clear.
We have quite a few parameters in the static solver and the rollout algorithm. Right now these are managed fairly ad-hoc:
We should streamline this a bit more. I'm thinking of a general Config
class on the Python side, that stores the relevant static operators, and the static and dynamic parameters. It should support being constructed from code (for tuning), or by reading from a file (e.g., with a classmethod from_file
).
The idea is to implement Slack Induction by String Removals by Christiaens and Vanden Berghe (2020) for crossover/local search.
The old fitness and diversity ranks were divided by popSize - 1
:
Euro-NeurIPS-2022/hgs_vrptw/src/Population.cpp
Lines 84 to 85 in 29a903c
The reason for this division is to normalize the biased fitness onto [0, 2]
instead of [0, 2 * (popSize-1)]
. This is important for binary tournaments, because we can pick individuals from both subpopulations. If one subpopulation has different size than the other, the normalization helps to compare fairly between the individuals.
We no longer do this in the current version:
Euro-NeurIPS-2022/hgs_vrptw/src/Population.cpp
Lines 74 to 81 in 9571b29
The result is that binary tournament is biased towards individuals from subpopulations that are smaller in size.
The fix is to change fitness to type double
and divide the last line by popSize
.
Relevant papers (added by Leon):
I think the simplest implementation idea is: 1) creating a pool of feasible solutions that is stored throughout and 2) write a MILP that takes this pool, solves the set partitioning problem, and returns the corresponding individual.
The advanced set partitioning heuristic is interesting but too complicated I think. bBut maybe there are some interesting bits in there that could be useful!
Trivial improvements can be made to the instances before any routes are made. Example: direct edges are not always the shortest path between two nodes. Removal or penalization of such edges may help local search operators.
First evaluate whether reducing available edges in such a way even results in significantly better results. If so, determine if efficient implementation of this does not take up too much computational time. Should be highly parallel, consider CUDA programming due to availability of GPU.
Suggestions:
Not sure if it adds a lot, but it should be easy to implement. Saw 1 team in DIMACS use this diversity measure as well.
Prins (p. 36) also outlines three other measures for the giant tour chromosome: 1) Hamming, 2) Broken Pairs, and 3) Levenshtein distance. I think the Hamming distance might be interesting to try out. It's easy to compute, cheap (same time complexity as broken pairs, and for VRPTW does not have the drawback of "circular shifts" as stated in the slides.
Originally posted by @leonlan in #22 (comment)
When you put in a non-boolean mask into these functions, you will get unexpected results instead of an error.
E.g. mask=np.array([1,1,1,0,0])
will return an instance with 3 times request one and 2 times the depot instead of only the depot + requests 2&3 as intended.
Do we trust ourselves or should we explicitly check for this?
Euro-NeurIPS-2022/dynamic/utils.py
Lines 13 to 38 in fe82d92
Euro-NeurIPS-2022/hgs_vrptw/src/GeneticAlgorithm.cpp
Lines 104 to 127 in f85d15d
It seems that there is a possibility for an individual to be added twice. The passed-in individual is added to the population after the first local search. If it was infeasible and we decide to repair, then another local search is applied again (with higher tw/load penalties). If the individual becomes feasible, then it is added again to the population.
Is my understanding here correct? If so, is this desired behavior that add both the infeasible and feasible individual?
A more sensible conditional is as follows:
local_search(indiv)
if not feasible(indiv) & rand.int <= repair_probability:
increase penalties
local_search(indiv)
add indiv to population
else:
add indiv to population
Here, if we repair an individual to feasibility, then we only add the feasible individual.
In the TSP example for ALNS, I added a simple postprocessing step: a single pass over all nodes, optimally choosing (by enumeration) a subpath of length k
starting at each of these nodes. This remedies a lot of fairly small stuff that's a little inefficient.
We can probably do something like this after finishing the local search procedure. Maybe set k = 6
or something and go through all the routes once to fix these subpaths.
Suggestion: In that case, introduce a threshold parameter on this fraction, that chooses between greedy+ or rollout. Instead of just doing it for epoch=0. (because it can occur in other epochs as well)
Originally posted by @jaspervd96 in #90 (comment)
See original discussion here.
Github Discussions doesn't interact nicely with Github Projects. E.g., discussion threads cannot be linked with todos, but issues can.
We have the jumpDistance
in the Route
, and will have something like a "cutoff on the probability that this is an improving move" threshold parameter for #65.
Those values don't really change, but they should be in a place where we can find and/or modify them easily. Maybe we should make them constexpr and static on the Config
object.
Nagata et al. (2010) presented the time warping concept together with the EAX crossover for VRPTW, improving on many best known solutions at that time. We could try to implement the EAX crossover.
My gut-feeling is that this has already been tried before by the ORTEC team (since they got time-warping from this paper), but it might have performed worse than SREX (which is also by Nagata, but for pick-up and delivery VRPTW). No idea though, unless we try it out ourselves :-).
Ortec has already tried this, but it was buggy/not working well.
Newly improved OX crossover operator by Hvattum (2022). OX currently does not really perform well, but maybe this adjusted version might improve it.
Hvattum2022_Adjusting_the_order_crossover_operator_for_capacitated_vehicle_routing_problems.pdf
In the dynamic problem, the minimum number of required vehicles can vary a lot depending on the instance that we are trying to solve. This became apparent in #79 when benchmarking the dynamic problem, where adding +30 to the current bin packing heuristic is insufficient. We need to determine a good value to make sure that the dynamic problem can be solved. This may also be relevant for the static problem.
https://hal.archives-ouvertes.fr/hal-00992081/document
This paper discusses lower bounds on the number of vehicles for VRPTW
Originally posted by @leonlan in #79 (comment)
Improving the restarting mechanism could help to squeeze out more of the algorithm. Some ideas:
Links to benchmark results can be found here.
One important observation is that the algorithm can get dead-stuck in sub-optimal solutions. For example, the figure below shows the objective plots for 9571b29. The first restart was useful, but the ones after are just completely wasted.
[Based on what we dicussed today.]
Currently we do two binary tournaments to select the two parents that will be used for crossover. If the two parents are identical, we attempt to find another one that is not.
@leonlan had an idea to select only the first parent by binary tournament. Then, the second parent can be selected based on proximity (broken pairs distance). This ensures we find parents that are somewhat similar, so the offspring can benefit from their 'good genes' (and, hopefully, not need so much local search after crossover). The downside is a potential lack of diversity, so perhaps we should also make sure that's not impacted too much.
Added by Leon:
rollout
to simulate
.rollout_tlim_factor
to simulate_tlim_factor
.Running the benchmark script twice with a fixed number of iterations, I do not get the same average cost. So something's ever so slightly wrong. Might be in the benchmark script, might be in the static solver.
When running experiments with the same setting, I notice that there is high variance in solution qualities among different runs. See e.g. plots #75, where restarting can sometimes be very good, and sometimes be very bad.
I think it's worthwhile to let the algorithm run many times (without restarting) and analyze the solutions.
TODO:
Following #111, I tried to analyze for which instances we have a large gap w.r.t. the best known solution (BKS). Gap is defined as (cost - best_cost) / best_cost * 100
. What I noticed is that for the solutions with large gap, this is often due to having 1 more route than in the best known.
Here is some data using the main branch from 30 September. The route difference is how many more routes the found solution uses than the BKS. A consistent pattern among solutions with large gap is 1 more route is used than what is used in the best known solution.
[('ORTEC-VRPTW-ASYM-844ea26c-d1-n231-k15', 'Gap: 2.23', 'Route diff: 1'),
('ORTEC-VRPTW-ASYM-5d8fd227-d1-n211-k15', 'Gap: 2.15', 'Route diff: 1'),
('ORTEC-VRPTW-ASYM-62980c46-d1-n254-k20', 'Gap: 1.40', 'Route diff: 1'),
('ORTEC-VRPTW-ASYM-b3059fbe-d1-n325-k20', 'Gap: 1.37', 'Route diff: 1'),
('ORTEC-VRPTW-ASYM-8bc1fa17-d1-n302-k23', 'Gap: 1.35', 'Route diff: 1'),
('ORTEC-VRPTW-ASYM-37572525-d1-n240-k16', 'Gap: 1.30', 'Route diff: 1'),
('ORTEC-VRPTW-ASYM-a1081c7b-d1-n325-k20', 'Gap: 1.27', 'Route diff: 1'),
('ORTEC-VRPTW-ASYM-d0d70aec-d1-n290-k18', 'Gap: 1.23', 'Route diff: 1'),
('ORTEC-VRPTW-ASYM-8bc13a3f-d1-n421-k40', 'Gap: 1.22', 'Route diff: 1'),
('ORTEC-VRPTW-ASYM-dd43a785-d1-n880-k50', 'Gap: 1.17', 'Route diff: 1'),
('ORTEC-VRPTW-ASYM-e814fc2f-d1-n247-k25', 'Gap: 1.10', 'Route diff: 1'),
('ORTEC-VRPTW-ASYM-d137e14b-d1-n257-k17', 'Gap: 1.09', 'Route diff: 1'),
('ORTEC-VRPTW-ASYM-7ef1213a-d1-n292-k23', 'Gap: 1.07', 'Route diff: 1'),
('ORTEC-VRPTW-ASYM-ce78488d-d1-n332-k30', 'Gap: 0.98', 'Route diff: 0'),
('ORTEC-VRPTW-ASYM-4fe9435f-d1-n418-k28', 'Gap: 0.95', 'Route diff: 1'),
('ORTEC-VRPTW-ASYM-a59d64ac-d1-n328-k20', 'Gap: 0.89', 'Route diff: 0'),
('ORTEC-VRPTW-ASYM-6aedfe61-d1-n308-k35', 'Gap: 0.87', 'Route diff: 1'),
It may be helpful to look into route minimization procedures. One possible direction is to use the one proposed in Christiaens and Vanden Berghe (2020).
Benchmark results show that keeping the best solution is not good for final phase performance. By setting nbKeepBest=0
, we gain 100 more points on the final phase when compared to the quali phase. There is no gain at all if we set nbKeepBest=1
.
I have an idea to mutate best
before putting it into the new population. This will be done after finishing #96. I don't know if it will work, but we'll see!
It's common for genetic algorithms to add a mutation operator after crossover. I think our static solver can benefit from this as well. Especially when our population gets stuck on a local optima, for some reason our current diversity factors such as SREX and the population management don't give enough punch to get out of it.
A mutation operator such as SISR could be a possible solution. Specifically, we remove a number of adjacent substrings and then repairing the solutions again. This was already done as crossover in #42, but it didn't work out at the time. So it's good to revisit this again but then as mutation operator.
From the meeting notes:
We need a way to track many more statistics during iteration, both for evaluating performance, and to train machine learning models on. Statistics collection hurts performance, so it should only be done in debug builds. Things to collect (at least):
A large number of LS operator moves are really bad. Like, stupid bad, and a human being never would've even computed them. But the LS procedure doesn't know that until after evaluating, so it wastes a ton of time on stuff that wasn't even close to a good move. This hurts particularly in operators that explore large neighbourhoods, like the route operators RELOCATE*
and SWAP*
, but those partially build on the node
I think we can quickly fit a very simple cost model to some of the moves explored by our LS operators. That cost model doesn't have to figure out what the actual cost will be (that's what the operator will do), but rather serves as a really fast screening check for really bad moves that will never result in an improvement.
Somewhat related: cost function approximation for VRPs, e.g. this.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.