mradaideh / neorl Goto Github PK

NeuroEvolution Optimization with Reinforcement Learning

License: MIT License

Python 100.00%

reinforcement-learning optimization-algorithms evolutionary-algorithms large-scale neuroevolution

neorl's Introduction

NEORL

NEORL (NeuroEvolution Optimization with Reinforcement Learning) is a set of implementations of hybrid algorithms combining neural networks and evolutionary computation based on a wide range of machine learning and evolutionary intelligence architectures. NEORL aims to solve large-scale optimization problems relevant to operation & optimization research, engineering, business, and other disciplines.

NEORL can be used for multidisciplinary applications for research, industrial, academic, and/or teaching purposes. NEORL can be used as a standalone platform or an additional benchmarking tool to supplement or validate other optimization packages. Our objective when we built NEORL is to give the user a simple and easy-to-use framework with an access to a wide range of algorithms, covering both standalone and hybrid algorithms in evolutionary, swarm, supervised learning, deep learning, and reinforcement learning. We hope NEORL will allow beginners to enjoy more advanced optimization and algorithms, without being involved in too many theoretical/implementation details, and give experts an opportunity to solve large-scale optimization problems.

Documentation

Documentation is available online: https://neorl.readthedocs.io/en/latest/index.html

The framework paper is available online: https://arxiv.org/abs/2112.07057

Copyright

You can read the first successful and the baseline application of NEORL for nuclear fuel optimization in this News Article.

Basic Features

Features	NEORL
Reinforcement Learning (standalone)	✔️
Evolutionary Computation (standalone)	✔️
Hybrid Neuroevolution	✔️
Supervised Learning	✔️
Parallel processing	✔️
Combinatorial/Discrete Optimization	✔️
Continuous Optimization	✔️
Mixed Discrete/Continuous Optimization	✔️
Hyperparameter Tuning	✔️
Ipython / Notebook friendly	✔️
Detailed Documentation	✔️
Advanced logging	✔️
Optimization Benchmarks	✔️

Knowledge Prerequisites

Note: despite the simplicity of NEORL usage, most algorithms, especially the neuro-based, need some basic knowledge about the optimization research and neural networks in supervised and reinforcement learning. Using NEORL without sufficient knowledge may lead to undesirable results due to the poor selection of algorithm hyperparameters. You should not utilize this package without basic knowledge in machine learning and optimization.

Safe Installation (Strongly Recommended)

Safe installation will setup NEORL in a separate virtual environment with its own dependencies. This eliminates any conflict with your existing package versions (e.g. numpy, Tensorflow).

To install on Linux, here are the steps:

https://neorl.readthedocs.io/en/latest/guide/detinstall.html

For Windows, the steps can be found here:

https://neorl.readthedocs.io/en/latest/guide/detinstall.html#windows-10

Quick Installation

For both Ubuntu and Windows, you can install NEORL via pip

pip install neorl

However, we strongly recommend following safe installation steps to avoid any conflict between NEORL dependencies (e.g. TensorFlow) and your current Python packages.

Testing NEORL Installation

Upon successful installation, NEORL offers a robust unit test package to test all algorithms, you can run the tests via terminal using

neorl --test

All unit tests in NEORL can be executed using pytest runner. If pytest is not installed, please use

pip install pytest pytest-cov

before running the tests.

Example

Here is a quick example of how to use NEORL to minimize a 5-D sphere function:

#---------------------------------
# Import packages
#---------------------------------
import numpy as np
import matplotlib.pyplot as plt
from neorl import DE, XNES

#---------------------------------
# Fitness
#---------------------------------
#Define the fitness function
def FIT(individual):
    """Sphere test objective function.
            F(x) = sum_{i=1}^d xi^2
            d=1,2,3,...
            Range: [-100,100]
            Minima: 0
    """

    return sum(x**2 for x in individual)

#---------------------------------
# Parameter Space
#---------------------------------
#Setup the parameter space (d=5)
nx=5
BOUNDS={}
for i in range(1,nx+1):
    BOUNDS['x'+str(i)]=['float', -100, 100]

#---------------------------------
# DE
#---------------------------------
de=DE(mode='min', bounds=BOUNDS, fit=FIT, npop=50, CR=0.5, F=0.7, ncores=1, seed=1)
x_best, y_best, de_hist=de.evolute(ngen=120, verbose=0)
print('---DE Results---', )
print('x:', x_best)
print('y:', y_best)

#---------------------------------
# NES
#---------------------------------
x0=[-50]*len(BOUNDS)
amat = np.eye(nx)
xnes=XNES(mode='min', bounds=BOUNDS, fit=FIT, npop=50, eta_mu=0.9,
          eta_sigma=0.5, adapt_sampling=True, seed=1)
x_best, y_best, nes_hist=xnes.evolute(120, x0=x0, verbose=0)
print('---XNES Results---', )
print('x:', x_best)
print('y:', y_best)


#---------------------------------
# Plot
#---------------------------------
#Plot fitness for both methods
plt.figure()
plt.plot(np.array(de_hist), label='DE')
plt.plot(np.array(nes_hist['fitness']), label='NES')
plt.xlabel('Generation')
plt.ylabel('Fitness')
plt.legend()
plt.show()

Implemented Algorithms

NEORL offers a wide range of algorithms, where some algorithms could be used with a specific parameter space.

Algorithm	Discrete Space	Continuous Space	Mixed Space	Multiprocessing
ACER	✔️	❌	❌	✔️
ACKTR	✔️	✔️	✔️	✔️
A2C	✔️	✔️	✔️	✔️
PPO	✔️	✔️	✔️	✔️
DQN	✔️	❌	❌	❌
ES	✔️	✔️	✔️	✔️
PSO	✔️	✔️	✔️	✔️
DE	✔️	✔️	✔️	✔️
XNES	❌	✔️	❌	✔️
GWO	✔️	✔️	✔️	✔️
PESA	✔️	✔️	✔️	✔️
PESA2	✔️	✔️	✔️	✔️
RNEAT	❌	✔️	❌	✔️
FNEAT	❌	✔️	❌	✔️
SA	✔️	✔️	✔️	✔️
SSA	✔️	✔️	✔️	✔️
WOA	✔️	✔️	✔️	✔️
JAYA	✔️	✔️	✔️	✔️
MFO	✔️	✔️	✔️	✔️
HHO	✔️	✔️	✔️	✔️
BAT	✔️	✔️	✔️	✔️
PPO-ES	✔️	✔️	✔️	✔️
ACKTR-DE	✔️	✔️	✔️	✔️
ACO	❌	✔️	❌	✔️
NGA	❌	✔️	❌	❌
NHHO	✔️	✔️	✔️	✔️
CS	✔️	✔️	✔️	✔️
TS	✔️	✔️	❌	❌

Major Founding Papers of NEORL

1- Radaideh, M. I., Wolverton, I., Joseph, J., Tusar, J. J., Otgonbaatar, U., Roy, N., Forget, B., Shirvan, K. (2021). Physics-informed reinforcement learning optimization of nuclear assembly design. Nuclear Engineering and Design, 372, p. 110966.

2- Radaideh, M. I., Shirvan, K. (2021). Rule-based reinforcement learning methodology to inform evolutionary algorithms for constrained optimization of engineering applications. Knowledge-Based Systems, 217, p. 106836.

3- Radaideh, M. I., Forget, B., & Shirvan, K. (2021). Large-scale design optimisation of boiling water reactor bundles with neuroevolution. Annals of Nuclear Energy, 160, p. 108355.

Citing the Project

To cite this repository in publications:

@article{radaideh2021neorl,
  title={NEORL: NeuroEvolution Optimization with Reinforcement Learning},
  author={Radaideh, Majdi I and Du, Katelin and Seurin, Paul and Seyler, Devin and Gu, Xubo and Wang, Haijia and Shirvan, Koroush},
  journal={arXiv preprint arXiv:2112.07057},
  year={2021}
}

Maintainers

See our team here Contributors. We are welcoming new contributors to the project.

Important Note: We do not do technical support and we do not answer personal questions via email.

Acknowledgments

NEORL was established in MIT back to 2020 with feedback, validation, and usage of different colleagues: Issac Wolverton (MIT Quest for Intelligence), Joshua Joseph (MIT Quest for Intelligence), Benoit Forget (MIT Nuclear Science and Engineering), Ugi Otgonbaatar (Exelon Corporation), and James Tusar (Exelon Corporation). We also thank our fellows at Stable Baselines, DEAP, and EvoloPy for sharing their implementation, which inspired us to leverage our optimization classes.

neorl's People

Contributors

Stargazers

Watchers

Forkers

xubogu wangcj05 dkatelin jimmy-inl oneliev deanrp2 jbae11 evdcush wangjie1450 kang-96 pseur thudj me-daniel haozhougt 212321301 khurrumsaleem purvidas

neorl's Issues

`import neorl.benchmark` breaks matplotlib display for linux

The code snippet below does not produce a plot when using ssh on a linux computer:

import neorl.benchmarks
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 1, 100)
y = x**2

plt.plot(x, y)
plt.show()

These two code snippets do produce the plot:

import neorl
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 1, 100)
y = x**2

plt.plot(x, y)
plt.show()

import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 1, 100)
y = x**2

plt.plot(x, y)
plt.show()
import neorl.benchmarks

Use of pathos for multiprocessing within python -- unknown error

joblib does not conserve the global variables during multi-processing. In my environment, I utilized some. As a workaround, I used pathos, which seemed to not have any issue. See for instance within ES:
core_list=[]
for key in pop:
core_list.append(pop[key][0])

        #with joblib.Parallel(n_jobs=self.ncores) as parallel:
        #    fitness=parallel(joblib.delayed(self.fit_worker)(item) for item in core_list)
        try:
            with joblib.Parallel(n_jobs=self.ncores) as parallel:#, prefer="threads" , require='sharedmem'
                fitness=parallel(joblib.delayed(self.fit_worker)(item) for item in core_list)
        except:
            p=pathos.multiprocessing.Pool(processes = self.ncores)
            fitness = p.map(self.fit_worker, core_list)
            p.close()
            p.join()

However, after some number of samples generated (it could be 10,000 as it could be 30,000), the optimization stops running without throwing errors. It happened with ES, SA, TS and multi-objective variants I implemented.

My workaround right now is to re-initialize all global variables before each candidate evaluation, but it eats some computing time.

Divide by 0 error in PSO

I encountered this error:

When looking at the code for neorl/evolu/pso.py

It looks like on lines 366 and 367, the population solutions are sorted and then the best is assigned to self.swm_pos. Then, in line 253 of self.UpdateParticle, this self.swm_pos is used for some kind of division.

I think the problem is that if the best member perfectly hits that 0.00000 a divide by zero error is encountered. I am honestly not sure if this is correct.

Minor typo in examples on website

In example 5, in the first code snippet:
import neorl.benchmarks.cec17 as functions

needs to be
import neorl.benchmarks.cec17 as all_functions

I think

Aranha, C., Camacho Villalón, C.L., Campelo, F., Dorigo, M., Ruiz, R., Sevaux, M., Sörensen, K. and Stützle, T., 2022. [Metaphor-based metaheuristics, a call for action: The elephant in the room](https://link.springer.com/article/10.1007/s11721-021-00202-9). Swarm Intelligence, 16(1), pp.1-6.

Aranha, C., Camacho Villalón, C.L., Campelo, F., Dorigo, M., Ruiz, R., Sevaux, M., Sörensen, K. and Stützle, T., 2022. Metaphor-based metaheuristics, a call for action: The elephant in the room. Swarm Intelligence, 16(1), pp.1-6.

The logo ‘NEORL’ show three times when using parallelization computing (ncores > 1)

When ncores is set to more than one, then the logo 'NEORL' will show three times rather than only once.

Objective function evaluation counting for ES

For Evolutionary Strategies, in the online documentation notes section, one bullet says "Total number of cost evaluations for ES is npop * ngen.".

This is incorrect, it should read "Total number of cost evaluations for ES is lambda*(ngen+1).".

Parallelization results in slow-down

I have a large optimization problem I'd like to run and it should be parallelizable. I started testing with ex2_ackerly and increased ncores to any number > 1. My CPU has plenty of threads, so this shouldn't be an issue.

However, it seems like there is a huge overhead penalty when running cases in parallel. Increasing to four threads resulted in nearly 4x runtime.

Are you aware of this problem? It seems to defeat the purpose of parallelization. Could you provide an example when parallelization helps?

@deanrp2 I won't revert the change since it is correct, but I realized that the output for the problems with `grid` type is not converted from integer back to how its original value in the `grid` space. This is needed for reporting. You still need to apply the function ` decode_discrete_to_grid` before reporting the final results. Please see in `gwo.py` lines 315-345 how this conversion occurs. Given AEO is not using a new fitness function, then there is no need to apply similar lines as 123-125 in GWO. When testing with the example, make sure that the output found by AEO for the input parameters with `grid` type belongs to how the grid was defined. Thanks!

Problem with discrete testing: Originally posted by @mradaideh in #31 (comment)

Parameter descriptions missing for hyperparameter tuning classes

For Random Search, Bayesian Search and Evolutionary search, parameter descriptions are missing in the online documentation.

Stray output when verbosity = False

On line 106 and 107 of neorl/evolu/mfo.py, there are print statements that cannot be suppressed with verbosity = False. I feel that these should be suppressible.

Stray print from neorl/benchmarks/tools/transforms.py

I am not sure what the purpose of this script but it is throwing in stray print on some imports.

Typo in HHO docstring

In neorl/neorl/evolu/hho.py, line 44: " :param nhawks: (int): number of the grey wolves in the group". Should be number of hawks in the group.

Potential scikit-learn version issue

Hello, I followed the instructions in the readthedocs,

> conda create --name neorl python=3.7
> conda activate neorl
> pip install neorl

and got the following error during neorl --test:

Traceback (most recent call last):
  File "/Users/4ib/anaconda3/envs/neorl/bin/neorl", line 5, in <module>
    from neorl.scripts import main
  File "/Users/4ib/anaconda3/envs/neorl/lib/python3.7/site-packages/neorl/scripts.py", line 42, in <module>
    from neorl.tune.runners.gridtune import GRIDTUNE
  File "/Users/4ib/anaconda3/envs/neorl/lib/python3.7/site-packages/neorl/tune/__init__.py", line 2, in <module>
    from neorl.tune.bayestune import BAYESTUNE
  File "/Users/4ib/anaconda3/envs/neorl/lib/python3.7/site-packages/neorl/tune/bayestune.py", line 30, in <module>
    from skopt import gp_minimize
  File "/Users/4ib/anaconda3/envs/neorl/lib/python3.7/site-packages/skopt/__init__.py", line 55, in <module>
    from .searchcv import BayesSearchCV
  File "/Users/4ib/anaconda3/envs/neorl/lib/python3.7/site-packages/skopt/searchcv.py", line 16, in <module>
    from sklearn.utils.fixes import MaskedArray
ImportError: cannot import name 'MaskedArray' from 'sklearn.utils.fixes' (/Users/4ib/anaconda3/envs/neorl/lib/python3.7/site-packages/sklearn/utils/fixes.py)

Seemed like it was a scikit-learn version problem:

Output of conda list:


scikit-learn              1.0.1                    pypi_0    pypi
scikit-optimize           0.8.1                    pypi_0    pypi

so I downgraded scikit-learn:

conda install scikit-learn=0.24.2

and now all the tests pass.

I'm not sure if this is an issue so feel free to close it, but I thought this might help someone :)

Notes Error for SSA

In third bullet point in "notes" section of SSA, the equation is missing a squared in the exponent:
Currently:
2e^{-4g/ngen}

It should be
2e^{-(4g/ngen)^2}

I want to use PPOES to solve the FJSP-T problem, but I don't know how to define the boundaries.

Hello, thank you for your code. I was inspired by it and attempted to use the PPOES algorithm to solve the Flexible Job Shop Scheduling Problem with Transportation (FJSP-T). However, I encountered an issue with defining the boundaries.

In the FJSP-T problem, the schedule sequence is represented by a string like [1, 2, 1, 2], where the numbers 1 and 2 represent job IDs. The first 1 represents the first operation of the first job, and the second 1 represents the second operation of the first job. During the iterations, the number of 1s and 2s in the schedule sequence must remain unchanged.

How can such boundaries be defined in NEORL? Which part of the code should I refer to?🌹

Control the sklearn version when pip install neorl==1.6

Hi Majdi, after installing neorl1.6, if I input `neorl` for a test, an error will occur show that sklearn has not MaskedArray.

Traceback (most recent call last):
File "/usr/local/bin/neorl", line 5, in
from neorl.scripts import main
File "/usr/local/lib/python3.7/dist-packages/neorl/scripts.py", line 42, in
from neorl.tune.runners.gridtune import GRIDTUNE
File "/usr/local/lib/python3.7/dist-packages/neorl/tune/init.py", line 2, in
from neorl.tune.bayestune import BAYESTUNE
File "/usr/local/lib/python3.7/dist-packages/neorl/tune/bayestune.py", line 30, in
from skopt import gp_minimize
File "/root/.local/lib/python3.7/site-packages/skopt/init.py", line 55, in
from .searchcv import BayesSearchCV
File "/root/.local/lib/python3.7/site-packages/skopt/searchcv.py", line 16, in
from sklearn.utils.fixes import MaskedArray
ImportError: cannot import name 'MaskedArray' from 'sklearn.utils.fixes' (/root/.local/lib/python3.7/site-packages/sklearn/utils/fixes.py)

The reason is that the sklearn was installed a version of 1.0 in default. A solution to this issue is to downgrade the sklearn version to 0.24.2:
pip install scikit-learn==0.24.2
It will be good to set sklearn to 0.24.2 in requirement.txt.

Bug with setting seed to 0

In at least WOA, the seed is being set with:

if seed:  
    random.seed(seed) 
    np.random.seed(seed)

if the user chooses the seed to be 0, no seed will be set. I feel that this is unexpected behavior.

`hist['last_pop']` in ES incorrect size

when using ES, hist['last_pop'] ends with the number of individuals equal to mu. I believe the number of individuals should be equal to lambda. See minimal working example here:

import numpy as np
import matplotlib.pyplot as plt

from neorl import ES
from math import exp, sqrt, cos, pi
np.random.seed(50)

#---------------------------------
# Fitness function
#---------------------------------
def ACKLEY(individual):
    #Ackley objective function.
    d = len(individual)
    f=20 - 20 * exp(-0.2*sqrt(1.0/d * sum(x**2 for x in individual))) \
            + exp(1) - exp(1.0/d * sum(cos(2*pi*x) for x in individual))
    return f

 24 #---------------------------------
# Parameter Space
#---------------------------------
#Setup the parameter space (d=8)
d=3
lb=-32
ub=32
BOUNDS={}
for i in range(1,d+1):
    BOUNDS['x'+str(i)]=['float', lb, ub]

#---------------------------------
# ES
#---------------------------------
es = ES(mode='min', fit=ACKLEY, bounds=BOUNDS, lambda_ = 12, mu = 6)
x_best, y_best, hist=es.evolute(120)
print(hist["last_pop"])

Stray debug print in GWO for grid bounds

There is a debug statement which prints even when verbose = 0 in GWO which reads:

'--debug: grid parameter type is found in the space'

I believe this should either be removed or only included if verbose is true.

Default verbosity for GWO and PESA2

Currently, GWO, WOA and PESA2 print a lot of information to stdout by default. Consider changing this to be consistent with the other algorithms.

Objective function memoization

Many algorithms may require multiple identical evaluations of the objective functions due to members of a population remaining the same between generations. Consider looking into automatic cacheing of objective function calls. See here for an example.

Silence PSO printing when x0 is used to initialize evolute

When a PSO optimization search is initialized with .evolute and a x0 option is supplied to initialize the population, a message is printed to the screen reading:

The first particle provided by the user: [0.11819554538464994, -0.3116788303975989]
The last particle provided by the user: [0.1007965766526148, -3.3361796084992505]

I feel that this print statement should be suppressed when the used selects the verbose=False option in the initialization of the class.

-bash: fork: retry: Resource temporarily unavailable...NEORL starting many sleeping processes

This post is for users who may encounter:
-bash: fork: retry: Resource temporarily unavailable
or
Segmentation fault (core dumped) nohup python de_expl.py
errors.

On linux computers, users often have a maximum number of processes they are allowed to have running on a computer. This can be checked with the ulimit -a command under the max user processes row. For example:

>> ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 1028858
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 4096
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

A user can check how many processes (in total) they have running with the command: ps --no-headers auxwwwm | awk '$2 == "-" { print $1 }' | sort | uniq -c | sort -n. If a user wants to see the specific listing: ps --no-headers auxwwwm.

A typical Python program may start <50 processes. I am not exactly sure why this is the case but I checked a few different random scripts I had lying around and this is the conclusion I came to.

For some reason, when running NEORL in serial, around 300 processes are started. I think this has something to do with parallelization. Most of the processes are sleeping, for whatever reason.

This becomes a problem if a user wants to run multiple independent python programs which use NEORL. Regardless of computer size, the process limit is quickly reached. A fix to this is to simply raise the max number of user processes: ulimit -u ####. But this cannot be done without sudo access.

I do not think this is an urgent problem for NEORL as it only comes up in a specific use case but I wanted to post this to provide information to users who encounter the same problem.

Possibly relevant links:
https://stackoverflow.com/questions/20614309/find-reason-for-sleeping-python-process
https://stackoverflow.com/questions/31193449/python-multiprocessing-big-data-turn-process-into-sleep
https://stackoverflow.com/questions/1032813/dump-stacktraces-of-all-active-threads/7317379#7317379