optuna / optuna-examples Goto Github PK

View Code? Open in Web Editor NEW

637.0 8.0 172.0 3.11 MB

Examples for https://github.com/optuna/optuna

License: MIT License

Python 90.38% Jsonnet 0.77% Dockerfile 0.25% Shell 0.63% Jupyter Notebook 7.96%

python machine-learning parallel distributed hyperparameter-optimization examples hacktoberfest

optuna-examples's Introduction

Optuna Examples

This page contains a list of example codes written with Optuna.

The simplest codeblock looks like this:

import optuna


def objective(trial):
    x = trial.suggest_float("x", -100, 100)
    return x ** 2


if __name__ == "__main__":
    study = optuna.create_study()
    # The optimization finishes after evaluating 1000 times or 3 seconds.
    study.optimize(objective, n_trials=1000, timeout=3)
    print(f"Best params is {study.best_params} with value {study.best_value}")

The examples below provide codeblocks similar to the example above for various different scenarios.

Simple Black-box Optimization

Examples with ML Libraries

An example of Optuna Dashboard

The following example demonstrates how to use Optuna Dashboard.

Starting Optuna Dashboard with in-memory storage

An example where an objective function uses additional arguments

The following example demonstrates how to implement an objective function that uses additional arguments other than trial.

Scikit-learn (callable class version)

Examples of Pruning

The following example demonstrates how to implement pruning logic with Optuna.

Simple pruning (scikit-learn)

In addition, integration modules are available for the following libraries, providing simpler interfaces to utilize pruning.

Examples of Samplers

Warm Starting CMA-ES

Examples of User-Defined Sampler

SimulatedAnnealingSampler

Examples of Terminator

Examples of Multi-Objective Optimization

Examples of Visualization

Examples of Distributed Optimization

Examples of Reinforcement Learning

Optimization of Hyperparameters for Stable-Baslines Agent

External projects using Optuna

PRs to add additional projects welcome!

Running with Optuna's Docker images?

You can use our docker images with the tag ending with -dev to run most of the examples. For example, you can run PyTorch Simple via docker run --rm -v $(pwd):/prj -w /prj optuna/optuna:py3.7-dev python pytorch/pytorch_simple.py. Also, you can try our visualization example in Jupyter Notebook by opening localhost:8888 in your browser after executing this:

docker run -p 8888:8888 --rm optuna/optuna:py3.7-dev jupyter notebook --allow-root --no-browser --port 8888 --ip 0.0.0.0 --NotebookApp.token='' --NotebookApp.password=''

optuna-examples's People

Contributors

Stargazers

Watchers

Forkers

toshihikoyanase hvy nomuramasahir0 nzw0301 majesticcoder14 hyerinyoo crissman ik362 tohmae eric-tc-wong toroi01 topology-op jrbourbeau dingzc2450 alexrios22 siyujiannovelis htw5295 seungwoo-h tokuton1024 xuanlongorz vmboehm rikkigouda richardxing kyosek yussif-issah waseemabbas05 iradonov datacode-tr mattcon70 zichaotong benrossjenkins jiangjinghu iavtavares furyhawk pushpendra910 jimmy-inl giaco5988 alexwang1227 tonychung610822 mustafabozkurt84 chen2092877 lothiraldan daikionodera yelrose 5axes hkhdair skzhang1 prettycharity python-repository-hub shuangyanyin masahitokumada contramundum53 wang21jun atillacanbasaran prettymeng xadrianzetx not522 c-bata timuj xukkx d3sm0 keerthanaramki safadou gbardi allenwind dylan-stark derakding fabio-bertini mazenbesher knshnb pandinosaurus ziyit pradeepanpp manmeet3591 andyzhou4451 szjin qingchuanlyu anthony-neo avkostanov reyoung jrphil yankikalfa nmrenyi avhajiyev alnusjaponica budzik99 krithika93 buluxianfeng li-li-github haihuiliu mmilk1231 sagar-tachtode ghlian shellyfra anhmike maya199x lenjob gen740 craigbosco bouchnam

optuna-examples's Issues

Missing issue/PR templates

Issue and PR templates are missing.

Optimizer Learning Scheduler

Motivation

lr_scheduler.ExponentialLR, lr_scheduler.ReduceLROnPlateau etc with respect to each optimizer Adam , RMSprop etc

Description

As we have search space for finding different optimizer ADAM, RMSprop. Similarly can we have search space within Optuna for learning scheduler i.e. lr_scheduler.ExponentialLR, lr_scheduler.ReduceLROnPlateau etc with respect to each optimizer

Decide policy of file names

Motivation

During the review of #91, we found that we had no explicit file name policy.
In my understanding, we followed the following convention implicitly.

Create {library_name}_simple.py that demonstrates the Optuna usage without the integration module (e.g., keras_simple.py).
Create {library_name}_integration.py that demonstrates the Optuna usage with the integration module (e.g., keras_integration.py).

But some examples integrates the pruning example to simple example and switch the behavior with the --pruning option (See #77).

So, we may need to discuss the naming policy based on the current status.

Description

#77 will reveal the use of integration modules and existence of --pruning option in each example. So, this issue can be revisit after it.

Callback Error for example pytorch_lightning_simple.py

When trying to run the example optuna-examples/pytorch_lightning_simple.py I get the runtime error: RuntimeError: The on_init_start callback hook was deprecated in v1.6 and is no longer supported as of v1.8.

Environment

Optuna version: 3.1.0
Python version: 3.8.10
OS: Ubuntu 22.04
(Optional) Other libraries and their versions:

Error messages, stack traces, or logs

RuntimeError: The on_init_start callback hook was deprecated in v1.6 and is no longer supported as of v1.8.

Additional context (optional)

If I comment out the callback part the code runs without problems. But this eliminates the pruning function which is quite important for the example. I could sadly not find a working example, so cannot really suggest a fix for lightning.

    trainer = pl.Trainer(
        logger=True,
        limit_val_batches=PERCENT_VALID_EXAMPLES,
        enable_checkpointing=False,
        max_epochs=EPOCHS,
        gpus=1 if torch.cuda.is_available() else None,
        #callbacks=[PyTorchLightningPruningCallback(trial, monitor="val_acc")],
    )

PyTorch Lightning example not working

Downloading and running the PyTorch Lightning example fails for me. The error being thrown indicates that trainer.callback_metrics is an empty dictionary.

Environment

Hardware

2x RTX 8000 GPUs

Software

Python 3.8.10
PyTorch 1.7.1
PyTorch Lightning 1.3.1
Optuna 2.7.0

Add aim callback example

Motivation

aim is an open-sourced machine learning experiment tracking library. Thanks to aim community, aim provides an optuna callback, which is similar to Optuna's WeightsAndBiasesCallback, to record optuna optimisation into aim. It would be great to provide an example using aim callback in this repo.

Description

Add an example of aim. #55 might be a helpful example.

Alternatives (optional)

Additional context (optional)

PyTorch Lightning Validation_Step not called.

I use validation_step, validation_step_end, validation_epoch_end and PyTorchLightningPruningCallback.

Then it does not run validation_step when doing trainer.fit().

So trainer.callback_metrics have only training_step metric values.

Version

Python 3.8.5
PyTorch 1.7.1
PyTorch Lightning 1.3.8
Optuna 2.9.1

Pytorch Lightning (PL) >=v1.8 support

The PL example uses optuna.integration.PyTorchLightningPruningCallback as a callback for a PL Trainer (Version 1.8 or higher), resulting into the following error message:

RuntimeError: The on_init_start callback hook was deprecated in v1.6 and is no longer supported as of v1.8.

Expected behavior

The code runs without any errors.

Environment

Optuna version: 3.0.3
Python version: 3.10.6
OS: NixOS 22.0.5
Pytorch-Lightning: 1.8.1

Error messages, stack traces, or logs

RuntimeError: The `on_init_start` callback hook was deprecated in v1.6 and is no longer supported as of v1.8.

Steps to reproduce

pip install optuna
pip install pytorch-lightning
Run the PL example
Error

Optuna example (fastai v2): is the example maximizing validation loss ?

Hello,

I just ran the code provided by optuna-examples and it turns out that the validation loss is being maximized.

Considering

pruner = optuna.pruners.MedianPruner(n_startup_trials=2)
study = optuna.create_study(direction="maximize", pruner=pruner)
study.optimize(objective, n_trials=100, timeout=1000)

Trials 0 and 1 are not resumed as n_startup_trials is set to 2. I get a validation loss of 0.213294 and 0.158786 respectively at epoch/step 0. At trial 2, it gets pruned at epoch 0 with a validation loss of 0.113877. Yet, trial 3 gets to the end with a validation loss of 0.227257 at epoch 0 ! I understand the goal of maximizing the accuracy from the validation set (it's the output being returned by the objective function). Although, the direction being set for the study seems to look for maximizing the validation loss at pruning time. It feels very counter-intuitive.

Replication should be easy as the code from the optuna example is self-contained.

I hope I am not misunderstanding things along the way.

Best regards,
Maxime

FastAI v2 example fails

Expected behavior

Runs in CI.

Environment

Daily examples CI job.

Error messages, stack traces, or logs

[W 2020-12-15 17:26:46,193] Trial 0 failed because of the following error: TypeError("no implementation found for 'torch.tensor.eq' on types that implement __torch_function__: [<class 'fastai.torch_core.TensorImage'>, <class 'fastai.torch_core.TensorCategory'>]")
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/optuna/_optimize.py", line 191, in _run_trial
    value_or_values = func(trial)
  File "examples/fastaiv2_simple.py", line 78, in objective
    learn.fit(EPOCHS)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 206, in fit
    self._with_events(self._do_fit, 'fit', CancelFitException, self._end_cleanup)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 155, in _with_events
    try:       self(f'before_{event_type}')       ;f()
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 197, in _do_fit
    self._with_events(self._do_epoch, 'epoch', CancelEpochException)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 155, in _with_events
    try:       self(f'before_{event_type}')       ;f()
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 192, in _do_epoch
    self._do_epoch_validate()
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 188, in _do_epoch_validate
    with torch.no_grad(): self._with_events(self.all_batches, 'validate', CancelValidException)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 155, in _with_events
    try:       self(f'before_{event_type}')       ;f()
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 161, in all_batches
    for o in enumerate(self.dl): self.one_batch(*o)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 179, in one_batch
    self._with_events(self._do_one_batch, 'batch', CancelBatchException)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 157, in _with_events
    finally:   self(f'after_{event_type}')        ;final()
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 133, in __call__
    def __call__(self, event_name): L(event_name).map(self._call_one)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastcore/foundation.py", line 154, in map
    def map(self, f, *args, gen=False, **kwargs): return self._new(map_ex(self, f, *args, gen=gen, **kwargs))
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastcore/basics.py", line 641, in map_ex
    return list(res)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastcore/basics.py", line 631, in __call__
    return self.func(*fargs, **kwargs)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 137, in _call_one
    [cb(event_name) for cb in sort_by_run(self.cbs)]
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 137, in <listcomp>
    [cb(event_name) for cb in sort_by_run(self.cbs)]
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/callback/core.py", line 44, in __call__
    if self.run and _run: res = getattr(self, event_name, noop)()
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 458, in after_batch
    for met in mets: met.accumulate(self.learn)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 380, in accumulate
    self.total += learn.to_detach(self.func(learn.pred, *learn.yb))*bs
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/metrics.py", line 102, in accuracy
    return (pred == targ).float().mean()
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/torch/tensor.py", line 25, in wrapped
    return handle_torch_function(wrapped, args, *args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/torch/overrides.py", line 1069, in handle_torch_function
    raise TypeError("no implementation found for '{}' on types that implement "
TypeError: no implementation found for 'torch.tensor.eq' on types that implement __torch_function__: [<class 'fastai.torch_core.TensorImage'>, <class 'fastai.torch_core.TensorCategory'>]
Traceback (most recent call last):
  File "examples/fastaiv2_simple.py", line 96, in <module>
    study.optimize(objective, n_trials=100, timeout=600)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/optuna/study.py", line 363, in optimize
    _optimize(
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/optuna/_optimize.py", line 57, in _optimize
    _optimize_sequential(
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/optuna/_optimize.py", line 158, in _optimize_sequential
    trial = _run_trial(study, func, catch)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/optuna/_optimize.py", line 191, in _run_trial
    value_or_values = func(trial)
  File "examples/fastaiv2_simple.py", line 78, in objective
    learn.fit(EPOCHS)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 206, in fit
    self._with_events(self._do_fit, 'fit', CancelFitException, self._end_cleanup)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 155, in _with_events
    try:       self(f'before_{event_type}')       ;f()
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 197, in _do_fit
    self._with_events(self._do_epoch, 'epoch', CancelEpochException)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 155, in _with_events
    try:       self(f'before_{event_type}')       ;f()
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 192, in _do_epoch
    self._do_epoch_validate()
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 188, in _do_epoch_validate
    with torch.no_grad(): self._with_events(self.all_batches, 'validate', CancelValidException)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 155, in _with_events
    try:       self(f'before_{event_type}')       ;f()
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 161, in all_batches
    for o in enumerate(self.dl): self.one_batch(*o)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 179, in one_batch
    self._with_events(self._do_one_batch, 'batch', CancelBatchException)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 157, in _with_events
    finally:   self(f'after_{event_type}')        ;final()
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 133, in __call__
    def __call__(self, event_name): L(event_name).map(self._call_one)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastcore/foundation.py", line 154, in map
    def map(self, f, *args, gen=False, **kwargs): return self._new(map_ex(self, f, *args, gen=gen, **kwargs))
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastcore/basics.py", line 641, in map_ex
    return list(res)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastcore/basics.py", line 631, in __call__
    return self.func(*fargs, **kwargs)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 137, in _call_one
    [cb(event_name) for cb in sort_by_run(self.cbs)]
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 137, in <listcomp>
    [cb(event_name) for cb in sort_by_run(self.cbs)]
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/callback/core.py", line 44, in __call__
    if self.run and _run: res = getattr(self, event_name, noop)()
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 458, in after_batch
    for met in mets: met.accumulate(self.learn)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/learner.py", line 380, in accumulate
    self.total += learn.to_detach(self.func(learn.pred, *learn.yb))*bs
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/fastai/metrics.py", line 102, in accuracy
    return (pred == targ).float().mean()
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/torch/tensor.py", line 25, in wrapped
    return handle_torch_function(wrapped, args, *args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.8.6/x64/lib/python3.8/site-packages/torch/overrides.py", line 1069, in handle_torch_function
    raise TypeError("no implementation found for '{}' on types that implement "
TypeError: no implementation found for 'torch.tensor.eq' on types that implement __torch_function__: [<class 'fastai.torch_core.TensorImage'>, <class 'fastai.torch_core.TensorCategory'>]
Error: Process completed with exit code 1.

Steps to reproduce

See: https://github.com/optuna/optuna/actions/runs/415675275

Additional context (optional)

Need to wait for FastAI to fix this.

The tentative solution is to downgrade the fastai version. See: https://forums.fast.ai/t/ganlearner-error-no-implementation-found-on-types-that-implement-invisibletensor/83451/7

Use `FloatDistribution` and `IntDistribution` in sklearn example

Motivation

Optuna v3.0.0b0 deprecated LogUniformDistribution and IntUniformDistribution, but they are still used in the sklearn example.

Description

Replace

LogUniformDistribution with FloatDistribution,
IntUniformDistribution with IntDistribution

Alternatives (optional)

N/A

Additional context (optional)

N/A

Replace outdated docstring in examples using FashionMNIST dataset

Motivation

Due to the instability of MNIST download, this repo's example codes have used FashionMNIST instead of MNIST.
However, some docstring is supposed to be mentioning MNIST, hand-written digit recognition, even though they use FashionMNIST as follows:

In this example, we optimize the validation accuracy of hand-written digit recognition using
Catalyst, and FashionMNIST. We optimize the neural network architecture.

The lines come from https://github.com/optuna/optuna-examples/blob/6a6b20ad634627eebb3e7e104f73b70b45c6e624/pytorch/catalyst_simple.py.

Description

Revise the docstring for FashionMNIST not MNIST.

Alternatives (optional)

Additional context (optional)

Pytorch Simple Example using Pytorch docs training loop

I modified the pytorch simple code example given in this repo and replaced the training/validation loop with a code similar to the sample training loop given in a classifier training example in the pytorch docs. However, the study ends after a single trial. I'm assuming this has something to do with the return accuracy statement at the end of the objective function. Can someone explain the purpose of this statement, or better if someone has any idea what I could be doing wrong?

SB3_contrib Maskable PPO, Learning Rate schedule error

Expected behavior

Optimizing the initial learning rate for PPO to linearly decrease the learning rate for each trial.

Environment-

Optuna version: optuna 3.0.5
Python version: Python 3.8.10
OS: WSL2- Ubuntu 20.04
(Optional) Other libraries and their versions:
sb3-contrib 1.6.2
stable-baselines3 1.6.2

Error messages, stack traces, or logs

2023-01-04 12:51:53 [W 2023-01-04 17:51:46,602] Trial 0 failed because of the following error: ValueError('value should be one of int, float, str, bool, or torch.Tensor')
2023-01-04 12:51:53 Traceback (most recent call last):
2023-01-04 12:51:53   File "/home/ftuser/.local/lib/python3.10/site-packages/optuna/study/_optimize.py", line 196, in _run_trial
2023-01-04 12:51:53     value_or_values = func(trial)
2023-01-04 12:51:53   File "/user_data/models/ReforceXOptuna.py", line 353, in <lambda>
2023-01-04 12:51:53     lambda trial: self._hyperopt_objective(
2023-01-04 12:51:53   File "/user_data/models/ReforceXOptuna.py", line 289, in _hyperopt_objective
2023-01-04 12:51:53     model.learn(
2023-01-04 12:51:53   File "/home/ftuser/.local/lib/python3.10/site-packages/sb3_contrib/ppo_mask/ppo_mask.py", line 613, in learn
2023-01-04 12:51:53     self.logger.dump(step=self.num_timesteps)
2023-01-04 12:51:53   File "/home/ftuser/.local/lib/python3.10/site-packages/stable_baselines3/common/logger.py", line 528, in dump
2023-01-04 12:51:53     _format.write(self.name_to_value, self.name_to_excluded, step)
2023-01-04 12:51:53   File "/home/ftuser/.local/lib/python3.10/site-packages/stable_baselines3/common/logger.py", line 429, in write
2023-01-04 12:51:53     experiment, session_start_info, session_end_info = hparams(value.hparam_dict, metric_dict=value.metric_dict)
2023-01-04 12:51:53   File "/home/ftuser/.local/lib/python3.10/site-packages/torch/utils/tensorboard/summary.py", line 231, in hparams
2023-01-04 12:51:53     raise ValueError(
2023-01-04 12:51:53 ValueError: value should be one of int, float, str, bool, or torch.Tensor

Steps to reproduce

I am using RL_zoo3 example https://github.com/DLR-RM/rl-baselines3-zoo/blob/master/rl_zoo3/hyperparams_opt.py

Reproducible examples (optional)

This is just a snippet of a large algo.

def linear_schedule(initial_value: Union[float, str]) -> Callable[[float], float]:
    """
    Linear learning rate schedule.
    :param initial_value: (float or str)
    :return: (function)
    """
    # Force conversion to float
    initial_value_ = float(initial_value)

    def func(progress_remaining: float) -> float:
        """
        Progress will decrease from 1 (beginning) to 0
        :param progress_remaining: (float)
        :return: (float)
        """
        return progress_remaining * initial_value_

    return func


def sample_ppo_params(trial: optuna.Trial) -> Dict[str, Any]:
    """
    Sampler for PPO hyperparams.
    :param trial:
    :return:
    """
      
    batch_size = trial.suggest_categorical("batch_size", [ 512, 1024, 2048,4096])
    n_steps = trial.suggest_categorical("n_steps", [8, 16, 32, 64, 128, 256, 512, 1024, 2048])
    gamma = trial.suggest_categorical("gamma", [0.6,0.7,0.8, 0.85, 0.9, 0.95, 0.98, 0.99, 0.995, 0.999, 0.9999])
  
    lr_schedule = trial.suggest_float("learning_rate", 0.00000003, 0.009)
    learning_rate = linear_schedule(lr_schedule)# 0.003 to 5e-6
   
    ent_coef = trial.suggest_float("ent_coef", 0.00000001, 0.01)
    clip_range = trial.suggest_categorical("clip_range", [0.1, 0.2, 0.3, 0.4, 0.5])
    n_epochs = trial.suggest_categorical("n_epochs", [ 5, 10, 15, 20, 25, 30,50,100,150])
    gae_lambda = trial.suggest_categorical("gae_lambda", [0.8, 0.85, 0.9, 0.95, 0.98, 0.99, 0.995, 0.999, 0.9999])
    max_grad_norm = trial.suggest_categorical("max_grad_norm", [0.3, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 2, 5])
    vf_coef = trial.suggest_float("vf_coef", 0, 1)
       
    if batch_size > n_steps:
        batch_size = n_steps   
    
return {
        
        ########PPO PARAMS######
        "n_steps": n_steps,
        "batch_size": batch_size,
        "gamma": gamma,
        "learning_rate": learning_rate,
        "ent_coef": ent_coef, 
        "clip_range": clip_range,
        "n_epochs": n_epochs,
        "gae_lambda": gae_lambda,
        "max_grad_norm": max_grad_norm,
        "vf_coef": vf_coef,
        # "sde_sample_freq": sde_sample_freq,
        #"policy_kwargs": net_arch
        #     # log_std_init=log_std_init,
         #   net_arch=net_arch,
        #     activation_fn=activation_fn,
        #     ortho_init=ortho_init,
        # ),
      
    }  

class ReforceXOptuna(ReforceX):
    
    
     def _hyperopt_objective(self, trial: optuna.Trial, train_df, total_timesteps: int) -> float:
        
        params = sample_ppo_params(trial)
        policy_kwargs = dict(activation_fn=th.nn.SiLU,net_arch=self.net_arch)
        #print(params)
        # Create the RL model
        model = self.MODELCLASS(
            self.policy_type,
            self.train_env,
            policy_kwargs=policy_kwargs,
            tensorboard_log=Path(self.custom_tensorboard_path),
            **params
        )
        
        nan_encountered = False
        optuna_callback = TrialEvalCallback(
            self.eval_env, trial, eval_freq=len(train_df), deterministic=True,
        )
        try:
            model.learn(
                total_timesteps=int(total_timesteps),
                callback=[optuna_callback, self.tensorboard_callback]
            )
            """
            results=optuna.study.trials_dataframe()
            sortedresults=results.sort_values(by='value',ascending=True) 
            print(sortedresults.tail(30))      
            """    
        except AssertionError as e:
            # Sometimes, random hyperparams can generate NaN
            logger.warning(f"Optuna encountered NaN n\{e}")
            nan_encountered = True
        finally:
            # Free memory
            model.env.close()
            self.eval_env.close()

        # Tell the optimizer that the trial failed
        if nan_encountered:
            return float("nan")

        if optuna_callback.is_pruned:
            raise optuna.exceptions.TrialPruned()

        return optuna_callback.last_mean_reward
    
     def fit(self, data_dictionary: Dict[str, Any], dk: FreqaiDataKitchen, **kwargs):
        """
        User customizable fit method
        :params:
        data_dictionary: dict = common data dictionary containing all train/test
            features/labels/weights.
        dk: FreqaiDatakitchen = data kitchen for current pair.
        :returns:
        model: Any = trained model to be used for inference in dry/live/backtesting
        """
        train_df = data_dictionary["train_features"]
        total_timesteps = self.freqai_info["rl_config"]["train_cycles"] * len(train_df)

Additional context (optional)

Deprecation Waring on KearsPruningCallback - How to implement it in a modern way?

I was just reading up on how to implement pruning with otuna. I have based my implementation very closely on the methodology listed in example keras/keras_integration.py.
When running this code I get the following warning:

FutureWarning: KerasPruningCallback has been deprecated in v2.1.0. This feature will be removed in v4.0.0. See https://github.com/optuna/optuna/releases/tag/v2.1.0. Recent Keras release (2.4.0) simply redirects all APIs in the standalone keras package to point to tf.keras. There is now only one Keras: tf.keras. There may be some breaking changes for some workflows by upgrading to keras 2.4.0. Test before upgrading. REF:https://github.com/keras-team/keras/releases/tag/2.4.0
  callbacks=[KerasPruningCallback(trial, "loss")]

I was wondering what the futere-proof way of implementing this would be.
Thank you!

Replace `thop` with `fvcore`

Motivation

thop has slightly complicated dependencies, which cause hotfix sometimes.

Description

Apply the similar modification as in optuna/optuna#3906.

Alternatives (optional)

Additional context (optional)

Use `Optuna.integration.CatBoostPruningCallback` in `catboost_pruning.py`

Motivation

Thanks to optuna/optuna#2734, Optuna provides the pruner callback of catboost as an integration class.
This PR is based on catboost_pruning.py in this example repo. Hence we can simplify the example code by using the integration class.

Description

Rewrite catboost_pruning.py by using
optuna.integration.CatBoostPruningCallback.

Alternatives (optional)

Additional context (optional)

Does integration.TorchDistributedTrial support multinode optimization?

Does `integration.TorchDistributedTrial` support multinode optimization?

I'm using Optuna on a SLURM cluster. Suppose I would like to do a distributed hyperparameter optimization using two nodes with two gpus each. Would submitting a script like pytorch_distributed_simple.py to multiple nodes yield expected results?

I assume every node would be responsible for executing their own trials (i.e. no nodes share trials) and every gpu on a node is responsible for its own portion of the data, determined by torch.utils.data.Dataloader's sampler. Is this assumption correct or are edits needed apart from TorchDistributedTrial's requirement to pass None to objective calls on ranks other than 0.

I already tried the above, but I'm not sure how to check every node is responsible for distinct trials.

StackOverflow crosspost

Optuna logs all hyperparameter tuning runs to the same experiment in Weights and Biases logger--pytorch-lightning

Expected behavior

I am using Optuna for hyperparameter tuning, along with Weights and Biases for logging, and Pytorch-Lightning. The tuning is working, but the problem is logging the results of the tuning trials. I would expect that for each trial--or combination of hyperparameters, that Optuna would generate a new experiment for the logger to log. Then I could look at the metrics by different experiment/combination of hyperparameters, and figure out which one works best.

Instead, Optuna does not seem to initiate a new experiment in the logger on each training run. So what I have is that all of the hyperparameter tuning runs are getting combined into a single experiment in Weights and Biases and it looks like a mess. So I a not sure if it is just a configuration issue on my part or something else. But I was hoping that someone might know the right thing to do.

Environment

Optuna version: 2.8.0
Python version: 3.8
OS:Ubuntu linux 20.04.2
(Optional) Other libraries and their versions:
- pytorch 1.9
- pytorch-lightning: 1.3.5

Error messages, stack traces, or logs

No error messages are generated.

Steps to reproduce

load module
run module.

Reproducible examples (optional)

def objective(trial: optuna.trial.Trial) -> float:

    input_window_size = trial.suggest_categorical("input_window_size", [20, 30, 40])
    output_window_size = trial.suggest_categorical("output_window_size", [ 20, 30, 40])
    test_percentage = 0.20
    val_percentage = 0.20
    lags = [1, 2, 3, 365, 366, 367]
    lag_combos = list(powerset(lags))
    laglist = trial.suggest_categorical("lags", lag_combos)
    drop_columns = ['pr', 'tmmx']
    drop_column_combos = list(powerset(drop_columns))
    remove_columns = trial.suggest_categorical("remove_columns", drop_column_combos)
    batch_size = trial.suggest_categorical("batch_size",[16, 32, 64, 128])
        
    datamodule = get_dataset('Sensor1') 
    datamodule = datamodule(input_window_size=input_window_size,
                            output_window_size=output_window_size,
                            test_percentage=test_percentage,
                            val_percentage=val_percentage,
                            laglist=laglist,
                            remove_columns=remove_columns,
                            batch_size=batch_size)
    
    datamodule.prepare_data()
    datamodule.setup()


    hidden_dim = trial.suggest_categorical("hidden_dim", [16, 32, 64, 128])
    num_layers = trial.suggest_int("num_layers", 1, 3)
    dropout = trial.suggest_float("dropout", 0.2, 0.5, step=0.1)

    optimizer_name = trial.suggest_categorical("optimizer", ["Adam", "RMSprop", "SGD"])
    lr = trial.suggest_uniform("lr", 1e-5, 1e-1)

    model = LitDivSensor(num_features = datamodule.num_features,
                            hidden_dim = hidden_dim,
                            num_layers = num_layers,
                            dropout = dropout,
                            debug = False,
                            learning_rate = lr,
                            batch_size = batch_size,
                            optimizer_name=optimizer_name)

    tb_logger = pl_loggers.TensorBoardLogger('logs/', name='division-bell-rnn')
    wandb_logger = pl_loggers.WandbLogger(name='division-bell-rnn'+ str(date_time),  ## NOTE WANDB LOGGER 
                                          save_dir='logs/', 
                                          project='division-bell', 
                                          entity='kwater',
                                          offline=False)
    trainer = pl.Trainer(
        logger=[tb_logger, wandb_logger],
        checkpoint_callback=False,
        max_epochs=EPOCHS,
        gpus=1 if torch.cuda.is_available() else None,
        callbacks=[PyTorchLightningPruningCallback(trial, monitor="val_loss")])

    hyperparameters = dict(input_window_size=input_window_size, 
                           output_window_size=output_window_size,
                           laglist=laglist,
                           remove_columns=remove_columns,
                           batch_size = batch_size,
                           hidden_dim=hidden_dim,
                           num_layers=num_layers,
                           dropout=dropout,
                           optimizer_name=optimizer_name,
                           learning_rate=lr)

    trainer.logger.log_hyperparams(hyperparameters)
    trainer.fit(model, datamodule=datamodule)

    return trainer.callback_metrics["val_loss"].item()

if __name__ == "__main__":
    pruner: optuna.pruners.BasePruner = (
            optuna.pruners.MedianPruner())

    study = optuna.create_study(direction="minimize", pruner=pruner)
    study.optimize(objective, n_trials=3, timeout=600)

Additional context (optional)

Here is a picture of the loggers where you can see all of the hyperparameter losses are in the same experiment--and hence under the same color line. the green line is from a previous experiment.

Using multiple RL environments with Optuna

How do I use Optuna with multiple OpenAI Gym environments in Stable Baselines 3?

Stable Baselines 3 suggests using a SubProcVec for running an agent through multiple environments.

When I try to wrap the environment in a SubProcVec while using your example it results in a BrokenPipe error. Do you have some further hints or examples?

Fix a CI failure of optuna-hydra-sweeper v1.2.0

Expected behavior

Fix an error of python hydra/simple.py --multirun.
See https://github.com/optuna/optuna-examples/runs/6509897643?check_suite_focus=true for details.

Environment

Optuna version: 2.10.0
Python version: 3.6-3.10
OS:
(Optional) Other libraries and their versions:

Error messages, stack traces, or logs

Run python hydra/simple.py --multirun > /dev/null
hydra/simple.py:19: UserWarning: 
The version_base parameter is not specified.
Please specify a compatability version level, or None.
Will assume defaults for version 1.1
  @hydra.main(config_path="conf", config_name="config")
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.7.13/x64/lib/python3.7/site-packages/hydra/_internal/utils.py", line 213, in run_and_report
    return func()
  File "/opt/hostedtoolcache/Python/3.7.13/x64/lib/python3.7/site-packages/hydra/_internal/utils.py", line 464, in <lambda>
    overrides=overrides,
  File "/opt/hostedtoolcache/Python/3.7.13/x64/lib/python3.7/site-packages/hydra/_internal/hydra.py", line 162, in multirun
    ret = sweeper.sweep(arguments=task_overrides)
  File "/opt/hostedtoolcache/Python/3.7.13/x64/lib/python3.7/site-packages/hydra_plugins/hydra_optuna_sweeper/optuna_sweeper.py", line 52, in sweep
    return self.sweeper.sweep(arguments)
  File "/opt/hostedtoolcache/Python/3.7.13/x64/lib/python3.7/site-packages/hydra_plugins/hydra_optuna_sweeper/_impl.py", line 2[8](https://github.com/optuna/optuna-examples/runs/6509897643?check_suite_focus=true#step:5:9)[9](https://github.com/optuna/optuna-examples/runs/6509897643?check_suite_focus=true#step:5:10), in sweep
    assert self.search_space is None
AssertionError

Steps to reproduce

pip install optuna-hydra-sweeper==1.2.0
python hydra/simple.py --multirun

Additional context (optional)

I made a hotfix at #114.

Rewrite `pytorch_checkpoint` example using `RetryFailedTrialCallback`

Motivation

Optuna v2.8.0 introduced RetryFailedTrialCallback, which will simplify the pytorch_checkpoint example.

Description

The callback restart_from_checkpoint in the pytorch_checkpoint example can be replaced with RetryFailedTrialCallback. We need to change the path to the model to use the number of the first failed trial instead of the current trial's number.

Missing contribution guidelines

Contribution guidelines (similar to optunas's https://github.com/optuna/optuna/blob/master/CONTRIBUTING.md) are missing.

Example integration with TorchElastic

Motivation

Optuna is an invaluable tool for model hyperparameter optimization, and many members of the community will likely be interested in using Optuna in conjunction with frameworks for fault-tolerant distributed training such as TorchElastic.

Currently Optuna supports and provides examples of distributed optimisation where n independent jobs optimize n independent trials. However, Optuna does not have a template (example optimization script) for usage with a Framework like TorchElastic, where Optuna is used in conjunction with distributed training, that is, a group of n dependent elastic workers optimise a single model that corresponds to a single Optuna trial.

Distributed training is common in deep learning use cases, and I think it would be very helpful to have an example of Optuna in the context of distributed training. If such an example already exists and I have missed it, my apologies, but I would be glad to be directed towards it!

Description

The following is a rough schema for a possible approach to using Optuna together with TorchElastic.

Assumptions

The main design assumption for this template is that we are proposing a schema for an Optuna optimization script that can be executed by the TorchElastic launch script, and that will execute all Optuna optimization and training within this script.

Assume a world size of n elastic workers
All workers should have access to the same trial
The trial should propose identical hyperparameters per worker
Only one worker should return optimization results for the trial

Logic of Example Optimization script

Initialize a GPU process group across all workers
One worker (random local rank) executes a study.optimize call to obtain hyperparameters before executing training logic
All workers initialize a temporary process group where the Optuna worker broadcasts its hyperparameters to all other workers
All workers execute identical training logic
The Optuna worker returns the optimization metric value on completion
If any of the workers are interrupted, the process group is restarted and the trial is marked as failed

See this example for a reference regarding the usage of temporary process groups to broadcast data within a worker group.

Complications

We may need to store additional state across surviving workers about the number of successfully completed trials to date. If the worker group restarts because of e.g. node failure, simply using an optimise call with a fixed n_trials can result in more trials than anticipated being run.

Alternatives

This is a topic that I haven't yet found clear design patterns for. There may be better options than my proposed schema, which it would be great to discuss!

In general the goal is to come up with a clear and functional example of integration between Optuna and TorchElastic, which can possibly also give inspiration for integrating Optuna with other distributed training frameworks.

BUG - dask integration example not working

Expected behavior

That the code sample executes on the pre-provisioned dask cluster and that the optuna to dask distributed integration is working seamlessly.

Environment

Optuna version:
Python version: python3.9
OS: Debian GNU/Linux 11 (airflow packaged container)
(Optional) Other libraries and their versions:
optuna == 3.1.0b0
joblib == 1.2.0
dask == 2022.12.1
distributed == 2022.12.1

Error messages, stack traces, or logs

[2023-01-03, 00:00:14 UTC] {taskinstance.py:1851} ERROR - Task failed with exception
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/decorators/base.py", line 188, in execute
return_value = super().execute(context)
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/operators/python.py", line 175, in execute
return_value = self.execute_callable()
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/operators/python.py", line 193, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File "/opt/airflow/dags/root_dag.py", line 93, in run_optimization
storage = optuna.integration.DaskStorage(InMemoryStorage())
File "/home/airflow/.local/lib/python3.9/site-packages/optuna/_experimental.py", line 115, in wrapped_init
_original_init(self, *args, **kwargs)
File "/home/airflow/.local/lib/python3.9/site-packages/optuna/integration/dask.py", line 446, in init
self.client.run_on_scheduler(_register_with_scheduler, storage=storage, name=self.name)
File "/home/airflow/.local/lib/python3.9/site-packages/distributed/client.py", line 2740, in run_on_scheduler
return self.sync(self._run_on_scheduler, function, *args, **kwargs)
File "/home/airflow/.local/lib/python3.9/site-packages/distributed/utils.py", line 339, in sync
return sync(
File "/home/airflow/.local/lib/python3.9/site-packages/distributed/utils.py", line 406, in sync
raise exc.with_traceback(tb)
File "/home/airflow/.local/lib/python3.9/site-packages/distributed/utils.py", line 379, in f
result = yield future
File "/home/airflow/.local/lib/python3.9/site-packages/tornado/gen.py", line 762, in run
value = future.result()
File "/home/airflow/.local/lib/python3.9/site-packages/distributed/client.py", line 2691, in _run_on_scheduler
response = await self.scheduler.run_function(
File "/home/airflow/.local/lib/python3.9/site-packages/distributed/core.py", line 1155, in send_recv_from_rpc
return await send_recv(comm=comm, op=key, **kwargs)
File "/home/airflow/.local/lib/python3.9/site-packages/distributed/core.py", line 945, in send_recv
raise exc.with_traceback(tb)
File "/opt/conda/lib/python3.9/site-packages/distributed/core.py", line 820, in _handle_comm
File "/opt/conda/lib/python3.9/site-packages/distributed/worker.py", line 3217, in run
File "/opt/conda/lib/python3.9/site-packages/distributed/protocol/pickle.py", line 73, in loads
ModuleNotFoundError: No module named 'optuna'

# error messages, stack traces, or logs

Steps to reproduce

take code used and rerun

Reproducible examples (optional)

From https://github.com/cherusk/godon/blob/master/breeder/linux_network_stack/root_dag.py#L83

# Essentially the example code

import optuna
from optuna.storages import InMemoryStorage
from optuna.integration import DaskStorage
from dask.distributed import Client
from dask.distributed import wait


def run_optimization():
            # boilerplate from https://jrbourbeau.github.io/dask-optuna/

            def objective(trial):
                x = trial.suggest_uniform("x", -10, 10)
                return (x - 2) ** 2

            with Client(address="godon_dask_scheduler_1:8786") as client:
                # Create a study using Dask-compatible storage
                storage = optuna.integration.DaskStorage(InMemoryStorage())
                study = optuna.create_study(storage=storage)
                # Optimize in parallel on your Dask cluster
                futures = [
                    client.submit(study.optimize, objective, n_trials=10, pure=False)
                    for i in range(10)
                ]
                wait(futures)
                print(f"Best params: {study.best_params}")


optimization_step = run_optimization()

Additional context (optional)

The end of the trace is quite strange, because nothing was installed via conda. All comes in via pip:
https://github.com/cherusk/godon/blob/master/Dockerfile-airflow#L3

optuna meant to run as backend of godon project: https://github.com/cherusk/godon

LightGBM integration example

Hi,

From this example : https://github.com/optuna/optuna-examples/blob/main/lightgbm/lightgbm_integration.py
There is a pruning integration with LightGBM - to prune unpromising trials :

pruning_callback = optuna.integration.LightGBMPruningCallback(trial, "auc")
gbm = lgb.train(param, dtrain, valid_sets=[dvalid], callbacks=[pruning_callback])

And another pruning here :
study = optuna.create_study( pruner=optuna.pruners.MedianPruner(n_warmup_steps=10), direction="maximize" )

Why we need both ?

Thanks !
Boris

Improve catboost example

Hyperparameter search space can improve by following the CatBoost's docs: https://catboost.ai/en/docs/concepts/parameter-tuning. In addition, it might be great to unify the code to the CatBoost example and the other boosting libraries.

Originally posted by @nzw0301 in #27 (comment)

Fix CI error of PyTorch

Expected behavior

CIs should be passed without error.

Environment

This repo's Github CI: https://github.com/optuna/optuna-examples/actions/workflows/pytorch.yml

Error messages, stack traces, or logs

Error message is as follows:

Run python pytorch/pytorch_lightning_simple.py
Traceback (most recent call last):
  File "pytorch/pytorch_lightning_simple.py", line 19, in <module>
    from optuna.integration import PyTorchLightningPruningCallback
ImportError: cannot import name 'PyTorchLightningPruningCallback' from 'optuna.integration' (unknown location)
Error: Process completed with exit code 1.

Steps to reproduce

See https://github.com/optuna/optuna-examples/actions/workflows/pytorch.yml

Additional context (optional)

Optuna's main repo doesn't see this CI error
ubuntu-latest has been released after this CI had been broken, the version of ubuntu were same:

From the first failed CI's log

Current runner version: '2.286.0'
Operating System
Virtual Environment
  Environment: ubuntu-20.04
  Version: 20211219.1
  Included Software: https://github.com/actions/virtual-environments/blob/ubuntu20/20211219.1/images/linux/Ubuntu2004-README.md
  Image Release: https://github.com/actions/virtual-environments/releases/tag/ubuntu20%2F20211219.1

From the last complete CI's log

Current runner version: '2.285.1'
Operating System
Virtual Environment
  Environment: ubuntu-20.04
  Version: 20211219.1
  Included Software: https://github.com/actions/virtual-environments/blob/ubuntu20/20211219.1/images/linux/Ubuntu2004-README.md
  Image Release: https://github.com/actions/virtual-environments/releases/tag/ubuntu20%2F20211219.1

the current runner versions were not the same between the two messages above, but I'm not sure that causes CI error...

Optuna with DDP training using multiple GPU

The DDP example shows the case of using cpu devices and dist.init_process_group("gloo"). When I switch to multiple gpus environment, with dist.init_process_group('nccl', xxx), optuna seems not able to work. It reports error when trying to suggest values. e.g.
n_layers = trial.suggest_int("n_layers", 1, 3)
RuntimeError: Tensors must be CUDA and dense

Add `hiplot` example

Motivation

hiplot v0.1.30 can take Optuna's study to create a plot. It would be great to give an example to enjoy this feature to enlarge our community.

Description

By following the convention of this example code repo, please create hiopt directory and add an example code to it.
Otherwise, create visualization dir.

Alternatives (optional)

Additional context (optional)

Add `WeightsAndBiasesCallback` example

Motivation

Optuna introduces W&B integration, WeightsAndBiasesCallback to track optimisation since the next version update, v2.9.0. By following our convention, it would be great to provide an example code to use the callback functionality.

Description

The expected pull request contains

minimal example code like mlflow/keras_mlflow.py
README.md like mlflow/README.md;
Add weights and biases to README.md

Of course, weightsandbiases/requirement.txt and .github/workflows/weightsandbiases.yml are necessary.

Additional context (optional)

Add W&B's link from the main repository:

Decide the policy whether or not to run with pruning option in CIs

Motivation

Several Optuna example scripts have pruning option --pruning, more concretely,

python pytorch_lightning_simple.py --pruning

as in this example.

But currently CIs have not performed with such pruning option.

Description

I'm wondering if we could decide the policy whether or not to use the pruning option in CIs:

Without pruning only (as in the current example's CIs)
With pruning only
Both

Alternatives (optional)

Additional context (optional)

#43 (comment)

Add missing examples to daily CI

Some examples are not tested by the daily CI. All examples in this repository preferably should. See #1 (comment).

Getting error fit() got an unexpected keyword argument 'callbacks' while running catboost_pruning.py

Hello

I've been trying to run catboost_pruning.py, but I got this error immediately.

Any help would be appreciated

Environment

Optuna version: 2.10.0
Catboost version: 0.25.1
Python version: 3.6.9

[W 2022-02-01 20:55:41,660] Trial 0 failed because of the following error: TypeError("fit() got an unexpected keyword argument 'callbacks'",)
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/optuna/study/_optimize.py", line 213, in _run_trial
    value_or_values = func(trial)
  File "<ipython-input-24-57f614d12d83>", line 30, in objective
    callbacks=[pruning_callback],
TypeError: fit() got an unexpected keyword argument 'callbacks'

TypeError                                 Traceback (most recent call last)
<ipython-input-25-da54f0e9fc16> in <module>
      3         pruner=optuna.pruners.MedianPruner(n_warmup_steps=5), direction="maximize"
      4     )
----> 5     study.optimize(objective, n_trials=100, timeout=600)
      6 
      7     print("Number of finished trials: {}".format(len(study.trials)))

/usr/local/lib/python3.6/site-packages/optuna/study/study.py in optimize(self, func, n_trials, timeout, n_jobs, catch, callbacks, gc_after_trial, show_progress_bar)
    407             callbacks=callbacks,
    408             gc_after_trial=gc_after_trial,
--> 409             show_progress_bar=show_progress_bar,
    410         )
    411 

/usr/local/lib/python3.6/site-packages/optuna/study/_optimize.py in _optimize(study, func, n_trials, timeout, n_jobs, catch, callbacks, gc_after_trial, show_progress_bar)
     74                 reseed_sampler_rng=False,
     75                 time_start=None,
---> 76                 progress_bar=progress_bar,
     77             )
     78         else:

/usr/local/lib/python3.6/site-packages/optuna/study/_optimize.py in _optimize_sequential(study, func, n_trials, timeout, catch, callbacks, gc_after_trial, reseed_sampler_rng, time_start, progress_bar)
    161 
    162         try:
--> 163             trial = _run_trial(study, func, catch)
    164         except Exception:
    165             raise

/usr/local/lib/python3.6/site-packages/optuna/study/_optimize.py in _run_trial(study, func, catch)
    262 
    263     if state == TrialState.FAIL and func_err is not None and not isinstance(func_err, catch):
--> 264         raise func_err
    265     return trial
    266 

/usr/local/lib/python3.6/site-packages/optuna/study/_optimize.py in _run_trial(study, func, catch)
    211 
    212     try:
--> 213         value_or_values = func(trial)
    214     except exceptions.TrialPruned as e:
    215         # TODO(mamu): Handle multi-objective cases.

<ipython-input-24-57f614d12d83> in objective(trial)
     28         verbose=0,
     29         early_stopping_rounds=100,
---> 30         callbacks=[pruning_callback],
     31     )
     32 

TypeError: fit() got an unexpected keyword argument 'callbacks'

Example of distribute training with TensorFlow.

Motivation

Currently, we do not have an example of distribute training with TensorFllow. It is useful for TensorFlow users.

https://www.tensorflow.org/guide/distributed_training

Description

Create an example of distribute training with TensorFllow.

Support Python 3.11

Motivation

Sub-task of optuna/optuna#3964 for tracking this repo.

Description

Add python 3.11 to github action's version matrix. An example PR is #160.

Alternatives (optional)

Additional context (optional)

I summarise the status of targets as follows:

ModuleNotFoundError: No module named 'botorch.sampling.samplers'

Expected behavior

Environment

Optuna version: 3.0.4
Python version: 3.10
OS: windows 10
botorch==0.8.0

Error messages, stack traces, or logs

F:\Github\...\bo.py:35: ExperimentalWarning: BoTorchSampler is experimental (supported from v2.4.0). The interface can change in the future.
  sampler = optuna.integration.BoTorchSampler(
Traceback (most recent call last):
  File "F:\Github\..\venv\lib\site-packages\optuna\integration\botorch.py", line 39, in <module>
    from botorch.sampling.samplers import SobolQMCNormalSampler
ModuleNotFoundError: No module named 'botorch.sampling.samplers'

Steps to reproduce

Run https://github.com/optuna/optuna-examples/blob/main/multi_objective/botorch_simple.py

Reproducible examples (optional)

# python code

Additional context (optional)

SobolQMCNormalSampler is now located at botorch.sampling.normal

Drop Python 3.6 support

Motivation

Related to optuna/optuna#3021.
The current master of optuna stopped Python 3.6 support for integration modules and optuna-examples.
For further details, please refer to as can be seen in optuna/optuna#3021 (comment).

Description

Please remove the Python 3.6 execution in following CI:

In concrete, we can simply remove '3.6' from the python-version list:

    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ['3.6', '3.7', '3.8', '3.9']

In addition, we can also remove the conditional branch regarding Python 3.6 if exists.

Add isort to CI

Motivation

Description

Add isort to CONTRIBUTING.md and add to CI tests

Alternatives (optional)

Leave as is. This may be useful if some of our examples are replicated to other repositories, such as Chainer, to keep both versions identical

Additional context (optional)

Get best_iteration from best_trial using LighGBM

Hi,

I am using Optuna to optimize LightGBM parameters with "valid_sets" like this :
pruning_callback = optuna.integration.LightGBMPruningCallback(trial = trial, metric = conf_obj.losses_function_optimize.lower())

lgbm_model = lgbm.train(params = param_grid, train_set = train_data_Dataset, valid_sets =[valid_data_Dataset], callbacks=[pruning_callback] , early_stopping_rounds = 10, verbose_eval=False)

Is there an option to retrieve best_iteration from best model (based on early_stopping_rounds )?
Is there any sense to use ' early_stopping_rounds' functionality at all ?

Thanks,
Boris

Add Python 3.10 check to CI

Optuna now supports Python 3.10 and we should ensure that these examples also work with Python 3.10.

We can add version 3.10 to the following files and check that they work:

.github/workflows/

Please make changes to each file in different pull requests.

Do we have option to tune keras with 5 fold cross validation

I am trying to optimize a model. but it worked on the fold i optimized and failed on the rest of the models

Do we specify `parser='auto'` argument in `sklearn.dataset.fetch_openml` to remove warning messages?

Motivation

Since sklearn v1.2, which was released Dec. 2022, sklearn.dataset.fetch_openml has a new argument parser. Due to its "temporal" default value: parser='warn', a number of examples show warning messages like

/Users/nzw/.pyenv/versions/3.11.1/lib/python3.11/site-packages/sklearn/datasets/_openml.py:932: FutureWarning: The default value of `parser` will change from `'liac-arff'` to `'auto'` in 1.4. You can set `parser='auto'` to silence this warning. Therefore, an `ImportError` will be raised from 1.4 if the dataset is dense and pandas is not installed. Note that the pandas parser may return different data types. See the Notes Section in fetch_openml's API doc for details.

for every call fetch_openml.

Description

Considering the backwards compatibility, personally, I prefer not to work on this issue, but as a record, I made this issue.

The scikit-learn v1.4 will change its default value to 'auto' so we can ignore warnings and wait for the release of v1.4. In this case, we can close this issue without sending any PR.

Alternatives (optional)

If we would like to disable these warnings, we need to specify parser='auto' in

./visualization/plot_study.ipynb
./visualization/plot_study.py
./hiplot/plot_study.ipynb
./pytorch/skorch_simple.py

I'm not sure this example repo cares about backward compatibility; we might need to add scikit-learn>=1.2 to their requirements.txt or work on this issue after a few months when users widely install scikit-learn>=1.2 for their machine.

Additional context (optional)

Consider introducing type hints and mypy

Consider adding type hints to the code and run mypy in the CI (optuna itself has type hints). In addition to the static checking, type hints acts as documentation. The downside is that it will increase maintenance cost.

NSGAIISampler behavior from its population size

Trying to gain some insight of the NSGA-II sampler, but still I can't understand the impact of the population size and where is specified the number of generations. I've read the paper in which this multi-objective algorithm is based on following the Optuna's documentation.
In particular:

By default the population_size=50, but if then I define n_trials=10, then this algorithm is like a random search because the objective function should be evaluated at least 50 times to make one generation?
NSGA-II is supposed to generates a number of generations, but where is this number specified?

Any help is appreciated! Thanks in advance.

Reconsider repository condition for CI execution

Currently, the CI workflows can only executed from this repository, but not from forks. Relaxing this condition would be useful for local verification using act. See #1 (comment).

Remove warning message for deprecated arguments from lightgbm examples

Motivation

As reported by optuna/optuna#3013, lightgbm has made a few arguments deprecated since v3.3.0. Optuna examples use the latest lightgbm for CI, so we can alternative convention instead of using deprecated arguments to simplify logging messages.

Description

Replace deprecated arguments such as early_stopping_rounds and verbose_evalwith callbacks by the following lightgbm's warning message.

/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/lightgbm/engine.py:181: UserWarning: 'early_stopping_rounds' argument is deprecated and will be removed in a future release of LightGBM. Pass 'early_stopping()' callback via 'callbacks' argument instead.
  _log_warning("'early_stopping_rounds' argument is deprecated and will be removed in a future release of LightGBM. "
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/lightgbm/engine.py:240: UserWarning: 'verbose_eval' argument is deprecated and will be removed in a future release of LightGBM. Pass 'log_evaluation()' callback via 'callbacks' argument instead.
  _log_warning("'verbose_eval' argument is deprecated and will be removed in a future release of LightGBM. "

Alternatives (optional)

Additional context (optional)

Add `MLflowCallback` example

Motivation

This repo does not provide a minimal example of MLflowCallback. There exist two MLflow examples as follows:

mlflow/keras_mlflow.py: using pure MLflow, not MLflowCallback.
kubernetes/mlflow/pytorch_lightning_distributed.py: using MLflowCallback, but it looks complicated.

I think it would be great to provide such a minimal example.

Description

Please add mlflow callback example similar way to #41.

Alternatives (optional)

Additional context (optional)

RuntimeError: The `on_init_start` callback hook was deprecated in v1.6 and is no longer supported as of v1.8.

Expected behavior

The example below, taken from this github repository, runs correctly:

"""
Optuna example that optimizes multi-layer perceptrons using PyTorch Lightning.
In this example, we optimize the validation accuracy of fashion product recognition using
PyTorch Lightning, and FashionMNIST. We optimize the neural network architecture. As it is too time
consuming to use the whole FashionMNIST dataset, we here use a small subset of it.
You can run this example as follows, pruning can be turned on and off with the `--pruning`
argument.
    $ python pytorch_lightning_simple.py [--pruning]
"""
import argparse
import os
from typing import List
from typing import Optional

import optuna
from optuna.integration import PyTorchLightningPruningCallback
from packaging import version
import pytorch_lightning as pl
import torch
from torch import nn
from torch import optim
import torch.nn.functional as F
from torch.utils.data import DataLoader
from torch.utils.data import random_split
from torchvision import datasets
from torchvision import transforms


if version.parse(pl.__version__) < version.parse("1.0.2"):
    raise RuntimeError("PyTorch Lightning>=1.0.2 is required for this example.")

PERCENT_VALID_EXAMPLES = 0.1
BATCHSIZE = 128
CLASSES = 10
EPOCHS = 10
DIR = os.getcwd()


class Net(nn.Module):
    def __init__(self, dropout: float, output_dims: List[int]):
        super().__init__()
        layers: List[nn.Module] = []

        input_dim: int = 28 * 28
        for output_dim in output_dims:
            layers.append(nn.Linear(input_dim, output_dim))
            layers.append(nn.ReLU())
            layers.append(nn.Dropout(dropout))
            input_dim = output_dim

        layers.append(nn.Linear(input_dim, CLASSES))

        self.layers: nn.Module = nn.Sequential(*layers)

    def forward(self, data: torch.Tensor) -> torch.Tensor:
        logits = self.layers(data)
        return F.log_softmax(logits, dim=1)


class LightningNet(pl.LightningModule):
    def __init__(self, dropout: float, output_dims: List[int]):
        super().__init__()
        self.model = Net(dropout, output_dims)

    def forward(self, data: torch.Tensor) -> torch.Tensor:
        return self.model(data.view(-1, 28 * 28))

    def training_step(self, batch, batch_idx: int) -> torch.Tensor:
        data, target = batch
        output = self(data)
        return F.nll_loss(output, target)

    def validation_step(self, batch, batch_idx: int) -> None:
        data, target = batch
        output = self(data)
        pred = output.argmax(dim=1, keepdim=True)
        accuracy = pred.eq(target.view_as(pred)).float().mean()
        self.log("val_acc", accuracy)
        self.log("hp_metric", accuracy, on_step=False, on_epoch=True)

    def configure_optimizers(self) -> optim.Optimizer:
        return optim.Adam(self.model.parameters())


class FashionMNISTDataModule(pl.LightningDataModule):
    def __init__(self, data_dir: str, batch_size: int):
        super().__init__()
        self.data_dir = data_dir
        self.batch_size = batch_size

    def setup(self, stage: Optional[str] = None) -> None:
        self.mnist_test = datasets.FashionMNIST(
            self.data_dir, train=False, download=True, transform=transforms.ToTensor()
        )
        mnist_full = datasets.FashionMNIST(
            self.data_dir, train=True, download=True, transform=transforms.ToTensor()
        )
        self.mnist_train, self.mnist_val = random_split(mnist_full, [55000, 5000])

    def train_dataloader(self) -> DataLoader:
        return DataLoader(
            self.mnist_train, batch_size=self.batch_size, shuffle=True, pin_memory=True
        )

    def val_dataloader(self) -> DataLoader:
        return DataLoader(
            self.mnist_val, batch_size=self.batch_size, shuffle=False, pin_memory=True
        )

    def test_dataloader(self) -> DataLoader:
        return DataLoader(
            self.mnist_test, batch_size=self.batch_size, shuffle=False, pin_memory=True
        )


def objective(trial: optuna.trial.Trial) -> float:

    # We optimize the number of layers, hidden units in each layer and dropouts.
    n_layers = trial.suggest_int("n_layers", 1, 3)
    dropout = trial.suggest_float("dropout", 0.2, 0.5)
    output_dims = [
        trial.suggest_int("n_units_l{}".format(i), 4, 128, log=True) for i in range(n_layers)
    ]

    model = LightningNet(dropout, output_dims)
    datamodule = FashionMNISTDataModule(data_dir=DIR, batch_size=BATCHSIZE)

    trainer = pl.Trainer(
        logger=True,
        limit_val_batches=PERCENT_VALID_EXAMPLES,
        enable_checkpointing=False,
        max_epochs=EPOCHS,
        gpus=1 if torch.cuda.is_available() else None,
        callbacks=[PyTorchLightningPruningCallback(trial, monitor="val_acc")],
    )
    hyperparameters = dict(n_layers=n_layers, dropout=dropout, output_dims=output_dims)
    trainer.logger.log_hyperparams(hyperparameters)
    trainer.fit(model, datamodule=datamodule)

    return trainer.callback_metrics["val_acc"].item()

pruning = True
pruner: optuna.pruners.BasePruner = (
        optuna.pruners.MedianPruner() if pruning else optuna.pruners.NopPruner()
)

study = optuna.create_study(direction="maximize", pruner=pruner)
study.optimize(objective, n_trials=100, timeout=600)

print("Number of finished trials: {}".format(len(study.trials)))

print("Best trial:")
trial = study.best_trial

print("  Value: {}".format(trial.value))

print("  Params: ")
for key, value in trial.params.items():
    print("    {}: {}".format(key, value))

Environment

requirements.txt

absl-py==1.3.0
aeppl==0.0.33
aesara==2.7.9
aiohttp==3.8.3
aiosignal==1.3.1
alabaster==0.7.12
albumentations==1.2.1
alembic==1.9.1
altair==4.2.0
appdirs==1.4.4
arviz==0.12.1
astor==0.8.1
astropy==4.3.1
astunparse==1.6.3
async-timeout==4.0.2
atari-py==0.2.9
atomicwrites==1.4.1
attrs==22.2.0
audioread==3.0.0
autograd==1.5
autopage==0.5.1
Babel==2.11.0
backcall==0.2.0
beautifulsoup4==4.6.3
bleach==5.0.1
blis==0.7.9
bokeh==2.3.3
branca==0.6.0
bs4==0.0.1
CacheControl==0.12.11
cachetools==5.2.0
catalogue==2.0.8
certifi==2022.12.7
cffi==1.15.1
cftime==1.6.2
chardet==4.0.0
charset-normalizer==2.1.1
click==7.1.2
cliff==4.1.0
clikit==0.6.2
cloudpickle==1.5.0
cmaes==0.9.0
cmake==3.22.6
cmd2==2.4.2
cmdstanpy==1.0.8
colorcet==3.0.1
colorlog==6.7.0
colorlover==0.3.0
community==1.0.0b1
confection==0.0.3
cons==0.4.5
contextlib2==0.5.5
convertdate==2.4.0
crashtest==0.3.1
crcmod==1.7
cufflinks==0.17.3
cvxopt==1.3.0
cvxpy==1.2.2
cycler==0.11.0
cymem==2.0.7
Cython==0.29.32
daft==0.0.4
dask==2022.2.1
datascience==0.17.5
db-dtypes==1.0.5
debugpy==1.0.0
decorator==4.4.2
defusedxml==0.7.1
descartes==1.1.0
dill==0.3.6
distributed==2022.2.1
dlib==19.24.0
dm-tree==0.1.8
dnspython==2.2.1
docutils==0.17.1
dopamine-rl==1.0.5
earthengine-api==0.1.335
easydict==1.10
ecos==2.0.11
editdistance==0.5.3
en-core-web-sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.4.1/en_core_web_sm-3.4.1-py3-none-any.whl
entrypoints==0.4
ephem==4.1.4
et-xmlfile==1.1.0
etils==0.9.0
etuples==0.3.8
fa2==0.3.5
fastai==2.7.10
fastcore==1.5.27
fastdownload==0.0.7
fastdtw==0.3.4
fastjsonschema==2.16.2
fastprogress==1.0.3
fastrlock==0.8.1
feather-format==0.4.1
filelock==3.8.2
firebase-admin==5.3.0
fix-yahoo-finance==0.0.22
Flask==1.1.4
flatbuffers==1.12
folium==0.12.1.post1
frozenlist==1.3.3
fsspec==2022.11.0
future==0.16.0
gast==0.4.0
GDAL==2.2.3
gdown==4.4.0
gensim==3.6.0
geographiclib==1.52
geopy==1.17.0
gin-config==0.5.0
glob2==0.7
google==2.0.3
google-api-core==2.11.0
google-api-python-client==2.70.0
google-auth==2.15.0
google-auth-httplib2==0.1.0
google-auth-oauthlib==0.4.6
google-cloud-bigquery==3.4.1
google-cloud-bigquery-storage==2.17.0
google-cloud-core==2.3.2
google-cloud-datastore==2.11.0
google-cloud-firestore==2.7.3
google-cloud-language==2.6.1
google-cloud-storage==2.7.0
google-cloud-translate==3.8.4
google-colab @ file:///colabtools/dist/google-colab-1.0.0.tar.gz
google-crc32c==1.5.0
google-pasta==0.2.0
google-resumable-media==2.4.0
googleapis-common-protos==1.57.0
googledrivedownloader==0.4
graphviz==0.10.1
greenlet==2.0.1
grpcio==1.51.1
grpcio-status==1.48.2
gspread==3.4.2
gspread-dataframe==3.0.8
gym==0.25.2
gym-notices==0.0.8
h5py==3.1.0
HeapDict==1.0.1
hijri-converter==2.2.4
holidays==0.17.2
holoviews==1.14.9
html5lib==1.0.1
httpimport==0.5.18
httplib2==0.17.4
httpstan==4.6.1
humanize==0.5.1
hyperopt==0.1.2
idna==2.10
imageio==2.9.0
imagesize==1.4.1
imbalanced-learn==0.8.1
imblearn==0.0
imgaug==0.4.0
importlib-metadata==4.13.0
importlib-resources==5.10.1
imutils==0.5.4
inflect==2.1.0
intel-openmp==2023.0.0
intervaltree==2.1.0
ipykernel==5.3.4
ipython==7.9.0
ipython-genutils==0.2.0
ipython-sql==0.3.9
ipywidgets==7.7.1
itsdangerous==1.1.0
jax==0.3.25
jaxlib @ https://storage.googleapis.com/jax-releases/cuda11/jaxlib-0.3.25+cuda11.cudnn805-cp38-cp38-manylinux2014_x86_64.whl
jieba==0.42.1
Jinja2==2.11.3
joblib==1.2.0
jpeg4py==0.1.4
jsonschema==4.3.3
jupyter-client==6.1.12
jupyter-console==6.1.0
jupyter_core==5.1.1
jupyterlab-widgets==3.0.5
kaggle==1.5.12
kapre==0.3.7
keras==2.9.0
Keras-Preprocessing==1.1.2
keras-vis==0.4.1
kiwisolver==1.4.4
korean-lunar-calendar==0.3.1
langcodes==3.3.0
libclang==14.0.6
librosa==0.8.1
lightgbm==2.2.3
lightning-utilities==0.5.0
llvmlite==0.39.1
lmdb==0.99
locket==1.0.0
logical-unification==0.4.5
LunarCalendar==0.0.9
lxml==4.9.2
Mako==1.2.4
Markdown==3.4.1
MarkupSafe==2.0.1
marshmallow==3.19.0
matplotlib==3.2.2
matplotlib-venn==0.11.7
miniKanren==1.0.3
missingno==0.5.1
mistune==0.8.4
mizani==0.7.3
mkl==2019.0
mlxtend==0.14.0
more-itertools==9.0.0
moviepy==0.2.3.5
mpmath==1.2.1
msgpack==1.0.4
multidict==6.0.3
multipledispatch==0.6.0
multitasking==0.0.11
murmurhash==1.0.9
music21==5.5.0
natsort==5.5.0
nbconvert==5.6.1
nbformat==5.7.1
netCDF4==1.6.2
networkx==2.8.8
nibabel==3.0.2
nltk==3.7
notebook==5.7.16
numba==0.56.4
numexpr==2.8.4
numpy==1.21.6
oauth2client==4.1.3
oauthlib==3.2.2
okgrade==0.4.3
opencv-contrib-python==4.6.0.66
opencv-python==4.6.0.66
opencv-python-headless==4.6.0.66
openpyxl==3.0.10
opt-einsum==3.3.0
optuna==3.0.5
osqp==0.6.2.post0
packaging==21.3
palettable==3.3.0
pandas==1.3.5
pandas-datareader==0.9.0
pandas-gbq==0.17.9
pandas-profiling==1.4.1
pandocfilters==1.5.0
panel==0.12.1
param==1.12.3
parso==0.8.3
partd==1.3.0
pastel==0.2.1
pathlib==1.0.1
pathy==0.10.1
patsy==0.5.3
pbr==5.11.0
pep517==0.13.0
pexpect==4.8.0
pickleshare==0.7.5
Pillow==7.1.2
pip-tools==6.6.2
platformdirs==2.6.0
plotly==5.5.0
plotnine==0.8.0
pluggy==0.7.1
pooch==1.6.0
portpicker==1.3.9
prefetch-generator==1.0.3
preshed==3.0.8
prettytable==3.5.0
progressbar2==3.38.0
prometheus-client==0.15.0
promise==2.3
prompt-toolkit==2.0.10
prophet==1.1.1
proto-plus==1.22.1
protobuf==3.19.6
psutil==5.4.8
psycopg2==2.9.5
ptyprocess==0.7.0
py==1.11.0
pyarrow==9.0.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycocotools==2.0.6
pycparser==2.21
pyct==0.4.8
pydantic==1.10.2
pydata-google-auth==1.4.0
pydot==1.3.0
pydot-ng==2.0.0
pydotplus==2.0.2
PyDrive==1.3.1
pyemd==0.5.1
pyerfa==2.0.0.1
Pygments==2.6.1
PyGObject==3.26.1
pylev==1.4.0
pymc==4.1.4
PyMeeus==0.5.12
pymongo==4.3.3
pymystem3==0.2.0
PyOpenGL==3.1.6
pyparsing==3.0.9
pyperclip==1.8.2
pyrsistent==0.19.2
pysimdjson==3.2.0
pysndfile==1.3.8
PySocks==1.7.1
pystan==3.3.0
pytest==3.6.4
python-apt==0.0.0
python-dateutil==2.8.2
python-louvain==0.16
python-slugify==7.0.0
python-utils==3.4.5
pytorch-lightning==1.8.6
pytz==2022.7
pyviz-comms==2.2.1
PyWavelets==1.4.1
PyYAML==6.0
pyzmq==23.2.1
qdldl==0.1.5.post2
qudida==0.0.4
regex==2022.6.2
requests==2.25.1
requests-oauthlib==1.3.1
resampy==0.4.2
rpy2==3.5.5
rsa==4.9
scikit-image==0.18.3
scikit-learn==1.0.2
scipy==1.7.3
screen-resolution-extra==0.0.0
scs==3.2.2
seaborn==0.11.2
Send2Trash==1.8.0
setuptools-git==1.2
shapely==2.0.0
six==1.15.0
sklearn-pandas==1.8.0
smart-open==6.3.0
snowballstemmer==2.2.0
sortedcontainers==2.4.0
soundfile==0.11.0
spacy==3.4.4
spacy-legacy==3.0.10
spacy-loggers==1.0.4
Sphinx==1.8.6
sphinxcontrib-serializinghtml==1.1.5
sphinxcontrib-websupport==1.2.4
SQLAlchemy==1.4.45
sqlparse==0.4.3
srsly==2.4.5
statsmodels==0.12.2
stevedore==4.1.1
sympy==1.7.1
tables==3.7.0
tabulate==0.8.10
tblib==1.7.0
tenacity==8.1.0
tensorboard==2.9.1
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorboardX==2.5.1
tensorflow==2.9.2
tensorflow-datasets==4.6.0
tensorflow-estimator==2.9.0
tensorflow-gcs-config==2.9.1
tensorflow-hub==0.12.0
tensorflow-io-gcs-filesystem==0.29.0
tensorflow-metadata==1.12.0
tensorflow-probability==0.17.0
termcolor==2.1.1
terminado==0.13.3
testpath==0.6.0
text-unidecode==1.3
textblob==0.15.3
thinc==8.1.6
threadpoolctl==3.1.0
tifffile==2022.10.10
toml==0.10.2
tomli==2.0.1
toolz==0.12.0
torch @ https://download.pytorch.org/whl/cu116/torch-1.13.0%2Bcu116-cp38-cp38-linux_x86_64.whl
torchaudio @ https://download.pytorch.org/whl/cu116/torchaudio-0.13.0%2Bcu116-cp38-cp38-linux_x86_64.whl
torchmetrics==0.11.0
torchsummary==1.5.1
torchtext==0.14.0
torchvision @ https://download.pytorch.org/whl/cu116/torchvision-0.14.0%2Bcu116-cp38-cp38-linux_x86_64.whl
tornado==6.0.4
tqdm==4.64.1
traitlets==5.7.1
tweepy==3.10.0
typeguard==2.7.1
typer==0.7.0
typing_extensions==4.4.0
tzlocal==1.5.1
uritemplate==4.1.1
urllib3==1.24.3
vega-datasets==0.9.0
wasabi==0.10.1
wcwidth==0.2.5
webargs==8.2.0
webencodings==0.5.1
Werkzeug==1.0.1
widgetsnbextension==3.6.1
wordcloud==1.8.2.2
wrapt==1.14.1
xarray==2022.12.0
xarray-einstats==0.4.0
xgboost==0.90
xkit==0.0.0
xlrd==1.2.0
xlwt==1.3.0
yarl==1.8.2
yellowbrick==1.5
zict==2.2.0
zipp==3.11.0

Error messages, stack traces, or logs

[W 2023-01-03 18:30:22,015] Trial 0 failed because of the following error: RuntimeError('The `on_init_start` callback hook was deprecated in v1.6 and is no longer supported as of v1.8.')
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/optuna/study/_optimize.py", line 196, in _run_trial
    value_or_values = func(trial)
  File "<ipython-input-3-b120a10c1659>", line 138, in objective
    trainer.fit(model, datamodule=datamodule)
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 603, in fit
    call._call_and_handle_interrupt(
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/call.py", line 38, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 645, in _fit_impl
    self._run(model, ckpt_path=self.ckpt_path)
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 1024, in _run
    verify_loop_configurations(self)
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/configuration_validator.py", line 53, in verify_loop_configurations
    _check_deprecated_callback_hooks(trainer)
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/configuration_validator.py", line 221, in _check_deprecated_callback_hooks
    raise RuntimeError(
RuntimeError: The `on_init_start` callback hook was deprecated in v1.6 and is no longer supported as of v1.8.
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-3-b120a10c1659>](https://localhost:8080/#) in <module>
    146 
    147 study = optuna.create_study(direction="maximize", pruner=pruner)
--> 148 study.optimize(objective, n_trials=100, timeout=600)
    149 
    150 print("Number of finished trials: {}".format(len(study.trials)))

11 frames
[/usr/local/lib/python3.8/dist-packages/optuna/study/study.py](https://localhost:8080/#) in optimize(self, func, n_trials, timeout, n_jobs, catch, callbacks, gc_after_trial, show_progress_bar)
    417         """
    418 
--> 419         _optimize(
    420             study=self,
    421             func=func,

[/usr/local/lib/python3.8/dist-packages/optuna/study/_optimize.py](https://localhost:8080/#) in _optimize(study, func, n_trials, timeout, n_jobs, catch, callbacks, gc_after_trial, show_progress_bar)
     64     try:
     65         if n_jobs == 1:
---> 66             _optimize_sequential(
     67                 study,
     68                 func,

[/usr/local/lib/python3.8/dist-packages/optuna/study/_optimize.py](https://localhost:8080/#) in _optimize_sequential(study, func, n_trials, timeout, catch, callbacks, gc_after_trial, reseed_sampler_rng, time_start, progress_bar)
    158 
    159         try:
--> 160             frozen_trial = _run_trial(study, func, catch)
    161         finally:
    162             # The following line mitigates memory problems that can be occurred in some

[/usr/local/lib/python3.8/dist-packages/optuna/study/_optimize.py](https://localhost:8080/#) in _run_trial(study, func, catch)
    232         and not isinstance(func_err, catch)
    233     ):
--> 234         raise func_err
    235     return frozen_trial
    236 

[/usr/local/lib/python3.8/dist-packages/optuna/study/_optimize.py](https://localhost:8080/#) in _run_trial(study, func, catch)
    194     with get_heartbeat_thread(trial._trial_id, study._storage):
    195         try:
--> 196             value_or_values = func(trial)
    197         except exceptions.TrialPruned as e:
    198             # TODO(mamu): Handle multi-objective cases.

[<ipython-input-3-b120a10c1659>](https://localhost:8080/#) in objective(trial)
    136     hyperparameters = dict(n_layers=n_layers, dropout=dropout, output_dims=output_dims)
    137     trainer.logger.log_hyperparams(hyperparameters)
--> 138     trainer.fit(model, datamodule=datamodule)
    139 
    140     return trainer.callback_metrics["val_acc"].item()

[/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py](https://localhost:8080/#) in fit(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
    601             raise TypeError(f"`Trainer.fit()` requires a `LightningModule`, got: {model.__class__.__qualname__}")
    602         self.strategy._lightning_module = model
--> 603         call._call_and_handle_interrupt(
    604             self, self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
    605         )

[/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/call.py](https://localhost:8080/#) in _call_and_handle_interrupt(trainer, trainer_fn, *args, **kwargs)
     36             return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
     37         else:
---> 38             return trainer_fn(*args, **kwargs)
     39 
     40     except _TunerExitException:

[/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py](https://localhost:8080/#) in _fit_impl(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
    643             model_connected=self.lightning_module is not None,
    644         )
--> 645         self._run(model, ckpt_path=self.ckpt_path)
    646 
    647         assert self.state.stopped

[/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py](https://localhost:8080/#) in _run(self, model, ckpt_path)
   1022         self._callback_connector._attach_model_logging_functions()
   1023 
-> 1024         verify_loop_configurations(self)
   1025 
   1026         # hook

[/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/configuration_validator.py](https://localhost:8080/#) in verify_loop_configurations(trainer)
     51     __verify_batch_transfer_support(trainer)
     52     # TODO: Delete this check in v2.0
---> 53     _check_deprecated_callback_hooks(trainer)
     54     # TODO: Delete this check in v2.0
     55     _check_on_epoch_start_end(model)

[/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/configuration_validator.py](https://localhost:8080/#) in _check_deprecated_callback_hooks(trainer)
    219     for callback in trainer.callbacks:
    220         if callable(getattr(callback, "on_init_start", None)):
--> 221             raise RuntimeError(
    222                 "The `on_init_start` callback hook was deprecated in v1.6 and is no longer supported as of v1.8."
    223             )

RuntimeError: The `on_init_start` callback hook was deprecated in v1.6 and is no longer supported as of v1.8.```

## Steps to reproduce

1. Run the example code above

## Reproducible examples (optional)

See above.

## Additional context (optional)

Update tensorflow examples to support `tensorflow>=2.11.0`

Motivation

tensorflow==2.11.0 seems to have breaking changes and the current tensorflow examples do not work due to the following erro

% python tensorflow_eager_simple.py
...
ValueError: decay is deprecated in the new Keras optimizer, please check the docstring for valid arguments, or use the legacy optimizer, e.g., tf.keras.optimizers.legacy.RMSprop.

% python tensorflow_estimator_simple.py
...
ValueError: Please set your optimizer as an instance of `tf.keras.optimizers.legacy.Optimizer`, e.g., `tf.keras.optimizers.legacy.SGD`.Received optimizer type: <class 'keras.optimizers.optimizer_experimental.sgd.SGD'>.

#146 pinned tensorflow version as a workaround.

Description

Please update tensorflow examples to support tensorflow>=2.11.0.
As shown in the following links, we have some choices of Optimizers. I'm not familiar with tensorflow, so could you explain why you select them in your PR, please?

Links:

optuna / optuna-examples Goto Github PK

optuna-examples's Introduction

Optuna Examples

Simple Black-box Optimization

Examples with ML Libraries

An example of Optuna Dashboard

An example where an objective function uses additional arguments

Examples of Pruning

Examples of Samplers

Examples of User-Defined Sampler

Examples of Terminator

Examples of Multi-Objective Optimization

Examples of Visualization

An example to enqueue trials with given parameter values

Examples of aim

Examples of MLflow

Examples of Weights & Biases

Examples of Hydra

Examples of Distributed Optimization

Examples of Reinforcement Learning

External projects using Optuna

Running with Optuna's Docker images?

optuna-examples's People

Contributors

Stargazers

Watchers

Forkers

optuna-examples's Issues

Motivation

Description

Motivation

Description

Environment

Error messages, stack traces, or logs

Additional context (optional)

Environment

Hardware

Software

Motivation

Description

Alternatives (optional)

Additional context (optional)

Version

Expected behavior

Environment

Error messages, stack traces, or logs

Steps to reproduce

Expected behavior

Environment

Error messages, stack traces, or logs

Steps to reproduce

Additional context (optional)

Motivation

Description

Alternatives (optional)

Additional context (optional)

Motivation

Description

Alternatives (optional)

Additional context (optional)

Expected behavior

Environment-

Error messages, stack traces, or logs

Steps to reproduce

Reproducible examples (optional)

Additional context (optional)

Motivation

Description

Alternatives (optional)

Additional context (optional)

Motivation

Description

Alternatives (optional)

Additional context (optional)

Does integration.TorchDistributedTrial support multinode optimization?

Expected behavior

Environment

Error messages, stack traces, or logs

Steps to reproduce

Reproducible examples (optional)

Does `integration.TorchDistributedTrial` support multinode optimization?