I want to ask the timing of pruning like these pattern. The di

Timing of Pruning about optuna HOT 5 CLOSED

Hiroshiba commented on May 12, 2024

Timing of Pruning

from optuna.

Comments (5)

toshihikoyanase commented on May 12, 2024 1

I have trained the deep-learning tasks that take 6 hours to end, so I want to run trials more efficiently.

I have good news for you. We plan to add a new pruner which significantly accelerates deep-learning tasks in a parallel computing environment.
Please try it when it is merged to master. The corresponding PR is #236.

Thank you.

from optuna.

toshihikoyanase commented on May 12, 2024

Thank you for your interest in Optuna. I'll try to answer your questions, but I'm not sure that I can fully understand them. If you have further questions, please feel free to contact us.

The difference of the interval trial.report (ex. 100iter, 1000iter).

trial.report(value, step) saves intermediate value to storage, and trial.should_prune(step) refers to value at step, and checks the pruning condition. So, we need to call trial.report(value, step) before we invoke trial.should_prune(step). This is a requirement of pruning interval and trial.report interval.

If you satisfy the requirement, you can set the different intervals of trial.report and trial.should_prune. For example, we can report values at every 5 iterations, and check pruning condition at every 10 iterations as follows:

def objective(trial):
    iris = sklearn.datasets.load_iris()
    classes = list(set(iris.target))
    train_x, test_x, train_y, test_y = \
        sklearn.model_selection.train_test_split(iris.data, iris.target, test_size=0.25)

    alpha = trial.suggest_loguniform('alpha', 1e-5, 1e-1)
    clf = sklearn.linear_model.SGDClassifier(alpha=alpha)

    for step in range(100):
        clf.partial_fit(train_x, train_y, classes=classes)

        # Report intermediate objective value.
        if step % 5 == 0:
            intermediate_value = 1.0 - clf.score(test_x, test_y)
            trial.report(intermediate_value, step)

        # Handle pruning based on the intermediate value.
        if step % 10 == 0 and trial.should_prune(step):
            raise optuna.structs.TrialPruned()

    return 1.0 - clf.score(test_x, test_y)

The above example code based on the examples/pruning/simple.py.

If you use integration modules for xgboost and lightgbm, they invoke trial.report and trial.should_prune at every iteration. Currently, Optuna does not provide options to change pruning intervals for the modules, so it is possible future work.

If you use chainer integration, you can change the interval of trial.report and trial.should_prune using pruner_trigger argument. Please refer to the reference for further details.

If multiple training are running

I think your question is about parallel execution of trials. Optuna's pruner is called in a trial individually, and the other trials running in other process does not affect the timing of pruning condition checks.

Over-fitting

Let me confirm our understanding of your questions. I think your question is about the relationship between Optuna's pruning mechanism and over-fitting of the models trained in your objective functions.

If so, the pruning mechanism does not detect the over-fitting because the pruners do not compare current values with previous values in the same trial. So, I think we need to use early stopping mechanism provided by ML libraries like chainer.training.triggers.EarlyStoppingTrigger for that purpose.

from optuna.

Hiroshiba commented on May 12, 2024

Thank you for the details.
I used to misunderstand that Optuna may prune with using the time series information.
So I understood 1st and 3rd questions.

I'm sorry I didn't make the 2nd question clear enough.
I wanted to know whether there is a difference between sequential execution and simultaneous execution.

I executed parallel trials at the same time.
The some trials' target value was obviously not good compared to other trials, but all tasks were executed to the end without being pruned.

Is there a way to execute more efficiently?

from optuna.

toshihikoyanase commented on May 12, 2024

I wanted to know whether there is a difference between sequential execution and simultaneous execution.

The difference may come from the implemetation of MedianPruner. Currently, MedianPruner only takes care of completed trials and ignores running trials when it calculate the threshold of pruning (more specifically, the median value of past trials). So, some trials may not be targets of pruning when we use parallel optimization.

Let me explain it using a simple example.
(a) Sequentially execute 100 trials.
(b) Parallelly execute 100 trials simultaneously.
We assume that both (a) and (b) uses MedianPruner(n_startup_trials=10) and each trial takes the same calculation time.

In the case of (a), the pruner is activated after 11 trials. So, 90 trials are the targets of the pruner.
In the case of (b), the pruner is never activated because the study has no completed trials during the evaluation of trials. So, pruning does not happen at all, and all 100 trials are completed in the end.
This is a toy example, but similar phenomena can be seen in real applications.

If it is not matched with your case, something may be wrong with the pruning mechanism. If you give us further information such as sample code to reproduce such phenomena and error logs, it will be a great help of us.

from optuna.

Hiroshiba commented on May 12, 2024

Thanks a lot.
I understood pruning timing.

I have trained the deep-learning tasks that takes 6 hours to end, so I want to run trials more efficiently.

I could know everything I was interested in. Thank you.

from optuna.

Timing of Pruning about optuna HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent