Git Product home page Git Product logo

Comments (5)

toshihikoyanase avatar toshihikoyanase commented on May 12, 2024 1

I have trained the deep-learning tasks that take 6 hours to end, so I want to run trials more efficiently.

I have good news for you. We plan to add a new pruner which significantly accelerates deep-learning tasks in a parallel computing environment.
Please try it when it is merged to master. The corresponding PR is #236.

Thank you.

from optuna.

toshihikoyanase avatar toshihikoyanase commented on May 12, 2024

Thank you for your interest in Optuna. I'll try to answer your questions, but I'm not sure that I can fully understand them. If you have further questions, please feel free to contact us.

  1. The difference of the interval trial.report (ex. 100iter, 1000iter).

trial.report(value, step) saves intermediate value to storage, and trial.should_prune(step) refers to value at step, and checks the pruning condition. So, we need to call trial.report(value, step) before we invoke trial.should_prune(step). This is a requirement of pruning interval and trial.report interval.

If you satisfy the requirement, you can set the different intervals of trial.report and trial.should_prune. For example, we can report values at every 5 iterations, and check pruning condition at every 10 iterations as follows:

def objective(trial):
    iris = sklearn.datasets.load_iris()
    classes = list(set(iris.target))
    train_x, test_x, train_y, test_y = \
        sklearn.model_selection.train_test_split(iris.data, iris.target, test_size=0.25)

    alpha = trial.suggest_loguniform('alpha', 1e-5, 1e-1)
    clf = sklearn.linear_model.SGDClassifier(alpha=alpha)

    for step in range(100):
        clf.partial_fit(train_x, train_y, classes=classes)

        # Report intermediate objective value.
        if step % 5 == 0:
            intermediate_value = 1.0 - clf.score(test_x, test_y)
            trial.report(intermediate_value, step)

        # Handle pruning based on the intermediate value.
        if step % 10 == 0 and trial.should_prune(step):
            raise optuna.structs.TrialPruned()

    return 1.0 - clf.score(test_x, test_y)

The above example code based on the examples/pruning/simple.py.

If you use integration modules for xgboost and lightgbm, they invoke trial.report and trial.should_prune at every iteration. Currently, Optuna does not provide options to change pruning intervals for the modules, so it is possible future work.

If you use chainer integration, you can change the interval of trial.report and trial.should_prune using pruner_trigger argument. Please refer to the reference for further details.

  1. If multiple training are running

I think your question is about parallel execution of trials. Optuna's pruner is called in a trial individually, and the other trials running in other process does not affect the timing of pruning condition checks.

  1. Over-fitting

Let me confirm our understanding of your questions. I think your question is about the relationship between Optuna's pruning mechanism and over-fitting of the models trained in your objective functions.

If so, the pruning mechanism does not detect the over-fitting because the pruners do not compare current values with previous values in the same trial. So, I think we need to use early stopping mechanism provided by ML libraries like chainer.training.triggers.EarlyStoppingTrigger for that purpose.

from optuna.

Hiroshiba avatar Hiroshiba commented on May 12, 2024

Thank you for the details.
I used to misunderstand that Optuna may prune with using the time series information.
So I understood 1st and 3rd questions.

I'm sorry I didn't make the 2nd question clear enough.
I wanted to know whether there is a difference between sequential execution and simultaneous execution.

I executed parallel trials at the same time.
The some trials' target value was obviously not good compared to other trials, but all tasks were executed to the end without being pruned.

Is there a way to execute more efficiently?

from optuna.

toshihikoyanase avatar toshihikoyanase commented on May 12, 2024

I wanted to know whether there is a difference between sequential execution and simultaneous execution.

The difference may come from the implemetation of MedianPruner. Currently, MedianPruner only takes care of completed trials and ignores running trials when it calculate the threshold of pruning (more specifically, the median value of past trials). So, some trials may not be targets of pruning when we use parallel optimization.

Let me explain it using a simple example.
(a) Sequentially execute 100 trials.
(b) Parallelly execute 100 trials simultaneously.
We assume that both (a) and (b) uses MedianPruner(n_startup_trials=10) and each trial takes the same calculation time.

In the case of (a), the pruner is activated after 11 trials. So, 90 trials are the targets of the pruner.
In the case of (b), the pruner is never activated because the study has no completed trials during the evaluation of trials. So, pruning does not happen at all, and all 100 trials are completed in the end.
This is a toy example, but similar phenomena can be seen in real applications.

If it is not matched with your case, something may be wrong with the pruning mechanism. If you give us further information such as sample code to reproduce such phenomena and error logs, it will be a great help of us.

from optuna.

Hiroshiba avatar Hiroshiba commented on May 12, 2024

Thanks a lot.
I understood pruning timing.

I have trained the deep-learning tasks that takes 6 hours to end, so I want to run trials more efficiently.

I could know everything I was interested in. Thank you.

from optuna.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.