Git Product home page Git Product logo

Comments (2)

muellerdo avatar muellerdo commented on July 26, 2024

Hello @adhusch,

thanks for the kind words, always happy to hear that AUCMEDI is useful for other researchers! :)

I understand the confusion. Sadly, this can not be changed (according to my knowledge) with the Tensosrflow backend.

The reason for this is that AUCMEDI AutoML runs 2 training processes through transfer learning.

  1. ‘shallow-tuning’ phase
  2. ‘fine-tuning’ phase

If I may cite out of my dissertation:

For the shallow-tuning phase, the neural network model starts an initial training process based on weights from a pre-fitted model. For this initial training process, all layers except for the classifier are frozen, a high learning rate is selected (for example 1E -04 ), and the model is fitting for a small number of epochs (commonly 5-15 epochs) [11, 207, 209]. The concept of shallow-tuning is that the model classifier can adapt the fixed architecture weights to the task. After this initial adaption phase, the architecture weights get unfrozen and the second training process is started with regular hyperparameters but a smaller learning rate than for the shallow-tuning phase (for example 1E -05 ) [11, 207, 209]. In this phase, the complete neural network model fine-tunes all weights for the task to obtain optimal performance.

In order to achieve this in Tensorflow, it is sadly required to run 2x training processes resulting into the 0-10 epoch and 10-500 epoch displays.
Don't worry, the training will probably not last 500 epochs but instead will automatically stop if no more validation loss improvement will be observed (in order to avoid overfitting).

For simple & quick training runs, you can, however, decrease the number of epochs as an argument.
For example:
aucmedi training --epochs 25

Hope that I was able to provide some insights.

Best Regards,
Dominik

from aucmedi.

adhusch avatar adhusch commented on July 26, 2024

Hi @muellerdo ,

thank you very much for the detailed explanations, that's very much appreciated.:)

I know the concept of freezing / head only training for transfer learning; but I had not understood from the docs that the AutoML CLI applies this. It's a pity that this is not otherwise possible in TF, in fast.ai this is done super nice with the freeze/unfreeze. Anyway.

Might be reasonable to have two epoch parameters then? One for the classifier fine-tuning and one for the full training of the whole network, including the initial layers? By that, the user could even choose to skip the fine tuning part by setting "epochs_classifier_head=0" (or the other way around, skip the training of the full network in the sense "epochs_full_net=0", for example for my real-life toy problem the "semi-random" encoding from the frozen pre-trained network is already good enough to reach AUC≥.99 with only training the classifier head for a few epochs).

Issue could be closed then. Thanks again!

P.S.: I might have a few more small questions, what's your preferred workflow for that? Opening issues or just dropping a mail for example?

from aucmedi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.