Git Product home page Git Product logo

Comments (13)

Cheng-Hsiung avatar Cheng-Hsiung commented on July 22, 2024 1

Hi, did you check if your total loss (model_loss + regularization_loss) decreases with time?
Also if you train cifar10 from scratch, the initial softmax classification loss shall be ~ ln10 ~2.3, I suppose? While your regularization starts around 0.017 which is relatively low.
Hope this helps .

from morph-net.

AlanHuang1998 avatar AlanHuang1998 commented on July 22, 2024 1

Everyone, thanks for help :).
I changed my loss and finally did it.
The problem is that I didn't add regularizer_loss to original loss and put regularizer in the wrong place in the code.

from morph-net.

ayp-google avatar ayp-google commented on July 22, 2024

One thing that stands out to me is that gamma_threshold is set to 0.88 which is rather large. The value quoted in the paper is 1e-2. As a sanity check, you can analyze the checkpoint to generate a histogram of the gamma values as in Figure 2. That can help choose the right value for gamma_threshold.

from morph-net.

AlanHuang1998 avatar AlanHuang1998 commented on July 22, 2024

Thanks for replying @ayp-google.
As I known, gamma_threshold is a arg for network_regularizer, how I can check my gamma value with histogram?

from morph-net.

ayp-google avatar ayp-google commented on July 22, 2024

You can use something like inspect_checkpoint.py to examine the tensor values. Specifically, you can get all the gamma values and plot a histogram as in Figure 2 in the paper. There is usually a bimodal distribution and you can choose gamma_threshold to be in the gap.

from morph-net.

eladeban avatar eladeban commented on July 22, 2024

Note:
It's probably not the problem, but
input_boundary=[images, labels]
should be:
input_boundary=[images.op, labels.op]

from morph-net.

AlanHuang1998 avatar AlanHuang1998 commented on July 22, 2024

Thank you @eladeban .
I tried this, but it seems not much different by visualize.
I'll edit it

from morph-net.

AlanHuang1998 avatar AlanHuang1998 commented on July 22, 2024

thanks for your help @ayp-google ,
I tried to plot a histogram of gamma value like Figure 2 in the paper, and I tried to use inspect_checkpoint.py.
Then I got the list like this.
q2

I use this to trace on the tensorboard, then I found this.
q3

Want to ask how can I find the gamma value to plot a histogram.

from morph-net.

ayp-google avatar ayp-google commented on July 22, 2024

You can use any histogram tool to plot the histogram. Basically, you can visualize how the gamma values are distributed and adjust the regularization_strength and gamma_threshold accordingly.

For example, in your tensor all the gamma values are >1. If so, you may need higher regularization_strength. Large gamma values indicate the output of this convolution is strong and those channels should not be removed. Larger regularization strength means gamma value has increased cost so the optimizer will decrease gamma to optimize the overall cost. Once the gamma value is small, this indicates the output signal is weak and does not contribute much to the prediction so it can be removed.

from morph-net.

AlanHuang1998 avatar AlanHuang1998 commented on July 22, 2024

@ayp-google , the original gamma value in Figure 1 and gamma with morphnet in Figure 2.
Figure 1
q5
Figure 2
q6

I try to plot the histogram with figure2's value, but it's not work, so I plot the Fig1's.
Figure 3
q4

I fine tune gamma_threshold, then I get this, the green one's threshlod is 0.01, gray one is 0.86, red one is 0.95 and my reguralizier strength fixed at 1e-10.
Figure 4
q7

FLOPs still rise up.
I thought of it, as Figure 3 showing, the gamma get larger value, so my threshold didn't work after 3k steps, so it rise up, right?

Thanks for helping!!

from morph-net.

eladeban avatar eladeban commented on July 22, 2024

Thanks for sharing this case.
It is very unusual. I don't think I have seen gammas go above 1.0 in any other model.
I am not sure exactly what model you are using an how are you configuring batch norm parameters but I suspect it could be related to that.

Things that come to mind:

  • What is the distribution of gammas when training without the FLOPs regularizer?
  • Could you try applying a small weight decay on the gammas (as is done in resnet models, say 1e-4 or so), this could prevent the swell in gammas.
  • what is the loss and what is the accuracy of the models you have tried, are they kind of OK? or are they junk?
  • The fact that the regularization_term is going up is not that surprising as gammas are >1 and the upper bound become more and more loose. The experiments with different thresholds actually suggest to me that you should try exploring the main knob regularization_strength, as ayp@ suggests you should try values larger than 1e-8, rather than smaller values. What about 5e-8, 1e-7, 5e-7 with any threshold you see fit.
  • you are not saying anything about your model. how non standard is is? are you using a custom LR schedule (i.e. is it increasing, after 3K?) are you using custom initializers?
  • I don't recall having problems with Momentum in the past, but could you try RMSPROP with constant learning rate of 0.1, just to rule this out?
  • When you say you are using PocketFlow, what do you mean by that, do you have other optimization going on? if so could you try not having them on at first.

Hope it helps,

Elad

from morph-net.

AlanHuang1998 avatar AlanHuang1998 commented on July 22, 2024

@Cheng-Hsiung thanks for sharing :).

I try larger strength and g_threshold fixed at 0.88, the gray one is 1e-10, orange is 1e-2, red one is 1e+2.
q8

Here is the gamma value, it seems to have decreased in some batch_normalization.
q9

I may not notice something, I'll follow Elad's point to improve.

from morph-net.

AlanHuang1998 avatar AlanHuang1998 commented on July 22, 2024

@eladeban , thank you :)
Fig 1 is gamma value histogram without the FLOPs regularizer, I'll try to add decay to the gammas, or try use morphnet without PocketFlow.

Fig 1
q11

I have 2 questions.

  • As Fig 2 and Fig 3. Why my json file's conv seems to decreace but FLOPs still going up?

  • I think it related to weight and bias like Fig 4, as I know it seems to MorphNet change weight, does MorphNet transfer bias?

Fig 2
q8

FIg 3
q10

Fig 4
NNs_bias_2

Thanks for replying:)

from morph-net.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.