Comments (13)
Hi, did you check if your total loss (model_loss + regularization_loss) decreases with time?
Also if you train cifar10 from scratch, the initial softmax classification loss shall be ~ ln10 ~2.3, I suppose? While your regularization starts around 0.017 which is relatively low.
Hope this helps .
from morph-net.
Everyone, thanks for help :).
I changed my loss and finally did it.
The problem is that I didn't add regularizer_loss to original loss and put regularizer in the wrong place in the code.
from morph-net.
One thing that stands out to me is that gamma_threshold is set to 0.88 which is rather large. The value quoted in the paper is 1e-2. As a sanity check, you can analyze the checkpoint to generate a histogram of the gamma values as in Figure 2. That can help choose the right value for gamma_threshold.
from morph-net.
Thanks for replying @ayp-google.
As I known, gamma_threshold is a arg for network_regularizer, how I can check my gamma value with histogram?
from morph-net.
You can use something like inspect_checkpoint.py to examine the tensor values. Specifically, you can get all the gamma values and plot a histogram as in Figure 2 in the paper. There is usually a bimodal distribution and you can choose gamma_threshold to be in the gap.
from morph-net.
Note:
It's probably not the problem, but
input_boundary=[images, labels]
should be:
input_boundary=[images.op, labels.op]
from morph-net.
Thank you @eladeban .
I tried this, but it seems not much different by visualize.
I'll edit it
from morph-net.
thanks for your help @ayp-google ,
I tried to plot a histogram of gamma value like Figure 2 in the paper, and I tried to use inspect_checkpoint.py.
Then I got the list like this.
I use this to trace on the tensorboard, then I found this.
Want to ask how can I find the gamma value to plot a histogram.
from morph-net.
You can use any histogram tool to plot the histogram. Basically, you can visualize how the gamma values are distributed and adjust the regularization_strength and gamma_threshold accordingly.
For example, in your tensor all the gamma values are >1. If so, you may need higher regularization_strength. Large gamma values indicate the output of this convolution is strong and those channels should not be removed. Larger regularization strength means gamma value has increased cost so the optimizer will decrease gamma to optimize the overall cost. Once the gamma value is small, this indicates the output signal is weak and does not contribute much to the prediction so it can be removed.
from morph-net.
@ayp-google , the original gamma value in Figure 1 and gamma with morphnet in Figure 2.
Figure 1
Figure 2
I try to plot the histogram with figure2's value, but it's not work, so I plot the Fig1's.
Figure 3
I fine tune gamma_threshold, then I get this, the green one's threshlod is 0.01, gray one is 0.86, red one is 0.95 and my reguralizier strength fixed at 1e-10.
Figure 4
FLOPs still rise up.
I thought of it, as Figure 3 showing, the gamma get larger value, so my threshold didn't work after 3k steps, so it rise up, right?
Thanks for helping!!
from morph-net.
Thanks for sharing this case.
It is very unusual. I don't think I have seen gammas go above 1.0 in any other model.
I am not sure exactly what model you are using an how are you configuring batch norm parameters but I suspect it could be related to that.
Things that come to mind:
- What is the distribution of gammas when training without the FLOPs regularizer?
- Could you try applying a small weight decay on the gammas (as is done in resnet models, say 1e-4 or so), this could prevent the swell in gammas.
- what is the loss and what is the accuracy of the models you have tried, are they kind of OK? or are they junk?
- The fact that the
regularization_term
is going up is not that surprising as gammas are >1 and the upper bound become more and more loose. The experiments with different thresholds actually suggest to me that you should try exploring the main knobregularization_strength
, as ayp@ suggests you should try values larger than 1e-8, rather than smaller values. What about 5e-8, 1e-7, 5e-7 with any threshold you see fit. - you are not saying anything about your model. how non standard is is? are you using a custom LR schedule (i.e. is it increasing, after 3K?) are you using custom initializers?
- I don't recall having problems with Momentum in the past, but could you try RMSPROP with constant learning rate of 0.1, just to rule this out?
- When you say you are using PocketFlow, what do you mean by that, do you have other optimization going on? if so could you try not having them on at first.
Hope it helps,
Elad
from morph-net.
@Cheng-Hsiung thanks for sharing :).
I try larger strength and g_threshold fixed at 0.88, the gray one is 1e-10, orange is 1e-2, red one is 1e+2.
Here is the gamma value, it seems to have decreased in some batch_normalization.
I may not notice something, I'll follow Elad's point to improve.
from morph-net.
@eladeban , thank you :)
Fig 1 is gamma value histogram without the FLOPs regularizer, I'll try to add decay to the gammas, or try use morphnet without PocketFlow.
I have 2 questions.
-
As Fig 2 and Fig 3. Why my json file's conv seems to decreace but FLOPs still going up?
-
I think it related to weight and bias like Fig 4, as I know it seems to MorphNet change weight, does MorphNet transfer bias?
Thanks for replying:)
from morph-net.
Related Issues (20)
- the morphnet can act on some layers of the model?such as act on backbone. HOT 7
- Questions about batch_normalization HOT 3
- OpRegularizerManager could not handle ops HOT 7
- Getting this Error HOT 1
- Getting error like this HOT 2
- In this case, MorphNet is getting stuck because it doesn't know how to handle the Shape op. You can try a couple options:
- Applying Morph-net while training training Yolov3 using darknet HOT 1
- Query on Morph Net Structure Exporter HOT 2
- What's the tf version this repo used?
- Can morph-net output a pruned model file? HOT 3
- Bad structure HOT 3
- MorphNet not working on google colab HOT 1
- Can I use MorphNet to slim my model trained by Pytorch? HOT 1
- Regularizer Loss remains zero / BatchToSpaceND HOT 1
- Multiple alive files output HOT 1
- Batch Norm layers Assertion Error
- Latency regularizer for a target hardware. HOT 5
- LayerNorm not support?
- Performance issue in the definition of create_test_input, examples/slim/nets/resnet_v1_test.py HOT 1
- Tensorflow 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from morph-net.