Comments (11)
it's easy to write a train_val.prototxt
from deploy.prototxt
.
you can do it by yourself.
Here is my solver settings, batch size is 256, it is quite easy too.
base_lr: 0.1
lr_policy: "poly"
power: 1.0
max_iter: 500000
momentum: 0.9
weight_decay: 0.0001
from senet-caffe.
same question
from senet-caffe.
hi @shicai,
have you ever tried another policy for learning rate?
thanks
from senet-caffe.
i just trained it once, and no other lr policies used.
from senet-caffe.
Thanks for your wonderful job. I am not sure about the hyper params in your train.proto. The BN layer is as follows:
layer {
name: "conv1/bn"
type: "BatchNorm"
bottom: "conv1"
top: "conv1"
batch_norm_param {
eps: 1e-4
}
}
layer {
name: "conv1/scale"
type: "Scale"
bottom: "conv1"
top: "conv1"
scale_param {
bias_term: true
}
}
Is it right?
from senet-caffe.
It is ok for test stage when using pretrained models. but for training, you should add params to control weight decay and learning rate multipliers.
from senet-caffe.
Thanks for your reply. I wonder if the hyper params of BatchNorm and Scale Layer are default (lr_multi=1.0 and decay_multi=1.0) ? @shicai
from senet-caffe.
for batchnorm layers, lr and wd should be set to 0, since you don't need to learn
mean/var params.
but for scale layers, lr and wd should be set as conv layers.
from senet-caffe.
Thanks. I would like to train from scratch on ImageNet dataset. So I think the params in batchnorm layer need to learn (mean/val/factor etc.). Are the lr_multi and decay_multi set as follows?
layer {
name: "conv1/bn"
type: "BatchNorm"
bottom: "conv1"
top: "conv1"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 1.0
decay_mult: 1.0
}
batch_norm_param {
eps: 1e-4
}
}
layer {
name: "conv1/scale"
type: "Scale"
bottom: "conv1"
top: "conv1"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0.0
}
scale_param {
bias_term: true
}
}
from senet-caffe.
you should know that params in batchnorm layer don't need to be learned, they are calculated. just calculate the mean/var values, actually they are not params, so please don't set lr or wd for them.
from senet-caffe.
layer {
name: "conv1/bn"
type: "BatchNorm"
bottom: "conv1"
top: "conv1"
batch_norm_param {
eps: 1e-4
}
}
layer {
name: "conv1/scale"
type: "Scale"
bottom: "conv1"
top: "conv1"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0.0
}
scale_param {
bias_term: true
}
}
Is it right?
from senet-caffe.
Related Issues (14)
- Caffe2
- Caffe2 GAN No Gradients in Generator
- Check failed: ExactNumBottomBlobs() == bottom.size() (3 vs. 2) Axpy Layer takes 3 bottom blob(s) as input.
- 请问下你是怎么转换a*b[:,:,1,:]的????
- could you provide the solver files and test code
- What kind of data-augmentation? HOT 9
- There is no license listed HOT 3
- How to fine-tune the pretrained SE-ResNet-50 model on custom dataset?
- why your model input is 225 HOT 3
- scale & mean_values HOT 1
- Something about the source code?
- about mean value and scale
- Should it be "axis: 1" instead of "axis: 0" in the scale layer below because Caffe Blobs are in NCWH shape? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from senet-caffe.