Is it possible to provide the train & solver prototxt?

it's easy to write a train_val.prototxt from <code cl

hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

train & solver prototxt about senet-caffe HOT 11 CLOSED

shicai commented on May 25, 2024

train & solver prototxt

from senet-caffe.

Comments (11)

shicai commented on May 25, 2024 1

it's easy to write a train_val.prototxt from deploy.prototxt.
you can do it by yourself.
Here is my solver settings, batch size is 256, it is quite easy too.

base_lr: 0.1
lr_policy: "poly"
power: 1.0
max_iter: 500000
momentum: 0.9
weight_decay: 0.0001

from senet-caffe.

kli-casia commented on May 25, 2024

same question

from senet-caffe.

zimenglan-sysu-512 commented on May 25, 2024

hi @shicai,
have you ever tried another policy for learning rate?
thanks

from senet-caffe.

shicai commented on May 25, 2024

i just trained it once, and no other lr policies used.

from senet-caffe.

wlw208dzy commented on May 25, 2024

Thanks for your wonderful job. I am not sure about the hyper params in your train.proto. The BN layer is as follows:
layer {
name: "conv1/bn"
type: "BatchNorm"
bottom: "conv1"
top: "conv1"
batch_norm_param {
eps: 1e-4
}
}
layer {
name: "conv1/scale"
type: "Scale"
bottom: "conv1"
top: "conv1"
scale_param {
bias_term: true
}
}
Is it right?

from senet-caffe.

shicai commented on May 25, 2024

It is ok for test stage when using pretrained models. but for training, you should add params to control weight decay and learning rate multipliers.

from senet-caffe.

wlw208dzy commented on May 25, 2024

Thanks for your reply. I wonder if the hyper params of BatchNorm and Scale Layer are default (lr_multi=1.0 and decay_multi=1.0) ? @shicai

from senet-caffe.

shicai commented on May 25, 2024

for batchnorm layers, lr and wd should be set to 0, since you don't need to learn mean/var params.
but for scale layers, lr and wd should be set as conv layers.

from senet-caffe.

wlw208dzy commented on May 25, 2024

Thanks. I would like to train from scratch on ImageNet dataset. So I think the params in batchnorm layer need to learn (mean/val/factor etc.). Are the lr_multi and decay_multi set as follows?
layer {
name: "conv1/bn"
type: "BatchNorm"
bottom: "conv1"
top: "conv1"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 1.0
decay_mult: 1.0
}
batch_norm_param {
eps: 1e-4
}
}
layer {
name: "conv1/scale"
type: "Scale"
bottom: "conv1"
top: "conv1"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0.0
}
scale_param {
bias_term: true
}
}

from senet-caffe.

shicai commented on May 25, 2024

you should know that params in batchnorm layer don't need to be learned, they are calculated. just calculate the mean/var values, actually they are not params, so please don't set lr or wd for them.

from senet-caffe.

wlw208dzy commented on May 25, 2024

layer {
name: "conv1/bn"
type: "BatchNorm"
bottom: "conv1"
top: "conv1"
batch_norm_param {
eps: 1e-4
}
}
layer {
name: "conv1/scale"
type: "Scale"
bottom: "conv1"
top: "conv1"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0.0
}
scale_param {
bias_term: true
}
}
Is it right?

from senet-caffe.

train & solver prototxt about senet-caffe HOT 11 CLOSED

Comments (11)

Related Issues (14)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent