对于卷积层：当设置param{ lr_mult: X}时，相当于该层的学习率为solver中的base_lr乘以这个系数Ｘ当设置param{ lr_

楼上正解，因为默认的lr_mult 和 decay_mult是1.0。见：<a href="https://github.com/BVLC/caffe/blob/maste

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

{ lr_mult: 1 decay_mult: 1}的含义 about densenet-caffe HOT 16 CLOSED

shicai commented on May 24, 2024

{ lr_mult: 1 decay_mult: 1}的含义

from densenet-caffe.

Comments (16)

shicai commented on May 24, 2024 6

lr_mult和decay_mult的默认值是1.0，如果没有任何设置，那就是1，也就是这层的参数按照base_lr进行学习。

from densenet-caffe.

shicai commented on May 24, 2024 6

BatchNorm里面的三个参数（mean，variance等）是计算得到的，不是权重更新得到的，因此不能加weight decay和lr mult，所以param需要设置为0,。

from densenet-caffe.

shicai commented on May 24, 2024 1

楼上正解，因为默认的lr_mult 和 decay_mult是1.0。见：https://github.com/BVLC/caffe/blob/master/src/caffe/proto/caffe.proto#L316

from densenet-caffe.

foralliance commented on May 24, 2024

主要想问下，当没有这个参数param时，到底是按
　1. solver中的base_lr进行，
2. 还是相当于param{ lr_mult: 0}，即，固定该层

from densenet-caffe.

DanLiu0623 commented on May 24, 2024

layer { name: "conv1/bn" type: "BatchNorm" bottom: "conv1" top: "conv1" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } batch_norm_param { use_global_stats: true eps: 1e-5 } }

比如这段bn参数的设置，三个param都是0是什么意思呢？ @shicai

from densenet-caffe.

DanLiu0623 commented on May 24, 2024

layer { name: "caffe.BN_5" type: "BN" bottom: "caffe.SpatialConvolution_4" top: "caffe.BN_5" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 1 decay_mult: 0 } bn_param { frozen: true slope_filler { value: 1 } bias_filler { value: 0 } } }那BN中这样定义的param是不就是权重更新得到的呢？ frozen: true 这个参数相当于caffe中定义的use_global_stats: true 意思是TEST阶段，按说是使用全局的均值和方差来计算，可是lr_mult: 1是表示权重更新来学习这个参数吗？ @shicai

from densenet-caffe.

DanLiu0623 commented on May 24, 2024

我知道了，他这个项目里把batchnorm和scale写在一起了，最后权重更新学的就是那两个参数，可是我不明白batchnorm里的参数（均值和方差）都是计算得到的，为什么还要写成param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } 这样呢？不写可以吗？

from densenet-caffe.

foralliance commented on May 24, 2024

@DanLiu0623
因为要保持这个参数不变．所以必须是0．
如果不写，lr_mult 和 decay_mult 默认的是1．
以上是自己的理解

from densenet-caffe.

foralliance commented on May 24, 2024

@shicai

几个小问题：
1.
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
convolution_param {
num_output: 64
bias_term: false
pad: 3
kernel_size: 7
stride: 2
weight_filler {
type: "gaussian"
std: 0.01
}
}
}
在微调时，convolution_param 中的参数 weight_filler 是不是没有用，因为它会被预训练模型覆盖，根本不需要初始化．这里的理解对吗？？

对于train.prototxt和test.prototxt．两者除了头尾不同，中间的内容是不是应该都是相同的？貌似以前看到过，中间内容不同的情况．这个如何理解呢？

如下2种形式：
1)
layer {
bottom: "data"
top: "conv1"
name: "conv1"
type: "Convolution"
convolution_param {
num_output: 64
kernel_size: 7
pad: 3
stride: 2
}
}
2)
layer {
name: "fc6_readonly"
type: "InnerProduct"
bottom: "pool5_readonly"
top: "fc6_readonly"
propagate_down: false
param {
name: "fc6_w"
}
param {
name: "fc6_b"
}
inner_product_param {
num_output: 4096
}
}
对于2)，此时 lr_mult 和 decay_mult 是默认值1
对于1)，这直接没有param项，此时 lr_mult 和 decay_mult 是不是仍是按照默认值1执行？还是另有它解？？在proto中，param没有默认项．

from densenet-caffe.

shicai commented on May 24, 2024

微调时，只要原网络中名字相同的大小相同的，都会被覆盖，因此可以不用初始化。
Conv和FC层默认有2个param，可以不用设置，按照默认来。param一般不需要命名，如果进行了命名，一般是与其他层进行权重共享使用，相同name的param都指向同一组参数。

from densenet-caffe.

foralliance commented on May 24, 2024

感谢您的回答 @shicai

对于您说的＂param一般不需要命名，如果进行了命名，一般是与其他层进行权重共享使用，相同name的param都指向同一组参数＂能不能这么理解：
layer {
name: "fc6_readonly"
type: "InnerProduct"
bottom: "pool5_readonly"
top: "fc6_readonly"
propagate_down: false
param {
name: "fc6_w"
}
param {
name: "fc6_b"
}
inner_product_param {
num_output: 4096
}
}
........................
........................
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
param {
name: "fc6_w"
lr_mult: 1
decay_mult: 1
}
param {
name: "fc6_b"
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
}
}

如上述同一prototxt中的，意思是fc6_readonly的param设置和fc6中一致？？

layer {
name: "upP2"
type: "Deconvolution"
bottom: "p3_lateral"
top: "upP2"
convolution_param {
kernel_h : 4
kernel_w : 4
stride_h: 2
stride_w: 2
pad_h: 1
pad_w: 1
num_output: 256
group: 256
bias_term: false
weight_filler {
type: "bilinear"
}
}
param { lr_mult: 0 decay_mult: 0 }
}

对于很多代码中的Deconvolution，param均为0，这个难道不用学习，直接用初始化的参数？？

from densenet-caffe.

shicai commented on May 24, 2024

它们指向同一组W/b参数，fc6更新，fc_readonly不更新。
初始化方式是bilinear，固定参数，也就是说以DeConv层的方式实现固定的bilinear上采样。

from densenet-caffe.

foralliance commented on May 24, 2024

@shicai
"它们指向同一组W/b参数"，意思就是fc6_readonly的参数是fc6的？

from densenet-caffe.

shicai commented on May 24, 2024

是的

from densenet-caffe.

foralliance commented on May 24, 2024

@shicai
Thank you for your patience.

from densenet-caffe.

YilanWang commented on May 24, 2024

在更新过的caffe中，bn_param被更新为batch_norm_param,我发现好像没有frozen这个参数了，请问原来的老代码中这个参数现在要怎么处理？
In the updated caffe, "bn_param" was updated to "batch_norm_param". I found that there is no the "frozen" parameter. How would this parameter be handled in the old code in the new version?

from densenet-caffe.

{ lr_mult: 1 decay_mult: 1}的含义 about densenet-caffe HOT 16 CLOSED

Comments (16)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent