Git Product home page Git Product logo

Comments (16)

shicai avatar shicai commented on May 24, 2024 6

lr_mult和decay_mult的默认值是1.0,如果没有任何设置,那就是1,也就是这层的参数按照base_lr进行学习。

from densenet-caffe.

shicai avatar shicai commented on May 24, 2024 6

BatchNorm里面的三个参数(mean,variance等)是计算得到的,不是权重更新得到的,因此不能加weight decay和lr mult,所以param需要设置为0,。

from densenet-caffe.

shicai avatar shicai commented on May 24, 2024 1

楼上正解,因为默认的lr_mult 和 decay_mult是1.0。见:https://github.com/BVLC/caffe/blob/master/src/caffe/proto/caffe.proto#L316

from densenet-caffe.

foralliance avatar foralliance commented on May 24, 2024

主要想问下,当没有这个参数param时,到底是按
 1. solver中的base_lr进行,
2. 还是相当于param{ lr_mult: 0},即,固定该层

from densenet-caffe.

DanLiu0623 avatar DanLiu0623 commented on May 24, 2024

layer { name: "conv1/bn" type: "BatchNorm" bottom: "conv1" top: "conv1" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } batch_norm_param { use_global_stats: true eps: 1e-5 } }

比如这段bn参数的设置,三个param都是0是什么意思呢? @shicai

from densenet-caffe.

DanLiu0623 avatar DanLiu0623 commented on May 24, 2024

layer { name: "caffe.BN_5" type: "BN" bottom: "caffe.SpatialConvolution_4" top: "caffe.BN_5" param { lr_mult: 1 decay_mult: 0 } param { lr_mult: 1 decay_mult: 0 } bn_param { frozen: true slope_filler { value: 1 } bias_filler { value: 0 } } }那BN中这样定义的param是不就是权重更新得到的呢? frozen: true 这个参数相当于caffe中定义的use_global_stats: true 意思是TEST阶段,按说是使用全局的均值和方差来计算,可是lr_mult: 1是表示权重更新来学习这个参数吗? @shicai

from densenet-caffe.

DanLiu0623 avatar DanLiu0623 commented on May 24, 2024

我知道了,他这个项目里把batchnorm和scale写在一起了,最后权重更新学的就是那两个参数,可是我不明白batchnorm里的参数(均值和方差)都是计算得到的,为什么还要写成param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } 这样呢?不写可以吗?

from densenet-caffe.

foralliance avatar foralliance commented on May 24, 2024

@DanLiu0623
因为要保持这个参数不变.所以必须是0.
如果不写,lr_mult 和 decay_mult 默认的是1.
以上是自己的理解

from densenet-caffe.

foralliance avatar foralliance commented on May 24, 2024

@shicai

几个小问题:
1.
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
convolution_param {
num_output: 64
bias_term: false
pad: 3
kernel_size: 7
stride: 2
weight_filler {
type: "gaussian"
std: 0.01
}
}
}
在微调时,convolution_param 中的参数 weight_filler 是不是没有用,因为它会被预训练模型覆盖,根本不需要初始化.这里的理解对吗??

对于train.prototxt和test.prototxt.两者除了头尾不同,中间的内容是不是应该都是相同的?貌似以前看到过,中间内容不同的情况.这个如何理解呢?

如下2种形式:
1)
layer {
bottom: "data"
top: "conv1"
name: "conv1"
type: "Convolution"
convolution_param {
num_output: 64
kernel_size: 7
pad: 3
stride: 2
}
}
2)
layer {
name: "fc6_readonly"
type: "InnerProduct"
bottom: "pool5_readonly"
top: "fc6_readonly"
propagate_down: false
param {
name: "fc6_w"
}
param {
name: "fc6_b"
}
inner_product_param {
num_output: 4096
}
}
对于2),此时 lr_mult 和 decay_mult 是默认值1
对于1),这直接没有param项,此时 lr_mult 和 decay_mult 是不是仍是按照默认值1执行?还是另有它解??在proto中,param没有默认项.

from densenet-caffe.

shicai avatar shicai commented on May 24, 2024

微调时,只要原网络中名字相同的大小相同的,都会被覆盖,因此可以不用初始化。
Conv和FC层默认有2个param,可以不用设置,按照默认来。param一般不需要命名,如果进行了命名,一般是与其他层进行权重共享使用,相同name的param都指向同一组参数。

from densenet-caffe.

foralliance avatar foralliance commented on May 24, 2024

感谢您的回答 @shicai

对于您说的"param一般不需要命名,如果进行了命名,一般是与其他层进行权重共享使用,相同name的param都指向同一组参数"能不能这么理解:
layer {
name: "fc6_readonly"
type: "InnerProduct"
bottom: "pool5_readonly"
top: "fc6_readonly"
propagate_down: false
param {
name: "fc6_w"
}
param {
name: "fc6_b"
}
inner_product_param {
num_output: 4096
}
}
........................
........................
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
param {
name: "fc6_w"
lr_mult: 1
decay_mult: 1
}
param {
name: "fc6_b"
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
}
}

如上述同一prototxt中的,意思是fc6_readonly的param设置和fc6中一致??

layer {
name: "upP2"
type: "Deconvolution"
bottom: "p3_lateral"
top: "upP2"
convolution_param {
kernel_h : 4
kernel_w : 4
stride_h: 2
stride_w: 2
pad_h: 1
pad_w: 1
num_output: 256
group: 256
bias_term: false
weight_filler {
type: "bilinear"
}
}
param { lr_mult: 0 decay_mult: 0 }
}

对于很多代码中的Deconvolution,param均为0,这个难道不用学习,直接用初始化的参数??

from densenet-caffe.

shicai avatar shicai commented on May 24, 2024
  1. 它们指向同一组W/b参数,fc6更新,fc_readonly不更新。
  2. 初始化方式是bilinear,固定参数,也就是说以DeConv层的方式实现固定的bilinear上采样。

from densenet-caffe.

foralliance avatar foralliance commented on May 24, 2024

@shicai
"它们指向同一组W/b参数",意思就是fc6_readonly的参数是fc6的?

from densenet-caffe.

shicai avatar shicai commented on May 24, 2024

是的

from densenet-caffe.

foralliance avatar foralliance commented on May 24, 2024

@shicai
Thank you for your patience.

from densenet-caffe.

YilanWang avatar YilanWang commented on May 24, 2024

在更新过的caffe中,bn_param被更新为batch_norm_param,我发现好像没有frozen这个参数了,请问原来的老代码中这个参数现在要怎么处理?
In the updated caffe, "bn_param" was updated to "batch_norm_param". I found that there is no the "frozen" parameter. How would this parameter be handled in the old code in the new version?

from densenet-caffe.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.