Git Product home page Git Product logo

mobile-yolov5-pruning-distillation's Issues

声明!

思考再三,还是做出如下声明
本git仅供学习!!! 拒绝伸手党
本git目的是帮助你大体了解一些做工程的一些基本流程,换backbone、剪枝、蒸馏、量化都是常规操作,包括利用现有框架往C++和android上转。
如果你要用来商用,或者做课设,做毕设,完全没有问题,遵从原git的相关开源协议即可。时间仓促,代码细节上难免会存在纰漏,但是只要按照readme复现整个流程是不会有问题的。很多可能的问题、细节在readme和代码注释中都有详细记录。包括一些关于实现的疑问在一些issue中我都解释过。
欢迎提供好的idea或者碰到了一些思路上的问题也欢迎共同来探讨,但是拒绝伸手党
每个人的代码环境,操作方式都不一样,不可能简单的通过ctrl c+v 错误信息来帮你直接debug。
如果有商业合作需求,欢迎私下联系我。

torch.nn.modules.module.ModuleAttributeError: 'Model' object has no attribute 'module'

test.py文件使用剪枝后的模型会报错,报错信息如下:
Traceback (most recent call last):
File "/home/demon/PycharmProjects/mobile-yolov5-pruning-distillation/test.py", line 357, in
test(opt.data,
File "/home/demon/PycharmProjects/mobile-yolov5-pruning-distillation/test.py", line 89, in test
names = {k: v for k, v in enumerate(model.names if hasattr(model, 'names') else model.module.names)}
File "/home/demon/anaconda3/envs/pytorch_gpu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 771, in getattr
raise ModuleAttributeError("'{}' object has no attribute '{}'".format(
torch.nn.modules.module.ModuleAttributeError: 'Model' object has no attribute 'module'

prune and distillation

Hello, excuse me again. I ran your code: pruning and distilling mobilenet. One can reduce the parameters and the other can improve the accuracy. Why not use the pruning model as a student model of distillation?

关于剪枝蒸馏的策略

请问能不能提供一关于你这里所采用的的剪枝蒸馏策略的参考资料?万分感谢。另外,想请教一下,数据集的改变是否会影响剪枝蒸馏的策略与效果

Low training recall rate

I set test params(conf_trhesh=0.3, iou_thresh=0.3, batch=24,) when training the VOC2012 and 2017 data sets,but
the recall rate after 120 epochs iteration is only 0.3xx, is this normal?

蒸馏

蒸馏时用v5的v2权重AttributeError: 'Detect' object has no attribute 'm'错误,剪枝后map只有0.4几咋回事呢,我是按照快速开始来的。

Thanks for your fantastic work! But I encountered a problem: onnx export error...

env :

ubuntu1804
torch: 1.6.0
torchvision: 0.7.0
Training: python3 train.py --type vocs python3 train.py --type dmvocs
export: export PYTHONPATH="$PWD" && python models/onnx_export.py --weights outputs/dmvocs/weights/best_dmvocs.pt --img 640 --batch 1

Error:

Namespace(batch_size=1, img_size=[640], weights='outputs/dmvocs/weights/best_dmvocs.pt')
Fusing layers...
Model Summary: 148 layers, 3.59629e+06 parameters, 3.31818e+06 gradients
Traceback (most recent call last):
  File "models/onnx_export.py", line 35, in <module>
    _ = model(img)  # dry run
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/data/byronnar/mobile-yolov5-pruning-distillation/models/yolo.py", line 94, in forward
    return self.forward_once(x, profile)  # single-scale inference, train
  File "/data/byronnar/mobile-yolov5-pruning-distillation/models/yolo.py", line 111, in forward_once
    x = m(x)  # run
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/data/byronnar/mobile-yolov5-pruning-distillation/models/common.py", line 24, in fuseforward
    return self.act(self.conv(x))
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 419, in forward
    return self._conv_forward(input, self.weight)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 416, in _conv_forward
    self.padding, self.dilation, self.groups)
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [32, 3, 3, 3], but got 3-dimensional input of size [1, 3, 640] instead

How can I solve this problem? Looking forward to your reply! Thank you!

蒸馏时Precision下降太多的问题

感谢大佬的工程。
我在蒸馏时,发现Precision下降挺多的,看到您的几个蒸馏方案中,也是Precision比没有蒸馏前低一些,这个问题可以通过啥方法改善么?
单分类的,倒是没什么大的下降~

剪枝代码

大佬好,剪枝代码看不太懂,有写相应的解释性的博客吗?

pruning problem

training dataset: coco2017
model weights: yolov5s.pt
classes = 80
why was lower to value of p or r when pruning?

as following:
Epoch gpu_mem GIoU obj cls total targets img_size
3/49 2.66G 0.06193 0.0939 0.04484 0.2007 141 640: 100%|██████████████████| 7393/7393 [2:41:07<00:00, 1.31s/
Class Images Targets P R [email protected] [email protected]:.95: 100%|██████████| 313/313 [02:54<00:00, 1.80it/s]
all 5e+03 3.63e+04 0.000964 0.000638 5.18e-05 1.53e-05

1

我对剪枝后的模型蒸馏后发现,精度上去的同时参数量也恢复到剪枝前的参数量了。

稀疏化训练后,运行剪枝报错FileNotFoundError: [Errno 2] No such file or directory: 'render_img/before_pruning.jpg',这个文件是哪里生成的呢?谢谢

Pruning 段错误

我先是通过 python train.py --type mvocs 进行训练,再通过python3 train.py --type smvocs将上一步的best_mvocs.pt作为本次训练的输入模型,再通过python3 pruning.py -t 0.1将上一步的best_smvocs.pt作为本次的输入模型,运行python3 pruning.py -t 0.1就报段错误

L1 稀疏化训练细节

 def compute_pruning_loss(p, prunable_modules, model, loss):
    ft = torch.cuda.FloatTensor if p[0].is_cuda else torch.Tensor
    ll1 = ft([0])
    h = model.hyp  # hyperparameters
    red = 'mean'  # Loss reduction (sum or mean)
    if prunable_modules is not None:
        for m in prunable_modules:
            ll1 += m.weight.norm(1)  # BN 层 gamma 值 的 L1 范数
        ll1 /= len(prunable_modules)
    ll1 *= h['sl']
    bs = p[0].shape[0]  # batch size
    loss += ll1 * bs
    return loss

请教下函数中 平均范数为啥要乘上 batch_size?

关于finetune

新手想请教一个比较基础的问题,剪枝后的模型finetune训练次数设置的足够高(比如跟正常训练一样100个epochs)可以取得很不错的mAP,这样做有问题吗?finetune的训练次数有没有什么限制比如只允许在10,20 次这样

mAP become 0

Why did mAP all become 0 after the second epoch when I used mobilev2-yolo5s to train my data?

Prune的ignore_idx指的是哪几层

您好,感谢您开源您的工作,请问您在剪枝部分ignore_idx=[230, 260, 290],这个[230, 260, 290]分别指的是哪些层,或者您是否有一个权重可以共享

run it

i can not run it ,it give me the message:

Traceback (most recent call last):
File "train.py", line 762, in
opt.cfg = check_file(opt.cfg) # check file
File "/home/demon/CLionProjects/mobile-yolov5-pruning-distillation/utils/utils.py", line 101, in check_file
if os.path.isfile(file):
File "/home/demon/anaconda3/envs/pytorch_gpu/lib/python3.8/genericpath.py", line 30, in isfile
st = os.stat(path)

pruning is useless

prunable_modules = []
prunable_module_type = (nn.BatchNorm2d, )
for i, m in enumerate(model.modules()):
if i in ignore_idx:
continue
if isinstance(m, prunable_module_type):
prunable_modules.append(m)

as mentioned above, model structure was not change!
in addtion, BN of conv module was not appended prunable_modules.
where pruning?

how to finetune the pruned model?

when I train it with 'train.py', the output model is non-pruned, as large as original model.
Do I need a *.yaml for pruned model, how to get it?
Thanks.

常见的一些问题汇总

  1. 问:为什么有时候训练mAP会变成0?
    答:这要分情况讨论,我看大多数就是loss->0,val的map->0,这是很明显的overfit了。可以参考一下原来写的csdn文章

  2. 问:torch.nn.modules.module.ModuleAttributeError: 'Model' object has no attribute 'module'
    答:这是原git挖的坑。你以为的模型加载->直接读取权重,恢复模型,构建梯度关系。实际上的模型加载->从yaml中构建一个随机权重的模型->得到梯度关系->加载每一层保存的权重。前者用在纯前向推理,后者用在需要梯度关系的时候,比如剪枝。还有就是保证原始工程结构,注意export PATH=$PWD

  3. 问:yolo5s的预训练权重没了怎么办?
    答:我现在手上也没了。不过这个git主要是用更轻量的mobilev2-yolo5s,所以这里大家也可以来尝试一下。如果需要用yolov5,还是用他们最新的,对着我这个改改代码还是很轻松的Readme中有一个网盘链接,可以试一下

  4. 问:recall低或者precision低
    答:和阈值有关,所以我这里只关注map作为中间指标。如果再剪枝或者蒸馏中出现map降低,也是很正常的,和你的数据集以及策略都有关,毕竟每个论文都是吹自己是sota。还需要自己鉴别。

  5. 问:剪枝之后的模型能保存yaml文件吗?可不可以蒸馏或者进行ncnn转换?
    答:可以,但是没必要用yaml文件。这个时候你直接加载模型就行了,参考ft(finetune)模式。如果你很介意,那你就写个逆向的脚本就行了,不难,锻炼一下自己。

  6. 问:模型训练的细节,剪枝蒸馏参考的文献等
    答:基本都在readme中有写,细节直接看代码就好了,不难,就当作学习。补充一下,剪枝是韩松的基于bn层剪枝,蒸馏则是用了针对一阶段、二阶段、分类网络等等的方式,具体细节肯定和原文有出入,但是差异不大注意鉴别。还有就是不要问我为什么不用什么什么方法,我不可能帮你实验完所有的,你可以自己动手。我所选的都是具有代表性的,很经典的方法,但是不一定最好。

  7. 问:yolov5在你这里剪枝蒸馏有一些bug
    答:一针对yolov5我没有具体测试过,别人作者都还在更新呢,肯定是有bug的。二我这个项目主要是想做个更轻量的模型出来,但是我能保证只要按照我的步骤一步一步肯定能复现出来,我连随机数种子都给你们了>_<。

  8. 问:mobilev2-yolo5s收敛慢,但是yolo5s就很快?
    答:别人在coco上预训练过的,参数量和计算量是我的两倍,capacity就很大。我这个就backbone用的imagenet权重,head全部随机初始化,收敛慢很正常,capacity小导致精度低也很正常。本身轻量化的思路就是从整体结构设计、剪枝、蒸馏、量化这4个点来做的。你要拿着yolov5x训练完了再做轻量化我也没得说,这也是个思路。

all processes

your work as following:
sparse learning-->prune-->finetune-->distill ?

RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.cuda.FloatTensor) should be the same

你好,我尝试使用yolov5s作为模型进行训练,命令行为
python train.py --type vocs
但是在训练完1个epoch后测试验证集时,报错

Traceback (most recent call last):
File "train.py", line 802, in
train(hyp)
File "train.py", line 453, in train
results, maps, times = test.test(opt.data,
File "/home/cwy/mobile-yolov5-pruning-distillation/test.py", line 112, in test
inf_out, train_out = model(img, augment=augment)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 621, in forward
outputs = self.parallel_apply(self._module_copies[:len(inputs)], inputs, kwargs)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 646, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
output.reraise()
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 1 on device 1.
Original Traceback (most recent call last):
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cwy/mobile-yolov5-pruning-distillation/models/yolo.py", line 94, in forward
return self.forward_once(x, profile) # single-scale inference, train
File "/home/cwy/mobile-yolov5-pruning-distillation/models/yolo.py", line 111, in forward_once
x = m(x) # run
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cwy/mobile-yolov5-pruning-distillation/models/common.py", line 86, in forward
return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1))
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cwy/mobile-yolov5-pruning-distillation/models/common.py", line 21, in forward
return self.act(self.bn(self.conv(x)))
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 423, in forward
return self._conv_forward(input, self.weight)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 419, in _conv_forward
return F.conv2d(input, weight, self.bias, self.stride,
RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.cuda.FloatTensor) should be the same

尝试了很多方法,但还是报同样的错误,想请教一下该如何解决?

策略

按照你的步骤一步一步,是要比之前更轻量吗,内存暂用更小吗,一般pruning剪枝率给到多少呢,蒸馏一般选择策略几呢,,我大概总共3,4万多张图片。

关于蒸馏的一些问题

感谢作者的repo,真的学到了很多,在看蒸馏实验室遇到了一些疑惑想和作者请教一下

  1. Strategy1 Output Based Distillation,repo中关于Teacher部分的蒸馏损失全都是用MSELoss计算的,而论文中仍然采用和原始YOLO相同的Loss,会不会这一块限制Student网络的调优
  2. Strategy2 Feature Based + Output Based Distillation,repo中似乎是通过参数控制一起训练的,而论文中是先冻结Feature Map之后的参数进行Feature Distillation,然后再解冻进行全局Distillation,会不会这部分限制了Strategy2的效果导致最后mAP偏低;且论文里Convert应该只有卷积,没有加上BN和Relu两个操作

遇到了点问题

RuntimeError: Given groups=1, weight of size [1, 255, 1, 1], expected input[1, 512, 8, 8] to have 255 channels, but got 512 channels instead
我运行那个mob-yolov5s.yaml,报错这个!

About training

I would like to ask what is the voc data format of your training? Whether it needs to be processed into xywh format after downloading voc

关于权重文件下载问题

当我执行前两个训练命令时,出现了以下错误,剩余的训练命令也类似,请问怎么解决呢qaq
QQ截图20210306102921
QQ截图20210306102941

"render_img/before_pruning.jpg"

Thank you for sharing the code. When I run the pruning operation, there is a missing image. Can you provide me with this picture?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.