taoyang1122 / mutualnet Goto Github PK

View Code? Open in Web Editor NEW

158.0 158.0 32.0 1.51 MB

[ECCV'20 Oral] MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution

Home Page: https://arxiv.org/abs/1909.12978

License: MIT License

Python 100.00%

mutualnet's People

Contributors

Stargazers

Watchers

mutualnet's Issues

Test with validation data？

https://github.com/taoyang1122/MutualNet/blob/master/train.py#L211

In this line, do you test the model with validation dataset? I don't think it's reasonable. :(

How to implement dynamic inference on MutualNet

Thanks for your great work about MutualNet especially your training strategy which inspires me a lot.

However, in test period, you test all these width-resolution configurations on a validation set from 1.0-224 to 0.25-128 and finally choose one best configuration according to all the results. In real time,we don't have time to test all these configurations on every sample, so I think this strategy will waste a lot of time.

Maybe the best way in test is to make the input dynamicly choose the best width-resolution to output directly. Just like the MSDNet meationed in your paper, the input can dynamicly find the most suitable classifier to output judged by a threshold.

So how to implement dynamic inference on MutualNet in test period?
Thanks.

test acc is 0

Total 25,503,912 4,089,284,608 10,376,016
Model profiling with width mult 0.9x:
Item params macs nanosecs
Total 25,503,912 3,338,064,192 10,670,845
test_only
07/19 03:54:31 PM | Start testing.
07/19 03:54:34 PM | VAL 3.0s 1.0x-224 Epoch:-1/120 Loss:95113.9297 Acc:0.000
07/19 03:54:38 PM | VAL 6.2s 0.9x-224 Epoch:-1/120 Loss:313094.5938 Acc:0.000
07/19 03:54:41 PM | VAL 9.3s 1.0x-192 Epoch:-1/120 Loss:1569.4336 Acc:0.000
07/19 03:54:44 PM | VAL 12.2s 0.9x-192 Epoch:-1/120 Loss:7313.3169 Acc:0.000
07/19 03:54:47 PM | VAL 15.2s 1.0x-160 Epoch:-1/120 Loss:575.6573 Acc:0.000
07/19 03:54:50 PM | VAL 18.4s 0.9x-160 Epoch:-1/120 Loss:2952.8027 Acc:0.000
07/19 03:54:53 PM | VAL 21.2s 1.0x-128 Epoch:-1/120 Loss:1645.1794 Acc:0.000
07/19 03:54:55 PM | VAL 23.7s 0.9x-128 Epoch:-1/120 Loss:81482.6406 Acc:0.000

I use your pretrianed model resnet50 to test on a subset of ImageNet_val , but ACC is 0.

Accuracy-FLOPs curves of MutualNet and US-Net

Dear Author,

I have been reading your paper and aim to apply the proposed Mutualnet in my own research domain. I have several concerns regarding the results obtained in Mutualnet. I am particularly confused by how the performance gap between Mutualnet and USNet varies as the Flops decrease.
For example, Figure 4(a) clearly indicates that as the FLOPs decrease, there is a significant drop in accuracy for USNet, while Mutualnet exhibits a slower decline and shows a considerable advantage. I want to know the performance gap between Mutualnet and USNet would widen or narrow, as the FLOPs decreases. Does it have a clear tendency？According to our understanding, the Mutualnet would behave better at lower computational budget. Is it right?
It will be greatly appreciated if you could provide deep insight into this. Many thanks for your kindly help.

self.width_mult 参数为啥在model.apply 依然是None

大佬，请问能够上传一份预训练模型到百度云吗？

大佬，我在下载您的预训练模型时，发现是谷歌云链接。俺们村里没法子下载。跪求一份国内可以下载的链接。

model(input_list[random.randint(0, 3)])，每一次输入训练的input_list.shape是多大？

请问BN层中momentum这个为什么是分数衰减的

尊敬的作者您好，感谢您出色的工作。
1.我想问您一下，BN层的计算中，mumentum为什么是1,1/2,1/3,1/4,1/5,1/6,1/7这样子衰减呢，是这样衰减比 mumentum 设置为一个常熟(比如0.1)的效果要好么？谢谢您的帮助，非常感谢。
2.cumulutive running average ：我没查到这个相关的移动平均，请问这个出自哪里。
不胜感激。

AttributeError: Can't pickle local object 'get_cifar.<locals>.<lambda>'

I use the windows to train model, but I get this problem.

File "D:\桌面\火灾大创\相关模型代码\MutualNet-目标检测等\MutualNet-master\ComputePostBN.py", line 44, in ComputeBN
for batch_idx, (inputs, targets) in enumerate(postloader):
File "E:\Anaconda3\envs\mutual\lib\site-packages\torch\utils\data\dataloader.py", line 819, in iter
return _DataLoaderIter(self)
File "E:\Anaconda3\envs\mutual\lib\site-packages\torch\utils\data\dataloader.py", line 560, in init
w.start()
File "E:\Anaconda3\envs\mutual\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "E:\Anaconda3\envs\mutual\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "E:\Anaconda3\envs\mutual\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "E:\Anaconda3\envs\mutual\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "E:\Anaconda3\envs\mutual\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'get_cifar..'

(E:\Anaconda3\envs\mutual) D:\桌面\火灾大创\相关模型代码\MutualNet-目标检测等\MutualNet-master>Traceback (most recent call last):
File "", line 1, in
File "E:\Anaconda3\envs\mutual\lib\multiprocessing\spawn.py", line 99, in spawn_main
new_handle = reduction.steal_handle(parent_pid, pipe_handle)
File "E:\Anaconda3\envs\mutual\lib\multiprocessing\reduction.py", line 87, in steal_handle
_winapi.DUPLICATE_SAME_ACCESS | _winapi.DUPLICATE_CLOSE_SOURCE)
PermissionError: [WinError 5] 拒绝访问。

I have searched the internet，they say：

Pytorch的DataLoader模块，当 num_workers设置为非0时，在linux下运行正常，但是在Windows下出现AttributeError: Can’t pickle local object ‘Sharpen..create_matrices’，出现该异常的原因是pickle模块不能序列化lambda function。
在Unix/Linux下，multiprocessing模块封装了fork（）调用，是我们不需要关注fork（）的细节。由于windows没有fork调用，因此，multiprocessing需要“模拟”出fork的效果，父进程所有Python对象都必须通过pickle序列号再传到子进程中去。所以，如果multiprocessing在Windows下调用失败了，要先考虑是不是pickle失败了。

I think the reason is multiprocessing , so I want to shut it down， and I change ‘data_loader_workers==1’，but it still not work
code below：

data

#dataset: imagenet1k
#data_transforms: imagenet1k_mobile
##dataset_dir: /home/ubuntu/yang/data/ImageNet/ILSVRC/Data/CLS-LOC
#dataset_dir: /home/ubuntu/yang/data/ImageNet/ILSVRC/Data/CLS-LOC
#data_loader_workers: 1
dataset: cifar100

test pretrained resume

test_only: True
pretrained: '../models/mobilenetv1.pt'
resume: ''

As you can see，My computer dont have enough memory，so I want to use cifar100，but still not work。
I have never used pytorch before，I feel sorry

07/15 07:10:27 PM | Start testing.
07/15 07:13:35 PM | VAL 188.3s 1.0x-224 Epoch:0/250 Loss:1.0739 Acc:0.730
07/15 07:16:41 PM | VAL 373.9s 0.9x-224 Epoch:0/250 Loss:1.2317 Acc:0.699
07/15 07:20:01 PM | VAL 574.2s 0.7x-224 Epoch:0/250 Loss:2.9944 Acc:0.377
Traceback (most recent call last):
  File "train.py", line 241, in <module>
    main()
  File "train.py", line 237, in main
    train_val_test()
  File "train.py", line 213, in train_val_test
    test(last_epoch, val_loader, model_wrapper, criterion, train_loader)
  File "train.py", line 144, in test
    model = ComputePostBN.ComputeBN(model, postloader, resolution)
  File "/home/rll/MutualNet/ComputePostBN.py", line 47, in ComputeBN
    _ = net(F.interpolate(img, (resolution, resolution), mode='bilinear', align_corners=True))
  File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 155, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 165, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
    output.reraise()
  File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise
    raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
  File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
    output = module(*input, **kwargs)
  File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/rll/MutualNet/models/mobilenet_v2.py", line 120, in forward
    x = self.features(x)
  File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/nn/modules/container.py", line 100, in forward
    input = module(input)
  File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/nn/modules/pooling.py", line 554, in forward
    self.padding, self.ceil_mode, self.count_include_pad, self.divisor_override)
RuntimeError: Given input size: (1280x6x6). Calculated output size: (1280x0x0). Output size is too small

taoyang1122 / mutualnet Goto Github PK

mutualnet's People

Contributors

Stargazers

Watchers

Forkers

mutualnet's Issues

data

test pretrained resume

Recommend Projects

Recommend Topics

Recommend Org