taoyang1122 / mutualnet Goto Github PK
View Code? Open in Web Editor NEW[ECCV'20 Oral] MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution
Home Page: https://arxiv.org/abs/1909.12978
License: MIT License
[ECCV'20 Oral] MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution
Home Page: https://arxiv.org/abs/1909.12978
License: MIT License
https://github.com/taoyang1122/MutualNet/blob/master/train.py#L211
In this line, do you test the model with validation dataset? I don't think it's reasonable. :(
Thanks for your great work about MutualNet especially your training strategy which inspires me a lot.
However, in test period, you test all these width-resolution configurations on a validation set from 1.0-224 to 0.25-128 and finally choose one best configuration according to all the results. In real time,we don't have time to test all these configurations on every sample, so I think this strategy will waste a lot of time.
Maybe the best way in test is to make the input dynamicly choose the best width-resolution to output directly. Just like the MSDNet meationed in your paper, the input can dynamicly find the most suitable classifier to output judged by a threshold.
So how to implement dynamic inference on MutualNet in test period?
Thanks.
Total 25,503,912 4,089,284,608 10,376,016
Model profiling with width mult 0.9x:
Item params macs nanosecs
Total 25,503,912 3,338,064,192 10,670,845
test_only
07/19 03:54:31 PM | Start testing.
07/19 03:54:34 PM | VAL 3.0s 1.0x-224 Epoch:-1/120 Loss:95113.9297 Acc:0.000
07/19 03:54:38 PM | VAL 6.2s 0.9x-224 Epoch:-1/120 Loss:313094.5938 Acc:0.000
07/19 03:54:41 PM | VAL 9.3s 1.0x-192 Epoch:-1/120 Loss:1569.4336 Acc:0.000
07/19 03:54:44 PM | VAL 12.2s 0.9x-192 Epoch:-1/120 Loss:7313.3169 Acc:0.000
07/19 03:54:47 PM | VAL 15.2s 1.0x-160 Epoch:-1/120 Loss:575.6573 Acc:0.000
07/19 03:54:50 PM | VAL 18.4s 0.9x-160 Epoch:-1/120 Loss:2952.8027 Acc:0.000
07/19 03:54:53 PM | VAL 21.2s 1.0x-128 Epoch:-1/120 Loss:1645.1794 Acc:0.000
07/19 03:54:55 PM | VAL 23.7s 0.9x-128 Epoch:-1/120 Loss:81482.6406 Acc:0.000
I use your pretrianed model resnet50 to test on a subset of ImageNet_val , but ACC is 0.
Dear Author,
I have been reading your paper and aim to apply the proposed Mutualnet in my own research domain. I have several concerns regarding the results obtained in Mutualnet. I am particularly confused by how the performance gap between Mutualnet and USNet varies as the Flops decrease.
For example, Figure 4(a) clearly indicates that as the FLOPs decrease, there is a significant drop in accuracy for USNet, while Mutualnet exhibits a slower decline and shows a considerable advantage. I want to know the performance gap between Mutualnet and USNet would widen or narrow, as the FLOPs decreases. Does it have a clear tendency?According to our understanding, the Mutualnet would behave better at lower computational budget. Is it right?
It will be greatly appreciated if you could provide deep insight into this. Many thanks for your kindly help.
大佬,我在下载您的预训练模型时,发现是谷歌云链接。俺们村里没法子下载。跪求一份国内可以下载的链接。
I use the windows to train model, but I get this problem.
File "D:\桌面\火灾大创\相关模型代码\MutualNet-目标检测等\MutualNet-master\ComputePostBN.py", line 44, in ComputeBN
for batch_idx, (inputs, targets) in enumerate(postloader):
File "E:\Anaconda3\envs\mutual\lib\site-packages\torch\utils\data\dataloader.py", line 819, in iter
return _DataLoaderIter(self)
File "E:\Anaconda3\envs\mutual\lib\site-packages\torch\utils\data\dataloader.py", line 560, in init
w.start()
File "E:\Anaconda3\envs\mutual\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "E:\Anaconda3\envs\mutual\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "E:\Anaconda3\envs\mutual\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "E:\Anaconda3\envs\mutual\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "E:\Anaconda3\envs\mutual\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'get_cifar..'
(E:\Anaconda3\envs\mutual) D:\桌面\火灾大创\相关模型代码\MutualNet-目标检测等\MutualNet-master>Traceback (most recent call last):
File "", line 1, in
File "E:\Anaconda3\envs\mutual\lib\multiprocessing\spawn.py", line 99, in spawn_main
new_handle = reduction.steal_handle(parent_pid, pipe_handle)
File "E:\Anaconda3\envs\mutual\lib\multiprocessing\reduction.py", line 87, in steal_handle
_winapi.DUPLICATE_SAME_ACCESS | _winapi.DUPLICATE_CLOSE_SOURCE)
PermissionError: [WinError 5] 拒绝访问。
I have searched the internet,they say:
Pytorch的DataLoader模块,当 num_workers设置为非0时,在linux下运行正常,但是在Windows下出现AttributeError: Can’t pickle local object ‘Sharpen..create_matrices’,出现该异常的原因是pickle模块不能序列化lambda function。
在Unix/Linux下,multiprocessing模块封装了fork()调用,是我们不需要关注fork()的细节。由于windows没有fork调用,因此,multiprocessing需要“模拟”出fork的效果,父进程所有Python对象都必须通过pickle序列号再传到子进程中去。所以,如果multiprocessing在Windows下调用失败了,要先考虑是不是pickle失败了。
I think the reason is multiprocessing , so I want to shut it down, and I change ‘data_loader_workers==1’,but it still not work
code below:
#dataset: imagenet1k
#data_transforms: imagenet1k_mobile
##dataset_dir: /home/ubuntu/yang/data/ImageNet/ILSVRC/Data/CLS-LOC
#dataset_dir: /home/ubuntu/yang/data/ImageNet/ILSVRC/Data/CLS-LOC
#data_loader_workers: 1
dataset: cifar100
test_only: True
pretrained: '../models/mobilenetv1.pt'
resume: ''
As you can see,My computer dont have enough memory,so I want to use cifar100,but still not work。
I have never used pytorch before,I feel sorry
Hi, Its rly a nice work! Could u pls update the detection code?
请问是否尝试过基于MobileNet的目标检测训练?
Hi, I ran the inference with pretrained mobilenetv2 but encountered the following error. Any idea about how to solve it?
07/15 07:10:27 PM | Start testing.
07/15 07:13:35 PM | VAL 188.3s 1.0x-224 Epoch:0/250 Loss:1.0739 Acc:0.730
07/15 07:16:41 PM | VAL 373.9s 0.9x-224 Epoch:0/250 Loss:1.2317 Acc:0.699
07/15 07:20:01 PM | VAL 574.2s 0.7x-224 Epoch:0/250 Loss:2.9944 Acc:0.377
Traceback (most recent call last):
File "train.py", line 241, in <module>
main()
File "train.py", line 237, in main
train_val_test()
File "train.py", line 213, in train_val_test
test(last_epoch, val_loader, model_wrapper, criterion, train_loader)
File "train.py", line 144, in test
model = ComputePostBN.ComputeBN(model, postloader, resolution)
File "/home/rll/MutualNet/ComputePostBN.py", line 47, in ComputeBN
_ = net(F.interpolate(img, (resolution, resolution), mode='bilinear', align_corners=True))
File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 155, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 165, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, **kwargs)
File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/home/rll/MutualNet/models/mobilenet_v2.py", line 120, in forward
x = self.features(x)
File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/nn/modules/container.py", line 100, in forward
input = module(input)
File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/home/rll/anaconda3/lib/python3.7/site-packages/torch/nn/modules/pooling.py", line 554, in forward
self.padding, self.ceil_mode, self.count_include_pad, self.divisor_override)
RuntimeError: Given input size: (1280x6x6). Calculated output size: (1280x0x0). Output size is too small
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.