ethanhe42 / channel-pruning Goto Github PK

Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)

Home Page: https://arxiv.org/abs/1707.06168

License: MIT License

Shell 0.28% Python 99.72%

image-recognition model-compression acceleration object-detection image-classification channel-pruning deep-neural-networks

channel-pruning's Introduction

Channel Pruning for Accelerating Very Deep Neural Networks

GitHub - yihui-he/channel-pruning: Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)

Channel Pruning for Accelerating Very Deep Neural Networks

ICCV 2017, by Yihui He, Xiangyu Zhang and Jian Sun

Please have a look our new works on compressing deep models:

AMC: AutoML for Model Compression and Acceleration on Mobile Devices ECCV’18, which combines channel pruning and reinforcement learning to further accelerate CNN. code and models are available!
AddressNet: Shift-Based Primitives for Efficient Convolutional Neural Networks WACV’19. We propose a family of efficient networks based on Shift operation.
MoBiNet: A Mobile Binary Network for Image Classification WACV’20 Binarized MobileNets.

In this repository, we released code for the following models:

model	Speed-up	Accuracy
https://github.com/yihui-he/channel-pruning/releases/tag/channel_pruning_5x	5x	88.1 (Top-5), 67.8 (Top-1)
https://github.com/yihui-he/channel-pruning/releases/tag/VGG-16_3C4x	4x	89.9 (Top-5), 70.6 (Top-1)
https://github.com/yihui-he/channel-pruning/releases/tag/ResNet-50-2X	2x	90.8 (Top-5), 72.3 (Top-1)
https://github.com/yihui-he/channel-pruning/releases/tag/faster-RCNN-2X4X	2x	36.7 ([email protected]:.05:.95)
https://github.com/yihui-he/channel-pruning/releases/tag/faster-RCNN-2X4X	4x	35.1 ([email protected]:.05:.95)

3C method combined spatial decomposition (Speeding up Convolutional Neural Networks with Low Rank Expansions) and channel decomposition (Accelerating Very Deep Convolutional Networks for Classification and Detection) (mentioned in 4.1.2)

Citation

If you find the code useful in your research, please consider citing:

@InProceedings{He_2017_ICCV,
author = {He, Yihui and Zhang, Xiangyu and Sun, Jian},
title = {Channel Pruning for Accelerating Very Deep Neural Networks},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2017}
}

requirements

Python3 packages you might not have: scipy, sklearn, easydict, use sudo pip3 install to install.
For finetuning with 128 batch size, 4 GPUs (~11G of memory)

Installation (sufficient for the demo)

Clone the repository

# Make sure to clone with --recursive
 git clone --recursive https://github.com/yihui-he/channel-pruning.git

Build my Caffe fork (which support bicubic interpolation and resizing image shorter side to 256 then crop to 224x224)

cd caffe

 # If you're experienced with Caffe and have all of the requirements installed, then simply do:
 make all -j8 && make pycaffe
 # Or follow the Caffe installation instructions here:
 # http://caffe.berkeleyvision.org/installation.html

 # you might need to add pycaffe to PYTHONPATH, if you've already had a caffe before

Download ImageNet classification dataset http://www.image-net.org/download-images
Specify imagenet source path in temp/vgg.prototxt (line 12 and 36)

Channel Pruning

For fast testing, you can directly download pruned model. See next section 1. Download the original VGG-16 model http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_16_layers.caffemodel move it to temp/vgg.caffemodel (or create a softlink instead)

Start Channel Pruning

python3 train.py -action c3 -caffe [GPU0]
 # or log it with ./run.sh python3 train.py -action c3 -caffe [GPU0]
 # replace [GPU0] with actual GPU device like 0,1 or 2

Combine some factorized layers for further compression, and calculate the acceleration ratio. Replace the ImageData layer of temp/cb_3c_3C4x_mem_bn_vgg.prototxt with [temp/vgg.prototxt’s](https://github.com/yihui-he/channel-pruning/blob/master/temp/vgg.prototxt#L1-L49) Shell ./combine.sh | xargs ./calflop.sh

Finetuning

caffe train -solver temp/solver.prototxt -weights temp/cb_3c_vgg.caffemodel -gpu [GPU0,GPU1,GPU2,GPU3]
 # replace [GPU0,GPU1,GPU2,GPU3] with actual GPU device like 0,1,2,3

Testing

Though testing is done while finetuning, you can test anytime with:
```
caffe test -model path/to/prototxt -weights path/to/caffemodel -iterations 5000 -gpu [GPU0]
 # replace [GPU0] with actual GPU device like 0,1 or 2
```
Pruned models (for download)

For fast testing, you can directly download pruned model from release: VGG-16 3C 4X, VGG-16 5X, ResNet-50 2X. Or follow Baidu Yun Download link

Test with:

caffe test -model channel_pruning_VGG-16_3C4x.prototxt -weights channel_pruning_VGG-16_3C4x.caffemodel -iterations 5000 -gpu [GPU0]
# replace [GPU0] with actual GPU device like 0,1 or 2

Pruning faster RCNN

For fast testing, you can directly download pruned model from release Or you can: 1. clone my py-faster-rcnn repo: https://github.com/yihui-he/py-faster-rcnn 2. use the pruned models from this repo to train faster RCNN 2X, 4X, solver prototxts are in https://github.com/yihui-he/py-faster-rcnn/tree/master/models/pascal_voc

FAQ

You can find answers of some commonly asked questions in our Github wiki, or just create a new issue

channel-pruning's People

Contributors

Stargazers

Watchers

Forkers

liuguoyou keyky wspba mornydew xqpinitial baiyancheng20 aswwqhome1993 johnson-yue lucaswu dreadlord1984 lin-j realwill lihua213 yiliangnie cfandy samsmith95 6676401088 jay2002 wwwanghao flygyyy msunming xialuxi zhangyangang hucley trantorrepository wonderzy weitaoatvison yantaocv haiyang21 carabob wcy0319 guo253 pumpkin007 wltongxing gaqiness oldaltarsauerkraut yucao42 alexliyang pustar jerrybonjour zhangxujinsh zuowang jms535 pciodyuc supersai007 zlheos lanchuanxin jiangwqcooler dxqjean sanchitaggarwal betterthinking yuhuixu1993 liyancas stoneyang justdolearning zshwuhan shiyuetianqiang lyk125 s5248 jebtang tangal0203 wikipedia2008 phlovexz muzi-8 zgsxwsdxg chuanxinlan zhangnju speedup4dl xhhong suyuan945 queenie88 cv9527 vanpersie32 marui8 sherrywangnv fgxfxpfzzfcc sditeng unsky elegantgod zhao1995 zhancr mai00dou zbxzc35 fanxianyou dengshuo zhang405744522 shubhampachori12110095 yaokeepmoving tzhang2014 zhenxingsh winjia uptodiff fallingdust liu3xing3long hoardboard xugithub1 felix-liuying geekrick88 scholltan marvrez

channel-pruning's Issues

difference compared to paper

I noticed that there exist three steps in R3 function of net. The first one is VHDecompose, which is mentioned in paper as related work. The second one is itq_decompose, which I don't know, what it stands for? The third one is lasso pruning, as paper reported.
Am I right? Thank you.

i used python2.7,which need to be modified.i find the file of 'Makefile' have the keywords of 'python3.4' .

All are welcome to create issues, but please google the problem first, and make sure it has not already been reported.

What steps reproduce the bug?

What hardware and operating system/distribution are you running?

Operating system: ubuntu 14.04
CUDA version: 8.0
CUDNN version: 5.1
openCV version: 3.1
BLAS:
Python version: 2.7

If the bug is a crash, provide the backtrace.

Check failed: !lines_.empty() File is empty

$ python3 train.py -action c3 -caffe 0
no lighting pack
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0828 09:45:42.791590 17829 image_data_layer.cpp:53] Check failed: !lines_.empty() File is empty
*** Check failure stack trace: ***

$ python3 train.py -action c3 -caffe GPU0
no lighting pack
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0828 09:46:49.385843 17904 common.cpp:152] Check failed: error == cudaSuccess (38 vs. 0) no CUDA-capable device is detected
*** Check failure stack trace: ***

how to do?

What's the meaning of ITQ_decompose?

Who can explain what's the meaning of ITQ_decompose function? I can't understand what to do and what to return in this function.

What's mean of WPQ in train.py of c3 function?

@yihui-he in function c3() of train.py, this line(https://github.com/yihui-he/channel-pruning/blob/master/train.py#L73) of code : WPQ, new_pt = net.R3(), I can't find descriptions of WPQ in your paper, could you explain it?

make test error

Hi,thanks for your codes,When I adopt the following configuration, encounter the following error in make test:
Operating system: linux
CUDA version: 8.0
CUDNN version: cudnn-8.0-linux-x64-v5.0
openCV version: 3.0
BLAS: yes
Python version: 3.4
the error is:
src/caffe/test/test_filter_layer.cpp:20:3: error： Uninitialized members‘caffe::FilterLayerTest<caffe::CPUDevice >::blob_top_labels_’Has a 'const' type‘ caffe::Blob* const’ [-fpermissive];
I have no idea how to solver this problom,thanks for your help.

GPU memory consume

All are welcome to create issues, but please google the problem first, and make sure it has not already been reported.

What steps reproduce the bug?

Hi
I test the VGG-16 as follow command
caffe test -model channel_pruning_VGG-16_3C4x.prototxt -weights channel_pruning_VGG-16_3C4x.caffemodel -iterations 5000 -gpu 0
compare the performance of original vgg-16, I found that the memory of GPU is increased.

The test result of GPU memory consume in nvidia GTX1080:
channel_pruning_VGG-16_3C4x.caffemodel : 1773MB
original_VGG-16.caffemodel: 1503MB
ps: batch_size=10

why does it increase so much?
Is that normal?

What hardware and operating system/distribution are you running?

Operating system: ubuntu16.04
CUDA version: 8.0
CUDNN version: 6.0
openCV version: 2.4.9
BLAS:
Python version: 3.5

If the bug is a crash, provide the backtrace.

This Pruning model cannot be downloaded.

This Pruning model cannot be downloaded. What's the reason.Thank you.

How can set cfgs for resnet?

can you share your cfgs for resnet file ?

faster-rcnn speed and accuracy

Hi,I am using your open source model faster-rcnn VGGx2 and VGGx4 code, the results did not achieve the effect of acceleration, your accuracy in the form is the actual accuracy of it? Why is it so much worse than the original VGG? Or I ignored what the details, look forward to your answer, thank you

Failed to include caffe_pb2

Hi Yihui,

I tried your code and met some problems.

After make -j8 and make pycaffe, I tried to python3 train.py, but found something wrong with protobuf.
So I change the protobuf version but the problem was still not solved.

Here is the problem:
When I tried protobuf 3.0.0(b1,b2,b3,b4) or 3.1.0 , the error message is:

Failed to include caffe_pb2, things might go wrong!
Traceback (most recent call last):
  File "/home/dutianyuan/anaconda3/lib/python3.5/site-packages/google/protobuf/internal/python_message.py", line 1087, in MergeFromString
    if self._InternalParse(serialized, 0, length) != length:
  File "/home/dutianyuan/anaconda3/lib/python3.5/site-packages/google/protobuf/internal/python_message.py", line 1109, in InternalParse
    (tag_bytes, new_pos) = local_ReadTag(buffer, pos)
  File "/home/dutianyuan/anaconda3/lib/python3.5/site-packages/google/protobuf/internal/decoder.py", line 181, in ReadTag
    while six.indexbytes(buffer, pos) & 0x80:
TypeError: unsupported operand type(s) for &: 'str' and 'int'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 19, in <module>
    from lib.net import Net, load_layer, caffe_test
  File "/mnt/lustre/dutianyuan/channel-pruning/lib/net.py", line 7, in <module>
    import caffe
  File "/mnt/lustre/dutianyuan/channel-pruning/caffe/python/caffe/__init__.py", line 4, in <module>
    from .proto.caffe_pb2 import TRAIN, TEST
  File "/mnt/lustre/dutianyuan/channel-pruning/caffe/python/caffe/proto/caffe_pb2.py", line 799, in <module>
    options=_descriptor._ParseOptions(descriptor_pb2.FieldOptions(), '\020\001')),
  File "/home/dutianyuan/anaconda3/lib/python3.5/site-packages/google/protobuf/descriptor.py", line 869, in _ParseOptions
    message.ParseFromString(string)
  File "/home/dutianyuan/anaconda3/lib/python3.5/site-packages/google/protobuf/message.py", line 185, in ParseFromString
    self.MergeFromString(serialized)
  File "/home/dutianyuan/anaconda3/lib/python3.5/site-packages/google/protobuf/internal/python_message.py", line 1093, in MergeFromString
    raise message_mod.DecodeError('Truncated message.')
google.protobuf.message.DecodeError: Truncated message.

and when I change protobuf to 3.2.0 / 3.3.0 / 3.4.0, the error message is

Traceback (most recent call last):
File "train.py", line 19, in <module>
  from lib.net import Net, load_layer, caffe_test 
File "/mnt/lustre/dutianyuan/channel-pruning/lib/net.py", line 7, in <module>
  import caffe
File "/mnt/lustre/dutianyuan/channel-pruning/caffe/python/caffe/__init__.py", line 4, in <module>
  from .proto.caffe_pb2 import TRAIN, TEST
File "/mnt/lustre/dutianyuan/channel-pruning/caffe/python/caffe/proto/caffe_pb2.py", line 17, in <module>
  serialized_pb='\n\x0b\x63\x61\x66\x66\x65.proto\x12\x05\x63\x61\x66\x66\x65\"\x1c\n\tBlobShape\x12\x0f\n\x03\x64im\x18\x01 
.....with a lot of \x......
  File "/home/dutianyuan/anaconda3/lib/python3.5/site-packages/google/protobuf/descriptor.py", line 824, in __new__
    return _message.default_pool.AddSerializedFile(serialized_pb)
TypeError: expected bytes, str found

When I first met this problem, my Python version is 3.4 and your default setting is with Python 3.5. So I install Python3.5 and Python3.6 by Anaconda, Python 3.6 by apt-get. No matter what version is, the problem was still not solved.

Hope you can show me the specific version of your coding environment!
Thanks!

Request: Explanation on how to set the speed-up for each layer independently

I've been studying the code you kindly posted, but I'm still unsure on how to set the speed-up factor for the layers, or is it possible to set a specific speed-up factor for the pruning to modify each layer differently. Could you please provide so indications on how to achieve this?

what content were it to sourceval.txt and sourcetrain.txt"

source: "path/to/ilsvrc12/sourceval.txt"
and
source: "path/to/ilsvrc12/sourcetrain.txt"

how to speed up to model pruned?

the channels of full connection Conv did not be pruned actually, while the most parameters also were not Conv, so how to improve the speed of model pruned?

This pruning models cannot be downloaded.

This pruning models cannot be downloaded.what's the reason? Thank you

can you share your model(GPU 2.5X)?

The pre-model channel_pruning_VGG-16_3C4x is slowly on GPU platform
can you share your 2.5X on GPU?

About the VGG16 baseline accuracy in paper

Thx for opening your impressive work.
In your paper, for 1-view, baseline 89.9% of VGG16, is this result from the open caffemodel offered by VGG group?

no content

How can i set channels ratios for CR ?

I your paper 4.1.2 section:
< Remaining channels ratios for shallow layers (conv1_x to conv3_x) and deep layers (conv4_x)
is 1 : 1:5. conv5_x are not pruned>
I set 3.0 rations for shollow layers and 1.5 rations for conv4_x, but the accuracy drop 0.77
Can you tell me how can i set channel rations ?

Also using svd method?

From the code, VGG16 acceleration factor can be 3X-4X, not only by using channel-pruning method but also used svd ? Is it true?

And What about the effect of ssd mobilenet ?

make -j8, script, error

find: examples/FSRCNN/Train/91': No such file or directory find: images': No such file or directory

pycaffe was compiled, it occured errors below,

python3 train.py -action c3 -gpu GPU0
no lighting pack
Traceback (most recent call last):
File "train.py", line 19, in
from lib.net import Net, load_layer, caffe_test
File "/home/yq/work/face_class/prune/cpc/channel-pruning/lib/net.py", line 7, in
import caffe
File "/home/yq/work/face_class/prune/cpc/channel-pruning/caffe/python/caffe/init.py", line 1, in
from .pycaffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, RMSPropSolver, AdaDeltaSolver, AdamSolver, NCCL, Timer
File "/home/yq/work/face_class/prune/cpc/channel-pruning/caffe/python/caffe/pycaffe.py", line 13, in
from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver,
ImportError: dynamic module does not define init function (PyInit__caffe)

make pruning error

HI.I'm sorry to bother you.When I run python train.py -action c3 -caffe 0 ,I got the error like this:

no lighting pack
[libprotobuf INFO google/protobuf/io/coded_stream.cc:610] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:81] The total number of bytes read was 553432081
Process Process-1:
Traceback (most recent call last):
File "/home/tang/anaconda3/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/home/tang/anaconda3/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/tang/channel-pruning/lib/worker.py", line 21, in job
ret = target(**kwargs)
File "train.py", line 26, in step0
net = Net(pt, model=model, noTF=1)
File "/home/tang/channel-pruning/lib/net.py", line 67, in init
self.net_param = NetBuilder(pt=pt)
File "/home/tang/channel-pruning/lib/builder.py", line 131, in init
pb2.text_format.Merge(f.read(), self.net)
File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 476, in Merge
descriptor_pool=descriptor_pool)
File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 526, in MergeLines
return parser.MergeLines(lines, message)
File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 559, in MergeLines
self._ParseOrMerge(lines, message)
File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 574, in _ParseOrMerge
self._MergeField(tokenizer, message)
File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 675, in _MergeField
merger(tokenizer, message, field)
File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 764, in _MergeMessageField
self._MergeField(tokenizer, sub_message)
File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 675, in _MergeField
merger(tokenizer, message, field)
File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 764, in _MergeMessageField
self._MergeField(tokenizer, sub_message)
File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 675, in _MergeField
merger(tokenizer, message, field)
File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 809, in _MergeScalarField
value = tokenizer.ConsumeString()
File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 1151, in ConsumeString
the_bytes = self.ConsumeByteString()
File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 1166, in ConsumeByteString
the_list = [self._ConsumeSingleByteString()]
File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 1191, in _ConsumeSingleByteString
result = text_encoding.CUnescape(text[1:-1])
File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_encoding.py", line 103, in CUnescape
result = ''.join(_cescape_highbit_to_str[ord(c)] for c in result)
File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_encoding.py", line 103, in
result = ''.join(_cescape_highbit_to_str[ord(c)] for c in result)
IndexError: list index out of range

Can you show me what is the problem?
Thanks!

In the code there are several references to "gt" (e.g. gt_model, gt_pt, gt_feats). So is clearly is a caffe model. But what does the name "gt" mean?

how to deal with the trained model of faster rcnn?(vgg)

All are welcome to create issues, but please google the problem first, and make sure it has not already been reported.

What steps reproduce the bug?

What hardware and operating system/distribution are you running?

Operating system:
CUDA version:
CUDNN version:
openCV version:
BLAS:
Python version:

If the bug is a crash, provide the backtrace.

undefined reference to 'nccl*****'

only one GPU, so it was shut to NCCL multi GPUs, error as following:

Makefile:631: recipe for target '.build_release/examples/mnist/convert_mnist_data.bin' failed
make: *** [.build_release/examples/mnist/convert_mnist_data.bin] Error 1
.build_release/lib/libcaffe.so: undefined reference to ncclGetUniqueId' .build_release/lib/libcaffe.so: undefined reference to ncclBcast'
.build_release/lib/libcaffe.so: undefined reference to ncclAllReduce' .build_release/lib/libcaffe.so: undefined reference to ncclCommInitRank'
.build_release/lib/libcaffe.so: undefined reference to ncclCommDestroy' .build_release/lib/libcaffe.so: undefined reference to ncclCommInitAll'
.build_release/lib/libcaffe.so: undefined reference to `ncclGetErrorString'
collect2: error: ld returned 1 exit status
Makefile:626: recipe for target '.build_release/tools/extract_features.bin' failed
make: *** [.build_release/tools/extract_features.bin] Error 1

experiment: error: unrecognized arguments: -gpu GPU0

python3 train.py -action c3 -gpu GPU0
usage: experiment [-h] [-tf TF_VIS] [-caffe CAFFE_VIS] [-action ACTION]
[-dic.keep DICDOTKEEP] [-dic.vh DICDOTVH]
[-dic.fitfc DICDOTFITFC] [-dic.rank_tol DICDOTRANK_TOL]
[-dic.afterconv DICDOTAFTERCONV]
[-dic.prepooling DICDOTPREPOOLING] [-dic.debug DICDOTDEBUG]
[-dic.option DICDOTOPTION]
[-dic.layeralpha DICDOTLAYERALPHA] [-dic.alter DICDOTALTER]
[-an.l1 ANDOTL1] [-an.l2 ANDOTL2] [-an.ratio ANDOTRATIO]
[-an.filter ANDOTFILTER] [-res.bn RESDOTBN]
[-res.short RESDOTSHORT] [-nPointsPerLayer NPOINTSPERLAYER]
[-nofc NOFC] [-ls LS] [-autodet AUTODET] [-fc_reg FC_REG]
[-kernelname KERNELNAME] [-data DATA] [-Action ACTION]
[-ntest NTEST] [-nBatches_fc NBATCHES_FC] [-solver SOLVER]
[-prototxt PROTOTXT] [-nBatches NBATCHES] [-model MODEL]
[-fc_ridge FC_RIDGE] [-nonlinear_fc NONLINEAR_FC]
[-frozen FROZEN] [-splitconvrelu SPLITCONVRELU]
[-weights WEIGHTS] [-mp MP] [-log LOG] [-shm SHM]
experiment: error: unrecognized arguments: -gpu GPU0
no lighting pack
How to work?

trian error

Hi,When I run the python3 train.py -action c3 -caffe GPU0,I encounted the following error:
/usr/local/lib/python3.4/dist-packages/matplotlib/backends/backend_gtk3agg.py:18: UserWarning: The Gtk3Agg backend is known to not work on Python 3.x with pycairo. Try installing cairocffi.
"The Gtk3Agg backend is known to not work on Python 3.x with pycairo. "
no lighting pack
Failed to include caffe_pb2, things might go wrong!
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/google/protobuf/internal/python_message.py", line 1062, in MergeFromString
if self._InternalParse(serialized, 0, length) != length:
File "/usr/local/lib/python3.4/dist-packages/google/protobuf/internal/python_message.py", line 1084, in InternalParse
(tag_bytes, new_pos) = local_ReadTag(buffer, pos)
File "/usr/local/lib/python3.4/dist-packages/google/protobuf/internal/decoder.py", line 181, in ReadTag
while six.indexbytes(buffer, pos) & 0x80:
TypeError: unsupported operand type(s) for &: 'str' and 'int'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train.py", line 19, in
from lib.net import Net, load_layer, caffe_test
File "/home/dl/file/lxw/channel_pruning/channel-pruning-master/lib/net.py", line 7, in
import caffe
File "/home/dl/file/lxw/channel_pruning/channel-pruning-master/caffe/python/caffe/init.py", line 4, in
from .proto.caffe_pb2 import TRAIN, TEST
File "/home/dl/file/lxw/channel_pruning/channel-pruning-master/caffe/python/caffe/proto/caffe_pb2.py", line 799, in
options=_descriptor._ParseOptions(descriptor_pb2.FieldOptions(), '\020\001')),
File "/usr/local/lib/python3.4/dist-packages/google/protobuf/descriptor.py", line 869, in _ParseOptions
message.ParseFromString(string)
File "/usr/local/lib/python3.4/dist-packages/google/protobuf/message.py", line 185, in ParseFromString
self.MergeFromString(serialized)
File "/usr/local/lib/python3.4/dist-packages/google/protobuf/internal/python_message.py", line 1068, in MergeFromString
raise message_mod.DecodeError('Truncated message.')
google.protobuf.message.DecodeError: Truncated message.

I have no idea how to solver this problems,Thanks for your help！

The original resnet prototxt?

Hi:
What's the difference between resnet50 and resnet 50 2X? Can you offer the original prototxt of resnet 50 2x? Thanks.

shoud channel pruning method fit for inception v1?

I want to prune inception v1 , can channel pruning fit for it ?

what's means?

caffe test -model ****.prototxt -weights ****.caffemodel -iterations 5 -gpu GPU0
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injectorboost::bad_lexical_cast >'
what(): bad lexical cast: source type value could not be interpreted as target
*** Aborted at 1503819237 (unix time) try "date -d @1503819237" if you are using GNU date ***
PC: @ 0x7f477dd6e1c7 gsignal
*** SIGABRT (@0x3ea00005f02) received by PID 24322 (TID 0x7f477fe20740) from PID 24322; stack trace: ***
@ 0x7f477dd6e250 (unknown)
@ 0x7f477dd6e1c7 gsignal
@ 0x7f477dd6fe2a abort
@ 0x7f477e3a9b7d __gnu_cxx::__verbose_terminate_handler()
@ 0x7f477e3a79c6 (unknown)
@ 0x7f477e3a7a11 std::terminate()
@ 0x7f477e3a7c29 __cxa_throw
@ 0x410154 boost::throw_exception<>()
@ 0x409612 get_gpus()
@ 0x40a29d test()
@ 0x4084e0 main
@ 0x7f477dd59ac0 __libc_start_main
@ 0x408d09 _start
@ 0x0 (unknown)
Aborted (core dumped)

collision

it conflicted when implementing both make pycaffe with cmake .. && make, then it didn't work to import caffe.

why don't consider relu layer in CR?

Sorry to trouble
when conv = ‘conv2_1’ and convnext = 'conv2_2'
In CR step , X_name='conv2_1' and convnext = 'conv2_2'
My problem is why don't use X_name = self.bottom_names['conv2_2'][0] and X_name = 'relu2_1' ?

Channel-pruning: Why deleted W2 weights are simply set to 0 (not removed; W2.shape remains the same) ?

In '''channel pruning''' , dictionary_kernels() gives the result of the feature map selection (i.e. indices, weights and bias):
https://github.com/yihui-he/channel-pruning/blob/c18d5ae4f8dd895e22d89a388f13630d1220876a/lib/net.py#L1440-L1445

As you know, using the indices(idxs), the values of W2 that aren't selected are set to zero , while the selected are over-written with W2.copy() (W2* shape remains the same)*.

But in case of W1, un-selected filters are completely vanished from the WPQ dictionary:
https://github.com/yihui-he/channel-pruning/blob/c18d5ae4f8dd895e22d89a388f13630d1220876a/lib/net.py#L1452-L1454

Why? I'm assuming that it is OK because in the next iteration convnext will be labeled as conv , and the weights that were set as zero will be selected by the lasso regression to be removed (so it takes 2 iterations to vanish a set of filter). Am I correct?

why did you prune ResNet-50, Instead of ResNet-18, ResNet-34?

As mentioned in the title.
I guess that the more is the number of layer, the more is better in speed after pruning.

model download error

Hi, may you upload the resource problem, the network the speed is not good and the model download fails, can you provide the model Baidu cloud download link? thanks for your help.

About the ITQ?

Hi, could you please tell me the ITQ's full name? I'm not good at math, I searched it in the Internet, but there were nothing. I saw only LASSO in your paper, but SVD, "ITQ" and LASSO in your code, which one makes more contribution? Hope to get your reply, thanks!

Nonuniform GPU accelerate ration?

The reference paper[53]:
there exists a nonuniform result

asym. (3d) model give (3.0×) GPU accelerate in Tabel 8
but it show 0.96X GPU accelerate in your paper in Tabel 3

why is so big different?

why did your ResNet-50 improve 2X speed?

e.g., your resnet prototxt : ResNet-50 | 2X | 90.8 (Top-5), it sounded impossible.

Due to the structure constraints of ResNet-50, non-tensor layers (e.g., batch normalization and pooling layers) take up more than 40% of the inference time on GPU.

how to calculate the speedup in paper?

I'm reimplementing the paper, confused about the speedup setting. The paper compares different methods under different speedup settings like "2x, 4x, 5x". I'm curious about how to calculate that.

My understanding is, "2x" means GFLOPs of all conv layers before prune / GFLOPs of all conv layers after prune = 2, is this correct?
I download the supplied caffemodels and net prototxts, and calculate the GFLOPs of VGG-16 3C4x and channel pruning 5X . Here is the result:
with a 224x224x3 image as input,

the total conv GFLOPs of original unpruned VGG16 net is 15.3466
the total conv GFLOPs of VGG-16 3C4x is 3.7440
the total conv GFLOPs of channel pruning 5X is 3.3917
namely, the speedup in my understanding is 15.3466/3.7440=4.1 (verified as 4x) and 15.3466/3.3917=4.5. The latter 4.5 is close to 5x, basically verified. So is my understanding exactly what the authors mean?

some kind of issue when python3 train.py -action c3 -caffe 0

hello, i have no idea what is wrong. thanks in advance

done:
git clone --recursive https://github.com/yihui-he/channel-pruning.git
make -j8 && make pycaffe

fail:
python3 train.py -action c3 -caffe 0

stdout:

no lighting pack
Failed to include caffe_pb2, things might go wrong!
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/google/protobuf/internal/python_message.py", line 1069, in MergeFromString
if self._InternalParse(serialized, 0, length) != length:
File "/usr/local/lib/python3.4/dist-packages/google/protobuf/internal/python_message.py", line 1091, in InternalParse
(tag_bytes, new_pos) = local_ReadTag(buffer, pos)
File "/usr/local/lib/python3.4/dist-packages/google/protobuf/internal/decoder.py", line 181, in ReadTag
while six.indexbytes(buffer, pos) & 0x80:
TypeError: unsupported operand type(s) for &: 'str' and 'int'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train.py", line 19, in
from lib.net import Net, load_layer, caffe_test
File "/home/t-jinche/channel-pruning/lib/net.py", line 7, in
import caffe
File "/home/t-jinche/channel-pruning/caffe/python/caffe/init.py", line 4, in
from .proto.caffe_pb2 import TRAIN, TEST
File "/home/t-jinche/channel-pruning/caffe/python/caffe/proto/caffe_pb2.py", line 799, in
options=_descriptor._ParseOptions(descriptor_pb2.FieldOptions(), '\020\001')),
File "/usr/local/lib/python3.4/dist-packages/google/protobuf/descriptor.py", line 874, in _ParseOptions
message.ParseFromString(string)
File "/usr/local/lib/python3.4/dist-packages/google/protobuf/message.py", line 185, in ParseFromString
self.MergeFromString(serialized)
File "/usr/local/lib/python3.4/dist-packages/google/protobuf/internal/python_message.py", line 1075, in MergeFromString
raise message_mod.DecodeError('Truncated message.')
google.protobuf.message.DecodeError: Truncated message.

imageNet

Operating system: CentOS7
CUDA version: 8.0
CUDNN version: 6.0
openCV version: 2.4.5
BLAS: open
Python version: 3.6.1
I want to know the data url
https://pan.baidu.com/s/1dDizyed#list/path=%2FImageNet
is good for use?

how to handle different dim between f(x) with x in residual module of ResNet

fg: F(x) = f(x)+x f(x) in residual module.

Could be it pruned to ResNet model ?

so so

/usr/bin/ld: cannot find -lboost_python3

@yihui-he
All are welcome to create issues, but please google the problem first, and make sure it has not already been reported.

What steps reproduce the bug?

make clean && make -j32 && make pycaffe -j32

What hardware and operating system/distribution are you running?

Operating system: Ubuntu14.04
CUDA version: cuda-8.0
CUDNN version: 5.1
openCV version: 2.4.9
BLAS: mkl
Python version: python3.4

If the bug is a crash, provide the backtrace.

part error log:
AR -o .build_release/lib/libcaffe.a
LD -o .build_release/lib/libcaffe.so.1.0.0
/usr/bin/ld: cannot find -lboost_python3
collect2: error: ld returned 1 exit status
make: *** [.build_release/lib/libcaffe.so.1.0.0] Error 1

any pruned models available?

Hi Yihui,

Nice work on channel pruning! I am wondering is there any plan to release some pruned models? It would be really great for fast test. : )

what's the corresponding method of these functions in codes

I found that there is a lot function names that are hard to tell which method it uses
for example
the "c3" in train.py which indicates these three methods

Speeding up Convolutional Neural Networks with Low Rank Expansions
Accelerating Very Deep Convolutional Networks for Classification and Detection
Channel Pruning for Accelerating Very Deep Neural Networks, the Lasso one

but which function implements these methods respectively?

Here is the list of function that I can't realize the corresponding method in papers

YYT (https://github.com/yihui-he/channel-pruning/blob/master/lib/decompose.py#L61)
VH_decompose (https://github.com/yihui-he/channel-pruning/blob/master/lib/decompose.py#L85)
ITQ_decompose (https://github.com/yihui-he/channel-pruning/blob/master/lib/decompose.py#L163)
dictionary (https://github.com/yihui-he/channel-pruning/blob/master/lib/decompose.py#L386)
fc_kernel (https://github.com/yihui-he/channel-pruning/blob/master/lib/decompose.py#L636)
nonlinear_fc (https://github.com/yihui-he/channel-pruning/blob/master/lib/decompose.py#L671)

Would you explain these functions?
Thanks!

How to calculate the rank?

https://github.com/yihui-he/channel-pruning/blob/master/lib/net.py#L1301

The rankdic is set beforehand and it looks like only for VGG16.

it seems that any formula or theorem about calculating rankdic does not mention in paper.

is it the experimental outcome?

if true, what's the rank for ResNet-50?
Since the rankdic in the release source code of ResNet-50 is same as VGG16
https://github.com/yihui-he/channel-pruning/releases/tag/ResNet-50-2X

thanks.

Inquiry: Pruning process without spatial decomposition?

Thanks again for sharing your code and releasing the VGG-16 5X pruned caffemodel. May I ask how much should the code be changed in order to produce this model (pruning only, no 3C included) ?

Can this be done by simply passing a different flag instead of "-action c3"? Thanks!!!! And sorry for asking but I'm sure a lot of people will want to know about this.

How can I conduct channel pruning only?

first, I really appreciated for your help!
Then I have some quesqions
How can i set single channel pruning?
how much is channel prunning accerate ration ?
why channel_pruning_VGG-16_3C4x.caffemodel is so big ?

ImportError undefined symbol: _ZN2cv6imreadERKNS_6StringEi

File "train.py", line 19, in
from lib.net import Net, load_layer, caffe_test
File "/home/yq/work/face_class/prune/cpc/channel-pruning/lib/net.py", line 7, in
import caffe
File "/home/yq/work/face_class/prune/channel-pruning-master/caffe/python/caffe/init.py", line 1, in
from .pycaffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, RMSPropSolver, AdaDeltaSolver, AdamSolver, NCCL, Timer
File "/home/yq/work/face_class/prune/channel-pruning-master/caffe/python/caffe/pycaffe.py", line 13, in
from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver,
ImportError: /home/yq/work/face_class/prune/channel-pruning-master/caffe/python/caffe/../../build/lib/libcaffe.so.1.0.0-rc5: undefined symbol: _ZN2cv6imreadERKNS_6StringEi

ethanhe42 / channel-pruning Goto Github PK

channel-pruning's Introduction

Channel Pruning for Accelerating Very Deep Neural Networks

Citation

requirements

Installation (sufficient for the demo)

Channel Pruning

Pruning faster RCNN

FAQ

channel-pruning's People

Contributors

Stargazers

Watchers

Forkers

channel-pruning's Issues

What steps reproduce the bug?

What hardware and operating system/distribution are you running?

If the bug is a crash, provide the backtrace.

What steps reproduce the bug?

What hardware and operating system/distribution are you running?

If the bug is a crash, provide the backtrace.

What steps reproduce the bug?

What hardware and operating system/distribution are you running?

If the bug is a crash, provide the backtrace.

What steps reproduce the bug?

What hardware and operating system/distribution are you running?

If the bug is a crash, provide the backtrace.

Recommend Projects

Recommend Topics

Recommend Org