Git Product home page Git Product logo

dhp's Introduction

This is the official implementation of "DHP: Differentiable Meta Pruning via HyperNetworks".

Contents

  1. Introduction
  2. Contribution
  3. Methodology
  4. Dependencies
  5. Image Classification
  6. Image Restoration
  7. Results
  8. Reference
  9. Acknowledgements

Introduction

Network pruning has been the driving force for the acceleration of neural networks and the alleviation of model storage/transmission burden. With the advent of AutoML and neural architecture search (NAS), pruning has become topical with automatic mechanism and searching based architecture optimization. Yet, current automatic designs rely on either reinforcement learning or evolutionary algorithm. Due to the non-differentiability of those algorithms, the pruning algorithm needs a long searching stage before reaching the convergence.

To circumvent this problem, this paper introduces a differentiable pruning method via hypernetworks for automatic network pruning. The specifically designed hypernetworks take latent vectors as input and generate the weight parameters of the backbone network. The latent vectors control the output channels of the convolutional layers in the backbone network and act as a handle for the pruning of the layers. By enforcing โ„“1 sparsity regularization to the latent vectors and utilizing proximal gradient solver, sparse latent vectors can be obtained. Passing the sparsified latent vectors through the hypernetworks, the corresponding slices of the generated weight parameters can be removed, achieving the effect of network pruning. The latent vectors of all the layers are pruned together, resulting in an automatic layer configuration. Extensive experiments are conducted on various networks for image classification, single image super-resolution, and denoising. And the experimental results validate the proposed method.

Contribution

1. A new architecture of hypernetwork is designed. Different from the classicalhypernetwork composed of linear layers, the new design is tailored to automatic network pruning. By only operating on the input of the hypernetwork,the backbone network can be pruned.

2. A differentiable automatic networking pruning method is proposed. The differentiability comes with the designed hypernetwork and the utilized proximal gradient. It accelerates the convergence of the pruning algorithm.

3. By the experiments on various vision tasks and modern convolutional neural networks (CNNs), the potential of automaticnetwork pruning as fine-grained architecture search is revealed.

Methodology

The workflow of the proposed differentiable pruning method. The latent vectors z attached to the convolutional layers act as the handle for network pruning. The hypernetwork takes two latent vectors as input and emits output as the weight ofthe backbone layer. l1 sparsity regularization is enforced on the latent vectors. The differentiability comes with the hypernetwork tailored to pruning and the proximal gradient exploited to solve problem. After the pruning stage, sparse latent vectors are obtained which result in pruned weights after being passed through the hypernetwork

Illustration of the hypernetwork designed for network pruning. It generates a weight tensor after passing the input latent vector through the latent layer, theembedding layer, and the explicit layer. If one element in the latent vector of the current layer is pruned, the corresponding slice of the output tensor is also pruned.

Dependencies

  • Python 3.7.4
  • PyTorch >= 1.2.0
  • numpy
  • matplotlib
  • tqdm
  • scikit-image
  • easydict
  • IPython

Image Classification

For the details on image classification, please refer to classification.

Image Restoration

For the details on image restoration, please refer to restoration.

Results

Reference

If you find our work useful in your research, you are welcome to refer to our work:

@inproceedings{li2020dhp,
  title={DHP: Differentiable Meta Pruning via HyperNetworks},
  author={Li, Yawei and Gu, Shuhang and Zhang, Kai and Van Gool, Luc and Timofte, Radu},
  booktitle={Proceedings of the European Conference on Computer Vision},
  year={2020}
}

Acknowledgements

This work was partly supported by the ETH Zurich Fund (OK), a Huawei Tech-nologies Oy (Finland) project, an Amazon AWS grant, and an Nvidia grant.

This repository is based on the our former paper Filter Basis and Group Sparsity. If you are interested, please refer to:

@inproceedings{li2020group,
  title={Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression},
  author={Li, Yawei and Gu, Shuhang and Mayer, Christoph and Van Gool, Luc and Timofte, Radu},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  year={2020}
}

@inproceedings{li2019learning,
  title = {Learning Filter Basis for Convolutional Neural Network Compression},
  author = {Li, Yawei and Gu, Shuhang and Van Gool, Luc and Timofte, Radu},
  booktitle = {Proceedings of the IEEE International Conference on Computer Vision},
  year = {2019}
}

dhp's People

Contributors

ofsoundof avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dhp's Issues

Can not get the performance reported in paper

Thanks for sharing the codes!
I follow the instructions in Readme,

  1. when running MobileNetV2, the codes seem missing, error reported!
  2. when running MobileNetV1, I get the best error 52.61, while the performance reported in paper is 51.63 in TinyImageNet DHP-50.

Please help me with question 1.
Also, I think question 2 is a little turbulence. Is it?

How can I set the datasets for super resolution

Thanks for sharing!
There are some important issues I want to ask and I look forward to your reply.

  1. DIV2KSUB is a subset that not mentioned.

  2. The Gray version of DIV2K is not mentioned.

  3. When using DIV2K for the training set, the code shows that the validation set is used, but the training set is not used, am I missing something?

  4. Many npy files are not available but seem to be used in the codes.

Thanks a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.