Git Product home page Git Product logo

fasterai1's Introduction

fasterai

FasterAI: A repository for making smaller and faster models with the FastAI library.

Fasterai now exists here for fastai 2.0 !

It allows you to use techniques such as:

  • Knowledge Distillation
  • Pruning
  • Batch Normalization Folding
  • Matrix Decomposition

Usage:

Knowledge Distillation

Used as a callback to make the student model train on soft-labels generated by a teacher model.

 KnowledgeDistillation(student:Learner, teacher:Learner)

You only need to give to the callback function your student learner and your teacher learner. Behind the scenes, fasterai will take care of making your model train using knowledge distillation


Sparsify the network

Used as a callback, will iteratively replace the lowest-norm parameters by zeroes. More information in this blog post

SparsifyCallback(learn, sparsity, granularity, method, criteria, sched_func)
  • sparsity: the percentage of sparsity that you want in your network
  • granularity: on what granularity you want the sparsification to be operated (currently supported: weight, kernel, filter)
  • method: either local or global, will affect the selection of parameters to be choosen in each layer independently (local) or on the whole network (global).
  • criteria: the criteria used to select which parameters to remove (currently supported: l1, grad)
  • sched_func: which schedule you want to follow for the sparsification (currently supported: any scheduling function of fastai, i.e annealing_linear, annealing_cos, ... and annealing_gradual, the schedule proposed by Zhu & Gupta) (shown in Figure below)

Prune the network

Will physically remove the parameters zeroed out in step before. More information in this blog post

Warning: this only works when filter sparsifying has been performed and for fully feed-forward architectures such as VGG16.

pruner = Pruner()
pruned_model = pruner.prune_model(learn.model)

You just need to pass the model whose filters has previously been sparsified and FasterAI will take care of removing them.


Batch Normalization Folding

Will remove batch normalization layers by injecting its normalization statistics (mean and variance) into the previous convolutional layer. More information in this blog post

bn_folder = BN_Folder()
bn_folder.fold(learn.model))

Again, you only need to pass your model and FasterAI takes care of the rest. For models built using the nn.Sequential, you don't need to change anything. For others, if you want to see speedup and compression, you actually need to subclass your model to remove the batch norm from the parameters and from the forward method of your network.


Fully-Connected Layers Decomposition

Will replace fully-connected layers by a factorized version that is more parameter efficient.

FCD = FCDecomposer()
decomposed_model = FCD.decompose(model, percent_removed)

The percent_removed corresponds to the percentage of singular values removed (k value above).

fasterai1's People

Contributors

nathanhubens avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

fasterai1's Issues

IndexError: index out of range in self

nxt_filters_keep = nxt_filters.index_select(1, ixs[0]).data

I dont know what is the reason behind this error, I tried to debug it but couldnt get to the core of it

@nathanhubens Can you please take a look into this! I tried the earlier version of pruner with my previous experiments and they seemed to work well. I think this has to do something with my architecture as well. For reference, I am trying to build a super-resolution architecture which looks like this

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.nn.init as init
   
class depthwise_conv(nn.Module):
    def __init__(self, nin, kernel_size, padding, stride=1, dilation=1):
        super(depthwise_conv, self).__init__()
        
        self.depthwise = nn.Conv2d(nin, nin, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, groups=nin)
        
    def forward(self, x):
        out = self.depthwise(x)
        return out
    

class dw_block(nn.Module):
    def __init__(self, nin, kernel_size, padding=1, stride=1, dilation=1):
        super(dw_block, self).__init__()
        
        self.dw_block = nn.Sequential(
            depthwise_conv(nin, kernel_size, stride, padding, dilation),
        )
        
    def forward(self, x):
        out = self.dw_block(x)
        return out
    

class pointwise_conv(nn.Module):
    def __init__(self, nin, nout, padding=0, stride=1):
        super(pointwise_conv, ####qweqwe##self).__init__()
        
        self.pointwise_block = nn.Sequential(
            nn.Conv2d(nin, nout, kernel_size=1, stride=stride, padding=padding),
        )
        
    def forward(self, x):
        out = self.pointwise_block(x)
        return out
    
    

class SuperRes(nn.Module):
    def __init__(self, scale_factor=3, num_channels=1, d=32, s=12, m=4):
        super(SuperRes, self).__init__()
        
        self.first_part = nn.Sequential(
            nn.Conv2d(num_channels, d, kernel_size=5, padding=5//2),
            nn.PReLU(d)
            
        )
        
        self.mid_part = [nn.Conv2d(d, s, kernel_size=1), nn.PReLU(s)]
        
        for _ in range(m):
            self.mid_part.extend([nn.Conv2d(s, s, kernel_size=3, padding=2, dilation=2), nn.PReLU(s)])
            
        self.mid_part.extend([nn.Conv2d(s, d, kernel_size=1), nn.PReLU(d)])
        
        self.mid_part = nn.Sequential(*self.mid_part)
        
        #self.last_part = nn.ConvTranspose2d(d, num_channels, kernel_size=9, stride=scale_factor, padding=9//2,
                                            #output_padding=scale_factor-1)
         
        self.dp1 = nn.Sequential(
            dw_block(32, kernel_size=3, dilation=2),
            nn.PReLU(32),
            pointwise_conv(nin = 32, nout = 24),
            nn.PReLU(24),
            
            dw_block(24, kernel_size=3, dilation=2),
            nn.PReLU(24),
            pointwise_conv(nin = 24, nout = 16),
            nn.PReLU(16),
            
            dw_block(16, kernel_size=3),
            nn.PReLU(16),
            pointwise_conv(nin = 16, nout = 8),
            nn.PReLU(8),
            
            dw_block(8, kernel_size=3),
            nn.PReLU(8),
            pointwise_conv(nin = 8, nout = 16),
            nn.PReLU(16),
            
            dw_block(16, kernel_size=5),
            nn.PReLU(16),
            pointwise_conv(nin = 16, nout = 24, padding=2),
            nn.PReLU(24),
            
            dw_block(24, kernel_size=5),
            nn.PReLU(24),
            pointwise_conv(nin = 24, nout = 32, padding=2),      # PADDING = 2 here
            nn.PReLU(32),
            
        )
        self.conv = nn.Conv2d(32, 9, 3, 1, 1)
        self.last_part = nn.PixelShuffle(scale_factor)

        self._initialize_weights()
       
    def _initialize_weights(self):
        
        for m in self.first_part:
            if isinstance(m, nn.Conv2d):
                nn.init.normal_(m.weight.data, mean=0.0, std=math.sqrt(2/(m.out_channels*m.weight.data[0][0].numel())))
                nn.init.zeros_(m.bias.data)
                
        for m in self.mid_part:
            if isinstance(m, nn.Conv2d):
                nn.init.normal_(m.weight.data, mean=0.0, std=math.sqrt(2/(m.out_channels*m.weight.data[0][0].numel())))
                nn.init.zeros_(m.bias.data)
        
        for m in self.dp1:
            if isinstance(m, nn.Conv2d):
                nn.init.normal_(m.weight.data, mean=0.0, std=math.sqrt(2/(m.out_channels*m.weight.data[0][0].numel())))
                nn.init.zeros_(m.bias.data)

    def forward(self, x):

        global_residual = x
        x1 = self.first_part(x)
        x2 = self.mid_part(x1)
        x3 = self.dp1(x2)
        x4 = x3 + x1                        
        x = self.conv(x4)
        x = x + global_residual           
        x = self.last_part(x)

        return x
   
if __name__ == "__main__":
    model = SuperRes()
    print(model)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.