Git Product home page Git Product logo

loupe's Introduction

Learnable mOdUle for Pooling fEatures (LOUPE) Tensorflow Toolbox

LOUPE is a Tensorflow toolbox that efficiently implements several learnable pooling method such as NetVLAD[1], NetRVLAD[2], NetFV[2] and Soft-DBoW[2] as well as the Context Gating activation from:

Antoine Miech and Ivan Laptev and Josef Sivic, Learnable pooling with Context Gating for video classification, arXiv:1706.06905, 2017.

It was initially used by the winning approach of the Youtube 8M Kaggle Large-Scale Video understading challenge: https://www.kaggle.com/c/youtube8m. We however think these are some general pooling approaches that can be used in various applications other than video representation. That is why we decided to create this small Tensorflow toolbox.

Usage example

Creating a NetVLAD block:

import loupe as lp

'''
Creating a NetVLAD layer with the following inputs:

feature_size: the dimensionality of the input features
max_samples: the maximum number of features per list
cluster_size: the number of clusters
output_dim: the dimensionality of the pooled features after 
dimension reduction
gating: If True, adds a Context Gating layer on top of the 
pooled representation
add_batch_norm: If True, adds batch normalization during training
is_training: If True, the graph is in training mode
'''
NetVLAD = lp.NetVLAD(feature_size=1024, max_samples=100, cluster_size=64, 
                     output_dim=1024, gating=True, add_batch_norm=True,
                     is_training=True)



'''
Forward pass of the pooling architecture with
tensor_input: A tensor of shape:
'batch_size'x'max_samples'x'feature_size'
tensor_output: The pooled representation of shape:
'batch_size'x'output_dim'
'''
tensor_output = NetVLAD.forward(tensor_input)

It is the same usage for NetRVLAD, NetFV and Soft-DBoW.

NOTE: The toolbox can only pool lists of features of the same length. It was specifically optimized to efficiently do so. One way to handle multiple lists of features of variable length is to create, via a data augmentation technique, a tensor of shape: 'batch_size'x'max_samples'x'feature_size'. Where 'max_samples' would be the maximum number of feature per list. Then for each list, you would fill the tensor with 0 values.

References

[1] Arandjelovic, Relja and Gronat, Petre and Torii, Akihiko and Pajdla, Tomas and Sivic, Josef, NetVLAD: CNN architecture for weakly supervised place recognition, CVPR 2016

If you use this toolbox, please cite the following paper:

[2] Antoine Miech and Ivan Laptev and Josef Sivic, Learnable pooling with Context Gating for video classification, arXiv:1706.06905:

@article{miech17loupe,
  title={Learnable pooling with Context Gating for video classification},
  author={Miech, Antoine and Laptev, Ivan and Sivic, Josef},
  journal={arXiv:1706.06905},
  year={2017},
}

Antoine Miech

loupe's People

Contributors

antoine77340 avatar mlopezantequera avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.