Git Product home page Git Product logo

ni-svm's Introduction

Nearly-Isotonic-SVM

This package implements the approach described in paper ``Semantic Pooling for Complex Event Analysis in Untrimmed Videos''. It depends on Caffe, gensim, vlfeat, and SLEP.

Usage

  • First, pre-train concept detectors based on improved dense trajectories features. In this paper, we use the following datasets: Trecvid SIN dataset (346 classes), Google Sports(478 classes), UCF101 dataset (101 classes) and YFCC dataset (609 classes). We first extract improved dense trajectory feautres with the code in [0-Prerequisite/00-FeatureExtraction/traj_pipeline] and encode with the Fisher vector representation [70]. Then, on top of the extracted low-level features, the cascade SVM was trained for each semantic concept. The code is located at [0-Prerequisite/01-TrainConceptDetection/distributed-svm/cascade-svm-mr]. Similarly, we extract the IDT features on all shots of each video in the evaluation datasets and apply the concept detectors to derive their semantic representations.

  • To learn the concept relevance, we train a skip-gram model using the English Wikipedia dump. For short phrases consisting of multiple words, we aggregate the word embeddings using Fisher vectors. We use vlfeat to generate the Fisher vector for each phrase. After normalizing the length of respective vector representations, we compute the cosine distance between the event description and each concept name, which measures the relevance of the concept to the event of interest.

  • Semantic Pooling. We define the semantic saliency score of each video shot as a weighted combination of the concept probability vector and the concept relevance vector. We conduct semantic pooling using the semantic scores, using the code in [1-SemanticSaliency].

function ori_pooling =  ss_pooling(ori, cd, rel)
  • NI-SVM
function ap = nisvm(fea, y, lambda, gamma, ind, fea_te, y_te)
% fea: C by S by N matrix
% C: number of concept detectors
% S: number of shots
% N: number of videos
% y: N by 1 vector

% lambda: total variation norm [0.001 0.01 0.1 1]
% gamma: squared l2 norm [0.001 0.01 0.1 1]
% ind: nonnegative W
  • Multi-class NI-SVM
function mf1 = nisvm_m(fea, Y, S, lambda, gamma, ind, mu, fea_te, Y_te, Ste)
% fea: d by m by n matrix
% d: # of CNN features
% m: # of shots
% n: # of videos
% k: # of events
% S: saliency score, m by n by k matrix
% lambda: total variation norm
% gamma: squared l2 norm
% ind: nonnegative W?
% mu: smoothing parameter

References

  • Xiaojun Chang, Yi Yang, Eric P. Xing and Yao-Liang Yu, Complex Event Detection using Semantic Saliency and Nearly-Isotonic SVM, International Conference on Machine Learning 2015.
  • Xiaojun Chang, Yao-Liang Yu, Yi Yang and Eric P. Xing, Semantic Pooling for Complex Event Analysis in Untrimmed Videos, IEEE Transactions on Pattern Analysis and Machine Intelligence 2016. [Under Minor Revision]

ni-svm's People

Contributors

cxj273 avatar

Stargazers

Echo avatar  avatar  avatar Gleb A avatar 绽琨 avatar Changmao Cheng avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.