Git Product home page Git Product logo

squidnet's Introduction

SQuiDNet: Selective Query-guided Debiasing Network for Video Corpus Moment Retrieval

We are providing the Code for ECCV 2022 paper "Selective Query-guided Debiasing for Video Corpus Moment Retrieval"

Author: "Sunjae Yoon, Ji Woo Hong, Eunseop Yoon, Dahyun Kim, Junyoeng Kim, Hee Suk Yoon and Chang D. Yoo

Task: Video Corpus Moment Retrieval

Video moment retrieval (VMR) aims to localize target moments in untrimmed videos pertinent to given textual query. Existing retrieval systems tend to rely on retrieval bias as a shortcut and thus, fail to sufficiently learn multi-modal interactions between query and video.

SQuiDNet is proposed to debiasing in video moment retrieval via conjugating retrieval bias in either positive or negative way.

Intro

SQuiDNet Overview

SQuiDNet is composed of 3 modules: (a) BMR which reveals biased retrieval, (b) NMR which performs accurate retrieval, (c) SQuiD which removes bad biases from accurate retrieval of NMR subject to the meaning of query.

Model

Implementation

Our results and further studies will also be updated soon!

  1. Clone the repositery
git clone https://github.com/dbstjswo505/SQuiDNet.git
cd SQuiDNet
  1. Prepare the environment
conda env create -f squid.yml
conda activate squid
  1. Input Features Download

Download tvr_feature_dataset, which should be located in the main folder SQuiDNet with the directory like below:

data
├── bmr
│   ├── bmr_prd_test_public_tvr
│   ├── bmr_prd_train_tvr
│   └── bmr_prd_val_tvr
├── sub_query_feature
│   ├── roberta_query
│   └── roberta_sub
├── video_feature
│   └── resnet_slowfast_1.5
├── text_data_ref
└── coocurrence_table

It is also available to download visual features (ResNet, SlowFast) obtained from HERO authors and text features (subtitle and query, from fine-tuned RoBERTa) obtained from XML authors. Feature extraction is available via understanding and running the code details: visual feature extraction, text feature extraction. The noun and predicate for coocurrence table are extracted using the code: noun and predicate extraction.

  1. SQuiDNet Training
bash scripts/train.sh

train.sh is performed with our defined hyperparameters, see the details in the code and is possible to modified experiement for more better performances including hyperparameter tunning.

  1. SQuiDNet Inference.
bash scripts/inference.sh

inference.sh is also performed with our defined hyperparameters and also hold the details in the code. Current settings are fixed on all the tasks including VCMR, SVMR and VR.

  1. Build Coocurrence Table
python mk_table.py

To design own your coocurrence table, you can adjust the 'cctable.json' file by running the code ./data/coocurrence_table/mk_table.py, which updates the cctable.json file.

Acknowledgement

This code is implemented on top of following contributions: TVRetrieval, HERO, HuggingFace, Info-ground, NetVLAD VLANet CONQUER MMT, MME. We thank the authors for open-sourcing these great projects and papers!

This work was partly supported by Institute for Information & communications Technology Promotion(IITP) grant funded by the Korea government(MSIT) (No. 2021-0-01381, Development of Causal AI through Video Understanding) and partly supported by Institute of Information & communications Technology Planning & Evaluation(IITP) grant funded by the Korea government(MSIT) (No. 2022-0-00184, Development and Study of AITechnologies to Inexpensively Conform to Evolving Policy on Ethics)

Citation

If you find this code useful for your research, please cite our paper:

@inproceedings{yoon2022selective,
  title={Selective Query-Guided Debiasing for Video Corpus Moment Retrieval},
  author={Yoon, Sunjae and Hong, Ji Woo and Yoon, Eunseop and Kim, Dahyun and Kim, Junyeong and Yoon, Hee Suk and Yoo, Chang D},
  booktitle={European Conference on Computer Vision},
  pages={185--200},
  year={2022},
  organization={Springer}
}

squidnet's People

Contributors

dbstjswo505 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.