Git Product home page Git Product logo

xiaojake / vnext Goto Github PK

View Code? Open in Web Editor NEW

This project forked from wjf5203/vnext

0.0 0.0 0.0 54.85 MB

!在 offline 和 online 范式下的视频实例分割Next-generation Video instance recognition framework on top of Detectron2 which supports SeqFormer(ECCV Oral) and IDOL(ECCV Oral))

License: Apache License 2.0

Shell 0.43% C++ 2.93% Python 89.23% Cuda 7.29% CMake 0.02% Dockerfile 0.10%

vnext's Introduction

VNext:

  • VNext is a Next-generation Video instance recognition framework on top of Detectron2.
  • Currently it provides advanced online and offline video instance segmentation algorithms.
  • We will continue to update and improve it to provide a unified and efficient framework for the field of video instance recognition to nourish this field.

To date, VNext contains the official implementation of the following algorithms:

IDOL: In Defense of Online Models for Video Instance Segmentation (ECCV2022 Oral)

SeqFormer: Sequential Transformer for Video Instance Segmentation (ECCV2022 Oral)

Highlight:

  • IDOL is accepted to ECCV 2022 as an oral presentation!
  • SeqFormer is accepted to ECCV 2022 as an oral presentation!
  • IDOL won first place in the video instance segmentation track of the 4th Large-scale Video Object Segmentation Challenge (CVPR2022).

Getting started

  1. For Installation and data preparation, please refer to to INSTALL.md for more details.

  2. For IDOL training, evaluation, and model zoo, please refer to IDOL.md

  3. For SeqFormer training, evaluation and model zoo, please refer to SeqFormer.md

IDOL

PWC PWC PWC

In Defense of Online Models for Video Instance Segmentation

Junfeng Wu, Qihao Liu, Yi Jiang, Song Bai, Alan Yuille, Xiang Bai

Introduction

  • In recent years, video instance segmentation (VIS) has been largely advanced by offline models, while online models are usually inferior to the contemporaneous offline models by over 10 AP, which is a huge drawback.

  • By dissecting current online models and offline models, we demonstrate that the main cause of the performance gap is the error-prone association and propose IDOL, which outperforms all online and offline methods on three benchmarks.

  • IDOL won first place in the video instance segmentation track of the 4th Large-scale Video Object Segmentation Challenge (CVPR2022).

Visualization results on OVIS valid set

Quantitative results

YouTube-VIS 2019

OVIS 2021

SeqFormer

PWC

SeqFormer: Sequential Transformer for Video Instance Segmentation

Junfeng Wu, Yi Jiang, Song Bai, Wenqing Zhang, Xiang Bai

Introduction

  • SeqFormer locates an instance in each frame and aggregates temporal information to learn a powerful representation of a video-level instance, which is used to predict the mask sequences on each frame dynamically.

  • SeqFormer is a robust, accurate, neat offline model and instance tracking is achieved naturally without tracking branches or post-processing.

Visualization results on YouTube-VIS 2019 valid set

Quantitative results

YouTube-VIS 2019

YouTube-VIS 2021

Citation

@inproceedings{seqformer,
  title={SeqFormer: Sequential Transformer for Video Instance Segmentation},
  author={Wu, Junfeng and Jiang, Yi and Bai, Song and Zhang, Wenqing and Bai, Xiang},
  booktitle={ECCV},
  year={2022},
}

@inproceedings{IDOL,
  title={In Defense of Online Models for Video Instance Segmentation},
  author={Wu, Junfeng and Liu, Qihao and Jiang, Yi and Bai, Song and Yuille, Alan and Bai, Xiang},
  booktitle={ECCV},
  year={2022},
}

Acknowledgement

This repo is based on detectron2, Deformable DETR, VisTR, and IFC Thanks for their wonderful works.

vnext's People

Contributors

wjf5203 avatar qihao067 avatar ifighting avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.