Git Product home page Git Product logo

xiliuer / paddlevideo Goto Github PK

View Code? Open in Web Editor NEW

This project forked from paddlepaddle/paddlevideo

0.0 0.0 0.0 60.48 MB

Comprehensive, latest, and deployable video deep learning algorithm, including video recognition, action localization, and temporal action detection tasks. It's a high-performance, light-weight codebase provides practical models for video understanding research and application

License: Apache License 2.0

Python 97.40% Shell 2.60%

paddlevideo's Introduction

简体中文 | English

PaddleVideo

News

Introduction

python version paddle version

PaddleVideo is a toolset for video recognition, action localization, and spatio temporal action detection tasks prepared for the industry and academia. This repository provides examples and best practice guildelines for exploring deep learning algorithm in the scene of video area. We devote to support experiments and utilities which can significantly reduce the "time to deploy". By the way, this is also a proficiency verification and implementation of the newest PaddlePaddle 2.0 in the video field.


If you think this repo is helpful to you, welcome to star us~ ⭐

Features

  • Various dataset and models PaddleVideo supports more datasets and models, including Kinetics400, UCF101, YoutTube8M, NTU-RGB+D datasets, and video recognition models, such as TSN, TSM, SlowFast, TimeSformer, AttentionLSTM, ST-GCN and action localization model, like BMN.

  • Higher performance PaddleVideo has built-in solutions to improve accuracy on recognition models. PP-TSM, which is based on the standard TSM, already archive the best performance in the 2D recognition network, has the same size of parameters but improve the Top1 Acc to 76.16%.

  • Faster training strategy PaddleVideo suppors faster training strategy, such as AMP training, Distributed training, Multigrid method for Slowfast, OP fusion method, Faster reader and so on.

  • Deployable PaddleVideo is powered by the Paddle Inference. There is no need to convert the model to ONNX format when deploying it, all you want can be found in this repository.

  • Applications PaddleVideo provides some interesting and practical projects that are implemented using video recognition and detection techniques, such as FootballAction and VideoTag.

Overview of the performance

Field Model Dataset Metrics ACC%
action recognition PP-TSM Kinetics-400 Top-1 76.16
action recognition PP-TSN Kinetics-400 Top-1 75.06
action recognition AGCN FSD Top-1 62.29
action recognition ST-GCN FSD Top-1 59.07
action recognition TimeSformer Kinetics-400 Top-1 77.29
action recognition SlowFast Kinetics-400 Top-1 75.84
action recognition TSM Kinetics-400 Top-1 71.06
action recognition TSN Kinetics-400 Top-1 69.81
action recognition AttentionLSTM Youtube-8M Hit@1 89.05
action detection BMN ActivityNet AUC 67.23

Changelog

release/2.1 was released in 20/05/2021. Please refer to release notes for details.

Community

  • Scan the QR code below with your Wechat and reply "video", you can access to official technical exchange group. Look forward to your participation.

Applications

  • VideoTag: 3k Large-Scale video classification model


Tutorials and Docs

Competition

License

PaddleVideo is released under the Apache 2.0 license.

Contributing

This poject welcomes contributions and suggestions. Please see our contribution guidelines.

paddlevideo's People

Contributors

chajchaj avatar d-danielyang avatar dreamer121121 avatar dreamerlin avatar huangjun12 avatar hydrogensulfate avatar lovejing0306 avatar mmglove avatar mohui37 avatar shippingwang avatar voipchina avatar xiaoguanghu01 avatar xiegegege avatar zephyr-fun avatar zhanghandi avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.