Git Product home page Git Product logo

aed-mae's Introduction

Self-Distilled Masked Auto-Encoders are Efficient Video Anomaly Detectors (CVPR 2024) - Official Repository

by Nicolae-Catalin Ristea*, Florinel-Alin Croitoru*, Radu Tudor Ionescu, Marius Popescu, Fahad Shahbaz Khan, Mubarak Shah

* Authors have contributed equally.

This is the official repository of "Self-Distilled Masked Auto-Encoders are Efficient Video Anomaly Detectors" accepted at CVPR 2024.

Paper links: Open CVF, ArXiv.

License

The source code and models are released under the Creative Common Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.

Description

We propose an efficient abnormal event detection model based on a lightweight masked auto-encoder (AE) applied at the video frame level. The novelty of the proposed model is threefold. First, we introduce an approach to weight tokens based on motion gradients, thus shifting the focus from the static background scene to the foreground objects. Second, we integrate a teacher decoder and a student decoder into our architecture, leveraging the discrepancy between the outputs given by the two decoders to improve anomaly detection. Third, we generate synthetic abnormal events to augment the training videos, and task the masked AE model to jointly reconstruct the original frames (without anomalies) and the corresponding pixel-level anomaly maps. Our design leads to an efficient and effective model, as demonstrated by the extensive experiments carried out on four benchmarks: Avenue, ShanghaiTech, UBnormal and UCSD Ped2. The empirical results show that our model achieves an excellent trade-off between speed and accuracy, obtaining competitive AUC scores, while processing 1655 FPS. Our model is between 8 and 70 times faster than competing methods.

Citation

Please cite our work if you use any material released in this repository.

@InProceedings{Ristea-CVPR-2024,
  author    = {Ristea, Nicolae-Catalin and Croitoru, Florinel-Alin and Ionescu, Radu Tudor and Popescu, Marius and Khan, Fahad Shahbaz and Shah, Mubarak},
  title     = "{Self-Distilled Masked Auto-Encoders are Efficient Video Anomaly Detectors}",
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  month     = {June},
  year      = {2024},
  pages     = {15984--15995}
  }

Preprocessing steps

  1. Compute the temporal gradients
python extract_gradients.py 

Before running the above command, you have to change the root folders used in the script to reflect the location where your dataset is stored.

  1. Include pseudo anomalies from UBnormal
cd util/create_anomalies
python main.py

Same as before, you have to change the arguments to reflect the location where the data is stored.

Training/Inference

  1. Preliminaries

Set the dataset location in "configs/configs.py".

  1. Train.
python main.py --dataset <avenue or shanghai>

The "dataset" parameter will choose between the two config options.

  1. Inference.

If you want to check the Micro-AUC and Macro-AUC scores you have to change in configs/configs.py the run_type variable to "inference" and then rerun "main.py".

Checkpoints:

https://drive.google.com/drive/folders/1Qpx1ZohOPgdeR0uMZkLqFaaNCOpcZ_aF?usp=sharing

aed-mae's People

Contributors

croitorualin avatar project2you avatar raduionescu avatar ristea avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.