Git Product home page Git Product logo

s3r's Introduction

Self-Supervised Sparse Representation for Video Anomaly Detection

Logo x Logo By Jhih-Ciang Wu*, He-Yen Hsieh*, Ding-Jie Chen, Chiou-Shann Fuh, and Tyng-Luh Liu (The symbol of * denotes equal contribution)

tags: video anomaly detection weakly-supervised dictionary learning

This repo is the official implementation of "Self-Supervised Sparse Representation for Video Anomaly Detection" (accepted at ECCV'22) for the weakly-supervised VAD (wVAD) setting.

Table of Contents

0. Introduction
1. Quick start
2. Prerequisitesn
3. Installation
4. Data preparation
5. Dictionary learning
6. Results and Models
7. Evaluation
8. Training
9. Acknowledgement
10. Citation

Introduction

We consider establishing a dictionary learning approach to model the concept of anomaly at the feature level. The dictionary learning presumes an overcomplete basis, and prefers a sparse representation to succinctly explain a given sample. With the training set $\mathcal{X}$, whose video samples are anomaly-free, we are motivated to learn its corresponding dictionary $D$ of $N$ atoms. Since the derivation of $D$ is specific to the training dataset $\mathcal{X}$, we will use the notation $D_T$ to emphasize that the underlying dictionary is task-specific. With the learned task-specific dictionary $D_T$, we can design two opposite network components: the en-Normal and de-Normal modules. Given a snippet-level feature $F$, the former is used to obtain its reconstructed normal-event feature, while, on the contrary, the latter is applied to filter out the normal-event feature. The two modules complement each other and are central to our approach to anomaly video detection.

Quick start

# please refer to the "Installation" section
$ conda create --name s3r python=3.6 -y
$ conda activate s3r
$ conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch
$ cd S3R/
$ pip install -r requirements.txt

# please refer to the "Data preparation" section
$ ln -sT <your-data-path>/SH_Train_ten_crop_i3d data/shanghaitech/i3d/train
$ ln -sT <your-data-path>/SH_Test_ten_crop_i3d data/shanghaitech/i3d/test

# please refer to the "Dictionary learning" section
$ ln -sT <downloaded-dictionary-path>/ dictionary

# please refer to the "Evaluation" section
$ CUDA_VISIBLE_DEVICES=0 python tools/trainval_anomaly_detector.py \
--dataset shanghaitech --inference --resume checkpoint/shanghaitech_s3r_i3d_best.pth

Prerequisites

  • linux Operating system
    • Ubuntu 18.04.6 LTS
  • pytorch Graphics card
    • GPU: NVIDIA RTX 2080 Ti
  • pytorch Framework and environment
    • pytorch: 1.6.0
    • cuda: 10.1
    • torchvision: 0.7.0
  • python Programming language
    • python: 3.6

Library versions for reference

The following information denotes the versions of installed libraries in our experiments.

  • python Library versions
    • pyyaml==6.0
    • tqdm==4.64.0
    • munch==2.5.0
    • terminaltables==3.1.0
    • scikit-learn==0.24.2
    • opencv-python==4.6.0
    • pandas==1.1.5
    • typed-argument-parser==1.7.2
    • einops==0.4.1

Project structure

$ tree S3R
S3R/
├─ anomaly/    (directory for core functions, including dataloader, S3R modules, and other useful functions)
├─ checkpoint/ (directory for model weights)
├─ configs/    (directory for model configurations)
├─ data/       (directory for dataset)
├─ dictionary/ (directory for learned dictionaries)
├─ tools/      (directory for main scripts)
├─ logs/       (directory for saving training logs)
├─ output/     (directory for saving inference results)
├─ config.py
├─ README.md 
├─ requirements.txt 
├─ utils.py

Installation

Step 1. Create a conda environment and activate it.

$ conda create --name s3r python=3.6 -y
$ conda activate s3r

Step 2. Install pytorch

$ conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch
or
$ pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html

Step 3. Install required libraries

$ pip install -r requirements.txt

Data preparation

Please download the extracted I3d features for shanghaitech and ucf-crime dataset from the link.

The file structure of downloaded features should look like:

$ tree data
data/
├─ shanghaitech/
│  ├─ shanghaitech.training.csv
│  ├─ shanghaitech_ground_truth.testing.json
│  ├─ shanghaitech.testing.csv
│  ├─ i3d/
│  │  ├─ test/
│  │  │  ├─01_0015_i3d.npy
│  │  │  ├─05_033_i3d.npy
│  │  │  ├─ ...
│  │  ├─ train/
│  │  │  ├─ 01_0014_i3d.npy
│  │  │  ├─ 05_040_i3d.npy
│  │  │  ├─ ...
├─ ucf-crime/
│  ├─ ucf-crime_ground_truth.testing.json
│  ├─ ucf-crime.testing.csv
│  ├─ ucf-crime.training.csv
│  ├─ i3d/
│  │  ├─ test/
│  │  │  ├─ Abuse028_x264_i3d.npy
│  │  │  ├─ Burglary079_x264_i3d.npy
│  │  │  ├─ ...
│  │  ├─ train/
│  │  │  ├─ Abuse001_x264_i3d.npy
│  │  │  ├─ Burglary001_x264_i3d.npy
│  │  │  ├─ ...

Examples:

$ ln -sT <your-data-path>/SH_Train_ten_crop_i3d data/shanghaitech/i3d/train
$ ln -sT <your-data-path>/SH_Test_ten_crop_i3d data/shanghaitech/i3d/test
$ ln -sT <your-data-path>/UCF_Train_ten_crop_i3d data/ucf-crime/i3d/train
$ ln -sT <your-data-path>/UCF_Test_ten_crop_i3d data/ucf-crime/i3d/test

Dictionary learning

The dictionaries can be downloaded from the link and the file structure of dictionaries should look like:

$ tree dictionary
dictionary/
├─ kinetics400
│  ├─ kinetics400_dictionaries.universal.omp.100iters.npy
├─ shanghaitech
│  ├─ shanghaitech_dictionaries.taskaware.omp.100iters.90pct.npy
│  ├─ shanghaitech_regular_features-2048dim.training.pickle
├─ ucf-crime
│  ├─ ucf-crime_dictionaries.taskaware.omp.100iters.50pct.npy
│  ├─ ucf-crime_regular_features-2048dim.training.pickle

Example:

$ ln -sT <downloaded-dictionary-path>/ dictionary

(Optional) Generate dictionaries

To generate dictionaries for the shanghaitech and ucf-crime dataset, please run the following commands:

# for the shanghaitech dataset
$ python data/shanghaitech/shanghaitech_dictionary_learning.py
and
# for the ucf-crime dataset
$ python data/ucf-crime/ucf_crime_dictionary_learning.py

Results and Models

config dataset backbone gpus AUC (%) ckpt log
shanghaitech_dl shanghaitech I3D 1 97.40 model log
ucf_crime_dl ucf-crime I3D 1 85.99 model log

Evaluation

To evaluate the S3R on shanghaitech, please run the following command:

$ CUDA_VISIBLE_DEVICES=0 python tools/trainval_anomaly_detector.py \
--dataset shanghaitech --inference --resume checkpoint/shanghaitech_s3r_i3d_best.pth

+ Performance on shanghaitech ----+---------+
|   Dataset    | Method | Feature | AUC (%) |
+--------------+--------+---------+---------+
| shanghaitech |  S3R   |   I3D   |  97.395 |
+--------------+--------+---------+---------+

To evaluate the S3R on ucf-crime, please run the following command:

$ CUDA_VISIBLE_DEVICES=0 python tools/trainval_anomaly_detector.py \
--dataset ucf-crime --inference --resume checkpoint/ucf-crime_s3r_i3d_best.pth

+ Performance on ucf-crime ----+---------+
|  Dataset  | Method | Feature | AUC (%) |
+-----------+--------+---------+---------+
| ucf-crime |  S3R   |   I3D   |  85.989 |
+-----------+--------+---------+---------+

Training

shanghaitech dataset

To train the S3R from scratch on shanghaitech, please run the following command:

$ CUDA_VISIBLE_DEVICES=<gpu-id> python tools/trainval_anomaly_detector.py \
--dataset shanghaitech --version <customized-version> --evaluate_min_step 5000

Example:

$ CUDA_VISIBLE_DEVICES=0 python tools/trainval_anomaly_detector.py \
--dataset shanghaitech --version s3r-vad-0.1 --evaluate_min_step 5000

ucf-crime dataset

To train the S3R from scratch on ucf-crime, please run the following command:

$ CUDA_VISIBLE_DEVICES=<gpu-id> python tools/trainval_anomaly_detector.py \
--dataset ucf-crime --version <customized-version> --evaluate_min_step 10

Example:

$ CUDA_VISIBLE_DEVICES=0 python tools/trainval_anomaly_detector.py \
--dataset ucf-crime --version s3r-vad-0.1 --evaluate_min_step 10

Acknowledgement

Our codebase is built based on RTFM. We really appreciate the authors for the nicely organized code!

Citation

We hope the codebase is beneficial to you. If this repo works positively for your research, please consider citing our paper. Thank you for your time and consideration.

@inproceedings{WuHCFL22,
  author    = {Jhih-Ciang Wu and
               He-Yen Hsieh and
               Ding-Jie Chen and
               Chiou-Shann Fuh and
               Tyng-Luh Liu},
  title     = {Self-Supervised Sparse Representation for Video Anomaly Detection},
  booktitle = {ECCV},
  year      = {2022},
}

s3r's People

Contributors

louisyen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

s3r's Issues

using custom dataset

hi, thanks for your code.

i have a question about using custom dataset.
i was acquire a abnormal videos and normal videos at few days ago.

so, want to the videos convert and export files to i3d dataset (etc: Abuse001_x264_i3d.npy)
if you know video convertion method, please tell it to me.

thanks!

Failed to replicate results with UCF-Crime from scratch

I've been attempting to replicate the training and validation results utilizing the UCF-Crime dataset from scratch, due to the unavailability of the download links in the RTFM repo, I am calculating the i3d features manually using the .mp4 files with the help of this repo, which is a simplified version of the methodology utilized in the pytorch-resnet3d repo.
I am calculating the Normal_Features pickle file utilizing the explanation given in this issue, padding the tensors with zeros, to comply with the shapes needed (32,10,2048).
When I start the training or valuation, I am getting errors in the inference method when calculating the roc_curve, due to inconsistencies in shapes. I have tried reshaping the tensors of the training sets without luck to match the expected samples. Currently applying the padding to the Normal_Videos in the training set I get the following error ValueError: Found input variables with inconsistent numbers o f samples: [114144, 1113424], where 1113424 is what I believe my testing set. I would appreciate if you could provide clarification on any additional steps required to use the UCF dataset or any other from scratch.

cv2.resize slows down the training time largely

First of all, thanks a lot for sharing your code! I noticed that in the training pipeline (video_dataset.py), you have used the following:

width, height = self.quantize_size, channels
features = cv2.resize(features, (width, height),
    interpolation=cv2.INTER_LINEAR) # CxTxN

This resizing delays the training pipeline by a lot. Instead of resizing on every 32 frames, shouldn't we just do it on the last some frames which are less than 32?

How to use this model as oVAD

First, let me thank you for this useful code and important research.
By the way, is this code designed to apply to wVAD? In the paper it says the same model can be applied to oVADs. it would be great if you could tell me how to use it as an oVAD. Thanks a lot.

MemoryError: Unable to allocate 4.66 GiB for an array with shape (61032, 10, 2048) and data type float32

Thanks for your outstanding contribution
I also train the model on RTX2080Ti
but in Ubuntu 16.04
and I have a problem with memory
MemoryError: Unable to allocate 4.66 GiB for an array with shape (61032, 10, 2048) and data type float32
I have tried to modify the batch size from 32 to 16, 8. And still cannot fix the problem
and this is my result
batch size 32 run 4/15000epoch
batch size 16 run 12/15000epoch
batch size 8 run 25/15000epoch
and all of above are broken
how can i fix it

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.