lukashedegaard / continual-skeletons Goto Github PK

Official codebase for "Online Skeleton-based Action Recognition with Continual Spatio-Temporal Graph Convolutional Networks"

Makefile 0.99% Python 84.65% Jupyter Notebook 14.01% Shell 0.34%

machine-learning deep-learning graph-neural-networks online-inference continual-inference convolutional-neural-networks skeleton-based-action-recognition

continual-skeletons's Introduction

Continual Spatio-Temporal Graph Convolutional Networks

Official codebase for "Continual Spatio-Temporal Graph Convolutional Networks" (Pattern Recognition, 2023), including:

Models: Co ST-GCN, Co AGCN, Co S-TR, and more ... (see Models section for full overview).
Datasets: NTU RGB+D 60, NTU RGB+D 120, and Kinetics Skeleton 400.

Abstract

Graph-based reasoning over skeleton data has emerged as a promising approach for human action recognition. However, the application of prior graph-based methods, which predominantly employ whole temporal sequences as their input, to the setting of online inference entails considerable computational redundancy. In this paper, we tackle this issue by reformulating the Spatio-Temporal Graph Convolutional Neural Network as a Continual Inference Network, which can perform step-by-step predictions in time without repeat frame processing. To evaluate our method, we create a continual version of ST-GCN, CoST-GCN, alongside two derived methods with different self-attention mechanisms, CoAGCN and CoS-TR. We investigate weight transfer strategies and architectural modifications for inference acceleration, and perform experiments on the NTU RGB+D 60, NTU RGB+D 120, and Kinetics Skeleton 400 datasets. Retaining similar predictive accuracy, we observe up to 109x reduction in time complexity, on-hardware accelerations of 26x, and reductions in maximum allocated memory of 52% during online inference.

Fig. 1: Continual Spatio-temporal Graph Convolution Blocks consist of an in-time Graph Convolution followed by an across-time Continual Convolution (here a kernel size of three is depicted). The residual connection is delayed to ensure temporal alignment with the continual temporal convolution that is weight-compatible with non-continual networks.

Fig. 2: Accuracy/complexity trade-off on NTU RGB+D 60 X-Sub for ⬥ Continual and ■ prior methods during online inference. Numbers denote streams for each method. *Architecture modification with stride one and no padding.

Setup

Installation

Clone this repository and enter it:

git clone https://github.com/LukasHedegaard/continual-skeletons.git
cd continual-skeletons

Optionally create and activate conda environment:

conda create --name continual-skeletons python=3.8
conda activate continual-skeletons

Install as editable module
```
pip install -e .[dev]
```

Repository structure

The repository is s

root
|- datasets/     # Dataset loaders
|- models/       # Individual models and shared base-code
    |- ...
    |- st_gcn/       # Baseline model
    |- cost_gcn/     # Continual version of model
    |- st_gcn_mod/   # Modified baseline with stride one and no padding
    |- cost_gcn_mod/ # Continual version of modified baseline model
        |- cost_gcn_mod.py  # Python entry-point
        |- scripts/         # Scripts used to achieve results from paper. Please run from root.
            |- evaluate_ntu60.py
            |- evaluate_ntu120.py
            |- evaluate_kinetics.py
            |- ...
|- tests/     # Unit tests for custom modules
|- weights/   # Place pretrained weights here
|- preds/     # Place extracted predictions here to perform multi-stream eval
|- Makefile   # Commands for testing, linting, cleaning.
|- .env       # Modify path to your dataset here, i.e. DATASETS_PATH=/my/path

Dataset preparation

Download the skeleton data of NTU-RGBD-60 and NTU-RGBD-120 from here and put them in nturgbd_raw directory. Name the folder of the downloaded skeletons folder for NTU-RGBD-60 and NTU-RGBD-120 as nturgb+d_skeletons60, nturgb+d_skeletons120, respectively. The skeleton data for Kinetics dataset is extracted using Openpose toolbox by ST-GCN authors. The extracted skeleton data called Kinetics-skeleton (7.5GB) can be directly downloaded from GoogleDrive, and it should be placed in kinetics_raw directory.

Before training and testing the models, the datasets should be preprocessed. The downloaded data should be placed in the following directories:

root
|- datasets/     
|- data_preparation/       
    |- nturgbd_raw\        # Raw NTU-RGBD skeleton data
        |- nturgb+d_skeletons60\   # Skeleton data for NTU-RGBD-60
        |- nturgb+d_skeletons120\   # Skeleton data for NTU-RGBD-120
        |- ...
        |- ntu60_samples_with_missing_skeletons.txt   # Sample IDs with missing skeletons in NTU-RGBD-60
        |- ntu120_samples_with_missing_skeletons.txt  # Sample IDs with missing skeletons in NTU-RGBD-120
  
   |- kinetics_raw\         # Raw Kinetics data
        |- kinetics_train\
        |- ...
        |- kinetics_val\
        |- ...
        |- kinetics_train_label.json
        |- kinetics_val_label.json

For generating the preprocedded data you need to run the following commands:

# NTU-RGBD-60
python datasets/data_preparation/ntu60_prep.py 
# NTU-RGBD-120
python datasets/data_preparation/ntu120_prep.py 
# Kinetics
python datasets/data_preparation/kinetics400_prep.py

For generating bone and motion data for each of the datasets run the following commands:

# Bone generation
python datasets/data_preparation/bone_data_prep.py 
# Motion generation
python datasets/data_preparation/motion_data_prep.py

The joint and bone skeleton data can be concatenated by running the following command:

# joint_bone data concatenation
python datasets/data_preparation/merge_joint_bone_data.py

Models

Individual folders with relevant scripts are avilable under /models for the following models:

ST-GCN (baseline)
ST-GCN*
Co ST-GCN
Co ST-GCN*
AGCN (baseline)
AGCN*
Co AGCN
Co AGCN*
S-TR (baseline)
S-TR*
Co S-TR
Co S-TR*

To see an overview of available commands for a model, check the help, e.g.:

python models/cost_gcn/cost_gcn.py --help

The commands used to produce the paper results are found in the associated scripts folder, e.g.:

python models/cost_gcn/scripts/evaluate_ntu60.py

Pretrained weights

Trained model weights are available here.

Experiments and results

To reproduce results:

Prepare datasets
- Download and preprocessing guidelines coming up
- Add DATASET_PATH=/your/dataset/path to .env.
Download pretrained weights and place them in ~/weights.
Run evaluation script. For instance, to evaluate the CoST-GCN* model on NTU RGB+D 120 and save its predictions, the command would be:
```
python models/cost_gcn/scripts/evaluate_ntu120.py
```

Benchmark

NTU RGB+D 60

NTU RGB+D 120

Kinetics Skeletons 400

Citation

@article{hedegaard2023continual,
title = {Continual spatio-temporal graph convolutional networks},
author = {Lukas Hedegaard and Negar Heidari and Alexandros Iosifidis},
journal = {Pattern Recognition},
volume = {140},
pages = {109528},
year = {2023},
issn = {0031-3203},
doi = {https://doi.org/10.1016/j.patcog.2023.109528},
}

continual-skeletons's People

Contributors

Stargazers

Watchers

Forkers

alekosiosifidis cc1164 desertfex negarhdr sugarmuzi sangin-lee au-maleci ugrkilc

continual-skeletons's Issues

Evaluation Error on Kinetics dataset

Running "python models/cost_gcn/scripts/evaluate_kinetics.py" throws this error:

with open(path, "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'datasets/kinetics/classes.yaml'

Could you please provide the classes.yaml file?

Thanks

Training from scratch.

Hi There :-)

This is a great work.
As part of my thesis, I want to investigate the impact of the adjacencies matrices in the AGCN model.
Therefore I want to try to train it from scratch.
I already ran:

kinetics400_prep
bone_data_prep
merge_joint_bone_data

scripts for the Kinetics dataset, but couldn't find an example of running the training script.
I'd be happy to get any help with it.

Thanks in advance,
Asaf,

How to train and test model

Hello, when I use your code to train or test model again, I realize that setting forward mode by frame takes more time than forward mode by clip and I don't know when I should use forward mode frame or clip. Could you explain me in detail in this issue? how can I train the model exactly like yours?
To run model, I use: python models/cost_gcn_mod/cost_gcn_mod.py --id train_frame_mod_xview --gpus=1 --forward_mode frame --train --test --max_epochs=30 --optimization_metric loss --test --batch_size=16 --num_workers=8 --learning_rate=0.1 --weight_decay 0.00001 --dataset_normalization=0 --dataset_name ntu60 --dataset_classes /continual-skeletons/datasets/ntu60/classes.yaml --dataset_train_data /continual-skeletons/datasets/ntu60/xview/train_data_joint.npy --dataset_val_data /continual-skeletons/datasets/ntu60/xview/val_data_joint.npy --dataset_test_data /continual-skeletons/datasets/ntu60/xview/val_data_joint.npy --dataset_train_labels /continual-skeletons/datasets/ntu60/xview/train_label.pkl --dataset_val_labels /continual-skeletons/datasets/ntu60/xview/val_label.pkl --dataset_test_labels /continual-skeletons/datasets/ntu60/xview/val_label.pkl
I am looking forward to hearing from you!

How to get the model output

Hello, thanks for your amazing work!

Since I'm not familiar with the framework of ride package
I want to ask if I can get the output of the model directly in the code
Just like the following format:

model = CoStGcn(args)
output = model(input)

I have read your ride repo but didn't find the answer
I look forward to receiving your response. Thank you!

Out of Memory

Hello,thanks for your great work. And now,when i want to train model on PKUMMD dataset with ,for example cos_tr.py,however,i just adjusted the GraphDataset.py,the code as follow:

class GraphDataset(Dataset):
def __init__(
    self,
    data_path,
    label_path,
    random_choose=False,
    random_shift=False,
    random_move=False,
    window_size=-1,
    normalization=False,
    mmap_mode="rb",
    istraining=True,
     ):
    self.data_path = data_path
    self.label_path = label_path
    self.random_choose = random_choose
    self.random_shift = random_shift
    self.random_move = random_move
    self.window_size = window_size
    self.normalization = normalization
    self.mmap_mode = mmap_mode
    self.istraining = istraining
    self.inputs = []
    self.load_data()
    if normalization:
        self.get_mean_map()
    
def load_data(self):
    self.gen_data(self.data_path,self.label_path)

def gen_data(self,data_path,label_path):
    with open(label_path,self.mmap_mode) as f:
        target_all = pickle.load(f)
    f.close()
    with open(data_path,self.mmap_mode) as f:
        self.skeleton_all = pickle.load(f)
    f.close()
    # self.skeleton_all = self.skeleton_all.astype(np.float32)
    # target_all = target_all.astype(np.float32)
    data = []
    labels = []
    sessions = target_all.keys()
    self.enc_steps = 300
    self.dec_steps = 8
    for session in sessions:
        target = target_all[session]
        
        seed = np.random.randint(self.enc_steps) if self.istraining else 0
        
        for start,end in zip(
            range(seed,target.shape[0],1),
            range(seed+self.enc_steps,target.shape[0]-self.dec_steps,1)
        ):
            enc_target = target[start:end]
            dec_target = target[end : end + self.dec_steps]
            distance_target, class_h_target = self.get_distance_target(
                target[start:end]
            )
            self.inputs.append(
                        [
                            session,
                            start,
                            end,
                            enc_target,
                            distance_target,
                            class_h_target,
                            dec_target,
                        ]
                )

def get_distance_target(self,target_vector):
    target_matrix = np.zeros(self.enc_steps - 1)
    target_argmax = target_vector[self.enc_steps - 1].argmax()
    for i in range(self.enc_steps - 1):
        if target_vector[i].argmax() == target_argmax:
            target_matrix[i] = 1.0
    return target_matrix, target_vector[self.enc_steps - 1]
        
def get_mean_map(self):
    data = self.data
    N, C, T, V, M = data.shape
    self.mean_map = (
        data.mean(axis=2, keepdims=True).mean(axis=4, keepdims=True).mean(axis=0)
    )
    self.std_map = (
        data.transpose((0, 2, 4, 1, 3))
        .reshape((N * T * M, C * V))
        .std(axis=0)
        .reshape((C, 1, V, 1))
    )

def __len__(self):
    return len(self.inputs)

def __iter__(self):
    return self
def __getitem__(self, index):
    (
        session,
        start,
        end,
        enc_target,
        distance_target,
        class_h_target,
        dec_target,
    ) = self.inputs[index]
    data_numpy = self.skeleton_all[session][start:end]
    data_numpy = data_numpy.transpose((2,0,1,3))
    C,T,V,S = data_numpy.shape
    if data_numpy.shape[1] < self.enc_steps:
        data_new = np.zeros((C,300,V,S),dtype=np.float32)
        data_new[:,:T,:,:] = data_numpy
        data_numpy = data_new
    label = class_h_target

    if self.normalization:
        data_numpy = (data_numpy - self.mean_map) / self.std_map
    if self.random_shift:
        data_numpy = tools.random_shift(data_numpy)
    if self.random_choose:
        data_numpy = tools.random_choose(data_numpy, self.window_size)
    elif self.window_size > 0:
        data_numpy = tools.auto_pading(data_numpy, self.window_size)
    if self.random_move:
        data_numpy = tools.random_move(data_numpy)

    return data_numpy, label, index

but when i run the code by cmdline as :python models/cos_tr/cos_tr.py --train --max_epochs 30 --id benchmark_costr_pkummdv1 --gpus "0,1,2,3" --profile_model --profile_model_num_runs 10 --forward_mode clip --batch_size 128 --num_workers 8 --dataset_name pkummd --dataset_classes ./datasets/pkummd/classes.yaml --dataset_train_data /data/pkummdv1_float32/train_subject_data_v1.pkl --dataset_val_data /data/pkummdv1_float32/test_subject_data_v1.pkl --dataset_train_labels /data/pkummdv1_float32/train_subject_label_thoum_v1.pkl --dataset_val_labels /data/pkummdv1_float32/test_subject_label_thoum_v1.pkl
the used memory would quickly rise to more than 100 G, but my total data is about 5G. and the output log is
`lightning: Global seed set to 123

ride: Running on host gpu-task-nod5
ride: ⭐️ View project repository at [email protected]:LukasHedegaard/continual-skeletons/tree/a8fe2937a33f24cce65c1f8c2fc41081bceda721
ride: Run data is saved locally at logs/run_logs/benchmark_costr_pkummdv1/version_6
ride: Logging using Tensorboard
ride: 💾 Saving logs/run_logs/benchmark_costr_pkummdv1/version_6/hparams.yaml
ride: 🚀 Running training
continual: Temporal stride of 2 will result in skipped outputs every 1 / 2 steps
continual: Temporal stride of 2 will result in skipped outputs every 1 / 2 steps
continual: Temporal stride of 2 will result in skipped outputs every 1 / 2 steps
continual: Temporal stride of 2 will result in skipped outputs every 1 / 2 steps
models: Input shape (C, T, V, S) = (3, 300, 25, 2)
models: Receptive field 449
models: Init frames 144
models: Pool size 75
models: Stride 4
models: Padding 152
models: Using Continual CallMode.FORWARD
ride: ✅ Checkpointing on val/loss with optimisation direction min
/home/yaoning.li/Anaconda/yes/envs/mmlab/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:110: LightningDeprecationWarning: Trainer(distrib uted_backend=ddp) has been deprecated and will be removed in v1.5. Use Trainer(accelerator=ddp) instead.
rank_zero_deprecation(
lightning: GPU available: True, used: True
lightning: TPU available: False, using: 0 TPU cores
lightning: IPU available: False, using: 0 IPUs
lightning: Global seed set to 123
lightning: initializing ddp: GLOBAL_RANK: 0, MEMBER: 1/4`

i dont know why. Can you help me?Thank you !