This repository contains the official PyTorch implementation for the paper Toward Open-set Human Object Interaction Detection (AAAI2024).
We provide weights for DHD models trained on HICO-DET.
Model | Dataset | Default Settings | DHD Weights | GroundingDINO Weights |
---|---|---|---|---|
DHD | HICO-DET | (29.91 , 28.42 , 30.35 ) |
weights | weights |
-
Install the lightweight deep learning library Pocket. The recommended PyTorch version is 1.9.0. Make sure the environment for Pocket is activated (
conda activate pocket
), and install the packaging library withpip install packaging
. -
init GroundingDINO and CLIP(from VIPLO).
-
Prepare the HICO-DET dataset.
- If you have not downloaded the dataset before, run the following script.
cd /path/to/dhd/hicodet bash download.sh
- If you have previously downloaded the dataset, simply create a soft link.
cd /path/to/dhd/hicodet ln -s /path/to/hicodet_20160224_det ./hico_20160224_det
-
Prepare the V-COCO dataset (contained in MS COCO).
- If you have not downloaded the dataset before, run the following script
cd /path/to/dhd/vcoco bash download.sh
- If you have previously downloaded the dataset, simply create a soft link
cd /path/to/dhd/vcoco ln -s /path/to/coco ./mscoco2014
-
Prepare the VG dataset from VG.
- If you have downloaded the dataset, simply create a soft link
cd /path/to/dhd/vg ln -s /path/to/vg ./vg
-
Prepare the preprocessed annotations, from vg and hicodet, and put them into the corresponding dataset directory.
DHD is released under the BSD-3-Clause License.
Refer to train.sh
for training and test.sh
for testing commands with different options.
@inproceedings{wu2024toward,
title={Toward Open-Set Human Object Interaction Detection},
author={Wu, Mingrui and Liu, Yuqi and Ji, Jiayi and Sun, Xiaoshuai and Ji, Rongrong},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={38},
number={6},
pages={6066--6073},
year={2024}
}
This repo is based on UPT.