yjh0410 / yowov2 Goto Github PK

View Code? Open in Web Editor NEW

201.0 8.0 32.0 97.38 MB

The second generation of YOWO action detector.

License: MIT License

Python 99.83% Shell 0.17%

action-detection one-stage-detector you-only-watch-once

yowov2's Introduction

YOWOv2: A Stronger yet Efficient Multi-level Detection Framework for Real-time Spatio-temporal Action Detection

English | 简体中文

非常感谢大家的star！YOWOv2是我业余时间做的一个尝试，是对上一代的YOWO的一次致敬。YOWO曾是我很喜欢的一个工作，但如今我已不再继续研究这一方向，因此，无法再回答大家的各种问题，对此，我深感抱歉，还望大家能够谅解。YOWOv2是一个完全开放的项目，不包含任何的license，因此，请尽管做任何你想做的改进或优化，无需经过我的同意。只要这个项目能对世界进步带去哪怕是微乎其微的促进和贡献，我也足够开心了~如果您觉得我们这个工作还行，不妨引用我们挂在Arxiv上的论文链接吧（在README的最下方）。

Thank you very much for everyone's star. YOWOv2 is an attempt I made in my spare time. It is a tribute to YOWO because I used to like YOWO very much. However, I am no longer deeply involved in the field of spatiotemporal motion detection, so I am not in a position to answer some of everyone's issues. For this, I am deeply sorry. YOWOv2 is a completely open spatiotemporal action detection project. I have not added any license to the project, so please feel free to do whatever you want without my consent. As long as my work can make even a small contribution to the progress of the world, I will be very happy. If you think our work is useful, you could cite our article posted on Arxiv (at the bottom of README).

Overview of YOWOv2

Requirements

We recommend you to use Anaconda to create a conda environment:

conda create -n yowo python=3.6

Then, activate the environment:

conda activate yowo

Requirements:

pip install -r requirements.txt

Visualization

Dataset

UCF101-24:

You can download UCF24 from the following links:

Google drive

Link: https://drive.google.com/file/d/1Dwh90pRi7uGkH5qLRjQIFiEmMJrAog5J/view?usp=sharing

BaiduYun Disk

Link: https://pan.baidu.com/s/11GZvbV0oAzBhNDVKXsVGKg

Password: hmu6

AVA

You can use instructions from here to prepare AVA dataset.

Experiment

UCF101-24

Model	Clip	GFLOPs	Params	F-mAP	V-mAP	FPS	Weight
YOWOv2-Nano	16	1.3	3.5 M	78.8	48.0	42	ckpt
YOWOv2-Tiny	16	2.9	10.9 M	80.5	51.3	50	ckpt
YOWOv2-Medium	16	12.0	52.0 M	83.1	50.7	42	ckpt
YOWOv2-Large	16	53.6	109.7 M	85.2	52.0	30	ckpt
YOWOv2-Nano	32	2.0	3.5 M	79.4	49.0	42	ckpt
YOWOv2-Tiny	32	4.5	10.9 M	83.0	51.2	50	ckpt
YOWOv2-Medium	32	12.7	52.0 M	83.7	52.5	40	ckpt
YOWOv2-Large	32	91.9	109.7 M	87.0	52.8	22	ckpt

All FLOPs are measured with a video clip with 16 or 32 frames (224×224). The FPS is measured with batch size 1 on a 3090 GPU from the model inference to the NMS operation.

Qualitative results on UCF101-24

AVA v2.2

Model	Clip	mAP	FPS	weight
YOWOv2-Nano	16	12.6	40	ckpt
YOWOv2-Tiny	16	14.9	49	ckpt
YOWOv2-Medium	16	18.4	41	ckpt
YOWOv2-Large	16	20.2	29	ckpt
YOWOv2-Nano	32	12.7	40	ckpt
YOWOv2-Tiny	32	15.6	49	ckpt
YOWOv2-Medium	32	18.4	40	ckpt
YOWOv2-Large	32	21.7	22	ckpt

Qualitative results on AVA

Train YOWOv2

UCF101-24

For example:

python train.py --cuda -d ucf24 --root path/to/dataset -v yowo_v2_nano --num_workers 4 --eval_epoch 1 --max_epoch 8 --lr_epoch 2 3 4 5 -lr 0.0001 -ldr 0.5 -bs 8 -accu 16 -K 16

or you can just run the script:

sh train_ucf.sh

python train.py --cuda -d ava_v2.2 --root path/to/dataset -v yowo_v2_nano --num_workers 4 --eval_epoch 1 --max_epoch 10 --lr_epoch 3 4 5 6 -lr 0.0001 -ldr 0.5 -bs 8 -accu 16 -K 16 --eval

or you can just run the script:

sh train_ava.sh

If you have multiple GPUs, you can launch DDP to train the YOWOv2, for example:

python train.py --cuda -dist -d ava_v2.2 --root path/to/dataset -v yowo_v2_nano --num_workers 4 --eval_epoch 1 --max_epoch 10 --lr_epoch 3 4 5 6 -lr 0.0001 -ldr 0.5 -bs 8 -accu 16 -K 16 --eval

However, I have not multiple GPUs, so I am not sure if there are any bugs, or if the given performance can be reproduced using DDP.

Test YOWOv2

UCF101-24 For example:

python test.py --cuda -d ucf24 -v yowo_v2_nano --weight path/to/weight -size 224 --show

AVA For example:

python test.py --cuda -d ava_v2.2 -v yowo_v2_nano --weight path/to/weight -size 224 --show

Test YOWOv2 on AVA video

For example:

python test_video_ava.py --cuda -d ava_v2.2 -v yowo_v2_nano --weight path/to/weight --video path/to/video --show

Note that you can set path/to/video to other videos in your local device, not AVA videos.

Evaluate YOWOv2

UCF101-24 For example:

# Frame mAP
python eval.py \
        --cuda \
        -d ucf24 \
        -v yowo_v2_nano \
        -bs 16 \
        -size 224 \
        --weight path/to/weight \
        --cal_frame_mAP \

# Video mAP
python eval.py \
        --cuda \
        -d ucf24 \
        -v yowo_v2_nano \
        -bs 16 \
        -size 224 \
        --weight path/to/weight \
        --cal_video_mAP \

Run the following command to calculate frame [email protected] IoU:

python eval.py \
        --cuda \
        -d ava_v2.2 \
        -v yowo_v2_nano \
        -bs 16 \
        --weight path/to/weight

Demo

# run demo
python demo.py --cuda -d ucf24 -v yowo_v2_nano -size 224 --weight path/to/weight --video path/to/video --show
                      -d ava_v2.2

Qualitative results in real scenarios

References

If you are using our code, please consider citing our paper.

@article{yang2023yowov2,
  title={YOWOv2: A Stronger yet Efficient Multi-level Detection Framework for Real-time Spatio-temporal Action Detection},
  author={Yang, Jianhua and Kun, Dai},
  journal={arXiv preprint arXiv:2302.06848},
  year={2023}
}

yowov2's People

Contributors

Stargazers

Watchers

yowov2's Issues

custom dataset training

can you help me to train on custom data set for animal claass like elephant walk, elephant sleeping, so on

Missing key(s) in state_dict:

When I run demo.py. I get the following error:

Traceback (most recent call last):
  File "demo.py", line 260, in <module>
    model = load_weight(model=model, path_to_ckpt=args.weight)
  File "F:\Projects\action-recognition\New_folder\YOWOv2\utils\misc.py", line 161, in load_weight
    model.load_state_dict(checkpoint_state_dict)
  File "C:\Users\UMAIR COMPUTER\anaconda3\envs\actions_env\lib\site-packages\torch\nn\modules\module.py", line 1224, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for YOWO:
        Missing key(s) in state_dict: "cls_preds.0.weight", "cls_preds.0.bias", "cls_preds.1.weight", "cls_preds.1.bias", "cls_preds.2.weight", "cls_preds.2.bias".

Support correct TRT/anything inference

HI! Thanks for this repo.
It seems like needs to add conversation between pytorch and onnx to make inference. I mean, now this repo does not have the functionality for inference, which is strange for the creators of the network that posits itself like best real-time action detection NN. Pls add support for onnx and minimal C++ example on TRT and OpenVINO. Now i don't understand, what I need to do with 5D input tensor in TRT.
Thanks!

train.py

https://github.com/yjh0410/YOWOv2/blob/master/train.py
Where is the class name feed into model, i havent see class name have passed while training!

A new version of YOWOv2 is being developed by us.

Update : YOWOv3 paper has been public here
Official code has been public

First of all, I would like to express my sincere gratitude to the author. This work is truly amazing and helpful. This repo has been tremendously helpful to me during my research in the field of spatiotemporal motion detection. In the past few months, I have frequently reviewed the code, and today I noticed that the author has updated the README.

It's unfortunate that the author has stopped providing support for the issues in this repo. However, I am also conducting my own research on a YOWO version and have achieved some promising results:

UCF-101: 84.x mAP with a 54M parameter model
AVAv2.2: 20.21 mAP with a 96M parameter model
Update : UCF-101 : 88.24 mAP with a 59.8M #param model !
Update : AVAv2.2 : 20.31 mAP with a 59.86M #param model !

I have just made my code public at : https://github.com/AakiraOtok/Project_VU

My research focuses on utilizing different techniques to improve the YOWOv2 model, along with support for various experimental configurations. The usage instructions for the repo are still quite sparse at the moment because I am still fine-tuning my model. However, they will be updated soon if people are interested. And of course, I will also provide support and answer any questions regarding my repo.

Once again, thank you to the author @yjh0410.

在ava数据集上训练10epoch后验证时精度一直是0mAp

训练参数

你好 @yjh0410,

我选择 224 作为训练的分辨率，但我在边界框预测中没有得到很好的结果，但分类效果很好。我有两个类的误报多于真报，推理速度很慢。
训练时 MAP 也很低，你能帮我解决这个问题吗？

先感谢您

recognition problem

is there any way so that I can integrate yowov2 into my current yolox detector module? take the box from yolox head, pass it to the yowov2 classification part, and classify that particular object

调大batchsize就会报错

如图，当Batchsize为8的时候能够正常训练，但是当调大为16及以上后就会报错。使用的是特斯拉V100显存32G，理论上调到80都是够用的。

Dataset Structure Mismatch

Hello,

I've prepared my custom dataset as AVA V2.2 structure. However when I run the train script with ava_v2.2 param an error showed up from ava_helper.py although my dataset structure is AVA v2.2. Is there any document that explain what is the convenient data structure for training as multi-person spatio-temporal action detection ?

Best
Alper

demo推理疑惑

我想请教一下demo推理中的问题，这个模型训练的不是视频动作吗，demo中的推理每次都只用一帧数据，直接复制成16份去推理吗？这不就是单帧的目标检测吗

复现贴出的ucf24的yowov2_nano的fmap结果有误

作者您好，目前我利用您贴出来的weight进行Eval，发现fmap结果并不是贴出来的78，而是77.2这是什么原因呢

Which torch / torchvision / cuda version should I use for this project?

I have cuda 11.2 on my system, I am trying to run training with this command:
python train.py --cuda -d ucf24 --root path/to/dataset -v yowo_v2_nano --num_workers 4 --eval_epoch 1 --max_epoch 8 --lr_epoch 2 3 4 5 -lr 0.0001 -ldr 0.5 -bs 8 -accu 16 -K 16
If I install torch and torchvision using requirements.txt, it runs fine without cuda, but with cuda I get this error:

AssertionError: Torch not compiled with CUDA enabled

I tried to manually install torch 1.8.0+cu111 and torchvision 0.9.0+cu111, but when I run this command I get this error:

Traceback (most recent call last):
File "train.py", line 330, in
train()
File "train.py", line 222, in train
for iter_i, (frame_ids, video_clips, targets) in enumerate(dataloader):
File "C:\ProgramData\Anaconda3\envs\yowo\lib\site-packages\torch\utils\data\dataloader.py", line 355, in iter
return self._get_iterator()
File "C:\ProgramData\Anaconda3\envs\yowo\lib\site-packages\torch\utils\data\dataloader.py", line 301, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "C:\ProgramData\Anaconda3\envs\yowo\lib\site-packages\torch\utils\data\dataloader.py", line 914, in init
w.start()
File "C:\ProgramData\Anaconda3\envs\yowo\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\ProgramData\Anaconda3\envs\yowo\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\ProgramData\Anaconda3\envs\yowo\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\ProgramData\Anaconda3\envs\yowo\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "C:\ProgramData\Anaconda3\envs\yowo\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe

I have no issues with manualy installed cpu only versions of torch and torchvision of the same versions

How to train the model in a custom dataset ?

Thanks for your great work ! I want to train the model in a custom dataset. Can you tell me how to train the model ?

TRT model convert

How to convert pth to TRT model

custom model training

Can this model can be used for custom action recongnition?

dataloader

why use img_id_temp and not img_id

for i in reversed(range(self.len_clip)):
            # make it as a loop
            img_id_temp = img_id - i * d
            if img_id_temp < 1:
                img_id_temp = 1
            elif img_id_temp > max_num:
                img_id_temp = max_num

            # load a frame
            if self.dataset == 'ucf24':
                path_tmp = os.path.join(self.data_root, 'rgb-images', img_split[1], img_split[2] ,'{:05d}.jpg'.format(img_id_temp))
            elif self.dataset == 'jhmdb21':
                path_tmp = os.path.join(self.data_root, 'rgb-images', img_split[1], img_split[2] ,'{:05d}.png'.format(img_id_temp))
            frame = Image.open(path_tmp).convert('RGB')

how to train it on custom dataset

@yjh0410 can you provide guildline how to train YOWOv2 on my own dataset here is alteration i make on dataset config file
'ava_v2.2': {
# dataset
'frames_dir': '/home/ubuntu/Furqan/new_dataset/20231004-activity-all-annotated-tasks/',
'frame_list': '/home/ubuntu/hammad/YOWOv2/config/frame/',
'annotation_dir': '/home/ubuntu/hammad/new_annotations/',
'train_gt_box_list': '/home/ubuntu/hammad/new_annotations/20231005_ke_activity_train_avastyle_17.csv',
'val_gt_box_list': '/home/ubuntu/hammad/new_annotations/20231005_ke_activity_val_avastyle_17.csv',
'train_exclusion_file': '/home/ubuntu/hammad/new_annotations/ava_train_excluded_timestamps_v2.2 (1).csv',
'val_exclusion_file': '/home/ubuntu/hammad/new_annotations/ava_val_excluded_timestamps_v2.2.csv',
'labelmap_file': '/home/ubuntu/hammad/YOWOv2/labels.pbtxt', # 'ava_v2.2/ava_action_list_v2.2.pbtxt',
'class_ratio_file': None,
'backup_dir': 'results/',

    **however it give me this error**

Model Config: YOWO_V2_LARGE
Finished loading image paths from: /home/ubuntu/hammad/YOWOv2/config/frame/train.csv
Traceback (most recent call last):
File "train.py", line 330, in
train()
File "train.py", line 156, in train
dataset, evaluator, num_classes = build_dataset(d_cfg, args, is_train=True)
File "/home/ubuntu/hammad/YOWOv2/utils/misc.py", line 67, in build_dataset
dataset = AVA_Dataset(
File "/home/ubuntu/hammad/YOWOv2/dataset/ava.py", line 49, in init
self._load_data()
File "/home/ubuntu/hammad/YOWOv2/dataset/ava.py", line 65, in _load_data
boxes_and_labels = ava_helper.load_boxes_and_labels(
File "/home/ubuntu/hammad/YOWOv2/dataset/ava_helper.py", line 134, in load_boxes_and_labels
if box_key not in all_boxes[video_name][frame_sec]:
KeyError: 0

分布式训练

你好！我尝试运行下列命令：

CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 train.py --cuda -d ava_v2.2 --root /data/ztq/data/ava/ -v yowo_v2_nano --num_workers 4 --eval_epoch 1 --max_epoch 10 --lr_epoch 3 4 5 6 -lr 0.0001 -ldr 0.5 -bs 8 -accu 16 -K 16 --eval

但是遇到了报错：

train.py: error: unrecognized arguments: --local_rank=0
Killing subprocess 2890
Killing subprocess 2891
Traceback (most recent call last):
File "/home/dsphaoyang/anaconda3/envs/yowo/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/dsphaoyang/anaconda3/envs/yowo/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/dsphaoyang/anaconda3/envs/yowo/lib/python3.6/site-packages/torch/distributed/launch.py", line 340, in
main()
File "/home/dsphaoyang/anaconda3/envs/yowo/lib/python3.6/site-packages/torch/distributed/launch.py", line 326, in main
sigkill_handler(signal.SIGTERM, None) # not coming back
File "/home/dsphaoyang/anaconda3/envs/yowo/lib/python3.6/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler
raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/dsphaoyang/anaconda3/envs/yowo/bin/python', '-u', 'train.py', '--local_rank=1', '--cuda', '-d', 'ava_v2.2', '--root', '/data/ztq/data/ava/', '-v', 'yowo_v2_nano', '--num_workers', '4', '--eval_epoch', '1', '--max_epoch', '10', '--lr_epoch', '3', '4', '5', '6', '-lr', '0.0001', '-ldr', '0.5', '-bs', '8', '-accu', '16', '-K', '16', '--eval']' returned non-zero exit status 2.

这应该是由于传入的参数没有local_rank所导致的，但是如果传入参数没有local_rank，请问怎样才能进行单机多卡的训练呢，因为这样使用不了torch.distributed.launch

trainlist.txt内容和训练中的box

有两点不明白的地方请教作者，我说的啰嗦一点，希望能表达清楚我的疑惑，两个问题：
1）trainlist.txt中每行的信息是某视频中某一帧文件（图像）的项相对路径，看代码是根据某一帧去找对应的视频再取其中的16帧作为训练数据，为什么不是以视频为单位去选数据？
2）一个视频，动作是变化的，每一帧的动作框都在变化，为什么在训练的时候一个视频的框只用一组[x1, y1, x2, y2]框去表示GT呢，这个框的个数不应该是帧数嘛，刚入这个方向，求大佬指点。

in dynamic_k_matching _, pos_idx = torch.topk( RuntimeError: selected index k out of range

May I ask if you have encountered this error when building the network: a few targets cannot get positive sample points when they are processed by get_in_boxes_info, that is, all grids in this picture cannot get positive samples through calculation. If you encountered it, can you describe how you solved it?

yowo/matcher.py", line 191, in dynamic_k_matching
_, pos_idx = torch.topk(
RuntimeError: selected index k out of range

cost tensor([], size=(1, 0))
num_gtu: 1
gt_idx 0
cost[gt_idx] tensor([])

blank images

how to add blank background images to reduce false positives?

how to add blank background images

how to add blank background images to reduce false positives?

用于自定义数据集的预训练模型

你好 @yjh0410 ，
我尝试过使用自定义数据集训练神经 YOWO，但我无法获得高平均精度！我还尝试上传预训练的权重，最初它会增加，但随着训练的进一步进行，它会减少。
我的数据集现在很小是因为这个吗？你能帮我解决这个问题吗？
（是因为我使用了背景图像吗？我使用的背景图像没有与YOLO相同的输出标签）
灰色图是带有焦点损失的预训练，橙色图是没有焦点损失的预训练
先感谢您

Pause and resume training

How to convert the models to ONNX?

Hello,

 How to convert the models to ONNX?

 Thanks!

-Scott

output video

Team,

how do i get output videos?
i am seeing set of images when i do inference,

Changing person Detection Backbone

Greetings,

First of all, thank you fotr the amazing repo of yours @yjh0410 ! I want to ask you that how can I change the bacbone2d which makes the person detection steps on training and inference phase ?

Best
Alper

About video-mAP

Hi, this repo really useful, however, I have a question about how to calculate the video-mAP metric. As far as I know, UCF101-24 first appeared in the THUMOS13 challenge, and the official metric does not mention video-mAP. Could you provide me with any related documentation on this measurement? Thank you very much.

Btw, the original train/test split provided in THUMOS13 include 3 splits, But I only see 1 split in the data. Can I ask for the reason?

关于训练时参数的一些问题

你好，请问在UCF24数据集上，不可以train一轮eval一轮吗，谢谢作者百忙中回复。

Train with UCF24 dataset

Traceback (most recent call last):
  File "/media/minhtran/Resources/Learning/AI_DL_Projects/YOWOv2/YOWOv2/train.py", line 330, in <module>
    train()
  File "/media/minhtran/Resources/Learning/AI_DL_Projects/YOWOv2/YOWOv2/train.py", line 222, in train
    for iter_i, (frame_ids, video_clips, targets) in enumerate(dataloader):
  File "/home/minhtran/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 631, in __next__
    data = self._next_data()
  File "/home/minhtran/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1346, in _next_data
    return self._process_data(data)
  File "/home/minhtran/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data
    data.reraise()
  File "/home/minhtran/.local/lib/python3.10/site-packages/torch/_utils.py", line 722, in reraise
    raise exception
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/minhtran/.local/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/minhtran/.local/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/minhtran/.local/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/media/minhtran/Resources/Learning/AI_DL_Projects/YOWOv2/YOWOv2/dataset/ucf_jhmdb.py", line 55, in __getitem__
    frame_idx, video_clip, target = self.pull_item(index)
  File "/media/minhtran/Resources/Learning/AI_DL_Projects/YOWOv2/YOWOv2/dataset/ucf_jhmdb.py", line 121, in pull_item
    video_clip, target = self.transform(video_clip, target)
  File "/media/minhtran/Resources/Learning/AI_DL_Projects/YOWOv2/YOWOv2/dataset/transforms.py", line 127, in __call__
    video_clip = self.random_distort_image(video_clip)
  File "/media/minhtran/Resources/Learning/AI_DL_Projects/YOWOv2/YOWOv2/dataset/transforms.py", line 36, in random_distort_image
    cs[1] = cs[1].point(lambda i: i * dsat)
  File "/usr/lib/python3/dist-packages/PIL/Image.py", line 1651, in point
    return self._new(self.im.point(lut, mode))
TypeError: 'float' object cannot be interpreted as an integer

I just executed this command python3 train.py --cuda -d ucf24 --root /media/minhtran/Resources/Learning/AI_DL_Projects/YOWOv2/YOWOv2/data -v yowo_v2_tiny --num_workers 4 --eval_epoch 1 --max_epoch 8 --lr_epoch 2 3 4 5 -lr 0.0001 -ldr 0.5 -bs 8 -accu 16 -K 16 with all the setup instructions in the README.md file but got the error above. Does anyone meet this issue, help me solve it.

train.py: error: ambiguous option: --lr could match --lr_decay_ratio, --lr_epoch

前辈您好，想问下我按你提供的命令，显示这个错误，该怎么调整呢

Creating Custom dataset like AVA

Hi researchers
I have been looking for how to create the dataset like AVA for a long time.
If anybody has any idea about that, please give feedback. I appreciate that.
Thank you in advance!

您好，训练中打出了一些信息

[INFO] Register count_convNd() for <class 'torch.nn.modules.conv.Conv2d'>.
[INFO] Register count_bn() for <class 'torch.nn.modules.batchnorm.BatchNorm2d'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.activation.ReLU'>.
[WARN] Cannot find rule for <class 'torch.nn.modules.container.Sequential'>. Treat it as zero Macs and zero Params.
[INFO] Register zero_ops() for <class 'torch.nn.modules.pooling.MaxPool2d'>.
[WARN] Cannot find rule for <class 'models.backbone.backbone_2d.cnn_2d.yolo_free.yolo_free_backbone.ShuffleV2Block'>. Treat it as zero Macs and zero Params.
[WARN] Cannot find rule for <class 'models.backbone.backbone_2d.cnn_2d.yolo_free.yolo_free_backbone.ShuffleNetV2'>. Treat it as zero Macs and zero Params.
[WARN] Cannot find rule for <class 'models.backbone.backbone_2d.cnn_2d.yolo_free.yolo_free_basic.Conv'>. Treat it as zero Macs and zero Params.
[WARN] Cannot find rule for <class 'models.backbone.backbone_2d.cnn_2d.yolo_free.yolo_free_neck.SPPF'>. Treat it as zero Macs and zero Params.
[INFO] Register count_relu() for <class 'torch.nn.modules.activation.LeakyReLU'>.
[WARN] Cannot find rule for <class 'torch.nn.modules.container.ModuleList'>. Treat it as zero Macs and zero Params.
[WARN] Cannot find rule for <class 'models.backbone.backbone_2d.cnn_2d.yolo_free.yolo_free_fpn.ELANBlock'>. Treat it as zero Macs and zero Params.
[WARN] Cannot find rule for <class 'models.backbone.backbone_2d.cnn_2d.yolo_free.yolo_free_fpn.PaFPNELAN'>. Treat it as zero Macs and zero Params.
[WARN] Cannot find rule for <class 'models.backbone.backbone_2d.cnn_2d.yolo_free.yolo_free_head.DecoupledHead'>. Treat it as zero Macs and zero Params.
[WARN] Cannot find rule for <class 'models.backbone.backbone_2d.cnn_2d.yolo_free.yolo_free.FreeYOLO'>. Treat it as zero Macs and zero Params.
[WARN] Cannot find rule for <class 'models.backbone.backbone_2d.backbone_2d.Backbone2D'>. Treat it as zero Macs and zero Params.
[INFO] Register count_convNd() for <class 'torch.nn.modules.conv.Conv3d'>.
[INFO] Register count_bn() for <class 'torch.nn.modules.batchnorm.BatchNorm3d'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.pooling.MaxPool3d'>.
[WARN] Cannot find rule for <class 'models.backbone.backbone_3d.cnn_3d.shufflnetv2.InvertedResidual'>. Treat it as zero Macs and zero Params.
[WARN] Cannot find rule for <class 'models.backbone.backbone_3d.cnn_3d.shufflnetv2.ShuffleNetV2'>. Treat it as zero Macs and zero Params.
[WARN] Cannot find rule for <class 'models.backbone.backbone_3d.backbone_3d.Backbone3D'>. Treat it as zero Macs and zero Params.
[WARN] Cannot find rule for <class 'models.basic.conv.Conv2d'>. Treat it as zero Macs and zero Params.
[WARN] Cannot find rule for <class 'torch.nn.modules.activation.Softmax'>. Treat it as zero Macs and zero Params.
[WARN] Cannot find rule for <class 'models.yowo.encoder.CSAM'>. Treat it as zero Macs and zero Params.
[INFO] Register zero_ops() for <class 'torch.nn.modules.dropout.Dropout'>.
[WARN] Cannot find rule for <class 'models.yowo.encoder.ChannelEncoder'>. Treat it as zero Macs and zero Params.
[WARN] Cannot find rule for <class 'models.yowo.head.DecoupledHead'>. Treat it as zero Macs and zero Params.
[WARN] Cannot find rule for <class 'models.yowo.yowo.YOWO'>. Treat it as zero Macs and zero Params.
/home/ubuntu/anaconda3/lib/python3.8/site-packages/torch/functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2228.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]

FLOPs : 1.25 G
Params : 3.52 M

Optimizer: adamw
--momentum: 0.9
--weight_decay: 0.0005

WarmUpScheduler: linear
--base_lr: 0.0001
--warmup_factor: 0.00066667
--wp_iter: 500
tensor([[0.2557, 0.0974, 0.5837, 0.8539]], device='cuda:0')
tensor([[0.6044, 0.1925, 0.8323, 0.6527]], device='cuda:0')
tensor([[0.3208, 0.2667, 0.5094, 0.9091]], device='cuda:0')
tensor([[0.3756, 0.0000, 0.7563, 0.8317]], device='cuda:0')
tensor([[0.3898, 0.0000, 0.6712, 0.4589]], device='cuda:0')
tensor([[0.4688, 0.2174, 0.6761, 0.7488]], device='cuda:0')
tensor([[0.0010, 0.3571, 0.3080, 0.9990]], device='cuda:0')
tensor([[0.2393, 0.0474, 0.5123, 0.8147]], device='cuda:0')
[Epoch: 1/8][Iter: 0/42229][lr: 0.000000][loss_conf: 15.85][loss_cls: 1.40][loss_box: 0.87][losses: 21.58][time: 1.72]
tensor([[0.3943, 0.2675, 0.6057, 0.9424],
[0.1434, 0.2716, 0.3835, 0.9300]], device='cuda:0')
tensor([[0.4412, 0.0667, 0.6581, 0.6292],
[0.4559, 0.0750, 0.5735, 0.3875]], device='cuda:0')
tensor([[0.5899, 0.0000, 0.7022, 0.2686],
[0.3399, 0.0289, 0.5028, 0.2934]], device='cuda:0')
tensor([[0.6540, 0.3468, 0.8635, 0.9009]], device='cuda:0')
tensor([[0.3647, 0.0000, 0.7098, 0.6789]], device='cuda:0')
tensor([[0.5238, 0.0167, 0.8095, 0.6542]], device='cuda:0')
tensor([[0.5495, 0.0000, 0.8435, 0.7583],
[0.3706, 0.2844, 0.8211, 0.7204]], device='cuda:0')
tensor([[0.1012, 0.0159, 0.3742, 0.7729]], device='cuda:0')
tensor([[0.2273, 0.2603, 0.3977, 0.7111],
[0.0227, 0.2857, 0.3523, 0.7175]], device='cuda:0')

这些tensor怎么自动打出，看起来不太正常，我用的1.11torch

about converting to coreml

import coremltools as ct
from coremltools.converters.mil.mil import types
coreml_model = ct.convert(
traced_model,
convert_to="mlprogram",
inputs=[ct.TensorType(shape=(1, 3, 16, 224, 224),dtype=types.float)]
)
coreml_model.save('yowo_tiny.mlmodel')

File "C:\Users\Admin\AppData\Roaming\Python\Python310\site-packages\coremltools\converters\mil\mil\builder.py", line 168, in _add_op
new_op = op_cls(**kwargs)
File "C:\Users\Admin\AppData\Roaming\Python\Python310\site-packages\coremltools\converters\mil\mil\operation.py", line 190, in init
self._validate_and_set_inputs(input_kv)
File "C:\Users\Admin\AppData\Roaming\Python\Python310\site-packages\coremltools\converters\mil\mil\operation.py", line 503, in _validate_and_set_inputs
self.input_spec.validate_inputs(self.name, self.op_type, input_kvs)
File "C:\Users\Admin\AppData\Roaming\Python\Python310\site-packages\coremltools\converters\mil\mil\input_type.py", line 163, in validate_inputs
raise ValueError(msg.format(name, var.name, input_type.type_str,
ValueError: Op "pred_reg.1" (op_type: gather) Input indices="anchor_idxs.1" expects tensor or scalar of dtype from type domain ['int32'] but got tensor[is1,fp32]

FPS Value

Hi all,

In demo.py the batch_size is constant 1, therefore FPS in the main detection loop is slower than can ever be. How can I increase the batch_size value or make the system faster ?

Greetings

Can I train multi object action detection?

Hello @yjh0410 ,

I have dataset with two action classes, and i need to train it, but my dataset has multi objects in same frame will the YOWO model localize two actions in single frame?

thank you in advance

DDP的Losses异常

作者您好，请问为什么进行DDP训练损失异常的大，但是进行单卡训练没有这种情况。谢谢解答

eval.py

作者您好，按照你的方法，我在每轮训练阶段都能显示该轮的eval结果，但训练结束之后，单独运行eval.py文件，并且加载了刚刚训练好的权重之后，显示错误模型缺失，想问下这是为什么？