bbepoch / hoitransformer Goto Github PK
View Code? Open in Web Editor NEWThis is the code for HOI Transformer
License: Apache License 2.0
This is the code for HOI Transformer
License: Apache License 2.0
Thanks for the nice work! By reading your paper and code, I find something very confusing.
Why the performance on VCOCO is similar to the paper, while much better than paper for HICO-DET? What's the key factor for such results?
HICO不是有80 objects吗?为什么代码中num_classes传入的是91呢?
Thanks for sharing great work
I have one question,
when training vcoco with resnet 50
Should I train 150 epoch or 250 epoch to reproduce the result of 51 AP
我尝试用您的程序对自己的数据集进行评估,请问如何实现对自己数据集的评估呢
Thanks for such a great work. I have a general query regarding an action query which is fed into a decoder of the transformer network.
I want to clarify whether the action query is input feature of the sample itself or the feature extracted from the bounding box of the respective interaction bounding box or something else ?.
In this work, num_queries=100 in default. How the features are selected based on number of queries?
Thanks in advance
How to use code to infer in your my data set? My own data set is not labeled, just want to see the actual application effect of HOI algorithm
thank you very much.
@bbepoch thanks for opensourcing the code base just wanted to knw will support be given for onnx conversion of the pre trained model ? THanks in advance
Hi,
Thank you so much for the great paper!
I'm confused about the data split on VCOCO by going through the codes, did you use train2014 and val2014 for training and testing respectively? I mean there're only training and testing datasets used without a validation set, or I can think the val2014 is used for both validation and testing? Thank you in advance.
Thanks for your nice work. Could you also provide the source code for training and evaluating on VCOCO dataset since it takes too long for the training on the HICO-Det dataset? Thanks a lot.
@bbepoch hi thanks for opensourcing the code base had one query regarding the architecture for the current hoitransformer you are using Detr as the base can we change it yolor ? is so what is the process of changing the code base to yolor can share your thoughts on this
Thanks in adavnce
First of all, Thank you so much for such an amazing work.
I have two things to clarify
First of all, thanks for sharing the source code of the paper (End-to-End Human Object Interaction Detection with HOI Transformer) which was accepted in CVPR. I have a question about your V-COCO dataset. The original V-coco dataset consists of 5,400 images in the trainval dataset and 4946 images in the test set. But I download your retag dataset from google drive which consists of 4971 images in trainval set and 4539 images in the test set.
May I ask whether I downloaded the wrong data or the data has been processed especially?
I look forward to receiving your reply.
I find some swin-b code in repositories and tried to train this model on swin-b.But i find the train loss convergence at 70 and it cant drop with any training.Can you share how to train this model on swin-b or teach me if there anything code need to complete in this repositories that i will tried to complete this code.I would be appreciate for your reponse,thank you so much.
Thanks for your great work. I try to evaluate the model with Known Object Mode, is it possible for you to share the code for it?
Thanks for your nice work! I know --epochs=250 for HICO-Det dataset. How many epochs about VCOCO dataset you used? 100 epochs or less than 100 epochs? And how about other parameters, like --lr_drop=200 and --batch_size=2 , are they as same as HICO-Det do? Thanks a lot.
Thanks for your issue to make this work a strong baseline of HOI detection.
It's a grate job, Tanks for what you contribute.
I have a question about the paper. In the paper, you said "All one-layer MLP branches for predicting confidence use a softmax function.". But as far as I know, VCOCO dataset's verbs is multi label, I am wonder how can I use softmax to predict multi label.
Hope you can reply me
Thanks
错误信息:/opt/conda/conda-bld/pytorch_1591914895884/work/aten/src/ATen/native/cuda/IndexKernel.cu:60: operator(): block: [0,0,0], thread: [62,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1591914895884/work/aten/src/THC/THCCachingHostAllocator.cpp line=278 error=59 : device-side assert triggered
py文件hoitr.py文件num_humans 是不是代表有多少人?那hico数据集中每张图的人员数量不是固定值,这个参数怎么设置呢?
做完相关实验后,明白了,抱歉误解了您和他的区别
Excuse me!
It's a grate job, Tanks for what you contribute.
The paper say that HICO-DET has 80 objects,but num_classes is 91 in /models/hoitr.py 345 lines.Why set it like this?
And could you tell me how to get 'hico_train_retag_hoitr.odgt' from hico-det datasets' file?
thank you very much!
I don't quite understand the self.human_cls_embed in class HoiTR. Why do we need to classify the human box? What do the 2 classes stand for?
Hi, I install the latest version of requirements but make this error message:
ImportError: cannot import name '_new_empty_tensor' from 'torchvision.ops'
If I install old version of requirements make the same error. Also some the version of requirements are disqualified.
I plan to use hoitransformer on my own dataset, but it seemes to i should train detr model firstly. I have trained my own dataset with detr(facebook), but it didn't work. Can you give me some suggestions on detr training?
hello, in your repo, the vcoco_test.odgt contains the 4953 images, but the ori vcoco test set contains above 5000 images, the vcoco performance should be test on 4953 images or 5000+ images? looking forward to your reply, thanks!
Hello Author, thank you for your great work!
Recently, I tried to use HoiTransformer to train my own dataset. I have some questions about the odgt annotations.
Taking HICO_train2015_00000001.jpg for example, my questions are as follows:
In the original HICO annotation of this picture, there was only one pair of boxes for motorcycle and person. However, in the ODGT annotation, I found that this picture had more than one pair of boxes for motorcycle and person(but there is only one person and motorcycle in the pic), and the coordinates of each box were not the same, but the difference was not significant. I did not understand this point. (why are the coordinates of boxes of the same person or motorbike is different).
The coordinate of the person box in this picture in the hico annotations is [207,32,426,299], but the coordinates in odgt annotations is [207,32,220,268]. My understanding is that in hico annotations, the coordinates are the coordinates of the upper-left and lower-right points of the box, whereas in your odgt annotation, the coordinates are the upper-left points and the length and width of the box. Although this explanation seems reasonable, according to this understanding, the length and width of the person box should be [426-207=219,299-32=267], but it is [220,268] in ODGT annotations. Please tell me why.
Thanks again for your excellent work! Your answer will be of great help to me.
Looking forward to your reply!
Thanks for your great works!! :)
I found the visualization of attention map in the paper,could u provide the visualization code?
Thank you for your help.
nori2是什么第三方库?pip的安装命令是什么?
Could you provide the training log files? Thanks
作者您好,我使用一个3090训练以resnet50为主干的vcoco数据集,但是并没有得到51.9的测试结果,只有46左右,请问是什么问题导致的,我是按照模型默认的参数进行训练的,只把num_worker的值改为了8。
I try to train on V-COCO.
But when I run
python main.py --epochs=250 --lr_drop=110 --dataset_file=vcoco --batch_size=16 --backbone=resnet50
I got an error of :
File "E:\project\HoiTransformer-master\models\hoi_matcher.py", line 80, in forward
human_cost_class = -human_out_prob[:, human_tgt_ids]
IndexError: tensors used as indices must be long, byte or bool tensors
How to solve this error?
Hello!
I am running the evaluation code and it throws error while trying to load corre_hico.npy. Is this provided in the repo?
Thanks.
The ODGT annotations are indeed much easier to understand for HOI detection. I was wondering if the script to convert V-COCO's raw annotations to the ODGT format could be shared. Thank you.
line80 in hoi_matcher.py, human_cost_class = -human_out_prob[:, human_tgt_ids],where human_out_prob is [200, 3], however human_tgt_ids is [4]
怎么训练自己的数据集?
@bbepoch hi thanks for sharing the code base great work, but i had one query, currently when i tested the model for some scenes like an only person running on a beach without any other object present there is no detections/activity in the output, is there any way i can get results like people walking , fighting, waving without depending on the object present in the scene
Thanks in advance
Traceback (most recent call last):
File "slurm_main.py", line 243, in
main(args)
File "slurm_main.py", line 200, in main
args.clip_max_norm)
File "/mnt/lustre/penghuan/HoiTransformer/engine.py", line 32, in train_one_epoch for samples, targets in metric_logger.log_every(data_loader, print_freq, header):
File "/mnt/lustre/penghuan/HoiTransformer/util/misc.py", line 223, in log_every
for obj in iterable:
File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/r1024/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line
355, in iter
return self._get_iterator()
File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/r1024/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line
301, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/r1024/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line
914, in init
w.start()
File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/r1024/lib/python3.6/multiprocessing/process.py", line 105, in start
self._popen = self._Popen(self)
File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/r1024/lib/python3.6/multiprocessing/context.py", line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj)
File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/r1024/lib/python3.6/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/r1024/lib/python3.6/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/r1024/lib/python3.6/multiprocessing/popen_spawn_posix.py", line 32, in init
super().init(process_obj)
File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/r1024/lib/python3.6/multiprocessing/popen_fork.py", line 19, in init
self._launch(process_obj)
File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/r1024/lib/python3.6/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/r1024/lib/python3.6/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/r1024/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 321, in reduce_storage
fd, size = storage.share_fd()
RuntimeError: unable to mmap 32 bytes from file </torch_239305_1669300387>: Cannot allocate memory (12)
Hi,
Will you release a demo code for this work? I really wish to try out your work but I'm stuck with limited resources. It'd be great if you could release your well-trained models. Thank you so much.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.