ilchaejung / rt-mdnet Goto Github PK

Python 71.78% Makefile 0.26% Cuda 18.05% C 9.01% C++ 0.90%

rt-mdnet's Introduction

RT-MDNet: Real-Time Multi-Domain Convolutional Neural Network Tracker

Created by Ilchae Jung, Jeany Son, Mooyeol Baek, and Bohyung Han

Introduction

RT-MDNet is the real-time extension of MDNet and is the state-of-the-art real-time tracker. Detailed description of the system is provided by our project page and paper

Citation

If you're using this code in a publication, please cite our paper.

@InProceedings{rtmdnet,
author = {Jung, Ilchae and Son, Jeany and Baek, Mooyeol and Han, Bohyung},
title = {Real-Time MDNet},
booktitle = {European Conference on Computer Vision (ECCV)},
month = {Sept},
year = {2018}
}

System Requirements

This code is tested on 64 bit Linux (Ubuntu 16.04 LTS).

Prerequisites 0. PyTorch (>= 0.2.1) 0. For GPU support, a GPU (~2GB memory for test) and CUDA toolkit. 0. Training Dataset (ImageNet-Vid) if needed.

Online Tracking

Pretrained Model and results If you only run the tracker, you can use the pretrained model: RT-MDNet-ImageNet-pretrained. Also, results from pretrained model are provided in here.

Demo 0. Run 'Run.py'.

Learning RT-MDNet

Preparing Datasets 0. If you download ImageNet-Vid dataset, you run 'modules/prepro_data_imagenet.py' to parse meta-data from dataset. After that, 'imagenet_refine.pkl' is generized. 0. type the path of 'imagenet_refine.pkl' in 'train_mrcnn.py'

Demo 0. Run 'train_mrcnn.py' after hyper-parameter tuning suitable to the capacity of your system.

rt-mdnet's People

Contributors

Stargazers

Watchers

rt-mdnet's Issues

samples2maskroi

I cannot understand these code,why the receptive field is subtracted to x2,y2, could you explain more about the reason?
rois[:, 0] *= cur_resize_ratio[0]
rois[:, 1] *= cur_resize_ratio[1]
rois[:, 2] = np.maximum(rois[:,0]+1,rois[:, 2]*cur_resize_ratio[0] - receptive_field)
rois[:, 3] = np.maximum(rois[:,1]+1,rois[:, 3]*cur_resize_ratio[1] - receptive_field)
why not like this:
rois[:, 0] *= cur_resize_ratio[0]
rois[:, 1] *= cur_resize_ratio[1]
rois[:, 2] *= cur_resize_ratio[0]
rois[:, 3] *= cur_resize_ratio[1]

about padding

padded_x1 = (neg_examples[:,0]-neg_examples[:,2](opts['padding']-1.)/2.).min()
padded_y1 = (neg_examples[:,1]-neg_examples[:,3](opts['padding']-1.)/2.).min()
padded_x2 = (neg_examples[:,0]+neg_examples[:,2](opts['padding']+1.)/2.).max()
padded_y2 = (neg_examples[:,1]+neg_examples[:,3](opts['padding']+1.)/2.).max()

I don't understand calculating padding like this. The neg_examples bbox is [x,y,w,h]. Why (x-w)*0.1?
What's the meaning? Can anyone explain to me。

Data process on sequences Tiger1?

I can‘t find the official instructions for this step。
after this step,gt is not corresponding to the image.why？

OTB2015 experimental results

Hello, could you tell me where your experimental results (. mat) file can be found? Thank you very much.

About receptive_field

rois[:, 0] *= cur_resize_ratio[0]
rois[:, 1] *= cur_resize_ratio[1]
rois[:, 2] = np.maximum(rois[:,0]+1,rois[:, 2]*cur_resize_ratio[0] - receptive_field)
rois[:, 3] = np.maximum(rois[:,1]+1,rois[:, 3]*cur_resize_ratio[1] - receptive_field)

I have question about receptiive_field, why there rois[:, 2]*cur_resize_ratio[0] minus receptive_field? what's the meaning about receptive_field？ can anyone answer me?

any plan to support PyTorch1.0

Great work, thanks very much!

Any plan to upgrade to PyTorch1.0?

Any suggestion if I am upgrading to PyTorch1.0 myself, I will be happy to share once done.

run script not working

/neural-networks/RT-MDNet$ ./Run.py
from: can't read /var/mail/os.path
from: can't read /var/mail/tracker
./Run.py: line 13: syntax error near unexpected token (' ./Run.py: line 13: def genConfig(seq_path, set_type):'

AttributeError:module 'roi_align._ext.roi_align' has no attribute 'roi_align_forward_cuda

Can not display?

when I "python Run.py -visualize",show error

Traceback (most recent call last):
File "Run.py", line 119, in
iou_result, result_bb, fps, result_nobb = run_mdnet(img_list, gt[0], gt, seq = seq, display=opts['visualize'])
File "/home/zhulishun/tracking/RT-MDNet-master/tracker.py", line 422, in run_mdnet
im = ax.imshow(cur_image, aspect='normal')
File "/home/zhulishun/anaconda2/envs/rtmdnet/lib/python2.7/site-packages/matplotlib/init.py", line 1867, in inner
return func(ax, *args, **kwargs)
File "/home/zhulishun/anaconda2/envs/rtmdnet/lib/python2.7/site-packages/matplotlib/axes/_axes.py", line 5496, in imshow
self.set_aspect(aspect)
File "/home/zhulishun/anaconda2/envs/rtmdnet/lib/python2.7/site-packages/matplotlib/axes/_base.py", line 1373, in set_aspect
aspect = float(aspect) # raise ValueError if necessary
ValueError: could not convert string to float: normal

about GPU memory size

thanks for your work，my gpu is 2GB. but still shows out of memory. And I try to run it on CPU and change the GPU setting to FALSE. However, it cannot work.

about the randomness

I think that the randomness only comes from SampleGenerator, and only np.rand is used in SampleGenerator. I have fixed the random seed of numpy, random and pytorch. But I cannot get the same result on the same video.
I really wonder where the randomness comes from.

The padding operation discussion

Hi, thanks for releasing the excellent work.

However, I have several points that cannot figure out well.

Firstly it is in the modules/pretrain_opts.py and options.py. As we can see, there are parameters like padding_ratio and jitter. I also found in https://github.com/IlchaeJung/RT-MDNet/blob/master/modules/data_prov.py, we compute the extra padding area to get a larger image and then we use jitter to scale this image, finally we crop the positive and negative regions. This operations are also applied when online tracking.

Is this just a means of augmentation? Or because MDNet's conv layers have no padding, therefore you add some uncertain padding to enlarge the origin image size? I cannot figure it out and it seems that your paper doesn't explain it at all. Can you please why we do this padding and jitter operations?

Thank you very much!

about data pre-prepare

I noticed you also provide scripts to pre-prepare dataset like VOT/OTB, but it seems that it lack some files like otb-vot15.txt or vot-otb.txt, sincerely hope for your reply!

I want to know how to do pre-training？

Is training on VID? How to set parameters？I hope to get a reply, thank you！

visual tracking result

This is my resule,
'fps': {'Basketball': 19.858671676184922, 'Baby_ce': 22.22692606990516}}
I want ti know 'How to put RT-MDNet into OTB to evaluation'
Is this program have a visual tracking result?

about ROI align

Thank you for releasing this excellent work.
I can successfully run this code on a 1080 TI GPU, while when I run the code using my K80 GPU, the error "cudaCheckError() failed : invalid device function" occurs.
I think the error may because the ROI align. I wonder how to recompile the ROI align module so that I can try it on different GPU devices? Thank you.

License

Hi,
Which license this repository uses?

no use of 'frame_interval'

Thank you for your work!
in train_mrcnn.py, what's the meaning of variable frame_interval, it seems it does not be used?

The results of OTB2015

Thank you for releasing this excellent work.
I tested your model on OTB2015 and the result was 0.632,This result is lower than 0.650 in your paper. I use the default parameter settings of your code. Could you please tell me how to solve this issue? Thank you very much.

The results of OTB2015 and training time

Thank you for releasing this excellent work.
I have two questions to ask you.
1. I tested your model on OTB2015 and the result was 0.632,This result is lower than 0.650 in your paper. I use the default parameter settings of your code. Could you please tell me how to solve this issue?
2. I train ImageNet-Vid dataset on a GeForce RTX 2080, how long will it take?
Thank you very much.

Error during code execution

1、FileNotFoundError: [Errno 2] No such file or directory: './models/rt-mdnet.pth'
Hello, I did not find the rt-mdnet.pth file in the code. How should I solve this problem?
2、

I want to use cpu to execute this code, can the code be executed by cpu? Because of the ‘ImportError: libcudart.so.10.0: cannot open shared object file: No such file or directory ’ in the image, I have commented on these ‘import’ statements, so it has no effect on the cpu execution of this code? ?
Thank you very much.

How to start？

1.Where put pretrained model ？You instruction is too siample to understand.

put it on'./modules'

the GPU memory

If the target size is small the size of the cropped img will become very large (6000*7000). My GPU has 12GB memory and is not enough. Is there any way to fix it? Maybe there are some memory leak when the opts['jitter'] is True.

About feature size

I found that the relationship between the size of conv3 output feature and the input image is not strictly one eighth, will it affect roalign to extract sample features?

ImportError: ./modules/roi_align/_ext/roi_align/_roi_align.so: undefined symbol: state

I get the following error:

Traceback (most recent call last):
  File "Run.py", line 3, in <module>
    from tracker import *
  File "/my_dummy_path/RT-MDNet/tracker.py", line 15, in <module>
    from data_prov import *
  File "./modules/data_prov.py", line 18, in <module>
    from img_cropper import *
  File "./modules/img_cropper.py", line 3, in <module>
    from roi_align.modules.roi_align import RoIAlign
  File "./modules/roi_align/modules/roi_align.py", line 3, in <module>
    from ..functions.roi_align import RoIAlignFunction, RoIAlignAdaFunction, RoIAlignDenseAdaFunction
  File "./modules/roi_align/functions/roi_align.py", line 3, in <module>
    from .._ext import roi_align
  File "./modules/roi_align/_ext/roi_align/__init__.py", line 3, in <module>
    from ._roi_align import lib as _lib, ffi as _ffi
ImportError: ./modules/roi_align/_ext/roi_align/_roi_align.so: undefined symbol: state

How to solve it?

about extra_pos_feats and pos_feats

I don't understand why get the training feat twice in the first frames to fintune. extra_pos_feats and pos_feats come from by almost same way.

About the Training loss

ImageNet-vid is too big so i decide to use the VOT dataset to train the model.
would u mind to tell me the specific data :'Mean precision & Inter loss' during your training process(using Imagenet-vid)?

Hello, the link of this pre-training model cannot be opened. could you repost it again

I would like to know the test accuracy of this OTB you released

only give first frame groundtruth

How to use in a movie recorded by myself?I only want to give groundtruth of the first frame,but how can I do ?I would be thank you very much.Only a tip is fine.

global name cur_triplet_loss is not defined

in 231 line of train_mrcnn, there is an error: global name 'cur_triplet_loss' is not defined.Could you tell me where that variable is defined. Thank you

the results of OTB2015,fix random seeds

I didn't change any parameters. Because of the randomness of the tracking results, I fixed the random seeds. But the final result is lower than that shown in the paper. What's the matter?
The result of the code I run is PR=0.847，SR=0.632.

Time cost for offline pretraining?

Thanks for your wonderful work!

I notice that in the pretrain_opts: n_cycles: 1000
And in train_mrcnn.py, K is the number of VID videos, it is 3499.(Some of them are filtered by preprocessing)

It seems that it is really huge inputs for MDNet. I want to ask for two values:

The offline training time for each iteration
The total time for the whole offline training

Thanks for your attention.

cudaCheckError() failed : invalid device function

I use python2.7 and pytorch0.2.0 or putorch0.4.1,there is always this error,how to solve it?thank you very much

bb_result and bb_result_nobb

I transfor 'result.npy' to'result.mat',found .mat file has' bb_result' and 'bb_result_nobb ',what does those result maens? Can you help me to learn more information,I just really interested in it.

random results on OTB

I tried several sequences of otb, but the results of each run are different.

0 Bird1 : 0.201098956064 , total mIoU:0.201098956064, fps:27.1192838431
1 Box : 0.302790951364 , total mIoU:0.251944953714, fps:34.7540858495
2 Couple : 0.662352476224 , total mIoU:0.388747461218, fps:37.7562471058
3 Freeman4 : 0.666170731107 , total mIoU:0.45810327869, fps:39.7685989323
4 BlurBody : 0.722626621854 , total mIoU:0.511007947323, fps:39.1456625614
5 Jumping : 0.678226531945 , total mIoU:0.538877711426, fps:41.4720706029

0 Bird1 : 0.111411426274 , total mIoU:0.111411426274, fps:15.230079837
1 Box : 0.699401980322 , total mIoU:0.405406703298, fps:27.8460658587
2 Couple : 0.682630547749 , total mIoU:0.497814651448, fps:33.7325697033
3 Freeman4 : 0.611166113115 , total mIoU:0.526152516865, fps:36.3247190875
4 BlurBody : 0.718198220241 , total mIoU:0.56456165754, fps:36.3292326732
5 Jumping : 0.676521638398 , total mIoU:0.58322165435, fps:38.9468761315