xingyizhou / gtr Goto Github PK

Global Tracking Transformers, CVPR 2022

Python 100.00%

gtr's Introduction

Global Tracking Transformers

Global Tracking Transformers,
Xingyi Zhou, Tianwei Yin, Vladlen Koltun, Philipp Krähenbühl,
CVPR 2022 (arXiv 2203.13250)

Features

Object association within a long temporal window (32 frames).
Classification after tracking for long-tail recognition.
"Detector" of global trajectories.

Installation

See installation instructions.

Demo

Run our demo using Colab (no GPU needed):

Try Replicate web demo here

We use the default detectron2 demo interface. For example, to run TAO model on an example video (video source: TAO/YFCC100M dataset), download the model and run

python demo.py --config-file configs/GTR_TAO_DR2101.yaml --video-input docs/yfcc_v_acef1cb6d38c2beab6e69e266e234f.mp4 --output output/demo_yfcc.mp4 --opts MODEL.WEIGHTS models/GTR_TAO_DR2101.pth

If setup correctly, the output on output/demo_yfcc.mp4 should look like:

Benchmark evaluation and training

Please first prepare datasets, then check our MODEL ZOO to reproduce results in our paper. We highlight key results below:

MOT17 test set

MOTA	IDF1	HOTA	DetA	AssA	FPS
75.3	71.5	59.1	61.6	57.0	19.6

TAO test set

Track mAP	FPS
20.1	11.2

License

The majority of GTR is licensed under the Apache 2.0 license, however portions of the project are available under separate license terms: trackeval in gtr/tracking/trackeval/, is licensed under the MIT license. FairMOT in gtr/tracking/local_tracker is under MIT license. Please see NOTICE for license details. The demo video is from TAO dataset, which is originally from YFCC100M dataset. Please be aware of the original dataset license.

Citation

If you find this project useful for your research, please use the following BibTeX entry.

@inproceedings{zhou2022global,
  title={Global Tracking Transformers},
  author={Zhou, Xingyi and Yin, Tianwei and Koltun, Vladlen and Kr{\"a}henb{\"u}hl, Philipp},
  booktitle={CVPR},
  year={2022}
}

gtr's People

Contributors

Stargazers

Watchers

gtr's Issues

Evaluation on MOT17 dataset

Hi @xingyizhou

Thanks for sharing the code
How did you evaluate the tracker on MOT17 dataset and got the results. Can you list the steps followed for recreating the results.
Thanks in advance.

model

Hello, Can I train my own model to only track pedestrian trajectories？Looking forward to your reply

Question about inference resolution

During MOT training the input resolution is set to 1280x1280 while the test size is 1560 (longer edge).
This mean that the input frames have an aspect-ratio (square) and a resolution (lower) compared to the test ones (rectangular aspect-ratio and bigger resolution).
I have tried to test with videos of the same resolution and aspect-ratio of training (1280x1280) but the performances were the worst.

My question is, how is it possible to obtain bad performances while maintaining the aspect-ratio and the same resolution of the training? Shouldn't the network perform better in that situation? If not, what is the reason (maybe I am missing some properties of the detector/transformer module)?

Typo near bottom of datasets/README.md

using
python tools/move_tao_keyframes.py --gt datasets/tao/annotations/validation.json --img_dir datasets/tao/frames --img_dir datasets/tao/keyframes
gives error:

0 36375
No exists! datasets/tao/keyframes/val/YFCC100M/v_25685519b728afd746dfd1b2fe77c/frame0751.jpg
Traceback (most recent call last):
  File "tools/move_tao_keyframes.py", line 19, in <module>
    target_path = args.out_dir + '/' + image['file_name']
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'

i think you all meant --out_dir datasets/tao/keyframes:
python tools/move_tao_keyframes.py --gt datasets/tao/annotations/validation.json --img_dir datasets/tao/frames --out_dir datasets/tao/keyframes

add web demo/model to Huggingface

Hi, would you be interested in adding GTR to Hugging Face? The Hub offers free hosting, and it would make your work more accessible and visible to the rest of the ML community.

Example from other organizations:
Keras: https://huggingface.co/keras-io
Microsoft: https://huggingface.co/microsoft
Facebook: https://huggingface.co/facebook

Example spaces with repos:
github: https://github.com/salesforce/BLIP
Spaces: https://huggingface.co/spaces/salesforce/BLIP

github: https://github.com/facebookresearch/omnivore
Spaces: https://huggingface.co/spaces/akhaliq/omnivore

and here are guides for adding spaces/models/datasets to your org

How to add a Space: https://huggingface.co/blog/gradio-spaces
how to add models: https://huggingface.co/docs/hub/adding-a-model
uploading a dataset: https://huggingface.co/docs/datasets/upload_dataset.html

Please let us know if you would be interested and if you have any questions, we can also help with the technical implementation.

gtr_roi_heads.py

@xingyizhou _get_asso_gt, when len(ind)>=1,and then i get assert error ,this may be because there are multiple targets at that time step that match the current forecast target,Is my data wrong?

Joint or separate training

Nice work! Thank you for sharing the code.

Is training of detector and tracker is joint or separate?
It seems from the paper (Section 5.2) that the first detector needs to be trained then the detector is frozen and the tracker is finetuned after that? Is that right inference?

Thanks
Gurkirt

RuntimeError: Detectron2 is not compiled with GPU support!

I am trying to run the demo command and saw this error

[04/05 14:46:45 detectron2]: Arguments: Namespace(confidence_threshold=0.5, config_file='configs/GTR_TAO_DR2101.yaml', cpu=False, input=None, opts=['MODEL.WEIGHTS', 'models/GTR_TAO_DR2101.pth'], output='output/demo_yfcc.mp4', video_input='docs/yfcc_v_acef1cb6d38c2beab6e69e266e234f.mp4', webcam=None)
WARNING [04/05 14:46:45 d2.config.compat]: Config 'configs/GTR_TAO_DR2101.yaml' has no VERSION. Assuming it to be compatible with latest v2.

[04/05 14:47:30 fvcore.common.checkpoint]: [Checkpointer] Loading from models/GTR_TAO_DR2101.pth ...
WARNING [04/05 14:47:37 fvcore.common.checkpoint]: Some model parameters or buffers are not found in the checkpoint:
roi_heads.box_predictor.0.freq_weight
roi_heads.box_predictor.1.freq_weight
roi_heads.box_predictor.2.freq_weight
WARNING [04/05 14:47:37 fvcore.common.checkpoint]: The checkpoint state_dict contains keys that are not used by the model:
roi_heads.pos_emb.weight
Could not find encoder for codec id 27: Encoder not found
[ERROR:0] global /io/opencv/modules/videoio/src/cap.cpp (392) open VIDEOIO(CV_IMAGES): raised OpenCV exception:

OpenCV(4.1.2) /io/opencv/modules/videoio/src/cap_images.cpp:253: error: (-5:Bad argument) CAP_IMAGES: can't find starting number (in the name of file): /tmp/video_format_test3gfay7is/test_file.mkv in function 'icvExtractPattern'

Traceback (most recent call last):
File "demo.py", line 161, in
for vis_frame in demo.run_on_video(video):
File "/home/jupyter/GTR/gtr/predictor.py", line 146, in run_on_video
outputs = self.video_predictor(frames)
File "/opt/conda/envs/gtr/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/jupyter/GTR/gtr/predictor.py", line 102, in call
predictions = self.model(inputs)
File "/opt/conda/envs/gtr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/jupyter/GTR/gtr/modeling/meta_arch/gtr_rcnn.py", line 61, in forward
return self.sliding_inference(batched_inputs)
File "/home/jupyter/GTR/gtr/modeling/meta_arch/gtr_rcnn.py", line 81, in sliding_inference
instances_wo_id = self.inference(
File "/home/jupyter/GTR/gtr/modeling/meta_arch/custom_rcnn.py", line 107, in inference
features = self.backbone(images.tensor)
File "/opt/conda/envs/gtr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/jupyter/detectron2/detectron2/modeling/backbone/fpn.py", line 126, in forward
bottom_up_features = self.bottom_up(x)
File "/opt/conda/envs/gtr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/jupyter/GTR/third_party/CenterNet2/projects/CenterNet2/centernet/modeling/backbone/res2net.py", line 630, in forward
x = stage(x)
File "/opt/conda/envs/gtr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/opt/conda/envs/gtr/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward
input = module(input)
File "/opt/conda/envs/gtr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/jupyter/GTR/third_party/CenterNet2/projects/CenterNet2/centernet/modeling/backbone/res2net.py", line 457, in forward
sp = self.convs[i](sp, offset, mask)
File "/opt/conda/envs/gtr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/jupyter/detectron2/detectron2/layers/deform_conv.py", line 474, in forward
x = modulated_deform_conv(
File "/home/jupyter/detectron2/detectron2/layers/deform_conv.py", line 221, in forward
_C.modulated_deform_conv_forward(
RuntimeError: Detectron2 is not compiled with GPU support!

Replacing CenterNet detector with custom trained detector

Is there a way to replace CenterNet detector with Yolov5? Apparently your model build deep into Detectron2 package and Im not sure how to go about replacing detector.

About windows system

Dear Writer, how to use this project on Windows system？

Error Running Demo

Hello, I'm having trouble running the inference (the "Demo" section in the README). Below is a notebook link showing the setup and error.

Here is the link to the notebook.

Let me know if anything else needs to be provided.

Much appreciated!

results_per_category only contains 296 classes during evaluation on TAO val set

Hi, sorry to bother you. But I have a small question about the evaluation on TAO val set.

As we all know, there are 482 categories in LVISv0.5 which are included in TAO.
So when using LVISv1.0 as in GTR, is it correct that there are only 296 classes in LVISv1.0 which are included in TAO? I'm not sure whether there are some categories are missed.

The related code is here:

    precisions = lvis_eval.eval['precision']
    assert len(class_names) == precisions.shape[2]
    results_per_category = []
    id2apiid = sorted(lvis_gt.get_cat_ids())
    inst_aware_ap, inst_count = 0, 0
    for idx, name in enumerate(class_names):
        precision = precisions[:, :, idx, 0]
        precision = precision[precision > -1]
        ap = np.mean(precision) if precision.size else float("nan")
        inst_num = len(lvis_gt.get_ann_ids(cat_ids=[id2apiid[idx]]))
        if inst_num > 0:
            results_per_category.append(("{} {}".format(
                name, 
                inst_num if inst_num < 1000 else '{:.1f}k'.format(inst_num / 1000)), 
                float(ap * 100)))
            inst_aware_ap += inst_num * ap
            inst_count += inst_num

EfficientDetResizeCrop

Hello, I want to understand some operations in 'custom_augmentation_impl.py'. Specifically, in line 56, the smallest scale-factor is chosen, which does not seem to match 'Scale the shorter edge to the given size'. I'm wondering if it's an issue with my understanding, and what exactly would be the impact.

Experiments on MOT20?

Though it is not necessary, I was just curious if you have tried GTR on MOT20 set? If so, could you share the performance or any experience as well?

Your test dataset overlaps with the training dataset

Hi,

Your mot training dataset and test dataset, generated by convert_mot2coco.py, are overlapped.
That comparison seems unfair...

Question on Association Head ( FC Head)

Hello!
I was curious what the purpose /intuition behind the Association Head is?
Also what was the reason behind choosing an FC Head instead of (1x1 conv --> flatten)?
Thank you in advance.

Is this tracker class agnostic and how are you dealing with datsaset problem?

Can this project train with GTX3090 (24G)? I met out of memory

Hello,
Thank you for sharing the excellent project!

My lab don't have RTX 6000GPUs (24G, referenced in paper ) but some GTX 3090 GPUs (24G). So I want to train GTR with GTX3090 on the MOT dataset. Howevr, I met out of memory error even with batchsize = 1. Could you give me any idea?

about lvis version

Hi there! Thanks for your work.

Here I have 2 questions about the version of lvis dataset:

Why did you use v1.0 instead of v10.5?
Could you please show me the code which re-map the labels of v1 back to v0.5?

Looking forward to your reply!

about lvis version

Hi there! Thanks for your work.

Here I have 2 questions about the version of lvis dataset:

Why did you use v1.0 instead of v0.5?
Could you please show me the code which re-map the labels of v1.0 back to v0.5?

Looking forward to your reply!

can't evaluate on MOT17

Hi Xingyi,

I believe the guidelines you write at the doc has some issue. To be precise, to directly evaluate on MOT17 by:

python train_net.py --config-file configs/GTR_MOT_FPN.yaml --eval-only MODEL.WEIGHTS  output/GTR_MOT/GTR_MOT_FPN/model_0004999.pth

we will get the error as:
gtr.tracking.trackeval.utils.TrackEvalException: GT file not found for sequence: MOT17-02-FRCNN

Besides, to evaluate on the self-splitted half-val, I assumed we need the files "gt_val_half.txt" under the directory of each sequence?

Could you help to double check if your guideline can work fine with the current version and reach the requirement of the TrackEval lib you adopted? I thought you may miss some guidelines about data splitting and preparation?

About gtr_roi_heads.py

In Line 242, An exception is thrown when len(ind)>=1, but the relevant situation is handled later;
How to understand this code, what is the reason if I keep throwing exceptions;
Thanks!

Which Detectron2 version should I install?

been met a lot of Version errors. T.T
I wonder which Detectron2 version is used in GTR?

any tips to real-world det&track scenario?

Great work! it works pretty well on TAO.
As for applying the method to real world multi-categories scenario, training on single image and interpolation or training on labeled video sequences, which approach can benefits much ? thank you.

how to visualization when inference

produce results file on TAO test set

Hi, could you please show me how to get the results on TAO test set, which will be uploaded to the challenge server?

I tried the following command, but it didn't work:

python train_net.py --config-file configs/GTR_TAO_DR2101.yaml --eval-only MODEL.WEIGHTS models/GTR_TAO_DR2101.pth DATASETS.TEST ('tao_test',)

test_conf0.json

datasets/mot/MOT17/annotations/test_conf0.json，I can't find it

how to train on one gpu

Hi, thanks for your wonderful works!
I have only one gpu, can i train the network? If it is possible, how to train on one gpu?

How to train my own dataset using GTR

Hello, GTR is a very good MOT model, I want try my own dataset on GTR,so I want to know how to train my own dataset using GTR?

Question: What is `data` in gtr/tracking/trackeval/metrics `eval_sequence` methods?

I don't have an issue. I'm just curious what the data variable has in https://github.com/xingyizhou/GTR/tree/master/gtr/tracking/trackeval/metrics eval_sequence methods.

From looking at clear.py, I'm guessing:

data = {
    "num_gt_ids": Total number of unique gt ids in the entire sequence,
    "num_tracker_dets": Total number of tracker detections in the entire sequence,
    "num_gt_dets": Total number of gt detections in the entire sequence,
    "gt_ids": [
        [0, 1],  # Zeroth frame.
        [0, 2, 3],
        ...
    ],
     "tracker_ids": [
        [0, 1, 2],  # Zeroth frame.
        [0, 2],
        ... 
    ],
    "similarity_scores": [  # Length is len(video) - 1
         # What is this?
    ]
}

Is similarity scores just a list of these: traj_score = torch.mm(asso_nonk, id_inds) # n_k x M?

It looks like the similarity score matrix is of shape: (number of ground truth instances at frame t, number of tracked instances at frame t). I'm still confused about what this exactly is.

Is there code provided for getting this data format (i.e. from instances returned from the sliding_inference method here)?

Thank you!

A question about the speed

Thanks for releasing this great work. May I ask for more details about the speed evaluation?

For TAO data, as you used the default detectron2, may I know if you count the inference time of detectron2 for 11.2 FPS, or only the GTR inference time? Since the TAO video sampling rate may not be 30 FPS, does it need to consider this factor and transfer the inference speed?

Thanks.

Difference between GTR_MOT_FPN and GTR_MOTFull_FPN

Hi, I cannot find in the detail neither here nor in the paper what are the differences between these two models. The configurations are identical but they differ in the training dataset (half vs full).

In the paper you said: "We follow CenterTrack [68] and split each training sequence in half. We use the first half for training and the second half for validation". But the results in table 3 seems obtained by GTR_MOTFull_FPN.

Which of the two models should be considered the "best" one? May I have more information about this?

Thank you so much in advance.

Training memory issue & missing file

Hello,
Thanks for sharing the source code of nice work!

I have tried the TAO training code (GTR_TAO_DR2101.yaml) but failed full training due to the memory overhead error.
It seems the memory usage increases gradually during training, and reaches the max memory limit.
As I am currently using A6000 with 48G gpu, it should be enough based on your training spec (4x 32G V100 gpu). Could you give any ideas? My initial solution is to reduce the video length 8 to 2.

Moreover, I cannot find the move_tao_keyframes.py file. Could you please provide this file?

Thanks,

OSError: [Errno 113] No route to host

Thank you for sharing your excellent work. But when I trained a model in a machine with 8 GPUs, I met OSError: [Errno 113] No route to host as follow:

I did not enable firewall.

Now, I don't know how to solve it.

class

thanks for your code, how can I show only the person's track, I didn't find your tag index，thanks

Reproducing Transformer Fine Tuning - TAO

I'm following the instructions here to reproduce the transformer head fine tuning on TAO here: https://github.com/xingyizhou/GTR/blob/master/docs/MODEL_ZOO.md#tao
and I can't seem to get the results reported in the MODEL_ZOO or paper.

Here are the steps I'm following:

Download and setup the datasets as described here: https://github.com/xingyizhou/GTR/tree/master/datasets
Download the trained detection model C2_LVISCOCO_DR2101_4x.pth from the link in the third bullet point under note section in TAO and place it in a models/ directory. The link for the config is broken in this bullet point but I'm using the C2_LVISCOCO_DR2101_4x.yaml in configs/ folder
run python train_net.py --num-gpus 8 --config-file configs/C2_LVISCOCO_DR2101_4x.yaml MODEL.WEIGHTS models/C2_LVISCOCO_DR2101_4x.pth This took about 6 days on 8 Titan X GPUs.

The reason I believe it didn't train properly is because when I run TAO validation on the output model of the training using:
python train_net.py --config-file configs/GTR_TAO_DR2101.yaml --eval-only MODEL.WEIGHTS output/GTR_TAO_first_train/C2_LVISCOCO_DR2101_4x/model_final.pth the mAP is 10.6
but when I run TAO validation on the pretraind model, GTR_TAO_DR2101.pth, downloaded from MODEL_ZOO:
python train_net.py --config-file configs/GTR_TAO_DR2101.yaml --eval-only MODEL.WEIGHTS models/GTR_TAO_DR2101.pth
the output is correct 22.5 mAP as reported.

Any ideas why the model training isn't working correctly? Am i using the wrong configurations or something?

Not able to run in x86 in CPU

Hi @xingyizhou @noahcao
Thank you for sharing this work
When I'm trying to run the script in my x86 machine in cpu $python demo.py --config-file configs/GTR_TAO_DR2101.yaml --video-input docs/yfcc_v_acef1cb6d38c2beab6e69e266e234f.mp4 --output output/demo_yfcc.mp4 --opts MODEL.WEIGHTS GTR_TAO_DR2101.pth, I'm getting the following error:

Traceback (most recent call last):
File "/home/sravan/SAT/Tracker/GTR/demo.py", line 161, in
for vis_frame in demo.run_on_video(video):
File "/home/sravan/SAT/Tracker/GTR/gtr/predictor.py", line 147, in run_on_video
outputs = self.video_predictor(frames)
File "/home/sravan/anaconda3/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/home/sravan/SAT/Tracker/GTR/gtr/predictor.py", line 103, in call
predictions = self.model(inputs)
File "/home/sravan/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sravan/SAT/Tracker/GTR/gtr/modeling/meta_arch/gtr_rcnn.py", line 61, in forward
return self.sliding_inference(batched_inputs)
File "/home/sravan/SAT/Tracker/GTR/gtr/modeling/meta_arch/gtr_rcnn.py", line 81, in sliding_inference
instances_wo_id = self.inference(
File "/home/sravan/SAT/Tracker/GTR/gtr/modeling/meta_arch/custom_rcnn.py", line 107, in inference
features = self.backbone(images.tensor)
File "/home/sravan/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sravan/anaconda3/lib/python3.9/site-packages/detectron2/modeling/backbone/fpn.py", line 126, in forward
bottom_up_features = self.bottom_up(x)
File "/home/sravan/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sravan/SAT/Tracker/GTR/third_party/CenterNet2/centernet/modeling/backbone/res2net.py", line 630, in forward
x = stage(x)
File "/home/sravan/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sravan/anaconda3/lib/python3.9/site-packages/torch/nn/modules/container.py", line 141, in forward
input = module(input)
File "/home/sravan/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sravan/SAT/Tracker/GTR/third_party/CenterNet2/centernet/modeling/backbone/res2net.py", line 457, in forward
sp = self.convs[i](sp, offset, mask)
File "/home/sravan/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sravan/anaconda3/lib/python3.9/site-packages/detectron2/layers/deform_conv.py", line 474, in forward
x = modulated_deform_conv(
File "/home/sravan/anaconda3/lib/python3.9/site-packages/detectron2/layers/deform_conv.py", line 211, in forward
raise NotImplementedError("Deformable Conv is not supported on CPUs!")
NotImplementedError: Deformable Conv is not supported on CPUs!

How can I solve this?

The choice of backbone on TAO

Hi, it seems that you are using Res2Net101 on TAO. I'm wondering that is it necessary to use such a heavy backbone instead of ResNet50? Will the performance decrease rapidly when using a smaller backbone like ResNet50?

MOTfull evaluation problem

@philkr @xingyizhou @noahcao @tianweiy @chenxwh
Hi GTR team

I followed the instructions of evaluation, but why it always said gtr.tracking,trackeval,utils,TrackEvalException: Tracker file not found: pred/data/MOT17-02-FRCNN.txt

after the entire inference? I'm really confused and warmly want your response which will be appreciated! :)

Kind regards,

Arguments: Namespace(confidence_threshold=0.5, config_file='configs/quick_schedules/mask_rcnn_R_50_FPN_inference_acc_test.yaml', cpu=False, input=None, opts=[], output=None, video_input=None, webcam=None)

When I run the demo, I get the following error：
Traceback (most recent call last):
File "/root/autodl-nas/GTR/demo.py", line 103, in
cfg = setup_cfg(args)
File "/root/autodl-nas/GTR/demo.py", line 33, in setup_cfg
cfg.merge_from_file(args.config_file)
File "/root/detectron2/detectron2/config/config.py", line 45, in merge_from_file
assert PathManager.isfile(cfg_filename), f"Config file '{cfg_filename}' does not exist!"
AssertionError: Config file 'configs/quick_schedules/mask_rcnn_R_50_FPN_inference_acc_test.yaml' does not exist!
[02/26 19:18:23 detectron2]: Arguments: Namespace(confidence_threshold=0.5, config_file='configs/quick_schedules/mask_rcnn_R_50_FPN_inference_acc_test.yaml', cpu=False, input=None, opts=[], output=None, video_input=None, webcam=None)

Webcam feature

Hi, can your model be used for real time tracking?

Failed on the long videos.

I tried to run the predictor on the long video (about 100k frames) using:

python demo.py --config-file configs/GTR_TAO_DR2101.yaml --video-input docs/Long video.mp4 --output output/Long video.mp4 --opts MODEL.WEIGHTS models/GTR_TAO_DR2101.pth

But the process always get "Killed". Are there any suggestions on this?

Should we rename the MOT17 train to trainval, which is not explained in the prepare datasets doc?
Should the datasets for training be ("mot17_halftrain","crowdhuman_train") instead of ("mot17_halftrain","crowdhuman_amodal_train") in the config file? the later one would raises an error of unregistered dataset: