visdrone / visdrone-dataset Goto Github PK

The dataset for drone based detection and tracking is released, including both image/video, and annotations.

visdrone-dataset's Introduction

VisDrone-Dataset

Drones, or general UAVs, equipped with cameras have been fast deployed to a wide range of applications, including agricultural, aerial photography, fast delivery, and surveillance. Consequently, automatic understanding of visual data collected from these platforms become highly demanding, which brings computer vision to drones more and more closely. We are excited to present a large-scale benchmark with carefully annotated ground-truth for various important computer vision tasks, named VisDrone, to make vision meet drones. The VisDrone2019 dataset is collected by the AISKYEYE team at Lab of Machine Learning and Data Mining , Tianjin University, China. The benchmark dataset consists of 288 video clips formed by 261,908 frames and 10,209 static images, captured by various drone-mounted cameras, covering a wide range of aspects including location (taken from 14 different cities separated by thousands of kilometers in China), environment (urban and country), objects (pedestrian, vehicles, bicycles, etc.), and density (sparse and crowded scenes). Note that, the dataset was collected using various drone platforms (i.e., drones with different models), in different scenarios, and under various weather and lighting conditions. These frames are manually annotated with more than 2.6 million bounding boxes of targets of frequent interests, such as pedestrians, cars, bicycles, and tricycles. Some important attributes including scene visibility, object class and occlusion, are also provided for better data utilization.

The challenge mainly focuses on four tasks:

(1) Task 1: object detection in images challenge. The task aims to detect objects of predefined categories (e.g., cars and pedestrians) from individual images taken from drones.

(2) Task 2: object detection in videos challenge. The task is similar to Task 1, except that objects are required to be detected from videos.

(3) Task 3: single-object tracking challenge. The task aims to estimate the state of a target, indicated in the first frame, in the subsequent video frames.

(4) Task 4: multi-object tracking challenge. The task aims to recover the trajectories of objects in each video frame.

(5) Task 5: crowd counting challenge. The task aims to to count persons in each video frame.

Download

Note that the bounding box annotations of test-dev are avalialbe. Researchers can use test-dev to publish papers. testset-challenge is used for VisDrone2020 Challenge and the annotations is unavailable.

Task 1: Object Detection in Images

VisDrone-DET dataset

trainset (1.44 GB): BaiduYun | GoogleDrive
valset (0.07 GB): BaiduYun | GoogleDrive
testset-dev (0.28 GB): BaiduYun | GoogleDrive (GT avalialbe)
testset-challenge (0.28 GB): BaiduYun | GoogleDrive

VisDrone-DET toolkit:

Matlab beta

Task 2: Object Detection in Videos

VisDrone-VID dataset

trainset (7.53 GB): BaiduYun | GoogleDrive
valset (1.49 GB): BaiduYun | GoogleDrive
testset-dev (2.14 GB): BaiduYun | GoogleDrive(GT avalialbe)
testset-challenge (2.70 GB): BaiduYun | GoogleDrive

VisDrone-VID toolkit:

Matlab beta

Task 3: Single-Object Tracking

VisDrone-SOT dataset

trainset_part1 (7.78 GB): BaiduYun | GoogleDrive
trainset_part2 (12.59 GB): BaiduYun | GoogleDrive
valset (1.29 GB): BaiduYun | GoogleDrive
testset-dev (11.27 GB): BaiduYun | GoogleDrive(GT avalialbe)
testset-challenge_part1 (17.40 GB): BaiduYun | GoogleDrive
testset-challenge_part2 (17.31 GB): BaiduYun | GoogleDrive
testset-challenge_initialization(12 KB): BaiduYun | GoogleDrive

VisDrone-SOT toolkit:

Matlab beta

Task 4: Multi-Object Tracking

VisDrone-MOT dataset

trainset (7.53 GB): BaiduYun | GoogleDrive
valset (1.48 GB): BaiduYun | GoogleDrive
testset-dev (2.14 GB): BaiduYun | GoogleDrive(GT avalialbe)
testset-challenge (2.70 GB): BaiduYun | GoogleDrive

VisDrone-MOT toolkit:

Matlab beta

Task 5: Crowd Counting

ECCV2020 Challenge DroneCrowd (1.03 GB): BaiduYun(code: h0j8)| GoogleDrive

Citation

@article{zhu2021detection,
  title={Detection and tracking meet drones challenge},
  author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Fan, Heng and Hu, Qinghua and Ling, Haibin},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  volume={44},
  number={11},
  pages={7380--7399},
  year={2021},
  publisher={IEEE}
}

visdrone-dataset's People

Contributors

Stargazers

Watchers

Forkers

snowbhr06 xiaoming990610 sunw71 qaz670756 tamln littlestarwxy chengmuni66 ml-lab xrosliang eyeblood liyujiang372 aigiraffe shilaifu five-days gentletorch hamidehkerdegari genff yxpengatbupt ghali007 xdmeng09 yanglei50 ketan-b sarbjeet78 dearwsj silicon2006 anasgit thiwankajayasiri qiuweibin2005 strawberry2333 jnulzl deepzsenu chenkeshuai hpc203 happypeanuts qq472017713 wuhantop3 igih ugurcira louisnust az-astro admiraldus0 aiedward xjsxujingsong woyang xiangjun0103 kjcx100 cqray1990 tuanthng mkaladi saurav1601 flyingdog-huang ayadalmamary trcclub yoon5 aliushn wangyinyan1998 deemoo-wang lilin19890401 bkayranci zh-nj jeffsunhaokai dhruvalvarshney zbh10 luo77123 ep2phany wgq18 yassinegacha husnejahan jahancintiqs rsmahabir kwalid kamesh1191 shawnzou717 synchrony10 ayerzcc enate xiaozhubenben ktans aricojf thirteentj abhi83166 liangrun01 guochuangye deepvertex deepchokshi alangera dhiganthrao mahdirahmani superaiken shubhambagwari lufffffeng ruabliuqiu myrault1998 rhm2030 sevda-karahan zzxgdh kaushal07wick raytracertim ie99fory sayedmohamedscu

visdrone-dataset's Issues

2020 dataset

1.The datasets are named as 2019 datasets, where are the 2020 datasets.
2.Are the annotations the same as mentioned in https://github.com/VisDrone/VisDrone2018-DET-toolkit

How to obtain the visdrone2021 data set? Forgot to tell!

Camera information

Hi, could you please provide the specific camera model information?

VisDrone Video data annotation format

I was trying to plot the ground truth labels on video dataset of visdrone. The annotation format is like -

Visdrones Video Detection dev- test set -
ann = 98 ,0 ,808 ,1 ,47 ,22 ,1 ,4 , 0, 0

I am aware of the DET format and in this VID format ann[2] to ann[5] is bbox, ann[6] is category.

Is ann[7] score?
ann[8] = truncation?
ann[9] = occlusion?
I assumed ann[0] as index and ann[1] as frame. But when I plot the ground truth on the image I found ann[0] as frame and now I am not sure what ann[1] is.

Could you please clarify what annotations are? Thanks.

What are the differences between 2018 dataset and 2019 dataset

Dear VisDrone,
I'm working on visdrone task1, I really want to know the differences between 2018 and 2019 DET datasets.
Are they the same as each other?
If not, which aspects are they different in? Does they have overlaps with each other?
Thanks a lot :)

有关数据集拍摄时的信息问题

您好！请问数据集中有附有采集数据时使用的摄像头的内参信息和拍摄角度吗？因为我想使用数据集检测的结果进行一个简单的位置估算，所以可能需要这些参数，不知道您是否方便提供？谢谢！

What is the flight height of the drone when DataSet Collection ？

I'm appreciating your project, I would like to collect some datas myself and need to refer to your flight altitude, please help me, thank you

visdrone coco annotation format

hello
could you please share the VisDrone 2019 dataset coco annotation format?

corresponding category label

Does the VisDrone-Dataset correspond to the class[ 'pedestrian', 'people', 'bicycle', 'car', 'van', 'truck', 'tricycle', 'awning-tricycle', 'bus', 'motor']?

[Extremely high precision Baseline 高精度baseline] VisDrone DET baseline PP-YOLOE has been released, together with converted COCO format dataset, welcome to use it !

The PaddleDetection team provides an extremely high precision and speed VisDrone DET baseline PP-YOLOE, and also provides a link to download the converted COCO format dataset. Welcome to use it！

PaddleDetection团队提供了一个极高精度和速度的VisDrone DET数据集的baseline PP-YOLOE，还提供了转好COCO格式的数据集下载链接。欢迎使用！

ModelZoo：

model	COCOAPI mAP^val 0.5:0.95	COCOAPI mAP^val 0.5	COCOAPI mAP^{test_dev 0.5:0.95}	COCOAPI mAP^test_dev 0.5	MatlabAPI mAP^{test_dev 0.5:0.95}	MatlabAPI mAP^test_dev 0.5	下载	配置文件
PP-YOLOE-Alpha-largesize-l	41.9	65.0	32.3	53.0	37.13	61.15	下载链接	配置文件
PP-YOLOE-P2-Alpha-largesize-l	41.3	64.5	32.4	53.1	37.49	51.54	下载链接	配置文件
PP-YOLOE-plus-largesize-l	43.3	66.7	33.5	54.7	38.24	62.76	下载链接	配置文件

How to understand the format in the annotation file?

hello，The annotation file format : 684，8，273，116，0，0，0，0 ;

how to konw these number in someone format ? like this : x_min,y_min ,x_max,y_max or other detail

Frame rate

Hi,

I just want confirm that what is the frame rate of the frames in the MOT challenge? I believe it's not 30, maybe 15 or even less?

Thanks.

测试集的标签如何获取

VisDrone2019-DET-test-challenge 里面只有图片

Upcoming 2021 challenge's dataset is same as 2019?

wait for it coming, hope the evaluation process of VisDrone and the leaderboard can be easy to use.

VisDrone 18 and 19

Hai is the dataset for VisDrone 2018 and VisDrone 2019 contain the same images and annotations?

How to convert VisDrone label format to yolo

Hi how convert VisDrone annotations to yolo v7 format ?

What is the frame rate of the video in the Vidrone dataset?

I want to ask you a question, are all videos in your dataset at the same frame rate? If the frame rate is the same, what is the frame rate? If the frame rate isn't the same, what are the kinds?

How to interprete annotation file values in Object Detection in Videos task?

I downloaded Task 2 dataset and unzipped it, then i got the annotation files and the format like below:

1,0,593,43,174,190,0,0,0,0
2,0,592,43,174,189,0,0,0,0
3,0,592,43,174,189,0,0,0,0
4,0,592,43,174,189,0,0,0,0
5,0,592,43,174,189,0,0,0,0
...

I found below description,

 <bbox_left>,<bbox_top>,<bbox_width>,<bbox_height>,<score>,<object_category>,<truncation>,<occlusion>


    Name                                                  Description
-------------------------------------------------------------------------------------------------------------------------------     
 <bbox_left>	     The x coordinate of the top-left corner of the predicted bounding box

 <bbox_top>	     The y coordinate of the top-left corner of the predicted object bounding box

 <bbox_width>	     The width in pixels of the predicted object bounding box

<bbox_height>	     The height in pixels of the predicted object bounding box

   <score>	     The score in the DETECTION file indicates the confidence of the predicted bounding box enclosing 
                     an object instance.
                     The score in GROUNDTRUTH file is set to 1 or 0. 1 indicates the bounding box is considered in evaluation, 
                     while 0 indicates the bounding box will be ignored.
                      
<object_category>    The object category indicates the type of annotated object, (i.e., ignored regions(0), pedestrian(1), 
                     people(2), bicycle(3), car(4), van(5), truck(6), tricycle(7), awning-tricycle(8), bus(9), motor(10), 
                     others(11))
                      
<truncation>	     The score in the DETECTION result file should be set to the constant -1.
                     The score in the GROUNDTRUTH file indicates the degree of object parts appears outside a frame 
                     (i.e., no truncation = 0 (truncation ratio 0%), and partial truncation = 1 (truncation ratio 1% ~ 50%)).
                      
<occlusion>	     The score in the DETECTION file should be set to the constant -1.
                     The score in the GROUNDTRUTH file indicates the fraction of objects being occluded (i.e., no occlusion = 0 
                     (occlusion ratio 0%), partial occlusion = 1 (occlusion ratio 1% ~ 50%), and heavy occlusion = 2 
                     (occlusion ratio 50% ~ 100%)).

But I think the description is quite different video annotation.
how to interprete this? thank you.

Raw results of the Visdrone SOT test_dev ?

I found this picture in the paper "Vision Meets Drones: Past, Present and Future", but no raw results. So where can I find the raw results of the Visdrone SOT test_dev ?

How to convert VisDrone-Dataset to yolov5 annotations?

hey, How can I convert VisDrone Video Object Detection Dataset to yolov5 annotations and folders order.

thanks!

crop image to 416 x 416 and change annotation accordingly

I want to train YOLO V3 for object detection on VisDrone dataset , for the YOLO V3 training I want to resize all the images to 416 x 416 and also change Annotations accordingly .

Is there any method or suggestions to do it ?

where can I find the testlist‘s annotation in task of crowd counting?

I found that your dataset of cc only contains annotation of trainlist.

A suggestion for upcoming 2021 challenge

Hi, I'm not sure if this project is still being maintained. When I was studying on vehicle detection and tracking with VisDrone, there's seldom researches that I can compete with. I think the main reason is that there's no public annotations for test-challenge subset and most of the teams didn't report their local evaluation results on test-dev.

So, like COCO, maybe it's more preferable for organizers to recommend participates to report results of both test-dev and test-chanllege (of course only the latter one is taken in to consideration for competition).

Ask the author, which kind of annotation tool is used?

how to change the coco format result in test dataset to the visdrone official format result,any one can share?

i have the result in test dataset,but the format is coco json.how can i change to the visdrone official format

How to visualise VisDrone target ID and bounding box?

I notice that the annotation files looked like that

1,0,19,783,60,91,1,1,0,0
2,0,16,782,60,91,1,1,0,0
3,0,13,781,60,91,1,1,0,0
4,0,11,780,60,91,1,1,0,0

How should i understand the annotation?
Is it something like that?
<Target ID>, <frame number>, <bbox_left>,<bbox_top>,<bbox_width>,<bbox_height>,<score>,<object_category>,<truncation>,<occlusion>

How may i display the bounding boxes such that it look like that? I do not need the class, but i just need the ID and the bounding box.

Or something like that

Visdrone 2019 video raw source

Dear authors,
Thank you for sharing a good dataset for object detection as well as tracking.
I am trying to find Visdrone 2019 video (NOT sequential frames) with demo purpose.
Could you provide me Visdrone 2019 raw video?

Thank you

关于图像的问题

请问这个数据集都是可见光的吗，还是有红外的图片？

VisDrone2021 dataset and the VisDrone2019 dataset

Dear Author, greetings. I would like to inquire if there are any differences in utilizing the VisDrone2021 dataset and the VisDrone2019 dataset for the task of object detection? I look forward to your esteemed response.

What is the difference between pedestrian and people, van and truck? They seem the one object.

submission problem

hello, your evaluation system maybe need fix bug.
when I submit my result to the evaluation system, I found my submission is in updated status over one hour, like the picture. I don't know the reason for this, maybe you can solve my problem.