Git Product home page Git Product logo

visdrone-dataset's Introduction

VisDrone-Dataset

VisDrone

Drones, or general UAVs, equipped with cameras have been fast deployed to a wide range of applications, including agricultural, aerial photography, fast delivery, and surveillance. Consequently, automatic understanding of visual data collected from these platforms become highly demanding, which brings computer vision to drones more and more closely. We are excited to present a large-scale benchmark with carefully annotated ground-truth for various important computer vision tasks, named VisDrone, to make vision meet drones. The VisDrone2019 dataset is collected by the AISKYEYE team at Lab of Machine Learning and Data Mining , Tianjin University, China. The benchmark dataset consists of 288 video clips formed by 261,908 frames and 10,209 static images, captured by various drone-mounted cameras, covering a wide range of aspects including location (taken from 14 different cities separated by thousands of kilometers in China), environment (urban and country), objects (pedestrian, vehicles, bicycles, etc.), and density (sparse and crowded scenes). Note that, the dataset was collected using various drone platforms (i.e., drones with different models), in different scenarios, and under various weather and lighting conditions. These frames are manually annotated with more than 2.6 million bounding boxes of targets of frequent interests, such as pedestrians, cars, bicycles, and tricycles. Some important attributes including scene visibility, object class and occlusion, are also provided for better data utilization.

The challenge mainly focuses on four tasks:

(1) Task 1: object detection in images challenge. The task aims to detect objects of predefined categories (e.g., cars and pedestrians) from individual images taken from drones.

(2) Task 2: object detection in videos challenge. The task is similar to Task 1, except that objects are required to be detected from videos.

(3) Task 3: single-object tracking challenge. The task aims to estimate the state of a target, indicated in the first frame, in the subsequent video frames.

(4) Task 4: multi-object tracking challenge. The task aims to recover the trajectories of objects in each video frame.

(5) Task 5: crowd counting challenge. The task aims to to count persons in each video frame.

Download

Note that the bounding box annotations of test-dev are avalialbe. Researchers can use test-dev to publish papers. testset-challenge is used for VisDrone2020 Challenge and the annotations is unavailable.

Task 1: Object Detection in Images

VisDrone-DET dataset

VisDrone-DET toolkit:

Task 2: Object Detection in Videos

VisDrone-VID dataset

VisDrone-VID toolkit:

Task 3: Single-Object Tracking

VisDrone-SOT dataset

VisDrone-SOT toolkit:

Task 4: Multi-Object Tracking

VisDrone-MOT dataset

VisDrone-MOT toolkit:

Task 5: Crowd Counting

ECCV2020 Challenge DroneCrowd (1.03 GB): BaiduYun(code: h0j8)| GoogleDrive

Citation

@article{zhu2021detection,
  title={Detection and tracking meet drones challenge},
  author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Fan, Heng and Hu, Qinghua and Ling, Haibin},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  volume={44},
  number={11},
  pages={7380--7399},
  year={2021},
  publisher={IEEE}
}

visdrone-dataset's People

Contributors

visdrone avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

visdrone-dataset's Issues

VisDrone Video data annotation format

I was trying to plot the ground truth labels on video dataset of visdrone. The annotation format is like -

Visdrones Video Detection dev- test set -
ann = 98 ,0 ,808 ,1 ,47 ,22 ,1 ,4 , 0, 0

I am aware of the DET format and in this VID format ann[2] to ann[5] is bbox, ann[6] is category.

  1. Is ann[7] score?
  2. ann[8] = truncation?
  3. ann[9] = occlusion?
    I assumed ann[0] as index and ann[1] as frame. But when I plot the ground truth on the image I found ann[0] as frame and now I am not sure what ann[1] is.

Could you please clarify what annotations are? Thanks.

What are the differences between 2018 dataset and 2019 dataset

Dear VisDrone,
I'm working on visdrone task1, I really want to know the differences between 2018 and 2019 DET datasets.
Are they the same as each other?
If not, which aspects are they different in? Does they have overlaps with each other?
Thanks a lot :)

有关数据集拍摄时的信息问题

您好!请问数据集中有附有采集数据时使用的摄像头的内参信息和拍摄角度吗?因为我想使用数据集检测的结果进行一个简单的位置估算,所以可能需要这些参数,不知道您是否方便提供?谢谢!

corresponding category label

Does the VisDrone-Dataset correspond to the class[ 'pedestrian', 'people', 'bicycle', 'car', 'van', 'truck', 'tricycle', 'awning-tricycle', 'bus', 'motor']?

[Extremely high precision Baseline 高精度baseline] VisDrone DET baseline PP-YOLOE has been released, together with converted COCO format dataset, welcome to use it !

The PaddleDetection team provides an extremely high precision and speed VisDrone DET baseline PP-YOLOE, and also provides a link to download the converted COCO format dataset. Welcome to use it!

PaddleDetection团队提供了一个极高精度和速度的VisDrone DET数据集的baseline PP-YOLOE,还提供了转好COCO格式的数据集下载链接。欢迎使用!

ModelZoo:

model COCOAPI mAPval
0.5:0.95
COCOAPI mAPval
0.5
COCOAPI mAPtest_dev
0.5:0.95
COCOAPI mAPtest_dev
0.5
MatlabAPI mAPtest_dev
0.5:0.95
MatlabAPI mAPtest_dev
0.5
下载 配置文件
PP-YOLOE-Alpha-largesize-l 41.9 65.0 32.3 53.0 37.13 61.15 下载链接 配置文件
PP-YOLOE-P2-Alpha-largesize-l 41.3 64.5 32.4 53.1 37.49 51.54 下载链接 配置文件
PP-YOLOE-plus-largesize-l 43.3 66.7 33.5 54.7 38.24 62.76 下载链接 配置文件

Frame rate

Hi,

I just want confirm that what is the frame rate of the frames in the MOT challenge? I believe it's not 30, maybe 15 or even less?

Thanks.

VisDrone 18 and 19

Hai is the dataset for VisDrone 2018 and VisDrone 2019 contain the same images and annotations?

How to interprete annotation file values in Object Detection in Videos task?

I downloaded Task 2 dataset and unzipped it, then i got the annotation files and the format like below:

1,0,593,43,174,190,0,0,0,0
2,0,592,43,174,189,0,0,0,0
3,0,592,43,174,189,0,0,0,0
4,0,592,43,174,189,0,0,0,0
5,0,592,43,174,189,0,0,0,0
...

I found below description,

 <bbox_left>,<bbox_top>,<bbox_width>,<bbox_height>,<score>,<object_category>,<truncation>,<occlusion>


    Name                                                  Description
-------------------------------------------------------------------------------------------------------------------------------     
 <bbox_left>	     The x coordinate of the top-left corner of the predicted bounding box

 <bbox_top>	     The y coordinate of the top-left corner of the predicted object bounding box

 <bbox_width>	     The width in pixels of the predicted object bounding box

<bbox_height>	     The height in pixels of the predicted object bounding box

   <score>	     The score in the DETECTION file indicates the confidence of the predicted bounding box enclosing 
                     an object instance.
                     The score in GROUNDTRUTH file is set to 1 or 0. 1 indicates the bounding box is considered in evaluation, 
                     while 0 indicates the bounding box will be ignored.
                      
<object_category>    The object category indicates the type of annotated object, (i.e., ignored regions(0), pedestrian(1), 
                     people(2), bicycle(3), car(4), van(5), truck(6), tricycle(7), awning-tricycle(8), bus(9), motor(10), 
                     others(11))
                      
<truncation>	     The score in the DETECTION result file should be set to the constant -1.
                     The score in the GROUNDTRUTH file indicates the degree of object parts appears outside a frame 
                     (i.e., no truncation = 0 (truncation ratio 0%), and partial truncation = 1 (truncation ratio 1% ~ 50%)).
                      
<occlusion>	     The score in the DETECTION file should be set to the constant -1.
                     The score in the GROUNDTRUTH file indicates the fraction of objects being occluded (i.e., no occlusion = 0 
                     (occlusion ratio 0%), partial occlusion = 1 (occlusion ratio 1% ~ 50%), and heavy occlusion = 2 
                     (occlusion ratio 50% ~ 100%)).

But I think the description is quite different video annotation.
how to interprete this? thank you.

Raw results of the Visdrone SOT test_dev ?

image
I found this picture in the paper "Vision Meets Drones: Past, Present and Future", but no raw results. So where can I find the raw results of the Visdrone SOT test_dev ?

A suggestion for upcoming 2021 challenge

Hi, I'm not sure if this project is still being maintained. When I was studying on vehicle detection and tracking with VisDrone, there's seldom researches that I can compete with. I think the main reason is that there's no public annotations for test-challenge subset and most of the teams didn't report their local evaluation results on test-dev.

So, like COCO, maybe it's more preferable for organizers to recommend participates to report results of both test-dev and test-chanllege (of course only the latter one is taken in to consideration for competition).

How to visualise VisDrone target ID and bounding box?

I notice that the annotation files looked like that

1,0,19,783,60,91,1,1,0,0
2,0,16,782,60,91,1,1,0,0
3,0,13,781,60,91,1,1,0,0
4,0,11,780,60,91,1,1,0,0

How should i understand the annotation?
Is it something like that?
<Target ID>, <frame number>, <bbox_left>,<bbox_top>,<bbox_width>,<bbox_height>,<score>,<object_category>,<truncation>,<occlusion>

How may i display the bounding boxes such that it look like that? I do not need the class, but i just need the ID and the bounding box.
track_all_seg_1280_025conf

Or something like that
image

Visdrone 2019 video raw source

Dear authors,
Thank you for sharing a good dataset for object detection as well as tracking.
I am trying to find Visdrone 2019 video (NOT sequential frames) with demo purpose.
Could you provide me Visdrone 2019 raw video?

Thank you

VisDrone2021 dataset and the VisDrone2019 dataset

Dear Author, greetings. I would like to inquire if there are any differences in utilizing the VisDrone2021 dataset and the VisDrone2019 dataset for the task of object detection? I look forward to your esteemed response.

submission problem

hello, your evaluation system maybe need fix bug.
when I submit my result to the evaluation system, I found my submission is in updated status over one hour, like the picture. I don't know the reason for this, maybe you can solve my problem.

image

category

There should be 10 categories. Yet 0-11 all occurs, meaning that there could be 12 categories. What is category to id mapping?

VisDrone CC 2020测试集没有标签

测试集没有标签的话,请问我该如何评估我的密度计数模型的mae,或者说当时VisDrone比赛时是如何评估模型mae的

Questions about mot bbox size

I found that the size of the bbox given in the mot dataset is much larger than the size of the actual object, is this normal? The bbox looks like this

image

Thank you very much if you can make some suggestions!

Custom data annotation?

I want to annotate my custom data to make a similar format to visdrone.
Are there any open source/free annotation tools?

Thanks

height/altitude information

Hi,

Could you please share the specific value or range of height/altitude at which the static images and videos have been captured and recorded respectively for the object detection. I would be grateful for your positive response.

Thanks,

Toolkit in Python

Appreciating your work. However, will there be a toolkit in Python (most of trackers nowdays are written in Python, I suppose) ?
Thanks in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.