something wrong with the coordinates of bbox

YOLO3 (Detection, Training, and Evaluation)

Dataset and Model

Dataset	mAP	Demo	Config	Model
Kangaroo Detection (1 class) (https://github.com/experiencor/kangaroo)	95%	https://youtu.be/URO3UDHvoLY	check zoo	https://bit.ly/39rLNoE
License Plate Detection (European in Romania) (1 class) (https://github.com/RobertLucian/license-plate-dataset)	90%	https://youtu.be/HrqzIXFVCRo	check zoo	https://bit.ly/2tIpvPl
Raccoon Detection (1 class) (https://github.com/experiencor/raccoon_dataset)	98%	https://youtu.be/lxLyLIL7OsU	check zoo	https://bit.ly/39rLNoE
Red Blood Cell Detection (3 classes) (https://github.com/experiencor/BCCD_Dataset)	84%	https://imgur.com/a/uJl2lRI	check zoo	https://bit.ly/39rLNoE
VOC (20 classes) (http://host.robots.ox.ac.uk/pascal/VOC/voc2012/)	72%	https://youtu.be/0RmOI6hcfBI	check zoo	https://bit.ly/39rLNoE

Todo list:

Installing

To install the dependencies, run

pip install -r requirements.txt

And for the GPU to work, make sure you've got the drivers installed beforehand (CUDA).

It has been tested to work with Python 2.7.13 and 3.5.3.

Detection

Grab the pretrained weights of yolo3 from https://pjreddie.com/media/files/yolov3.weights.

python yolo3_one_file_to_detect_them_all.py -w yolo3.weights -i dog.jpg

Training

1. Data preparation

Download the Raccoon dataset from from https://github.com/experiencor/raccoon_dataset.

Organize the dataset into 4 folders:

train_image_folder <= the folder that contains the train images.
train_annot_folder <= the folder that contains the train annotations in VOC format.
valid_image_folder <= the folder that contains the validation images.
valid_annot_folder <= the folder that contains the validation annotations in VOC format.

There is a one-to-one correspondence by file name between images and annotations. If the validation set is empty, the training set will be automatically splitted into the training set and validation set using the ratio of 0.8.

Also, if you've got the dataset split into 2 folders such as one for images and the other one for annotations and you need to set a custom size for the validation set, use create_validation_set.sh script to that. The script expects the following parameters in the following order:

./create_validation_set.sh $param1 $param2 $param3 $param4
# 1st param - folder where the images are found
# 2nd param - folder where the annotations are found
# 3rd param - number of random choices (aka the size of the validation set in absolute value)
# 4th param - folder where validation images/annots end up (must have images/annots folders inside the given directory as the 4th param)

2. Edit the configuration file

The configuration file is a json file, which looks like this:

{
    "model" : {
        "min_input_size":       352,
        "max_input_size":       448,
        "anchors":              [10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326],
        "labels":               ["raccoon"]
    },

    "train": {
        "train_image_folder":   "/home/andy/data/raccoon_dataset/images/",
        "train_annot_folder":   "/home/andy/data/raccoon_dataset/anns/",      
          
        "train_times":          10,             # the number of time to cycle through the training set, useful for small datasets
        "pretrained_weights":   "",             # specify the path of the pretrained weights, but it's fine to start from scratch
        "batch_size":           16,             # the number of images to read in each batch
        "learning_rate":        1e-4,           # the base learning rate of the default Adam rate scheduler
        "nb_epoch":             50,             # number of epoches
        "warmup_epochs":        3,              # the number of initial epochs during which the sizes of the 5 boxes in each cell is forced to match the sizes of the 5 anchors, this trick seems to improve precision emperically
        "ignore_thresh":        0.5,
        "gpus":                 "0,1",

        "saved_weights_name":   "raccoon.h5",
        "debug":                true            # turn on/off the line that prints current confidence, position, size, class losses and recall
    },

    "valid": {
        "valid_image_folder":   "",
        "valid_annot_folder":   "",

        "valid_times":          1
    }
}

The labels setting lists the labels to be trained on. Only images, which has labels being listed, are fed to the network. The rest images are simply ignored. By this way, a Dog Detector can easily be trained using VOC or COCO dataset by setting labels to ['dog'].

Download pretrained weights for backend at:

https://bit.ly/39rLNoE

This weights must be put in the root folder of the repository. They are the pretrained weights for the backend only and will be loaded during model creation. The code does not work without this weights.

3. Generate anchors for your dataset (optional)

python gen_anchors.py -c config.json

Copy the generated anchors printed on the terminal to the anchors setting in config.json.

4. Start the training process

python train.py -c config.json

By the end of this process, the code will write the weights of the best model to file best_weights.h5 (or whatever name specified in the setting "saved_weights_name" in the config.json file). The training process stops when the loss on the validation set is not improved in 3 consecutive epoches.

5. Perform detection using trained weights on image, set of images, video, or webcam

python predict.py -c config.json -i /path/to/image/or/video

It carries out detection on the image and write the image with detected bounding boxes to the same folder.

If you wish to change the object threshold or IOU threshold, you can do it by altering obj_thresh and nms_thresh variables. By default, they are set to 0.5 and 0.45 respectively.

Evaluation

python evaluate.py -c config.json

Compute the mAP performance of the model defined in saved_weights_name on the validation dataset defined in valid_image_folder and valid_annot_folder.