Laboro Tomato: Instance segmentation dataset

Overview
Baseline
Subsets

Overview

About Laboro Tomato

Laboro Tomato is an image dataset of growing tomatoes at different stages of their ripening which is designed for object detection and instance segmentation tasks. We also provide two subsets of tomatoes separated by size. Dataset was gathered at a local farm with two separate cameras with its different resolution and image quality.

Samples of raw/annotated images: IMG_1066, IMG_1246

Annotation details

Each tomato is divided into 2 categories according to size (normal size and cherry tomato) and 3 categories depending on the stage of ripening:

fully_ripened - complitely red color and ready to be harvested. Filled with red color on 90%* or more
half_ripened - greenish and needs time to ripen. Filled with red color on 30-89%*
green - complitely green/white, sometimes with rare red parts. Filled with red color on 0-30%*

*All percentages are approximate and differ from case to case.

Dataset details

Dataset includes 804 images with following details:

name: tomato_mixed
images: 643 train, 161 test
cls_num: 6
cls_names: b_fully_ripened, b_half_ripened, b_green, l_fully_ripened, l_half_ripened, l_green
total_bboxes: train[7781], test[1,996]
bboxes_per_class:
    *Train: b_fully_ripened[348], b_half_ripened[520], b_green[1467], 
            l_fully_ripened[982], l_half_ripened[797], l_green[3667]
    *Test:  b_fully_ripened[72], b_half_ripened[116], b_green[387], 
            l_fully_ripened[269], l_half_ripened[223], l_green[929]
image_resolutions: 3024x4032, 3120x4160

Scope of application

Laboro Tomato dataset can be used to solve cutting edge real-life tasks by fusing various technologies:

Harvesting forecast based on tomato maturity
Automatic harvest of only ripened tomates
Identification and automatic thinning of deteriorated and obsolete tomatoes
Sprayig pesticides only on tomatoes at a specific ripening stage
Temperature control in greenhouse according to ripening stage
Quality control on production line of food manufactures, etc.

Licence

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
For commercial use, please contact Laboro.AI Inc.

Download dataset

Laboro Tomato download link

Baseline

Pretrained model

Model have been trained by mmdetection V2.0 on 4 Tesla-V100 and based on Mask R-CNN with R-50-FPN 1x backbone:

Dataset	bbox AP	mask AP	Download
Laboro Tomato	64.3	65.7	model

We haven't done hyperparameters tuning for baseline model training and used default values, provided by original mmdetection configs.
Training parameters:

lr = 0.01
step = [32, 44]
total epoch = 48

Output examples

Image gallery with pretrained model output examples and its comparison between raw and annotated images.

Test a dataset

To evaluate pretrained models please prepare mmdetection environment by official installation guide.

Prepare dataset

It is recommended to symlink the dataset root to $MMDETECTION/data. If your folder structure is different, you may need to change the corresponding paths in config files.

mmdetection
├── mmdet
├── tools
├── configs
├── data
│   ├── laboro_tomato
│   │   ├── annotations
│   │   ├── train
│   │   ├── test

Add datasets to mmdetection

To load data we need to create a new config file mmdet/datasets/laboro_tomato.py with corresponding subsets:

from .coco import CocoDataset
from .builder import DATASETS


@DATASETS.register_module()
class LaboroTomato(CocoDataset):
    CLASSES = ('b_fully_ripened', 'b_half_ripened', 'b_green', 
               'l_fully_ripened', 'l_half_ripened', 'l_green')

And add dataset names to mmdet/datasets/__init__.py:

from .laboro_tomato import LaboroTomato

__all__ = [    
           ..., 'LaboroTomato'
          ]

Configuration files

Configuration files setup on Tomato Mixed dataset example:

Create laboro_tomato_base.py in configs/_base_/datasets/ with content of coco_detection configuration file and change dataset type, root and path parameters:

dataset_type = 'LaboroTomato'
data_root = 'data/laboro_tomato/'
...

Create laboro_tomato_instance.py in configs/_base_/datasets/ with content of coco_instance and replace it with your base detection configuration file:

_base_ = 'laboro_tomato_base.py'
...

Replace class numbers at model configuration file configs/_base_/models/mask_rcnn_r50_fpn.py:

...
num_classes = 6
...
num_classes = 6
...

Replace dataset configuration file name in configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py with created at step 3:

_base_ = [
    ...
    '../_base_/datasets/laboro_tomato_instance.py',
    ...
]

Evaluation

You can use the following commands to test a dataset:

# single-gpu testing
python tools/test.py configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py \
                     laboro_tomato_96ep.pth --show

# multi-gpu testing with 4 GPUs
./tools/dist_test.sh configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py \
                     laboro_tomato_96ep.pth 4 --out results.pkl --eval bbox segm

Train a model

To train your model finish all steps from Test a model section and change learning rate and total epoch, steps at configs/_base_/schedules/schedule_1x.py. The default learning rate in config files is for 8 GPUs and 2 img/gpu (batch size = 8*2 = 16). According to the Linear Scaling Rule, you need to set the learning rate proportional to the batch size if you use different GPUs or images per GPU, e.g., lr=0.01 for 4 GPUs * 2 img/gpu and lr=0.08 for 16 GPUs * 4 img/gpu.

optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
...
    step=[64, 88])
total_epochs = 96

You can use the following commands to train a model:

# single-gpu train
python tools/train.py configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py \
                      --work-dir ./laboro_tomato

# multi-gpu train with 4 GPUs
./tools/dist_test.sh configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py 4 \
                      --work-dir ./laboro_tomato

Subsets

Details

name: tomato_big
images: 353 train, 89 test
cls_num: 3
cls_names: b_fully_ripened, b_half_ripened, b_green
total_bboxes: train[2360], test[550]
bboxes_per_class:
    *Train: b_fully_ripened[343], b_half_ripened[506], b_green[1511], 
    *Test:  b_fully_ripened[77], b_half_ripened[130], b_green[343], 
image_resolutions: 3024x4032, 3120x4160

name: tomato_little
images: 289 train, 73 test
cls_num: 3
cls_names: l_fully_ripened, l_half_ripened, l_green
total_bboxes: train[5397], test[1470]
bboxes_per_class:
    *Train: l_fully_ripened[963], l_half_ripened[805], l_green[3629], 
    *Test:  l_fully_ripened[288], l_half_ripened[215], l_green[967], 
image_resolutions: 3024x4032, 3120x4160

Pretrained models

As well as main dataset, Laboro tomato big and Laboro tomato little have been trained by mmdetection V2.0 on 4 Tesla-V100 and based on Mask R-CNN with R-50-FPN 1x backbone:

Dataset	bbox AP	mask AP	Download
Laboro tomato big	67.9	68.4	model
Laboro tomato little	62.7	63.1	model

Training parameters:

lr = 0.01
step = [32, 44]
total epoch = 48

schillij95 / laborotomato Goto Github PK

laborotomato's Introduction