The yolov1 from jl749

YOLO architecture

YOLO = single regression to predict (class probability + bounding boxes)

Pros

Fast and simple with high accuracy compared to any other real-time detection system
Network understands generalized object representation (e,g, trained on artwork dataset --> inference also works on the real images)
Hence, low FP (False-Positive, predicted Pos when actual is Neg)

Cons

Low accuracy especially on small objects
Cannot predict small objects clustered together
(YOLO imposes strong spatial constraints on bounding box predictions since each grid cell only predicts two boxes and can only have one class. This spatial constraint limits the number of nearby objects that the model can predict. The model struggles with small objects that appear in groups, such as flocks of birds.)

Architecture

C = 20, S = 7, B = 2

Process

S*S grid on the input image
each grid cell contains B bounding boxes and for each bounding box contains a confidence score (0 in case of no obj)
P(obj) * IOU(expected, pred) IOU = 0 ~ 1
1 Bounding Box = (confidence, x, y, w, h) x, y = relative to the grid (0 ~ 1) ___ w, h = relative to the whole img (0 ~ 1)
each grid cell also contains C conditional class probability
P(class _i_ | obj)
(BATCH_SIZE, S, S, B*5+C)

Testing --> class specific confidence score

Inference

For each bounding boxes found --> confidence score * conditional class probability
P(obj) * IOU(expected, pred) * P(class _i_ | obj)

For every 98 class specific confidence score apply non-max suppression

--> set object class as well as bounding box location

dataset

use Pascal VOC dataset

example label txt file
[class_label, x, y, w, h]

11 0.27232142857142855 0.623 0.5386904761904762 0.306
14 0.8125 0.518 0.375 0.892
14 0.3214285714285714 0.447 0.369047619047619 0.5660000000000001
14 0.22767857142857142 0.6 0.3898809523809524 0.47600000000000003

x, y, w, h = 0~1 scaled position

train.csv, test.csv

000005.jpg,000005.txt
000007.jpg,000007.txt
000009.jpg,000009.txt

jl749 / yolov1 Goto Github PK

yolov1's People

Contributors

Watchers

yolov1's Issues

yolo loss function

YOLO architecture

Pros

Cons

Architecture

Process

Inference

dataset

train, evaluate

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent