Comments (4)
I don't think I fully understand the questions, but here are a few points regarding what I believe is at least part of what you're asking:
- The
AnchorBoxes
layer does not participate at all in the training of the model. Its only purpose is to output the anchor box coordinates and variances so thatdecode_y()
ordecode_y2()
can decode the raw model prediction tensor without needing any other information. This is also why the model output tensor's last axis has lengthn_classes + 4 + 4 + 4
. The last 8 elements of the last axis are just the four anchor box coordinates and the four variances for each box and as mentioned before, they are not relevant to the training, but only to decode the model output at inference time. I also recommend reading the documentation and inline comments ofAnchorBoxes
, which should help understand why the model output tensor has this particular shape. - The variances are just scaling factors for the ground truth box coordinates. They allow you to scale the individual coordinate offsets independently. I recommend you take a look at the comments at the bottom of the definition of
encode_y()
to better understand what they do. The coordinate offsets are being divided by the variances. For example, the variances chosen in the original SSD300 are[0.1, 0.1, 0.2, 0.2]
, meaning that the(cx, cy)
values are up-scaled by a factor of 10 and the(w, h)
values are upscaled by a factor of 5. Among other things this means that the box center coordinates are weighted stronger than the width/height values. The idea behind trying different values for these variances is simply to see if the model learns better with a certain set of values.
from ssd_keras.
oh my god,
thank you a lot, pierluigiferrari!!!
I think I finally understand your comments!!
It was my fault, I totally misunderstood it.
I totally suddenly see the light.
Thanks for your help!!!
but...
there is one thing I can not figure out is why you encode width and height by np.log?
np.log is asymmetric which means at the same iou but anchor boxes bigger or smaller than ground truth would get different loss?
from ssd_keras.
The idea is this:
We have four scalar anchor box coordinates, cx, cy, w, h. For each of these four coordinates, the desired prediction for a given ground truth box could be either larger or smaller than the respective anchor box coordinate. For each of these four coordinates, we want the model to predict positive offsets in one direction and negative offsets in the other direction. And we want the predicted offsets to be relative to the respective absolute coordinate values of the anchor box. The chosen formula for the width and height fulfills both of these criteria:
ln(g/d) = ln(g) - ln(d) > 0
if g > d
, < 0
if g < d
and ln(a*g) - ln(a*d) = ln(g) - ln(d)
for any positive number a
.
Whether or not this is the best transformation of the target coordinates for the model to learn optimally is a different story, but it is at least one possible transformation that works well.
from ssd_keras.
Oh oh oh,
I think I have learned a lot and known something.
Really very thanks you!!!
from ssd_keras.
Related Issues (20)
- How to set the param seems like intensity_meanใmin_scale ใ aspect_ratios and scales?
- Requirements Versioning not working with python3.8
- InvalidArgumentError when compiling model with ssd_loss HOT 1
- WARNING:tensorflow:Gradients do not exist for variables ['conv4_3/bias:0',...] when minimizing the loss. HOT 1
- "Invalid argument: Index out of range using input dim 0; input has only 0 dims" during ssd300 model training
- load weight
- ValueError: Error when checking input: expected input_3 to have 4 dimensions, but got array with shape
- While training I got training terminate error . Epoch 00001: LearningRateScheduler setting learning rate to 0.001. 1/10 [==>...........................] - ETA: 4:08 - loss: nanBatch 0: Invalid loss, terminating training Epoch 00001: saving model to ssd512_URPC2018_epoch-01.h5 Process finished with exit code 0
- ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.
- ValueError: Layer model expects 1 input(s), but it received 2 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(None, None, None, None) dtype=uint8>, <tf.Tensor 'IteratorGetNext:1' shape=(None, None, None) dtype=float32>] HOT 23
- Parameters of the model HOT 1
- Bouding boxes predictions are concentrated in left top corner HOT 1
- Ambiguous dimension while trying to load weights.
- Urgent!! Invalid Loss HOT 4
- What are the requirements to run this code?. HOT 1
- Pascal VOC Training Person Detection
- The device being used is CPU while capturing image from webcam. How do I use my GPU for processing instead?
- Label error during Coco Training HOT 1
- TypeError: Expected any non-tensor type, got a tensor instead.
- Changes make the code work in 2023 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ssd_keras.