Comments (6)
As it said in the original paper:
Each bounding box consists of 5 predictions: x, y, w, h, and confidence. The (x, y) coordinates represent the center of the box relative to the bounds of the grid cell. The width and height are predicted relative to the whole image.
Thanks for your distinctive idea, but I can't agree with you. As we know, round operation
contains cell
and floor
operation. If you feed neural network an ambiguous operation, how can you expect the network to learn easier ? In that case, I think it will be rather more difficult to make the network converge.
from tensorflow-yolov3.
Oh, I see. But, I mean, the consistency is about which grid we assign for the object label. A good common sense is: (i) assigning to the grid which has biggest IoU to the box GT, right? See another illustration here, if we use floor
, both case (1) and (2) will be assigned in the grid (0,0) for the object area, meanwhile, box GT in case (1) and (2) have a distance almost one grid. If we use round
, case (1) and (2) will be assigned in the grid (1,1) and (0,0), repectively, just like the common sense I mention in (i) before. So, we can think that consistency is more about assigning label to the one which gives the least distance between box_center
and grid_center
. In this case, round
will give less distance compared to floor
. We can also think that by performing round
and floor
operation, we will lose some information from the 0.x
number we remove. Using round
, the maximum number we lose is 0.5
, while, using floor
, it can be 0.99...
.
Anyway, do you ever run this code to train MS COCO dataset from the scratch (not using pre-training network)? I wonder how many days are needed. I'm currectly still running this code to train MS COCO from the scratch. Now is in epoc 4 (3 days), it looks converged, but seems it needs a lot of epoch, i.e., a lot of days (in the original paper is stated using 160 epoch).
from tensorflow-yolov3.
Oh, I see. But, I mean, the consistency is about which grid we assign for the object label. A good common sense is: (i) assigning to the grid which has biggest IoU to the box GT, right? See another illustration here, if we use
floor
, both case (1) and (2) will be assigned in the grid (0,0) for the object area, meanwhile, box GT in case (1) and (2) have a distance almost one grid. If we useround
, case (1) and (2) will be assigned in the grid (1,1) and (0,0), repectively, just like the common sense I mention in (i) before. So, we can think that consistency is more about assigning label to the one which gives the least distance betweenbox_center
andgrid_center
. In this case,round
will give less distance compared tofloor
. We can also think that by performinground
andfloor
operation, we will lose some information from the0.x
number we remove. Usinground
, the maximum number we lose is0.5
, while, usingfloor
, it can be0.99...
.Anyway, do you ever run this code to train MS COCO dataset from the scratch (not using pre-training network)? I wonder how many days are needed. I'm currectly still running this code to train MS COCO from the scratch. Now is in epoc 4 (3 days), it looks converged, but seems it needs a lot of epoch, i.e., a lot of days (in the original paper is stated using 160 epoch).
Oh, I got it. Your idea is very impressive ! But, since YOLO is a regression problem, which means the regression objection must be certain. So we need a particular grid cell location to be regressor.
For training MS COCO dataset from scratch, I have not done it yet. I will appreciate it very much if you would have shared your result with us.
from tensorflow-yolov3.
Yes sure, later I can share the result. Now is still in epochs 6, the recall is still 0.0x
, but the precision is already high, almost one.
Btw, do you know why running this training code constantly increases the used RAM memory? Keeping running the code will encouter running out of (ram) memory issue and eventually kills the process.
from tensorflow-yolov3.
Good news
I wanna share the cause of memory increase in train.py
code (currently you already delete it in this repository). The root cause is in this code part:
_, _, _, summary = sess.run([tf.assign(rec_tensor, rec),
tf.assign(prec_tensor, prec),
tf.assign(mAP_tensor, mAP), write_op], feed_dict={is_training:True})
Putting tf.assign
operation inside the training loop will create new additional graph repeatedly. So, I change those three tf.assign
by using placeholder, and do feed_dict
to them using rec, prec
and mAP
. The training time is also faster afterward.
from tensorflow-yolov3.
Good news
I wanna share the cause of memory increase intrain.py
code (currently you already delete it in this repository). The root cause is in this code part:_, _, _, summary = sess.run([tf.assign(rec_tensor, rec), tf.assign(prec_tensor, prec), tf.assign(mAP_tensor, mAP), write_op], feed_dict={is_training:True})
Putting
tf.assign
operation inside the training loop will create new additional graph repeatedly. So, I change those threetf.assign
by using placeholder, and dofeed_dict
to them usingrec, prec
andmAP
. The training time is also faster afterward.
Can you share the code ?? Thanks a lot!
from tensorflow-yolov3.
Related Issues (20)
- Cannot train each branch separately (large-object or medium-object branch)
- Can this implemented on Tensorflow 2.4?
- 能否直接读取ckpt文件的网络进行训练
- question about restriction on xind, yind
- yolov3 tensorflow2的实现
- 测试的时候为什么只有一个input_size?图片不是正方形怎么办? HOT 1
- Using MSE loss for class prediction
- 提一个bug,固定的一组anchors用于多尺度训练存在不严谨。 HOT 1
- 测试时检测的目标框比实际目标大出很多
- 请问您有相关的yolov3部署的教程吗?开发板或者服务器
- 3d detection
- Stuck
- Converting .pb to .tflite ? HOT 1
- How do you save the trained model to HDF5 (h5) format?
- Can i train on multi-gpu HOT 1
- 发现个测试小问题image_demo/video_demo
- Need explnation on LR Scheduler
- WHAT IS PROBLEM HERE
- Key conv52/batch_normalization/beta/ExponentialMovingAverage not found in checkpoint
- cpu训练正常,gpu训练出现loss=nan HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tensorflow-yolov3.