Comments (8)
@albertyou2 I've just updated the README and included instructions in this notebook.
Check it out and let me know if this helps. If you have further questions let me know and I'd be more than happy to clarify and improve the documentation.
I'm also interested in training a model on the KITTI datasets myself, so I'll probably look into their data format soon. In case neither of the two parser methods provided by BatchGenerator
is compatible with the KITTI annotations, you could follow the instructions in the README and write an additional parser method that can handle them. I might also do that at some point.
from ssd_keras.
hi @pierluigiferrari
Thank you very much .
I 'm sorry for reply late, I was very busy these days.i will check it out as soon as I could!
Thank you a again
from ssd_keras.
hi @pierluigiferrari
I have test you project successful both training and testing! Thank you again .
I use your code to detete some logos ,it works fine!
Now I meet a new problem :
The accuracy of detection for objects is very low .The objects could be very small ,like 10*10 pixel .
I just want to know if SSD is able to this Job?
If it could ,which args should I fine-tune?
Thank you very much
from ssd_keras.
@albertyou2 two things you could try to improve small object detection (this is not meant as an exhaustive list):
- Decrease the scaling factors: You could decrease all of them, not just the smallest one. Calculate what fraction of your input image size the smallest and the largest objects in your dataset will be and set the scaling factors accordingly. For example, if your input images have size 300x300, your smallest objects are about 10x10 pixels and your largest objects are about 60x60 pixels, then you could choose your smallest scaling factor to be 0.033 (or a tiny bit larger than that) and your largest scaling factors 0.2. This might improve the matching. However, this alone might not lead to a huge improvement. You might also have to try the this in addition:
- You could try using a predictor layer that sits on top of an earlier layer of the network, either by adding an additional predictor layer to the existing ones or by changing the lowest level predictor layer to sit on a lower layer of the network. This would both increase the spatial resolution of that predictor layer, so the overall coverage for small boxes would be better, and it would lead to a lower level of abstraction as input for the predictor layer, which might also be beneficial. Presumably, less abstraction is needed (or even useful) to detect 10x10 objects than is needed to detect 200x200 objects.
These two measures together might help improve the detection performance on small objects, although I can't guarantee it.
Another question: What base network architecture are you using? Are you using the original SSD300? If yes, then I cannot recommend trying to train that from scratch. I'm not sure if anything good could come out of that without pre-trained weights, considering that there is no dropout or batch normalization in the reduced VGG-16 and the overall network is quite deep.
If you are using a more shallow network architecture like the SSD7 included in the repo, then the above might work.
Another question would be how many different logos you are trying to detect. If the number of distinct logos is very large, then the capacity of a small network like SSD7 might not be enough and you might need a wider (more filters per conv layer) and/or deeper (more conv layers) network.
from ssd_keras.
hi @pierluigiferrari
Thank you so much!
I will try these suggestions soon !!!
"Another question: What base network architecture are you using? Are you using the original SSD300?"
Yes,I 'm using SSD300 to do my job.But I will try SSD512 without retrained model.I think larger input size will increase the size of the object which will be detected ,so the accuracy will be better.
"Another question would be how many different logos you are trying to detect"
The class number of my logo image dataset is 22. I think this is not very large.
Thank you again
from ssd_keras.
@pierluigiferrari
I followed your suggestion and train it again , the accuracy for small objects detection is now better!It's reach to 56%.Thank you !
I ‘m now wandering if I use smaller network (SSD7)on this small dataset will get better reault?
from ssd_keras.
@albertyou2 that also depends on how much data you have and how heavily you use data augmentation. If you only have a couple hundred or a few thousand images, a deep and wide model like SSD300 will be overkill. If you have tens of thousands or hundreds of thousands of images, then SSD300 or SSD512 will be suitable models. And of course, more data augmentation is always better, as long as the generated data is representative of the original data.
Now, when it comes to training SSD300 or SSD512 from scratch, consider the following important points:
When Wei Liu et al. turned the original VGG-16 into their reduced atrous version, they removed the dropout layers and loaded weights that were pre-trained on the large ImageNet localization dataset. They didn't need the regularization because they initialized the base network with pre-trained weights anyway. If, however, you're trying to train the entire SSD300 completely from scratch, then that might be a problem. There are no dropout layers, no batch normalization, and no other techniques in the SSD300 that would improve learning for such a deep and wide network.
If you have enough data, a smaller network like SSD7 will not yield better results, but at the same time training the original SSD300 from scratch (i.e. without loading pre-trained weights for the VGG-16 base network) is not optimal either.
But there is not really a need to stick to the original SSD300/512 architecture with the reduced atrous VGG-16 base network if you want to train from scratch. You could modify the base network or even build something completely different.
For example, I would definitely include a batch normalization layer after every convolution layer, like SSD7 does it. That alone might help quite a bit. I would also use ELUs instead of ReLUs - ReLUs can die. Or to take it a step further you could use a ResNet architecture. It wouldn't have to be a super-deep ResNet, but the general design is far superior over the more primitive VGG design.
As always, these suggestions aren't guaranteed to get better results, but I believe they are worth a shot.
And another thing: Since adding a lower level detection layer worked, you could try taking this experiment further in the same direction. You could add another, even lower level detection layer to test whether or not that will yield further improvements.
from ssd_keras.
@pierluigiferrari
Thank you so much !
I will try your suggestion again!
from ssd_keras.
Related Issues (20)
- InvalidArgumentError when compiling model with ssd_loss HOT 1
- WARNING:tensorflow:Gradients do not exist for variables ['conv4_3/bias:0',...] when minimizing the loss. HOT 1
- "Invalid argument: Index out of range using input dim 0; input has only 0 dims" during ssd300 model training
- load weight
- ValueError: Error when checking input: expected input_3 to have 4 dimensions, but got array with shape
- While training I got training terminate error . Epoch 00001: LearningRateScheduler setting learning rate to 0.001. 1/10 [==>...........................] - ETA: 4:08 - loss: nanBatch 0: Invalid loss, terminating training Epoch 00001: saving model to ssd512_URPC2018_epoch-01.h5 Process finished with exit code 0
- ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.
- ValueError: Layer model expects 1 input(s), but it received 2 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(None, None, None, None) dtype=uint8>, <tf.Tensor 'IteratorGetNext:1' shape=(None, None, None) dtype=float32>] HOT 23
- Parameters of the model HOT 1
- Bouding boxes predictions are concentrated in left top corner HOT 1
- Ambiguous dimension while trying to load weights.
- Urgent!! Invalid Loss HOT 4
- What are the requirements to run this code?. HOT 1
- Pascal VOC Training Person Detection
- The device being used is CPU while capturing image from webcam. How do I use my GPU for processing instead?
- Label error during Coco Training HOT 1
- TypeError: Expected any non-tensor type, got a tensor instead.
- Changes make the code work in 2023 HOT 2
- custom SSD300 model
- error while training with custom dataset in COCO format
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ssd_keras.