Git Product home page Git Product logo

Comments (6)

pierluigiferrari avatar pierluigiferrari commented on May 24, 2024

I have very limited information about your dataset or how much you trained the model or even which model you're trying to train to begin with, so it could be many things. One possible reason if it's not making a lot of confident predictions is that it simply hasn't been trained enough. I see that all the time when I train a model from scratch: After the first couple of hundred or thousand training steps, the model predicts almost nothing with high confidence (except background), so after confidence thresholding, you're left with no predictions at all. Then it starts getting better and better, first occasionally making a correct detection here and there on easy objects, then slowly detecting harder objects.

As for your own conjectures:

  • I don't know how much data you have or how many ground truth boxes are in an average image, but this is likely not the reason. If you have only little data, if anything it should overfit and predict those logos with high confidence.
  • I'm not sure I understand what you mean by "Varying image sizes of the wild". Are we talking about about varying image sizes or varying object sizes? The images all have the same size after they come out of the generator anyway, and the varying object sizes that partially result from resizing the images shouldn't be a problem for the model as long as they are within the range of object sizes that the configuration (scaling factors etc.) was designed for. Of course it's known that SSD generally tends to have trouble with very small objects.
  • I don't know which model you're using, but for a custom dataset it's always worth tuning parameters like the scaling factors and aspect ratios, or even the network architecture if necessary, to the reality in your dataset. If your objects have similar sizes and shapes as the objects in one of the pre-trained models, then that model's configuration should work fine as is of course.

from ssd_keras.

adamuas avatar adamuas commented on May 24, 2024

Sorry for the late response.

I am training the SSD300 model with 13 classes with roughly at least 150 classes per class for training (the rest is my testing set - i.e. at least 50 images per class) . I setup an early stopping with a patience of 100 and min_delta of 0.001 to avoid it stopping too early. Because I had limited training data, I used the training data + noise as my validation data (noise - was introduced by the image augmentations).

VGG16BASE_FREEZE = ['input_1', 'conv1_1', 'conv1_2', 'pool1',
          'conv2_1', 'conv2_2', 'pool2',
          'conv3_1', 'conv3_2', 'conv3_3', 'pool3',
          'conv4_1', 'conv4_2', 'conv4_3', 'pool4',
          'conv5_1', 'conv5_2', 'conv5_3', 'pool5']

How many epochs do you recon I should train for from your experience?

from ssd_keras.

pierluigiferrari avatar pierluigiferrari commented on May 24, 2024

One suggestion would be that you load weights of one of the fully trained SSD300 models rather than starting to train with the trained VGG16 weights only. Read in my first reply to #50 on how to circumvent the problem that the number of classes for your dataset (13) differs from the number of classes of the trained models (20 for Pascal VOC, 80 for MS COCO, or 200 for ImageNet).

I don't know what your logo images look like, but I assume they are very different from any of the object categories in Pascal VOC, MS COCO, or ImageNet. Nonetheless, it's probably fair to assume that any trained weights are always a better starting point to fine-tune the model on your dataset than randomly initialized weights, even if your objects of interest are very different from the objects the models were trained on. Loading trained model weights would likely improve your results tremendously and save you a lot of training time.

It's hard to say for how many training steps (let's use training steps as the metric rather than epochs) you would have to train for until you get half-decent results if you start out with only the VGG16 weights, but my best guess would be in the ball park of a few tens of thousands.

But once again, I would recommend to start out by fine-tuning one of the fully trained models. Sub-sampling the weight tensors of the classification predictor layers sounds more tedious than it is, at the end of the day it's just a bit of Numpy slicing. Or just changing the names of the classification predictor layers would be the really easy (and slightly worse) way.

from ssd_keras.

adamuas avatar adamuas commented on May 24, 2024

Thanks, appreciate this!

I will give it a try with the ImageNet weights as a starting point.

from ssd_keras.

pierluigiferrari avatar pierluigiferrari commented on May 24, 2024

Yeah, the ImageNet weights will probably be a good starting point. I've created a notebook that does the weight sub-sampling for you:

https://github.com/pierluigiferrari/ssd_keras/blob/master/weight_sampling_tutorial.ipynb

from ssd_keras.

adamuas avatar adamuas commented on May 24, 2024

Thanks alot @pierluigiferrari , appreciate this 👍 👍

from ssd_keras.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.