I am training this digit recognition model with children's handwritten image ( worse than mnist handwritten images ),
After 20-30 epochs of training, loss is becoming zero, and when training is finished the model is overfitted with zero (any input, it predicts as zero ), what could be the reason?
My training config :
learning rate: 0.01
epoch:1
batch size:32
I might have some bad inputs as well ( reasonably less ), as I am training the model with 3 lakhs digit images.
I have some questions,
How to decide epoch value, batch size value, learning rate value?
How to include diagnosis of training to the same model?
What's your suggestion on transfer learning?