Hi, when I trained with the CharbonnierLoss , the loss is very very big, but when I t

All of three loss is big, what is the problem? <p dir="

All of three loss is big, what is the problem? </blockqu

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

I met the same problem as <a class="user-mention notranslate" data-hovercard-type="use

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

about the loss function about edvr HOT 10 CLOSED

xinntao commented on July 28, 2024

about the loss function

from edvr.

Comments (10)

LI945 commented on July 28, 2024

when I rained with L1 loss, it was also big.
How many is your L1 loss?

from edvr.

LI945 commented on July 28, 2024

All of three loss is big, what is the problem?

from edvr.

yinnhao commented on July 28, 2024

All of three loss is big, what is the problem?

Same to me. My loss is about "e+4". How about you?

from edvr.

LI945 commented on July 28, 2024

All of three loss is big, what is the problem?

Same to me. My loss is about "e+4". How about you

My loss is about "e+4" too

from edvr.

zzzzwj commented on July 28, 2024

All of three loss is big, what is the problem?

Same to me. My loss is about "e+4". How about you

My loss is about "e+4" too

Cause the loss is reduced by sum, and when you try to divide it by (GT_sizeGT_sizebatch_size) you'll find the loss is just the common case like 1e-2. You can replace the reduce function by 'mean'.

from edvr.

LI945 commented on July 28, 2024

I have another problem, which is that the loss function doesn't go down, does anybody else have this problem？

from edvr.

xinntao commented on July 28, 2024

@zzzzwj has pointed it out. CharbonnierLoss is in the sum mode. For L1/L2 losses, there are also some modes like mean and sum (you can see them in PyTorch doc).
The key during the training is the gradient instead of the loss. So even if with larger losses, the training is OK under proper gradients.
When using different modes, mean or sum, you may need to adjust the learning rate. But the Adam optimizer can automatically adjust it to some extent.

@LI945 During the training, you may observe the loss decreases very slowly. But if you evaluate the checkpoints, the performance (PSNR) actually increases as the training goes.

from edvr.

zzzzwj commented on July 28, 2024

I met the same problem as @LI945 mentioned. When I trained with my own datasets, the loss decreases very slow. When I train with SISR model (for example, EDSR), the psnr increases very fast which can reach almost the best value around 37.0 psnr in 20~30 epochs. However when I train with EDVR, using the raw training code, the psnr increases fast in first 10 epochs reaching ~33.0 psnr, then it's psnr value seems to be stable which means in next 20 epochs, the psnr value just inceases less than 1.0. So have you met the same problem when you train the REDS or Vimeo90K datasets? And can I have your training log? Hope for your reply @xinntao .

from edvr.

xinntao commented on July 28, 2024

@zzzzwj I will upload a training log example tomorrow. Actually, 1) we use a different training scheme with restarts, which improves the performance. 2) We usually measure in iteration rather than epoch.

from edvr.

zzzzwj commented on July 28, 2024

@zzzzwj I will upload a training log example tomorrow. Actually, 1) we use a different training scheme with restarts, which improves the performance. 2) We usually measure in iteration rather than epoch.

Well, thanks a lot.

from edvr.

about the loss function about edvr HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent