Thanks for creating this repo and very helpful blog post. I tried inference x2 and

How to train the model for x8 scale? about super-resolution HOT 13 CLOSED

krasserm commented on June 9, 2024

How to train the model for x8 scale?

from super-resolution.

Comments (13)

krasserm commented on June 9, 2024 1

I fixed this issue by resizing the upscaled image to be the same shape as HR image.

A proper solution would be to center-crop the HR image, using the SR image size as crop window. Resizing recomputes all pixel values via (bilinear, bicubic, ...) interpolation which may explain the low PSNR values you see during validation.

About reusing weights of lower factors, layers have to be named consistently so that weights from lower resolution models can be loaded

Exactly, this must be done for all model implementations. What you proposed for up-sampling looks good to me (didn't see the "wrong" names in your first proposal).

from super-resolution.

krasserm commented on June 9, 2024 1

I'd expect it is able to learn that to some extend.

from super-resolution.

krasserm commented on June 9, 2024

The code changes you've made look good. I just tried training for only 8000 steps with factor x8 and the results look reasonable (when I'm running example-edsr.ipynb). Here's an example:

During these 8000 steps I'm observing a loss decrease from 14.6 to 10.4 and a PSNR increase from 23.5 to 24.3 (I didn't see PSNR values around 15 though). For this to get working I had to add a filter to the validation data source:

def experimental_filter(lr, hr):
    return tf.shape(lr)[1] * 8 == tf.shape(hr)[1] and tf.shape(lr)[2] * 8 == tf.shape(hr)[2]

valid_ds = div2k_valid.dataset(batch_size=1, random_transform=False, repeat_count=1).filter(experimental_filter)

This is a temporary hack to prevent an error during processing of validation SR images whose dimensions do not exactly match the HR dimensions (for example, some LR images have height 169 so that the SR images have height 169 * 8 = 1352 whereas the corresponding HR image has height 1356). This has to be fixed in the data source implementation itself ...

Also, it is recommended to initialize the x8 model with weights from training with lower factors (i.e. x4) to have a better starting point. To make this working smoothly, some changes are needed to the model definitions. In particular, layers have to be named consistently so that weights from lower resolution models can be loaded with model.load_weights(..., by_name=True). Nevertheless, for shallow model with only 16 residual blocks, starting from an randomly initialized model should work as well.

I'll work on these things later this week. Alternatively, feel free to create a pull request should you want to do these changes (or a subset) yourself.

Do you mind sharing the training code you've used? I'm curious how you came to PSNR values around 15.

from super-resolution.

canhnht commented on June 9, 2024

Thanks a lot for fast response.

I also got the issue about dimensions of SR images not matching with HR dimensions. This issue happened when calculating PSNR value during validation after each step. I fixed this issue by resizing the upscaled image to be the same shape as HR image.

def psnr(x1, x2):
    if x1.shape.as_list() != x2.shape.as_list():
        print('resize', x1.shape, x2.shape)
        x1 = tf.image.resize_with_pad(x1, x2.shape[1], x2.shape[2])
    return tf.image.psnr(x1, x2, max_val=255)

Maybe this hack caused the loss and PSNR value to change so little during training. (Sorry I forgot to mention about it in the first post)

My training script is basically the same as instructed in the repo here https://github.com/krasserm/super-resolution#edsr-1. I tried with batch_size of 128, depth of 8, 16, 32 and 100000 steps.

Thanks a lot for showing the hack about filtering dataset. I didn't know about that.
I will try this hack and train x8 with more epochs.
Then I will update the result and my complete training code soon.

from super-resolution.

canhnht commented on June 9, 2024

About reusing weights of lower factors, layers have to be named consistently so that weights from lower resolution models can be loaded means that name of upsample layers should be the same in model of x4 and x8. I'm thinking of changing the upsample() method to be like this.

def upsample(x, scale, num_filters):
    def upsample_1(x, factor, **kwargs):
        x = Conv2D(num_filters * (factor ** 2), 3, padding='same', **kwargs)(x)
        return Lambda(pixel_shuffle(scale=factor))(x)

    if scale == 2:
        x = upsample_1(x, 2, name='conv2d_1_scale_2')
    elif scale == 3:
        x = upsample_1(x, 3, name='conv2d_1_scale_3')
    elif scale == 4:
        x = upsample_1(x, 2, name='conv2d_1_scale_2')
        x = upsample_1(x, 2, name='conv2d_2_scale_2')
    elif scale == 8:
        x = upsample_1(x, 2, name='conv2d_1_scale_2')
        x = upsample_1(x, 2, name='conv2d_2_scale_2')
        x = upsample_1(x, 2, name='conv2d_3_scale_2')

    return x

Upsample layers of x8 is named the same as upsample layers of x4, specifically conv2d_1_scale_2 and conv2d_2_scale_2. Is there any additional change I should make to reuse weights of x4 model?

Thanks in advance.

from super-resolution.

canhnht commented on June 9, 2024

Great! I will try with this naming and see if the result of x8 training is improved.
Have you tried training EDSR model with other downgrade methods, e.g. unknown, mild, difficult?
I tried training x4 with difficult dataset of DIV2K. After 150000 steps, loss decreases from 22 to 18 and PSN is always around 20. I'm wondering why PSNR is not increasing.
Difficult dataset also includes noise. Should I add some layer to EDSR model to clean noise and artefacts in images?

from super-resolution.

krasserm commented on June 9, 2024

I didn't try EDSR training with other than bicubic downgrade. For difficult, I'd first investigate if there was also a shift operation applied to images (i.e. image moved to left-right or up-down) which should be corrected otherwise.

from super-resolution.

canhnht commented on June 9, 2024

If we train with images that only have some noise and some artifacts like serration, then can EDSR model clean up these noise and artifacts when inferencing?

from super-resolution.

canhnht commented on June 9, 2024

Maybe fine-tuning it with SRGAN will help to produce cleaner images.
About inference of large images, I'm thinking of splitting large image into chunks, then running inference for these chunks, then concatenate these chunks into the result.
Do you think it's feasible with EDSR model?

from super-resolution.

krasserm commented on June 9, 2024

What problem do you see with inference of large images?

from super-resolution.

canhnht commented on June 9, 2024

Currently, I'm using a machine with 16GB RAM. Maximum input size for x4 upscale is about 500px. If input image is larger than 500px, it will throw some error of allocating memory. Similar situation when I use GPU of about 11GB VRAM.
I'm thinking of inferencing large images with limited memory. With limited memory, chunking technique can be good fix for inference of larger image size, like 1000px or 2000px or larger.

from super-resolution.

krasserm commented on June 9, 2024

Understood, memory problems. Composing SR chunks could work quite well I think, not sure about some artifacts at image/chunk boundaries but doing this for a few images should give you a good impression how that works.

from super-resolution.

krasserm commented on June 9, 2024

@canhnht

Great! I will try with this naming ...

Are you planning to make a contribution regarding this? Otherwise, I'd close this ticket and open a more specific one.

from super-resolution.

How to train the model for x8 scale? about super-resolution HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent