Git Product home page Git Product logo

ultimate-sr's Introduction

One-to-Many Approach for Improving Perceptual Super-Resolution ๐Ÿ˜†

Official Implementation of Compatible Training Objective for Improving Perceptual Super-Resolution in Tensorflow 2.0+.

This repository contains the implementation and training of the methods proposed in the paper Compatible Training Objective for Improving Perceptual Super-Resolution.(Link)

Diagram of our method

The methods presented in our paper were implemented with the ESRGAN network from ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks by Xintao Wang et al. In our work we propose the following:

  • We provide weigthed random noise to the generator to provide it with the ability to generate diverse outputs.
  • We propose a weaker content loss that is compatible with the multiple outputs of the generator, and does not contradict the adversarial loss.
  • We improve the SR quality by filtering blurry regions in the training data using Laplacian activation.
  • We additionally provide the LR image to the discriminator as a reference image to give better gradient feedback to the generator.

Training and Testing

Requirements

To install requirements:

pip install -r requirements.txt

Configuration File

You can modify the configurations of our models in ./configs/*.yaml for training and testing, which like below.

# general setting
batch_size: 16
input_size: 32
gt_size: 128
ch_size: 3
scale: 4
log_dir: '/content/drive/MyDrive/ESRGAN'
pretrain_dir: '/content/drive/MyDrive/ESRGAN-MSE-500K-DIV2K'  # directory to load from at initial training
cycle_mse: True
# generator setting
network_G:
    name: 'RRDB'
    nf: 64
    nb: 23
    apply_noise: True
# discriminator setting
network_D:
    nf: 64

# dataset setting
train_dataset:
    path: '/content/drive/MyDrive/data/div2k_hr/DIV2K_train_HR'
    num_samples: 32208
    using_bin: True
    using_flip: True
    using_rot: True
    detect_blur: True
    buffer_size: 1024           # max size of buffer
    patch_per_image: 128        # number of patches to extract from each image
test_dataset:
    set5: './test data/Set5'
    set14: './test data/Set14'

# training setting
niter: 400000

lr_G: !!float 1e-4
lr_D: !!float 1e-4
lr_steps: [50000, 100000, 200000, 300000]
lr_rate: 0.5

adam_beta1_G: 0.9
adam_beta2_G: 0.99
adam_beta1_D: 0.9
adam_beta2_D: 0.99

w_pixel: !!float 1e-2
pixel_criterion: l1

w_feature: 1.0
feature_criterion: l1

w_gan: !!float 5e-3
gan_type: ragan  # gan | ragan
refgan: True       # provide reference image

save_steps: 5000

# logging settings
logging:
    psnr: True
    lpips: True
    ssim: True
    plot_samples: True

cycle_mse: use cycle-consistent content loss

network_G/apply_noise: provide random noise to the generator network

train_dataset/detect_blur: filter blurry images in the training dataset

refgan: provide reference image to the discriminator network

Explanation of config files:

  • esrgan.yaml: baseline ESRGAN (configuration(c))
  • esrrefgan.yaml: +refgan (configuration(d))
  • use_noise.yaml: +use noise (configuration(e))
  • cyclegan.yaml: +cycle loss (configuration(f))
  • cyclegan_only.yaml: -perceptual loss (configuration(g))

Training

The training process is divided into two parts; pretraining the model with pixel-wise loss, and training the pretrained PSNR model with ESRGAN loss.

Pretrain PSNR

Pretrain the PSNR RDDB model.

python train_psnr.py --cfg_path="./configs/psnr.yaml" --gpu=0

ESRGAN

Train the ESRGAN model with the pretrain PSNR model.

python train_esrgan.py --cfg_path="./configs/esrgan.yaml" --gpu=0

Configure the dataset directory and log directory in the config file before training. The DIV2K dataset is available here and the DIV8K dataset is avilable here.

Evaluation

python test.py --model=weights/ESRGAN-cyclemixing --gpu=0 --img_path=photo/baby.png --down_up=True --scale=4(optional)

When the down_up option is True, the image will be arbitrarily downsampled and processed through the network. For real use cases, the option must be marked False for the model to upsample the image.

Pre-trained models and logs

All our trained models and tensorboard logs in the experiment can be downloaded here. Three trained models are included in the repository in ./weights/*.

Results

Our methods were evaluated on LPIPS, PSNR, and SSIM using the Set5, Set14, BSD100, Urban100, and Manga109 dataset. The scores are displayed in the tables below, in the order LPIPS/PSNR/SSIM.

Pretrained PSNR Network

Method Set5 Set14 BSD100 Urban100 Manga109
Baseline PSNR 0.1341 / 30.3603 / 0.8679 0.2223 / 26.7608 / 0.7525 0.2705 / 27.2264 / 0.7461 0.1761 / 24.8770 / 0.7764 0.0733 / 29.2534 / 0.8945
+Blur detection 0.1327 / 30.4582 / 0.7525 0.2229 / 26.8448 / 0.7547 0.2684 / 27.2545 / 0.7473 0.1744 / 25.0816 / 0.7821 0.0711 / 29.5228 / 0.8973

X4 super-resolution

Method Set5 Set14 BSD100 Urban100 Manga109
ESRGAN (Official) 0.0597 / 28.4362 / 0.8145 0.1129 / 23.4729 / 0.6276 0.1285 / 23.3657 / 0.6108 0.1025 / 22.7912 / 0.7058 - / - / -
ESRGAN (Baseline) 0.0538 / 27.9285 / 0.7968 0.1117 / 24.5264 / 0.6602 0.1256 / 24.6554 / 0.6447 0.1026 / 23.2829 / 0.7137 0.0567 / 26.6808 / 0.8186
+refGAN 0.0536 / 27.9871 / 0.8014 0.1157 / 24.4505 / 0.6611 0.1275 / 24.5896 / 0.6470 0.1027 / 23.0496 / 0.7103 0.0623 / 26.4068 / 0.8150
+Add noise 0.04998 / 28.23 / 0.8081 0.1104 / 24.48 / 0.6626 0.1209 / 24.8439 / 0.6577 0.1007 / 23.2204 / 0.7203 0.0572 / 26.6227 / 0.8260
+Cycle loss 0.0524 / 28.1322 / 0.8033 0.1082 / 24.5802 / 0.6634 0.1264 / 24.6180 / 0.6468 0.1015 / 23.1363 / 0.7103 0.0616 / 26.3945 / 0.8151
-Perceptual loss 0.2690 / 23.4608 / 0.6312 0.2727 / 22.2703 / 0.5685 0.2985 / 24.1648 / 0.5859 0.2411 / 20.8169 / 0.6244 0.2780 / 21.7002 / 0.6483

Comparison of results

ultimate-sr's People

Contributors

sieu-n avatar calebelee05 avatar peteryux avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.