Git Product home page Git Product logo

espcn's Introduction

ESPCN

A PyTorch implementation of ESPCN based on CVPR 2016 paper Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network.

Requirements

conda install pytorch torchvision -c soumith
conda install pytorch torchvision cuda80 -c soumith # install it if you have installed cuda
  • PyTorchNet
pip install git+https://github.com/pytorch/tnt.git@master
  • opencv
conda install opencv

Datasets

Train、Val Dataset

The train and val datasets are sampled from VOC2012. Train dataset has 16700 images and Val dataset has 425 images. Download the datasets from here(access code:5tzp), and then extract it into data directory. Finally run

python data_utils.py

optional arguments:
--upscale_factor      super resolution upscale factor [default value is 3]

to generate train and val datasets from VOC2012 with given upscale factors(options: 2、3、4、8).

Test Image Dataset

The test image dataset are sampled from | Set 5 | Bevilacqua et al. BMVC 2012 | Set 14 | Zeyde et al. LNCS 2010 | BSD 100 | Martin et al. ICCV 2001 | Sun-Hays 80 | Sun and Hays ICCP 2012 | Urban 100 | Huang et al. CVPR 2015. Download the image dataset from here(access code:xwhy), and then extract it into data directory.

Test Video Dataset

The test dataset are sampled from Jay Chou's Music Videos. Download the video dataset from here(access code:6rad), and then extract it into data/test/SRF_xx/video directory, which xx means the upscale factor.

Usage

Train

python -m visdom.server & python train.py

optional arguments:
--upscale_factor      super resolution upscale factor [default value is 3]
--num_epochs          super resolution epochs number [default value is 100]

Visdom now can be accessed by going to 127.0.0.1:8097 in your browser, or your own host address if specified.

If the above does not work, try using an SSH tunnel to your server by adding the following line to your local ~/.ssh/config : LocalForward 127.0.0.1:8097 127.0.0.1:8097.

Maybe if you are in China, you should download the static resources from here(access code:vhm7), and put them on ~/anaconda3/lib/python3.6/site-packages/visdom/static/.

Test Image

python test_image.py

optional arguments:
--upscale_factor      super resolution upscale factor [default value is 3]
--model_name          super resolution model name [default value is epoch_3_100.pt]

The output high resolution images are on results directory.

Test Video

python test_video.py

optional arguments:
--upscale_factor      super resolution upscale factor [default value is 3]
--is_real_time        super resolution real time to show [default value is False]
--delay_time          super resolution delay time to show [default value is 1]
--model_name          super resolution model name [default value is epoch_3_100.pt]

The output high resolution videos are on results directory.

Benchmarks

Adam optimizer were used with learning rate scheduling between epoch 30 and epoch 80.

Upscale Factor = 2

Epochs with batch size of 64 takes ~1 minute on a NVIDIA GeForce TITAN X GPU.

Loss/PSNR graphs

Image Results

The left is low resolution image, the middle is high resolution image, and the right is super resolution image(output of the ESPCN).

  • Set5
  • Set14
  • BSD100
  • Urban100

Video Results

The right is low resolution video, the left is super resolution video(output of the ESPCN). Click the image to watch the complete video.

Watch the video

Upscale Factor = 3

Epochs with batch size of 64 takes ~30 seconds on a NVIDIA GeForce TITAN X GPU.

Loss/PSNR graphs

Image Results

The left is low resolution image, the middle is high resolution image, and the right is super resolution image(output of the ESPCN).

  • Set5
  • Set14
  • BSD100

Video Results

The right is low resolution video, the left is super resolution video(output of the ESPCN). Click the image to watch the complete video.

Watch the video

Upscale Factor = 4

Epochs with batch size of 64 takes ~20 seconds on a NVIDIA GeForce GTX 1070 GPU.

Loss/PSNR graphs

Image Results

The left is low resolution image, the middle is high resolution image, and the right is super resolution image(output of the ESPCN).

  • Set5
  • Set14
  • BSD100
  • Urban100

Video Results

The right is low resolution video, the left is super resolution video(output of the ESPCN). Click the image to watch the complete video.

Watch the video

Upscale Factor = 8

Epochs with batch size of 64 takes ~15 seconds on a NVIDIA GeForce GTX 1070 GPU.

Loss/PSNR graphs

Image Results

The left is low resolution image, the middle is high resolution image, and the right is super resolution image(output of the ESPCN).

  • SunHays80

Video Results

The left is low resolution video, the right is super resolution video(output of the ESPCN). Click the image to watch the complete video.

Watch the video

The complete test image results could be downloaded from here(access code:nkh9), and the complete test video results could be downloaded from here(access code:1dus).

espcn's People

Contributors

developer0hye avatar leftthomas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

espcn's Issues

About image normalization

Thanks for your work! But I have a tiny question for image normalization.

out_img_y *= 255.0

The output of model is in range [0,1], but I didn't find the normalization for input and target during training, which are loaded from file in range [0, 255]. Then there will be some trouble for the loss between model output and the target. Can you help me to figure out where is the normalization for the input and target to change the range from [0, 255] into [0,1]?

on-hand model availability

Hello:

This work is wonderful and I wonder if I could get an on-hand trained model for this work. This will help me a lot. Thanks.

Real-time performance issue

hi, I use the test_video.py to test the real-time videoSR performance of the model, when I test the 240P video it works just fine, but when it comes to 360P video, the fps reduced to 10.

I notice that the GPU utilization is only about 10%, but the CPU is at maximum capacity.

image = Variable(ToTensor()(y)).view(1, -1, y.size[1], y.size[0])
This line of code uses up most of CPU, and
out = out.cpu()
This line takes up most of the time. It takes 20ms to process while the model only takes 4ms. I thought it uses so much time because it needs to clean up the GPU ram, and I tried torch.cuda.empty_cache() before that and the time .cpu() consumed went down but the whole running time didn't change. I don't know how to tackle this problem, and I don't know what .cpu() does except copying data from GPU. I can really use some help. ( Or this is exactly how it works and I got nothing to do except using a better CPU?)

My CPU is i7-9700K and my GPU is RTX2080Ti. I also want to know the CPU you used for testing and the fps of the real-time VSR you can get.

Thanks in advance!!

test image

I am a pytorch beginner
When python test_image.py. There was an error.
File "test_image.py", line 29, in
model.load_state_dict(torch.load('epochs/' + MODEL_NAME))
File "/home//anaconda3/envs/pytorch/lib/python3.5/site-packages/torch/serialization.py", line 229, in load
return _load(f, map_location, pickle_module)
File "/home/
/anaconda3/envs/pytorch/lib/python3.5/site-packages/torch/serialization.py", line 367, in _load
magic_number = pickle_module.load(f)
_pickle.UnpicklingError

What can I do to solve it?Thank you for your help!

UnpicklingError: invalid load key, '\xef'

Hi ! When I run the test_image.py to load the given model I met the error:_pickle.UnpicklingError: invalid load key, '\xef'. I tried CSAILVision/places365#25 and set the encoding to be utf-8 or unicode but it doesn't work. I hope anyone could give me any advice? Thanks in advance!

cPickle.UnpicklingError: invalid load key, '�'.

This is what I get:

File "test_image.py", line 29, in
model.load_state_dict(torch.load('epochs/' + MODEL_NAME))
File "/home/a80050185/anaconda3/envs/py2.7/lib/python2.7/site-packages/torch/serialization.py", line 267, in load
return _load(f, map_location, pickle_module)
File "/home/a80050185/anaconda3/envs/py2.7/lib/python2.7/site-packages/torch/serialization.py", line 410, in _load
magic_number = pickle_module.load(f)
cPickle.UnpicklingError: invalid load key, '�'.

Please help

question about the test_video.py

I am using your code to reproduce the experimental results of the ESPCN paper, but I find that the results are quite different from those in the paper. Therefore, I speculate whether the input image only has y value and no CB and Cr value into the model, which will lead to the final image can not achieve the effect of the paper?
this is my results:
image

image

cannot download sataset

hello,
i can access train/val dataset download link, but it doesn't exist.
how can i download it?

Could re-release training data or pre-train model again?

As the title, all baidu yun url are missing, and I try to use pre-train model but have "UnpicklingError: invalid load key, '\xef'" problem when loading model. Could you release those data again? Or could you send those data to my email [email protected]?

p.s. My platform is windows and use python 3.6.2 & pytorch 0.4.0. (windows don't have pytorch 0.3.1)

Strange effect when test images

I build the python environment with anaconda, using python 3.7.13.When I run the image test cmd, there was an error reported.I search it with google, and find a solution with installing 'pillow 6.1.0'.Then I run the training cmd,there was an other error reported:
Traceback (most recent call last): File "train.py", line 116, in <module> engine.train(processor, train_loader, maxepoch=NUM_EPOCHS, optimizer=optimizer) File "/home/lxp/anaconda3/envs/hh_espcn/lib/python3.7/site-packages/torchnet/engine/engine.py", line 63, in train state['optimizer'].step(closure) File "/home/lxp/anaconda3/envs/hh_espcn/lib/python3.7/site-packages/torch/optim/adam.py", line 58, in step loss = closure() File "/home/lxp/anaconda3/envs/hh_espcn/lib/python3.7/site-packages/torchnet/engine/engine.py", line 56, in closure self.hook('on_forward', state) File "/home/lxp/anaconda3/envs/hh_espcn/lib/python3.7/site-packages/torchnet/engine/engine.py", line 31, in hook self.hooks[name](state) File "train.py", line 45, in on_forward meter_loss.add(state['loss'].data[0]) IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number
I search this again,and found an solution is change this code:
meter_loss.add(state['loss'].data[0])
to:
meter_loss.add(state['loss'].item())
After that the training process can be finished.I got the model file 'epoch_3_100.pt' in epoch directory.
However, when I use this model to test images, I got very strange effect like this:
BSD100_099
Did i do anything wrong in these steps?

Alternative links for dataset download

Hi,

Could you provide alternative download links for the dataset as its difficult to download from baidu.
Maybe a shared google drive link will be helpful.

THanks,
Pramod

Discuss that the SISR recurring result is lower than the original paper?

First of all ,thank you very much for your hard work to replicate the program!

The training set used in the original paper is 91 high-definition images, and the effect achieved by [×3_SR] is Set5—32.55dB.

The training set you used is 16700 images, and the reproduced effect is Set5—31.51dB. Why is the effect worse when using a larger training set?

And I found that the highest value of your program is Set5-34.09dB is higher than the original paper, so I guess the original paper uses the highest value as the result?

cPickle.UnpicklingError: invalid load key, '?' With existing model epoch_3_100.pt

>>> import torch
>>>torch.load('epochs/epoch_3_100.pt')
<open file 'epochs/epoch_3_100.pt', mode 'rb' at 0x7f7e3eb91e40>
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/torch/serialization.py", line 231, in load
    return _load(f, map_location, pickle_module)
  File "/usr/local/lib/python2.7/dist-packages/torch/serialization.py", line 369, in _load
    magic_number = pickle_module.load(f)
cPickle.UnpicklingError: invalid load key, '?'.
>>>

I tried to use the existing model in epochs folder, but got the error
Thanks

model

Is your model damaged? Now I need to train myself to use it, right?

关于train.py的问题

在运行完python -m visdom server & python train.py后出来了visdom可视化界面,但是没有出现任何数据,只有顶部的导航栏和蓝底,没有train.py里写的向visdom界面生成训练进度等数据的可视化窗口,想问问作者这样的情况该怎么解决呢?感谢!

ESPCN's architecture

The suggested model is a bit different than the on in the paper. You got 4 convolutional layers, whereas, there are only 3 in the paper. Have you tried less? Why do you have sigmoid in the output layer?
Thank you very much!
Idan

Cannot download datasets from Baidu

Download links in Baidu redirect to a cloud application, which require an account to use, which require a Chinese mobile number to set up. Would it be possible to have these datasets in a more accessible place?

cuda error

TypeError: can't convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first

Consult the file train.h5

Good evening! I am very sorry for disturbing you! I don't know where is the train.h5 file. Can you tell me where this file can go? Thank you.

python3

Firstly thanks for you work
When ues python3 load the epoch_3_100.pt, it comes _pickle.UnpicklingError: invalid load key, '\xef'.
How can i reslove this problem ?

RuntimeError: cuda runtime error (8) : invalid device function

My CUDA is 10.0, Python is 3.6, GPU computing power is 5.0, tensorflow version is 1.14.0, running train.py is wrong. Runtime Error: CUDA runtime error (8): invalid device function Another thing I don't understand is, when I run train. py, what is the number of data I read in 261, and where does 261 come from? 0%| | 0/261 [00:00<?,? It/s]

about visdom

hello,I want to ask about "python -m visdom.server & python train.py",Whether it's one statement or two statements?If I run it alone "python -m visdom.server ",I will receive "You can navigate to http://localhost:8097"Then it is necessary to keep the terminal page open, reopen a terminal and run "python train.py" again?

pre_train_model

@Risemxd
Duplicate of #25

I ran train.py, but only printed Train Loss .. and Val Loss .. , But the statement:" torch. Save (model.state_dict(), 'epochs/epoch_%d_% d.epot '% (UPSCALE_FACTOR, state['epoch']) " did not run.

Results in MSU Video Super Resolution Benchmark

Hello,
MSU Video Group has recently launched Video Super Resolution Benchmark and evaluated this algorithm.

It takes 14th place by subjective score, 8th place by PSNR, and 14th by our metric ERQAv1.0. You can see the results here.

If you have any other VSR method you want to see in our benchmark, we kindly invite you to participate.
You can submit it for the benchmark, following the submission steps.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.