clovaai / cutblur Goto Github PK
View Code? Open in Web Editor NEWRethinking Data Augmentation for Image Super-resolution (CVPR 2020)
License: MIT License
Rethinking Data Augmentation for Image Super-resolution (CVPR 2020)
License: MIT License
I was reproducing cutblur in DIV2K dataset successfully and the cutblur has indeed brought considerable improvement. However, when I changed the face dataset, there was a serious error in the training, and the PSNR value was very low. I can't find the reason for that temporarily. I hope the author can give me some enlightenment.
By the way, this is a awesome project!
Thanks in advance.
Hi,
your work is excellent.
I noticed that your paper mentions the GAN model, especially in ESRGAN. However, I didn't find the ESRGAN model in the codes, would you provide the code for ESRGAN model?
Thank you
Hi I was trying to reproduce cutblur and failed because I didn't use X2 scale pretraining. Then I noticed that you mentioned in the README that "To achieve the result in the paper, X2 scale pretraining is necessary".
I'm a bit curious about have you found out why is this necessary?
Thanks in advance.
Thanks for your awesome work!
One question: How to use cutblur on video super resolution in which the input is sequential images?
Here's how I write the training loop ...
`
def train_loop_fn(data_loader, model, optimizer, device, scheduler):
running_loss = 0.0
model.train()
for inputs,labels in data_loader:
inputs = inputs.to(device, dtype=torch.float)
labels = labels.to(device, dtype=torch.float)
optimizer.zero_grad()
outputs = model(inputs)
loss = loss_fn(outputs, labels)
loss.backward()
optimizer_step(optimizer)
running_loss += loss.item() * inputs.size(0)
train_loss = running_loss / float(len(train_dataset))
scheduler.step(train_loss)
print('training Loss: {:.4f}'.format(train_loss))
`
Please tell me how to use cutblur in this loop ?
When I try to run inference.py for an image in Set14, it doesn't give any output and it says 0it [00:00, ?it/s].
Can you please help figure out what am I doing wrong? Thank you!
Hi,
It's not clear in the paper and it's not in the repository. Do you cutblur augment the images in the G phase? D phase? both? Something else?
Thank you!
hi,I am confused the cutblur function
cut_ratio = np.random.randn() * 0.01 + alpha
why not
cut_ratio = np.random.rand()
Blend & channel permutation seems to cause PSNR metric drop.
Hi!
I ran cutblur and the results did not improve but decreased. The baseline is EDSR, and i use the alpha=0.7.
Hi, thanks very much for your solid work. I have a question about the training input patch size for single image superresolution. I just find that many works just use training patch size=96x96 for scale=2x SISR. However, many deeper networks (RCAN) have a larger Receptive Field. I wonder whether training patch size=96x96 for scale=2x is the best choice?
Hi,
Thanks for your code and idea, quite interesting.
I decide to try to apply on some of the models and test the effectiveness.
May I know what is the value for alpha you used in your paper, for the cutblur function?
also, did you verify how different alpha value will affect the effectiveness of cutblur?
Thanks for your help
What is the size of the image block you use at the x2, x3, and x4 scales?
Are they all 48x48?
Thank you!
How do we use this if I have to train using a different dataset?
When doing im2 *= fim2, but shall we do the same in im1 *= fim1?
I am trying to learn to do the training with another dataset. Before that I tried repeating div2k training with patch-size of 24. After training, I used the checkpoint file for testing and the error was as follows:
RuntimeError: Error(s) in loading state_dict for Net:
Missing key(s) in state_dict: "tail.0.2.weight", "tail.0.2.bias".
size mismatch for head.1.weight: copying a param with shape torch.Size([256, 12, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 48, 3, 3]).
Is it ok to change the patch_size to 24? In that case why I do get this error? Thank you for your help!
Hi, in your code, you seem to use F.interpolate to upsample the LR image to match the resolution of HR in order to apply Cutblur. But have you checked the upsampled image? Cause when I do it in your way, I will get an image with server color shift, and that should not be the case in your paper.
Due to some unkown reasons, I cannot upload the images, but I can share with you my testing code.
HR = io.imread('data/DIV2K/DIV2K_train_HR/0159.png')
LR = io.imread('data/DIV2K/DIV2K_train_LR_bicubic/X4/0159x4.png')
HR_plot = HR[0:400, 200:600]
LR_plot = LR[0:100, 50:150]
LR_tensor = im2tensor(LR_plot).unsqueeze(0)
LR_plot = F.interpolate(LR_tensor, scale_factor=4, mode="nearest")[0].numpy().transpose(1,2,0)
f, axarr = plt.subplots(1, 2, figsize=(10, 5))
axarr[0].imshow(LR_plot)
axarr[1].imshow(HR_plot)
Hope you can try this, and tell me where I did wrong.
Thank you so much!
hi authors,
i‘ve only tested the performance of cutblur once by using python main.py --model CARN --augs cutblur --alpha 0.7 --dataset RealSR --scale 4 --camera all --dataset_root ./input/RealSR/ --ckpt_root ./pt/RealSR/cutblur/ --save_result --save_root ./output/RealSR/cutblur/
.
the obtained result is 28.89 which is lower than the result 29.00 in the paper.
therefore, i would like to know if 29.00 is the average of multiple models tested.
Hello!
I have some questions regarding Table 1
You have shown the results for DIV2K and RealSR. I wanted to know whether you use Div2K validation for both DIV2K and RealSR? i.e. Trained with DIV2K and RealSR but use DIV2K validation data for testing both RealSR and DIV2K.
Thanks.
thanks for your great work.
And there is the problem (↓) when the input image size is [2000,3000,3] (size[<1800, <1800,3] is OK), if the size of input images is limited?
Segmentation fault (core dumped)
more infoi:
torch 1.5.0
torchfile 0.1.0
torchvision 0.6.0
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
GeForce RTX 2080Ti
Hi,
I just try to reproduce the result mentioned in the paper.
I follow the steps you listed in this repo, firstly train RCAN x2 then tune for RCAN x4. I use exactly the same code/command as you provided.
but RCAN x2 on DIV2K with MoA only gives PSNR 36.32, which i think is way too low.
is this normal? can provide more training details? like what PSNR you obtained when training RCAN x2?
Really appreciate.
Hi!
I use the network I designed for training, but the loss keeps oscillating during the training process. I tried different learning rates, but it didn't work. Do you have any suggestions?
Thank you!
All the links for pretrained model are not working. Can you please give us the new link? Thank you!
I have a question about how to match he resolution of (LR, HR) due to CutBlur.
When I check the code about matching the resolution of (LR, HR) due to CutBlur,
I found using nearest.
if HR.size() != LR.size():
scale = HR.size(2) // LR.size(2)
LR = F.interpolate(LR, scale_factor=scale, mode="nearest")
Why don't you use bicubic?
Most people use bicubic in super resolution.
Do you have some special things?
I am interest in your CutBlur.
Thank you for your attention.
hi i try to reproduce on my custom dataset,
i use "RealSR" settings to run, but got this error below from
File "main.py", line 27, in main
solver.fit()
raise ValueError("empty range for randrange() (%d, %d, %d)" % (istart, istop, width))
ValueError: empty range for randrange() (0, -36, -36)
anyone have idea about this error?
I get division by zero error when I am trying to evaluate pretrained model using Set14. Can you please help me figure out why? I have put the Set14 in the directory for the dataset and mentioned Set14_SR for the dataset argument. Thank you!
File "/kaggle/working/cutblur/solver.py", line 139, in evaluate
return psnr/len(self.test_loader)
ZeroDivisionError: division by zero
Hi I am currently doing some research on data augmentation & SISR. I find that you crop a small margin of HR, SR image during evaluation. Is it because there might be some artifact of produced super-resoluted image that would impact the PSNR result?
Kind Regards,
why the way to interpolate is defferent between code and paper? the way in paper is bicubic,however , the way in code is nearest.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.