geekyutao / rn Goto Github PK

Region Normalization for Image Inpainting, accepted by AAAI-2020

License: MIT License

Python 100.00%

rn's Introduction

Region Normalization for Image Inpainting

The paper can be found here. If you have any question about the paper/codes, you can contact me through Email([email protected]).

Please run the codes where the python is Version 3.x and pytorch>=0.4.

PS: 1) The results of this version codes are better than those in the paper. The original base inpainting model that RN uses is not very stable (the result variance is a bit large) and we only reported conservative results. However, we optimized the base model and improved its robustness after the pulication so that the results now are better. 2) RN wants to bring an insight that spatially region-wise normalization is better for some CV tasks such as inpainting. Theoretically, RN can be both BN-style or IN-style. Both have pros and cons. IN-style RN gives less blurring results and achieves style consistence to background in some extent, while suffers from spatial inconsistence if the model representation ability is limited. BN-style RN gives higher PSNR on an aligned validation data, but makes regions more blurring and causes much data-bias risk when testing data distribution has a certain shift to training data distribution. One chooses the RN style according to the specific scene. (See issue #12)

Repo Update:

[04/26/2022] Support torch >= 1.7; fix old-version issues.

Preparation

Before running the codes, you should prepare training/evaluation image file list (flist) and mask file list (flist). You can refer to the folowing command to generate .flist file:

python flist.py --path your_dataset_folder --output xxx.flist

Training

There are some hyperparameters that you can adjust in the main.py. To train the model, you can run:

python main.py --bs 14 --gpus 2 --prefix rn --img_flist your_training_images.flist --mask_flist your_training_masks.flist

PS: You can set the "--bs" and "--gpus" to any number as you like. The above is just an example.

Evaluation

To evaluate the model, you can use GPU or CPU to run.

For GPU:

python eval.py --bs your_batch_size --model your_checkpoint_path --img_flist your_eval_images.flist --mask_flist your_eval_masks.flist

For CPU:

python eval.py --cpu --bs your_batch_size --model your_checkpoint_path --img_flist your_eval_images.flist --mask_flist your_eval_masks.flist

PS: The pretrained model under folder './pretrained_model/' is trained from Places2 dataset with Irregular Mask dataset. Please train RN from scratch if you test data not from Places2 or using regular mask.

Cite Us

Please cite us if you find this work helps.

@inproceedings{yu2020region,
  title={Region Normalization for Image Inpainting.},
  author={Yu, Tao and Guo, Zongyu and Jin, Xin and Wu, Shilin and Chen, Zhibo and Li, Weiping and Zhang, Zhizheng and Liu, Sen},
  booktitle={AAAI},
  pages={12733--12740},
  year={2020}
}

Appreciation

The codes refer to EdgeConnect. Thanks for the authors of it！

rn's People

Contributors

Stargazers

Watchers

rn's Issues

Total instance number: 1

hi,thanks for your work.
i have produce the train_image.flist/train_mask.flist/test_image.flist/test.mask.flist
but when i try training, it said "Total instance number: 1"as follow:

and when i try the eval.py with your pretrainermodel is said "ValueError: num_samples should be a positive integer value, but got num_samples=0" as follow:

both the problems seem that the data cannot to be loaded?hope your reply~

自己训练的模型

你好，我想问一下，我自己训练模型之后进程测试KeyError: 'discriminator.module.conv1.0.weight_orig' 报这个错，是要进行好几步训练吗。

Visualization

Hi,geekyutao. I am very interested in your image inpainting work, the idea of Region Normalization is amazing. I notice that you visualized certain features of the inpainting network in your paper, I am very curious about this process. Could you share the code of this part, my email:[email protected]
thank you very much.

Training on custom dataset with custom mask for Inpainting

Hello @geekyutao
Great work on the paper and even the results look amazing!

Can you please guide me with step by step process on how we need to prepare a dataset for a custom image dataset with a custom mask (if possible)?

How the masks will look like in the sense that whether it will be a colour image on which white patches (the part that will be inpainted from the model) is drawn or it will be a completely black image on which white patches (the part that will be inpainted from the model) is drawn? (in short how to prepare mask and then how to supply it for model training)

I referred to your steps but couldn't completely follow them.
Thank you in advance!

I have trained this on single gpu with cuda 11.2 and pytorch 1.8 but still confuse about multi-gpu training

how can i use multi-gpu pretrained model on single gpu. this code seems not work.
btw, i have sent mail to yutao about license of this great work ([email protected]) expect to get a reply :)

about demo code

Hi, thank you for your work.
Do not find codes to test a image with mask file using the pretrained model.
It will be helpful if you can provide a sample version to do it.
:)

about channels

the encoder part,the input mask have 3 channels but the feature have 64 channels,the code put x*label, it's error?

Model loading issue in eval.py and main.py

Hi, Thanks for developing RN algorithm and made it into a public repo. I have faced two similar issues in different instances.

While using main.py and opting for the pretrained model I get the following error.
And the same error appears when using eval.py with my new model created from scratch.
Are there any changes in model architecture that I am missing out? Did you created the pretrained model "x_admin.cluster.localRN-0.8RN-Net_bs_14_epoch_3" from main.py?

File "main.py", line 251, in
model.load_state_dict(pretained_model)
File "/home/chintu/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 839, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for InpaintingModel:
Missing key(s) in state_dict: "generator.encoder_conv1.weight", "generator.encoder_conv1.bias", "generator.encoder_in1.foreground_gamma", "generator.encoder_in1.foreground_beta", "generator.encoder_in1.background_gamma", "generator.encoder_in1.background_beta", "generator.encoder_conv2.weight", "generator.encoder_conv2.bias", "generator.encoder_in2.foreground_gamma", "generator.encoder_in2.foreground_beta", "generator.encoder_in2.background_gamma", "generator.encoder_in2.background_beta", "generator.encoder_conv3.weight", "generator.encoder_conv3.bias", "generator.encoder_in3.foreground_gamma", "generator.encoder_in3.foreground_beta", "generator.encoder_in3.background_gamma", "generator.encoder_in3.background_beta", "generator.middle.0.conv_block.1.weight", "generator.middle.0.conv_block.1.bias", "generator.middle.0.conv_block.2.sa.conv1.weight", "generator.middle.0.conv_block.2.sa.gamma_conv.weight", "generator.middle.0.conv_block.2.sa.gamma_conv.bias", "generator.middle.0.conv_block.2.sa.beta_conv.weight"

But, when I use eval.py with your pretrained model it works perfectly fine.

load model error

Traceback (most recent call last):
File "eval.py", line 182, in
model.load_state_dict(pretained_model)
File "/home/jiangwenqiang/.conda/envs/env3.6/lib/python3.6/site-packages/torch/nn/modules/module.py", line 763, in load_state_dict
load(self)
File "/home/jiangwenqiang/.conda/envs/env3.6/lib/python3.6/site-packages/torch/nn/modules/module.py", line 761, in load
load(child, prefix + name + '.')
File "/home/jiangwenqiang/.conda/envs/env3.6/lib/python3.6/site-packages/torch/nn/modules/module.py", line 761, in load
load(child, prefix + name + '.')
File "/home/jiangwenqiang/.conda/envs/env3.6/lib/python3.6/site-packages/torch/nn/modules/module.py", line 761, in load
load(child, prefix + name + '.')
[Previous line repeated 1 more time]
File "/home/jiangwenqiang/.conda/envs/env3.6/lib/python3.6/site-packages/torch/nn/modules/module.py", line 758, in load
state_dict, prefix, local_metadata, True, missing_keys, unexpected_keys, error_msgs)
File "/home/jiangwenqiang/.conda/envs/env3.6/lib/python3.6/site-packages/torch/nn/modules/module.py", line 685, in _load_from_state_dict
hook(state_dict, prefix, local_metadata, strict, missing_keys, unexpected_keys, error_msgs)
File "/home/jiangwenqiang/.conda/envs/env3.6/lib/python3.6/site-packages/torch/nn/utils/spectral_norm.py", line 165, in call
weight_orig = state_dict[prefix + fn.name + '_orig']
KeyError: 'discriminator.module.conv1.0.weight_orig'

i wana better result plz help

when i use my single gpu model (batchsize=8 epoch=4 th=0.8) it works like this:

but when i use your pre-trained multigpu model (as the former issue method) it doesn't work:

how can i get a better result just like what in your paper?
or how can i make the pre-trained model work on my single gpu?

请问一下训练后的结果在哪里测试呢？没看见有测试程序啊

How to test it for a single photo.

how can I test it on a single photo and its mask?

Example Inputs and masks for evaluation

Hi! Thank you very much for your excellent work. However, when i try to go through the evaluation with Place2 with the provided pretrained model, the quality of the result is quite bad. Would you like to provide some examples that can work on the pretrained model or more concrete instructions?

advice on evaluation code

Hi, thanks a lot for your work!

I find that if I want to test your pretrained model using my own images and masks (the size of which are not 256*256), there will some errors, so I think you can just adjust your dataset.py and change the to_tensor function to:

def to_tensor(self, img):
img = Image.fromarray(img)
#img_t = F.to_tensor(img).float()
mytrans = my_transforms()
img_t = mytrans(img).float()
return img_t

then other people can test your code with images of random size, just my advice, thanks a lot.

some difficult

Hello, geekyutao:
I'm interseted in your work and want to try it, but i have some problems.
I would be appreciated if you could share the the fitted model that I can use it to generate a good result instead of training the net again.
Does the words you said in Preparation mean that we should prepare the image and mask ourselves?
Are masks required for successful inpainting,Does this mean the inverse operation only works for masks applied to the image? Doesn't this reduce the potential for real-world application?
I apologize for any misunderstanding, thankyou.

训练时遇到问题

load model error

测试的时候训练模型加载不进去，请问该怎么解决？

Please provide demo images to run your test.

Thanks for your hard word.
Could you please provide some demo images to help us run your code.
Since we don't know your official image mask format(black white mask or alpha mask).
Thanks in advance.

Cityscape Dataset

Hello, I'm trying to train this network with the Cityscape Dataset. Do you have any kind of advice regarding the number of epochs? After 400 epochs the results aren't so good.

wu

Why use the given pre-training model to get such results？

Question about more regions (K>2)

Hi, in your work, the number of regions is 2（mask and others) and it performs very well. I'm wondering what form should a more number of regions look like, does every region still represent the occluded area in the image? And how to modify the code when K>2?
Can you please tell me any experience or insight with this situation, any suggestions would be very helpful, thanks.

Details for training Places2 dataset and irregular mask

Hi, thank you very much for your work.
I have some questions about training details of Places2 dataset:

Do you use all the 365 scenes for training? How many epoches did you train for convergence?
Do you use the 12000 irregular masks for both training and testing? As I noticed they have a particular training irregular mask dataset.
Thank you very much for your time.

bad results

@geekyutao Hi, when i run the test code with your release code, I find often generate the bad result as bellow.

the edge-concect result are bellow:

the Pconv result are bellow:

I also test 10000 samples from places2 validation, the PSNR , SSIM, and FID are below；
PSNR SSIM FID
RN 23.446 0.799 28.419
edge_connect 25.046 0.847 11.402
Pconv 24.819 0.839 7.872
Surprisingly，RN is much worse than EC and Pconv！ Did I make any mistakes to get such bad results？

无结果

Batch Region Norm or Instance Region Norm?

Hi yutao,

Thanks for sharing your inspiring work. I have a question about the specific implementation of Region Norm. As you've mentioned in your paper, Region Norm should be a generation to Instance Norm. However, from your implementation, rn is still based on the distribution of mask/unmask regions in the whole batch. Is there some errors in your implementation, or you've found that performance based on instance regions worse than batch regions.

question

Can you tell me how many epochs you have iterated on the three data sets?

About output from the pretrained model

Hey Hi,
I am writing to you as a user of your paper's pretrained model, specifically the model described in the code you provided. First of all, I would like to express my appreciation for your work and the effort you have put into developing this model.

Recently, I have been utilizing your pretrained model for a specific task in my research project. While I acknowledge the potential and effectiveness of the model, I must inform you that I am not getting correct output.

I have attached my code and the output, please look into it and let me know if there are changes needed in the code

`from future import print_function
import argparse
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
from PIL import Image, ImageOps
from torchvision.transforms.functional import to_pil_image
from models import InpaintingModel
import lpips
import os
from skimage.metrics import peak_signal_noise_ratio as compare_psnr
from skimage.metrics import structural_similarity as compare_ssim

loss_fn_alex = lpips.LPIPS(net='alex')

Training settings

parser = argparse.ArgumentParser(description='PyTorch Video Inpainting with Background Auxiliary')
parser.add_argument('--bs', type=int, default=256, help='training batch size')
parser.add_argument('--lr', type=float, default=0.001, help='Learning Rate. Default=0.001')
parser.add_argument('--cpu', default=False, action='store_true', help='Use CPU to test')
parser.add_argument('--threads', type=int, default=1, help='number of threads for data loader to use')
parser.add_argument('--seed', type=int, default=67454, help='random seed to use. Default=123')
parser.add_argument('--gpus', default=0, type=int, help='number of GPUs')
parser.add_argument('--threshold', type=float, default=0.8)
parser.add_argument('--img_path', type=str, default="D:/FYP/input_image/input.jpg")
parser.add_argument('--mask_path', type=str, default="D:/FYP/input_mask/00015.png")
parser.add_argument('--model', default='C:/FYP/RN-master/pretrained_model/x_admin.cluster.localRN-0.8RN-Net_bs_14_epoch_3.pth', help='pretrained base model')
parser.add_argument('--save', default=True, action='store_true', help='If save test images')
parser.add_argument('--save_path', type=str, default='C:/FYP/RN-master/output')
parser.add_argument('--input_size', type=int, default=512, help='input image size')
parser.add_argument('--l1_weight', type=float, default=1.0)
parser.add_argument('--gan_weight', type=float, default=.1)

opt = parser.parse_args()

def evaluate_single_image(image_path, mask_path, save=False, save_path=None):
# Load the model
device = torch.device('cpu' if opt.cpu else 'cuda')
model = InpaintingModel(g_lr=opt.lr, d_lr=(0.1 * opt.lr), l1_weight=opt.l1_weight, gan_weight=opt.gan_weight, iter=0, threshold=opt.threshold)
model.load_state_dict(torch.load(opt.model, map_location=device), strict=False)

pred, avg_lpips, mask, gt = eval_single_image(image_path, mask_path, model)

if save:
    image = Image.open(opt.img_path)
    mask = Image.open(opt.mask_path)
    inverted_mask = ImageOps.invert(mask)
    resized_mask = inverted_mask.resize(image.size, resample=Image.BILINEAR)
    masked_image = Image.composite(image, Image.new('RGB', image.size), resized_mask)
    masked_image.save(r'C:/FYP/RN-master/output/input.png')
    save_img(save_path, 'mask', mask)
    save_img(save_path, 'output', pred)
    save_img(save_path, 'gt', gt)

return avg_lpips

def eval_single_image(image_path, mask_path, model):
model.eval()
model.generator.eval()
avg_lpips = 0.

with torch.no_grad():
    gt = np.array(Image.open(image_path))
    mask = np.array(Image.open(mask_path))

    gt = torch.from_numpy(gt.transpose((2, 0, 1))).float().unsqueeze(0) / 255.0
    mask = torch.from_numpy(mask).unsqueeze(0).unsqueeze(0)  # Add extra dimensions for channel and batch

    # Resize the input tensor and mask tensor to match the expected input size
    gt = F.interpolate(gt, size=(opt.input_size, opt.input_size), mode='bilinear', align_corners=False)
    mask = F.interpolate(mask, size=(opt.input_size, opt.input_size), mode='nearest')

    gt, mask = Variable(gt), Variable(mask)

    prediction = model.generator(gt, mask)
    prediction = prediction * mask + gt * (1 - mask)
    avg_lpips = loss_fn_alex(prediction, gt).mean().item()

return prediction, avg_lpips, mask, gt

def save_img(path, name, img):
# img (H,W,C) or (H,W) np.uint8 or torch tensor
if isinstance(img, torch.Tensor):
img = to_pil_image(img.squeeze().cpu())
img.save(os.path.join(path, name + '.png'))

def main():
torch.manual_seed(opt.seed)

# Checking for GPU availability
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda and not opt.cpu else "cpu")

# Evaluate single image
avg_lpips = evaluate_single_image(opt.img_path, opt.mask_path, save=opt.save, save_path=opt.save_path)

print("Average LPIPS: {:.4f}".format(avg_lpips))

if name == 'main':
main()
`

PermissionError: [Errno 13] Permission denied: '/data'

hi，when i try to train on my own dataset,it said
"File "/home/xzx/anaconda3/lib/python3.6/os.py", line 220, in makedirs
mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/data'",
thanks for your reply

About pretrained_model

Hello, what is the train data of pretrained model?

关于修复结果的请教

@想请教一下为什么我用这个代码进行训练之后再进行测试的时候结果很不好呢？甚至可以说掩码就基本没有处理，这是什么原因造成的呢？我用的数据集是Celeba-HQ,由于电脑配置的原因，训练周期为40个epoch，batch size 设置为4了，虽然源码的batch size为14，但我认为这个的影响不是很大，想请教一下是否是其他原因造成的。 #31

About loss function

In the code, L1 loss and adversarial loss are used to train.
But in the paper, it's "We apply the same discriminators (PatchGAN (Isola et al. 2017;Zhu et al. 2017)) and loss functions (reconstruction loss, adversarial loss, perceptual loss and style loss) of the original backbone model to our model."
Should we add perceptual loss and style loss to train your model? Or just L1 loss and adversarial loss are enough to get the numerical metric data?
Thank you for your time.

Question about losses

Hello, I don not find style and perceptual losses in your code which paly a important role in results. Can you give me some suggestions?

RuntimeError: The size of tensor a (64) must match the size of tensor b (3) at non-singleton dimension 1

Nice job! @geekyutao cound you pls tell me how to fix the following problem? Thanks a lot

Traceback (most recent call last):
File "main.py", line 286, in
train(epoch)
File "main.py", line 83, in train
prediction = model.generator(gt, mask)
File "/home/asiainfo/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 539, in call
result = self.forward(*input, **kwargs)
File "/home/wanghz/rn/networks.py", line 76, in forward
x = self.encoder(x, mask)
File "/home/wanghz/rn/networks.py", line 60, in encoder
x = self.encoder_in1(x, mask)
File "/home/asiainfo/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 539, in call
result = self.forward(*input, **kwargs)
File "/home/wanghz/rn/rn.py", line 64, in forward
rn_x = self.rn(x, mask)
File "/home/asiainfo/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 539, in call
result = self.forward(*input, **kwargs)
File "/home/wanghz/rn/rn.py", line 18, in forward
rn_foreground_region = self.rn(x * label, label)
RuntimeError: The size of tensor a (64) must match the size of tensor b (3) at non-singleton dimension 1

load model error

Hi, @geekyutao thanks for your sharing. I've followed the discription on Readme.md, however when running eval.py it seems that the model loaded incorrectly, I used the pretrained model x_admin.cluster.localRN-0.8RN-Net_bs_14_epoch_3.pth and the error is showed as bellow:

Traceback (most recent call last): File "/tmp/pycharm_project_13/eval.py", line 200, in <module> model.load_state_dict(pretained_model) File "/root/anaconda3/envs/pytorch040/lib/python3.6/site-packages/torch/nn/modules/module.py", line 719, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for InpaintingModel: Missing key(s) in state_dict: "discriminator.module.conv1.0.weight", "discriminator.module.features.0.weight", "discriminator.module.conv2.0.weight", "discriminator.module.conv3.0.weight", "discriminator.module.conv4.0.weight", "discriminator.module.conv5.0.weight". Unexpected key(s) in state_dict: "discriminator.module.conv1.0.weight_v", "discriminator.module.features.0.weight_v", "discriminator.module.conv2.0.weight_v", "discriminator.module.conv3.0.weight_v", "discriminator.module.conv4.0.weight_v", "discriminator.module.conv5.0.weight_v".

I have no idea where the problem is. Could you help me? Looking forward for your reply.

损失函数只在代码里找到了对抗损失，别的损失函数在哪里呀

add one more channel

Hello author, thank you for your work! If edge information is added, and there is one more channel in the network, will there be such an improvement?

the loss is None

when i train my data with your code,when the 2epoch ,the loss is None,,,i can't get the problem,,please give me a hand

Torch version

Hello, I install the version of torch is 1.7.0, however, there is a mistake: RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 512, 4, 4]] is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True). I modified inplace = True to inplace = false, the error is still existing. Can you give me some advice?

Questions about training

How to construct evaluating image-mask pairs?

Hi, Yu, thank you for sharing your code.
I have a simple question, that is how to construct testing cases. For example, the number of images in validation data of Places2 is 36500, but the number of masks is 12000. In the code of dataset.py, we take the index-th image from image set(0~36499), while you just take the index-th mask in self.mask_data in testing model of Dataset class. This will incur some errors when loading masks and then return the first image sample.

try:
    item = self.load_item(index)
except:
    print('loading error: ' + self.data[index])
    item = self.load_item(0)

With such code, you will only test 12000 images from Places2. Am I right?

Hoping for your reply!

About your pretrained model

I can't load your predtrined model, and i find your predtrained model have the key "discriminator.module.features.0.weight_orig".But your discriminator net don't have "features" module 。can you help me sove solve this problem?

geekyutao / rn Goto Github PK

rn's Introduction

Region Normalization for Image Inpainting

Repo Update:

Preparation

Training

Evaluation

Cite Us

Appreciation

rn's People

Contributors

Stargazers

Watchers

Forkers

rn's Issues

Training settings

Recommend Projects

Recommend Topics

Recommend Org