Comments (20)
Hi, that looks strange to me. Have you adapted the mask options to the different image size? When do these artefacts start to show?
from deepfillv2-pytorch.
@nipponjo Thanks for the quick response!
Yes. I have adapted the mask options. This is my training configuration file.
# resume training
model_restore: '' # start new training
#model_restore: 'checkpoints/celebahq/model_exp0/states.pth'
# dataloading
dataset_path: '/home/ohayonguy/research/datasets/celeba/img_align_celeba_splits/train/img_align_celeba'
scan_subdirs: True # Are the images organized in subfolders?
random_crop: False # Set to false when dataset is 'celebahq', meaning only resize the images to img_shapes, instead of crop img_shapes from a larger raw image. This is useful when you train on images with different resolutions like places2. In these cases, please set random_crop to true.
random_horizontal_flip: False
batch_size: 16
num_workers: 10
# training
tb_logging: True # Enable Tensorboard logging?
log_dir: 'tb_logs/celebahq/model_exp0' # Tensorboard logging folder
checkpoint_dir: 'checkpoints/celebahq/model_exp0' # Checkpoint folder
use_cuda_if_available: True
random_seed: False # options: False | <int>
g_lr: 0.0001 # lr for Adam optimizer (generator)
g_beta1: 0.5 # beta1 for Adam optimizer (generator)
g_beta2: 0.999 # beta2 for Adam optimizer (generator)
d_lr: 0.0001 # lr for Adam optimizer (discriminator)
d_beta1: 0.5 # beta1 for Adam optimizer (discriminator)
d_beta2: 0.999 # beta2 for Adam optimizer (discriminator)
max_iters: 1000000 # number of batches to train the models
# logging
viz_max_out: 10 # number of images from batch
# if optional: set to False to deactivate
print_iter: 100 # write losses to console and tensorboard
save_checkpoint_iter: 100 # save checkpoint file and overwrite last one
save_imgs_to_tb_iter: 500 # (optional) add image grids to tensorboard
save_imgs_to_dics_iter: 500 # (optional) save image grids in checkpoint folder
save_cp_backup_iter: 5000 # (optional) save checkpoint file named states_{n_iter}.pth
img_shapes: [128, 128, 3]
# mask options
height: 50
width: 50
max_delta_height: 32
max_delta_width: 32
vertical_margin: 0
horizontal_margin: 0
# loss
gan_loss: 'hinge' # options: 'hinge', 'ls'
gan_loss_alpha: 1.
ae_loss: True
l1_loss_alpha: 1.
Does this seem okay to you?
These artifacts start to show right in the beginning:
from deepfillv2-pytorch.
The config looks okay, even though I would also reduce the max_delta_height/width. Do you use batch or instance norm? I had some complications with those. When I tested with 128px, I also removed some layers in the generator, and one in the discriminator.
from deepfillv2-pytorch.
I am using the same networks that are provided in this repo, without any changes.
I took another look at the discriminator architecture. Why would we need to remove a layer from the discriminator and some layers from the generator, when the image size changes?
So you say that 128x128 images worked for you in some experiment you performed?
from deepfillv2-pytorch.
With an image size of 128px the discriminator with 6 layers outputs 2x2 feature maps (instead of 4x4) but I don't think that should be a problem. I think it worked in some experiments, but I am currently unable to find them. I will try with your config tomorrow (14.1.).
from deepfillv2-pytorch.
Thanks a lot!!
from deepfillv2-pytorch.
Hi, I made a few tests and here is what I think:
With 128px images, in general, I would recommend making these changes:
mask options:
height: 64
width: 64
max_delta_height: 16
max_delta_width: 16
remove these layers:
conv_bn5 (CoarseGenerator)
conv_conv_bn5 and ca_conv_bn5 (FineGenerator)
conv6 (Discriminator)
However, for this face dataset, I think the 32x32 bottleneck is too narrow to produce good results.
With 256px images, its resolution is 64x64, which makes it much easier to preserve detailed information.
One can remove one up-/downsampling stage (that's what I did), but that won't save much compute compared to using the 256px images.
Alternatively, I assume that skip-connections (as in U-Net for example) could help in this case.
from deepfillv2-pytorch.
Have you tried to run the code with these changes, and it worked for you?
I does not seem to me like an architecture issue. Why would these artifacts appear in the first place? And why would removing a layer pose such a huge affect on the results?
from deepfillv2-pytorch.
Hi, yes I tried that, but the training became unstable quickly (orange loss curve). The architecture was designed for 256px images, so I wouldn't just expect it to work with 128px without problems. It seems to me that the generator can't keep up with the discriminator. Removing (unnessesary) layers can make optimization easier, especially since there are no skip-connections or normalization layers in the net. I trained with these changes (red graph), but found that the generator still can't keep up after a while. I assume that learning an upsampling from 32px to 128px is considerably harder than from 64px to 256px, as a 64px face still shows some important details. I also trained the net with an added skip connection between the down-/upsampled 64x64 feature maps (shown here) and got some more reasonable results (2nd orange graph).
from deepfillv2-pytorch.
I trained with 256x256 images from celeba, with the original config file you provided, and the issue persists.
What version of PyTorch are you using?
from deepfillv2-pytorch.
I used version 1.10. What does your ae loss curve look like?
from deepfillv2-pytorch.
Around 16k iterations, images seem okay! It appears like some sort of a mode collapse? Or too high learning rate? What do you think causes this unstability?
I am also using torch 1.10 btw. Lucky me. All deepfillv2 repos use either very old torch or very old tensorflow. Not sure why.
from deepfillv2-pytorch.
That's strange. When I trained, it looked like this. I don't think I changed anything, but I will try another run and see if the beginning is different.
from deepfillv2-pytorch.
are you trying to work with the exact same dataset? regular celeba?
Here is my config:
# resume training
model_restore: '' # start new training
#model_restore: 'checkpoints/celebahq/model_exp0/states.pth'
# dataloading
dataset_path: '/home/ohayonguy/research/datasets/celeba/img_align_celeba_splits/train/img_align_celeba'
scan_subdirs: True # Are the images organized in subfolders?
random_crop: False # Set to false when dataset is 'celebahq', meaning only resize the images to img_shapes, instead of crop img_shapes from a larger raw image. This is useful when you train on images with different resolutions like places2. In these cases, please set random_crop to true.
random_horizontal_flip: False
batch_size: 16
num_workers: 10
# training
tb_logging: True # Enable Tensorboard logging?
log_dir: 'tb_logs/celebahq/model_exp0' # Tensorboard logging folder
checkpoint_dir: 'checkpoints/celebahq/model_exp0' # Checkpoint folder
use_cuda_if_available: True
random_seed: False # options: False | <int>
g_lr: 0.0001 # lr for Adam optimizer (generator)
g_beta1: 0.5 # beta1 for Adam optimizer (generator)
g_beta2: 0.999 # beta2 for Adam optimizer (generator)
d_lr: 0.0001 # lr for Adam optimizer (discriminator)
d_beta1: 0.5 # beta1 for Adam optimizer (discriminator)
d_beta2: 0.999 # beta2 for Adam optimizer (discriminator)
max_iters: 1000000 # number of batches to train the models
# logging
viz_max_out: 10 # number of images from batch
# if optional: set to False to deactivate
print_iter: 100 # write losses to console and tensorboard
save_checkpoint_iter: 100 # save checkpoint file and overwrite last one
save_imgs_to_tb_iter: 500 # (optional) add image grids to tensorboard
save_imgs_to_dics_iter: 500 # (optional) save image grids in checkpoint folder
save_cp_backup_iter: 5000 # (optional) save checkpoint file named states_{n_iter}.pth
img_shapes: [256, 256, 3]
# mask options
height: 128
width: 128
max_delta_height: 32
max_delta_width: 32
vertical_margin: 0
horizontal_margin: 0
# loss
gan_loss: 'hinge' # options: 'hinge', 'ls'
gan_loss_alpha: 1.
ae_loss: True
l1_loss_alpha: 1.
from deepfillv2-pytorch.
I used CelebA-HQ.
from deepfillv2-pytorch.
I see. I am using the regular celeba. Maybe this causes the difference?
from deepfillv2-pytorch.
That's hard to tell.
from deepfillv2-pytorch.
I will give it a shot on celeba-hq. Although it seems odd to me that an algorithm would work on one dataset of faces but not on another.
from deepfillv2-pytorch.
It seems like there is some problem with the spectral_norm
in the discriminator. When I trained the model on the dataset, I used the Conv2DSpectralNorm
layer, which I implemented as it is in the original implementation. Now that I have tested both variants, the one with the spectral_norm
from torch.nn.utils.parametrizations
caused instability (the gan loss overpowers the ae loss). Maybe the model is so fragile that this small difference causes a problem. You can just switch them out:
in networky.py -> DConv
self.conv_sn = Conv2DSpectralNorm(cnum_in, cnum_out, ksize, stride, padding)
#self.conv_sn = spectral_norm(nn.Conv2d(cnum_in, cnum_out, ksize, stride, padding))
from deepfillv2-pytorch.
Ok. Will give it a shot. Thanks!
from deepfillv2-pytorch.
Related Issues (20)
- Export to onnx HOT 8
- how to generate test masks? HOT 5
- Where can I download the states_tf_places2.pth file? HOT 2
- Questions in the Learning Process HOT 1
- Question about train.yaml HOT 1
- How can i export pytorch model to coreml model
- Error during Core ML conversion
- Grayscale conversion HOT 4
- How to use muti GPU training?
- Pretrained models HOT 1
- Why use a grid during inference but not during training ?
- Is the test.ipynb up-to-date ?
- The object does not get removed but gets filled with noise on running test.py HOT 7
- How to set the train and value folder?
- 系统找不到指定的文件。: 'app/frontend/build/c_hello'
- training time and convergence HOT 8
- Can you please create new CoreML model ? HOT 1
- How to run with cpu? HOT 2
- An error HOT 1
- Why do I train my dataset, the output picture is just masked and not fixed, and 300,000 epochs are trained like this? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepfillv2-pytorch.