genforce / interfacegan Goto Github PK

View Code? Open in Web Editor NEW

1.5K 1.5K 283.0 13.78 MB

[CVPR 2020] Interpreting the Latent Space of GANs for Semantic Face Editing

Home Page: https://genforce.github.io/interfacegan/

License: MIT License

Python 100.00%

interfacegan's Introduction

GenForce Lib for Generative Modeling

An efficient PyTorch library for deep generative modeling. May the Generative Force (GenForce) be with You.

Updates

Encoder Training: We support training encoders on top of pre-trained GANs for GAN inversion.
Model Converters: You can easily migrate your already started projects to this repository. Please check here for more details.

Highlights

Distributed training framework.
Fast training speed.
Modular design for prototyping new models.
Model zoo containing a rich set of pretrained GAN models, with Colab live demo to play.

Installation

Create a virtual environment via conda.

conda create -n genforce python=3.7
conda activate genforce

Install cuda and cudnn. (We use CUDA 10.0 in case you would like to use TensorFlow 1.15 for model conversion.)
```
conda install cudatoolkit=10.0 cudnn=7.6.5
```
Install torch and torchvision.
```
pip install torch==1.7 torchvision==0.8
```
Install requirements
```
pip install -r requirements.txt
```

Quick Demo

We provide a quick training demo, scripts/stylegan_training_demo.py, which allows to train StyleGAN on a toy dataset (500 animeface images with 64 x 64 resolution). Try it via

./scripts/stylegan_training_demo.sh

We also provide an inference demo, synthesize.py, which allows to synthesize images with pre-trained models. Generated images can be found at work_dirs/synthesis_results/. Try it via

python synthesize.py stylegan_ffhq1024

You can also play the demo at Colab.

Play with GANs

Test

Pre-trained models can be found at model zoo.

On local machine:

GPUS=8
CONFIG=configs/stylegan_ffhq256_val.py
WORK_DIR=work_dirs/stylegan_ffhq256_val
CHECKPOINT=checkpoints/stylegan_ffhq256.pth
./scripts/dist_test.sh ${GPUS} ${CONFIG} ${WORK_DIR} ${CHECKPOINT}

Using slurm:

CONFIG=configs/stylegan_ffhq256_val.py
WORK_DIR=work_dirs/stylegan_ffhq256_val
CHECKPOINT=checkpoints/stylegan_ffhq256.pth
GPUS=8 ./scripts/slurm_test.sh ${PARTITION} ${JOB_NAME} \
    ${CONFIG} ${WORK_DIR} ${CHECKPOINT}

Train

All log files in the training process, such as log message, checkpoints, synthesis snapshots, etc, will be saved to the work directory.

On local machine:

GPUS=8
CONFIG=configs/stylegan_ffhq256.py
WORK_DIR=work_dirs/stylegan_ffhq256_train
./scripts/dist_train.sh ${GPUS} ${CONFIG} ${WORK_DIR} \
    [--options additional_arguments]

Using slurm:

CONFIG=configs/stylegan_ffhq256.py
WORK_DIR=work_dirs/stylegan_ffhq256_train
GPUS=8 ./scripts/slurm_train.sh ${PARTITION} ${JOB_NAME} \
    ${CONFIG} ${WORK_DIR} \
    [--options additional_arguments]

Play with Encoders for GAN Inversion

Train

On local machine:

GPUS=8
CONFIG=configs/stylegan_ffhq256_encoder_y.py
WORK_DIR=work_dirs/stylegan_ffhq256_encoder_y
./scripts/dist_train.sh ${GPUS} ${CONFIG} ${WORK_DIR} \
    [--options additional_arguments]

Using slurm:

CONFIG=configs/stylegan_ffhq256_encoder_y.py
WORK_DIR=work_dirs/stylegan_ffhq256_encoder_y
GPUS=8 ./scripts/slurm_train.sh ${PARTITION} ${JOB_NAME} \
    ${CONFIG} ${WORK_DIR} \
    [--options additional_arguments]

Contributors

Member	Module
Yujun Shen	models and running controllers
Yinghao Xu	runner and loss functions
Ceyuan Yang	data loader
Jiapeng Zhu	evaluation metrics
Bolei Zhou	cheerleader

NOTE: The above form only lists the person in charge for each module. We help each other a lot and develop as a TEAM.

We welcome external contributors to join us for improving this library.

License

The project is under the MIT License.

Acknowledgement

We thank PGGAN, StyleGAN, StyleGAN2, StyleGAN2-ADA for their work on high-quality image synthesis. We thank IDInvert and GHFeat for their contribution to GAN inversion. We also thank MMCV for the inspiration on the design of controllers.

BibTex

We open source this library to the community to facilitate the research of generative modeling. If you do like our work and use the codebase or models for your research, please cite our work as follows.

@misc{genforce2020,
  title =        {GenForce},
  author =       {Shen, Yujun and Xu, Yinghao and Yang, Ceyuan and Zhu, Jiapeng and Zhou, Bolei},
  howpublished = {\url{https://github.com/genforce/genforce}},
  year =         {2020}
}

interfacegan's People

Stargazers

Watchers

Forkers

ak9250 sedatli liuguoyou kp-forks aixander jdc08161063 wenqingchu chaucergit wangrun amir22010 peterzhousz keyky chaoso tylerwilliams trendingtechnology jangocheng hugeinteger kklemon haimin777 mazzzystar jurph peterzs xdmds ignaciosan22 fastcode3d kreativai yes7rose chenzehao0915 aakash2602 akoz csjoax zzxzzx123 zhyj3038 renmengyuan binbomb lotayou gayaniindunil oliviaperryman ahuirecome mann2107 immocat jack12xl berylsheep-up mombin gagandaroach wyhsirius shiyingzhang90 apollyon125 yongxinw jjandnn giovannymoncayo diegosiqueir4 baldrlector xenonlamb ideaplexus jochemstoel chenchaoxu greendream182 jacobwjs ngurkan hologerry dorniwang tony109060581 duyuankai1992 bruinxiong aashishrai3799 anjanau justfitting shuxiangguo neka-nat zeta1999 yukimiyazaki xingx001 duanexiao youngadvance vcip2015 vmorish bruce-chappell bencoster amrmkayid felixshiyong mornydew dmschauer zhaoyk1986 xrosliang cv-ip naomi-ken-korem gouxiayibu shimashahfar april-cai xiusdk 1170500804 sh2120170809 nivratti minha12 miss1997yuan pandadreamer brechetp jdily yangtong1989

interfacegan's Issues

RuntimeError: Given groups=1, weight of size [16, 16, 3, 3], expected input[4, 512, 1, 1] to have 16 channels, but got 512 channels instead

E:\Users\Raytine\Anaconda3\python.exe F:/expression/InterFaceGAN-master/edit.py -m pggan_celebahq -b boundaries/pggan_celebahq_smile_boundary.npy -n 10 -o results/pggan_celebahq_smile_editing
[2019-08-12 09:17:26,315][INFO] Initializing generator.
[2019-08-12 09:17:26,440][WARNING] No pre-trained model will be loaded!
[2019-08-12 09:17:27,728][INFO] Preparing boundary.
[2019-08-12 09:17:27,731][INFO] Preparing latent codes.
[2019-08-12 09:17:27,731][INFO] Sample latent codes randomly.
[2019-08-12 09:17:27,732][INFO] Editing {total_num} samples.
Traceback (most recent call last):
File "F:/expression/InterFaceGAN-master/edit.py", line 112, in
main()
File "F:/expression/InterFaceGAN-master/edit.py", line 98, in main
outputs = model.easy_synthesize(interpolations_batch)
File "F:\expression\InterFaceGAN-master\models\base_generator.py", line 230, in easy_synthesize
outputs = self.synthesize(latent_codes, **kwargs)
File "F:\expression\InterFaceGAN-master\models\pggan_generator.py", line 117, in synthesize
images = self.model(zs)
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "F:\expression\InterFaceGAN-master\models\pggan_generator_model.py", line 127, in forward
return super().forward(x)
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\container.py", line 91, in forward
input = module(input)
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "F:\expression\InterFaceGAN-master\models\pggan_generator_model.py", line 243, in forward
x = self.conv(x)
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\conv.py", line 301, in forward
self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [16, 16, 3, 3], expected input[4, 512, 1, 1] to have 16 channels, but got 512 channels instead

Process finished with exit code 1

support for stylegan2

do you guys have any plan to support stylegan2?

Training with stylegan2 model

Hi. Your results are good. Currently i am working pose transformation. Your repository helps me alot to go step forward.
Currently stylegan2 produced generated images after mixing are too divergent and good results than previous.

Can you tell the process of how to train stylegan2 model to get boundaries??

I changed following parameter values based on stylegan2-ffhq-config-f.pkl.

_RESOLUTIONS_TO_CHANNELS = {
8: [512, 512, 512],
16: [512, 512, 512, 512],
32: [512, 512, 512, 512, 512],
64: [512, 512, 512, 512, 512, 256],
128: [512, 512, 512, 512, 512, 256, 128],
256: [512, 512, 512, 512, 512, 256, 128, 64],
512: [512, 512, 512, 512, 512, 256, 128, 64, 32],
1024: [512, 512, 512, 512, 512, 256, 128, 64, 32, 16],
}

pylint: disable=line-too-long

Variable mapping from pytorch model to official tensorflow model.

_STYLEGAN_PTH_VARS_TO_TF_VARS = {
# Statistic information of disentangled latent feature, w.
'truncation.w_avg':'dlatent_avg', # [512]

# Noises.
'synthesis.layer0.epilogue.apply_noise.noise': 'noise0',    # [1, 1, 4, 4]
'synthesis.layer1.epilogue.apply_noise.noise': 'noise1',    # [1, 1, 8, 8]
'synthesis.layer2.epilogue.apply_noise.noise': 'noise2',    # [1, 1, 8,8]
'synthesis.layer3.epilogue.apply_noise.noise': 'noise3',    # [1, 1, 16, 16]
'synthesis.layer4.epilogue.apply_noise.noise': 'noise4',    # [1, 1, 16, 16]
'synthesis.layer5.epilogue.apply_noise.noise': 'noise5',    # [1, 1, 32, 32]
'synthesis.layer6.epilogue.apply_noise.noise': 'noise6',    # [1, 1, 32, 32]
'synthesis.layer7.epilogue.apply_noise.noise': 'noise7',    # [1, 1, 64, 64]
'synthesis.layer8.epilogue.apply_noise.noise': 'noise8',    # [1, 1, 64, 64]
'synthesis.layer9.epilogue.apply_noise.noise': 'noise9',    # [1, 1, 128, 128]
'synthesis.layer10.epilogue.apply_noise.noise': 'noise10',  # [1, 1, 128, 128]
'synthesis.layer11.epilogue.apply_noise.noise': 'noise11',  # [1, 1, 256, 256]
'synthesis.layer12.epilogue.apply_noise.noise': 'noise12',  # [1, 1, 256, 256]
'synthesis.layer13.epilogue.apply_noise.noise': 'noise13',  # [1, 1, 512, 512]
'synthesis.layer14.epilogue.apply_noise.noise': 'noise14',  # [1, 1, 512, 512]
'synthesis.layer15.epilogue.apply_noise.noise': 'noise15',  # [1, 1, 1024, 1024]
'synthesis.layer16.epilogue.apply_noise.noise': 'noise16',  # [1, 1, 1024, 1024]

# Mapping blocks.
'mapping.dense0.linear.weight': 'Dense0/weight',  # [512, 512]
'mapping.dense0.wscale.bias': 'Dense0/bias',  # [512]
'mapping.dense1.linear.weight': 'Dense1/weight',  # [512, 512]
'mapping.dense1.wscale.bias': 'Dense1/bias',  # [512]
'mapping.dense2.linear.weight': 'Dense2/weight',  # [512, 512]
'mapping.dense2.wscale.bias': 'Dense2/bias',  # [512]
'mapping.dense3.linear.weight': 'Dense3/weight',  # [512, 512]
'mapping.dense3.wscale.bias': 'Dense3/bias',  # [512]
'mapping.dense4.linear.weight': 'Dense4/weight',  # [512, 512]
'mapping.dense4.wscale.bias': 'Dense4/bias',  # [512]
'mapping.dense5.linear.weight': 'Dense5/weight',  # [512, 512]
'mapping.dense5.wscale.bias': 'Dense5/bias',  # [512]
'mapping.dense6.linear.weight': 'Dense6/weight',  # [512, 512]
'mapping.dense6.wscale.bias': 'Dense6/bias',  # [512]
'mapping.dense7.linear.weight': 'Dense7/weight',  # [512, 512]
'mapping.dense7.wscale.bias': 'Dense7/bias',  # [512]

# Synthesis blocks.

'synthesis.lod': 'lod' , #[]
'synthesis.add_constant': '4x4/Const/const',
'synthesis.layer0.conv.weight': '4x4/Conv/weight',
'synthesis.layer0.epilogue.mod_weight':'4x4/Conv/mod_weight',
'synthesis.layer0.epilogue.mod_bias':'4x4/Conv/mod_bias',
'synthesis.layer0.epilogue.apply_noise':'4x4/Conv/noise_strength',
'synthesis.layer0.epilogue.bias': '4x4/Conv/bias',
'synthesis.output0.conv.weight': '4x4/ToRGB/weight',   
'synthesis.output0.epilogue.mod_weight':'4x4/ToRGB/mod_weight',
'synthesis.output0.epilogue.mod_bias':'4x4/ToRGB/mod_bias',
'synthesis.output0.epilogue.bias':'4x4/ToRGB/bias',

'synthesis.layer1.conv.weight':'8x8/Conv0_up/weight',
'synthesis.layer1.epilogue.mod_weight':'8x8/Conv0_up/mod_weight',
'synthesis.layer1.epilogue.mod_bias':'8x8/Conv0_up/mod_bias',
'synthesis.layer1.epilogue.apply_noise':'8x8/Conv0_up/noise_strength',
'synthesis.layer1.epilogue.bias':'8x8/Conv0_up/bias',
'synthesis.layer2.conv.weight':'8x8/Conv1/weight',
'synthesis.layer2.epilogue.mod_weight':'8x8/Conv1/mod_weight',
'synthesis.layer2.epilogue.mod_bias':'8x8/Conv1/mod_bias',
'synthesis.layer2.epilogue.apply_noise':'8x8/Conv1/noise_strength',
'synthesis.layer2.epilogue.bias':'8x8/Conv1/bias',
'synthesis.output1.conv.weight': '8x8/ToRGB/weight',
'synthesis.output1.epilogue.mod_weight':'8x8/ToRGB/mod_weight',
'synthesis.output1.epilogue.mod_bias':'8x8/ToRGB/mod_bias',
'synthesis.output1.epilogue.bias':'8x8/ToRGB/bias',

'synthesis.layer3.conv.weight':'16x16/Conv0_up/weight',
'synthesis.layer3.epilogue.mod_weight':'16x16/Conv0_up/mod_weight',
'synthesis.layer3.epilogue.mod_bias':'16x16/Conv0_up/mod_bias',
'synthesis.layer3.epilogue.apply_noise':'16x16/Conv0_up/noise_strength',
'synthesis.layer3.epilogue.bias':'16x16/Conv0_up/bias',
'synthesis.layer4.conv.weight':'16x16/Conv1/weight',
'synthesis.layer4.epilogue.mod_weight':'16x16/Conv1/mod_weight',
'synthesis.layer4.epilogue.mod_bias':'16x16/Conv1/mod_bias',
'synthesis.layer4.epilogue.apply_noise':'16x16/Conv1/noise_strength',
'synthesis.layer4.epilogue.bias':'16x16/Conv1/bias',
'synthesis.output2.conv.weight': '16x16/ToRGB/weight',
'synthesis.output2.epilogue.mod_weight':'16x16/ToRGB/mod_weight',
'synthesis.output2.epilogue.mod_bias':'16x16/ToRGB/mod_bias',
'synthesis.output2.epilogue.bias':'16x16/ToRGB/bias',

'synthesis.layer5.conv.weight':'32x32/Conv0_up/weight',
'synthesis.layer5.epilogue.mod_weight':'32x32/Conv0_up/mod_weight',
'synthesis.layer5.epilogue.mod_bias':'32x32/Conv0_up/mod_bias',
'synthesis.layer5.epilogue.apply_noise':'32x32/Conv0_up/noise_strength',
'synthesis.layer5.epilogue.bias':'32x32/Conv0_up/bias',
'synthesis.layer6.conv.weight':'32x32/Conv1/weight',
'synthesis.layer6.epilogue.mod_weight':'32x32/Conv1/mod_weight',
'synthesis.layer6.epilogue.mod_bias':'32x32/Conv1/mod_bias',
'synthesis.layer6.epilogue.apply_noise':'32x32/Conv1/noise_strength',
'synthesis.layer6.epilogue.bias':'32x32/Conv1/bias',
'synthesis.output3.conv.weight': '32x32/ToRGB/weight',
'synthesis.output3.epilogue.mod_weight':'32x32/ToRGB/mod_weight',
'synthesis.output3.epilogue.mod_bias':'32x32/ToRGB/mod_bias',
'synthesis.output3.epilogue.bias':'32x32/ToRGB/bias',

'synthesis.layer7.conv.weight':'64x64/Conv0_up/weight',
'synthesis.layer7.epilogue.mod_weight':'64x64/Conv0_up/mod_weight',
'synthesis.layer7.epilogue.mod_bias':'64x64/Conv0_up/mod_bias',
'synthesis.layer7.epilogue.apply_noise':'64x64/Conv0_up/noise_strength',
'synthesis.layer7.epilogue.bias':'64x64/Conv0_up/bias',
'synthesis.layer8.conv.weight':'64x64/Conv1/weight',
'synthesis.layer8.epilogue.mod_weight':'64x64/Conv1/mod_weight',
'synthesis.layer8.epilogue.mod_bias':'64x64/Conv1/mod_bias',
'synthesis.layer8.epilogue.apply_noise':'64x64/Conv1/noise_strength',
'synthesis.layer8.epilogue.bias':'64x64/Conv1/bias',
'synthesis.output4.conv.weight': '64x64/ToRGB/weight',
'synthesis.output4.epilogue.mod_weight':'64x64/ToRGB/mod_weight',
'synthesis.output4.epilogue.mod_bias':'64x64/ToRGB/mod_bias',
'synthesis.output4.epilogue.bias':'64x64/ToRGB/bias',   

'synthesis.layer9.conv.weight':'128x128/Conv0_up/weight',
'synthesis.layer9.epilogue.mod_weight':'128x128/Conv0_up/mod_weight',
'synthesis.layer9.epilogue.mod_bias':'128x128/Conv0_up/mod_bias',
'synthesis.layer9.epilogue.apply_noise':'128x128/Conv0_up/noise_strength',
'synthesis.layer9.epilogue.bias':'128x128/Conv0_up/bias',
'synthesis.layer10.conv.weight':'128x128/Conv1/weight',
'synthesis.layer10.epilogue.mod_weight':'128x128/Conv1/mod_weight',
'synthesis.layer10.epilogue.mod_bias':'128x128/Conv1/mod_bias',
'synthesis.layer10.epilogue.apply_noise':'128x128/Conv1/noise_strength',
'synthesis.layer10.epilogue.bias':'128x128/Conv1/bias',
'synthesis.output5.conv.weight': '128x128/ToRGB/weight',
'synthesis.output5.epilogue.mod_weight':'128x128/ToRGB/mod_weight',
'synthesis.output5.epilogue.mod_bias':'128x128/ToRGB/mod_bias',
'synthesis.output5.epilogue.bias':'128x128/ToRGB/bias',

'synthesis.layer11.conv.weight':'256x256/Conv0_up/weight',
'synthesis.layer11.epilogue.mod_weight':'256x256/Conv0_up/mod_weight',
'synthesis.layer11.epilogue.mod_bias':'256x256/Conv0_up/mod_bias',
'synthesis.layer11.epilogue.apply_noise':'256x256/Conv0_up/noise_strength',
'synthesis.layer11.epilogue.bias':'256x256/Conv0_up/bias',
'synthesis.layer12.conv.weight':'256x256/Conv1/weight',
'synthesis.layer12.epilogue.mod_weight':'256x256/Conv1/mod_weight',
'synthesis.layer12.epilogue.mod_bias':'256x256/Conv1/mod_bias',
'synthesis.layer12.epilogue.apply_noise':'256x256/Conv1/noise_strength',
'synthesis.layer12.epilogue.bias':'256x256/Conv1/bias',
'synthesis.output6.conv.weight': '256x256/ToRGB/weight',
'synthesis.output6.epilogue.mod_weight':'256x256/ToRGB/mod_weight',
'synthesis.output6.epilogue.mod_bias':'256x256/ToRGB/mod_bias',
'synthesis.output6.epilogue.bias':'256x256/ToRGB/bias',

'synthesis.layer13.conv.weight':'512x512/Conv0_up/weight',
'synthesis.layer13.epilogue.mod_weight':'512x512/Conv0_up/mod_weight',
'synthesis.layer13.epilogue.mod_bias':'512x512/Conv0_up/mod_bias',
'synthesis.layer13.epilogue.apply_noise':'512x512/Conv0_up/noise_strength',
'synthesis.layer13.epilogue.bias':'512x512/Conv0_up/bias',
'synthesis.layer14.conv.weight':'512x512/Conv1/weight',
'synthesis.layer14.epilogue.mod_weight':'512x512/Conv1/mod_weight',
'synthesis.layer14.epilogue.mod_bias':'512x512/Conv1/mod_bias',
'synthesis.layer14.epilogue.apply_noise':'512x512/Conv1/noise_strength',
'synthesis.layer14.epilogue.bias':'512x512/Conv1/bias',
'synthesis.output7.conv.weight': '512x512/ToRGB/weight',
'synthesis.output7.epilogue.mod_weight':'512x512/ToRGB/mod_weight',
'synthesis.output7.epilogue.mod_bias':'512x512/ToRGB/mod_bias',
'synthesis.output7.epilogue.bias':'512x512/ToRGB/bias',

'synthesis.layer15.conv.weight':'1024x1024/Conv0_up/weight',
'synthesis.layer15.epilogue.mod_weight':'1024x1024/Conv0_up/mod_weight',
'synthesis.layer15.epilogue.mod_bias':'1024x1024/Conv0_up/mod_bias',
'synthesis.layer15.epilogue.apply_noise':'1024x1024/Conv0_up/noise_strength',
'synthesis.layer15.epilogue.bias':'1024x1024/Conv0_up/bias',
'synthesis.layer16.conv.weight':'1024x1024/Conv1/weight',
'synthesis.layer16.epilogue.mod_weight':'1024x1024/Conv1/mod_weight',
'synthesis.layer16.epilogue.mod_bias':'1024x1024/Conv1/mod_bias',
'synthesis.layer16.epilogue.apply_noise':'1024x1024/Conv1/noise_strength',
'synthesis.layer16.epilogue.bias':'1024x1024/Conv1/bias',
'synthesis.output8.conv.weight': '1024x1024/ToRGB/weight',
'synthesis.output8.epilogue.mod_weight':'1024x1024/ToRGB/mod_weight',
'synthesis.output8.epilogue.mod_bias':'1024x1024/ToRGB/mod_bias',
'synthesis.output8.epilogue.bias':'1024x1024/ToRGB/bias'

}
in stylegan2_generator_model.py (which is a copy of stylegan_generator_model.py
In stylegan_generator.py)

if 'ToRGB_lod' in tf_var_name:
lod = int(tf_var_name[len('ToRGB_lod')])
lod_shift = 10 - int(np.log2(self.resolution))
tf_var_name = tf_var_name.replace(f'{lod}', f'{lod - lod_shift}')
if tf_var_name not in tf_vars:
self.logger.debug(f'Variable {tf_var_name} does not exist in '
f'tensorflow model.')
is used.

Here in stylegan2 tf variable names are like '512x512/ToRGB/weight':
So, i have removed the above steps because it contains resolution size directly in variable name.

If i execute the codes by doing above modifications,it is saving that successfully saved the model. But at the time of loading giving error:
size mismatch for synthesis.layer1.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 8, 8]) from checkpoint, the shape in current model is torch.Size([1, 1, 4, 4]).
size mismatch for synthesis.layer3.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 16, 16]) from checkpoint, the shape in current model is torch.Size([1, 1, 8, 8]).
size mismatch for synthesis.layer5.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 32, 32]) from checkpoint, the shape in current model is torch.Size([1, 1, 16, 16]).
size mismatch for synthesis.layer7.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 64, 64]) from checkpoint, the shape in current model is torch.Size([1, 1, 32, 32]).
size mismatch for synthesis.layer8.conv.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 512, 3, 3]).
size mismatch for synthesis.layer8.epilogue.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for synthesis.layer9.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 128, 128]) from checkpoint, the shape in current model is torch.Size([1, 1, 64, 64]).
size mismatch for synthesis.output4.conv.weight: copying a param with shape torch.Size([3, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 256, 1, 1]).
size mismatch for synthesis.layer10.epilogue.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for synthesis.layer11.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 256, 256]) from checkpoint, the shape in current model is torch.Size([1, 1, 128, 128]).
size mismatch for synthesis.output5.conv.weight: copying a param with shape torch.Size([3, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 128, 1, 1]).
size mismatch for synthesis.layer12.epilogue.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for synthesis.layer13.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 512, 512]) from checkpoint, the shape in current model is torch.Size([1, 1, 256, 256]).
size mismatch for synthesis.output6.conv.weight: copying a param with shape torch.Size([3, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 64, 1, 1]).
size mismatch for synthesis.layer14.epilogue.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for synthesis.layer15.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 1024, 1024]) from checkpoint, the shape in current model is torch.Size([1, 1, 512, 512]).
size mismatch for synthesis.output7.conv.weight: copying a param with shape torch.Size([3, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 32, 1, 1]).
size mismatch for synthesis.layer16.epilogue.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for synthesis.output8.conv.weight: copying a param with shape torch.Size([3, 32, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 16, 1, 1]).

Can you help me change the functions in stylegan2_generator_model.py

Thanks in Advance .
Regards,
SandhyaLaxmi Kanna

Cancled

created stylegan2.pth file. Further steps for training??????

I am able to execute stylegan2-pytorch and obtained stylegan2-ffhq-config-f.pt file . Attaching here:

https://drive.google.com/file/d/1fqBNoicjUG1dHlunQAQQQ9KVpN4x2_Yn/view?usp=sharing

Can any suggestions for how to use this pt file for training and getting boundaries???

Thanks in advance.

Originally posted by @sandhyalaxmiK in #31 (comment)

Trying to use a pretrained stylegan with gray-scaled images

Hi,

I am trying to use generate_data.py with a trained stylegan on gray-scaled images, i registered my model in model_settings.py and specified number of channels as 1 in there. However, it turns out that something in the original model is fixed size to 3, the error is:

RuntimeError: Error(s) in loading state_dict for StyleGANGeneratorModel:
size mismatch for synthesis.output6.bias: copying a param with shape torch.Size([1]) from checkpoint, the shape in current model is torch.Size([3]).

Do you have any idea, how can i solve this.

Thanks in advance.

More explanation about the calculation about the boundary on two condition

https://github.com/ShenYujun/InterFaceGAN/blob/b707e942187f464251f855c92f7009b8cf13bf03/utils/manipulator.py#L189

I found that the code also supports two condition boundary editing. But I am confused about the calculation. Can anyone make an explanation for me? Thanks a lot!

Requirments.txt and pre-trained models

Great contribution and git my friends, thanks so much for posting this open to the community to use.

Would like to ask several questions:

Any way you can provide a requirements.txt file in order for those who don't have the libraries to easily use your git?
Can you link to the appropriate pre-trained models and where to download them? I have tried using the PGGAN from the github and yet the edit.py isn't working.

Thanks again!

Omitting intercept

Hey!
First of all great job on the paper and the code!

I was wondering what was the motivation behind assuming that all boundaries (hyperplanes) pass through the origin? I would imagine this assumption might be restrictive (especially for general theory when the mapping network can produce distributions that are not centered around the origin).

Also related to this it seems like when you find the hyperplane in train_boundary via fitting of a linear model you do not enforce the intercept to be 0.

similar to this project?

Hi,

Thank you very much for your great work! But I find it is very similar with this project:
https://github.com/Puzer/stylegan-encoder

What do you think?

Thanks!

Questions about the truncation module.

I have a question on your implementation of the truncation module. Why do the first 9 channels of the W+ code are same? It looks like you separate W+ code into just 2 blocks but 18 blocks? This is strange for that in the official code each channel(I mean 18, not 512) of W+ code is different.

Does Interface GAN keep face id when editing faces?

Hi Yujun,
Thanks for the great work.
When I run edit.py, the output images do not keep the face ID, does it possible to keep the face ID (just like FaceID GAN/FaceFeat GAN) if one laten code is passed to generator?

Tencent wrote a patent ，using the same way in your paper

[**发明] CN202010341415.0 人脸图像编辑方法、装置和存储介质

bitch tencent

Can it make an adult into a baby？

Hi, the age demo in the paper makes an adult into a child, could you please tell me what will if I set the age attribute as an extreme value?

Can it make an adult into a one-year-old baby？Can it keep the generator output a normal human face?

How to get the pose boundary?

Hello! I am confused about how to get the pose boundary. The paper said that the auxiliary attribute prediction model predicts the 5-point facial landmarks. I find the label about the facial landmarks is coordinate in the CelebFace Dataset. The paper said that turning right is the corresponding positive direction. How did you turn the coordinate into binary code that represents the direction?

auxiliary attribute prediction model

can you upload code of your auxiliary attribute prediction model ?or the link your code from?

Ways of learning the attribute vector

Very impressive work! I'm wondering if you have compared the proposed way of learning the attribute vector (by classification) with the way in [1] (by simply using the difference between the mean features)?.

[1] P. Upchurch, J. Gardner, G. Pleiss, R. Pless, N. Snavely, K. Bala, and K. Weinberger. Deep feature interpolation for image content changes. In CVPR, 2017

SyntaxError: invalid syntax

E:\Users\Raytine\Anaconda3\python.exe F:/expression/InterFaceGAN-master/edit.py -m stylegan_celebahq -b boundaries/stylegan_celebahq_pose_boundary -n 10 -o results/stylegan_celebahq_smile_editing
[2019-08-12 13:50:32,896][INFO] Initializing generator.
[2019-08-12 13:50:34,279][INFO] Loading tensorflow model from {self.tf_model_path}.
Traceback (most recent call last):
File "F:/expression/InterFaceGAN-master/edit.py", line 114, in
main()
File "F:/expression/InterFaceGAN-master/edit.py", line 68, in main
model = StyleGANGenerator(args.model_name, logger)
File "F:\expression\InterFaceGAN-master\models\stylegan_generator.py", line 42, in init
super().init(model_name, logger)
File "F:\expression\InterFaceGAN-master\models\base_generator.py", line 96, in init
self.convert_tf_model()
File "F:\expression\InterFaceGAN-master\models\stylegan_generator.py", line 73, in convert_tf_model
_, , tf_model = pickle.load(f)
File "models/stylegan_tf_official\dnnlib_init.py", line 20
submit_config: SubmitConfig = None # Package level variable for SubmitConfig which is only valid when inside the run function.
^
SyntaxError: invalid syntax

Issue learning latent encoding for new faces

I am trying to derive latent encodings for cutom faces, as done in https://github.com/Puzer/stylegan-encoder.

Here are the details after porting the same to pytorch:

from models.stylegan_generator import StyleGANGenerator

#load the pre-trained synthesis network
m_synth = StyleGANGenerator("stylegan_ffhq").model.synthesis.cuda().eval()

#process the output of the synthesis module
class PostProcAfterSynth(nn.Module):
    def __init__(self):
        super(PostProcAfterSynth, self).__init__()
    def forward(self, gen_img):
        #remap to [0,1]
        return (gen_img+1)/2
    
post_proc_layer = PostProcAfterSynth()

#preprocess the generated image before feeding into perceptual model    
class PreProcBeforePerception(nn.Module):
    def __init__(self, img_size):
        super(PreProcBeforePerception, self).__init__()
        self.img_size = img_size
        self.mean = torch.tensor([0.485, 0.456, 0.406], device=device).view(-1, 1, 1)
        self.std = torch.tensor([0.229, 0.224, 0.225], device=device).view(-1, 1, 1)
    def forward(self, gen_img):
        #resize input image
        gen_img = F.adaptive_avg_pool2d(gen_img, self.img_size)
        #normalize
        gen_img = (gen_img - self.mean) / self.std
        return gen_img
    
pre_proc_layer = PreProcBeforePerception(img_size=256)

#use pre-trained vgg model for feature extraction
m_vgg = models.vgg16(pretrained=True).features[:16].to(device).eval()

#set up the model
model = nn.Sequential(m_synth)
model.add_module(str(1), post_proc_layer)
model.add_module(str(2), pre_proc_layer)
model.add_module(str(3), m_vgg)

for param in model.parameters():
    param.requires_grad_(False)

print(m_vgg)

Sequential(
  (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): ReLU(inplace)
  (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (3): ReLU(inplace)
  (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (6): ReLU(inplace)
  (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (8): ReLU(inplace)
  (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (11): ReLU(inplace)
  (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (13): ReLU(inplace)
  (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (15): ReLU(inplace)
)

As done by Puzer, I select the [conv->conv->pool->conv->conv->pool->conv->conv->conv] section of the vgg network for feature extraction.

Pre-computing the features for the reference image:

ref_img_path = "."
ref_img = np.array(Image.open(ref_img_path))
ref_img = ref_img.astype(np.float32)/255.
ref_img = np.array([np.transpose(ref_img, (2,0,1))])
ref_img = torch.tensor(ref_img, device=device)
ref_img = pre_proc_layer(ref_img)
ref_img_features = m_vgg(ref_img).detach()

Optimization:

trainable_latent = torch.randn((1,18,512), device=device).requires_grad_(True)
loss_func = torch.nn.MSELoss()

optimizer = optim.SGD([trainable_latent], lr=0.5)

losses = []
for i in tqdm(range(1000)):
    optimizer.zero_grad()
    gen_img_features = model(trainable_latent)
    loss = loss_func(gen_img_features, ref_img_features)
    loss_val = loss.data.cpu()
    losses.append(loss_val)
    loss.backward()
    optimizer.step()

The latent encoding and subsequent generated images are of a poor quality. The results are nowhere near as crisp as that by Puzer.

What I have tried:

Learning Z space latent instead of WP+
Variety of optimizers, learning rate, iterations combos

What could be wrong:

There might be issues with my pipeline above (new to pytorch)
There might be some difference in pre-trained vgg networks for pytorch and keras, that I might have failed to take into account.
The perceptual model used is not complex enough. (but it does work for Puzer)

Any help with the above would be much appreciated.

Attribute Scores

Hello, thanks for the great work.
I would like to know if it is possible to see the code for the attribute predictor you used, or if you could share the scores of these directly to be able to find boundaries on different architectures but with the same dataset.
Thanks very much

quality on stylegan_ffhq

Hi, thanks for the paper and the results are impressive!

I tested the code with "stylegan_ffhq" model and "stylegan_ffhq_pose_boundary.npy or stylegan_ffhq_pose_w_boundary.npy", with the default settings, but the results are not very good.

The person identity, age, even gender changed simultaneously with the pose.
Regarding to the "stylegan_ffhq_pose_w_boundary.npy", the degree of pose changes are more or less ignorable.

python edit.py -m stylegan_ffhq -o results/stylegan_ffhq_pose_w_boundary -b ./boundaries/stylegan_ffhq_pose_w_boundary.npy -n 10

Is there anything that I have to adjust?

W space or W+ space

hello! The GitHub provides us some boundaries about the W space in StyleGAN. But I found that the code contains two configurations which are W space and W+ space. So I wonder if all the boundaries label W in paper and Github are corresponding to W space? There is no provided W+ space boundaries and no W+ space results in the paper? Thanks a lot.

How to project boundary preserved from more than one attribute?

Since project_boundary function only support 2 inputs, how to project boundary preserved from more than one attribute? Such as changing eyeglasses without changing age and gender

How to manipulate real faces?

Dear author, after checking this repository, I have found that there isn't included a encoder-decoder model as the paper tests in Figure 11. WILL this release in the near future?

RuntimeError: Error(s) in loading state_dict for PGGANGeneratorModel:

E:\Users\Raytine\Anaconda3\python.exe F:/expression/InterFaceGAN-master/edit.py -m pggan_celebahq -b boundaries/pggan_celebahq_smile_boundary.npy -n 10 -o results/pggan_celebahq_smile_editing
[2019-08-12 10:30:13,999][INFO] Initializing generator.
[2019-08-12 10:30:23,051][INFO] Loading tensorflow model from {self.tf_model_path}.
[2019-08-12 10:30:28,891][INFO] Successfully loaded!
[2019-08-12 10:30:28,892][INFO] Converting tensorflow model to pytorch version.
[2019-08-12 10:30:29,095][INFO] Successfully converted!
[2019-08-12 10:30:29,095][INFO] Saving pytorch model to {self.model_path}.
[2019-08-12 10:30:29,120][INFO] Successfully saved!
[2019-08-12 10:30:29,120][INFO] Loading pytorch model from {self.model_path}.
Traceback (most recent call last):
File "F:/expression/InterFaceGAN-master/edit.py", line 112, in
main()
File "F:/expression/InterFaceGAN-master/edit.py", line 63, in main
model = PGGANGenerator(args.model_name, logger)
File "F:\expression\InterFaceGAN-master\models\pggan_generator.py", line 24, in init
super().init(model_name, logger)
File "F:\expression\InterFaceGAN-master\models\base_generator.py", line 96, in init
self.convert_tf_model()
File "F:\expression\InterFaceGAN-master\models\pggan_generator.py", line 70, in convert_tf_model
self.load()
File "F:\expression\InterFaceGAN-master\models\pggan_generator.py", line 34, in load
self.model.load_state_dict(torch.load(self.model_path))
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 719, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for PGGANGeneratorModel:
Unexpected key(s) in state_dict: "layer5.conv.weight", "layer18.conv.weight", "layer2.conv.weight", "layer15.wscale.bias", "layer18.wscale.bias", "layer4.wscale.bias", "layer6.wscale.bias", "layer1.conv.weight", "layer8.conv.weight", "layer14.wscale.bias", "layer10.wscale.bias", "layer8.wscale.bias", "layer15.conv.weight", "output_1024x1024.conv.weight", "output_1024x1024.wscale.bias", "layer7.conv.weight", "layer9.conv.weight", "layer9.wscale.bias", "layer17.conv.weight", "layer13.wscale.bias", "layer12.wscale.bias", "layer14.conv.weight", "layer16.wscale.bias", "layer11.wscale.bias", "layer16.conv.weight", "layer10.conv.weight", "layer6.conv.weight", "layer17.wscale.bias", "layer4.conv.weight", "layer13.conv.weight", "layer5.wscale.bias", "layer2.wscale.bias", "layer3.wscale.bias", "layer12.conv.weight", "layer1.wscale.bias", "layer11.conv.weight", "layer7.wscale.bias", "layer3.conv.weight".

artifacts boundary

Hi there,

In the paper the last paragraph of section 3.3 mentions how to fix artifacts using certain hyperplane. However this hyperplane is missing in the boundaries folder of this repo. Are you going to release it soon?

Thanks!

Can we use pretrained models of stylegan for getting attribute scores??

Hi,

1.Can we use pretrained attribute classification model files like 'celebahq-classifier-19-eyeglasses.pkl' for attribute scores of the models???

can we use the pth file generated from this repository https://github.com/rosinality/stylegan2-pytorch ??

Multi-GPU support

I have warped the model in models/base_generator. However, CUDA out of memory occurs when I run the synthesize script. Could you help me figure it out?
GPU: P40, 24G. batchsize：32.

different interpolation logic

Hi,

First many thanks for sharing this great work.

Got a question regarding the linear interpolation logic. In function linear_interpolate(), when len(latent_code.shape) == 2, dot product of latent_code and boundary is subtracted from [start_distance, end_distance]. However if if len(latent_code.shape) == 3, the dot product is not considered at all. Just wondering why these 2 cases are treated differently.

Thanks.

Issues converting custom pggan models

It seems that for pggan, state dictionaries for the model trained by paper authors (file karras2018iclr-celebahq-1024x1024.pkl) and custom models that are trained by their released code are different. You can check the differences here: https://www.diffchecker.com/0hFYlK82.

Unfortunately I was unable to tweak your code so that it would convert my trained model correctly.

Did you try converting custom trained model using pggan code? If so, could you please provide a model you trained that worked for you other than the one provided by pggan authors?

Do you have plan to release code of attribute predictor?

I want to try training boundaries, and realized that i need trained attribute score predictor to generate attribute inference results. Do you have plan to release code of attribute predictor?

AssertionError: Torch not compiled with CUDA enabled

(base) PS E:\darshan\pytorch_stylegan_encoder-master\InterFaceGAN> python generate_data.py -m stylegan_ffhq -o data/pggan_celebahq -n 10000
[2020-01-20 03:53:48,282][INFO] Initializing generator.
[2020-01-20 03:53:48,521][WARNING] No pre-trained model will be loaded!
Traceback (most recent call last):
File "generate_data.py", line 111, in
main()
File "generate_data.py", line 64, in main
model = StyleGANGenerator(args.model_name, logger)
File "E:\darshan\pytorch_stylegan_encoder-master\InterFaceGAN\models\stylegan_generator.py", line 42, in init
super().init(model_name, logger)
File "E:\darshan\pytorch_stylegan_encoder-master\InterFaceGAN\models\base_generator.py", line 103, in init
self.model.eval().to(self.run_device)
File "C:\Users\HpZ8\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 426, in to
return self._apply(convert)
File "C:\Users\HpZ8\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 202, in _apply
module._apply(fn)
File "C:\Users\HpZ8\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 202, in _apply
module._apply(fn)
File "C:\Users\HpZ8\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 202, in _apply
module._apply(fn)
File "C:\Users\HpZ8\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 224, in apply
param_applied = fn(param)
File "C:\Users\HpZ8\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 424, in convert
return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
File "C:\Users\HpZ8\Anaconda3\lib\site-packages\torch\cuda_init.py", line 192, in _lazy_init
check_driver()
File "C:\Users\HpZ8\Anaconda3\lib\site-packages\torch\cuda_init.py", line 95, in _check_driver
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

Attributes predictor

Hi! Thanks for your great research!
Is there any chance you can provide attributes predictor that you used for finding boundaries? I didn't really get about scores that it should give to attributes on the pictures.
Also for example for binary attributes like glasses/no glasses is it possible to manually put labels of say 1 for glasses, 0 for no glasses and feed this data to train a new boundary?

Questions about generator W space data by stylegan_ffhq

Hi, I want to edit image in W space.

In #35, it is suggested that using generate_data.py to get w.npy first.
I used the following code to generator. However, the image is strange that it is not like a normal human face
python generate_data.py -m stylegan_ffhq -o data/stylegan-ffhq -n 3 -s W

Was my code wrong?

Also, I want to ask. If I want to edit the W space image, was the following code right?

python edit.py \
    -m stylegan_ffhq \
    -b boundaries/stylegan_ffhq_age_w_boundary.npy \
    -i ./data/stylegan-ffhq/w.npy \
    -o results/stylegan_celebahq_age_w_boundary \
    -s W

Confusing about the definition of binary attribute

Hello! I am confusing about the definition of binary attribute. For example, if I want to change the hair color attribute, whether the method can only work for dart colors vs. light colors. Can i get the centain color direction like red hair color direction through collecting red hair and no red hair images to train a
binary classifier? Maybe can we get the red color direction through collecting red hair and yellow hair?

Regarding yaw pose estimation using facial landmarks

Hi,

I wish to apply your technique to a StyleGAN model that I have trained on Celeba-HQ-128 images. Can you please release the code to estimate yaw pose using the five facial landmarks present in CelebA dataset (left eye centre, right eye centre, nose tip, left mouth corner and right mouth corner)?

Thanks.

generate_data.py generates empty images

Hi guys,

Thank you for your great work. I have a question.

I found that generate_data.py works very well with my style-gan models, but when I trained PGGAN with the same dataset of images and then ran generate_data.py for generating 10k images, all images look like this:

I tried another checkpoint but it gave similar results:

I guess it is caused by some misconfiguration between my PGGAN and the inference code of InterFaceGAN, but haven't found it yet. Can you give me some advice please?

about style_mod

why the function is x * (style[:, 0] + 1) + style[:, 1]？what "+1" of （style[:, 0] + 1）mean？，the official implementation is also same as yours.i can‘t figure it out

How to find more boundaries?

Congratulations for the great work!

Please correct me if I'm wrong. From what I understand, providing a pretrained model and a boundary (in boundaries/) we can tuning features of generated images. This is amazing.

I wonder if I can explore other boundaries as well? Let's say, hair color or skin colors? If it is possible, how could I do that?

Thank you very much!

Issue when using pretrained model with input size 512x512

Hi,

I am trying to run generate_data.py using my pretrained model which was trained on 512x512 images. It successfully converted the pkl model to pth, but then showed the error below.

Traceback (most recent call last): File "/media/tai/6TB/Projects/InterFaceGAN/InterFaceGAN/generate_data.py", line 114, in <module> main() File "/media/tai/6TB/Projects/InterFaceGAN/InterFaceGAN/generate_data.py", line 65, in main model = StyleGANGenerator(args.model_name, logger) File "/media/tai/6TB/Projects/InterFaceGAN/InterFaceGAN/models/stylegan_generator.py", line 42, in __init__ super().__init__(model_name, logger) File "/media/tai/6TB/Projects/InterFaceGAN/InterFaceGAN/models/base_generator.py", line 95, in __init__ self.load() File "/media/tai/6TB/Projects/InterFaceGAN/InterFaceGAN/models/stylegan_generator.py", line 63, in load self.model.load_state_dict(state_dict) File "/media/tai/6TB/anaconda3/envs/InterfaceGAN/lib/python3.6/site-packages/torch/nn/modules/module.py", line 845, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for StyleGANGeneratorModel: Unexpected key(s) in state_dict: "synthesis.output8.conv.weight", "synthesis.output8.bias". size mismatch for synthesis.output4.conv.weight: copying a param with shape torch.Size([3, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 256, 1, 1]). size mismatch for synthesis.output5.conv.weight: copying a param with shape torch.Size([3, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 128, 1, 1]). size mismatch for synthesis.output6.conv.weight: copying a param with shape torch.Size([3, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 64, 1, 1]). size mismatch for synthesis.output7.conv.weight: copying a param with shape torch.Size([3, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 32, 1, 1]).
I guess it is because InterFaceGAN is set to work with models being trained with 1024x1024 images by default.

Where should I modify so that I can load my 512x512 model?

Thank you very much!

How to perform StyleGAN inversion?

Hi Yujun,

In the paper you claimed that it must use GAN inversion method to map real images to latent codes, and StyleGAN inversion methods are much better, are there documents introducing how to do the inversion?
Any comments are appreciated! Best Regards.

no obvious age difference between orignial stylegan and this

I used previous stylegan and stylegan in interfacegan to change age. But I cannot tell obvious difference between these two.
the boundary used is "stylegan_f fhq_age_w_boundary.npy". Could you provide any help?

How can I input an image to edit?

Hi,
First many thanks for sharing this great work.
I want to ask a question. The input of InterFaceGAN is "Latent code". If I want to input a face image and edit the face, how can I deal with? Need I transform the image as a "Latent code"? And how to do it?

Thank you.

RuntimeError: Given groups=1, weight of size [16, 16, 3, 3], expected input[4, 512, 1, 1] to have 16 channels, but got 512 channels instead

E:\Users\Raytine\Anaconda3\python.exe F:/expression/InterFaceGAN-master/edit.py -m pggan_celebahq -b boundaries/pggan_celebahq_smile_boundary.npy -n 10 -o results/pggan_celebahq_smile_editing
[2019-08-12 14:15:15,846][INFO] Initializing generator.
[2019-08-12 14:15:15,972][INFO] Loading pytorch model from {self.model_path}.
[2019-08-12 14:15:16,002][INFO] Successfully loaded!
[2019-08-12 14:15:17,357][INFO] Preparing boundary.
0%| | 0/10 [00:00<?, ?it/s][2019-08-12 14:15:17,394][INFO] Preparing latent codes.
[2019-08-12 14:15:17,394][INFO] Sample latent codes randomly.
[2019-08-12 14:15:17,395][INFO] Editing {total_num} samples.
Traceback (most recent call last):
File "F:/expression/InterFaceGAN-master/edit.py", line 114, in
main()
File "F:/expression/InterFaceGAN-master/edit.py", line 100, in main
outputs = model.easy_synthesize(interpolations_batch)
File "F:\expression\InterFaceGAN-master\models\base_generator.py", line 230, in easy_synthesize
outputs = self.synthesize(latent_codes, **kwargs)
File "F:\expression\InterFaceGAN-master\models\pggan_generator.py", line 132, in synthesize
images = self.model(zs)
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "F:\expression\InterFaceGAN-master\models\pggan_generator_model.py", line 127, in forward
return super().forward(x)
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\container.py", line 91, in forward
input = module(input)
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "F:\expression\InterFaceGAN-master\models\pggan_generator_model.py", line 243, in forward
x = self.conv(x)
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\conv.py", line 301, in forward
self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [16, 16, 3, 3], expected input[4, 512, 1, 1] to have 16 channels, but got 512 channels instead

the change of smile when increase the distance of the "glasses" attribute

Hi,thanks for your great research.
I have a problem when I apply the "pggan_celebahq_eyeglasses_c_age_gender_boundary.npy" on the pre-trained pggan model. With the conditional manipulation on "glasses" attribute , the smile changes gradually. Here are the results.

But the attributes seem to be uncorrelated in your paper.
Could you explained it for me?

Roll and Pitch for Pose data

Hi Shenyujun,
Thanks so much for awesome work! I saw you have pose direction in repo, but there is no roll or pitch rotation for pose in that direction. Do you think it's possible to train with both pitch, roll and yaw? BTW, any attribute predictor or data could share regarding the pose generation? That will be really helpful, not found in original stylegan repo.
Thanks

How to train w+ space boundary?

Using the stylegan-encoder project, I got the latent codes as array (n,18,512). However, the training code is for 1d vector input , do i need to separate the latent code into 1d vector?
Thanks a lot!

Issue with tensorflow version

When I try to install tensorflow 1.12.2 as I believe tensorflow 1.12.2 and cudo 9 are compatible to execute this code.. I'm getting version mismatch error and some other version of gets installed and when try to display tensorflow version I'm getting errors... images are not getting for other versions of tensorflow generated..

Why the latent code from GAN inversion methods can be manipulated by the boundary

Hi, thanks for sharing this great work!

I'm trying to edit a new face. In #30, it is suggested using https://github.com/Puzer/stylegan-encoder firstly to get the new face latent code of W+ space. However, the shape of the latent code is (18, 512) and 18 layers have different values.

What confusing me is :

The shape of "stylegan_ffhq_age_w_boundary.npy" is (1,512), so if using (1, 512) boundary to edit (18,512) latent code, all layers will edit by the same value. But the meaning of different layers of (18,512) latent code is not the same, because the values of 18 layers are different.

Why can we use (1, 512) boundary to edit (18,512) latent code? Why it can also work?

If the (18,512) latent code has different values of its 18 layers, training a (18, 512) boundary (which also has different values of its 18 layers) is more reasonable, isn't it?
In your paper, you also do the experiment of real images. What the latent space did you get from your stylegan encode? Z, W or W+?
If the shape of your latent code is (18,512), do 18 layers have different values?

Thank you!

how to prepare data for a custom attribute

Hi, I think the main idea is to prepare data with/without a certain attribute, and train a binary classifier (namely linear-SVM) with the data.

But most attributes are of continuous values, such as pose rotation.
The logic in your code is to use the average of the largest value and the smallest value as a threshold.

There are also some attributes should be quantised as several finite states, for example, face shape can be in [square, triangle, heart, round, oval]

Do you have more detailed suggestions on how to quantize those attributes?

Thanks!

genforce / interfacegan Goto Github PK

interfacegan's Introduction

GenForce Lib for Generative Modeling

Updates

Highlights

Installation

Quick Demo

Play with GANs

Test

Train

Play with Encoders for GAN Inversion

Train

Contributors

License

Acknowledgement

BibTex

interfacegan's People

Stargazers

Watchers

Forkers

interfacegan's Issues

pylint: disable=line-too-long

Variable mapping from pytorch model to official tensorflow model.

Recommend Projects

Recommend Topics

Recommend Org