Git Product home page Git Product logo

interfacegan's Introduction

GenForce Lib for Generative Modeling

An efficient PyTorch library for deep generative modeling. May the Generative Force (GenForce) be with You.

image

Updates

  • Encoder Training: We support training encoders on top of pre-trained GANs for GAN inversion.
  • Model Converters: You can easily migrate your already started projects to this repository. Please check here for more details.

Highlights

  • Distributed training framework.
  • Fast training speed.
  • Modular design for prototyping new models.
  • Model zoo containing a rich set of pretrained GAN models, with Colab live demo to play.

Installation

  1. Create a virtual environment via conda.

    conda create -n genforce python=3.7
    conda activate genforce
  2. Install cuda and cudnn. (We use CUDA 10.0 in case you would like to use TensorFlow 1.15 for model conversion.)

    conda install cudatoolkit=10.0 cudnn=7.6.5
  3. Install torch and torchvision.

    pip install torch==1.7 torchvision==0.8
  4. Install requirements

    pip install -r requirements.txt

Quick Demo

We provide a quick training demo, scripts/stylegan_training_demo.py, which allows to train StyleGAN on a toy dataset (500 animeface images with 64 x 64 resolution). Try it via

./scripts/stylegan_training_demo.sh

We also provide an inference demo, synthesize.py, which allows to synthesize images with pre-trained models. Generated images can be found at work_dirs/synthesis_results/. Try it via

python synthesize.py stylegan_ffhq1024

You can also play the demo at Colab.

Play with GANs

Test

Pre-trained models can be found at model zoo.

  • On local machine:

    GPUS=8
    CONFIG=configs/stylegan_ffhq256_val.py
    WORK_DIR=work_dirs/stylegan_ffhq256_val
    CHECKPOINT=checkpoints/stylegan_ffhq256.pth
    ./scripts/dist_test.sh ${GPUS} ${CONFIG} ${WORK_DIR} ${CHECKPOINT}
  • Using slurm:

    CONFIG=configs/stylegan_ffhq256_val.py
    WORK_DIR=work_dirs/stylegan_ffhq256_val
    CHECKPOINT=checkpoints/stylegan_ffhq256.pth
    GPUS=8 ./scripts/slurm_test.sh ${PARTITION} ${JOB_NAME} \
        ${CONFIG} ${WORK_DIR} ${CHECKPOINT}

Train

All log files in the training process, such as log message, checkpoints, synthesis snapshots, etc, will be saved to the work directory.

  • On local machine:

    GPUS=8
    CONFIG=configs/stylegan_ffhq256.py
    WORK_DIR=work_dirs/stylegan_ffhq256_train
    ./scripts/dist_train.sh ${GPUS} ${CONFIG} ${WORK_DIR} \
        [--options additional_arguments]
  • Using slurm:

    CONFIG=configs/stylegan_ffhq256.py
    WORK_DIR=work_dirs/stylegan_ffhq256_train
    GPUS=8 ./scripts/slurm_train.sh ${PARTITION} ${JOB_NAME} \
        ${CONFIG} ${WORK_DIR} \
        [--options additional_arguments]

Play with Encoders for GAN Inversion

Train

  • On local machine:

    GPUS=8
    CONFIG=configs/stylegan_ffhq256_encoder_y.py
    WORK_DIR=work_dirs/stylegan_ffhq256_encoder_y
    ./scripts/dist_train.sh ${GPUS} ${CONFIG} ${WORK_DIR} \
        [--options additional_arguments]
  • Using slurm:

    CONFIG=configs/stylegan_ffhq256_encoder_y.py
    WORK_DIR=work_dirs/stylegan_ffhq256_encoder_y
    GPUS=8 ./scripts/slurm_train.sh ${PARTITION} ${JOB_NAME} \
        ${CONFIG} ${WORK_DIR} \
        [--options additional_arguments]

Contributors

Member Module
Yujun Shen models and running controllers
Yinghao Xu runner and loss functions
Ceyuan Yang data loader
Jiapeng Zhu evaluation metrics
Bolei Zhou cheerleader

NOTE: The above form only lists the person in charge for each module. We help each other a lot and develop as a TEAM.

We welcome external contributors to join us for improving this library.

License

The project is under the MIT License.

Acknowledgement

We thank PGGAN, StyleGAN, StyleGAN2, StyleGAN2-ADA for their work on high-quality image synthesis. We thank IDInvert and GHFeat for their contribution to GAN inversion. We also thank MMCV for the inspiration on the design of controllers.

BibTex

We open source this library to the community to facilitate the research of generative modeling. If you do like our work and use the codebase or models for your research, please cite our work as follows.

@misc{genforce2020,
  title =        {GenForce},
  author =       {Shen, Yujun and Xu, Yinghao and Yang, Ceyuan and Zhu, Jiapeng and Zhou, Bolei},
  howpublished = {\url{https://github.com/genforce/genforce}},
  year =         {2020}
}

interfacegan's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

interfacegan's Issues

RuntimeError: Given groups=1, weight of size [16, 16, 3, 3], expected input[4, 512, 1, 1] to have 16 channels, but got 512 channels instead

E:\Users\Raytine\Anaconda3\python.exe F:/expression/InterFaceGAN-master/edit.py -m pggan_celebahq -b boundaries/pggan_celebahq_smile_boundary.npy -n 10 -o results/pggan_celebahq_smile_editing
[2019-08-12 09:17:26,315][INFO] Initializing generator.
[2019-08-12 09:17:26,440][WARNING] No pre-trained model will be loaded!
[2019-08-12 09:17:27,728][INFO] Preparing boundary.
[2019-08-12 09:17:27,731][INFO] Preparing latent codes.
[2019-08-12 09:17:27,731][INFO] Sample latent codes randomly.
[2019-08-12 09:17:27,732][INFO] Editing {total_num} samples.
Traceback (most recent call last):
File "F:/expression/InterFaceGAN-master/edit.py", line 112, in
main()
File "F:/expression/InterFaceGAN-master/edit.py", line 98, in main
outputs = model.easy_synthesize(interpolations_batch)
File "F:\expression\InterFaceGAN-master\models\base_generator.py", line 230, in easy_synthesize
outputs = self.synthesize(latent_codes, **kwargs)
File "F:\expression\InterFaceGAN-master\models\pggan_generator.py", line 117, in synthesize
images = self.model(zs)
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "F:\expression\InterFaceGAN-master\models\pggan_generator_model.py", line 127, in forward
return super().forward(x)
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\container.py", line 91, in forward
input = module(input)
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "F:\expression\InterFaceGAN-master\models\pggan_generator_model.py", line 243, in forward
x = self.conv(x)
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\conv.py", line 301, in forward
self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [16, 16, 3, 3], expected input[4, 512, 1, 1] to have 16 channels, but got 512 channels instead

Process finished with exit code 1

Training with stylegan2 model

Hi. Your results are good. Currently i am working pose transformation. Your repository helps me alot to go step forward.
Currently stylegan2 produced generated images after mixing are too divergent and good results than previous.

Can you tell the process of how to train stylegan2 model to get boundaries??

I changed following parameter values based on stylegan2-ffhq-config-f.pkl.

_RESOLUTIONS_TO_CHANNELS = {
8: [512, 512, 512],
16: [512, 512, 512, 512],
32: [512, 512, 512, 512, 512],
64: [512, 512, 512, 512, 512, 256],
128: [512, 512, 512, 512, 512, 256, 128],
256: [512, 512, 512, 512, 512, 256, 128, 64],
512: [512, 512, 512, 512, 512, 256, 128, 64, 32],
1024: [512, 512, 512, 512, 512, 256, 128, 64, 32, 16],
}

pylint: disable=line-too-long

Variable mapping from pytorch model to official tensorflow model.

_STYLEGAN_PTH_VARS_TO_TF_VARS = {
# Statistic information of disentangled latent feature, w.
'truncation.w_avg':'dlatent_avg', # [512]

# Noises.
'synthesis.layer0.epilogue.apply_noise.noise': 'noise0',    # [1, 1, 4, 4]
'synthesis.layer1.epilogue.apply_noise.noise': 'noise1',    # [1, 1, 8, 8]
'synthesis.layer2.epilogue.apply_noise.noise': 'noise2',    # [1, 1, 8,8]
'synthesis.layer3.epilogue.apply_noise.noise': 'noise3',    # [1, 1, 16, 16]
'synthesis.layer4.epilogue.apply_noise.noise': 'noise4',    # [1, 1, 16, 16]
'synthesis.layer5.epilogue.apply_noise.noise': 'noise5',    # [1, 1, 32, 32]
'synthesis.layer6.epilogue.apply_noise.noise': 'noise6',    # [1, 1, 32, 32]
'synthesis.layer7.epilogue.apply_noise.noise': 'noise7',    # [1, 1, 64, 64]
'synthesis.layer8.epilogue.apply_noise.noise': 'noise8',    # [1, 1, 64, 64]
'synthesis.layer9.epilogue.apply_noise.noise': 'noise9',    # [1, 1, 128, 128]
'synthesis.layer10.epilogue.apply_noise.noise': 'noise10',  # [1, 1, 128, 128]
'synthesis.layer11.epilogue.apply_noise.noise': 'noise11',  # [1, 1, 256, 256]
'synthesis.layer12.epilogue.apply_noise.noise': 'noise12',  # [1, 1, 256, 256]
'synthesis.layer13.epilogue.apply_noise.noise': 'noise13',  # [1, 1, 512, 512]
'synthesis.layer14.epilogue.apply_noise.noise': 'noise14',  # [1, 1, 512, 512]
'synthesis.layer15.epilogue.apply_noise.noise': 'noise15',  # [1, 1, 1024, 1024]
'synthesis.layer16.epilogue.apply_noise.noise': 'noise16',  # [1, 1, 1024, 1024]

# Mapping blocks.
'mapping.dense0.linear.weight': 'Dense0/weight',  # [512, 512]
'mapping.dense0.wscale.bias': 'Dense0/bias',  # [512]
'mapping.dense1.linear.weight': 'Dense1/weight',  # [512, 512]
'mapping.dense1.wscale.bias': 'Dense1/bias',  # [512]
'mapping.dense2.linear.weight': 'Dense2/weight',  # [512, 512]
'mapping.dense2.wscale.bias': 'Dense2/bias',  # [512]
'mapping.dense3.linear.weight': 'Dense3/weight',  # [512, 512]
'mapping.dense3.wscale.bias': 'Dense3/bias',  # [512]
'mapping.dense4.linear.weight': 'Dense4/weight',  # [512, 512]
'mapping.dense4.wscale.bias': 'Dense4/bias',  # [512]
'mapping.dense5.linear.weight': 'Dense5/weight',  # [512, 512]
'mapping.dense5.wscale.bias': 'Dense5/bias',  # [512]
'mapping.dense6.linear.weight': 'Dense6/weight',  # [512, 512]
'mapping.dense6.wscale.bias': 'Dense6/bias',  # [512]
'mapping.dense7.linear.weight': 'Dense7/weight',  # [512, 512]
'mapping.dense7.wscale.bias': 'Dense7/bias',  # [512]

# Synthesis blocks.

'synthesis.lod': 'lod' , #[]
'synthesis.add_constant': '4x4/Const/const',
'synthesis.layer0.conv.weight': '4x4/Conv/weight',
'synthesis.layer0.epilogue.mod_weight':'4x4/Conv/mod_weight',
'synthesis.layer0.epilogue.mod_bias':'4x4/Conv/mod_bias',
'synthesis.layer0.epilogue.apply_noise':'4x4/Conv/noise_strength',
'synthesis.layer0.epilogue.bias': '4x4/Conv/bias',
'synthesis.output0.conv.weight': '4x4/ToRGB/weight',   
'synthesis.output0.epilogue.mod_weight':'4x4/ToRGB/mod_weight',
'synthesis.output0.epilogue.mod_bias':'4x4/ToRGB/mod_bias',
'synthesis.output0.epilogue.bias':'4x4/ToRGB/bias',

'synthesis.layer1.conv.weight':'8x8/Conv0_up/weight',
'synthesis.layer1.epilogue.mod_weight':'8x8/Conv0_up/mod_weight',
'synthesis.layer1.epilogue.mod_bias':'8x8/Conv0_up/mod_bias',
'synthesis.layer1.epilogue.apply_noise':'8x8/Conv0_up/noise_strength',
'synthesis.layer1.epilogue.bias':'8x8/Conv0_up/bias',
'synthesis.layer2.conv.weight':'8x8/Conv1/weight',
'synthesis.layer2.epilogue.mod_weight':'8x8/Conv1/mod_weight',
'synthesis.layer2.epilogue.mod_bias':'8x8/Conv1/mod_bias',
'synthesis.layer2.epilogue.apply_noise':'8x8/Conv1/noise_strength',
'synthesis.layer2.epilogue.bias':'8x8/Conv1/bias',
'synthesis.output1.conv.weight': '8x8/ToRGB/weight',
'synthesis.output1.epilogue.mod_weight':'8x8/ToRGB/mod_weight',
'synthesis.output1.epilogue.mod_bias':'8x8/ToRGB/mod_bias',
'synthesis.output1.epilogue.bias':'8x8/ToRGB/bias',

'synthesis.layer3.conv.weight':'16x16/Conv0_up/weight',
'synthesis.layer3.epilogue.mod_weight':'16x16/Conv0_up/mod_weight',
'synthesis.layer3.epilogue.mod_bias':'16x16/Conv0_up/mod_bias',
'synthesis.layer3.epilogue.apply_noise':'16x16/Conv0_up/noise_strength',
'synthesis.layer3.epilogue.bias':'16x16/Conv0_up/bias',
'synthesis.layer4.conv.weight':'16x16/Conv1/weight',
'synthesis.layer4.epilogue.mod_weight':'16x16/Conv1/mod_weight',
'synthesis.layer4.epilogue.mod_bias':'16x16/Conv1/mod_bias',
'synthesis.layer4.epilogue.apply_noise':'16x16/Conv1/noise_strength',
'synthesis.layer4.epilogue.bias':'16x16/Conv1/bias',
'synthesis.output2.conv.weight': '16x16/ToRGB/weight',
'synthesis.output2.epilogue.mod_weight':'16x16/ToRGB/mod_weight',
'synthesis.output2.epilogue.mod_bias':'16x16/ToRGB/mod_bias',
'synthesis.output2.epilogue.bias':'16x16/ToRGB/bias',

'synthesis.layer5.conv.weight':'32x32/Conv0_up/weight',
'synthesis.layer5.epilogue.mod_weight':'32x32/Conv0_up/mod_weight',
'synthesis.layer5.epilogue.mod_bias':'32x32/Conv0_up/mod_bias',
'synthesis.layer5.epilogue.apply_noise':'32x32/Conv0_up/noise_strength',
'synthesis.layer5.epilogue.bias':'32x32/Conv0_up/bias',
'synthesis.layer6.conv.weight':'32x32/Conv1/weight',
'synthesis.layer6.epilogue.mod_weight':'32x32/Conv1/mod_weight',
'synthesis.layer6.epilogue.mod_bias':'32x32/Conv1/mod_bias',
'synthesis.layer6.epilogue.apply_noise':'32x32/Conv1/noise_strength',
'synthesis.layer6.epilogue.bias':'32x32/Conv1/bias',
'synthesis.output3.conv.weight': '32x32/ToRGB/weight',
'synthesis.output3.epilogue.mod_weight':'32x32/ToRGB/mod_weight',
'synthesis.output3.epilogue.mod_bias':'32x32/ToRGB/mod_bias',
'synthesis.output3.epilogue.bias':'32x32/ToRGB/bias',

'synthesis.layer7.conv.weight':'64x64/Conv0_up/weight',
'synthesis.layer7.epilogue.mod_weight':'64x64/Conv0_up/mod_weight',
'synthesis.layer7.epilogue.mod_bias':'64x64/Conv0_up/mod_bias',
'synthesis.layer7.epilogue.apply_noise':'64x64/Conv0_up/noise_strength',
'synthesis.layer7.epilogue.bias':'64x64/Conv0_up/bias',
'synthesis.layer8.conv.weight':'64x64/Conv1/weight',
'synthesis.layer8.epilogue.mod_weight':'64x64/Conv1/mod_weight',
'synthesis.layer8.epilogue.mod_bias':'64x64/Conv1/mod_bias',
'synthesis.layer8.epilogue.apply_noise':'64x64/Conv1/noise_strength',
'synthesis.layer8.epilogue.bias':'64x64/Conv1/bias',
'synthesis.output4.conv.weight': '64x64/ToRGB/weight',
'synthesis.output4.epilogue.mod_weight':'64x64/ToRGB/mod_weight',
'synthesis.output4.epilogue.mod_bias':'64x64/ToRGB/mod_bias',
'synthesis.output4.epilogue.bias':'64x64/ToRGB/bias',   

'synthesis.layer9.conv.weight':'128x128/Conv0_up/weight',
'synthesis.layer9.epilogue.mod_weight':'128x128/Conv0_up/mod_weight',
'synthesis.layer9.epilogue.mod_bias':'128x128/Conv0_up/mod_bias',
'synthesis.layer9.epilogue.apply_noise':'128x128/Conv0_up/noise_strength',
'synthesis.layer9.epilogue.bias':'128x128/Conv0_up/bias',
'synthesis.layer10.conv.weight':'128x128/Conv1/weight',
'synthesis.layer10.epilogue.mod_weight':'128x128/Conv1/mod_weight',
'synthesis.layer10.epilogue.mod_bias':'128x128/Conv1/mod_bias',
'synthesis.layer10.epilogue.apply_noise':'128x128/Conv1/noise_strength',
'synthesis.layer10.epilogue.bias':'128x128/Conv1/bias',
'synthesis.output5.conv.weight': '128x128/ToRGB/weight',
'synthesis.output5.epilogue.mod_weight':'128x128/ToRGB/mod_weight',
'synthesis.output5.epilogue.mod_bias':'128x128/ToRGB/mod_bias',
'synthesis.output5.epilogue.bias':'128x128/ToRGB/bias',

'synthesis.layer11.conv.weight':'256x256/Conv0_up/weight',
'synthesis.layer11.epilogue.mod_weight':'256x256/Conv0_up/mod_weight',
'synthesis.layer11.epilogue.mod_bias':'256x256/Conv0_up/mod_bias',
'synthesis.layer11.epilogue.apply_noise':'256x256/Conv0_up/noise_strength',
'synthesis.layer11.epilogue.bias':'256x256/Conv0_up/bias',
'synthesis.layer12.conv.weight':'256x256/Conv1/weight',
'synthesis.layer12.epilogue.mod_weight':'256x256/Conv1/mod_weight',
'synthesis.layer12.epilogue.mod_bias':'256x256/Conv1/mod_bias',
'synthesis.layer12.epilogue.apply_noise':'256x256/Conv1/noise_strength',
'synthesis.layer12.epilogue.bias':'256x256/Conv1/bias',
'synthesis.output6.conv.weight': '256x256/ToRGB/weight',
'synthesis.output6.epilogue.mod_weight':'256x256/ToRGB/mod_weight',
'synthesis.output6.epilogue.mod_bias':'256x256/ToRGB/mod_bias',
'synthesis.output6.epilogue.bias':'256x256/ToRGB/bias',

'synthesis.layer13.conv.weight':'512x512/Conv0_up/weight',
'synthesis.layer13.epilogue.mod_weight':'512x512/Conv0_up/mod_weight',
'synthesis.layer13.epilogue.mod_bias':'512x512/Conv0_up/mod_bias',
'synthesis.layer13.epilogue.apply_noise':'512x512/Conv0_up/noise_strength',
'synthesis.layer13.epilogue.bias':'512x512/Conv0_up/bias',
'synthesis.layer14.conv.weight':'512x512/Conv1/weight',
'synthesis.layer14.epilogue.mod_weight':'512x512/Conv1/mod_weight',
'synthesis.layer14.epilogue.mod_bias':'512x512/Conv1/mod_bias',
'synthesis.layer14.epilogue.apply_noise':'512x512/Conv1/noise_strength',
'synthesis.layer14.epilogue.bias':'512x512/Conv1/bias',
'synthesis.output7.conv.weight': '512x512/ToRGB/weight',
'synthesis.output7.epilogue.mod_weight':'512x512/ToRGB/mod_weight',
'synthesis.output7.epilogue.mod_bias':'512x512/ToRGB/mod_bias',
'synthesis.output7.epilogue.bias':'512x512/ToRGB/bias',

'synthesis.layer15.conv.weight':'1024x1024/Conv0_up/weight',
'synthesis.layer15.epilogue.mod_weight':'1024x1024/Conv0_up/mod_weight',
'synthesis.layer15.epilogue.mod_bias':'1024x1024/Conv0_up/mod_bias',
'synthesis.layer15.epilogue.apply_noise':'1024x1024/Conv0_up/noise_strength',
'synthesis.layer15.epilogue.bias':'1024x1024/Conv0_up/bias',
'synthesis.layer16.conv.weight':'1024x1024/Conv1/weight',
'synthesis.layer16.epilogue.mod_weight':'1024x1024/Conv1/mod_weight',
'synthesis.layer16.epilogue.mod_bias':'1024x1024/Conv1/mod_bias',
'synthesis.layer16.epilogue.apply_noise':'1024x1024/Conv1/noise_strength',
'synthesis.layer16.epilogue.bias':'1024x1024/Conv1/bias',
'synthesis.output8.conv.weight': '1024x1024/ToRGB/weight',
'synthesis.output8.epilogue.mod_weight':'1024x1024/ToRGB/mod_weight',
'synthesis.output8.epilogue.mod_bias':'1024x1024/ToRGB/mod_bias',
'synthesis.output8.epilogue.bias':'1024x1024/ToRGB/bias'

}
in stylegan2_generator_model.py (which is a copy of stylegan_generator_model.py
In stylegan_generator.py)

if 'ToRGB_lod' in tf_var_name:
lod = int(tf_var_name[len('ToRGB_lod')])
lod_shift = 10 - int(np.log2(self.resolution))
tf_var_name = tf_var_name.replace(f'{lod}', f'{lod - lod_shift}')
if tf_var_name not in tf_vars:
self.logger.debug(f'Variable {tf_var_name} does not exist in '
f'tensorflow model.')
is used.

Here in stylegan2 tf variable names are like '512x512/ToRGB/weight':
So, i have removed the above steps because it contains resolution size directly in variable name.

If i execute the codes by doing above modifications,it is saving that successfully saved the model. But at the time of loading giving error:
size mismatch for synthesis.layer1.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 8, 8]) from checkpoint, the shape in current model is torch.Size([1, 1, 4, 4]).
size mismatch for synthesis.layer3.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 16, 16]) from checkpoint, the shape in current model is torch.Size([1, 1, 8, 8]).
size mismatch for synthesis.layer5.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 32, 32]) from checkpoint, the shape in current model is torch.Size([1, 1, 16, 16]).
size mismatch for synthesis.layer7.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 64, 64]) from checkpoint, the shape in current model is torch.Size([1, 1, 32, 32]).
size mismatch for synthesis.layer8.conv.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 512, 3, 3]).
size mismatch for synthesis.layer8.epilogue.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for synthesis.layer9.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 128, 128]) from checkpoint, the shape in current model is torch.Size([1, 1, 64, 64]).
size mismatch for synthesis.output4.conv.weight: copying a param with shape torch.Size([3, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 256, 1, 1]).
size mismatch for synthesis.layer10.epilogue.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for synthesis.layer11.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 256, 256]) from checkpoint, the shape in current model is torch.Size([1, 1, 128, 128]).
size mismatch for synthesis.output5.conv.weight: copying a param with shape torch.Size([3, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 128, 1, 1]).
size mismatch for synthesis.layer12.epilogue.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for synthesis.layer13.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 512, 512]) from checkpoint, the shape in current model is torch.Size([1, 1, 256, 256]).
size mismatch for synthesis.output6.conv.weight: copying a param with shape torch.Size([3, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 64, 1, 1]).
size mismatch for synthesis.layer14.epilogue.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for synthesis.layer15.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 1024, 1024]) from checkpoint, the shape in current model is torch.Size([1, 1, 512, 512]).
size mismatch for synthesis.output7.conv.weight: copying a param with shape torch.Size([3, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 32, 1, 1]).
size mismatch for synthesis.layer16.epilogue.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for synthesis.output8.conv.weight: copying a param with shape torch.Size([3, 32, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 16, 1, 1]).

Can you help me change the functions in stylegan2_generator_model.py

Thanks in Advance .
Regards,
SandhyaLaxmi Kanna

Trying to use a pretrained stylegan with gray-scaled images

Hi,

I am trying to use generate_data.py with a trained stylegan on gray-scaled images, i registered my model in model_settings.py and specified number of channels as 1 in there. However, it turns out that something in the original model is fixed size to 3, the error is:

RuntimeError: Error(s) in loading state_dict for StyleGANGeneratorModel:
size mismatch for synthesis.output6.bias: copying a param with shape torch.Size([1]) from checkpoint, the shape in current model is torch.Size([3]).

Do you have any idea, how can i solve this.

Thanks in advance.

Requirments.txt and pre-trained models

Great contribution and git my friends, thanks so much for posting this open to the community to use.

Would like to ask several questions:

  1. Any way you can provide a requirements.txt file in order for those who don't have the libraries to easily use your git?

  2. Can you link to the appropriate pre-trained models and where to download them? I have tried using the PGGAN from the github and yet the edit.py isn't working.

Thanks again!

Omitting intercept

Hey!
First of all great job on the paper and the code!

I was wondering what was the motivation behind assuming that all boundaries (hyperplanes) pass through the origin? I would imagine this assumption might be restrictive (especially for general theory when the mapping network can produce distributions that are not centered around the origin).

Also related to this it seems like when you find the hyperplane in train_boundary via fitting of a linear model you do not enforce the intercept to be 0.

Questions about the truncation module.

I have a question on your implementation of the truncation module. Why do the first 9 channels of the W+ code are same? It looks like you separate W+ code into just 2 blocks but 18 blocks? This is strange for that in the official code each channel(I mean 18, not 512) of W+ code is different.

Does Interface GAN keep face id when editing faces?

Hi Yujun,
Thanks for the great work.
When I run edit.py, the output images do not keep the face ID, does it possible to keep the face ID (just like FaceID GAN/FaceFeat GAN) if one laten code is passed to generator?

Can it make an adult into a baby?

Hi, the age demo in the paper makes an adult into a child, could you please tell me what will if I set the age attribute as an extreme value?

Can it make an adult into a one-year-old baby?Can it keep the generator output a normal human face?

How to get the pose boundary?

Hello! I am confused about how to get the pose boundary. The paper said that the auxiliary attribute prediction model predicts the 5-point facial landmarks. I find the label about the facial landmarks is coordinate in the CelebFace Dataset. The paper said that turning right is the corresponding positive direction. How did you turn the coordinate into binary code that represents the direction?

Ways of learning the attribute vector

Very impressive work! I'm wondering if you have compared the proposed way of learning the attribute vector (by classification) with the way in [1] (by simply using the difference between the mean features)?.

[1] P. Upchurch, J. Gardner, G. Pleiss, R. Pless, N. Snavely, K. Bala, and K. Weinberger. Deep feature interpolation for image content changes. In CVPR, 2017

SyntaxError: invalid syntax

E:\Users\Raytine\Anaconda3\python.exe F:/expression/InterFaceGAN-master/edit.py -m stylegan_celebahq -b boundaries/stylegan_celebahq_pose_boundary -n 10 -o results/stylegan_celebahq_smile_editing
[2019-08-12 13:50:32,896][INFO] Initializing generator.
[2019-08-12 13:50:34,279][INFO] Loading tensorflow model from {self.tf_model_path}.
Traceback (most recent call last):
File "F:/expression/InterFaceGAN-master/edit.py", line 114, in
main()
File "F:/expression/InterFaceGAN-master/edit.py", line 68, in main
model = StyleGANGenerator(args.model_name, logger)
File "F:\expression\InterFaceGAN-master\models\stylegan_generator.py", line 42, in init
super().init(model_name, logger)
File "F:\expression\InterFaceGAN-master\models\base_generator.py", line 96, in init
self.convert_tf_model()
File "F:\expression\InterFaceGAN-master\models\stylegan_generator.py", line 73, in convert_tf_model
_, , tf_model = pickle.load(f)
File "models/stylegan_tf_official\dnnlib_init
.py", line 20
submit_config: SubmitConfig = None # Package level variable for SubmitConfig which is only valid when inside the run function.
^
SyntaxError: invalid syntax

Issue learning latent encoding for new faces

I am trying to derive latent encodings for cutom faces, as done in https://github.com/Puzer/stylegan-encoder.

Here are the details after porting the same to pytorch:

from models.stylegan_generator import StyleGANGenerator

#load the pre-trained synthesis network
m_synth = StyleGANGenerator("stylegan_ffhq").model.synthesis.cuda().eval()

#process the output of the synthesis module
class PostProcAfterSynth(nn.Module):
    def __init__(self):
        super(PostProcAfterSynth, self).__init__()
    def forward(self, gen_img):
        #remap to [0,1]
        return (gen_img+1)/2
    
post_proc_layer = PostProcAfterSynth()

#preprocess the generated image before feeding into perceptual model    
class PreProcBeforePerception(nn.Module):
    def __init__(self, img_size):
        super(PreProcBeforePerception, self).__init__()
        self.img_size = img_size
        self.mean = torch.tensor([0.485, 0.456, 0.406], device=device).view(-1, 1, 1)
        self.std = torch.tensor([0.229, 0.224, 0.225], device=device).view(-1, 1, 1)
    def forward(self, gen_img):
        #resize input image
        gen_img = F.adaptive_avg_pool2d(gen_img, self.img_size)
        #normalize
        gen_img = (gen_img - self.mean) / self.std
        return gen_img
    
pre_proc_layer = PreProcBeforePerception(img_size=256)

#use pre-trained vgg model for feature extraction
m_vgg = models.vgg16(pretrained=True).features[:16].to(device).eval()

#set up the model
model = nn.Sequential(m_synth)
model.add_module(str(1), post_proc_layer)
model.add_module(str(2), pre_proc_layer)
model.add_module(str(3), m_vgg)

for param in model.parameters():
    param.requires_grad_(False)

print(m_vgg)

Sequential(
  (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): ReLU(inplace)
  (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (3): ReLU(inplace)
  (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (6): ReLU(inplace)
  (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (8): ReLU(inplace)
  (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (11): ReLU(inplace)
  (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (13): ReLU(inplace)
  (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (15): ReLU(inplace)
)

As done by Puzer, I select the [conv->conv->pool->conv->conv->pool->conv->conv->conv] section of the vgg network for feature extraction.

Pre-computing the features for the reference image:

ref_img_path = "."
ref_img = np.array(Image.open(ref_img_path))
ref_img = ref_img.astype(np.float32)/255.
ref_img = np.array([np.transpose(ref_img, (2,0,1))])
ref_img = torch.tensor(ref_img, device=device)
ref_img = pre_proc_layer(ref_img)
ref_img_features = m_vgg(ref_img).detach()

Optimization:

trainable_latent = torch.randn((1,18,512), device=device).requires_grad_(True)
loss_func = torch.nn.MSELoss()

optimizer = optim.SGD([trainable_latent], lr=0.5)

losses = []
for i in tqdm(range(1000)):
    optimizer.zero_grad()
    gen_img_features = model(trainable_latent)
    loss = loss_func(gen_img_features, ref_img_features)
    loss_val = loss.data.cpu()
    losses.append(loss_val)
    loss.backward()
    optimizer.step()

The latent encoding and subsequent generated images are of a poor quality. The results are nowhere near as crisp as that by Puzer.

What I have tried:

  1. Learning Z space latent instead of WP+
  2. Variety of optimizers, learning rate, iterations combos

What could be wrong:

  1. There might be issues with my pipeline above (new to pytorch)
  2. There might be some difference in pre-trained vgg networks for pytorch and keras, that I might have failed to take into account.
  3. The perceptual model used is not complex enough. (but it does work for Puzer)

Any help with the above would be much appreciated.

Attribute Scores

Hello, thanks for the great work.
I would like to know if it is possible to see the code for the attribute predictor you used, or if you could share the scores of these directly to be able to find boundaries on different architectures but with the same dataset.
Thanks very much

quality on stylegan_ffhq

Hi, thanks for the paper and the results are impressive!

I tested the code with "stylegan_ffhq" model and "stylegan_ffhq_pose_boundary.npy or stylegan_ffhq_pose_w_boundary.npy", with the default settings, but the results are not very good.

The person identity, age, even gender changed simultaneously with the pose.
Regarding to the "stylegan_ffhq_pose_w_boundary.npy", the degree of pose changes are more or less ignorable.

python edit.py -m stylegan_ffhq -o results/stylegan_ffhq_pose_w_boundary -b ./boundaries/stylegan_ffhq_pose_w_boundary.npy -n 10

Is there anything that I have to adjust?

W space or W+ space

hello! The GitHub provides us some boundaries about the W space in StyleGAN. But I found that the code contains two configurations which are W space and W+ space. So I wonder if all the boundaries label W in paper and Github are corresponding to W space? There is no provided W+ space boundaries and no W+ space results in the paper? Thanks a lot.

How to manipulate real faces?

Dear author, after checking this repository, I have found that there isn't included a encoder-decoder model as the paper tests in Figure 11. WILL this release in the near future?

RuntimeError: Error(s) in loading state_dict for PGGANGeneratorModel:

E:\Users\Raytine\Anaconda3\python.exe F:/expression/InterFaceGAN-master/edit.py -m pggan_celebahq -b boundaries/pggan_celebahq_smile_boundary.npy -n 10 -o results/pggan_celebahq_smile_editing
[2019-08-12 10:30:13,999][INFO] Initializing generator.
[2019-08-12 10:30:23,051][INFO] Loading tensorflow model from {self.tf_model_path}.
[2019-08-12 10:30:28,891][INFO] Successfully loaded!
[2019-08-12 10:30:28,892][INFO] Converting tensorflow model to pytorch version.
[2019-08-12 10:30:29,095][INFO] Successfully converted!
[2019-08-12 10:30:29,095][INFO] Saving pytorch model to {self.model_path}.
[2019-08-12 10:30:29,120][INFO] Successfully saved!
[2019-08-12 10:30:29,120][INFO] Loading pytorch model from {self.model_path}.
Traceback (most recent call last):
File "F:/expression/InterFaceGAN-master/edit.py", line 112, in
main()
File "F:/expression/InterFaceGAN-master/edit.py", line 63, in main
model = PGGANGenerator(args.model_name, logger)
File "F:\expression\InterFaceGAN-master\models\pggan_generator.py", line 24, in init
super().init(model_name, logger)
File "F:\expression\InterFaceGAN-master\models\base_generator.py", line 96, in init
self.convert_tf_model()
File "F:\expression\InterFaceGAN-master\models\pggan_generator.py", line 70, in convert_tf_model
self.load()
File "F:\expression\InterFaceGAN-master\models\pggan_generator.py", line 34, in load
self.model.load_state_dict(torch.load(self.model_path))
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 719, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for PGGANGeneratorModel:
Unexpected key(s) in state_dict: "layer5.conv.weight", "layer18.conv.weight", "layer2.conv.weight", "layer15.wscale.bias", "layer18.wscale.bias", "layer4.wscale.bias", "layer6.wscale.bias", "layer1.conv.weight", "layer8.conv.weight", "layer14.wscale.bias", "layer10.wscale.bias", "layer8.wscale.bias", "layer15.conv.weight", "output_1024x1024.conv.weight", "output_1024x1024.wscale.bias", "layer7.conv.weight", "layer9.conv.weight", "layer9.wscale.bias", "layer17.conv.weight", "layer13.wscale.bias", "layer12.wscale.bias", "layer14.conv.weight", "layer16.wscale.bias", "layer11.wscale.bias", "layer16.conv.weight", "layer10.conv.weight", "layer6.conv.weight", "layer17.wscale.bias", "layer4.conv.weight", "layer13.conv.weight", "layer5.wscale.bias", "layer2.wscale.bias", "layer3.wscale.bias", "layer12.conv.weight", "layer1.wscale.bias", "layer11.conv.weight", "layer7.wscale.bias", "layer3.conv.weight".

artifacts boundary

Hi there,

In the paper the last paragraph of section 3.3 mentions how to fix artifacts using certain hyperplane. However this hyperplane is missing in the boundaries folder of this repo. Are you going to release it soon?

Thanks!

Multi-GPU support

I have warped the model in models/base_generator. However, CUDA out of memory occurs when I run the synthesize script. Could you help me figure it out?
GPU: P40, 24G. batchsize:32.

different interpolation logic

Hi,

First many thanks for sharing this great work.

Got a question regarding the linear interpolation logic. In function linear_interpolate(), when len(latent_code.shape) == 2, dot product of latent_code and boundary is subtracted from [start_distance, end_distance]. However if if len(latent_code.shape) == 3, the dot product is not considered at all. Just wondering why these 2 cases are treated differently.

Thanks.

Issues converting custom pggan models

It seems that for pggan, state dictionaries for the model trained by paper authors (file karras2018iclr-celebahq-1024x1024.pkl) and custom models that are trained by their released code are different. You can check the differences here: https://www.diffchecker.com/0hFYlK82.

Unfortunately I was unable to tweak your code so that it would convert my trained model correctly.

Did you try converting custom trained model using pggan code? If so, could you please provide a model you trained that worked for you other than the one provided by pggan authors?

AssertionError: Torch not compiled with CUDA enabled

(base) PS E:\darshan\pytorch_stylegan_encoder-master\InterFaceGAN> python generate_data.py -m stylegan_ffhq -o data/pggan_celebahq -n 10000
[2020-01-20 03:53:48,282][INFO] Initializing generator.
[2020-01-20 03:53:48,521][WARNING] No pre-trained model will be loaded!
Traceback (most recent call last):
File "generate_data.py", line 111, in
main()
File "generate_data.py", line 64, in main
model = StyleGANGenerator(args.model_name, logger)
File "E:\darshan\pytorch_stylegan_encoder-master\InterFaceGAN\models\stylegan_generator.py", line 42, in init
super().init(model_name, logger)
File "E:\darshan\pytorch_stylegan_encoder-master\InterFaceGAN\models\base_generator.py", line 103, in init
self.model.eval().to(self.run_device)
File "C:\Users\HpZ8\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 426, in to
return self._apply(convert)
File "C:\Users\HpZ8\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 202, in _apply
module._apply(fn)
File "C:\Users\HpZ8\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 202, in _apply
module._apply(fn)
File "C:\Users\HpZ8\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 202, in _apply
module._apply(fn)
File "C:\Users\HpZ8\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 224, in apply
param_applied = fn(param)
File "C:\Users\HpZ8\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 424, in convert
return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
File "C:\Users\HpZ8\Anaconda3\lib\site-packages\torch\cuda_init
.py", line 192, in _lazy_init
check_driver()
File "C:\Users\HpZ8\Anaconda3\lib\site-packages\torch\cuda_init
.py", line 95, in _check_driver
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

Attributes predictor

Hi! Thanks for your great research!
Is there any chance you can provide attributes predictor that you used for finding boundaries? I didn't really get about scores that it should give to attributes on the pictures.
Also for example for binary attributes like glasses/no glasses is it possible to manually put labels of say 1 for glasses, 0 for no glasses and feed this data to train a new boundary?

Questions about generator W space data by stylegan_ffhq

Hi, I want to edit image in W space.

In #35, it is suggested that using generate_data.py to get w.npy first.
I used the following code to generator. However, the image is strange that it is not like a normal human face
python generate_data.py -m stylegan_ffhq -o data/stylegan-ffhq -n 3 -s W

000002
000000
000001

Was my code wrong?

Also, I want to ask. If I want to edit the W space image, was the following code right?

python edit.py \
    -m stylegan_ffhq \
    -b boundaries/stylegan_ffhq_age_w_boundary.npy \
    -i ./data/stylegan-ffhq/w.npy \
    -o results/stylegan_celebahq_age_w_boundary \
    -s W

Confusing about the definition of binary attribute

Hello! I am confusing about the definition of binary attribute. For example, if I want to change the hair color attribute, whether the method can only work for dart colors vs. light colors. Can i get the centain color direction like red hair color direction through collecting red hair and no red hair images to train a
binary classifier? Maybe can we get the red color direction through collecting red hair and yellow hair?

Regarding yaw pose estimation using facial landmarks

Hi,

I wish to apply your technique to a StyleGAN model that I have trained on Celeba-HQ-128 images. Can you please release the code to estimate yaw pose using the five facial landmarks present in CelebA dataset (left eye centre, right eye centre, nose tip, left mouth corner and right mouth corner)?

Thanks.

generate_data.py generates empty images

Hi guys,

Thank you for your great work. I have a question.

I found that generate_data.py works very well with my style-gan models, but when I trained PGGAN with the same dataset of images and then ran generate_data.py for generating 10k images, all images look like this:

000009

I tried another checkpoint but it gave similar results:

000015

I guess it is caused by some misconfiguration between my PGGAN and the inference code of InterFaceGAN, but haven't found it yet. Can you give me some advice please?

about style_mod

why the function is x * (style[:, 0] + 1) + style[:, 1]?what "+1" of (style[:, 0] + 1)mean?,the official implementation is also same as yours.i can‘t figure it out

How to find more boundaries?

Congratulations for the great work!

Please correct me if I'm wrong. From what I understand, providing a pretrained model and a boundary (in boundaries/) we can tuning features of generated images. This is amazing.

I wonder if I can explore other boundaries as well? Let's say, hair color or skin colors? If it is possible, how could I do that?

Thank you very much!

Issue when using pretrained model with input size 512x512

Hi,

I am trying to run generate_data.py using my pretrained model which was trained on 512x512 images. It successfully converted the pkl model to pth, but then showed the error below.

Traceback (most recent call last): File "/media/tai/6TB/Projects/InterFaceGAN/InterFaceGAN/generate_data.py", line 114, in <module> main() File "/media/tai/6TB/Projects/InterFaceGAN/InterFaceGAN/generate_data.py", line 65, in main model = StyleGANGenerator(args.model_name, logger) File "/media/tai/6TB/Projects/InterFaceGAN/InterFaceGAN/models/stylegan_generator.py", line 42, in __init__ super().__init__(model_name, logger) File "/media/tai/6TB/Projects/InterFaceGAN/InterFaceGAN/models/base_generator.py", line 95, in __init__ self.load() File "/media/tai/6TB/Projects/InterFaceGAN/InterFaceGAN/models/stylegan_generator.py", line 63, in load self.model.load_state_dict(state_dict) File "/media/tai/6TB/anaconda3/envs/InterfaceGAN/lib/python3.6/site-packages/torch/nn/modules/module.py", line 845, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for StyleGANGeneratorModel: Unexpected key(s) in state_dict: "synthesis.output8.conv.weight", "synthesis.output8.bias". size mismatch for synthesis.output4.conv.weight: copying a param with shape torch.Size([3, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 256, 1, 1]). size mismatch for synthesis.output5.conv.weight: copying a param with shape torch.Size([3, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 128, 1, 1]). size mismatch for synthesis.output6.conv.weight: copying a param with shape torch.Size([3, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 64, 1, 1]). size mismatch for synthesis.output7.conv.weight: copying a param with shape torch.Size([3, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 32, 1, 1]).
I guess it is because InterFaceGAN is set to work with models being trained with 1024x1024 images by default.

Where should I modify so that I can load my 512x512 model?

Thank you very much!

How to perform StyleGAN inversion?

Hi Yujun,

In the paper you claimed that it must use GAN inversion method to map real images to latent codes, and StyleGAN inversion methods are much better, are there documents introducing how to do the inversion?
Any comments are appreciated! Best Regards.

How can I input an image to edit?

Hi,
First many thanks for sharing this great work.
I want to ask a question. The input of InterFaceGAN is "Latent code". If I want to input a face image and edit the face, how can I deal with? Need I transform the image as a "Latent code"? And how to do it?

Thank you.

RuntimeError: Given groups=1, weight of size [16, 16, 3, 3], expected input[4, 512, 1, 1] to have 16 channels, but got 512 channels instead

E:\Users\Raytine\Anaconda3\python.exe F:/expression/InterFaceGAN-master/edit.py -m pggan_celebahq -b boundaries/pggan_celebahq_smile_boundary.npy -n 10 -o results/pggan_celebahq_smile_editing
[2019-08-12 14:15:15,846][INFO] Initializing generator.
[2019-08-12 14:15:15,972][INFO] Loading pytorch model from {self.model_path}.
[2019-08-12 14:15:16,002][INFO] Successfully loaded!
[2019-08-12 14:15:17,357][INFO] Preparing boundary.
0%| | 0/10 [00:00<?, ?it/s][2019-08-12 14:15:17,394][INFO] Preparing latent codes.
[2019-08-12 14:15:17,394][INFO] Sample latent codes randomly.
[2019-08-12 14:15:17,395][INFO] Editing {total_num} samples.
Traceback (most recent call last):
File "F:/expression/InterFaceGAN-master/edit.py", line 114, in
main()
File "F:/expression/InterFaceGAN-master/edit.py", line 100, in main
outputs = model.easy_synthesize(interpolations_batch)
File "F:\expression\InterFaceGAN-master\models\base_generator.py", line 230, in easy_synthesize
outputs = self.synthesize(latent_codes, **kwargs)
File "F:\expression\InterFaceGAN-master\models\pggan_generator.py", line 132, in synthesize
images = self.model(zs)
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "F:\expression\InterFaceGAN-master\models\pggan_generator_model.py", line 127, in forward
return super().forward(x)
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\container.py", line 91, in forward
input = module(input)
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "F:\expression\InterFaceGAN-master\models\pggan_generator_model.py", line 243, in forward
x = self.conv(x)
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "E:\Users\Raytine\Anaconda3\lib\site-packages\torch\nn\modules\conv.py", line 301, in forward
self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [16, 16, 3, 3], expected input[4, 512, 1, 1] to have 16 channels, but got 512 channels instead

the change of smile when increase the distance of the "glasses" attribute

Hi,thanks for your great research.
I have a problem when I apply the "pggan_celebahq_eyeglasses_c_age_gender_boundary.npy" on the pre-trained pggan model. With the conditional manipulation on "glasses" attribute , the smile changes gradually. Here are the results.
000_001
000_004
008_001
008_005
But the attributes seem to be uncorrelated in your paper.
Could you explained it for me?

Roll and Pitch for Pose data

Hi Shenyujun,
Thanks so much for awesome work! I saw you have pose direction in repo, but there is no roll or pitch rotation for pose in that direction. Do you think it's possible to train with both pitch, roll and yaw? BTW, any attribute predictor or data could share regarding the pose generation? That will be really helpful, not found in original stylegan repo.
Thanks

How to train w+ space boundary?

Using the stylegan-encoder project, I got the latent codes as array (n,18,512). However, the training code is for 1d vector input , do i need to separate the latent code into 1d vector?
Thanks a lot!

Issue with tensorflow version

When I try to install tensorflow 1.12.2 as I believe tensorflow 1.12.2 and cudo 9 are compatible to execute this code.. I'm getting version mismatch error and some other version of gets installed and when try to display tensorflow version I'm getting errors... images are not getting for other versions of tensorflow generated..

Why the latent code from GAN inversion methods can be manipulated by the boundary

Hi, thanks for sharing this great work!

I'm trying to edit a new face. In #30, it is suggested using https://github.com/Puzer/stylegan-encoder firstly to get the new face latent code of W+ space. However, the shape of the latent code is (18, 512) and 18 layers have different values.

What confusing me is :

  1. The shape of "stylegan_ffhq_age_w_boundary.npy" is (1,512), so if using (1, 512) boundary to edit (18,512) latent code, all layers will edit by the same value. But the meaning of different layers of (18,512) latent code is not the same, because the values of 18 layers are different.

Why can we use (1, 512) boundary to edit (18,512) latent code? Why it can also work?

  1. If the (18,512) latent code has different values of its 18 layers, training a (18, 512) boundary (which also has different values of its 18 layers) is more reasonable, isn't it?

  2. In your paper, you also do the experiment of real images. What the latent space did you get from your stylegan encode? Z, W or W+?
    If the shape of your latent code is (18,512), do 18 layers have different values?

Thank you!

how to prepare data for a custom attribute

Hi, I think the main idea is to prepare data with/without a certain attribute, and train a binary classifier (namely linear-SVM) with the data.

But most attributes are of continuous values, such as pose rotation.
The logic in your code is to use the average of the largest value and the smallest value as a threshold.

There are also some attributes should be quantised as several finite states, for example, face shape can be in [square, triangle, heart, round, oval]

Do you have more detailed suggestions on how to quantize those attributes?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.