Hi. Your results are good. Currently i am working pose transformation. Your repository helps me alot to go step forward.
Currently stylegan2 produced generated images after mixing are too divergent and good results than previous.
Can you tell the process of how to train stylegan2 model to get boundaries??
I changed following parameter values based on stylegan2-ffhq-config-f.pkl.
_RESOLUTIONS_TO_CHANNELS = {
8: [512, 512, 512],
16: [512, 512, 512, 512],
32: [512, 512, 512, 512, 512],
64: [512, 512, 512, 512, 512, 256],
128: [512, 512, 512, 512, 512, 256, 128],
256: [512, 512, 512, 512, 512, 256, 128, 64],
512: [512, 512, 512, 512, 512, 256, 128, 64, 32],
1024: [512, 512, 512, 512, 512, 256, 128, 64, 32, 16],
}
pylint: disable=line-too-long
Variable mapping from pytorch model to official tensorflow model.
_STYLEGAN_PTH_VARS_TO_TF_VARS = {
# Statistic information of disentangled latent feature, w.
'truncation.w_avg':'dlatent_avg', # [512]
# Noises.
'synthesis.layer0.epilogue.apply_noise.noise': 'noise0', # [1, 1, 4, 4]
'synthesis.layer1.epilogue.apply_noise.noise': 'noise1', # [1, 1, 8, 8]
'synthesis.layer2.epilogue.apply_noise.noise': 'noise2', # [1, 1, 8,8]
'synthesis.layer3.epilogue.apply_noise.noise': 'noise3', # [1, 1, 16, 16]
'synthesis.layer4.epilogue.apply_noise.noise': 'noise4', # [1, 1, 16, 16]
'synthesis.layer5.epilogue.apply_noise.noise': 'noise5', # [1, 1, 32, 32]
'synthesis.layer6.epilogue.apply_noise.noise': 'noise6', # [1, 1, 32, 32]
'synthesis.layer7.epilogue.apply_noise.noise': 'noise7', # [1, 1, 64, 64]
'synthesis.layer8.epilogue.apply_noise.noise': 'noise8', # [1, 1, 64, 64]
'synthesis.layer9.epilogue.apply_noise.noise': 'noise9', # [1, 1, 128, 128]
'synthesis.layer10.epilogue.apply_noise.noise': 'noise10', # [1, 1, 128, 128]
'synthesis.layer11.epilogue.apply_noise.noise': 'noise11', # [1, 1, 256, 256]
'synthesis.layer12.epilogue.apply_noise.noise': 'noise12', # [1, 1, 256, 256]
'synthesis.layer13.epilogue.apply_noise.noise': 'noise13', # [1, 1, 512, 512]
'synthesis.layer14.epilogue.apply_noise.noise': 'noise14', # [1, 1, 512, 512]
'synthesis.layer15.epilogue.apply_noise.noise': 'noise15', # [1, 1, 1024, 1024]
'synthesis.layer16.epilogue.apply_noise.noise': 'noise16', # [1, 1, 1024, 1024]
# Mapping blocks.
'mapping.dense0.linear.weight': 'Dense0/weight', # [512, 512]
'mapping.dense0.wscale.bias': 'Dense0/bias', # [512]
'mapping.dense1.linear.weight': 'Dense1/weight', # [512, 512]
'mapping.dense1.wscale.bias': 'Dense1/bias', # [512]
'mapping.dense2.linear.weight': 'Dense2/weight', # [512, 512]
'mapping.dense2.wscale.bias': 'Dense2/bias', # [512]
'mapping.dense3.linear.weight': 'Dense3/weight', # [512, 512]
'mapping.dense3.wscale.bias': 'Dense3/bias', # [512]
'mapping.dense4.linear.weight': 'Dense4/weight', # [512, 512]
'mapping.dense4.wscale.bias': 'Dense4/bias', # [512]
'mapping.dense5.linear.weight': 'Dense5/weight', # [512, 512]
'mapping.dense5.wscale.bias': 'Dense5/bias', # [512]
'mapping.dense6.linear.weight': 'Dense6/weight', # [512, 512]
'mapping.dense6.wscale.bias': 'Dense6/bias', # [512]
'mapping.dense7.linear.weight': 'Dense7/weight', # [512, 512]
'mapping.dense7.wscale.bias': 'Dense7/bias', # [512]
# Synthesis blocks.
'synthesis.lod': 'lod' , #[]
'synthesis.add_constant': '4x4/Const/const',
'synthesis.layer0.conv.weight': '4x4/Conv/weight',
'synthesis.layer0.epilogue.mod_weight':'4x4/Conv/mod_weight',
'synthesis.layer0.epilogue.mod_bias':'4x4/Conv/mod_bias',
'synthesis.layer0.epilogue.apply_noise':'4x4/Conv/noise_strength',
'synthesis.layer0.epilogue.bias': '4x4/Conv/bias',
'synthesis.output0.conv.weight': '4x4/ToRGB/weight',
'synthesis.output0.epilogue.mod_weight':'4x4/ToRGB/mod_weight',
'synthesis.output0.epilogue.mod_bias':'4x4/ToRGB/mod_bias',
'synthesis.output0.epilogue.bias':'4x4/ToRGB/bias',
'synthesis.layer1.conv.weight':'8x8/Conv0_up/weight',
'synthesis.layer1.epilogue.mod_weight':'8x8/Conv0_up/mod_weight',
'synthesis.layer1.epilogue.mod_bias':'8x8/Conv0_up/mod_bias',
'synthesis.layer1.epilogue.apply_noise':'8x8/Conv0_up/noise_strength',
'synthesis.layer1.epilogue.bias':'8x8/Conv0_up/bias',
'synthesis.layer2.conv.weight':'8x8/Conv1/weight',
'synthesis.layer2.epilogue.mod_weight':'8x8/Conv1/mod_weight',
'synthesis.layer2.epilogue.mod_bias':'8x8/Conv1/mod_bias',
'synthesis.layer2.epilogue.apply_noise':'8x8/Conv1/noise_strength',
'synthesis.layer2.epilogue.bias':'8x8/Conv1/bias',
'synthesis.output1.conv.weight': '8x8/ToRGB/weight',
'synthesis.output1.epilogue.mod_weight':'8x8/ToRGB/mod_weight',
'synthesis.output1.epilogue.mod_bias':'8x8/ToRGB/mod_bias',
'synthesis.output1.epilogue.bias':'8x8/ToRGB/bias',
'synthesis.layer3.conv.weight':'16x16/Conv0_up/weight',
'synthesis.layer3.epilogue.mod_weight':'16x16/Conv0_up/mod_weight',
'synthesis.layer3.epilogue.mod_bias':'16x16/Conv0_up/mod_bias',
'synthesis.layer3.epilogue.apply_noise':'16x16/Conv0_up/noise_strength',
'synthesis.layer3.epilogue.bias':'16x16/Conv0_up/bias',
'synthesis.layer4.conv.weight':'16x16/Conv1/weight',
'synthesis.layer4.epilogue.mod_weight':'16x16/Conv1/mod_weight',
'synthesis.layer4.epilogue.mod_bias':'16x16/Conv1/mod_bias',
'synthesis.layer4.epilogue.apply_noise':'16x16/Conv1/noise_strength',
'synthesis.layer4.epilogue.bias':'16x16/Conv1/bias',
'synthesis.output2.conv.weight': '16x16/ToRGB/weight',
'synthesis.output2.epilogue.mod_weight':'16x16/ToRGB/mod_weight',
'synthesis.output2.epilogue.mod_bias':'16x16/ToRGB/mod_bias',
'synthesis.output2.epilogue.bias':'16x16/ToRGB/bias',
'synthesis.layer5.conv.weight':'32x32/Conv0_up/weight',
'synthesis.layer5.epilogue.mod_weight':'32x32/Conv0_up/mod_weight',
'synthesis.layer5.epilogue.mod_bias':'32x32/Conv0_up/mod_bias',
'synthesis.layer5.epilogue.apply_noise':'32x32/Conv0_up/noise_strength',
'synthesis.layer5.epilogue.bias':'32x32/Conv0_up/bias',
'synthesis.layer6.conv.weight':'32x32/Conv1/weight',
'synthesis.layer6.epilogue.mod_weight':'32x32/Conv1/mod_weight',
'synthesis.layer6.epilogue.mod_bias':'32x32/Conv1/mod_bias',
'synthesis.layer6.epilogue.apply_noise':'32x32/Conv1/noise_strength',
'synthesis.layer6.epilogue.bias':'32x32/Conv1/bias',
'synthesis.output3.conv.weight': '32x32/ToRGB/weight',
'synthesis.output3.epilogue.mod_weight':'32x32/ToRGB/mod_weight',
'synthesis.output3.epilogue.mod_bias':'32x32/ToRGB/mod_bias',
'synthesis.output3.epilogue.bias':'32x32/ToRGB/bias',
'synthesis.layer7.conv.weight':'64x64/Conv0_up/weight',
'synthesis.layer7.epilogue.mod_weight':'64x64/Conv0_up/mod_weight',
'synthesis.layer7.epilogue.mod_bias':'64x64/Conv0_up/mod_bias',
'synthesis.layer7.epilogue.apply_noise':'64x64/Conv0_up/noise_strength',
'synthesis.layer7.epilogue.bias':'64x64/Conv0_up/bias',
'synthesis.layer8.conv.weight':'64x64/Conv1/weight',
'synthesis.layer8.epilogue.mod_weight':'64x64/Conv1/mod_weight',
'synthesis.layer8.epilogue.mod_bias':'64x64/Conv1/mod_bias',
'synthesis.layer8.epilogue.apply_noise':'64x64/Conv1/noise_strength',
'synthesis.layer8.epilogue.bias':'64x64/Conv1/bias',
'synthesis.output4.conv.weight': '64x64/ToRGB/weight',
'synthesis.output4.epilogue.mod_weight':'64x64/ToRGB/mod_weight',
'synthesis.output4.epilogue.mod_bias':'64x64/ToRGB/mod_bias',
'synthesis.output4.epilogue.bias':'64x64/ToRGB/bias',
'synthesis.layer9.conv.weight':'128x128/Conv0_up/weight',
'synthesis.layer9.epilogue.mod_weight':'128x128/Conv0_up/mod_weight',
'synthesis.layer9.epilogue.mod_bias':'128x128/Conv0_up/mod_bias',
'synthesis.layer9.epilogue.apply_noise':'128x128/Conv0_up/noise_strength',
'synthesis.layer9.epilogue.bias':'128x128/Conv0_up/bias',
'synthesis.layer10.conv.weight':'128x128/Conv1/weight',
'synthesis.layer10.epilogue.mod_weight':'128x128/Conv1/mod_weight',
'synthesis.layer10.epilogue.mod_bias':'128x128/Conv1/mod_bias',
'synthesis.layer10.epilogue.apply_noise':'128x128/Conv1/noise_strength',
'synthesis.layer10.epilogue.bias':'128x128/Conv1/bias',
'synthesis.output5.conv.weight': '128x128/ToRGB/weight',
'synthesis.output5.epilogue.mod_weight':'128x128/ToRGB/mod_weight',
'synthesis.output5.epilogue.mod_bias':'128x128/ToRGB/mod_bias',
'synthesis.output5.epilogue.bias':'128x128/ToRGB/bias',
'synthesis.layer11.conv.weight':'256x256/Conv0_up/weight',
'synthesis.layer11.epilogue.mod_weight':'256x256/Conv0_up/mod_weight',
'synthesis.layer11.epilogue.mod_bias':'256x256/Conv0_up/mod_bias',
'synthesis.layer11.epilogue.apply_noise':'256x256/Conv0_up/noise_strength',
'synthesis.layer11.epilogue.bias':'256x256/Conv0_up/bias',
'synthesis.layer12.conv.weight':'256x256/Conv1/weight',
'synthesis.layer12.epilogue.mod_weight':'256x256/Conv1/mod_weight',
'synthesis.layer12.epilogue.mod_bias':'256x256/Conv1/mod_bias',
'synthesis.layer12.epilogue.apply_noise':'256x256/Conv1/noise_strength',
'synthesis.layer12.epilogue.bias':'256x256/Conv1/bias',
'synthesis.output6.conv.weight': '256x256/ToRGB/weight',
'synthesis.output6.epilogue.mod_weight':'256x256/ToRGB/mod_weight',
'synthesis.output6.epilogue.mod_bias':'256x256/ToRGB/mod_bias',
'synthesis.output6.epilogue.bias':'256x256/ToRGB/bias',
'synthesis.layer13.conv.weight':'512x512/Conv0_up/weight',
'synthesis.layer13.epilogue.mod_weight':'512x512/Conv0_up/mod_weight',
'synthesis.layer13.epilogue.mod_bias':'512x512/Conv0_up/mod_bias',
'synthesis.layer13.epilogue.apply_noise':'512x512/Conv0_up/noise_strength',
'synthesis.layer13.epilogue.bias':'512x512/Conv0_up/bias',
'synthesis.layer14.conv.weight':'512x512/Conv1/weight',
'synthesis.layer14.epilogue.mod_weight':'512x512/Conv1/mod_weight',
'synthesis.layer14.epilogue.mod_bias':'512x512/Conv1/mod_bias',
'synthesis.layer14.epilogue.apply_noise':'512x512/Conv1/noise_strength',
'synthesis.layer14.epilogue.bias':'512x512/Conv1/bias',
'synthesis.output7.conv.weight': '512x512/ToRGB/weight',
'synthesis.output7.epilogue.mod_weight':'512x512/ToRGB/mod_weight',
'synthesis.output7.epilogue.mod_bias':'512x512/ToRGB/mod_bias',
'synthesis.output7.epilogue.bias':'512x512/ToRGB/bias',
'synthesis.layer15.conv.weight':'1024x1024/Conv0_up/weight',
'synthesis.layer15.epilogue.mod_weight':'1024x1024/Conv0_up/mod_weight',
'synthesis.layer15.epilogue.mod_bias':'1024x1024/Conv0_up/mod_bias',
'synthesis.layer15.epilogue.apply_noise':'1024x1024/Conv0_up/noise_strength',
'synthesis.layer15.epilogue.bias':'1024x1024/Conv0_up/bias',
'synthesis.layer16.conv.weight':'1024x1024/Conv1/weight',
'synthesis.layer16.epilogue.mod_weight':'1024x1024/Conv1/mod_weight',
'synthesis.layer16.epilogue.mod_bias':'1024x1024/Conv1/mod_bias',
'synthesis.layer16.epilogue.apply_noise':'1024x1024/Conv1/noise_strength',
'synthesis.layer16.epilogue.bias':'1024x1024/Conv1/bias',
'synthesis.output8.conv.weight': '1024x1024/ToRGB/weight',
'synthesis.output8.epilogue.mod_weight':'1024x1024/ToRGB/mod_weight',
'synthesis.output8.epilogue.mod_bias':'1024x1024/ToRGB/mod_bias',
'synthesis.output8.epilogue.bias':'1024x1024/ToRGB/bias'
}
in stylegan2_generator_model.py (which is a copy of stylegan_generator_model.py
In stylegan_generator.py)
if 'ToRGB_lod' in tf_var_name:
lod = int(tf_var_name[len('ToRGB_lod')])
lod_shift = 10 - int(np.log2(self.resolution))
tf_var_name = tf_var_name.replace(f'{lod}', f'{lod - lod_shift}')
if tf_var_name not in tf_vars:
self.logger.debug(f'Variable {tf_var_name}
does not exist in '
f'tensorflow model.')
is used.
Here in stylegan2 tf variable names are like '512x512/ToRGB/weight':
So, i have removed the above steps because it contains resolution size directly in variable name.
If i execute the codes by doing above modifications,it is saving that successfully saved the model. But at the time of loading giving error:
size mismatch for synthesis.layer1.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 8, 8]) from checkpoint, the shape in current model is torch.Size([1, 1, 4, 4]).
size mismatch for synthesis.layer3.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 16, 16]) from checkpoint, the shape in current model is torch.Size([1, 1, 8, 8]).
size mismatch for synthesis.layer5.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 32, 32]) from checkpoint, the shape in current model is torch.Size([1, 1, 16, 16]).
size mismatch for synthesis.layer7.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 64, 64]) from checkpoint, the shape in current model is torch.Size([1, 1, 32, 32]).
size mismatch for synthesis.layer8.conv.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 512, 3, 3]).
size mismatch for synthesis.layer8.epilogue.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for synthesis.layer9.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 128, 128]) from checkpoint, the shape in current model is torch.Size([1, 1, 64, 64]).
size mismatch for synthesis.output4.conv.weight: copying a param with shape torch.Size([3, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 256, 1, 1]).
size mismatch for synthesis.layer10.epilogue.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for synthesis.layer11.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 256, 256]) from checkpoint, the shape in current model is torch.Size([1, 1, 128, 128]).
size mismatch for synthesis.output5.conv.weight: copying a param with shape torch.Size([3, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 128, 1, 1]).
size mismatch for synthesis.layer12.epilogue.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for synthesis.layer13.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 512, 512]) from checkpoint, the shape in current model is torch.Size([1, 1, 256, 256]).
size mismatch for synthesis.output6.conv.weight: copying a param with shape torch.Size([3, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 64, 1, 1]).
size mismatch for synthesis.layer14.epilogue.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
size mismatch for synthesis.layer15.epilogue.apply_noise.noise: copying a param with shape torch.Size([1, 1, 1024, 1024]) from checkpoint, the shape in current model is torch.Size([1, 1, 512, 512]).
size mismatch for synthesis.output7.conv.weight: copying a param with shape torch.Size([3, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 32, 1, 1]).
size mismatch for synthesis.layer16.epilogue.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).
size mismatch for synthesis.output8.conv.weight: copying a param with shape torch.Size([3, 32, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 16, 1, 1]).
Can you help me change the functions in stylegan2_generator_model.py
Thanks in Advance .
Regards,
SandhyaLaxmi Kanna