huggingface / pytorch-pretrained-biggan Goto Github PK

View Code? Open in Web Editor NEW

1.0K 26.0 175.0 340 KB

🦋A PyTorch implementation of BigGAN with pretrained weights and conversion scripts.

License: MIT License

Python 98.46% Shell 1.54%

neural-network gan pytorch biggan computer-vision artificial-intelligence generative-adversarial-network

pytorch-pretrained-biggan's People

Contributors

Stargazers

Watchers

Forkers

hendrikstrobelt avatarworld haifengzeng tobyge whmnoe4j dreamofd jacktank qoboty koala-good ruffe-ai 1165048017 burakakrishna earlbabson bi4o samux87 omniaprobitate phpmind stjordanis nguyenducnhaty ortalby xieshentoken tinyloop castrol68 shaunstanislauslau guliisgreat jangocheng trendingtechnology hyzcn hsouporto millsvonmilski sermakarevich catmi666 yuuto13 nikolayvoronchikhin deeperunderstanding yqgans mtlong sk48880 akashravichandran hhy5277 vfdev-5 muxinghan sainiudit monica-hr lucywi yuanpeng5 rileynwong savourylie reloadbrain collawolley sruthi-racharla shaikkamran caravanuden duyamin lst4ever nitintushir0048 finesure2017 iabd amir22010 trucnguyenlam tarsbase artechstark wtyuyuyu sahanduiuc daiwc zhushaoquan leipang0817 xiaoye77 clynie rystylee brucejunlee reluuu shubhamrock428 kunato hushengfeng dantheman3333 seongl tamwaiban b2220333 auscenery lifeixianshen laurentperrinet 711e braman09 aakashofficial dhpollack yuv4r4j aksrustagi jamesross2 rosssong darknoon zhanqizhang66 sudipta90 giannisdaras dieptran43 aizawan phoenix-king-bird adamsh25 huggingworld wenbank

pytorch-pretrained-biggan's Issues

Generation of representations

Hi, very nice work making pretrained tf models available on pytorch.
How can one use this model to generate representations?

AssertionError when working with classes which are not in ImageNet

The function one_hot_from_names throws an AssertionError when a class name - which is not in the original ImageNet classes and for which possible synsets do not exist either - is used.

This happens because the batch_size is not updated when calling one_hot_from_int in utils.py after converting words to their respective indices.

The following lines should be able to reproduce this:

import torch
from pytorch_pretrained_biggan import BigGAN, one_hot_from_names
model = BigGAN.from_pretrained('biggan-deep-256')
class_vector = one_hot_from_names(['cake'], batch_size=1)

This would throw the following error:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-4-4dd4cd1296e1> in <module>()
      1 
----> 2 class_vector = one_hot_from_names(['cake'], batch_size=1)

/home/saqib/Projects/Poem2GIF/repos/pytorch-pretrained-BigGAN/pytorch_pretrained_biggan/utils.py in one_hot_from_names(class_name_or_list, batch_size)
    211                 classes.append(IMAGENET[possible_synsets[0].offset()])
    212 
--> 213     return one_hot_from_int(classes, batch_size=batch_size)
    214 
    215 

/home/saqib/Projects/Poem2GIF/repos/pytorch-pretrained-BigGAN/pytorch_pretrained_biggan/utils.py in one_hot_from_int(int_or_list, batch_size)
    164         int_or_list = [int_or_list[0]] * batch_size
    165 
--> 166     assert batch_size == len(int_or_list)
    167 
    168     array = np.zeros((batch_size, NUM_CLASSES), dtype=np.float32)

AssertionError:

Is the model trained on truncated noise? What was input noise vector characteristics for training?

Hi,

I have noticed in the "utils.py" line 32, you truncated the normal noise in the range [-2,2] by this line of code:

values = truncnorm.rvs(-2, 2, size=(batch_size, dim_z), random_state=state).astype(np.float32)

Could you please let me know whether the pre-trained model is also trained using this truncated noise? If not, could you please let me know the characteristics of the input noise vectors during training your model? Thanks!

About license

Hi, thank you for your fantastic work!

I have a question about this repository.
The LICENSE file says this repository's license is MIT, but pypi says this repository's is apache.
ref: https://pypi.org/project/pytorch-pretrained-biggan/

Could you provide the network structure and the pre-trained model of discriminator?

Hi, I want to use the discriminator to measure the quality of the generated image. So, could you provide the network structure and the pre-trained model of discriminator?
Thank you very much!

Something really funny

When I tried to generate image of a cat...

It seems like a kind of mixture with leopard's body and tiger's face.

Output does not match TF implementation

First, I want to thank you for doing this! There are a lot of little pieces to get right and the results so far are pretty amazing.

That said, images are quite different from the Tensorflow implementation. I'm interested in using this with points found with Ganbreeder, which means that I want to match the TF version as closely as possible. I think it's all using cuDNN underneath right? If so, I imagine we could get this closer aligned.

My main question is how to modify the Tensorflow graph to output the intermediate calculations so we can compare activations and figure out which layer(s) are causing the biggest error. I would be happy you with this.

Examples:

TF	pytorch	URL
		https://ganbreeder.app/info?k=ff84584479eb1cc90e4af4a4
		https://ganbreeder.app/info?k=e1a780eba7bd551dfeb43789
		https://ganbreeder.app/info?k=42999d4b4849f0c0852e6ecd

When i run the code, it will download the pre-trained model? Why?

train BigGan

will the repo add code to train a BigGan from scratch? Thank you

Fine-tune BigGAN?

Would it be possible to finetune this BigGAN implementation to a custom dataset, in order to generate new classes of images?

How to load the model??

model = BigGAN.from_pretrained('biggan-deep-256')
Can we just replace 'biggan-deep-256' with the path which we download the pre-train model??

TypeError: 'encoding' is an invalid keyword argument for this function

When I try to load the model I get the following error. Am I missing any dependency?

File "visualize.py", line 97, in
model = BigGAN.from_pretrained(model_name)
File "/home/ricardo/.local/lib/python2.7/site-packages/pytorch_pretrained_biggan/model.py", line 274, in from_pretrained
config = BigGANConfig.from_json_file(resolved_config_file)
File "/home/ricardo/.local/lib/python2.7/site-packages/pytorch_pretrained_biggan/config.py", line 56, in from_json_file
with open(json_file, "r", encoding='utf-8') as reader:
TypeError: 'encoding' is an invalid keyword argument for this function

Can I find the pre-trained model of discriminator in BigaGAN?

where is the pre-trained model of discriminator?

Is it convenient to offer a Chinese mirror download url for the pre-trained model?

The downloading speed form amazonaws is not tolerant in China even for 200MB file.

why the model use only first three channels of the last layer output ?

pytorch-pretrained-BigGAN/pytorch_pretrained_biggan/model.py

Lines 245 to 246 in 1e18aed

 z = self.conv_to_rgb(z) 

 z = z[:, :3, ...]

I'm trying to understand the model by reading code. I noticed that conv_to_rgb has actually 128 channels but only first three are used for the final RGB image. Why do you do this? What the other 125 channels for?