clvrai / acgan-pytorch Goto Github PK

License: MIT License

Python 100.00%

acgan-pytorch's Introduction

Conditional Image Synthesis With Auxiliary Classifier GANs

As part of the implementation series of Joseph Lim's group at USC, our motivation is to accelerate (or sometimes delay) research in the AI community by promoting open-source projects. To this end, we implement state-of-the-art research papers, and publicly share them with concise reports. Please visit our group github site for other projects.

This project is implemented by Te-Lin Wu and the codes have been reviewed by Shao-Hua Sun before being published.

Descriptions

This project is a PyTorch implementation of Conditional Image Synthesis With Auxiliary Classifier GANs which was published as a conference proceeding at ICML 2017. This paper proposes a simple extention of GANs that employs label conditioning in additional to produce high resolution and high quality generated images.

By adding an auxiliary classifier to the discriminator of a GAN, the discriminator produces not only a probability distribution over sources but also probability distribution over the class labels. This simple modification to the standard DCGAN models does not give tremendous difference but produces better results and is capable of stabilizing the whole adversarial training.

The architecture is as shown below for comparisons of several GANs.

The sample generated images from ImageNet dataset.

The sample generated images from CIFAR-10 dataset.

The implemented model can be trained on both CIFAR-10 and ImageNet datasets.

Note that this implementation may differ in details from the original paper such as model architectures, hyperparameters, applied optimizer, etc. while maintaining the main proposed idea.

*This code is still being developed and subject to change.

Prerequisites

Python 2.7
PyTorch
SciPy
NumPy
PIL
imageio

Usage

Run the following command for details of each arguments.

$ python main.py -h

You should specify the path to the dataset you are using with argument --dataroot, the code will automatically check if you have cifar10 dataset downloaded or not. If not, the code will download it for you. For the ImageNet training you should download the whole dataset on their website, this repository used 2012 version for the training. And you should point the dataroot to the train (or val) directory as the root directory for ImageNet training.

In line 80 of main.py, you can change the classes_idx argument to take into other user-specified imagenet classes, and adjust the num_classes accordingly if it is not 10.

if opt.dataset == 'imagenet':
    # folder dataset
    dataset = ImageFolder(root=opt.dataroot,
                          transform=transforms.Compose([
                              transforms.Scale(opt.imageSize),
                              transforms.CenterCrop(opt.imageSize),
                              transforms.ToTensor(),
                              transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
                         ]),
                         classes_idx=(10,20))

Train the models

Example training commands, the code will automatically generate images for testing during training to the --outf directory.

$ python main.py --outf=/your/output/file/name --niter=500 --batchSize=100 --cuda --dataset=cifar10 --imageSize=32 --dataroot=/data/path/to/cifar10 --gpu=0

Author

Te-Lin Wu / @telin0411 @ Joseph Lim's research lab @ USC

acgan-pytorch's People

Contributors

Stargazers

Watchers

acgan-pytorch's Issues

Label dimensionality always fails

From the following line:

aux_label.data.resize_(batch_size).copy_(label)

I am getting the following error:

RuntimeError: expand(torch.FloatTensor{[1, 1]}, size=[1]): the number of sizes provided (1) must be greater or equal to the number of dimensions in the tensor (2)

For a custom dataset. The dataset is using TensorDataset from PyTorch, and is supplying a 1D tensor, but for some reason it fails as [1, 1].

The way I create the dataset:

...
tensor_y = torch.stack([torch.Tensor(i) for i in y]) # i is a 1D vector
return data.TensorDataset(tensor_x, tensor_y)

Softmax or logsoftmax for auxiliary prediction

The current code uses softmax for auxiliary prediction and during training uses NLL Loss. But according to here it seems that the right last layer to go with NLL is LogSoftmax instead?

accuracy

About accuracy ，what's the best score about the cifar10?

ImageNet training parameters

Would it be possible to provide a run script for training on ImageNet, similar to the one you provided for CIFAR:

python main.py --outf=/your/output/file/name --niter=500 --batchSize=100 --cuda --dataset=cifar10 --imageSize=32 --dataroot=/data/path/to/cifar10 --gpu=0

I assume this should be okay, but I didn't want to guess any of the params (lr, niter, ngf, ndf etc.) either.

python main.py --outf=/your/output/file/name --niter=500 --batchSize=100 --cuda --dataset=imagenet --imageSize=128 --dataroot=/data/path/to/imagenet --num_classes=1000 --nz=1100 --gpu=0

What worked best in your experiments?

Error while running on CIFAR-10

UserWarning: Using a target size (torch.Size([1])) that is different to the input size (torch.Size([16])) is deprecated. Please ensure they have the same size.
"Please ensure they have the same size.".format(target.size(), input.size()))
Traceback (most recent call last):
File "main.py", line 188, in
dis_errD_real = dis_criterion(dis_output, dis_label)
File "/home/.../envs/python36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/home/.../envs/python36/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 372, in forward
size_average=self.size_average)
File "/home/.../envs/python36/lib/python3.6/site-packages/torch/nn/functional.py", line 1171, in binary_cross_entropy
"!= input nelement ({})".format(target.nelement(), input.nelement()))
ValueError: Target and input must have the same number of elements. target nelement (1) != input nelement (16)

If I want change the noise tu images, what should I do to change the network?

comment typo network.py line 36

should read # Transposed Convolution 6 not # Transposed Convolution 5

Multi label classification

I have a custom dataset with multiple labels per image.
This means my "label" is a vector with 1 or 0 in multiple locations.

How can I augment ACGAN to classify multiple labels?

generator -label

excuse me， I don't see input labels in the generator？ Did anyone notice that?

Running on a costum data-set

Can you please provide guidance on how to train this model with a different data set that is not in the format of cifar or mnist?

Thank you,

Unable to get inception score for cifar-10 when compared to the original paper

First of all, I would like to thank you for sharing your implementation on github. After running my experiment on cifar-10 and computing the inception score, I discovered that the inception score computed is different from score reported on the original paper.

I get an inception score of 5 for 100 iterations, 5.03 for 200 iterations. 5.24 for 500 iteration. My inception score is calculated using the following tensorflow script.
https://github.com/dashayushman/TAC-GAN/blob/master/inception_score.py
The reported number for ACGAN paper is 8.25.

The inception score calculation is correct because I can get the same score (11.24), for all 50,000 training images. (as mentioned in Improved Techniques for Training GANs https://arxiv.org/pdf/1606.03498.pdf)

Can you shed some light as why the results differ so much? (5~ vs 8~)

Aux_Label when training GENERATOR

When training GENERATOR
Line 224

aux_errG = aux_criterion(aux_output, aux_label)

aux_label- has random values from lines 204 and 197.

Shouldn't the 'aux_label' have ground truth values because we are training the generator????

error in the loss

Thanks for publishing the code in Pytorch ! I have a few questions however.
[1] for the loss associated with the auxilliary classifier fc you are using NLL Loss but the last layer is Softmax layer. Shouldn't it be LogSoftmax instead of Softmax ?

[2] I am wondering why is the noise in line 201 generated using the class_one_hot vector representation ? Cannot we use simply the noise as generated in Line 196? Did you find any improvements with that specific noise generation ?

Also instead of randomly generating label as in Line 197 can't we use the label that have been sampled from the data loader i.e., Line 177

[3] Also based on the figure given in the main page (the last figure to the right), it is shown that class information i.e., C_class is given to both the latent variable z and before the discriminator D (on X_real and X_fake ) in the training stage. However in the code, it seems to be missing. Can you please clarify why is that?

Please refer to this
https://github.com/znxlwm/pytorch-generative-model-collections/blob/master/ACGAN.py

Thank you in advance for the wonderful code.