HyperGAN 0.9

A composable GAN API and CLI. Built for developers, researchers, and artists.

HyperGAN is currently in open beta.

Logos generated with examples/colorizer, AlphaGAN, and the RandomWalk sampler

About
Showcase
Documentation
Changelog
Quick start
The pip package hypergan
Training
Sampling
API
- Examples
- Search
Configuration
- Usage
- Architecture
- GANComponent
- Generator
- Encoders
- Discriminators
- Losses
- WGAN
- LS-GAN
- Standard GAN and Improved GAN
- Categories
- Supervised
- Trainers
Datasets
Contributing
Versioning
Sources
Papers
Citation

About

Generative Adversarial Networks consist of 2 learning systems that learn together. HyperGAN implements these learning systems in Tensorflow with deep learning.

For an introduction, see here http://blog.aylien.com/introduction-generative-adversarial-networks-code-tensorflow/

HyperGAN is currently in open beta.

Showcase

0.9 samples are still training.

Documentation

API Documentation

Changelog

See the full changelog here: Changelog.md

Quick start

Minimum requirements

For 256x256, we recommend a GTX 1080 or better. 32x32 can be run on lower-end GPUs.
CPU training is extremely slow. Use a GPU if you can!
Python3

Install

Install hypergan:

  pip3 install hypergan --upgrade

Optional `virtualenv`:

If you use virtualenv:

  virtualenv --system-site-packages -p python3 hypergan
  source hypergan/bin/activate

Dependencies:

If installation fails try this.

  pip3 install numpy tensorflow-gpu hyperchamber pillow pygame

Dependency help

If the above step fails see the dependency documentation:

tensorflow - https://www.tensorflow.org/install/
pygame - http://www.pygame.org/wiki/GettingStarted

Create a new project

  hypergan new mymodel

This will create a mymodel.json based off the default configuration. You can change configuration templates with the -c flag.

List configuration templates

  hypergan new mymodel -l

See all configuration templates with --list-templates or -l.

Train

  # Train a 32x32 gan with batch size 32 on a folder of folders of pngs, resizing images as necessary
  hypergan train folder/ -s 32x32x3 -f png -c mymodel --resize

Increasing performance

On ubuntu sudo apt-get install libgoogle-perftools4 and make sure to include this environment variable before training

  LD_PRELOAD="/usr/lib/libtcmalloc.so.4" hypergan train my_dataset

HyperGAN does not cache image data in memory. Images are loaded every time they're needed, so you can increase performance by pre-processing your inputs, especially by resampling large inputs to the output resolution. e.g. with ImageMagick:

  convert image1.jpg -resize '128x128^' -gravity Center -crop 128x128+0+0 image1.png

Development mode

If you wish to modify hypergan

git clone https://github.com/255BITS/hypergan
cd hypergan
python3 setup.py develop

Running on CPU

Make sure to include the following 2 arguments:

CUDA_VISIBLE_DEVICES= hypergan --device '/cpu:0'

Don't train on CPU! It's too slow.

The pip package hypergan

 hypergan -h

Training

  # Train a 256x256 gan with batch size 32 on a folder of pngs
  hypergan train [folder] -s 32x32x3 -f png -b 32 --config [name]

Sampling

  # Train a 256x256 gan with batch size 32 on a folder of pngs
  hypergan train [folder] -s 32x32x3 -f png -b 32 --config [name] --sampler static_batch --sample_every 5

One way a network learns:

To create videos:

  ffmpeg -i samples/%06d.png -vcodec libx264 -crf 22 -threads 0 gan.mp4

Arguments

To see a detailed list, run

  hypergan -h

API

See the API documentation at https://s3.amazonaws.com/hypergan-apidocs/0.9.0/index.html

  import hypergan as hg

Examples

See the example documentation https://github.com/255BITS/HyperGAN/tree/master/examples

Search

Each example is capable of random search. You can search along any set of parameters, including loss functions, trainers, generators, etc.

Datasets

To build a new network you need a dataset. Your data should be structured like:

  [folder]/[directory]/*.png

Creating a Dataset

Datasets in HyperGAN are meant to be simple to create. Just use a folder of images.

Unsupervised learning

The default mode of hypergan.

 [folder]/*.png

For jpg(pass -f jpg)

Supervised learning

Training with labels allows you to train a classifier.

Each directory in your dataset represents a classification.

Example: Dataset setup for classification of apple and orange images:

 /dataset/apples
 /dataset/oranges

You must pass --classloss to hypergan cli to activate this feature.

Configuration

Configuration in HyperGAN uses JSON files. You can create a new config with the default template by running hypergan new mymodel.

You can see all templates with hypergan new mymodel -l.

Architecture

A hypergan configuration contains all hyperparameters for reproducing the full GAN.

In the original DCGAN you will have one of the following components:

Encoder
Generator
Discriminator
Loss
Trainer

Other architectures may differ. See the configuration templates.

GANComponent

A base class for each of the component types listed below.

Generator

A generator is responsible for projecting an encoding (sometimes called z space) to an output (normally an image). A single GAN object from HyperGAN has one generator.

Resize Conv

This generator supports any resolution. Works using a combination of final_depth and depth_increase in order to scale output size.

For example: the shape of final_depth=16 and depth_increase=16 when working on images of 64x64x3

  64x64x3 -> 32x32x16 -> 16x16x32 -> 8x8x48 -> 4x4x64

The same network on 128x128x3:

  128x128x3 -> 64x64x16 -> 32x32x32 -> 16x16x48 -> 8x8x64 -> 4x4x80

attribute	description	type
final_depth	The features for the last convolution layer(before projecting to final output).	int > 0
depth_increase	Working backwards, each previous layer will contain this many more features.	int > 0
activation	Activations to use. See activations	f(net):net
final_activation	Final activation to use. This is usually set to tanh to squash the output range. See activations.	f(net):net
layer_filter	On each resize of G, we call this method. Anything returned from this method is added to the graph before the next convolution block. See common layer filters	f(net):net
layer_regularizer	This "regularizes" each layer of the generator with a type. See layer regularizers	f(name)(net):net
block	This is called at each layer of the generator, after the resize. Can also be the string `deconv`	f(...) see source code
resize_image_type	See tf.resize_images for values	enum(int)

Encoders

Sometimes referred to as the z-space representation or latent space. In dcgan the 'encoder' is random uniform noise.

Can be thought of as input to the generator.

Uniform Encoder

attribute	description	type
z	The dimensions of random uniform noise inputs	int > 0
min	Lower bound of the random uniform noise	int
max	Upper bound of the random uniform noise	int > min
projections	See more about projections below	[f(config, gan, net):net, ...]
modes	If using modes, the number of modes to have per dimension	int > 0

Projections

This encoder takes a random uniform value and outputs it as many possible types. The primary idea is that you are able to query Z as a random uniform distribution, even if the gan is using a spherical representation.

Some projection types are listed below.

"identity" projection

"sphere" projection

"gaussian" projection

"modal" projection

One of many

"binary" projection

On/Off

Category Encoder

Uses categorical prior to choose 'one-of-many' options.

Discriminators

A discriminator's main purpose(sometimes called a critic) is to separate out G from X, and to give the Generator a useful error signal to learn from.

Note a discriminator can be an encoder sometimes(like in the case of AlphaGAN)

Pyramid Discriminator

Architecturally similar to the ResizeConvGenerator.

For example: the shape of initial_depth=16 and depth_increase=16 when working on images of 64x64x3

  64x64x3 -> 32x32x16 -> 16x16x32 -> 8x8x48 -> 4x4x64

The same network on 128x128x3:

  128x128x3 -> 64x64x16 -> 32x32x32 -> 16x16x48 -> 8x8x64 -> 4x4x80

attribute	description	type
activation	Activations to use. See activations	f(net):net
initial_depth	The initial number of filters to use.	int > 0
depth_increase	Increases the filter sizes on each convolution by this amount	int > 0
final_activation	Final activation to use. None is common here, and is required for several loss functions.	f(net):net
layers	The number of convolution layers	int > 0
layer_filter	Append information to each layer of the discriminator	f(config, net):net
layer_regularizer	batch_norm_1, layer_norm_1, or None	f(batch_size, name)(net):net
fc_layer_size	The size of the linear layers at the end of this network(if any).	int > 0
fc_layers	fully connected layers at the end of the discriminator(standard dcgan is 0)	int >= 0
noise	Instance noise. Can be added to the input X	float >= 0
progressive_enhancement	If true, enable progressive enhancement	boolean

Losses

WGAN

Wasserstein Loss is simply:

 d_loss = d_real - d_fake
 g_loss = d_fake

d_loss and g_loss can be reversed as well - just add a '-' sign.

Least-Squares GAN

 d_loss = (d_real-b)**2 - (d_fake-a)**2
 g_loss = (d_fake-c)**2

a, b, and c are all hyperparameters.

Standard GAN and Improved GAN

Includes support for Improved GAN. See hypergan/losses/standard_gan_loss.py for details.

Supervised loss

Supervised loss is for labeled datasets. This uses a standard softmax loss function on the outputs of the discriminator.

Categorical loss

This is currently untested.

Cramer loss

No good results yet

Softmax loss

Not working as well as the others

Boundary Equilibrium Loss

Use with the AutoencoderDiscriminator.

See the began configuration template.

Loss configuration

attribute	description	type
batch_norm	batch_norm_1, layer_norm_1, or None	f(batch_size, name)(net):net
create	Called during graph creation	f(config, gan, net):net
discriminator	Set to restrict this loss to a single discriminator(defaults to all)	int >= 0 or None
label_smooth	improved gan - Label smoothing.	float > 0
labels	lsgan - A triplet of values containing (a,b,c) terms.	[a,b,c] floats
reduce	Reduces the output before applying loss	f(net):net
reverse	Reverses the loss terms, if applicable	boolean

Trainers

Determined by the GAN implementation. These variables are the same across all trainers.

Configuration

attribute	description	type
g_learn_rate	Learning rate for the generator	float >= 0
g_beta1	(adam)	float >= 0
g_beta2	(adam)	float >= 0
g_epsilon	(adam)	float >= 0
g_decay	(rmsprop)	float >= 0
g_momentum	(rmsprop)	float >= 0
d_learn_rate	Learning rate for the discriminator	float >= 0
d_beta1	(adam)	float >= 0
d_beta2	(adam)	float >= 0
d_epsilon	(adam)	float >= 0
d_decay	(rmsprop)	float >= 0
d_momentum	(rmsprop)	float >= 0
clipped_gradients	If set, gradients will be clipped to this value.	float > 0 or None
d_clipped_weights	If set, the discriminator will be clipped by value.	float > 0 or None

Downloadable datasets

CelebA aligned faces http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
MS Coco http://mscoco.org/
ImageNet http://image-net.org/

Contributing

Contributions are welcome and appreciated! We have many open issues in the Issues tab.

See how to contribute.

Versioning

HyperGAN uses semantic versioning. http://semver.org/

TLDR: x.y.z

x is incremented on stable public releases.
y is incremented on API breaking changes. This includes configuration file changes and graph construction changes.
z is incremented on non-API breaking changes. z changes will be able to reload a saved graph.

Papers

GAN - https://arxiv.org/abs/1406.2661
DCGAN - https://arxiv.org/abs/1511.06434
InfoGAN - https://arxiv.org/abs/1606.03657
Improved GAN - https://arxiv.org/abs/1606.03498
Adversarial Inference - https://arxiv.org/abs/1606.00704
Energy-based Generative Adversarial Network - https://arxiv.org/abs/1609.03126
Wasserstein GAN - https://arxiv.org/abs/1701.07875
Least Squares GAN - https://arxiv.org/pdf/1611.04076v2.pdf
Boundary Equilibrium GAN - https://arxiv.org/abs/1703.10717
Self-Normalizing Neural Networks - https://arxiv.org/abs/1706.02515
Variational Approaches for Auto-Encoding Generative Adversarial Networks - https://arxiv.org/pdf/1706.04987.pdf
CycleGAN - https://junyanz.github.io/CycleGAN/
DiscoGAN - https://arxiv.org/pdf/1703.05192.pdf
Softmax GAN - https://arxiv.org/abs/1704.06191
The Cramer Distance as a Solution to Biased Wasserstein Gradients - https://arxiv.org/abs/1705.10743
Improved Training of Wasserstein GANs - https://arxiv.org/abs/1704.00028

Sources

DCGAN - https://github.com/carpedm20/DCGAN-tensorflow
InfoGAN - https://github.com/openai/InfoGAN
Improved GAN - https://github.com/openai/improved-gan
Hyperchamber - https://github.com/255bits/hyperchamber

Citation

If you wish to cite this project, do so like this:

  255bits(Martyn, Mikkel et al),
  HyperGAN, (2017), 
  GitHub repository, 
  https://github.com/255BITS/HyperGAN

jpvolt / hypergan Goto Github PK

hypergan's Introduction

HyperGAN 0.9

Table of contents

About

Showcase

Documentation

Changelog

Quick start

Minimum requirements

Install

Install hypergan:

Optional virtualenv:

Dependencies:

Dependency help

Create a new project

List configuration templates

Train

Increasing performance

Development mode

Running on CPU

The pip package hypergan

Training

Sampling

Arguments

API

Examples

Search

Datasets

Creating a Dataset

Unsupervised learning

Supervised learning

Configuration

Architecture

GANComponent

Generator

Resize Conv

Encoders

Uniform Encoder

Projections

"identity" projection

"sphere" projection

"gaussian" projection

"modal" projection

"binary" projection

Category Encoder

Discriminators

Pyramid Discriminator

Losses

WGAN

Least-Squares GAN

Standard GAN and Improved GAN

Supervised loss

Categorical loss

Cramer loss

Softmax loss

Boundary Equilibrium Loss

Loss configuration

Trainers

Configuration

Downloadable datasets

Contributing

Versioning

Papers

Sources

Citation

hypergan's People

Contributors

Watchers

Recommend Projects

Recommend Topics

Recommend Org

Optional `virtualenv`: