Git Product home page Git Product logo

bvae-tf's People

Contributors

alecgraves avatar stoplime avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

bvae-tf's Issues

I do not think the capacity argument works

What was past me thinking? I do not know ๐Ÿ˜„

 if self.reg == 'bvae':
            # kl divergence:
            latent_loss = -0.5 * K.mean(1 + stddev
                                - K.square(mean)
                                - K.exp(stddev), axis=-1)
            # use beta to force less usage of vector space:
            # also try to use <capacity> dimensions of the space:
            latent_loss = self.beta * K.abs(latent_loss - self.capacity/self.shape.as_list()[1])
            self.add_loss(latent_loss, x)

I just randomly subtract a constant from my loss?

This is more like it:

if self.reg == 'bvae':
            # kl divergence:
            latent_losses = -0.5 * (1 + stddev
                                - K.square(mean)
                                - K.exp(stddev))
            # use beta to force less usage of vector space:
            # also try to use <capacity> dimensions of the space:
            bvae_weight = self.beta * K.ones(shape=(self.shape.as_list()[1]-self.capacity))
            if self.capacity > 0:
                vae_weight = K.ones(shape=(self.capacity))
                bvae_weight = K.concatenate([vae_weight, bvae_weight], axis=-1)
            latent_loss = K.abs(K.mean(bvae_weight*latent_losses, axis=-1))
            
            self.add_loss(latent_loss, x)

better way to handle batch_size

heyo,

really like your implementation but noticed the static batch size was causing me all sorts of grief when i wanted to play around with training. after a bit of mucking around i came up with a solution that i feel is a little more elegant.

basically the issue arises because during construction there is a call to the instantiated layer. at that point the tensor being passed in as "x" to the sampling layers call() function has an undefined batch_size. at build time all we need to do is return a tensor with the appopriate shape, we dont actually need to call the K.random_normal() function which is the only part of this function that needs the batch_size explicitly.

long story short, stick this in your Sampling.call() function:

        # trick to allow setting batch at train/eval time
        if x[0].shape[0].value == None:
            return mean + 0*stddev

in context that is (i made some slight other changes to function but you can ignore them, this is just so you can see how y fix would fit into the function):

    def call(self, x):
        if len(x) != 2:
            raise Exception('input layers must be a list: mean and stddev')
        if len(x[0].shape) != 2 or len(x[1].shape) != 2:
            raise Exception('input shape is not a vector [batchSize, latentSize]')
        
        mean = x[0]
        stddev = x[1]        
        
        # trick to allow setting batch at train/eval time
        if x[0].shape[0].value == None:
            return mean + 0*stddev

        if self.reg:
            # kl divergence:
            latent_loss = -0.5 * K.mean(1 + stddev
                                        - K.square(mean)
                                        - K.exp(stddev), axis=-1)        
    
            if self.reg == 'bvae':
                # use beta to force less usage of vector space:
                # also try to use <capacity> dimensions of the space:
                latent_loss = self.beta * K.abs(latent_loss - self.capacity/self.shape.as_list()[1])

            self.add_loss(latent_loss, x)

        epsilon = K.random_normal(shape=self.shape,
                              mean=0., stddev=1.)
        if self.random:
            # 'reparameterization trick':
            return mean + K.exp(stddev / 2) * epsilon
        else: # do not perform random sampling, simply grab the impulse value
            return mean + 0*stddev # Keras needs the *0 so the gradinent is not None

Some questions about implementation

Hi,

I've been using your code in some experiments.
I have the following questions:

  1. Applying your recent committed changes to the loss actually resulted in predicted values with weird (larger) ranges in my experiments, which were weirder to convert to an image. I had to "roll back" to the previous version... Have you noticed such an impact?

  2. Shouldn't the last layer have a sigmoid as activation so that the output has values between 0 and 1? These values should be comparable to the input ones, which I think are rescaled to be between 0 and 1, I am correct? Does this affect the reconstruction loss?

  3. Also, in some other implementations the common reconstruction loss is the mean squared error and not the mean absolute error. Do you use 'mae" for some reason?

  4. This is an extra issue that I'm having. Have you been able to use the Tensorboard callback to log the losses and metrics? When trying to add the Tensorboard callback I get an error which I think is because the ae model is made of two models, and thus internally has more than one loss. I get the following error: line 1050, in _write_custom_summaries
    summary_value.simple_value = value.item()
    ValueError: can only convert an array of size 1 to a Python scalar
    I could not find a solution yet..!

  5. Minor detail: Why changing the stddev to its absolute value? Can it ever be negative?!

I'm sorry for the long text and for raising all these issues, but I think they may be relevant for more users too!

Thank you in advance!

Applying to 1D data

Hi,
Thanks for providing the code!
I was wondering if you know how to apply the code to a 1D data instead an image? I have done some edits to the code, but I am getting the following error:

Here is my edits to the code:
`class Darknet19Encoder(Architecture):
'''
This encoder predicts distributions then randomly samples them.
Regularization may be applied to the latent space output

a simple, fully convolutional architecture inspried by 
    pjreddie's darknet architecture
https://github.com/pjreddie/darknet/blob/master/cfg/darknet19.cfg
'''
def __init__(self, inputShape=(16889,), batchSize=32,
             latentSize=1024, latentConstraints='bvae', beta=100., capacity=0.,
             randomSample=True):
    '''
    params
    -------
    latentConstraints : str
        Either 'bvae', 'vae', or 'no'
        Determines whether regularization is applied
            to the latent space representation.
    beta : float
        beta > 1, used for 'bvae' latent_regularizer
        (Unused if 'bvae' not selected, default 100)
    capacity : float
        used for 'bvae' to try to break input down to a set number
            of basis. (e.g. at 25, the network will try to use 
            25 dimensions of the latent space)
        (unused if 'bvae' not selected)
    randomSample : bool
        whether or not to use random sampling when selecting from distribution.
        if false, the latent vector equals the mean, essentially turning this into a
            standard autoencoder.
    '''
    self.latentConstraints = latentConstraints
    self.beta = beta
    self.latentCapacity = capacity
    self.randomSample = randomSample
    print('inputShape ', inputShape, 'batchSize ', batchSize,'latentSize ',  latentSize)

    super().__init__(inputShape, batchSize, latentSize)

def Build(self):
    
    
    # create the input layer for feeding the netowrk
    inLayer = Input(shape=(16889,))
    net = Dense(1024, activation='relu',kernel_initializer='glorot_uniform')(inLayer)
    net = BatchNormalization()(net)
    net = Activation('relu')(net)
    
    mean = Dense(1024, name = 'mean')(net)
    stddev = Dense(1024, name = 'std')(net)
    
    sample = SampleLayer(self.latentConstraints, self.beta,
                        self.latentCapacity, self.randomSample)([mean, stddev])

    return Model(inputs=inLayer, outputs=sample)`

and this is the error that I'm getting:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-124-14cadf28fcb2> in <module>()
----> 1 d19e = Darknet19Encoder()
      2 d19e.model()

<ipython-input-123-3d464c6af1ad> in __init__(self, inputShape, batchSize, latentSize, latentConstraints, beta, capacity, randomSample)
     78         print('inputShape ', inputShape, 'batchSize ', batchSize,'latentSize ',  latentSize)
     79 
---> 80         super().__init__(inputShape, batchSize, latentSize)
     81 
     82     def Build(self):

<ipython-input-123-3d464c6af1ad> in __init__(self, inputShape, batchSize, latentSize)
     37         self.latentSize = latentSize
     38 
---> 39         self.model = self.Build()
     40 
     41 

<ipython-input-123-3d464c6af1ad> in Build(self)
     93 
     94         sample = SampleLayer(self.latentConstraints, self.beta,
---> 95                             self.latentCapacity, self.randomSample)([mean, stddev])
     96 
     97         return Model(inputs=inLayer, outputs=sample)

/projects/sysbio/projects/czi/immune/anaconda2/envs/py36/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
    734 
    735       if not in_deferred_mode:
--> 736         outputs = self.call(inputs, *args, **kwargs)
    737         if outputs is None:
    738           raise ValueError('A layer\'s `call` method should return a Tensor '

<ipython-input-121-c50a024c4c69> in call(self, x)
    110 
    111         epsilon = K.random_normal(shape=self.shape,
--> 112                               mean=0., stddev=1.)
    113         if self.random:
    114             # 'reparameterization trick':

/projects/sysbio/projects/czi/immune/anaconda2/envs/py36/lib/python3.6/site-packages/tensorflow/python/keras/backend.py in random_normal(shape, mean, stddev, dtype, seed)
   4512     seed = np.random.randint(10e6)
   4513   return random_ops.random_normal(
-> 4514       shape, mean=mean, stddev=stddev, dtype=dtype, seed=seed)
   4515 
   4516 

/projects/sysbio/projects/czi/immune/anaconda2/envs/py36/lib/python3.6/site-packages/tensorflow/python/ops/random_ops.py in random_normal(shape, mean, stddev, dtype, seed, name)
     70   """
     71   with ops.name_scope(name, "random_normal", [shape, mean, stddev]) as name:
---> 72     shape_tensor = _ShapeTensor(shape)
     73     mean_tensor = ops.convert_to_tensor(mean, dtype=dtype, name="mean")
     74     stddev_tensor = ops.convert_to_tensor(stddev, dtype=dtype, name="stddev")

/projects/sysbio/projects/czi/immune/anaconda2/envs/py36/lib/python3.6/site-packages/tensorflow/python/ops/random_ops.py in _ShapeTensor(shape)
     41   else:
     42     dtype = None
---> 43   return ops.convert_to_tensor(shape, dtype=dtype, name="shape")
     44 
     45 

/projects/sysbio/projects/czi/immune/anaconda2/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py in convert_to_tensor(value, dtype, name, preferred_dtype)
    996       name=name,
    997       preferred_dtype=preferred_dtype,
--> 998       as_ref=False)
    999 
   1000 

/projects/sysbio/projects/czi/immune/anaconda2/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py in internal_convert_to_tensor(value, dtype, name, as_ref, preferred_dtype, ctx)
   1092 
   1093     if ret is None:
-> 1094       ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
   1095 
   1096     if ret is NotImplemented:

/projects/sysbio/projects/czi/immune/anaconda2/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py in _tensor_shape_tensor_conversion_function(s, dtype, name, as_ref)
    236   if not s.is_fully_defined():
    237     raise ValueError(
--> 238         "Cannot convert a partially known TensorShape to a Tensor: %s" % s)
    239   s_list = s.as_list()
    240   int64_value = 0

ValueError: Cannot convert a partially known TensorShape to a Tensor: (?, 1024)

and here some printing results that may help?

inputShape  (16889,) batchSize  32 latentSize  1024
len(x) 2 len(x[0].shape) 2 len(x[1].shape) 2 x [<tf.Tensor 'mean_10/BiasAdd:0' shape=(?, 1024) dtype=float32>, <tf.Tensor 'std_8/BiasAdd:0' shape=(?, 1024) dtype=float32>]
mean =  Tensor("mean_10/BiasAdd:0", shape=(?, 1024), dtype=float32)
stddev =  Tensor("std_8/BiasAdd:0", shape=(?, 1024), dtype=float32)
latent_loss Tensor("sample_layer_33/mul:0", shape=(), dtype=float32)
latent_loss Tensor("sample_layer_33/mul_1:0", shape=(), dtype=float32)

why is the in_train_phase not working

return K.in_train_phase(reparameterization_trick, mean + 0*logvar, training=training) # TODO figure out why this is not working in the specified tf version???

in_train_phase should call the reparameterization function when K.backend is in its training phase.... It does not appear to be running at all.

wrong implementation?

I checked various pytorch repos,
all of them have loss for mean and log_var value, but your have not.
Also resampler formula wrong in your repo.

Unable to run ae.py due to NoneType error

I was trying to run the code, and encountered the following error. Please tell me how I can fix it.

Traceback (most recent call last):
  File "ae.py", line 65, in <module>
    test()
  File "ae.py", line 45, in test
    encoder = Darknet19Encoder(inputShape, latentSize=latentSize, latentConstraints='bvae', beta=69)
  File "/home/ies/billa/BVAE-tf/bvae/models.py", line 72, in __init__
    super().__init__(inputShape, batchSize, latentSize)
  File "/home/ies/billa/BVAE-tf/bvae/models.py", line 41, in __init__
    self.model = self.Build()
  File "/home/ies/billa/BVAE-tf/bvae/models.py", line 114, in Build
    sample = SampleLayer(self.latentConstraints, self.beta)([mean, logvar], training=self.training)
  File "/home/ies/billa/miniconda3/envs/pfprint/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 922, in __call__
    outputs = call_fn(cast_inputs, *args, **kwargs)
  File "/home/ies/billa/miniconda3/envs/pfprint/lib/python3.6/site-packages/tensorflow/python/autograph/impl/api.py", line 265, in wrapper
    raise e.ag_error_metadata.to_exception(e)
AttributeError: in user code:

    /home/ies/billa/BVAE-tf/bvae/sample_layer.py:70 call  *
        if mean.shape[0].value == None or  logvar.shape[0].value == None:

    AttributeError: 'NoneType' object has no attribute 'value'

Is negative stddev a problem?

I do not know if I should abs the stddev component of latent space or not... I think it breaks the loss function if it is negative?

# kl divergence:
latent_loss = -0.5 * K.mean(1 + stddev
                        - K.square(mean)
                        - K.exp(stddev), axis=-1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.