mattjj / svae Goto Github PK
View Code? Open in Web Editor NEWcode for Structured Variational Autoencoders
code for Structured Variational Autoencoders
The definition of flat should refer to from autograd.util import flatten, but actually refers to a function in the same file with the same name "flatten". This leads to a bug when "flat" is used. The expected_stats should be a 1-D ndarray, but it is a scalar.
Hey Matthew!
First of all, thanks for the awesome ideas and work.
I have a question about the way you're computing pgm_natgrad
. Specifically here. I'm copying the relevant lines:
# this expression for pgm_natgrad drops a term that can be computed using
# the function autograd.misc.fixed_points.fixed_point
pgm_natgrad = -natgrad_scale / num_datapoints * \
(flat(pgm_prior) + num_batches*flat(saved.stats) - flat(pgm_params))
If I understand correctly, the dropped term is this:
Which in the paper you mention "is computed automatically as part of the backward pass for computing the gradients with respect to the other parameters".
Can you clarify why that term is dropped? Also, I don't understand the minus sign, right in the beginning of the assignment, line 33.
Again, congrats on the awesome work!
On a side note, I think I spotted 2 errors in the paper:
In section 4.2 (and then again in the appendix), where you define \eta_x to be a partial local optimizer of the surrogate objective:
I believe this should be argmax, rather than argmin. Can you confirm?
In the second expression of proposition 4.2:
I think the gradient should be w.r.t. \theta, rather than x. Is that correct?
Hey Matty --- I'm seeing an assertion error during the resnet_decode
step when running the gmm_svae_synth.py
example as is:
│/Users/acm/Dropbox/Proj/svae/svae/forward_models.pyc in resnet_decode(z, phi)
│ 81 def resnet_decode(z, phi):
│ 82 phi_linear, phi_mlp = phi
│---> 83 return add(linear_decode(z, phi_linear), mlp_decode(z, phi_mlp, tanh_scale=
│2., sigmoid_output=False))
│ 84 # return linear_decode(z, phi_linear)
│ 85
│
│/Users/acm/Dropbox/Proj/svae/svae/util.py in wrapped(a, b)
│ 134 if shape(a) != shape(b):
│ 135 print shape(a), shape(b)
│--> 136 assert shape(a) == shape(b)
│ 137 return binop(a, b)
│ 138 return wrapped
│
│AssertionError:
It looks like the param sizes passed to add have shapes
((150, 10, 2), (150, 10, 4)) ((150, 10, 2), (150, 10, 2))
so the second set of params coming from linear_decode
has a last dimension twice as big as the second set of params coming from mlp_decode
.
Is one passing back a dense covariance and the other passing back a diagonal covariance?
I will do it.
Hi,
It seems that there has been a great deal of rearranging of this code since publication. Which commit should I clone to reproduce the figures shown in the paper?
Brian
Hi, i tried to re-produce the results in paper using this code, but i meet the following confusing errors:
i run instruction python setup.py build_ext --inplace
on ubuntu 14.04 with python 2.7.6, scipy, numpy and gcc 4.7.3 correctly installed. Could anyone help me? Thanks!
So sorry for placing so long errors here.
^
^
^
^
^
^
^
^
^
^
^
for t in range(T):
themax = max_vector(node_params[t])
for i in range(N):
alpha[t,i] = in_potential[i] * exp(node_params[t,i] - themax)
lognorm += log(normalize_inplace(alpha[t])) + themax
dgemv('T', &N, &N, &one, &pair_params[0,0], &N, &alpha[t,0], &inc,
^
for t in range(T):
themax = max_vector(node_params[t])
for i in range(N):
alpha[t,i] = in_potential[i] * exp(node_params[t,i] - themax)
lognorm += log(normalize_inplace(alpha[t])) + themax
dgemv('T', &N, &N, &one, &pair_params[0,0], &N, &alpha[t,0], &inc,
^
for t in range(T):
themax = max_vector(node_params[t])
for i in range(N):
alpha[t,i] = in_potential[i] * exp(node_params[t,i] - themax)
lognorm += log(normalize_inplace(alpha[t])) + themax
dgemv('T', &N, &N, &one, &pair_params[0,0], &N, &alpha[t,0], &inc,
^
for t in range(T):
themax = max_vector(node_params[t])
for i in range(N):
alpha[t,i] = in_potential[i] * exp(node_params[t,i] - themax)
lognorm += log(normalize_inplace(alpha[t])) + themax
dgemv('T', &N, &N, &one, &pair_params[0,0], &N, &alpha[t,0], &inc,
^
for t in range(T):
themax = max_vector(node_params[t])
for i in range(N):
alpha[t,i] = in_potential[i] * exp(node_params[t,i] - themax)
lognorm += log(normalize_inplace(alpha[t])) + themax
dgemv('T', &N, &N, &one, &pair_params[0,0], &N, &alpha[t,0], &inc,
^
for t in range(T):
themax = max_vector(node_params[t])
for i in range(N):
alpha[t,i] = in_potential[i] * exp(node_params[t,i] - themax)
lognorm += log(normalize_inplace(alpha[t])) + themax
dgemv('T', &N, &N, &one, &pair_params[0,0], &N, &alpha[t,0], &inc,
^
for t in range(T):
themax = max_vector(node_params[t])
for i in range(N):
alpha[t,i] = in_potential[i] * exp(node_params[t,i] - themax)
lognorm += log(normalize_inplace(alpha[t])) + themax
dgemv('T', &N, &N, &one, &pair_params[0,0], &N, &alpha[t,0], &inc,
^
for t in range(T):
themax = max_vector(node_params[t])
for i in range(N):
alpha[t,i] = in_potential[i] * exp(node_params[t,i] - themax)
lognorm += log(normalize_inplace(alpha[t])) + themax
dgemv('T', &N, &N, &one, &pair_params[0,0], &N, &alpha[t,0], &inc,
^
themax = max_vector(node_params[t])
for i in range(N):
alpha[t,i] = in_potential[i] * exp(node_params[t,i] - themax)
lognorm += log(normalize_inplace(alpha[t])) + themax
dgemv('T', &N, &N, &one, &pair_params[0,0], &N, &alpha[t,0], &inc,
&zero, &in_potential[0], &inc)
^
themax = max_vector(node_params[t])
for i in range(N):
alpha[t,i] = in_potential[i] * exp(node_params[t,i] - themax)
lognorm += log(normalize_inplace(alpha[t])) + themax
dgemv('T', &N, &N, &one, &pair_params[0,0], &N, &alpha[t,0], &inc,
&zero, &in_potential[0], &inc)
^
themax = max_vector(node_params[t])
for i in range(N):
alpha[t,i] = in_potential[i] * exp(node_params[t,i] - themax)
lognorm += log(normalize_inplace(alpha[t])) + themax
dgemv('T', &N, &N, &one, &pair_params[0,0], &N, &alpha[t,0], &inc,
&zero, &in_potential[0], &inc)
^
ext_modules=cythonize('**/*.pyx'),
cythonize_one(*args[1:])
raise CompileError(None, pyx_file)
I was trying to import svae.svae and got the error:
Traceback (most recent call last):
File "", line 1, in
File "svae/svae.py", line 5, in
from util import split_into_batches, get_num_datapoints
File "svae/util.py", line 12, in
from autograd.container_types import TupleNode, ListNode
ImportError: cannot import name TupleNode
and checked autograd.container_types, this TupleNode is not available there, or did I miss something? Thanks for any help.
Hi! Thanks for sharing the code and congrats for this amazing article!
I have a particular doubt about the natural parameterisation of the NIW distribution and I saw that in your code there is a function to re-parameterise it (standard_to_natural()
in svae/distributions/niw.py
). In particular, I don't exactly see where the outer product in parameter S
. Do you know any reference where I can check out the natural parameterisation for the NIW distribution (I couldn't find any)?
Many thanks in advance!
Any instructions on how to install svae? A similar to autograd and simple install like
pip install svae or pip install https://github.com/mattjj/svae/archive/master.zip
would be appreciated.
Thanks
Hi, I have the correct set up to reproduce the gmm experiments and am now trying to reproduce the lds dots experiment via the script in this file:
https://github.com/mattjj/svae/blob/reviving-lds-dots/experiments/lds_svae_synth.py
But I'm having trouble getting the code to run (attached the last error). Any help is appreciated, thanks!
Hi Matt,
I'm not sure if this is right place to ask question about your paper?
But anyway, I implemented a version of latent Gaussain Mixture model (normal-gamma prior for the gaussian). I found that the gradient from the KL loss term ( E_q[ KL(q(x)||P(x|theta))] ) make the result worse. Well, it makes the latent space looks like a gaussian and the generator network wouldn't be able to learn to reconstruct at all. I manage to make it works sometime but all of the time without the KL loss.
What I am wondering is, is this the behaviour you see on your experiments?
Sub-question, I'm using learning rate around 0.1- 0.2 for updating global parameters with natural gradient and I'm using 0.001 for the neural network recogniser and generator...Do you optimise the two parts seperately or together? It seems like in the paper the theory suggest same learning rate but I'm not sure. Sorry, I should have read through your code to answer these question but I'm a little bit clueless reading code in general.
my code if anyone interested https://github.com/Nat-D/SVAE-Torch
Thanks alot in advance,
Nat
Hi, I've been going through this code for the last week and I'm really excited about its potential. However, I'm currently stuck with the lds_svae_dots.py
example. I've tried many different hyper-parameter initializations and I cannot reproduce the corresponding figure from the paper (not even with the ones reported in the paper). What you see below is kind of the closest I can get:
For those who are trying as well and if the code is slow then please go to the svae.optimizers.py
file and unindent the callback such that the plotting function is called just once for a corresponding batch. You might also need to comment out the line with plt.close('all')
in lds_svae_dots.py
.
What follows is just a short report of what I've found out. The VAE part seems to work as the input is correctly reconstructed whereas the inference is usually off (see fig above). I ran the tests and realized that some of them were failing for the newest version of pylds. I therefore installed an older version of pylds (73fceec2215347e0a0e35a5f116e69aa719b2efc) which made the tests pass on a commit from April 2016 (a89e886).
Unfortunately this does not fix the issue with reproducing the LDS result so I am wondering if you are aware of what might be the underlying reason for why the model doesn't converge to a good solution? Could there be a bug in the inference part of the code which causes this behavior?
Thank you for making the code publicly available!
Cheers,
Haffi
Please could you tell me where to download the dataset from?
I saw you fixed the from test_util (change to svae.util), but somehow it reverted on trunk?
perhaps flesh out the setup.py a bit. Here is what I did (version requirements are just whatever
I had available on my Ubuntu 16.04 box).
I did not see the other experiments/ that wuaalb was mentioning. Perhaps they are on a branch?
diff --git a/setup.py b/setup.py
index c841313..01b893c 100644
--- a/setup.py
+++ b/setup.py
@@ -3,6 +3,16 @@ import numpy as np
from Cython.Build import cythonize
setup(
+ name='svae',
+ version='0.0.0',
+ description='structure variational auto-encoder',
+ install_requires=['autograd>=1.1.7', 'numpy>=1.11.0', 'scipy>=0.17.0', 'Cython>=0.25.1'
+ , 'pyhsmm>=0.1.6', 'toolz>=0.8.1'],
+ keywords=['autoencoder', 'machine learning', 'optimization'
+ , 'neural networks', 'Python', 'Numpy', 'Scipy'],
+ url='https://github.com/mattjj/svae',
+ packages=['svae', 'svae.distributions', 'svae.hmm', 'svae.lds', 'svae.models'],
+
ext_modules=cythonize('**/*.pyx'),
include_dirs=[np.get_include(),],
)
diff --git a/tests/test_gaussian.py b/tests/test_gaussian.py
index 7b7e560..2fa0daa 100644
--- a/tests/test_gaussian.py
+++ b/tests/test_gaussian.py
@@ -5,8 +5,7 @@ from autograd import grad
from svae.distributions.gaussian import logZ, expectedstats, \
pack_dense, unpack_dense
-from test_util import rand_psd
-
+from svae.util import rand_psd
def rand_gaussian(n):
J = rand_psd(n) + n * np.eye(n)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.