Git Product home page Git Product logo

vistan's Introduction

vistan

vistan is a simple library to run variational inference algorithms on Stan models.

vistan uses autograd and PyStan under the hood. The aim is to provide a "petting zoo" to make it easy to play around with the different variational methods discussed in the NeurIPS 2020 paper Advances in BBVI.

Features

  • Initialization: Laplace's method to initialize full-rank Gaussian
  • Gradient Estimators: Total-gradient, STL, DReG, closed-form entropy
  • Variational Families: Full-rank Gaussian, Diagonal Gaussian, RealNVP
  • Objectives: ELBO, IW-ELBO
  • IW-sampling: Posterior samples using importance weighting

Installation

pip install vistan

Usage

The typical usage of the package would have the following steps:

  1. Create an algorithm. This can be done in two wasy:
  • The easiest is to use a pre-baked recipe as algo=vistan.recipe('meanfield'). There are various options:
    • 'advi': Run our implementation of ADVI's PyStan.
    • 'meanfield': Full-factorized Gaussia a.k.a meanfield VI
    • 'fullrank': Use a full-rank Gaussian for better dependence between latent variables
    • 'flows': Use a RealNVP flow-based VI
    • 'method x': Use methods from the paper Advances in BBVI where x is one of [0, 1, 2, 3a, 3b, 4a, 4b, 4c, 4d]
  • Alternatively, you can create a custom algorithm as algo=vistan.algorithm(). Some most frequent arguments:
    • vi_family: This can be one of ['gaussian', 'diagonal', 'rnvp'] (Default: gaussian)
    • max_iter: The maximum number of optimization iterations. (Default: 100)
    • optimizer: This can be 'adam' or 'advi'. (Default: 'adam')
    • grad_estimator: What gradient estimator to use. Can be 'Total-gradient', 'STL', 'DReG', or 'closed-form-entropy'. (Default: 'DReG')
    • M_iw_train: The number of importance samples. Use 1 for standard variational inference or more for importance-weighted variational inference. (Default: 1)
    • per_iter_sample_budget: The total number of evaluations to use in each iteration. (Default: 100)
  1. Get an approximate posterior as posterior=algo(code, data). This runs the algorithm on Stan model given by the string code with observations given by the data.
  2. Draw samples from the approximate posterior as samples=posterior.sample(100). You can also draw samples using importance weighting as posterior.sample(100, M_iw_sample=10). Further, you can evaluate the log-probability of the posterior as posterior.log_prob(latents).

Citing vistan

If you use vistan, please, consider citing:

@inproceedings{aagrawal2020,
  author    = {Abhinav Agrawal and
               Daniel R. Sheldon and
               Justin Domke},
  title     = {Advances in Black-Box {VI:} Normalizing Flows, Importance Weighting,
               and Optimization},
  booktitle = {Advances in Neural Information Processing Systems 33: Annual Conference
               on Neural Information Processing Systems 2020, NeurIPS 2020, December
               6-12, 2020, virtual},
  year      = {2020},
}

Recipes

Recipies refers to set of predetermined hyperparameters that let you quickly run some common variational algorithms.

Meanfield Gaussian

'meanfield' runs the fully factorized Gaussian VI optimized using Adam

import vistan 
import matplotlib.pyplot as plt
import numpy as np 
import scipy
code = """
data {
    int<lower=0> N;
    int<lower=0,upper=1> x[N];
}
parameters {
    real<lower=0,upper=1> p;
}
model {
    p ~ beta(1,1);
    x ~ bernoulli(p);
}
"""
data = {"N":5, "x":[0,1,0,0,0]}
algo = vistan.recipe() # runs Meanfield VI by default
posterior = algo(code, data) 
samples = posterior.sample(100000)

points = np.arange(0,1,.01)
plt.hist(samples['p'], 200, density = True, histtype = 'step')
plt.plot(points,scipy.stats.beta(2,5).pdf(points),label='True Posterior')
plt.legend()
plt.show()

Full-rank Gaussian

'fullrank', as the name suggests, optimizes full-rank Gaussian VI using Adam

algo = vistan.recipe("fullrank")  
posterior = algo(code, data)
samples = posterior.sample(100000)

points = np.arange(0, 1, .01)
plt.hist(samples['p'], 200, density=True, histtype='step')
plt.plot(points, scipy.stats.beta(2, 5).pdf(points), label='True Posterior')
plt.legend()
plt.show()

Flow-based VI

'flows' optimizes a RealNVP inspired flow distribution for variational approximation using Adam

algo = vistan.recipe("flows")  
posterior = algo(code, data)
samples = posterior.sample(100000)

points = np.arange(0, 1, .01)
plt.hist(samples['p'], 200, density=True, histtype='step')
plt.plot(points, scipy.stats.beta(2, 5).pdf(points), label='True Posterior')
plt.legend()
plt.show()

ADVI

'advi' runs our implementation of PyStan's ADVI and uses their custom step-sequence scheme

algo = vistan.recipe("advi")  
posterior = algo(code, data)
samples = posterior.sample(100000)

points = np.arange(0, 1, .01)
plt.hist(samples['p'], 200, density=True, histtype='step')
plt.plot(points, scipy.stats.beta(2, 5).pdf(points), label='True Posterior')
plt.legend()
plt.show()

Methods from Advances in BBVI

method x runs implementation of different variational methods from Advances in BBVI, where x is one of [0, 1, 2, 3a, 3b, 4a, 4b, 4c, 4d]

# Try method 0, 1, 2, 3a, 3b, 4a, 4b, 4c, 4d
algo = vistan.recipe("method 4d")  
posterior = algo(code, data)
samples = posterior.sample(100000)

points = np.arange(0, 1, .01)
plt.hist(samples['p'], 200, density=True, histtype='step')
plt.plot(points, scipy.stats.beta(2, 5).pdf(points), label='True Posterior')
plt.legend()
plt.show()

Custom algorithms

You can also specify custom VI algorithms to work with your Stan models using vistan.algorithm. Please, see the documentation of vistan.algorithm for a complete list of supported arguments.

algo = vistan.algorithm(
                M_iw_train=2,
                grad_estimator="DReG",
                vi_family="gaussian",
                per_iter_sample_budget=10,
                max_iters=100)
posterior = algo(code, data)
samples = posterior.sample(100000)

points = np.arange(0, 1, .01)
plt.hist(samples['p'], 200, density=True, histtype='step')
plt.plot(points, scipy.stats.beta(2, 5).pdf(points), label='True Posterior')
plt.legend()
plt.show()

IW-sampling

We provide support to use IW-sampling at inference time; this importance weights M_iw_sample candidate samples and picks one (see Advances in BBVI for more information.) IW-sampling is a post-hoc step and can be used with almost any variational scheme.

samples = posterior.sample(100000, M_iw_sample=10)

points = np.arange(0, 1, .01)
plt.hist(samples['p'], 200, density=True, histtype='step')
plt.plot(points, scipy.stats.beta(2, 5).pdf(points), label='True Posterior')
plt.legend()
plt.show()

Initialization

We provide support to use Laplace's method to initialize the parameters for Gaussian VI.

algo = vistan.algorithm(vi_family='gaussian', LI=True)
posterior = algo(code, data) 
samples = posterior.sample(100000)

points = np.arange(0, 1, .01)
plt.hist(samples['p'], 200, density=True, histtype='step')
plt.plot(points, scipy.stats.beta(2, 5).pdf(points), label='True Posterior')
plt.legend()
plt.show()

Building your own inference algorithms

We provide access to the model.log_prob function we use internally for optimization. This allows you to evaluate the log density in the unconstrained space for your Stan model. Also, this function is differentiable in autograd.

log_prob = posterior.model.log_prob

Limitations

  • We currently only support inference on all latent parameters in the model
  • No support for data sub-sampling.

vistan's People

Contributors

abhiagwl avatar justindomke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

vistan's Issues

vistan installation fails when installing PyStan -- need dev environment info

Hello,

I've had some trouble installing vistan on my laptop (macOS 10.15.7). It fails when it's trying to install pystan. To troubleshoot this, I've tried installing PyStan prior to vistan - and this worked well. However once I install vistan it will uninstall PyStan and fail when it tries to reinstall it. Do you have any recommendations on how to get around this issue? Please let me know if there is any additional information you need to resolve this.

pystan doesn't load when trying to execute example on home page

I typed in the example from the image on your home page (included below), but it fails on loading pystan.

Here's the result of the pip install:

~/github/abhiagwl/vistan (master)$ pip install vistan
...
Successfully installed aiohttp-3.8.4 aiosignal-1.3.1 appdirs-1.4.4 async-timeout-4.0.2 autograd-1.5 charset-normalizer-3.1.0 clikit-0.6.2 crashtest-0.3.1 frozenlist-1.3.3 future-0.18.3 httpstan-4.9.1 idna-3.4 joblib-1.2.0 marshmallow-3.19.0 multidict-6.0.4 pastel-0.2.1 pylev-1.4.0 pysimdjson-5.0.2 pystan-3.6.0 vistan-0.0.0.6 webargs-8.2.0 yarl-1.8.2
[notice] A new release of pip is available: 23.0 -> 23.0.1
[notice] To update, run: python3.9 -m pip install --upgrade pip

It says pystan-3.6.0 is installed, but when I try to run it through the script attached below, I get this:

~/github/abhiagwl/vistan/sandbox (master)$ python3
Python 3.9.4 (default, Apr  5 2021, 01:47:16) 
[Clang 11.0.0 (clang-1100.0.33.17)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> exec(open('bernoulli-eg.py').read())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.9/site-packages/vistan/__init__.py", line 1, in <module>
    from .inference import algorithm, recipe
  File "/usr/local/lib/python3.9/site-packages/vistan/inference.py", line 7, in <module>
    import vistan.interface as interface
  File "/usr/local/lib/python3.9/site-packages/vistan/interface.py", line 1, in <module>
    import pystan
ModuleNotFoundError: No module named 'pystan'

Independently installing pystan also doesn't work. @justindomke suggested it may be a Python version number.

Here's the script I transcribed from the image on the home page:

import vistan
import matplotlib.pyplot as plt
import numpy as np
import scipy
plt.style.use("ggplot")
code = """
data {
  int N;
  int x[N];
}
parameters {
  real<lower=0, upper=1> theta;
}
model {
  x ~ bernoulli(theta);
}
"""
data = {"N":5, "x":[0,1,0,0,0]}

for r in ['meanfield', 'flows']:
    algo = vistan.recipe(r)
    posterior = algo(code, data)
    sample = posterior.sample(100_000)
    plt.hist(sample['theta'], 200, density=True, histtype='step', label= r, linewidth = 1.5)

points = np.arange(0, 1, 0.01)
plot.points(points, scipy.stats.beta(2,5).pdf(points), label='True Posterior', linewidth=1.5)
plt.legend()
plot.show()

Convert from PyStan back end to BridgeStan

It would be good for performance and for keeping up with Stan releases to replace PyStan with the BridgeStan interface to Stan. BridgeStan is also much simpler. We just released a stable 1.0 version with full doc and testing (please let us know if something's unclear). BridgeStan communicates through the low-level ctypes interface for Python, which lets us do everything in memory directly without copying. It's also good for compatibility and installation because it only requires memory compatibility, which is guaranteed by the C++ spec.

Cannot import vistan after pip installing

importing vistan produces an error.

----> 1 import vistan

File ~/Repos/advi/env/lib/python3.9/site-packages/vistan/__init__.py:1, in <module>
----> 1 from .inference import algorithm, recipe

File ~/Repos/advi/env/lib/python3.9/site-packages/vistan/inference.py:7, in <module>
      5 import vistan.vi_families as vi_families
      6 import vistan.objectives as objectives
----> 7 import vistan.interface as interface
      8 import vistan.utilities as utils
      9 import vistan.hyperparams as hyperparams

File ~/Repos/advi/env/lib/python3.9/site-packages/vistan/interface.py:1, in <module>
----> 1 import pystan
      2 import warnings
      3 import pickle

I believe I have read that newer versions of pystan should be imported as stan. I currently have pystan==3.4.0.

Some questions:

  1. What version of python were you using when you built this library? (In particular, was it python 2 or python 3?)
  2. What version of pystan were you using when you built this library? (I don't see a requirements.txt.)
  3. What is your recommendation to python3 users for importing your library? (If it matters, I am using python 3.9.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.