Git Product home page Git Product logo

thoglu / jammy_flows Goto Github PK

View Code? Open in Web Editor NEW
41.0 4.0 3.0 8.64 MB

A package to describe amortized (conditional) normalizing-flow PDFs defined jointly on tensor products of manifolds with coverage control. The connection between different manifolds is fixed via an autoregressive structure.

License: MIT License

Python 100.00%
normalizing-flows probability-distribution pdf manifolds

jammy_flows's Introduction

jammy_flows

This package implements (conditional) PDFs with Joint Autoregressive Manifold (MY) normalizing-flows. It grew out of work for the paper Unifying supervised learning and VAEs - coverage, systematics and goodness-of-fit in normalizing-flow based neural network models for astro-particle reconstructions [arXiv:2008.05825] and includes the paper's described methodology for coverage calculation of PDFs on tensor products of manifolds. For Euclidean manifolds, it includes an updated implementation of the offical implementation of Gaussianization flows [arXiv:2003.01941], where now the inverse is differentiable (adding Newton iterations to the bisection) and made more stable using better approximations of the inverse Gaussian CDF. Several other state-of-the art flows are implemented sometimes using slight modifications or extensions.

The package has a simple syntax that lets the user define a PDF and get going with a single line of code that should just work. To define a 10-d PDF, with 4 Euclidean dimensions, followed by a 2-sphere, followed again by 4 Euclidean dimensions, one could for example write

import jammy_flows

pdf=jammy_flows.pdf("e4+s2+e4", "gggg+n+gggg")

The first argument describes the manifold structure, the second argument the flow layers for a particular manifold. Here "g" and "n" stand for particular normalizing flow layers that are pre-implemented (see Features below). The Euclidean parts in this example use 4 "g" layers each. drawing

Have a look at the script that generates the above animation.

Documentation

The docs can be found here.

Also check out the example notebook.

Features

General

  • Autoregressive conditional structure is taken care of behind the scenes and connects manifolds
  • Coverage is straightforward. Everything (including spherical, interval and simplex flows) is based on a Gaussian base distribution (arXiv:2008.0582).
  • Bisection & Newton iterations for differentiable inverse (used for certain non-analytic inverse flow functions)
  • amortizable MLPs that can use low-rank approximations
  • amortizable PDFs - the total PDF can be the output of another neural network
  • unit tests that make sure backwards / and forward flow passes of all implemented flow-layers agree
  • include log-lambda as an additional flow parameter to define parametrized Poisson-Processes
  • easily extendible: define new Euclidean / spherical flow layers by subclassing Euclidean or spherical base classes

Euclidean flows:

  • Generic affine flow (Multivariate normal distribution) ("t")
  • Gaussianization flow arXiv:2003.01941 ("g")
  • Hybrid of nonlinear scalings and rotations ("Polynomial Stretch flow") ("p")

Spherical flows:

S1:

S2:

Interval Flows:

Simplex Flows:

For a description of all flows and abbreviations, have a look in the docs here.

Requirements

  • pytorch (>=1.7)
  • numpy (>=1.18.5)
  • scipy (>=1.5.4)
  • matplotlib (>=3.3.3)
  • torchdiffeq (>=0.2.1)

The package has been built and tested with these versions, but might work just fine with older ones.

Installation

specific version:

pip install git+https://github.com/thoglu/jammy_flows.git@*tag* 

e.g.

pip install git+https://github.com/thoglu/[email protected]

to install release 1.0.0.

master:

pip install git+https://github.com/thoglu/jammy_flows.git

Contributions

If you want to implement your own layer or have bug / feature suggestions, just file an issue.

jammy_flows's People

Contributors

thoglu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

jammy_flows's Issues

NaN during training

this gives NaN after a few epochs:

pdf = jammy_flows.pdf("e1+s2", "ggg+v", conditional_input_dim=4, hidden_mlp_dims_sub_pdfs="64-128-64")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In [20], line 20
     17 w = data[:, 3] *data.shape[0]/ sum(data[:, 3])
     18 labels = labels.to(device)
---> 20 log_pdf, _, _ = pdf(inp, conditional_input=labels) 
     21 neg_log_loss = (-log_pdf * w).mean()
     22 neg_log_loss.backward()

File ~/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1130, in Module._call_impl(self, *input, **kwargs)
   1126 # If we don't have any hooks, we want to skip the rest of the logic in
   1127 # this function, and just call forward.
   1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130     return forward_call(*input, **kwargs)
   1131 # Do not call functions when jit is used
   1132 full_backward_hooks, non_full_backward_hooks = [], []

File ~/.local/lib/python3.10/site-packages/jammy_flows/flows.py:975, in pdf.forward(self, x, conditional_input, amortization_parameters, force_embedding_coordinates, force_intrinsic_coordinates)
    968 tot_log_det = torch.zeros(x.shape[0]).type_as(x)
    970 base_pos, tot_log_det=self.all_layer_inverse(x, tot_log_det, conditional_input, amortization_parameters=amortization_parameters, force_embedding_coordinates=force_embedding_coordinates, force_intrinsic_coordinates=force_intrinsic_coordinates)
    972 log_pdf = torch.distributions.MultivariateNormal(
    973     torch.zeros_like(base_pos).to(x),
    974     covariance_matrix=torch.eye(self.total_base_dim).type_as(x).to(x),
--> 975 ).log_prob(base_pos)
    978 return log_pdf + tot_log_det, log_pdf, base_pos

File ~/.local/lib/python3.10/site-packages/torch/distributions/multivariate_normal.py:210, in MultivariateNormal.log_prob(self, value)
    208 def log_prob(self, value):
    209     if self._validate_args:
--> 210         self._validate_sample(value)
    211     diff = value - self.loc
    212     M = _batch_mahalanobis(self._unbroadcasted_scale_tril, diff)

File ~/.local/lib/python3.10/site-packages/torch/distributions/distribution.py:293, in Distribution._validate_sample(self, value)
    291 valid = support.check(value)
    292 if not valid.all():
--> 293     raise ValueError(
    294         "Expected value argument "
    295         f"({type(value).__name__} of shape {tuple(value.shape)}) "
    296         f"to be within the support ({repr(support)}) "
    297         f"of the distribution {repr(self)}, "
    298         f"but found invalid values:\n{value}"
    299     )

ValueError: Expected value argument (Tensor of shape (200, 3)) to be within the support (IndependentConstraint(Real(), 1)) of the distribution MultivariateNormal(loc: torch.Size([200, 3]), covariance_matrix: torch.Size([200, 3, 3])), but found invalid values:
tensor([[    nan,  0.1067, -2.2454],
        [    nan, -0.4479, -1.3993],
        [    nan,  1.1414, -0.2839],
        [    nan,  0.2720, -0.9769],
        [    nan,  0.4975,  0.5888],
        [    nan,  0.3729,  0.7307],
        [    nan, -0.5783, -0.6921],
        [    nan, -0.0498,  1.1616],
        [    nan,  1.1821, -1.6822],
        [    nan,  1.7657,  1.9744],
        [    nan, -1.0785,  1.1321],
....

transform & jacobian weights

Hi,

I see that the example script learns how to fit jammy_flows to characters (conditional on the character as the label).
But I am a bit confused how to do the simpler unconditional case, with Gaussianization flow. The notebook does not seem to use any samples, the visualisations just come after initialisation of the structure.

I'd also like to know how to use the pdf object. Specifically, if I have a coordinate in the "basis" space (where the PDF is a unit gaussian), how do I transform to the target space and obtain the jacobian weight for that location/transformation? If I understand correctly, forward() is the other direction?

Cheers,
Johannes

lambert projection does not work

pdf = jammy_flows.pdf("e1+s2", "gggg+n", conditional_input_dim=4, hidden_mlp_dims_sub_pdfs="128-128")

helper_fns.visualize_pdf(
    pdf.to("cpu"),
    fig,
    nsamples=10000,
    conditional_input=labels,
    bounds=[[-50, 150], [0, np.pi], [0, 2*np.pi]],
    s2_norm="lambert"
    );

RuntimeError                              Traceback (most recent call last)
Cell In [20], line 4
      1 data_df, labels = read_photon_table_hdf_unweighted("../assets/photon_table.hd5", 80)
      3 fig=plt.figure(figsize=(8,6))
----> 4 helper_fns.visualize_pdf(
      5     pdf.to("cpu"),
      6     fig,
      7     nsamples=10000,
      8     conditional_input=labels,
      9     bounds=[[-50, 150], [0, np.pi], [0, 2*np.pi]],
     10     s2_norm="lambert"
     11     )

File ~/.local/lib/python3.10/site-packages/jammy_flows/helper_fns.py:1012, in visualize_pdf(pdf, fig, gridspec, subgridspec, conditional_input, nsamples, total_pdf_eval_pts, bounds, true_values, plot_only_contours, contour_probs, contour_color, autoscale, seed, skip_plotting_density, hide_labels, s2_norm, colormap, s2_rotate_to_true_value, s2_show_gridlines, skip_plotting_samples, var_names)
   1005   samples, samples_base, evals, evals_base = pdf.sample(
   1006       samplesize=nsamples,
   1007       conditional_input=sample_conditional_input,
   1008       seed=seed)
   1010   higher_dim_spheres = False
-> 1012   new_subgridspec, total_pdf_integral = plot_joint_pdf(
   1013       pdf,
   1014       fig,
   1015       gridspec,
   1016       samples,
...
---> 77 input_cumwidths = cumwidths.gather(-1, bin_idx)#[..., 0]
     81 input_bin_widths = widths.gather(-1, bin_idx)#[..., 0]
     83 input_cumheights = cumheights.gather(-1, bin_idx)#[..., 0]

RuntimeError: index -1 is out of bounds for dimension 1 with size 6

"N"-type flows don't work on GPUs

pdf = jammy_flows.pdf("e1+s2", "gggg+n", conditional_input_dim=4, hidden_mlp_dims_sub_pdfs="128-128").to("cuda")
inp = inp.to("cuda")
cond = cond.to("cuda")
pdf(inp, conditional_input=cond)


File ~/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1130, in Module._call_impl(self, *input, **kwargs)
   1126 # If we don't have any hooks, we want to skip the rest of the logic in
   1127 # this function, and just call forward.
   1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130     return forward_call(*input, **kwargs)
   1131 # Do not call functions when jit is used
   1132 full_backward_hooks, non_full_backward_hooks = [], []

File ~/.local/lib/python3.10/site-packages/jammy_flows/flows.py:970, in pdf.forward(self, x, conditional_input, amortization_parameters, force_embedding_coordinates, force_intrinsic_coordinates)
    966     assert(x.shape[0]==conditional_input.shape[0]), "Evaluating input x and condititional input shape must be similar!"
    968 tot_log_det = torch.zeros(x.shape[0]).type_as(x)
--> 970 base_pos, tot_log_det=self.all_layer_inverse(x, tot_log_det, conditional_input, amortization_parameters=amortization_parameters, force_embedding_coordinates=force_embedding_coordinates, force_intrinsic_coordinates=force_intrinsic_coordinates)
    972 log_pdf = torch.distributions.MultivariateNormal(
    973     torch.zeros_like(base_pos).to(x),
    974     covariance_matrix=torch.eye(self.total_base_dim).type_as(x).to(x),
...
    104     #extra_input_counter+=self.num_householder_params
    105 else:
    106     mat_pars=mat_pars.repeat(x.shape[0],1,1)

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.