photrek / nonlinear-statistical-coupling Goto Github PK

View Code? Open in Web Editor NEW

5.0 5.0 2.0 31.73 MB

License: Apache License 2.0

Mathematica 81.32% Python 1.15% Jupyter Notebook 17.51% Fortran 0.01%

nonlinear-statistical-coupling's People

Contributors

Stargazers

Watchers

Forkers

thistleton hxyue1

nonlinear-statistical-coupling's Issues

Coupled Power Function

The Coupled Power Function is a high priority as it is needed for the Weighted Generalized Mean function.

See equation 3.15 of the Reduced Perplexity book chapter for the specification of the coupled power. I will also give some consideration as to whether the coupled power can be defined in terms of the coupled logarithms and coupled exponential functions.

Treat special cases of coupled entropy within coupled entropy function

There are special cases of the coupled entropy function which reduce by either:

the coupled probability is analytical, thus eliminating one part of the numerical evaluation
the entire coupled entropy computation is analytical.

These special cases should be identified and solved analytically within the coupled entropy function rather than as separate functions. Its okay for the special functions which have already been developed to be sub functions of the coupled entropy.

Mixture Model for Coupled VAE

To enable processing of CIFAR10 with the Coupled VAE a variety of enhancements are required. One critical one is to incorporate supervised learning with labels for the 10 classes of the CIFAR images. The labeling will allow each class to be trained with its own Coupled VAE latent layer. This is one step toward reducing the complexity of the dataset. Other steps will be required but let's complete this step first.

Review carefully the Boenninghoff paper on Student t Mixture Model VAE and related papers particularly the following reference: J. Domke and D. Sheldon, “Importance Weighting and Variational Inference,” in 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), 2018.
Determine and specify what changes are needed in both the architecture of the VAE and the cost functions of the VAE.
Collaborate with John Clements on a plan to implement this change.

coupled_kl_divergence_norm off by orders of 2 to 3 magnitudes

When I use tf.math to calculate kl divergence:

        logpz = self.log_normal_pdf(z_sample, 0., 1.)
        logqz_x = self.log_normal_pdf(z_sample, mean, logvar)
        kl_div = logqz_x - logpz

I get the following numbers, averaging to 0.26684278.

ipdb> kl_div
<tf.Tensor: shape=(128,), dtype=float32, numpy=
array([ 0.86921954,  0.45371056,  0.88609505, -0.01174855,  0.46242428,
        0.88558793,  0.9410727 , -0.3700683 ,  0.84306717, -0.06638861,
        0.37066197,  0.45233607,  0.8508425 ,  0.52803755,  0.9505639 ,
       -0.3642335 ,  0.38080645,  0.9970349 , -0.27189922,  0.97796655,
        0.95682573, -0.32304716, -0.813817  , -0.30250955,  0.6030083 ,
        0.75281763, -1.2111926 , -0.7194972 ,  0.2248416 , -0.5922799 ,
       -0.16742158,  0.05214858, -0.53073287,  0.548347  ,  0.6337991 ,
       -0.40753698,  0.864239  , -1.0780277 ,  0.7774732 ,  0.6771748 ,
        0.80476236, -0.46709728, -1.0554905 ,  0.37865567,  0.7497237 ,
        0.33856797,  0.81753445,  0.8892932 ,  0.3270316 , -1.6759243 ,
        0.4765191 ,  0.64577174,  0.25702858,  0.26793242,  0.8592057 ,
        0.7047727 ,  0.9932246 , -1.3861675 ,  0.10657287,  0.52103424,
        0.56670666,  0.63626647,  0.5903802 , -1.5752082 ,  0.23447895,
       -2.8028917 ,  0.61361504,  0.32030725,  0.77301764, -0.25954676,
       -0.19354391,  0.91773224,  0.4544549 ,  0.6440444 ,  0.9674704 ,
       -0.13501692, -0.4141333 , -0.14588952,  0.07112408,  0.96379423,
        0.96018887, -0.28566027,  0.45304155,  0.64666224,  0.5147927 ,
        0.460351  , -0.42211604,  0.88477373,  0.41102314,  0.663666  ,
        0.86534095,  0.9025917 ,  0.46783733,  0.47456598,  0.71588826,
        0.99136996,  0.08168316,  0.95838964,  0.9762056 , -0.6931119 ,
        0.39269042,  0.7297759 ,  0.70975566,  0.9976181 ,  0.1633246 ,
        0.8174796 ,  0.9928291 ,  0.6770606 ,  0.64429426, -0.60490847,
        0.63297606, -0.11669183,  0.78825736,  0.90766   , -0.7307825 ,
        0.9038205 ,  0.05003333,  0.89798975,  0.58409166, -0.593915  ,
        0.1855967 , -0.3870778 ,  0.96894336,  0.33893538, -0.7414057 ,
        0.6335244 , -2.4104114 ,  0.20711422], dtype=float32)>
ipdb> tf.math.reduce_mean(kl_div)
<tf.Tensor: shape=(), dtype=float32, numpy=0.26684278>

However, when I use coupled_kl_divergence_norm in the following manner:

        x_recons_logits, z_sample, mean, logvar = self.model(x_true)
        
        q_zx = MultivariateCoupledNormal(loc=mean.numpy(), scale=tf.exp(logvar/2).numpy())
        p_z = MultivariateCoupledNormal(loc=np.zeros(mean.shape), scale=np.ones(logvar.shape))

        kl_div = coupled_kl_divergence_norm(q_zx, p_z, root=False)
        kl_div = tf.convert_to_tensor(kl_div, dtype=tf.float32)

I get the following numbers, averaging to just 0.0007430053.

...
...
       [[ 1.5007547e-03]],

       [[ 1.4787790e-04]],

       [[ 4.7172891e-04]],

       [[ 1.7600998e-03]],

       [[-3.5504821e-05]],

       [[ 2.9015096e-04]],

       [[ 1.0139990e-03]],

       [[ 2.1769719e-03]],

       [[ 3.9246224e-04]],

       [[ 3.0478922e-04]],

       [[ 2.3205217e-03]],

       [[ 1.1474675e-03]],

       [[ 2.5465735e-04]],

       [[ 1.3815672e-03]],

       [[ 1.0416418e-03]]], dtype=float32)>
ipdb> tf.math.reduce_mean(kl_div)
<tf.Tensor: shape=(), dtype=float32, numpy=0.0007430053>

Plz advise.

MultivariateCoupledNormal - entropy function

Implement the entropy function for MultivariateCoupledNormal. Use the equivalent CoupledNormal function as a foundation.

See our latest nsc code here.

Enable nsc lib to be in functional form

Instantiation distribution classes do not require passing in hard loc and scale 'tensors' right away, allowing for delayed execution. This allows for speedier runtime during the execution of the model. For example, @tf.function does not have to be commented out when integrated nsc into a TF-based model.

Improve header documentation of functions

Each function should have a text summary of the function with a reference to a paper and an equation on which the function is modeled.

change variable name kMult to riskBias

The variable which in papers and in the documentation is called "Risk Bias" currently has a variable name kMult. Throughout the NSC and Coupled VAE library this variable needs to be renamed riskBias

MultivariateCoupledNormal - KL function

Implement the KL-Divergence function for two MultivariateCoupledNormal distributions. Use the equivalent CoupledNormal function as a foundation.

Create CoupledNormal distribution class

Create a CoupledNormal distribution class that takes in coupling value kappa and includes the following functions:

log_prob. See our latest nsc code here. See original tfp's StudentT code here:

def log_prob(x, df, loc, scale):
  """Compute log probability of Student T distribution.
  Note that scale can be negative.
  Args:
    x: Floating-point `Tensor`. Where to compute the log probabilities.
    df: Floating-point `Tensor`. The degrees of freedom of the
      distribution(s). `df` must contain only positive values.
    loc: Floating-point `Tensor`; the location(s) of the distribution(s).
    scale: Floating-point `Tensor`; the scale(s) of the distribution(s).
  Returns:
    A `Tensor` with shape broadcast according to the arguments.
  """
  # Writing `y` this way reduces XLA mem copies.
  y = (x - loc) * (tf.math.rsqrt(df) / scale)
  log_unnormalized_prob = -0.5 * (df + 1.) * log1psquare(y)
  log_normalization = (
      tf.math.log(tf.abs(scale)) + 0.5 * tf.math.log(df) +
      0.5 * np.log(np.pi) + tfp_math.log_gamma_difference(0.5, 0.5 * df))
  return log_unnormalized_prob - log_normalization

sample_n. See our latest nsc code here. See original StudentT code here:

def log_prob(x, df, loc, scale):
  """Compute log probability of Student T distribution.
  Note that scale can be negative.
  Args:
    x: Floating-point `Tensor`. Where to compute the log probabilities.
    df: Floating-point `Tensor`. The degrees of freedom of the
      distribution(s). `df` must contain only positive values.
    loc: Floating-point `Tensor`; the location(s) of the distribution(s).
    scale: Floating-point `Tensor`; the scale(s) of the distribution(s).
  Returns:
    A `Tensor` with shape broadcast according to the arguments.
  """
  # Writing `y` this way reduces XLA mem copies.
  y = (x - loc) * (tf.math.rsqrt(df) / scale)
  log_unnormalized_prob = -0.5 * (df + 1.) * log1psquare(y)
  log_normalization = (
      tf.math.log(tf.abs(scale)) + 0.5 * tf.math.log(df) +
      0.5 * np.log(np.pi) + tfp_math.log_gamma_difference(0.5, 0.5 * df))
  return log_unnormalized_prob - log_normalization

entropy. See our latest nsc code here. See original StudentT code here:

def entropy(df, scale, batch_shape, dtype):
  """Compute entropy of the StudentT distribution.
  Args:
    df: Floating-point `Tensor`. The degrees of freedom of the
      distribution(s). `df` must contain only positive values.
    scale: Floating-point `Tensor`; the scale(s) of the distribution(s). Must
      contain only positive values.
    batch_shape: Floating-point `Tensor` of the batch shape
    dtype: Return dtype.
  Returns:
    A `Tensor` of the entropy for a Student's T with these parameters.
  """
  v = tf.ones(batch_shape, dtype=dtype)
  u = v * df
  return (tf.math.log(tf.abs(scale)) + 0.5 * tf.math.log(df) +
          tfp_math.lbeta(u / 2., v / 2.) + 0.5 * (df + 1.) *
          (tf.math.digamma(0.5 * (df + 1.)) - tf.math.digamma(0.5 * df)))

kl_couplednormal_couplednormal. Unfortunately, the TFP's StudentT code does not contain the KL-Divergence function for two StudentT distributions, so we may need to create that one first, per the following issue. See our latest nsc code here.

Create MultivariateCoupledNormal distribution class

Create a MultivariateCoupledNormal distribution class that takes in coupling value kappa and number of dimensions dim, and includes the following functions:

log_prob. See our latest nsc code here.
sample_n. See our latest nsc code here.
entropy. See our latest nsc code here.
kl_multcouplednormal_multcouplednormal. Or the KL-Divergence function for two MultivariateCoupledNormal distributions.

Use the respective CoupledNormal functions as a foundation to build the MultivariateCoupledNormal ones.

checking scipy multivariate student t

from scipy.stats import multivariate_t

produces probability density function, random deviates, etc.

allows non-diagonal sigma

https://colab.research.google.com/drive/1TcIfpMnx95QwUi0ZO3aVns7atYTepTyu?usp=sharing

MultivariateCoupledNormal - log_prob function

Implement the log PDF for MultivariateCoupledNormal. Use the equivalent CoupledNormal function as a foundation.

See our latest nsc code here.

Develop Cross-Entropy, Entropy, and Divergence

Complete the development of Coupled Cross-Entropy, Coupled Entropy, and Coupled Divergence in Python
Use the Mathematica functions in Coupled_Functions.nb as guides for the development
Daniel Svoboda has prototyped the code, see his workspace folder and the file functions.py

The important points are:

Leverage the enhanced CoupledLogarithm function
Insure that both the original definition and the definition with the alpha root is implemented and verified against the Mathematica plots
Make use of the Hamiltonian Monte-Carlo integration method or a similar tool from the TFP MCMC folder
The TFP VI (Variational Inference) folder has divergence functions which can provide some guidance, but our interests are in having the three functions: cross-entropy, entropy, and divergence
Develop the Coupled Cross-Entropy first which has two distributions as inputs
Coupled Entropy should call Coupled Cross-Entropy with the same distribution
Coupled_Divergence = Coupled_Cross_Entropy - Coupled_Entropy; thus the divergence function will be implemented very differently than tfp vi library; however, this seems like the simplest way to ensure that there is one foundational function the cross-entropy which the others are dependent on.

CoupledNormal - log_prob function

Implement the log PDF for CoupledNormal. Use the commented out TFP's Student log_prob code as a foundation.

See our latest nsc code here.

See original StudentT code here:

def log_prob(x, df, loc, scale):
  """Compute log probability of Student T distribution.
  Note that scale can be negative.
  Args:
    x: Floating-point `Tensor`. Where to compute the log probabilities.
    df: Floating-point `Tensor`. The degrees of freedom of the
      distribution(s). `df` must contain only positive values.
    loc: Floating-point `Tensor`; the location(s) of the distribution(s).
    scale: Floating-point `Tensor`; the scale(s) of the distribution(s).
  Returns:
    A `Tensor` with shape broadcast according to the arguments.
  """
  # Writing `y` this way reduces XLA mem copies.
  y = (x - loc) * (tf.math.rsqrt(df) / scale)
  log_unnormalized_prob = -0.5 * (df + 1.) * log1psquare(y)
  log_normalization = (
      tf.math.log(tf.abs(scale)) + 0.5 * tf.math.log(df) +
      0.5 * np.log(np.pi) + tfp_math.log_gamma_difference(0.5, 0.5 * df))
  return log_unnormalized_prob - log_normalization

Processes for Coupled VAE

Support team in the implementation of effective numerical methods for the core Coupled Functions, particularly the current issue regarding integration for the entropy functions.
Provide guidance for Daniel Svoboda in completing a well-written review of applications of Coupled VAEs
Develop a plan for applying the Coupled VAE to a variety of processes (signal, image, and natural language).

Interesting behaviour in coupled exponential with negative Kappa

Reporting some of the results I talked about with John, Bill and Kevin with regards to negative Kappa values

For -1< kappa <0, there are domain restriction on the input values. When kappa < -1, the shape of the exponential inverts and stop at the x-axis. At kappa = -1, it's a straight line which extends from -inf to 1.

Code to reproduce

X = np.linspace(-10, 5, n_sample*10)
y = {}

fig, ax = plt.subplots(figsize=(8, 12))
ax.axvline(c='black', lw=1)
ax.axhline(c='black', lw=1)

cm = plt.get_cmap('PiYG')
kappa_values = [round(value, 1) for value in np.arange(-2, -0.4, 0.2)]
n = len(kappa_values)

plt.xlim(-10, 10)
plt.ylim(-10, 10)

for kappa in kappa_values:
    y[kappa] = nsc.exp(X, kappa)
    plt.plot(X, y[kappa], label=kappa)

plt.legend()
plt.show();

I will comment more results in this issue as I continue to find them.

CoupledNormal sampling inconsistent with Student's t

I was looking through the CoupledNormal class and was curious about the sampling method. I did some testing to compare it with the Student's t distribution and it looks like they're not consistent.

From my understanding, a coupled normal has the same distribution as a student's t with df = 1/kappa, provided loc and scale are the same. Empirically when I sample from the CoupledNormal and the numpy implementation of the student's t, the results are inconsistent, and in some cases wildly different.

sample_size = 1000
kappa = 0.4
loc = 0.
scale = 1.

np.random.seed(0)
cn = CoupledNormal(loc=loc, scale=scale, kappa=kappa, alpha=2)
cn_samples = cn.sample_n(n=sample_size)
t_samples = t.rvs(df=1/kappa, loc=loc, scale=scale, size=sample_size)

fig, ax = plt.subplots(figsize=(8, 5))
plt.hist(cn_samples, label='coupled normal', bins=30, alpha=0.5)
plt.hist(t_samples, label='students-t', bins=30, alpha=0.5)
plt.legend()

It seems to be particularly bad when scale > 1 and kappa is small

sample_size = 1000
kappa = 0.1
loc = 0.
scale = 10.

np.random.seed(0)
cn = CoupledNormal(loc=loc, scale=scale, kappa=kappa, alpha=2)
cn_samples = cn.sample_n(n=sample_size)
t_samples = t.rvs(df=1/kappa, loc=loc, scale=scale, size=sample_size)

fig, ax = plt.subplots(figsize=(8, 5))
plt.hist(cn_samples, label='coupled normal', bins=30, alpha=0.5)
plt.hist(t_samples, label='students-t', bins=30, alpha=0.5)
plt.legend()

nsc features for v0.0.4

Here is the list of desired deliverables for the next release of v0.0.4:

Migrate coupled entropy function and quad integrations into the master branch.
Conduct unit and existing tests on the prob and sample_n functions for both CoupledNormal and MultivariateCoupledNormal classes.
Update the nsc guide to include these additions.
Test nsc libs on Coupled VAE notebook, using the MultivariateCoupledNormal class in the reparameterization as well coupled kl_divergence and coupled_entropy in the loss function.

See the current production version here and the test version here.

Coupled Product Function

The NSC library needs to include a Coupled Product Function. This is a high priority as it forms the foundation for the Generalized Mean function and will be used in other ways regarding the performance evaluation.

The specification is detailed in the mathematical code. The basic structure is exp_k (Total (log_k (input is typically a probability)). The first priority would be to implement a version similar to the mathematical code in which all the inputs have the same dimension. A lower priority would be to allow each input to have a different dimension.

See K. P. Nelson, “A definition of the coupled-product for multivariate coupled-exponentials,” Phys. A Stat. Mech. its Appl., vol. 422, pp. 187–192, Mar. 2015.

Displaying attributest for MultivariateCoupledNormal when batch_size = 1

The computations work when there is a batch dimension, but we might want to display the loc and scale without a batch dimension for readability.

Failed to use np.apply_along_axis for MNC prob with batches

Initially, I have tried to use the following lambda function in MNC prob:

            _normalized_X = lambda x: np.matmul(np.matmul(np.expand_dims(x-self._loc, axis=-2),
                                                          _sigma_inv
                                                          ),
                                                np.expand_dims(x-self._loc, axis=-1)
                                                )

However, when doing so, I get the following error:

X_norm = np.apply_along_axis(_normalized_X, 1, X)
*** ValueError: operands could not be broadcast together with shapes (64,) (64,2)

Therefore, as a work-around, I have to use the following for-loop in order to populate X_norm. Nevertheless, I still believe that this can be done through vectorization, for example, using an alternative func other than np.apply_along_axis.

Coupled KL Divergence does not support batches

I have created two MultivariateCoupledNormal distributions, both of bach_size=32 batches and dim=2. However, I get the following error when performing coupled_kl_divergence:

Copywrite statement for each file

Draft a template Copywrite statement for each file to include Copywrite Photrek Date with a second line stating Contributing Programmers and a third line stating Reviewers and Approvers

bdbdbd

fdfdbf

Applications of Coupled VAE

We needed a clear plan of how we will demonstrate the capabilities of the Coupled VAE. This issue can address both short-term demonstrations which we complete for the current paper and longer-term interests in applications that potential sponsors would be interested in.

For the short-term:

We are planning to use the CIFAR-10 dataset to demonstrate close to the state-of-the-art capabilities with the Coupled VAE; however, there are concerns that the current architecture will not be adequate to train against these complex images. Therefore, there is a need to
Review the literature on the use of VAE to learn, classify, and generate CIFAR images.
Take detailed notes in Mendeley of these articles so others on the team can quickly review the key points for our project
Write a good summary of the approaches that can be used to both decide on our best approach and be used in the paper we are working on.

For longer-term issues

This should be lower priority until we have a good review of processing CIFAR-10 with VAEs
Can be broader in scope including using VAEs for processes, including signal processing, image processing, and natural language processing
Goal is to both motivate a new project following the current paper and to help the team develop sponsorship for the research

Make nsc lib compatible with tf's Tensors

Nsc lib is currently compatible with scalars and numpy arrays. Now we also like to make it compatible for Tensorflow's tf as well in order to use it to perform experiments for the VAE.

Apply tfp's sample_n function into MultivariateCoupledNormal class

In the nsc's tensor branch, sample_n function is the highest priority as we very likey to use tensor version of sample_n rather than the numpy version of it.

Coupled Sum Function

The Coupled Sum Function is a lower priority as it is not likely to be needed for the Coupled VAE development. Nevertheless, for completeness, this would be nice to have in the library.

There are two ways the function could be developed.

follow the approach used in the mathematical code, which is a straightforward implementation of the basic function;
A better alternative is to utilize the exp_k and log_k functions. This is sketched below.

The Coupled Sum arises from the product of coupled exponential functions which in turn results in the coupled sum of the exponents of a resultant. exp_k x * exp_k y = exp_k (x +_k y). The coupled sum is defined as x +_k y = x + y + k x y. However, a more complete version that accounts for the parameters alpha and dimension like the coupled product function does. Thus a better implementation would be as follows.

Coupled_Sum (X) = log_k ( Product_Total (exp_k (x_i) )), where X is an array and x_i are the elements. I'm not showing the alpha and dimension terms but these would be included to complete the expression.

Coupled cross entropy does not accept ndArray or Tensor inputs

In the current vae code, x_true is the input images while x_recons_logits is the output generated images. Both are Tensors. In the tf lib, there is a cross entropy function that calculates these two:

        raw_cross_entropy = tf.nn.sigmoid_cross_entropy_with_logits(
        labels=x_true, logits=x_recons_logits)

However, in our nsc lib, our coupled_cross_entropy function inputs density function p and q. Would it also work if we input in x_true and x_recons_logits? Will it be something like:

coupled_cross_entropy(x, x_gen, sample_n)

Although how do we get sample_n?

CoupledNormal - entropy function

Implement the entropy function for CoupledNormal. Use the commented out TFP's Student entropy code as a foundation.

See our latest nsc code here.

See original StudentT code here:

def entropy(df, scale, batch_shape, dtype):
  """Compute entropy of the StudentT distribution.
  Args:
    df: Floating-point `Tensor`. The degrees of freedom of the
      distribution(s). `df` must contain only positive values.
    scale: Floating-point `Tensor`; the scale(s) of the distribution(s). Must
      contain only positive values.
    batch_shape: Floating-point `Tensor` of the batch shape
    dtype: Return dtype.
  Returns:
    A `Tensor` of the entropy for a Student's T with these parameters.
  """
  v = tf.ones(batch_shape, dtype=dtype)
  u = v * df
  return (tf.math.log(tf.abs(scale)) + 0.5 * tf.math.log(df) +
          tfp_math.lbeta(u / 2., v / 2.) + 0.5 * (df + 1.) *
          (tf.math.digamma(0.5 * (df + 1.)) - tf.math.digamma(0.5 * df)))

nsc features for v0.0.3

Here is the list of desired deliverables for the next release of v0.0.3:

Incorporate quad Integration into the coupled entropy function. Also test out MC Integration as well?
Create MultivariateCoupledNormal class or enable CoupledNormal to compute for n-dimensions.
Add sample_n function into MultivariateCoupledNormal distribution class(es). Use inverse transform sampling or RNG?
Integration of MultivariateCoupledNormal with the coupled entropy function.
Conduct unit and existing tests on the existing functions.
Update the nsc guide to include these additions.
Created a Coupled VAE notebook that uses the MultivariateCoupledNormal class in the reparameterization as well coupled kl_divergence and coupled_entropy in the loss function.

See the current production version here and the test version here.

CoupledNormal - sample_n function

Implement the sampling function for CoupledNormal. Use the commented out TFP's Student sample_n code as a foundation.

See our latest nsc code here.

See original StudentT code here:

def sample_n(n, df, loc, scale, batch_shape, dtype, seed):
  """Draw n samples from a Student T distribution.
  Note that `scale` can be negative or zero.
  The sampling method comes from the fact that if:
    X ~ Normal(0, 1)
    Z ~ Chi2(df)
    Y = X / sqrt(Z / df)
  then:
    Y ~ StudentT(df)
  Args:
    n: int, number of samples
    df: Floating-point `Tensor`. The degrees of freedom of the
      distribution(s). `df` must contain only positive values.
    loc: Floating-point `Tensor`; the location(s) of the distribution(s).
    scale: Floating-point `Tensor`; the scale(s) of the distribution(s). Must
      contain only positive values.
    batch_shape: Callable to compute batch shape
    dtype: Return dtype.
    seed: Optional seed for random draw.
  Returns:
    samples: a `Tensor` with prepended dimensions `n`.
  """
  normal_seed, gamma_seed = samplers.split_seed(seed, salt='student_t')
  shape = ps.concat([[n], batch_shape], 0)

  normal_sample = samplers.normal(shape, dtype=dtype, seed=normal_seed)
  df = df * tf.ones(batch_shape, dtype=dtype)
  gamma_sample = gamma_lib.random_gamma(
      [n], concentration=0.5 * df, rate=0.5, seed=gamma_seed)
  samples = normal_sample * tf.math.rsqrt(gamma_sample / df)
  return samples * scale + loc

StudentT - KL function

Create a KL-Divergence function that accepts two tfp StudentT distributions. See _kl_normal_normal function from tfp Normal as example.

@kullback_leibler.RegisterKL(Normal, Normal)
def _kl_normal_normal(a, b, name=None):
  """Calculate the batched KL divergence KL(a || b) with a and b Normal.
  Args:
    a: instance of a Normal distribution object.
    b: instance of a Normal distribution object.
    name: Name to use for created operations.
      Default value: `None` (i.e., `'kl_normal_normal'`).
  Returns:
    kl_div: Batchwise KL(a || b)
  """
  with tf.name_scope(name or 'kl_normal_normal'):
    b_scale = tf.convert_to_tensor(b.scale)  # We'll read it thrice.
    diff_log_scale = tf.math.log(a.scale) - tf.math.log(b_scale)
    return (
        0.5 * tf.math.squared_difference(a.loc / b_scale, b.loc / b_scale) +
        0.5 * tf.math.expm1(2. * diff_log_scale) -
        diff_log_scale)

Weighted Generalized Mean

This specifies an alternative method to coding the generalizaed mean. It is lower priority since the straightforward mathematics is already coded.

While the formula is well known and was implemented in this manner for the Mathematica code. The preference would be to utilize the coupled product and coupled power functions. This would then provide a foundation for the function within the coupled algebra context and incorporate the subtleties which arise regarding the variables kappa, alpha, and dimension. It is probably best to have the inputs include an option to specify either the risk_bias = -alpha kappa / (1 + dim kappa) or the alpha, kappa, and dim directly.

The functional specification is given in equation 3.15 of the Reduced Perplexity book chapter.

Limited range of convergence for current Coupled VAE implementation

Given the current issue regarding the limited range convergence for the Coupled VAE investigate two potential resolutions:

Does taking the root of the entropy function give sufficient stability to the quantification of generalized entropy such that the training of machine learning algorithms can converge of a broader range of coupling values. If this is the case plan to priorities use of the root in defining the Coupled Entropy and draft a paper that introduces the importance of this definition for generalized entropy.
Develop a better understand how the incorporation of the generalized cost functions into machine learning affects the gradient descent algorithms used in the training process.
Given the literature on Wasserstein metric for machine learning, determine exactly what the distinction if any is from the generalized mean. If these are closely related, or possibly equal, than clarify how the generalized mean could be used as a cost function rather than the generalized entropy functions.

Repurpose coupled_cross_entropy_norm and its related functions

This function provides numerical efficiency in computing the coupled cross entropy for the coupled Gaussian
Change name to coupled_cross_entropy_coupled_gaussian

Change name of coupled_cross_entropy to coupled_cross_entropy_general.

Create wrapper function which calls the subfunctions coupled_cross_entropy_coupled_gaussian and coupled_cross_entropy_general

The documentation at the head of the function should refer to the appropriate equations.

The same change needs to be completed for the coupled_entropy_norm and the coupled_divergence_norm

MultivariateCoupledNormal - sample_n function

Implement the sampling function for MultivariateCoupledNormal. Use the equivalent CoupledNormal function as a foundation.

See our latest nsc code here.

CoupledNormal - KL function

Implement the KL-Divergence function for two CoupledNormal distributions . Unfortunately, the TFP's StudentT code does not contain the KL-Divergence function for two StudentT distributions, so we may need to create that one first, per the following issue.

See our latest nsc code here.

photrek / nonlinear-statistical-coupling Goto Github PK

nonlinear-statistical-coupling's People

Contributors

Stargazers

Watchers

Forkers

nonlinear-statistical-coupling's Issues

Recommend Projects

Recommend Topics

Recommend Org