Git Product home page Git Product logo

edward2's Issues

Add activation/weight histograms for CIFAR-10

For some reason, naively turning it on didn't work for me in variational_inference.py.

  tensorboard_cb = tf.keras.callbacks.TensorBoard(log_dir=FLAGS.output_dir,
                                                  # TODO(trandustin): This
                                                  # doesn't work(?).
                                                  # histogram_freq=5,
                                                  write_graph=False)

Problem Importing Layer Modules

Hello,

I was eager to try out some of the Bayesian layers that you had implemented. I saw that they were in the tensor2tensor package but then they seemed to have been removed placed in the edward2 package in the layers part. but I can't seem to find any of the modules (e.g. GaussianProcess, SparseGaussianProcess) or perhaps I don't understand how it works with importing them. I thought maybe perhaps the names were different but I can't seem to wrap my head around where to find the bayesian layer modules.

I cannot even find the files within the actual distribution located in /usr/local/lib/python3.6/dist-packages/edward2/__init__.py so I'm not sure if I am doing something wrong or not.


Installations

I did the following installation procedure with and without the tf-nightly

pip install edward2[tf-nightly]

I also tried to directly install it via the github repo:

pip install git+https://github.com/google/edward2

Helpful Info

  • Google Colab Notebook
  • Python 3.6
  • TensorFlow - 2.0.0-rc1
  • Edward - 0.0.1

Thanks,
Emmanuel

Train on more CIFAR-10 data

All baselines currently use splits of 40k train / 10k validation / 10k test. What's standard (aside from not having validation data..)? 45k train / 5k validation? 49k train / 1k validation?

Define metrics used in baselines

As we continue to add metrics, we should formalize how they're defined by writing a section with english/math descriptions of how each of the columns are computed.

Rewrite resnet implementations as block + model?

Resnets in CIFAR/ImageNet currently write a generic conv, bn, and relu layer, and the model function stacks these.

I like Pytorch's implementations which more closely resemble how we think of resnets: define a residual block function (which itself can vary), and then define a model which stacks residual blocks. These seems conceptually nicer but may require more boilerplate as the conv layer takes quite a few arguments.

model.predict does not work with stochastic output layers

model.predict doesn't work for non-Tensor outputs, including Tensor-convertible objects like ed.RandomVariable.

For now, the workaround is to replace model.predict as below with an explicit for loop over the data.

dataset_test = dataset_test.repeat().batch(batch_size)
test_steps = ds_info.splits['test'].num_examples // batch_size

predictions = model.predict(dataset_test, verbose=1, steps=test_steps)  # raises error
logits = predictions.distribution.logits  # predicted logits of full dataset
dataset_test = dataset_test.batch(batch_size)

logits = []
for features, _ in dataset_test:
  predictions = model(features)
  logits.append(predictions.distribution.logits)

logits = tf.concat(logits, axis=0)  # predicted logits of full dataset

Note to loop over tf data, you need to use TF 2.0 behavior; otherwise you need to use a tf.Session with the deprecated iterator design.

Baselines do not test on all data during training

Baselines currently have this snippet.

validation_steps = 100
dataset_test = dataset_test.take(FLAGS.batch_size * validation_steps).repeat(
      ).batch(FLAGS.batch_size)

Is it too expensive to evaluate on all test data at each epoch? Ideally for small experiments like CIFAR-10, we shouldn't need to run an additional eval job that's separate from training.

tune deterministic resnet-50 to 76.3%?

Facebook's scaling paper suggests they consistently get 76.4% with base learning rate=0.1 and total_batch_size (kn) = 256, and 76.3% with total_batch_size (kn)=8k. Our accuracies are hovering around 76.0-76.3%, based on the official tpu resnet50 keras codebase. Should double check whether the official codebase is meeting this target and ultimately how we can meet it.

AttributeError: module 'edward2' has no attribute 'set_seed'

Edward 1 API has a function called set_seed, which is apparently no more available in Edward 2, given that I get the error AttributeError: module 'edward2' has no attribute 'set_seed' when I attempt to call ed.set_seed. So, what is the equivalent function in Edward 2?

Potentially rewrite constraint functions for bayesian layers

We currently adopt Keras' practice of unconstrained parameters followed by projected gradient descent. It's more common in probabilistic modeling code to constrain the parameter space itself, e.g., ed.Normal(0., tf.nn.softplus(tf.Variable(1.)) + tf.keras.backend.epsilon()). This is subtle but potentially an impactful change so we should be careful with our ablation studies if we want to change this behavior.

Improve deterministic baseline to 95%+ test accuracy

ResNet-20 may be too weak of a baseline. We should maybe move to WRN-28-10. Wide ResNets also involve dropout, so there's likely more benefit in using BNN layers due to the need to regularize the wider layers. The BatchEnsemble baselines' current code also use ResNets with more than typical filters.

todos for switching to wide resnet

  • use v2 aka preactivation resnet, with order of bn-relu-conv instead of conv-bn-relu
  • add width factor
  • add dropout baseline with dropout between convs and perhaps after skip connection

ImportError: cannot import name 'docstring' from 'tensorflow_probability.python.util'

I have a virtual environment (with Python 3.7.4) where I installed edward2, tensorflow (2) and tensorflow_probability. When I try to import edward2 with import edward2 as ed, I get the error

ImportError: cannot import name 'docstring' from 'tensorflow_probability.python.util'

Here's the full Traceback

Traceback (most recent call last):
File "/Users/nbro/Desktop/edward_tests/edward_test2.py", line 1, in
import edward2 as ed
File "/Users/nbro/Desktop/edward_tests/venv2/lib/python3.7/site-packages/edward2/init.py", line 32, in
from edward2 import generated_random_variables
File "/Users/nbro/Desktop/edward_tests/venv2/lib/python3.7/site-packages/edward2/generated_random_variables.py", line 26, in
from tensorflow_probability.python.util import docstring as docstring_util
ImportError: cannot import name 'docstring' from 'tensorflow_probability.python.util' (/Users/nbro/Desktop/edward_tests/venv2/lib/python3.7/site-packages/tensorflow_probability/python/util/init.py)

AttributeError: module 'edward2' has no attribute 'KLqp'

I am trying to run this example https://github.com/blei-lab/edward/blob/master/examples/bayesian_nn.py, but with edward 2 and TensorFlow 2 (which is now stable). After having first used the script tf_upgrade_v2 on this file and fixed other problems mentioned in this issue #37, I get another error

Traceback (most recent call last):
File "/Users/nbro/Desktop/edward_tests/edward_test2.py", line 102, in
app.run(main)
File "/Users/nbro/Desktop/edward_tests/venv/lib/python3.7/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/Users/nbro/Desktop/edward_tests/venv/lib/python3.7/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "/Users/nbro/Desktop/edward_tests/edward_test2.py", line 95, in main
inference = ed.KLqp({W_0: qW_0, b_0: qb_0, W_1: qW_1, b_1: qb_1, W_2: qW_2, b_2: qb_2},
AttributeError: module 'edward2' has no attribute 'KLqp'

I noticed that https://github.com/google/edward2/blob/master/Upgrading_From_Edward_To_Edward2.md shows an example of how to perform inference with edward 2. This example is extremely verbose, compared to edward 1's example, which just calls KLqp. Is there an easy (non-verbose) way of performing inference with edward 2 (with TensorFlow 2)?

This issue may be related to blei-lab/edward#640.

MDN Implementation using Edward2

The current example on MDN from Edward tutorials needs small modifications to run on edward2. Documentation covering these modifications will be appreciated.

tensorflow_probability docstring_util location

The location of the docstring_util in tensorflow_probability 0.8.0 has changed, resulting in edward2 errors such as:

File "/home/wibble/.conda/envs/gem/lib/python3.7/site-packages/edward2/generated_random_variables.py", line 26, in <module>
    from tensorflow_probability.python.util import docstring as docstring_util
ImportError: cannot import name 'docstring' from 'tensorflow_probability.python.util'

Add option for mixed precision training

Deterministic and BatchEnsemble baselines cast data as bfloat16 by default with TPUs. Following the cloud TPU imagenet example, we need to set a policy if we want to maintain bfloat16 for the activations, etc.

  if _USE_BFLOAT16:
    policy = tf.keras.mixed_precision.experimental.Policy('mixed_bfloat16')
    tf.keras.mixed_precision.experimental.set_policy(policy)

It should be easy enough to set up a boolean flag to use bfloat16 with that policy or otherwise operate in float32.

Add scale arg to scale the output of KL divergence regularizers

Makes it slightly easier to scale the KL penalties by the dataset size if you want to do it as part of the model rather than at training time. E.g., consider model.fit where the loss is always loss + sum(model.losses), so you can't scale the model losses externally as we currently recommend.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.