Git Product home page Git Product logo

fehiepsi / rethinking-numpyro Goto Github PK

View Code? Open in Web Editor NEW
438.0 13.0 72.0 163.78 MB

Statistical Rethinking (2nd ed.) with NumPyro

Home Page: https://fehiepsi.github.io/rethinking-numpyro/

License: MIT License

Jupyter Notebook 99.58% Python 0.41% CSS 0.01% Smarty 0.01%
bayesian-statistics laplace-approximation markov-chain-monte-carlo causal-inference variational-inference python numpy

rethinking-numpyro's People

Contributors

fehiepsi avatar felipeffm avatar fmardini avatar himalayajung avatar ksachdeva avatar manuvazquez avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rethinking-numpyro's Issues

Code 4.48 returns rank-1 solution for N >= 54

Chapter 4. Geocentric Models
4.4. Linear prediction
Code 4.48 (and Code 4.49 to display it)

Whenever N is equal or bigger than 54, sampled posterior returns rank-1 matrix (aka, all the rows are the same). Sounds like a numpyro package issue, but wanted to report it here since I cannot get to the bottom of it. Could you help?

image
image

Code 4.54 Returns Unexpected Output for me

Hi there,

I was trying to replicate the following Code 4.54 and I get the following:

image

Where the expected result is:

image

Please find a complete and inclusive minimum viable example below, which produces the same unexpected output:

image

Quadratic approximation results diverge

In Chapter 2, code section 2.6 (2nd Edition), McElreath reports quadratic approximation values of

p 0.67   0.16 0.42  0.92

On my machine (numpyro 0.11.0, jax 0.4.10) I get

 p      0.63      0.13      0.43      0.83

Are these numerical differences to be expected? I do not have any experience with numpyro and not much with quadratic approximation, but I'm surprised at the divergence for this 1D problem. McElreath's solution is also more accurate to the true proportion.

Update requirements to pin numpyro version

I think it will be useful to pin the numpyro version in requirements.txt. As we update numpyro, we can check compatibility and bump up the number in this file if all the examples work correctly.

Resolution of exercises

Resolutions of the exercises could help those early learners following the book and this repo.

From the perspective of someone learning bayesian statistics and numpyro simultaneously, reading the book and following the examples in this repo is a bit challenging.

Mainly because the documentation of numpyro methods used in the code examples utilizes terms from bayesian statistics yet to be explained by the book. It seems unlikely that the first three chapter exercises could be done without much effort from the code examples without previous bayesian statistics knowledge, just based on the code examples. At least, that has been my personal experience following the book and the repo.

My intention is compile the solutions of all exercises of each chapter in numpyro in GitHub to help other readers.

What do you think about it? Would you already have the solutions in numpyro?

14.31 code

TLDR: I think a piece of code in the book is hacky in one place and works correctly by sheer coincidence; perhaps worth correcting for it here because it took quite a while for me to debug this (and probably took/will take for someone else) as i was following the example notebooks.

Proposed solution:
Turn this:

kl_data = dict(
    N=kl_dyads.shape[0],
    N_households=kl_dyads.hidB.max(),
    did=kl_dyads.did.values - 1,
    hidA=kl_dyads.hidA.values - 1,
    hidB=kl_dyads.hidB.values - 1,
    giftsAB=kl_dyads.giftsAB.values,
    giftsBA=kl_dyads.giftsBA.values,
)

Into this:

kl_data = dict(
    N=kl_dyads.shape[0],
    # N_households=kl_dyads.hidB.max(),
    did=kl_dyads.did.values - 1,
    hidA=kl_dyads.hidA.values - 1,
    hidB=kl_dyads.hidB.values - 1,
    giftsAB=kl_dyads.giftsAB.values,
    giftsBA=kl_dyads.giftsBA.values,
)

kl_data["N_households"] = len(set(kl_data["hidA"]) | set(kl_data["hidB"]))

Details:
While reproducing models from 2022 lectures and the book with pandas + numpyro + plotly stack and found a peculiar thing:

in 14.31 code N_households is computed as kl_dyads.hidB.max(), same as in the book. It's implied that hidB and hidA in original data are 1-indexed, so -1 is performed at kl_dyads init. But N_households is computed without -1.
And one must pay attention to it if one computes it not at init but somewhere downstream instead (like I insolently did). The model results (coefficients) with will look quite the same but the sampling will be bad (very low n_eff), as if not reparameterized. A recipe for a very fun time debugging correlated varying effects :)
The book's code, I think, is a hack since proper logic would imply instead computing a number of unique entries in hidB and hidA (see solution).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.