Comments (9)
Hi Eric-
I suspect one reason this model has not been added is that the logpdf
for the zero inflated Poisson has not been implemented. Here my attempt at a numerically stable implementation:
using Distributions, Turing, StatsFuns
import Distributions: logpdf
struct ZIPoisson{T1,T2} <: DiscreteUnivariateDistribution
logλ::T1
w::T2
end
function logpdf(d::ZIPoisson, y::Int)
LL = 0.0
if y == 0
LLs = zeros(typeof(d.logλ), 2)
LLs[1] = log(d.w)
LLs[2] = log(1 - d.w) - exp(d.logλ)
LL = logsumexp(LLs)
else
LL = log(1 - d.w) + logpdf(LogPoisson(d.logλ), y)
end
return LL
end
This can be added to your script and it should work. Note that ZIPoisson
uses log of lambda rather than lambda. That should get you started.
Rob, do you think ZIPoisson
should be included in Distributions or Turing?
from sr2turingpluto.jl.
Here is a working example for future reference:
using Distributions, Turing, StatsFuns, Random
import Distributions: logpdf, rand
struct ZIPoisson{T1,T2} <: DiscreteUnivariateDistribution
logλ::T1
w::T2
end
function logpdf(d::ZIPoisson, y::Int)
LL = 0.0
if y == 0
LLs = zeros(typeof(d.logλ), 2)
LLs[1] = log(d.w)
LLs[2] = log(1 - d.w) - exp(d.logλ)
LL = logsumexp(LLs)
else
LL = log(1 - d.w) + logpdf(LogPoisson(d.logλ), y)
end
return LL
end
function rand(d::ZIPoisson)
return rand() <= d.w ? 0 : rand(Poisson(exp(d.logλ)))
end
rand(d::ZIPoisson, N::Int) = map(_->rand(d), 1:N)
inv_logit(x) = 1/(1 + exp(-x))
@model model(data) = begin
a1 ~ Normal(1, .5)
logλ ~ Normal(-1.5, 1)
w = inv_logit(a1)
data .~ ZIPoisson(logλ, w)
end
Random.seed!(74591)
data = rand(ZIPoisson(-1.5, .70), 1000)
n_samples = 3000
n_adapt = 1500
config = NUTS(n_adapt, .65)
n_chains = 4
chain = sample(model(data), config, MCMCThreads(), n_samples, n_chains, progress=true)
from sr2turingpluto.jl.
Nice to see that people want to add more models to the rethinking repository and as you suspected Eric, the reason for not having these models was that I didn't get that far due to other obligations. My ultimate goal when starting the translations was to have all models from McElreath's Rethinking translated, and now the 2nd edition has been released :)
Rob, I think that ZI models are quite common (I've used them in my own research so we have some anecdotal evidence at least ;)
From the top of my head, Richard assumes, in the 2nd edition, the following data generative processes,
- Gaussian
- Binomial
- Bernoulli
- Poisson
- ZI-Poisson
- Neg-Binomial
- Beta-Binomial
- Categorical
- Ordered-logit/ordered categorical/ordered logistic) (see
polr()
in R'sMASS
package) - MVNormal
- Log-Normal
On top of that, it would be interesting to also have adjacent-category and sequential likelihoods (https://journals.sagepub.com/doi/pdf/10.1177/2515245918823199), other flavors of ZI (Beta, Neg-Bin, Bin), 0/1-Beta, and shifted-lognormal, skew-normal, student, weibull.
I have to confess that I haven't checked out the support for the above distributions in Distributions.jl
lately... In short, I would like to pick one of the above distributions and simply start using it, just like when I use rethinking
, brms
, rstanarm
, et al.
I think many people would not see it as "relatively easy" to implement things from scratch, after all, if everyone did that we would have variants with bugs sooner or later, so perhaps better offer robust alternatives? Even in the relatively simple example that Chris provides, given that I'm not fluent in Turing, I would have to rely on @trappmartin to check the code and optimize if needed.
from sr2turingpluto.jl.
Would there be the possibility to identify those distributions that are commonly used in probabilistic modelling settings but not available at the moment? I and others from the Turing team are not really doing much modelling and therefore we don't really know what is missing from a user perspective. But I think it would be great if we could extend the set of distributions supported by default to a more reasonable collection.
from sr2turingpluto.jl.
Hi Chris, thanks for stepping in. Let me play around a bit and compare the results between your model and below Stan language model. I am clearly not the right person to answer your question but definitely like the idea to add this example to the collection of models (both Stan and Turing) as your example shows how "relatively easy" it is to extend the set of available distributions. Does that make sense? ( @torkar @trappmartin )
Hi Eric,
It has never really been my intention to have all models in the ([Stan|Turin|DynamicHMC]Models repositories. But it certainly helps if folks indicate which models are difficult. Thank you.
data{
int y[365];
}
parameters{
real ap;
real al;
}
model{
real p;
real lambda;
al ~ normal( 1 , 0.5 );
ap ~ normal( -1.5 , 1 );
lambda = al;
lambda = exp(lambda);
p = ap;
p = inv_logit(p);
for ( i in 1:365 ) {
if ( y[i]==0 )
target += log_mix( p , 0 , poisson_lpmf(0|lambda) );
if ( y[i] > 0 )
target += log1m( p ) + poisson_lpmf(y[i] | lambda );
}
}
from sr2turingpluto.jl.
As far as I can tell, most of the distributions mentioned so far appear to be implemented in Distributions or Turing. Although Turing has an ordered-logistic distribution, I'm not sure whether it covers all of the adjacent category models @torkar referenced. ZI-Poisson is the missing distribution among the ones mentioned so far. Aside from that, the only other distribution I can think of at the moment is the Wiener process, which is used in many fields.
from sr2turingpluto.jl.
Thanks Chris, I'll add it for now as m12.3bt.jl to StatisticalRethinkingTuring.
Given Richard (T)'s very useful list I think the answer to your original question could be a separate repo in JuliaStats, e.g. AdditionalDistributions.jl, which can hold new distributions and/or flavors until shown robust and of general interest. Maybe some of the distributions in distributions.jl in Turing could also go there to test those in more settings.
I haven't tried it yet , but wonder if the ZIPoisson() definition as you provided also is sufficient for Turing's MAP(), Prior() and predict()?
from sr2turingpluto.jl.
Hello: I was revisiting this recently, and noticed that the model does not seem to replicate the book result. I think that the priors were switched around. However, making a minor change to Chris' code corrects this (RCall with the appropriate seed (seed 365, on page 390) to generate the same data).
# this replicates the book result
@model ppl12_3bt(x) = begin
α_λ ~ Normal(1, 0.5) # Normal(-1.5, 1)
α_p ~ Normal(-1.5, 1)
logλ = α_λ
logitp = α_p
p = inv_logit(logitp)
for i ∈ 1:length(y)
y[i] ~ ZIPoisson(logλ, p)
end
end
Relatedly, I've been writing up the functions to make ZIPoisson more complete -- e.g. adding a cdf, quantile, function. ZIPoisson should also work with Turing's predict() function now. Do you think that this should be submitted to Distributions.jl, with appropriate credit given to Chris?
Eric
from sr2turingpluto.jl.
It would be great to have distributions like ZIPoisson available, even if they might initially not be as robust and complete as the other distributions in Distributions.jl. Maybe my preference would still be to add it to Distributions.jl, but if that turns out to be too complicated/time consuming, create a new repo AdditionalDistributions.jl
in either Turing or StatisticalRethinkingJulia as a holding place until implementations of such new distributions have been tested and are considered mature enough.
from sr2turingpluto.jl.
Related Issues (15)
- question about quap function HOT 3
- For univariate distributions, calling pdf on an array of values is deprecated HOT 2
- `logpdf` to broadcasted `.logpdf` HOT 2
- use of quap() HOT 3
- Could not import Turing.ModeResult into StatisticalRethinking HOT 11
- Installation Error - Hardcoded Path in Manifest HOT 4
- A potential erratum from Chapter 2? HOT 3
- Instantiate Error HOT 3
- instantiate command results in type error, preventing dependency installation HOT 10
- `show` of MCMC.Chain summary statistics in clip-03-02-05t fails HOT 3
- @formula not defined in Clip-00-04-05t.jl HOT 7
- Strange convergency of model m7.1 HOT 15
- Is this a right place to put turing-dependent utility functions? HOT 8
- Models with multivariate normal priors? HOT 9
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sr2turingpluto.jl.