Git Product home page Git Product logo

Comments (9)

itsdfish avatar itsdfish commented on June 13, 2024 3

Hi Eric-

I suspect one reason this model has not been added is that the logpdf for the zero inflated Poisson has not been implemented. Here my attempt at a numerically stable implementation:

using Distributions, Turing, StatsFuns
import Distributions: logpdf

struct ZIPoisson{T1,T2} <: DiscreteUnivariateDistribution
    logλ::T1
    w::T2
end

function logpdf(d::ZIPoisson, y::Int)
    LL = 0.0
    if y == 0
        LLs = zeros(typeof(d.logλ), 2)
        LLs[1] = log(d.w)
        LLs[2] = log(1 - d.w) - exp(d.logλ)
        LL = logsumexp(LLs)
    else
        LL = log(1 - d.w) + logpdf(LogPoisson(d.logλ), y)
    end
    return LL
end

This can be added to your script and it should work. Note that ZIPoisson uses log of lambda rather than lambda. That should get you started.

Rob, do you think ZIPoisson should be included in Distributions or Turing?

from sr2turingpluto.jl.

itsdfish avatar itsdfish commented on June 13, 2024 2

Here is a working example for future reference:

using Distributions, Turing, StatsFuns, Random
import Distributions: logpdf, rand

struct ZIPoisson{T1,T2} <: DiscreteUnivariateDistribution
    logλ::T1
    w::T2
end

function logpdf(d::ZIPoisson, y::Int)
    LL = 0.0
    if y == 0
        LLs = zeros(typeof(d.logλ), 2)
        LLs[1] = log(d.w)
        LLs[2] = log(1 - d.w) - exp(d.logλ)
        LL = logsumexp(LLs)
    else
        LL = log(1 - d.w) + logpdf(LogPoisson(d.logλ), y)
    end
    return LL
end

function rand(d::ZIPoisson)
    return rand() <= d.w ? 0 : rand(Poisson(exp(d.logλ)))
end

rand(d::ZIPoisson, N::Int) = map(_->rand(d), 1:N)


inv_logit(x) = 1/(1 + exp(-x))

@model model(data) = begin
    a1 ~ Normal(1, .5)
    logλ ~ Normal(-1.5, 1)
    w = inv_logit(a1)
    data .~ ZIPoisson(logλ, w)
end

Random.seed!(74591)
data = rand(ZIPoisson(-1.5, .70), 1000)

n_samples = 3000
n_adapt = 1500
config = NUTS(n_adapt, .65)
n_chains = 4
chain = sample(model(data), config, MCMCThreads(), n_samples, n_chains, progress=true)

from sr2turingpluto.jl.

torkar avatar torkar commented on June 13, 2024 1

Nice to see that people want to add more models to the rethinking repository and as you suspected Eric, the reason for not having these models was that I didn't get that far due to other obligations. My ultimate goal when starting the translations was to have all models from McElreath's Rethinking translated, and now the 2nd edition has been released :)

Rob, I think that ZI models are quite common (I've used them in my own research so we have some anecdotal evidence at least ;)

From the top of my head, Richard assumes, in the 2nd edition, the following data generative processes,

  • Gaussian
  • Binomial
  • Bernoulli
  • Poisson
  • ZI-Poisson
  • Neg-Binomial
  • Beta-Binomial
  • Categorical
  • Ordered-logit/ordered categorical/ordered logistic) (see polr() in R's MASSpackage)
  • MVNormal
  • Log-Normal

On top of that, it would be interesting to also have adjacent-category and sequential likelihoods (https://journals.sagepub.com/doi/pdf/10.1177/2515245918823199), other flavors of ZI (Beta, Neg-Bin, Bin), 0/1-Beta, and shifted-lognormal, skew-normal, student, weibull.

I have to confess that I haven't checked out the support for the above distributions in Distributions.jl lately... In short, I would like to pick one of the above distributions and simply start using it, just like when I use rethinking, brms, rstanarm, et al.

I think many people would not see it as "relatively easy" to implement things from scratch, after all, if everyone did that we would have variants with bugs sooner or later, so perhaps better offer robust alternatives? Even in the relatively simple example that Chris provides, given that I'm not fluent in Turing, I would have to rely on @trappmartin to check the code and optimize if needed.

from sr2turingpluto.jl.

trappmartin avatar trappmartin commented on June 13, 2024 1

Would there be the possibility to identify those distributions that are commonly used in probabilistic modelling settings but not available at the moment? I and others from the Turing team are not really doing much modelling and therefore we don't really know what is missing from a user perspective. But I think it would be great if we could extend the set of distributions supported by default to a more reasonable collection.

from sr2turingpluto.jl.

goedman avatar goedman commented on June 13, 2024

Hi Chris, thanks for stepping in. Let me play around a bit and compare the results between your model and below Stan language model. I am clearly not the right person to answer your question but definitely like the idea to add this example to the collection of models (both Stan and Turing) as your example shows how "relatively easy" it is to extend the set of available distributions. Does that make sense? ( @torkar @trappmartin )

Hi Eric,

It has never really been my intention to have all models in the ([Stan|Turin|DynamicHMC]Models repositories. But it certainly helps if folks indicate which models are difficult. Thank you.

data{
    int y[365];
}
parameters{
    real ap;
    real al;
}
model{
    real p;
    real lambda;
    al ~ normal( 1 , 0.5 );
    ap ~ normal( -1.5 , 1 );
    lambda = al;
    lambda = exp(lambda);
    p = ap;
    p = inv_logit(p);
    for ( i in 1:365 ) {
        if ( y[i]==0 )
            target += log_mix( p , 0 , poisson_lpmf(0|lambda) );
        if ( y[i] > 0 )
            target += log1m( p ) + poisson_lpmf(y[i] | lambda );
    }
}

from sr2turingpluto.jl.

itsdfish avatar itsdfish commented on June 13, 2024

As far as I can tell, most of the distributions mentioned so far appear to be implemented in Distributions or Turing. Although Turing has an ordered-logistic distribution, I'm not sure whether it covers all of the adjacent category models @torkar referenced. ZI-Poisson is the missing distribution among the ones mentioned so far. Aside from that, the only other distribution I can think of at the moment is the Wiener process, which is used in many fields.

from sr2turingpluto.jl.

goedman avatar goedman commented on June 13, 2024

Thanks Chris, I'll add it for now as m12.3bt.jl to StatisticalRethinkingTuring.

Given Richard (T)'s very useful list I think the answer to your original question could be a separate repo in JuliaStats, e.g. AdditionalDistributions.jl, which can hold new distributions and/or flavors until shown robust and of general interest. Maybe some of the distributions in distributions.jl in Turing could also go there to test those in more settings.

I haven't tried it yet , but wonder if the ZIPoisson() definition as you provided also is sufficient for Turing's MAP(), Prior() and predict()?

from sr2turingpluto.jl.

emfeltham avatar emfeltham commented on June 13, 2024

Hello: I was revisiting this recently, and noticed that the model does not seem to replicate the book result. I think that the priors were switched around. However, making a minor change to Chris' code corrects this (RCall with the appropriate seed (seed 365, on page 390) to generate the same data).

# this replicates the book result
@model ppl12_3bt(x) = begin
    α_λ ~ Normal(1, 0.5) # Normal(-1.5, 1)
    α_p ~ Normal(-1.5, 1)

    logλ = α_λ
    logitp = α_p
    p = inv_logit(logitp)

    for i ∈ 1:length(y)
        y[i] ~ ZIPoisson(logλ, p)
    end
end

Relatedly, I've been writing up the functions to make ZIPoisson more complete -- e.g. adding a cdf, quantile, function. ZIPoisson should also work with Turing's predict() function now. Do you think that this should be submitted to Distributions.jl, with appropriate credit given to Chris?

Eric

from sr2turingpluto.jl.

goedman avatar goedman commented on June 13, 2024

It would be great to have distributions like ZIPoisson available, even if they might initially not be as robust and complete as the other distributions in Distributions.jl. Maybe my preference would still be to add it to Distributions.jl, but if that turns out to be too complicated/time consuming, create a new repo AdditionalDistributions.jl in either Turing or StatisticalRethinkingJulia as a holding place until implementations of such new distributions have been tested and are considered mature enough.

from sr2turingpluto.jl.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.