tpapp / dynamichmc.jl Goto Github PK

Implementation of robust dynamic Hamiltonian Monte Carlo methods (NUTS) in Julia.

License: Other

Julia 100.00%

julia-language hybrid-monte-carlo bayesian-inference bayesian-methods bayesian hamiltonian-monte-carlo julia bayesian-statistics

dynamichmc.jl's Introduction

DynamicHMC

Implementation of robust dynamic Hamiltonian Monte Carlo methods in Julia.

Overview

This package implements a modern version of the “No-U-turn sampler” in the Julia language, mostly as described in Betancourt (2017), with some tweaks.

In contrast to frameworks which utilize a directed acyclic graph to build a posterior for a Bayesian model from small components, this package requires that you code a log-density function of the posterior in Julia. Derivatives can be provided manually, or using automatic differentiation.

Consequently, this package requires that the user is comfortable with the basics of the theory of Bayesian inference, to the extent of coding a (log) posterior density in Julia. This approach allows the use of standard tools like profiling and benchmarking to optimize its performance.

The building blocks of the algorithm are implemented using a functional (non-modifying) approach whenever possible, allowing extensive unit testing of components, while at the same time also intended to serve as a transparent, pedagogical introduction to the low-level mechanics of current Hamiltonian Monte Carlo samplers, and as a platform for research into MCMC methods.

Please start with the documentation.

Examples

Some basic examples are available in DynamicHMCExamples.jl.
DynamicHMCModels.jl contains worked examples from the Statistical Rethinking book.

Support and participation

For general questions, open an issue or ask on the Discourse forum. I am happy to help with models.

Users who rely on this package and want to participate in discussions are recommended to subscribe to the Github notifications (“watching” the package). Also, I will do my best to accommodate feature requests, just open issues.

Bibliography

Betancourt, M. J., Byrne, S., & Girolami, M. (2014). Optimizing the integrator step size for Hamiltonian Monte Carlo. arXiv preprint arXiv:1411.6669.

Betancourt, M. (2016). Diagnosing suboptimal cotangent disintegrations in Hamiltonian Monte Carlo. arXiv preprint arXiv:1604.00695.

Betancourt, M. (2017). A Conceptual Introduction to Hamiltonian Monte Carlo. arXiv preprint arXiv:1701.02434.

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). Bayesian data analysis. : CRC Press.

Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models.

Hoffman, M. D., & Gelman, A. (2014). The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

McElreath, R. (2018). Statistical rethinking: A Bayesian course with examples in R and Stan. Chapman and Hall/CRC.

dynamichmc.jl's People

Contributors

Stargazers

Watchers

dynamichmc.jl's Issues

clarify interface compatibility wrt semver

no deprecations, API just changes with major versions, minor versions make upward-compatible changes
encourage users to put a [compat] section in their code
what is part of the API? anything exported from DynamicHMC, recursively. so ATM Diagnostics isn't strictly speaking a part of the API one can rely on. make this clear

Issue with gaussian process model

This is probably an issue with ForwardDiff and how I'm writing my code, but on the off chance...

I am running into an issue that the kernels of GPs I try to estimate are always triggering an error when it comes time to estimate using the NUTS sampling function. Specifically, the covariance matrix fails during a cholesky decomposition, sometimes with the error that it is not PD. I have tested the same functions for constructing the kernels outside of DHMC/ForwardDiff, and they work fine. I thought this might be an issue others have run into.

I've tried to provide a MWE below:

``
using Parameters, DynamicHMC, ContinuousTransformations, DiffWrappers, MCMCDiagnostics, CSV, DataFrames, Distributions

function sq_distance(X)
    N = size(X, 1)
    dist_matrix = Array{eltype(X)}(N, N)
    
    for j in 1:N
        for i in 1:N
            dist_matrix[i, j] = (X[i] - X[j])^2
        end
    end
    
    return dist_matrix
end

struct GP
    X::Array{Float64, 1}
end

function (m::GP)(theta)
    @unpack X = m
    N = size(X, 1)
    llpost = 0.0
    signal, = theta[1]
    length, = theta[2]
    time_hat = theta[3]
    
    for i in 1:3
        all(isfinite.(theta[i])) || return -Inf
    end
    
    dist_all = sq_distance(time_hat)
    K = (signal^2) * exp.(-0.5 * dist_all/(length^2)) + 0.000001*I
    
    llpost += logpdf(MvNormal(zeros(size(K, 1)), K), X)
    
    return llpost
end

dat1 = randn(30)
gp_model = GP(dat1)

tt = TransformationTuple(bridge(ℝ, ℝ⁺), bridge(ℝ, ℝ⁺), ArrayTransformation(bridge(ℝ, ℝ), size(dat1, 1)))
m_trans = TransformLogLikelihood(gp_model, tt)
m_trans_g = ForwardGradientWrapper(m_trans, randn(length(m_trans)))
m1 = m_trans(randn(length(m_trans)))
m2 = m_trans_g(zeros(length(m_trans_g)))

sample, NUTS_tuned = NUTS_init_tune_mcmc(m_trans_g, length(m_trans_g), 5000)

It always fails when it tries to apply chol() to the covariance matrix during sampling, although it works fine when I am just handling the transformed loglikelihood function by itself.

don't return the tuned sampler by default

As continuing with the tuned sampler may not be a typical use case, the common entry point for the API may not need to return that (should still be available).

Registrator release

The current version has capped TransformVariables at 0.2 because of how the conversion script from METADATA works so it would be great with a new release that allows for TransformVariables >= 0.3

add doctest/code examples to functions exposed in docs

Once #62 is merged, it would be nice to have some code examples in docstrings, especially for diagnostics.

warmup and diagnostics API

This is what is left over from #30 after #35 is merged.

document warmup api and results and how to use them for diagnostics
run whole trees from a point
examples with difficult posteriors

more diagnostics

http://discourse.mc-stan.org/t/best-way-to-identify-pathologies-from-divergent-transitions-general-step-by-step-workflow/4358

Implement an interface for the calculations, which can then be plotted using one's favorite plotting library.

reporting should not print chain id when it is `nothing`

Tag?

Do you think this library is ready? We merged the DiffEqBayes.jl code. I will put it as an optional dependency for now via Requires but if you think this is ready then please tag. Thanks!

Fix rand_phasepoint for non-Euclidean KE

rand_phasepoint should pass the point q to rand. i.e.

DynamicHMC.jl/src/hamiltonian.jl

Line 157 in 65f8a30

rand_phasepoint(rng::AbstractRNG, H, q) = phasepoint_in(H, q, rand(rng, H.κ))

should be

rand_phasepoint(rng::AbstractRNG, H, q) = phasepoint_in(H, q, rand(rng, H.κ, q))

This is consistent with the existing interface for rand used in

DynamicHMC.jl/src/hamiltonian.jl

Line 97 in 65f8a30

 rand(rng::AbstractRNG, κ::GaussianKE, q = nothing) = κ.W * randn(rng, size(κ.W, 1)) 

and enables the user to use rand_phasepoint with user-defined non-EuclideanKEs.

Error tagging new release

The REQUIRE file could not be found.
cc: @tpapp

What's missing for sampling on product manifolds?

It would be great if this package had the flexibility to run HMC on product manifolds. See for example this paper (by @simonbyrne, who I see filed an issue here), which presents an approach for sampling on embedded manifolds using the geodesic flow on the manifold (see Algorithm 1). The paper also notes that the geodesic flow on a product manifold splits into the flows on the submanifolds, so that each step of the integration is done in parallel. This would be very useful for applying HMC to sample geometric quantities. For example, I'm interested in sampling orientations of rigid bodies (SE(3), implemented as S^3 x R^3). With n Euclidean variables and m rigid bodies, the variable space is the product manifold R^n x (S^3 x R^3)^m = R^(n+3m) x (S^3)^m). The Euclidean part can be updated with all Euclidean variables, while the latter must be considered as a separate unit with their own kinetic energy, leapfrog integrator, and integrator step size (equivalent to scale of the spherical metric; alternatively, scaling factor for step size).

The machinery to do this might be too specialized for this package, but I'm wondering whether you think the building blocks here are sufficiently general to enable this kind of manifold sampling. At first glance, it seems that a reimplementation of much of hamiltonian.jl might be all that's necessary.

Question: making LBA compatible with new interface

Hi Tamas-

I was wondering if you might help me update this code for the new interface. While updating my other models was simple, this one seems to be a little trickier because I had to make some changes to the adaptation parameters to make it work. Here is how it used to work.

I tried several potential solutions, including initialization = (q = zeros(n), κ = GaussianKineticEnergy(5, 0.1)), but to no avail. Any guidance would be much appreciated. Thanks!

using Distributions, Parameters, DynamicHMC, LogDensityProblems, TransformVariables
using Random
import Distributions: pdf,logpdf,rand
export LBA,pdf,logpdf,rand

mutable struct LBA{T1,T2,T3,T4} <: ContinuousUnivariateDistribution
    ν::T1
    A::T2
    k::T3
    τ::T4
    σ::Float64
end

Base.broadcastable(x::LBA)=Ref(x)

LBA(;τ,A,k,ν,σ=1.0) = LBA(ν,A,k,τ,σ)

function selectWinner(dt)
    if any(x->x >0,dt)
        mi,mv = 0,Inf
        for (i,t) in enumerate(dt)
            if (t > 0) && (t < mv)
                mi = i
                mv = t
            end
        end
    else
        return 1,-1.0
    end
    return mi,mv
end

function sampleDriftRates(ν,σ)
    noPositive=true
    v = similar(ν)
    while noPositive
        v = [rand(Normal(d,σ)) for d in ν]
        any(x->x>0,v) ? noPositive=false : nothing
    end
    return v
end

function rand(d::LBA)
    @unpack τ,A,k,ν,σ = d
    b=A+k
    N = length(ν)
    v = sampleDriftRates(ν,σ)
    a = rand(Uniform(0,A),N)
    dt = @. (b-a)/v
    choice,mn = selectWinner(dt)
    rt = τ .+ mn
    return choice,rt
end

function rand(d::LBA,N::Int)
    choice = fill(0,N)
    rt = fill(0.0,N)
    for i in 1:N
        choice[i],rt[i]=rand(d)
    end
    return (choice=choice,rt=rt)
end

logpdf(d::LBA,choice,rt) = log(pdf(d,choice,rt))

function logpdf(d::LBA,data::T) where {T<:NamedTuple}
    return sum(logpdf.(d,data...))
end

function logpdf(dist::LBA,data::Array{<:Tuple,1})
    LL = 0.0
    for d in data
        LL += logpdf(dist,d...)
    end
    return LL
end

function pdf(d::LBA,c,rt)
    @unpack τ,A,k,ν,σ = d
    b=A+k; den = 1.0
    rt < τ ? (return 1e-10) : nothing
    for (i,v) in enumerate(ν)
        if c == i
            den *= dens(d,v,rt)
        else
            den *= (1-cummulative(d,v,rt))
        end
    end
    pneg = pnegative(d)
    den = den/(1-pneg)
    den = max(den,1e-10)
    isnan(den) ? (return 0.0) : (return den)
end

logpdf(d::LBA,data::Tuple) = logpdf(d,data...)

function dens(d::LBA,v,rt)
    @unpack τ,A,k,ν,σ = d
    dt = rt-τ; b=A+k
    n1 = (b-A-dt*v)/(dt*σ)
    n2 = (b-dt*v)/(dt*σ)
    dens = (1/A)*(-v*cdf(Normal(0,1),n1) + σ*pdf(Normal(0,1),n1) +
        v*cdf(Normal(0,1),n2) - σ*pdf(Normal(0,1),n2))
    return dens
end

function cummulative(d::LBA,v,rt)
    @unpack τ,A,k,ν,σ = d
    dt = rt-τ; b=A+k
    n1 = (b-A-dt*v)/(dt*σ)
    n2 = (b-dt*v)/(dt*σ)
    cm = 1 + ((b-A-dt*v)/A)*cdf(Normal(0,1),n1) -
        ((b-dt*v)/A)*cdf(Normal(0,1),n2) + ((dt*σ)/A)*pdf(Normal(0,1),n1) -
        ((dt*σ)/A)*pdf(Normal(0,1),n2)
    return cm
end

function pnegative(d::LBA)
    @unpack ν,σ=d
    p=1.0
    for v in ν
        p*= cdf(Normal(0,1),-v/σ)
    end
    return p
end

   struct LBAProb{T}
      data::T
      N::Int
      Nc::Int
  end

  function (problem::LBAProb)(θ)
      @unpack data=problem
      @unpack v,A,k,tau=θ
      d=LBA(ν=v,A=A,k=k,τ=tau)
      minRT = minimum(x->x[2],data)
      logpdf(d,data)+sum(logpdf.(TruncatedNormal(0,3,0,Inf),v)) +
      logpdf(TruncatedNormal(.8,.4,0,Inf),A)+logpdf(TruncatedNormal(.2,.3,0,Inf),k)+
      logpdf(TruncatedNormal(.4,.1,0,minRT),tau)
  end

function sampleDHMC(choice,rt,N,Nc,nsamples)
    data = [(c,r) for (c,r) in zip(choice,rt)]
    return sampleDHMC(data,N,Nc,nsamples)
end

# Define problem with data and inits.
function sampleDHMC(data,N,Nc,nsamples)
    p = LBAProb(data,N,Nc)
    p((v=fill(.5,Nc),A=.8,k=.2,tau=.4))
    # Write a function to return properly dimensioned transformation.
    trans = as((v=as(Array,asℝ₊,Nc),A=asℝ₊,k=asℝ₊,tau=asℝ₊))
    # Use Flux for the gradient.
    P = TransformedLogDensity(trans, p)
    ∇P = ADgradient(:ForwardDiff, P)
    # FSample from the posterior.
    n = dimension(trans)
    results = mcmc_with_warmup(Random.GLOBAL_RNG, ∇P, nsamples;
        q = zeros(n), p = ones(n),reporter = NoProgressReport())
    # Undo the transformation to obtain the posterior from the chain.
    posterior = transform.(trans, results.chain)
    chns = nptochain(results,posterior)
    return chns
end

function simulateLBA(;Nd,v=[1.0,1.5,2.0],A=.8,k=.2,tau=.4,kwargs...)
    return (rand(LBA(ν=v,A=A,k=k,τ=tau),Nd)...,N=Nd,Nc=length(v))
end

data = simulateLBA(Nd=10)

samples = sampleDHMC(data...,2000)

use `Base.@kwdef` where appropriate

With #82 we depend on Julia 1.1 anyway. Review the code and make use Base.@kwdef where appropriate.

RFC: API reorganization

Motivation

DynamicHMC.jl was started in 2017 June, and initially released in 2018 February. Since then, the architecture and the API only underwent minor changes. However, various use cases are stretching the API a bit, and it is time for a redesign and a rewrite of some intervals. I am opening this issue to discuss these — feel free to use it as a wishlist, or to share your recommendations or use cases.

I intend to keep the focus of the package the same, ie as a building block for Bayesian inference using variants of the NUTS sampler. The user is still expected to provide a log density (with gradient).

I briefly discuss the changes I am proposing below.

Low-level implementation changes

These would be mostly invisible to the user.

Non-allocating leapfrog calculations

The most important one is probably reducing allocations, by reusing the vectors for position and momentum. This has a tiny impact (as for nontrivial models, calculating the logdensity is the most costly), but is a low-hanging fruit and has some significance for high-dimensional models. I am undecided about this though, as I would have to trust the log density calculations not to change the position vector.

~~Ideas: perhaps make it optional, and allow SVector transparently?~~

EDIT: I abandoned this, because keeping the functional design (which is now generalized) makes it much easier for me to use multithreading in 1.3

Mid-level API

Flexible NUTS step implementation

The NUTS sampling step, currently implemented by DynamicHMC.NUTS_transition, currently reports the reason for termination, the new drawn position, and the average acceptance rate. However, obtaining the whole trajectory with probabilities could be useful, eg for debugging issues like SciML/DiffEqBayes.jl#60 and also for pedagogical purposes (eg visualizing HMC trajectories).

The interface should allow users to experiment with different step sizes (also jitter), momentum, kinetic energy, and max_depth specifications, and debug these. Eg if the user learns that most steps terminate because of divergence, he should be able to

plot example trajectories from a given point,
experiment with various step sizes and kinetic energies.

Allow jittered stepsize `ϵ`

A core stepsize should be adapted, while at the same time using a random jitter factor to adjust.

Interface for iterative application

Sometimes a more granular interface would be useful for tuning and adaptation. In #28 we arrived at the interface

sample, new_state = mcmc_step(stage, nuts::NUTS, state)

for performing NUTS steps, with the idea that state could be something that is tuned (eg stepsize ϵ) in stage. Currently the API only exposes doing this for a pre-determined number of steps (see below).

High-level API

Logging

A new de-facto standard seems to be emerging for progress meters via @logmsg, eg see Atom.jl and timholy/ProgressMeter.jl#102 . Progress reports should be using this. Cf #10.

Interface for initialization and adaptation

I envision each adaptation step as a transformation from previous parameters of the algorithm (stepsize, kinetic energy) to new ones, using random realizations of MCMC draws, ie

sampler parameters, Hamiltonian ====================> new sampler parameters
                                  NUTS realizations

The user could be interested in

the whole history of adaptation (currently possible by invoking steps manually),
the posterior and adapted sampler (what is now returned by NUTS_init_tune_mcmc),
just the posterior.

Targetting (2) as the default interface may have been a mistake, as mostly I am interested in (3) when things go well, and (1) when they don't (cf #24, #9). Also, when samplers have to be parametrized manually, it would be useful to experiment with various initialization and adaptation strategies, eg

picking the initial position by a crude or sophisticated maximization algorithm (addressing #8, #25),
less or more aggressive adaptation of stepsize.

The proposed interface is the following: the user provides

a chain of adaptation steps, eg as a Tuple,
a parameter that specifies how much history should be kept.

Each is applied to the previous state (initialized using nothing), with the target log density and the current parameters as given, and returns a new set of parameters and an adaptation history (when required). The high-level interface can then pick what to keep and return.

Bound LogDensityProblems

The most recent version of LogDensityProblems is out, but DynamicHMC now fails if you try to using DynamicHMC because it cannot find ValueGradient.

Is it possible to define initial values for the parameter NamedTuple?

Hi Tamas,

The very last example in StatisticalRethinkingJulia/DynamicHMCModels.jl never gave problems in DynamicHMC up to v1.0.6 (for reference, the old version used with DynamicHMC v1.0.6 is in m12.6d.jl ) .

In pre-2.0 though, it seems to have difficulty in finding a reasonable stepsize (i.e. typically it sets ϵ < e-12, often way smaller, and never recovers). The updated version of m12.6d.jl is included below in the self-contained example section. Sometimes Optim succeeds, but often it can't find an optimum (if no e.g. Random.seed!(...) is used).

I've looked at the new initialization for q and ϵ and tried a few settings for ϵ, without luck. I tried initialization = (q=(...), ) but that is not correct. I did see a remark that both data and init values can be included in the model function but haven't figured out how to do that (I wanted to use the initial values used in DynamicHMC v1.0.6). That remark was in a comment and might be a left-over from v1.0.6 examples.

The log density function in both versions is identical.

The data for this example is in:

Kline.txt

Please make sure you are using the latest tagged versions and the last stable release of Julia before proceeding. Support for other versions is very limited.

This is using DynamicHMC#master as of today (8/27/2019) on Julia v1.4 or v1.3-RC1.

Self-contained example that demonstrates the problem

using DynamicHMC, LogDensityProblems, TransformVariables, Flux, Random
using Distributions, Parameters, CSV, DataFrames, LinearAlgebra

Random.seed!(012451)

# Read attached Kline.txt file
df = DataFrame(CSV.read(joinpath(@__DIR__, "Kline.txt"), delim=';'));
# Add col logpop, set log() for population data
df.logpop = map((x) -> log(x), df.population);
# Add id for societies
df.society = 1:10;

Base.@kwdef mutable struct KlineModel{Ty <: AbstractVector,
  Tx <: AbstractMatrix, Ts <: AbstractVector}
    "Observations (total_tools)."
    y::Ty
    "Covariates (logpop)"
    x::Tx
    "Society"
    s::Ts
    "Number of observations (10)"
    N::Int
    "Number of societies (also 10)"
    N_societies::Int
end

function make_transformation(model::KlineModel)
    as( (β = as(Array, size(model.x, 2)), α = as(Array, model.N_societies), σ = asℝ₊) )
end

# Instantiate the model with data and inits.

N = size(df, 1)
N_societies = length(unique(df.society))
x = hcat(ones(Int64, N), df.logpop);
s = df.society
y = df.total_tools
model = KlineModel(; y=y, x=x, s=s, N=N, N_societies=N_societies)

# Make the type callable with the parameters *as a single argument*.

function (model::KlineModel)(θ)
    @unpack y, x, s, N, N_societies = model   # data
    @unpack β, α, σ = θ  # parameters
    ll = 0.0
    ll += logpdf(Cauchy(0, 1), σ)
    ll += sum(logpdf.(Normal(0, σ), α)) # α[1:10]
    ll += logpdf.(Normal(0, 10), β[1]) # a
    ll += logpdf.(Normal(0, 1), β[2]) # a
    ll += sum(
      [loglikelihood(Poisson(exp(α[s[i]] + dot(x[i, :], β))), [y[i]]) for i in 1:N]
    )
    ll
end

println()
θ = (β = [1.0, 0.25], α = rand(Normal(0, 1), N_societies), σ = 0.2)
model(θ) |> display
println()

# Wrap the problem with a transformation, then use Flux for the gradient.

P = TransformedLogDensity(make_transformation(model), model)
∇P = ADgradient(:Flux, P);

# Can we initialize? E.g.:

results = mcmc_with_warmup(Random.GLOBAL_RNG, ∇P, 1000;
#  initialization = (ϵ = 0.001, ),
#  warmup_stages = fixed_stepsize_warmup_stages()
)
posterior = P.transformation.(results.chain)

println()
DynamicHMC.Diagnostics.EBFMI(results.tree_statistics) |> display

println()
DynamicHMC.Diagnostics.summarize_tree_statistics(results.tree_statistics) |> display
println()

posterior_β = mean(posterior[i].β for i in 1:length(posterior))
posterior_α = mean(posterior[i].α for i in 1:length(posterior))
posterior_σ = mean(posterior[i].σ for i in 1:length(posterior))

println()
posterior_β |> display
println()
posterior_α  |> display
println()
posterior_σ |> display
println()

Output, expected outcome, comparison to other samplers

stan_result = "
Iterations = 1:1000
Thinning interval = 1
Chains = 1,2,3,4
Samples per chain = 1000

Empirical Posterior Estimates:
                            Mean                SD               Naive SE             MCSE            ESS    
            a          1.076167468  0.7704872560 0.01218247319 0.0210530022 1000.000000
           bp         0.263056273  0.0823415805 0.00130193470 0.0022645077 1000.000000
  a_society.1   -0.191723568  0.2421382537 0.00382854195 0.0060563054 1000.000000
  a_society.2    0.054569029  0.2278506876 0.00360263570 0.0051693148 1000.000000
  a_society.3   -0.035935050  0.1926364647 0.00304584994 0.0039948433 1000.000000
  a_society.4    0.334355037  0.1929971201 0.00305155241 0.0063871707  913.029080
  a_society.5    0.049747513  0.1801287716 0.00284808595 0.0043631095 1000.000000
  a_society.6   -0.311903245  0.2096126337 0.00331426674 0.0053000536 1000.000000
  a_society.7    0.148637507  0.1744680594 0.00275858223 0.0047660246 1000.000000
  a_society.8   -0.164567976  0.1821341074 0.00287979309 0.0034297298 1000.000000
  a_society.9    0.277066965  0.1758237250 0.00278001719 0.0055844175  991.286501
 a_society.10   -0.094149204  0.2846206232 0.00450024719 0.0080735022 1000.000000
sigma_society    0.310352849  0.1374834682 0.00217380450 0.0057325226  575.187461
";

Contributing code for tests

Kline.txt is data from StatisticalRethinking and as such not MIT licensed.

documentation

set up autogenerated documentation
go through docstrings, use DocStringExtensions.jl
write skeleton for autogenerated documentation

progress report

Introduce a customizable progress report scheme with reasonable defaults, that reports information

about tuning, including various phases, some results (eg diagonal of the matrix),
time / sample, function calls / sample, at various intervals,
current iteration in the sampling phase

Default could emit to the console, user would overwrite methods for it.

Option to increase the maximum iterations allowed.

Is there some way to increase the maximum iterations being passed?

DynamicHMC.jl/src/DynamicHMC.jl

Line 255 in 4b7cc86

function InitialStepsizeSearch(; a_min = 0.25, a_max = 0.75, ϵ₀ = 1.0,

If there isn't will be possible to add it? It will be useful in cases such as parameter estimation of Lorenz Equation (DiffEqBayes.jl) using DynamicHMC.jl as the backend.

redesign rejected positions

For some points in the parameter space, the likelihood is -Inf, ie the sample is rejected. Currently this is indicated by a -Inf value in the returned DiffResult.

There are drawbacks to this:

even when the sample is known to be rejected, the function returns a gradient which is nonsensical,
leapfrog proceeds as if nothing happened, using the gradient at that point, relying on it not being used.

The following could make sense instead:

a structure specific to this package is returned, which always contains the value, and when that is not -Inf, can be queried for the derivative. Optionally contains other fields, eg generated quantities.
-Inf (incl other subtypes of AbstractFloat) act as a placeholder for rejected values. They have no other fields, and these are not queried.

Similarly, rejected phase points could be indicated by a singleton type.

Disadvantage: yet another type in the API (but it may be needed because of generated quantities anyway). Possible type instability? But small Union types are handled very well by the compiler in v0.7.

tag new release

Conditional on:

JuliaLang/METADATA.jl#17307 being merged

TODO:

remove 1.0 from allow_failures
update examples repo

don't use sample as a global

Eg in examples and docstrings. Conflicts with StatsBase.sample.

create a DOI

Eg using Zenodo, after #30 is complete.

Error tagging new release

The tag name "1.0.0" is not of the appropriate SemVer form (vX.Y.Z).
cc: @tpapp

link to stable docs

Once release is tagged, check that stable docs is generated and update links in

README.md,
docstring of module

to stable.

Epsilon adaptation is machine-dependent

I'm having a hard time getting an MWE because right now it's inside of a private repo and the result is different on every run, but the setup is like this. It's a Bayesian estimation of some ODE system, with Random.seed!(1) before the call. However, Travis, my laptop, and my workstation all give different results. Somehow, Travis passes tests (yay!), my laptop runs but is slightly off on one parameter, while my workstation fails with the error pasted below. It seems to be tied to different steps being taken, and I'm not sure entirely what would cause it.

Bayesian models: Error During Test at /home/travis/.julia/packages/SafeTestsets/A83XK/src/SafeTestsets.jl:25
  Got exception outside of a @test
  LoadError: ArgumentError: 0 ≤ a must hold. Got
  0 => 0.0
  a => NaN
  Stacktrace:
   [1] macro expansion at /home/travis/.julia/packages/ArgCheck/BUMkA/src/checks.jl:165 [inlined]
   [2] adapt_stepsize(::DynamicHMC.DualAveragingParameters{Float64}, ::DynamicHMC.DualAveragingAdaptation{Float64}, ::Float64) at /home/travis/.julia/packages/DynamicHMC/pkv4F/src/stepsize.jl:162
   [3] tune(::DynamicHMC.NUTS{Array{Float64,1},Float64,Random.MersenneTwister,DynamicHMC.Hamiltonian{PuMaS.BayesLogDensity{PKPDModel{ParamSet{NamedTuple{(:θ, :Ω, :σ),Tuple{Constrained{MvNormal{Float64,PDiagMat{Float64,Array{Float64,1}},Array{Float64,1}},VectorDomain{Array{Float64,1},Array{Float64,1}}},InverseWishart{Float64,PDMat{Float64,Array{Float64,2}}},RealDomain{Float64}}}},getfield(Main.##408, Symbol("##27#40")),getfield(Main.##408, Symbol("##28#41")),getfield(Main.##408, Symbol("##29#42")),ODEProblem{Nothing,Tuple{Nothing,Nothing},false,Nothing,ODEFunction{false,getfield(ModelingToolkit, Symbol("##156#157")),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing,DiffEqBase.StandardODEProblem},getfield(Main.##408, Symbol("##30#43")),getfield(Main.##408, Symbol("##35#48"))},Population{Subject{StructArrays.StructArray{NamedTuple{(:dv,),Tuple{Float64}},1,NamedTuple{(:dv,),Tuple{Array{Float64,1}}}},NamedTuple{(:time, :WT, :SEX),Tuple{Array{Float64,1},Float64,Int64}},Array{PuMaS.Event{Float64,Float64,Float64,Float64,Float64,Float64},1},Float64}},Array{Float64,1},ForwardDiff.GradientConfig{ForwardDiff.Tag{typeof(LogDensityProblems.logdensity),Float64},Float64,8,Array{ForwardDiff.Dual{ForwardDiff.Tag{typeof(LogDensityProblems.logdensity),Float64},Float64,8},1}},DiffResults.MutableDiffResult{1,Float64,Tuple{Array{Float64,1}}},Tuple{},Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}},DynamicHMC.GaussianKE{Symmetric{Float64,Array{Float64,2}},LowerTriangular{Float64,Array{Float64,2}}}},DynamicHMC.ReportIO{Base.TTY}}, ::DynamicHMC.StepsizeTuner) at /home/travis/.julia/packages/DynamicHMC/pkv4F/src/sampler.jl:74
   [4] tune(::DynamicHMC.NUTS{Array{Float64,1},Float64,Random.MersenneTwister,DynamicHMC.Hamiltonian{PuMaS.BayesLogDensity{PKPDModel{ParamSet{NamedTuple{(:θ, :Ω, :σ),Tuple{Constrained{MvNormal{Float64,PDiagMat{Float64,Array{Float64,1}},Array{Float64,1}},VectorDomain{Array{Float64,1},Array{Float64,1}}},InverseWishart{Float64,PDMat{Float64,Array{Float64,2}}},RealDomain{Float64}}}},getfield(Main.##408, Symbol("##27#40")),getfield(Main.##408, Symbol("##28#41")),getfield(Main.##408, Symbol("##29#42")),ODEProblem{Nothing,Tuple{Nothing,Nothing},false,Nothing,ODEFunction{false,getfield(ModelingToolkit, Symbol("##156#157")),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing,DiffEqBase.StandardODEProblem},getfield(Main.##408, Symbol("##30#43")),getfield(Main.##408, Symbol("##35#48"))},Population{Subject{StructArrays.StructArray{NamedTuple{(:dv,),Tuple{Float64}},1,NamedTuple{(:dv,),Tuple{Array{Float64,1}}}},NamedTuple{(:time, :WT, :SEX),Tuple{Array{Float64,1},Float64,Int64}},Array{PuMaS.Event{Float64,Float64,Float64,Float64,Float64,Float64},1},Float64}},Array{Float64,1},ForwardDiff.GradientConfig{ForwardDiff.Tag{typeof(LogDensityProblems.logdensity),Float64},Float64,8,Array{ForwardDiff.Dual{ForwardDiff.Tag{typeof(LogDensityProblems.logdensity),Float64},Float64,8},1}},DiffResults.MutableDiffResult{1,Float64,Tuple{Array{Float64,1}}},Tuple{},Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}},DynamicHMC.GaussianKE{Diagonal{Float64,Array{Float64,1}},LowerTriangular{Float64,Diagonal{Float64,Array{Float64,1}}}}},DynamicHMC.ReportIO{Base.TTY}}, ::DynamicHMC.TunerSequence{Tuple{DynamicHMC.StepsizeTuner,DynamicHMC.StepsizeCovTuner{Float64},DynamicHMC.StepsizeCovTuner{Float64},DynamicHMC.StepsizeCovTuner{Float64},DynamicHMC.StepsizeCovTuner{Float64},DynamicHMC.StepsizeCovTuner{Float64},DynamicHMC.StepsizeTuner}}) at /home/travis/.julia/packages/DynamicHMC/pkv4F/src/sampler.jl:262
   [5] #NUTS_init_tune_mcmc#20(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Random.MersenneTwister, ::PuMaS.BayesLogDensity{PKPDModel{ParamSet{NamedTuple{(:θ, :Ω, :σ),Tuple{Constrained{MvNormal{Float64,PDiagMat{Float64,Array{Float64,1}},Array{Float64,1}},VectorDomain{Array{Float64,1},Array{Float64,1}}},InverseWishart{Float64,PDMat{Float64,Array{Float64,2}}},RealDomain{Float64}}}},getfield(Main.##408, Symbol("##27#40")),getfield(Main.##408, Symbol("##28#41")),getfield(Main.##408, Symbol("##29#42")),ODEProblem{Nothing,Tuple{Nothing,Nothing},false,Nothing,ODEFunction{false,getfield(ModelingToolkit, Symbol("##156#157")),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing,DiffEqBase.StandardODEProblem},getfield(Main.##408, Symbol("##30#43")),getfield(Main.##408, Symbol("##35#48"))},Population{Subject{StructArrays.StructArray{NamedTuple{(:dv,),Tuple{Float64}},1,NamedTuple{(:dv,),Tuple{Array{Float64,1}}}},NamedTuple{(:time, :WT, :SEX),Tuple{Array{Float64,1},Float64,Int64}},Array{PuMaS.Event{Float64,Float64,Float64,Float64,Float64,Float64},1},Float64}},Array{Float64,1},ForwardDiff.GradientConfig{ForwardDiff.Tag{typeof(LogDensityProblems.logdensity),Float64},Float64,8,Array{ForwardDiff.Dual{ForwardDiff.Tag{typeof(LogDensityProblems.logdensity),Float64},Float64,8},1}},DiffResults.MutableDiffResult{1,Float64,Tuple{Array{Float64,1}}},Tuple{},Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}}, ::Int64) at /home/travis/.julia/packages/DynamicHMC/pkv4F/src/sampler.jl:285
   [6] NUTS_init_tune_mcmc at /home/travis/.julia/packages/DynamicHMC/pkv4F/src/sampler.jl:284 [inlined]
   [7] #NUTS_init_tune_mcmc#21 at /home/travis/.julia/packages/DynamicHMC/pkv4F/src/sampler.jl:294 [inlined]
   [8] NUTS_init_tune_mcmc(::PuMaS.BayesLogDensity{PKPDModel{ParamSet{NamedTuple{(:θ, :Ω, :σ),Tuple{Constrained{MvNormal{Float64,PDiagMat{Float64,Array{Float64,1}},Array{Float64,1}},VectorDomain{Array{Float64,1},Array{Float64,1}}},InverseWishart{Float64,PDMat{Float64,Array{Float64,2}}},RealDomain{Float64}}}},getfield(Main.##408, Symbol("##27#40")),getfield(Main.##408, Symbol("##28#41")),getfield(Main.##408, Symbol("##29#42")),ODEProblem{Nothing,Tuple{Nothing,Nothing},false,Nothing,ODEFunction{false,getfield(ModelingToolkit, Symbol("##156#157")),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing,DiffEqBase.StandardODEProblem},getfield(Main.##408, Symbol("##30#43")),getfield(Main.##408, Symbol("##35#48"))},Population{Subject{StructArrays.StructArray{NamedTuple{(:dv,),Tuple{Float64}},1,NamedTuple{(:dv,),Tuple{Array{Float64,1}}}},NamedTuple{(:time, :WT, :SEX),Tuple{Array{Float64,1},Float64,Int64}},Array{PuMaS.Event{Float64,Float64,Float64,Float64,Float64,Float64},1},Float64}},Array{Float64,1},ForwardDiff.GradientConfig{ForwardDiff.Tag{typeof(LogDensityProblems.logdensity),Float64},Float64,8,Array{ForwardDiff.Dual{ForwardDiff.Tag{typeof(LogDensityProblems.logdensity),Float64},Float64,8},1}},DiffResults.MutableDiffResult{1,Float64,Tuple{Array{Float64,1}}},Tuple{},Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}}, ::Int64) at /home/travis/.julia/packages/DynamicHMC/pkv4F/src/sampler.jl:294
   [9] #fit#207(::Int64, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::PKPDModel{ParamSet{NamedTuple{(:θ, :Ω, :σ),Tuple{Constrained{MvNormal{Float64,PDiagMat{Float64,Array{Float64,1}},Array{Float64,1}},VectorDomain{Array{Float64,1},Array{Float64,1}}},InverseWishart{Float64,PDMat{Float64,Array{Float64,2}}},RealDomain{Float64}}}},getfield(Main.##408, Symbol("##27#40")),getfield(Main.##408, Symbol("##28#41")),getfield(Main.##408, Symbol("##29#42")),ODEProblem{Nothing,Tuple{Nothing,Nothing},false,Nothing,ODEFunction{false,getfield(ModelingToolkit, Symbol("##156#157")),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing,DiffEqBase.StandardODEProblem},getfield(Main.##408, Symbol("##30#43")),getfield(Main.##408, Symbol("##35#48"))}, ::Population{Subject{StructArrays.StructArray{NamedTuple{(:dv,),Tuple{Float64}},1,NamedTuple{(:dv,),Tuple{Array{Float64,1}}}},NamedTuple{(:time, :WT, :SEX),Tuple{Array{Float64,1},Float64,Int64}},Array{PuMaS.Event{Float64,Float64,Float64,Float64,Float64,Float64},1},Float64}}, ::PuMaS.BayesMCMC) at /home/travis/build/UMCTM/PuMaS.jl/src/models/bayes.jl:133
   [10] (::getfield(StatsBase, Symbol("#kw##fit")))(::NamedTuple{(:nsamples,),Tuple{Int64}}, ::typeof(fit), ::PKPDModel{ParamSet{NamedTuple{(:θ, :Ω, :σ),Tuple{Constrained{MvNormal{Float64,PDiagMat{Float64,Array{Float64,1}},Array{Float64,1}},VectorDomain{Array{Float64,1},Array{Float64,1}}},InverseWishart{Float64,PDMat{Float64,Array{Float64,2}}},RealDomain{Float64}}}},getfield(Main.##408, Symbol("##27#40")),getfield(Main.##408, Symbol("##28#41")),getfield(Main.##408, Symbol("##29#42")),ODEProblem{Nothing,Tuple{Nothing,Nothing},false,Nothing,ODEFunction{false,getfield(ModelingToolkit, Symbol("##156#157")),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing,DiffEqBase.StandardODEProblem},getfield(Main.##408, Symbol("##30#43")),getfield(Main.##408, Symbol("##35#48"))}, ::Population{Subject{StructArrays.StructArray{NamedTuple{(:dv,),Tuple{Float64}},1,NamedTuple{(:dv,),Tuple{Array{Float64,1}}}},NamedTuple{(:time, :WT, :SEX),Tuple{Array{Float64,1},Float64,Int64}},Array{PuMaS.Event{Float64,Float64,Float64,Float64,Float64,Float64},1},Float64}}, ::PuMaS.BayesMCMC) at ./none:0
   [11] top-level scope at none:0
   [12] include at ./boot.jl:326 [inlined]
   [13] include_relative(::Module, ::String) at ./loading.jl:1038
   [14] include at ./sysimg.jl:29 [inlined]
   [15] include(::String) at /home/travis/.julia/packages/SafeTestsets/A83XK/src/SafeTestsets.jl:23
   [16] top-level scope at /home/travis/build/UMCTM/PuMaS.jl/test/nlme/runtests.jl:16
   [17] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1083
   [18] top-level scope at /home/travis/build/UMCTM/PuMaS.jl/test/nlme/runtests.jl:16
   [19] eval(::Module, ::Any) at ./boot.jl:328
   [20] top-level scope at /home/travis/.julia/packages/SafeTestsets/A83XK/src/SafeTestsets.jl:23
   [21] top-level scope at util.jl:156
   [22] top-level scope at /home/travis/build/UMCTM/PuMaS.jl/test/nlme/runtests.jl:16
   [23] include at ./boot.jl:326 [inlined]
   [24] include_relative(::Module, ::String) at ./loading.jl:1038
   [25] include(::Module, ::String) at ./sysimg.jl:29
   [26] include(::String) at ./client.jl:403
   [27] top-level scope at /home/travis/build/UMCTM/PuMaS.jl/test/runtests.jl:54
   [28] top-level scope at util.jl:156
   [29] include at ./boot.jl:326 [inlined]
   [30] include_relative(::Module, ::String) at ./loading.jl:1038
   [31] include(::Module, ::String) at ./sysimg.jl:29
   [32] include(::String) at ./client.jl:403
   [33] top-level scope at none:0
   [34] eval(::Module, ::Any) at ./boot.jl:328
   [35] exec_options(::Base.JLOptions) at ./client.jl:243
   [36] _start() at ./client.jl:436
  in expression starting at /home/travis/build/UMCTM/PuMaS.jl/test/nlme/bayes.jl:108

review interface before tagging 2.0

Cf #74, #69

should we export something else?
should we not export something, to keep it flexible
alternative samplers (other than NUTS, HMC variants): do they fit into the API seamlessly?
- should default_warmup_stages and fixed_stepsize_warmup_stages take the sampler as a parameter?
- maybe rename TreeOptionsNUTS to simply NUTS and call sampler_options simply sampler
- make sure sampler is interchangeable

Needed help with an error I am getting in using DiffEqBayes.jl preparing for my talk at JuliaCon.

Hi @tpapp,
I was running a previous notebook we had in our benchmarks and got an error probably due some change in DynamicHMC.jl, I tried to diagnose it but it seemed a bit advanced for me. Could you help me out?

AssertionError: chunk size cannot be greater than length(x) (10 > 4)

Stacktrace:
 [1] chunk_mode_gradient!(::DiffResults.MutableDiffResult{1,Float64,Tuple{Array{Float64,1}}}, ::ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}}, ::Array{Float64,1}, ::ForwardDiff.GradientConfig{ForwardDiff.Tag{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},Float64},Float64,10,Array{ForwardDiff.Dual{ForwardDiff.Tag{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},Float64},Float64,10},1}}) at /Users/vaibhav/.julia/v0.6/ForwardDiff/src/gradient.jl:123
 [2] gradient!(::DiffResults.MutableDiffResult{1,Float64,Tuple{Array{Float64,1}}}, ::ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}}, ::Array{Float64,1}, ::ForwardDiff.GradientConfig{ForwardDiff.Tag{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},Float64},Float64,10,Array{ForwardDiff.Dual{ForwardDiff.Tag{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},Float64},Float64,10},1}}, ::Val{true}) at /Users/vaibhav/.julia/v0.6/ForwardDiff/src/gradient.jl:37
 [3] (::DiffWrappers.ForwardGradientWrapper{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},ForwardDiff.GradientConfig{ForwardDiff.Tag{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},Float64},Float64,10,Array{ForwardDiff.Dual{ForwardDiff.Tag{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},Float64},Float64,10},1}},DiffResults.MutableDiffResult{1,Float64,Tuple{Array{Float64,1}}}})(::Array{Float64,1}) at /Users/vaibhav/.julia/v0.6/DiffWrappers/src/DiffWrappers.jl:55
 [4] #NUTS_init#15(::DynamicHMC.GaussianKE{Diagonal{Float64},UpperTriangular{Float64,Array{Float64,2}}}, ::Array{Float64,1}, ::Int64, ::Float64, ::Function, ::MersenneTwister, ::DiffWrappers.ForwardGradientWrapper{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},ForwardDiff.GradientConfig{ForwardDiff.Tag{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},Float64},Float64,10,Array{ForwardDiff.Dual{ForwardDiff.Tag{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},Float64},Float64,10},1}},DiffResults.MutableDiffResult{1,Float64,Tuple{Array{Float64,1}}}}, ::Array{Float64,1}) at /Users/vaibhav/.julia/v0.6/DynamicHMC/src/DynamicHMC.jl:969
 [5] (::DynamicHMC.#kw##NUTS_init)(::Array{Any,1}, ::DynamicHMC.#NUTS_init, ::MersenneTwister, ::DiffWrappers.ForwardGradientWrapper{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},ForwardDiff.GradientConfig{ForwardDiff.Tag{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},Float64},Float64,10,Array{ForwardDiff.Dual{ForwardDiff.Tag{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},Float64},Float64,10},1}},DiffResults.MutableDiffResult{1,Float64,Tuple{Array{Float64,1}}}}, ::Array{Float64,1}) at ./<missing>:0
 [6] #NUTS_init_tune_mcmc#18(::Array{Any,1}, ::Function, ::MersenneTwister, ::DiffWrappers.ForwardGradientWrapper{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},ForwardDiff.GradientConfig{ForwardDiff.Tag{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},Float64},Float64,10,Array{ForwardDiff.Dual{ForwardDiff.Tag{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},Float64},Float64,10},1}},DiffResults.MutableDiffResult{1,Float64,Tuple{Array{Float64,1}}}}, ::Array{Float64,1}, ::Int64) at /Users/vaibhav/.julia/v0.6/DynamicHMC/src/DynamicHMC.jl:1122
 [7] (::DynamicHMC.#kw##NUTS_init_tune_mcmc)(::Array{Any,1}, ::DynamicHMC.#NUTS_init_tune_mcmc, ::MersenneTwister, ::DiffWrappers.ForwardGradientWrapper{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},ForwardDiff.GradientConfig{ForwardDiff.Tag{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},Float64},Float64,10,Array{ForwardDiff.Dual{ForwardDiff.Tag{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},Float64},Float64,10},1}},DiffResults.MutableDiffResult{1,Float64,Tuple{Array{Float64,1}}}}, ::Array{Float64,1}, ::Int64) at ./<missing>:0
 [8] #NUTS_init_tune_mcmc#19(::Array{Any,1}, ::Function, ::DiffWrappers.ForwardGradientWrapper{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},ForwardDiff.GradientConfig{ForwardDiff.Tag{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},Float64},Float64,10,Array{ForwardDiff.Dual{ForwardDiff.Tag{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},Float64},Float64,10},1}},DiffResults.MutableDiffResult{1,Float64,Tuple{Array{Float64,1}}}}, ::Array{Float64,1}, ::Int64) at /Users/vaibhav/.julia/v0.6/DynamicHMC/src/DynamicHMC.jl:1133
 [9] (::DynamicHMC.#kw##NUTS_init_tune_mcmc)(::Array{Any,1}, ::DynamicHMC.#NUTS_init_tune_mcmc, ::DiffWrappers.ForwardGradientWrapper{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},ForwardDiff.GradientConfig{ForwardDiff.Tag{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},Float64},Float64,10,Array{ForwardDiff.Dual{ForwardDiff.Tag{ContinuousTransformations.TransformLogLikelihood{DiffEqBayes.DynamicHMCPosterior,ContinuousTransformations.TransformationTuple{NTuple{4,ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic}}}},Float64},Float64,10},1}},DiffResults.MutableDiffResult{1,Float64,Tuple{Array{Float64,1}}}}, ::Array{Float64,1}, ::Int64) at ./<missing>:0
 [10] #dynamichmc_inference#15(::Float64, ::Array{Float64,1}, ::Int64, ::Array{Any,1}, ::Function, ::DiffEqBase.ODEProblem{Array{Float64,1},Float64,true,Array{Float64,1},FitzhughNagumo,Void,Void,UniformScaling{Int64},DiffEqBase.StandardODEProblem}, ::Array{Float64,2}, ::Array{Distributions.Truncated{Distributions.Normal{Float64},Distributions.Continuous},1}, ::Array{Float64,1}, ::Array{ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic},1}) at /Users/vaibhav/.julia/v0.6/DiffEqBayes/src/dynamichmc_inference.jl:74
 [11] (::DiffEqBayes.#kw##dynamichmc_inference)(::Array{Any,1}, ::DiffEqBayes.#dynamichmc_inference, ::DiffEqBase.ODEProblem{Array{Float64,1},Float64,true,Array{Float64,1},FitzhughNagumo,Void,Void,UniformScaling{Int64},DiffEqBase.StandardODEProblem}, ::Array{Float64,2}, ::Array{Distributions.Truncated{Distributions.Normal{Float64},Distributions.Continuous},1}, ::Array{Float64,1}, ::Array{ContinuousTransformations.ComposedTransformation{ContinuousTransformations.Affine{Float64},ContinuousTransformations.Logistic},1}) at ./<missing>:0

Thanks!

adapt a diagonal mass matrix by default

In practice, even a regularized covariance matrix is just estimating noise for the off-diagonal elements for a short sample. Either use a diagonal matrix always, or allow customizations of this but make diagonal the default.

AFAIK this is what Stan does, too.

footnotes broken in generated docs

Cf JuliaDocs/Documenter.jl#1104

What is a "step"?

The sampler prints out info like "0.19 s/step". I assume that's per MH step, not per leapfrog step? It would be useful to be more precise.

unit tests for posterior

Current tests are very ad-hoc.

Threshold for |z| is very large and was adjusted to make tests pass. Is this something to be expected, or does it just help pass incorrect output? Compare with simple HMC and other Bayesian software.
check EBFMI
check Rhat
add tests for other known distributions (Multivariate t, possibly Wishart, inverse Wishart)

Registered version is currently broken

Registered version is currently broken because it uses latest LogDensityProblems which gets rid of ValueGradient

[bbc10e6e] DynamicHMC v1.0.5

optimize before adapting stepsize

NUTS_init should do some (crude?) optimization before adapting stepsize.

Support other output data structures

The current DynamicHMC approach of storing NUTS output in an array is common, but is not always the most desirable. Two difficulties with this:

It's opaque until it's done running. This could be addressed with the addition of callback functions, but this requires a bit awkward coding style.
When it's done, it's done. There's no way to inform the sampler that we're not really done yet and would like a few more samples.

By representing output in the form of an iterator, both of these difficulties could be improved. An iterator can easily be processed "live" in order to interrupt (if we realize the geometry of the problem needs adjustment) or extended (for example if we'd like it to run until the effective number of samples reaches a certain point).

There may be some small overhead to an iterator approach compared to storing things in an array. But NUTS output we're interested in is almost always in the 100-10000 range, so this should be minimal. Also, it's easy to pipe an iterator into an array if array output is preferred.

An alternative to the iterator approach is to allow multiple output formats, maybe with something like an output=:array default keyword argument.

heavy dependencies for initial optimization

Optim.jl, which is used in finding the initial local optimum, is responsible about 10-15 of the dependencies in the manifest.

Given that we just need a solver that gets close to a local optimum, is there a more streamlined solution?

ERROR: Reached maximum number of iterations while bisecting interval for ϵ.

Tamas, we ran into above issue with DynamicHMC. The error occurs intermittently, about 1 in 5 runs. Here is the full error message:

ERROR: Reached maximum number of iterations while bisecting interval for ϵ.
Stacktrace:
 [1] bisect_stepsize(::DynamicHMC.InitialStepsizeSearch, ::getfield(DynamicHMC, Symbol("##3#4")){DynamicHMC.Hamiltonian{LogDensityProblems.ForwardDiffLogDensity{TransformedLogDensity{TransformVariables.TransformTuple{NamedTuple{(:v, :A, :k, :tau),Tuple{TransformVariables.ArrayTransform{TransformVariables.ShiftedExp{true,Float64},1},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64}}}},LBAProb{NamedTuple{(:choice, :rt),Tuple{Array{Int64,1},Array{Float64,1}}}}},ForwardDiff.GradientConfig{ForwardDiff.Tag{getfield(LogDensityProblems, Symbol("##1#2")){TransformedLogDensity{TransformVariables.TransformTuple{NamedTuple{(:v, :A, :k, :tau),Tuple{TransformVariables.ArrayTransform{TransformVariables.ShiftedExp{true,Float64},1},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64}}}},LBAProb{NamedTuple{(:choice, :rt),Tuple{Array{Int64,1},Array{Float64,1}}}}}},Float64},Float64,6,Array{ForwardDiff.Dual{ForwardDiff.Tag{getfield(LogDensityProblems, Symbol("##1#2")){TransformedLogDensity{TransformVariables.TransformTuple{NamedTuple{(:v, :A, :k, :tau),Tuple{TransformVariables.ArrayTransform{TransformVariables.ShiftedExp{true,Float64},1},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64}}}},LBAProb{NamedTuple{(:choice, :rt),Tuple{Array{Int64,1},Array{Float64,1}}}}}},Float64},Float64,6},1}}},GaussianKE{LinearAlgebra.Diagonal{Float64,Array{Float64,1}},LinearAlgebra.LowerTriangular{Float64,LinearAlgebra.Diagonal{Float64,Array{Float64,1}}}}},DynamicHMC.PhasePoint{Array{Float64,1},LogDensityProblems.ValueGradient{Float64,Array{Float64,1}}},Float64}, ::Float64, ::Float64, ::Float64, ::Float64) at /home/dfish/.julia/packages/DynamicHMC/YX6JA/src/stepsize.jl:122
 [2] find_initial_stepsize at /home/dfish/.julia/packages/DynamicHMC/YX6JA/src/stepsize.jl:146 [inlined]
 [3] find_initial_stepsize(::DynamicHMC.InitialStepsizeSearch, ::DynamicHMC.Hamiltonian{LogDensityProblems.ForwardDiffLogDensity{TransformedLogDensity{TransformVariables.TransformTuple{NamedTuple{(:v, :A, :k, :tau),Tuple{TransformVariables.ArrayTransform{TransformVariables.ShiftedExp{true,Float64},1},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64}}}},LBAProb{NamedTuple{(:choice, :rt),Tuple{Array{Int64,1},Array{Float64,1}}}}},ForwardDiff.GradientConfig{ForwardDiff.Tag{getfield(LogDensityProblems, Symbol("##1#2")){TransformedLogDensity{TransformVariables.TransformTuple{NamedTuple{(:v, :A, :k, :tau),Tuple{TransformVariables.ArrayTransform{TransformVariables.ShiftedExp{true,Float64},1},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64}}}},LBAProb{NamedTuple{(:choice, :rt),Tuple{Array{Int64,1},Array{Float64,1}}}}}},Float64},Float64,6,Array{ForwardDiff.Dual{ForwardDiff.Tag{getfield(LogDensityProblems, Symbol("##1#2")){TransformedLogDensity{TransformVariables.TransformTuple{NamedTuple{(:v, :A, :k, :tau),Tuple{TransformVariables.ArrayTransform{TransformVariables.ShiftedExp{true,Float64},1},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64}}}},LBAProb{NamedTuple{(:choice, :rt),Tuple{Array{Int64,1},Array{Float64,1}}}}}},Float64},Float64,6},1}}},GaussianKE{LinearAlgebra.Diagonal{Float64,Array{Float64,1}},LinearAlgebra.LowerTriangular{Float64,LinearAlgebra.Diagonal{Float64,Array{Float64,1}}}}}, ::DynamicHMC.PhasePoint{Array{Float64,1},LogDensityProblems.ValueGradient{Float64,Array{Float64,1}}}) at /home/dfish/.julia/packages/DynamicHMC/YX6JA/src/stepsize.jl:151
 [4] #NUTS_init#18(::Array{Float64,1}, ::GaussianKE{LinearAlgebra.Diagonal{Float64,Array{Float64,1}},LinearAlgebra.LowerTriangular{Float64,LinearAlgebra.Diagonal{Float64,Array{Float64,1}}}}, ::Array{Float64,1}, ::Int64, ::DynamicHMC.InitialStepsizeSearch, ::ReportIO{Base.TTY}, ::Function, ::MersenneTwister, ::LogDensityProblems.ForwardDiffLogDensity{TransformedLogDensity{TransformVariables.TransformTuple{NamedTuple{(:v, :A, :k, :tau),Tuple{TransformVariables.ArrayTransform{TransformVariables.ShiftedExp{true,Float64},1},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64}}}},LBAProb{NamedTuple{(:choice, :rt),Tuple{Array{Int64,1},Array{Float64,1}}}}},ForwardDiff.GradientConfig{ForwardDiff.Tag{getfield(LogDensityProblems, Symbol("##1#2")){TransformedLogDensity{TransformVariables.TransformTuple{NamedTuple{(:v, :A, :k, :tau),Tuple{TransformVariables.ArrayTransform{TransformVariables.ShiftedExp{true,Float64},1},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64}}}},LBAProb{NamedTuple{(:choice, :rt),Tuple{Array{Int64,1},Array{Float64,1}}}}}},Float64},Float64,6,Array{ForwardDiff.Dual{ForwardDiff.Tag{getfield(LogDensityProblems, Symbol("##1#2")){TransformedLogDensity{TransformVariables.TransformTuple{NamedTuple{(:v, :A, :k, :tau),Tuple{TransformVariables.ArrayTransform{TransformVariables.ShiftedExp{true,Float64},1},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64}}}},LBAProb{NamedTuple{(:choice, :rt),Tuple{Array{Int64,1},Array{Float64,1}}}}}},Float64},Float64,6},1}}}) at /home/dfish/.julia/packages/DynamicHMC/YX6JA/src/sampler.jl:141
 [5] NUTS_init(::MersenneTwister, ::LogDensityProblems.ForwardDiffLogDensity{TransformedLogDensity{TransformVariables.TransformTuple{NamedTuple{(:v, :A, :k, :tau),Tuple{TransformVariables.ArrayTransform{TransformVariables.ShiftedExp{true,Float64},1},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64}}}},LBAProb{NamedTuple{(:choice, :rt),Tuple{Array{Int64,1},Array{Float64,1}}}}},ForwardDiff.GradientConfig{ForwardDiff.Tag{getfield(LogDensityProblems, Symbol("##1#2")){TransformedLogDensity{TransformVariables.TransformTuple{NamedTuple{(:v, :A, :k, :tau),Tuple{TransformVariables.ArrayTransform{TransformVariables.ShiftedExp{true,Float64},1},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64}}}},LBAProb{NamedTuple{(:choice, :rt),Tuple{Array{Int64,1},Array{Float64,1}}}}}},Float64},Float64,6,Array{ForwardDiff.Dual{ForwardDiff.Tag{getfield(LogDensityProblems, Symbol("##1#2")){TransformedLogDensity{TransformVariables.TransformTuple{NamedTuple{(:v, :A, :k, :tau),Tuple{TransformVariables.ArrayTransform{TransformVariables.ShiftedExp{true,Float64},1},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64}}}},LBAProb{NamedTuple{(:choice, :rt),Tuple{Array{Int64,1},Array{Float64,1}}}}}},Float64},Float64,6},1}}}) at /home/dfish/.julia/packages/DynamicHMC/YX6JA/src/sampler.jl:138
 [6] #NUTS_init_tune_mcmc#20(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::MersenneTwister, ::LogDensityProblems.ForwardDiffLogDensity{TransformedLogDensity{TransformVariables.TransformTuple{NamedTuple{(:v, :A, :k, :tau),Tuple{TransformVariables.ArrayTransform{TransformVariables.ShiftedExp{true,Float64},1},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64}}}},LBAProb{NamedTuple{(:choice, :rt),Tuple{Array{Int64,1},Array{Float64,1}}}}},ForwardDiff.GradientConfig{ForwardDiff.Tag{getfield(LogDensityProblems, Symbol("##1#2")){TransformedLogDensity{TransformVariables.TransformTuple{NamedTuple{(:v, :A, :k, :tau),Tuple{TransformVariables.ArrayTransform{TransformVariables.ShiftedExp{true,Float64},1},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64}}}},LBAProb{NamedTuple{(:choice, :rt),Tuple{Array{Int64,1},Array{Float64,1}}}}}},Float64},Float64,6,Array{ForwardDiff.Dual{ForwardDiff.Tag{getfield(LogDensityProblems, Symbol("##1#2")){TransformedLogDensity{TransformVariables.TransformTuple{NamedTuple{(:v, :A, :k, :tau),Tuple{TransformVariables.ArrayTransform{TransformVariables.ShiftedExp{true,Float64},1},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64}}}},LBAProb{NamedTuple{(:choice, :rt),Tuple{Array{Int64,1},Array{Float64,1}}}}}},Float64},Float64,6},1}}}, ::Int64) at /home/dfish/.julia/packages/DynamicHMC/YX6JA/src/sampler.jl:284
 [7] NUTS_init_tune_mcmc at /home/dfish/.julia/packages/DynamicHMC/YX6JA/src/sampler.jl:284 [inlined]
 [8] #NUTS_init_tune_mcmc#21 at /home/dfish/.julia/packages/DynamicHMC/YX6JA/src/sampler.jl:294 [inlined]
 [9] NUTS_init_tune_mcmc(::LogDensityProblems.ForwardDiffLogDensity{TransformedLogDensity{TransformVariables.TransformTuple{NamedTuple{(:v, :A, :k, :tau),Tuple{TransformVariables.ArrayTransform{TransformVariables.ShiftedExp{true,Float64},1},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64}}}},LBAProb{NamedTuple{(:choice, :rt),Tuple{Array{Int64,1},Array{Float64,1}}}}},ForwardDiff.GradientConfig{ForwardDiff.Tag{getfield(LogDensityProblems, Symbol("##1#2")){TransformedLogDensity{TransformVariables.TransformTuple{NamedTuple{(:v, :A, :k, :tau),Tuple{TransformVariables.ArrayTransform{TransformVariables.ShiftedExp{true,Float64},1},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64}}}},LBAProb{NamedTuple{(:choice, :rt),Tuple{Array{Int64,1},Array{Float64,1}}}}}},Float64},Float64,6,Array{ForwardDiff.Dual{ForwardDiff.Tag{getfield(LogDensityProblems, Symbol("##1#2")){TransformedLogDensity{TransformVariables.TransformTuple{NamedTuple{(:v, :A, :k, :tau),Tuple{TransformVariables.ArrayTransform{TransformVariables.ShiftedExp{true,Float64},1},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64},TransformVariables.ShiftedExp{true,Float64}}}},LBAProb{NamedTuple{(:choice, :rt),Tuple{Array{Int64,1},Array{Float64,1}}}}}},Float64},Float64,6},1}}}, ::Int64) at /home/dfish/.julia/packages/DynamicHMC/YX6JA/src/sampler.jl:294
 [10] sampleDHMC(::NamedTuple{(:choice, :rt),Tuple{Array{Int64,1},Array{Float64,1}}}, ::Int64, ::Int64, ::Int64) at /home/dfish/.julia/dev/MCMCBenchmarks/src/temp.jl:159
 [11] top-level scope at none:0

We were wondering if you have any advice for dealing with this situation. One solution that comes immediately to mind is to increase the number of iterations. However, that may not be the most efficient solution. A MWE can be found here.

unit test turn statistics and detection

These are key parts of the algorithm, yet only tested indirectly. Especially the combination in adjacent_tree.

With #30, introduce a transition building block that keeps track of the relative (integer) index of steps, and compare this to a simple leapfrog sequence.

preparing for release 2.0

@zenna, @goedman, @cscherrer

DynamicHMC is nearing a new release. Many things are improved, but this is a breaking change in the API. I am pinging you because you are using this package in your own packages.

At the very least, you should provide version bounds in Project.toml, along the lines

[compat]
DynamicHMC = "^1"

This should prevent problems when I release the new version, which I plan to do on Sep 1, 2019 if testing does not reveal any bugs.

The new interface is documented at https://tamaspapp.eu/DynamicHMC.jl/latest/worked_example/ . Unless you are tweaking the adaptation, you would just use it as

results = mcmc_with_warmup(Random.GLOBAL_RNG, ∇P, 1000)

Note that LogDensityProblems also had an API change, 0.9.x is a bit different. If you are using the TransformedLogDensity constructor, not much changed, but if you are creating log densities directly, you need to use the new API (the changes are trivial).

If you have tests which you can run with the new interface before I release, feedback and bug reports are appreciated. This code should be more solid than the previous release (I added tests and reworked a lot of internals), but one never knows.
Finally, please ask questions here, and if you think that something you need is not addressed by the new API, let me know.

TODO before release

~~fix #67~~ edit: will fix later as Documenter.jl catches up
run more sample correctness tests

TODO after release

blog about new interface
blog about new internals
blog about coding style changes
make a PR fixing SciML/DiffEqBayes.jl#63 (code is ready in https://github.com/tpapp/DiffEqBayesDHMC)
#77

tag a new release?

I'm getting an error on the release which doesn't appear on master, so would be good to have a new version.

unit test for totally divergent NUTS

Unit test for a log density which is -Inf everywhere except the starting point. Is the average acceptance rate sensible? Does it matter?

Problem with NUTS_tune_and_mcmc

I would like to use the NUTS_tune_and_mcmc function to sample from the following model: https://github.com/tpapp/HamiltonianABC.jl/blob/master/test/simultaneous.jl, but the function runs for hours without any result.

Further on my previous request (Request for help #12 in DynamicHMCExamples)

Hi Tamas,

I've tried to simplify above mentioned request for help a bit further to make a as close as possible to a MWE. The example needs the chimpanzees.TXT data (it is a csv file but GitHub doesn't support that) and below script. The expected results are in the variable rethinking. At the end of the script I print off the DynamicHMC values. The alpha values seem ok, bp and bpC are consistently off.

chimpanzees.TXT

using CSV, DataFrames, Random, Distributions, LinearAlgebra
using StatsBase, StatsFuns
using DynamicHMC, TransformVariables, LogDensityProblems
using MCMCDiagnostics, LinearAlgebra, Statistics
using Parameters, ForwardDiff

Random.seed!(12345)

# Change to a local directory where chimpanzees.txt is stored

ProjDir = @__DIR__
cd(ProjDir)

# Read in chimpanzees and convert columns to Float68
# Same result is obtained if left as Int64.

d = CSV.read("chimpanzees.txt", delim=',');
df = convert(DataFrame, d);
df[:pulled_left] = convert(Array{Float64}, df[:pulled_left])
df[:prosoc_left] = convert(Array{Float64}, df[:prosoc_left])
df[:condition] = convert(Array{Float64}, df[:condition])
df[:actor] = convert(Array{Int64}, df[:actor])

struct m_10_04d_model{TY <: AbstractVector, TX <: AbstractMatrix,
  TA <: AbstractVector}
    "Observations."
    y::TY
    "Covariates"
    X::TX
    "Actors"
    A::TA
    "Number of observations"
    N::Int
    "Number of unique actors"
    N_actors::Int
end

# Make the type callable with the parameters *as a single argument*.

function (problem::m_10_04d_model)(θ)
    @unpack y, X, A, N, N_actors = problem
    @unpack β, α = θ
    ll = 0.0
    ll += sum(logpdf.(Normal(0, 10), β)) # bp & bpC
    ll += sum(logpdf.(Normal(0, 10), α)) # alpha[1:7]
    ll += sum(
      (loglikelihood(Binomial(1, logistic(α[A[i]] + dot(X[i, :], β))), [y[i]]) for i in 1:N)
    )
    ll
end

# Instantiate the model with data and inits.

N = size(df, 1)
N_actors = length(unique(df[:actor]))
X = hcat(ones(Int64, N), df[:prosoc_left] .* df[:condition]);
A = df[:actor]
y = df[:pulled_left]
p = m_10_04d_model(y, X, A, N, N_actors);
θ = (β = [1.0, 0.0], α = [-1.0, 10.0, -1.0, -1.0, -1.0, 0.0, 2.0])
p(θ)

problem_transformation(p::m_10_04d_model) =
    as( (β = as(Array, size(p.X, 2)), α = as(Array, p.N_actors), ) )

P = TransformedLogDensity(problem_transformation(p), p)
∇P = LogDensityRejectErrors(ADgradient(:ForwardDiff, P));
#∇P = ADgradient(:ForwardDiff, P);
#LogDensityProblems.stresstest(P)

chain, NUTS_tuned = NUTS_init_tune_mcmc(∇P, 1000);
posterior = TransformVariables.transform.(Ref(problem_transformation(p)), get_position.(chain));
posterior_β = mean(first, posterior)
posterior_α = mean(last, posterior)

# Result CmdStan

rethinking = "
Iterations = 1:1000
Thinning interval = 1
Chains = 1,2,3,4
Samples per chain = 1000

Empirical Posterior Estimates:
        Mean        SD       Naive SE       MCSE      ESS
a.1 -0.74503184 0.26613979 0.0042080396 0.0060183398 1000
a.2 10.77955494 5.32538998 0.0842018089 0.1269148045 1000
a.3 -1.04982353 0.28535997 0.0045119373 0.0049074219 1000
a.4 -1.04898135 0.28129307 0.0044476339 0.0056325117 1000
a.5 -0.74390933 0.26949936 0.0042611590 0.0052178124 1000
a.6  0.21599365 0.26307574 0.0041595927 0.0045153523 1000
a.7  1.81090866 0.39318577 0.0062168129 0.0071483527 1000

 bp  0.83979926 0.26284676 0.0041559722 0.0059795826 1000
bpC -0.12913322 0.29935741 0.0047332562 0.0049519863 1000
";

# Means of draws

[posterior_β, posterior_α]

Ambiguous method when trying to use DynamicNUDT

That's what happens with Julia 1.1 and freshly updated packages:

julia> using Turing

julia> using DynamicHMC

julia> @model gdemo(x, y) = begin
         s ~ InverseGamma(2,3)
         m ~ Normal(0, sqrt(s))
         x ~ Normal(m, sqrt(s))
         y ~ Normal(m, sqrt(s))
       end
gdemo (generic function with 3 methods)

julia> chn = sample(gdemo(1.5, 2.0), DynamicNUTS(1000))
ERROR: MethodError: LogDensityProblems.logdensity(::Type{LogDensityProblems.ValueGradient}, ::Turing.Inference.FunctionLogDensity{getfield(Turing.Inference, Symbol("#_lp#49")){Turing.Model{Tuple{:s,:m},Tuple{:x,:y},getfield(Main, Symbol("###inner_function#375#10")),NamedTuple{(:x, :y),Tuple{Float64,Float64}},NamedTuple{(:x, :y),Tuple{Symbol,Symbol}}},Turing.Sampler{DynamicNUTS{Turing.Core.ForwardDiffAD{40},Union{}}}}}, ::Array{Float64,1}) is ambiguous. Candidates:
  logdensity(::Type{LogDensityProblems.ValueGradient}, ℓ::Turing.Inference.FunctionLogDensity, x) in Turing.Inference at /home/ge68wan/.julia/packages/Turing/izlov/src/Turing.jl:124
  logdensity(::Type{LogDensityProblems.ValueGradient}, ℓ, x::AbstractArray{T,1}) where T in LogDensityProblems at /home/ge68wan/.julia/packages/LogDensityProblems/sIFw5/src/problem.jl:46
Possible fix, define
  logdensity(::Type{LogDensityProblems.ValueGradient}, ::Turing.Inference.FunctionLogDensity, ::AbstractArray{T,1})
Stacktrace:
 [1] phasepoint_in(::DynamicHMC.Hamiltonian{Turing.Inference.FunctionLogDensity{getfield(Turing.Inference, Symbol("#_lp#49")){Turing.Model{Tuple{:s,:m},Tuple{:x,:y},getfield(Main, Symbol("###inner_function#375#10")),NamedTuple{(:x, :y),Tuple{Float64,Float64}},NamedTuple{(:x, :y),Tuple{Symbol,Symbol}}},Turing.Sampler{DynamicNUTS{Turing.Core.ForwardDiffAD{40},Union{}}}}},GaussianKE{LinearAlgebra.Diagonal{Float64,Array{Float64,1}},LinearAlgebra.LowerTriangular{Float64,LinearAlgebra.Diagonal{Float64,Array{Float64,1}}}}}, ::Array{Float64,1}, ::Array{Float64,1}) at /home/ge68wan/.julia/packages/DynamicHMC/gnaxv/src/hamiltonian.jl:149
 [2] #NUTS_init#18(::Array{Float64,1}, ::GaussianKE{LinearAlgebra.Diagonal{Float64,Array{Float64,1}},LinearAlgebra.LowerTriangular{Float64,LinearAlgebra.Diagonal{Float64,Array{Float64,1}}}}, ::Array{Float64,1}, ::Int64, ::DynamicHMC.InitialStepsizeSearch, ::ReportIO{Base.TTY}, ::Function, ::Random.MersenneTwister, ::Turing.Inference.FunctionLogDensity{getfield(Turing.Inference, Symbol("#_lp#49")){Turing.Model{Tuple{:s,:m},Tuple{:x,:y},getfield(Main, Symbol("###inner_function#375#10")),NamedTuple{(:x, :y),Tuple{Float64,Float64}},NamedTuple{(:x, :y),Tuple{Symbol,Symbol}}},Turing.Sampler{DynamicNUTS{Turing.Core.ForwardDiffAD{40},Union{}}}}}) at /home/ge68wan/.julia/packages/DynamicHMC/gnaxv/src/sampler.jl:139
 [3] NUTS_init(::Random.MersenneTwister, ::Turing.Inference.FunctionLogDensity{getfield(Turing.Inference, Symbol("#_lp#49")){Turing.Model{Tuple{:s,:m},Tuple{:x,:y},getfield(Main, Symbol("###inner_function#375#10")),NamedTuple{(:x, :y),Tuple{Float64,Float64}},NamedTuple{(:x, :y),Tuple{Symbol,Symbol}}},Turing.Sampler{DynamicNUTS{Turing.Core.ForwardDiffAD{40},Union{}}}}}) at /home/ge68wan/.julia/packages/DynamicHMC/gnaxv/src/sampler.jl:138
 [4] #NUTS_init_tune_mcmc#20(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Random.MersenneTwister, ::Turing.Inference.FunctionLogDensity{getfield(Turing.Inference, Symbol("#_lp#49")){Turing.Model{Tuple{:s,:m},Tuple{:x,:y},getfield(Main, Symbol("###inner_function#375#10")),NamedTuple{(:x, :y),Tuple{Float64,Float64}},NamedTuple{(:x, :y),Tuple{Symbol,Symbol}}},Turing.Sampler{DynamicNUTS{Turing.Core.ForwardDiffAD{40},Union{}}}}}, ::Int64) at /home/ge68wan/.julia/packages/DynamicHMC/gnaxv/src/sampler.jl:284
 [5] NUTS_init_tune_mcmc(::Random.MersenneTwister, ::Turing.Inference.FunctionLogDensity{getfield(Turing.Inference, Symbol("#_lp#49")){Turing.Model{Tuple{:s,:m},Tuple{:x,:y},getfield(Main, Symbol("###inner_function#375#10")),NamedTuple{(:x, :y),Tuple{Float64,Float64}},NamedTuple{(:x, :y),Tuple{Symbol,Symbol}}},Turing.Sampler{DynamicNUTS{Turing.Core.ForwardDiffAD{40},Union{}}}}}, ::Int64) at /home/ge68wan/.julia/packages/DynamicHMC/gnaxv/src/sampler.jl:284
 [6] #NUTS_init_tune_mcmc#21(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Turing.Inference.FunctionLogDensity{getfield(Turing.Inference, Symbol("#_lp#49")){Turing.Model{Tuple{:s,:m},Tuple{:x,:y},getfield(Main, Symbol("###inner_function#375#10")),NamedTuple{(:x, :y),Tuple{Float64,Float64}},NamedTuple{(:x, :y),Tuple{Symbol,Symbol}}},Turing.Sampler{DynamicNUTS{Turing.Core.ForwardDiffAD{40},Union{}}}}}, ::Int64) at /home/ge68wan/.julia/packages/DynamicHMC/gnaxv/src/sampler.jl:294
 [7] NUTS_init_tune_mcmc(::Turing.Inference.FunctionLogDensity{getfield(Turing.Inference, Symbol("#_lp#49")){Turing.Model{Tuple{:s,:m},Tuple{:x,:y},getfield(Main, Symbol("###inner_function#375#10")),NamedTuple{(:x, :y),Tuple{Float64,Float64}},NamedTuple{(:x, :y),Tuple{Symbol,Symbol}}},Turing.Sampler{DynamicNUTS{Turing.Core.ForwardDiffAD{40},Union{}}}}}, ::Int64) at /home/ge68wan/.julia/packages/DynamicHMC/gnaxv/src/sampler.jl:294
 [8] sample(::Turing.Model{Tuple{:s,:m},Tuple{:x,:y},getfield(Main, Symbol("###inner_function#375#10")),NamedTuple{(:x, :y),Tuple{Float64,Float64}},NamedTuple{(:x, :y),Tuple{Symbol,Symbol}}}, ::DynamicNUTS{Turing.Core.ForwardDiffAD{40},Union{}}) at /home/ge68wan/.julia/packages/Turing/izlov/src/inference/dynamichmc.jl:66
 [9] top-level scope at none:0

Works fine if HMC(1000, 0.1, 5) is used (as originally in the example).

allow jittered stepsize

User should be able to ask for a jittered stepsize. Bikeshed about the API here.

discuss starting from the mode

Not doing this can easily lead to problems, eg #7.

the documentation should definitely mention this, eg Optim.jl and BlackBoxOptim.jl,
perhaps offer a way to do this automatically,
even stronger: always do it unless explicitly asked not to, eg as the first item of the default TunerSequence

Add mean tree depth to NUTS_tuned

Hi Tamas-

I am using DynamicHMC v1.0.6 and would like to report mean tree depth for MCMCBenchmarks.jl. As far as I can tell, NUTS_tuned only reports the maximum tree depth. Would it be possible to add mean tree depth to NUTS_tuned? Thanks.

New Test Case

Hi Tamas-

I am following up with our discussion on discourse. I believe you wanted to use the LBA code as a test for your package. That is fine with me. Performing parameter estimation on the LBA is difficult due to the intercorrelations among parameters. So it makes a great test case.

The code can be found here and uncommenting line 176 will produce the maximum iteration error.

Different behaviour when running code in the REPL or in Jupyter notebook

Hi, I also wrote a message on the julia discourse board..
I'm on the latest julia version (1.2 stable) and I pulled TransformVariables, LogDensityProblems and DynamicHMC from the master branch. I'm using Julia.
I have reproduced the problem both on Ubuntu 16.04 and macOS 10.14.6.

It's probably a very stupid problem, but I can't figure it out!

Self-contained example that demonstrates the problem

using LinearAlgebra
using StaticArrays
using TransformVariables
using LogDensityProblems
using DynamicHMC
using Parameters
using Random

function LogLikelihoodMin(ω,dyDep)
    
 dt = 0.1
 cD = [0 -1im ; 1im 0]
 H = (ω/2.) * cD
 rho=[0.05 0 ; 0 0.95]
 M = I - ((cD'*cD)/2) * dt + (cD * dyDep) - 1im * H * dt   
 newRho = M * rho * M'
 lklhood = real(tr(newRho))- (ω* dt/2)^2 
 return log(lklhood)

end

struct Experiment
      dyDep::Float64
end

function (problem::Experiment)((ω,)::NamedTuple{(:ω,)})
      @unpack dyDep  = problem        # extract the data
      LogLikelihoodMin(ω,dyDep)
end

dyDepObs=0.690691
p1 = Experiment(dyDepObs)

print(p1((ω=.4,)))

trans_single = as((ω=as(Real, 2, 4),))
P1 = TransformedLogDensity(trans_single, p1)
∇P1 = ADgradient(:ForwardDiff, P1)

results = mcmc_with_warmup(Random.GLOBAL_RNG, ∇P1, 1000)

Output, expected outcome, comparison to other samplers

Basically, if I run this code in the REPL or launch it as a script it runs correctly, while if I copy and paste it in a Jupyter notebook I get the error

ArgumentError: nothing should not be printed; use show, repr, or custom output instead.

See the whole stacktrace here.

tpapp / dynamichmc.jl Goto Github PK

dynamichmc.jl's Introduction

DynamicHMC

Overview

Examples

Support and participation

Bibliography

dynamichmc.jl's People

Contributors

Stargazers

Watchers

Forkers

dynamichmc.jl's Issues

Motivation

Low-level implementation changes

Non-allocating leapfrog calculations

Mid-level API

Flexible NUTS step implementation

Allow jittered stepsize ϵ

Interface for iterative application

High-level API

Logging

Interface for initialization and adaptation

Self-contained example that demonstrates the problem

Output, expected outcome, comparison to other samplers

Contributing code for tests

TODO before release

TODO after release

Self-contained example that demonstrates the problem

Output, expected outcome, comparison to other samplers

Recommend Projects

Recommend Topics

Recommend Org

Allow jittered stepsize `ϵ`