Git Product home page Git Product logo

bssm's Introduction

bssm

Project Status: Active - The project has reached a stable, usable state and is being actively developed Status at rOpenSci Software Peer Review R-CMD-check Codecov test coverage CRAN version downloads

The bssm R package provides efficient methods for Bayesian inference of state space models via particle Markov chain Monte Carlo and importance sampling type weighted MCMC. Currently Gaussian, Poisson, binomial, negative binomial, and Gamma observation densities with linear-Gaussian state dynamics, as well as general non-linear Gaussian models and discretely observed latent diffusion processes are supported.

For details, see

There are also couple posters and a talk related to IS-correction methodology and bssm package:

The bssm package was originally developed with the support of Academy of Finland grants 284513, 312605, 311877, and 331817. Current development is focused on increased usability. For recent changes, see NEWS file.

Citing the package

If you use the bssm package in publications, please cite the corresponding R Journal paper:

Jouni Helske and Matti Vihola (2021). “bssm: Bayesian Inference of Non-linear and Non-Gaussian State Space Models in R.” The R Journal (2021) 13:2, pages 578-589. https://journal.r-project.org/archive/2021/RJ-2021-103/index.html

Installation

You can install the released version of bssm from CRAN with:

install.packages("bssm")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("helske/bssm")

Or from R-universe with

install.packages("bssm", repos = "https://helske.r-universe.dev")

Example

Consider the daily air quality measurements in New Your from May to September 1973, available in the datasets package. Let’s try to predict the missing ozone levels by simple linear-Gaussian local linear trend model with temperature and wind as explanatory variables (missing response variables are handled naturally in the state space modelling framework, however no missing values in covariates are normally allowed);

library("bssm")
#> Warning: package 'bssm' was built under R version 4.3.1
#> 
#> Attaching package: 'bssm'
#> The following object is masked from 'package:base':
#> 
#>     gamma
library("dplyr")
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library("ggplot2")
#> Warning: package 'ggplot2' was built under R version 4.3.1
set.seed(1)

data("airquality", package = "datasets")

# Covariates as matrix. For complex cases, check out as_bssm function
xreg <- airquality |> select(Wind, Temp) |> as.matrix()

model <- bsm_lg(airquality$Ozone,
  xreg = xreg,  
  # Define priors for hyperparameters (i.e. not the states), see ?bssm_prior
  # Initial value followed by parameters of the prior distribution
  beta = normal_prior(rep(0, ncol(xreg)), 0, 1),
  sd_y = gamma_prior(1, 2, 0.01),
  sd_level = gamma_prior(1, 2, 0.01), 
  sd_slope = gamma_prior(1, 2, 0.01))

fit <- run_mcmc(model, iter = 20000, burnin = 5000)
fit
#> 
#> Call:
#> run_mcmc.lineargaussian(model = model, iter = 20000, burnin = 5000)
#> 
#> Iterations = 5001:20000
#> Thinning interval = 1
#> Length of the final jump chain = 3593
#> 
#> Acceptance rate after the burn-in period:  0.239
#> 
#> Summary for theta:
#> 
#>  variable       Mean          SE        SD        2.5%     97.5% ESS
#>      Temp  1.0265846 0.007497538 0.2064343  0.60226671  1.400436 758
#>      Wind -2.5183269 0.020978488 0.5764833 -3.68987992 -1.327578 755
#>  sd_level  6.3731836 0.113153715 2.8013937  1.52958636 12.403961 613
#>  sd_slope  0.3388712 0.010355574 0.2833955  0.04210885  1.070284 749
#>      sd_y 20.8618647 0.068145131 1.9369381 17.08728231 24.722309 808
#> 
#> Summary for alpha_154:
#> 
#>  variable time        Mean         SE        SD       2.5%     97.5%  ESS
#>     level  154 -28.3163054 0.69650977 20.132341 -69.271049 11.797133  835
#>     slope  154  -0.3740463 0.03683278  1.685733  -4.065499  2.830134 2094
#> 
#> Run time:
#>    user  system elapsed 
#>    0.57    0.02    0.63

obs <- data.frame(Time = 1:nrow(airquality),
  Ozone = airquality$Ozone) |> filter(!is.na(Ozone))

pred <- fitted(fit, model)
pred |>
  ggplot(aes(x = Time, y = Mean)) + 
  geom_ribbon(aes(ymin = `2.5%`, ymax = `97.5%`), 
    alpha = 0.5, fill = "steelblue") + 
  geom_line() + 
  geom_point(data = obs, 
    aes(x = Time, y = Ozone), colour = "Tomato") +
  theme_bw()

Same model but now assuming observations are from Gamma distribution:

model2 <- bsm_ng(airquality$Ozone,
  xreg = xreg,  
  beta = normal(rep(0, ncol(xreg)), 0, 1),
  distribution = "gamma",
  phi = gamma_prior(1, 2, 0.01),
  sd_level = gamma_prior(1, 2, 0.1), 
  sd_slope = gamma_prior(1, 2, 0.1))

fit2 <- run_mcmc(model2, iter = 20000, burnin = 5000, particles = 10)
fit2
#> 
#> Call:
#> run_mcmc.nongaussian(model = model2, iter = 20000, particles = 10, 
#>     burnin = 5000)
#> 
#> Iterations = 5001:20000
#> Thinning interval = 1
#> Length of the final jump chain = 3858
#> 
#> Acceptance rate after the burn-in period:  0.257
#> 
#> Summary for theta:
#> 
#>  variable         Mean           SE          SD          2.5%       97.5%  ESS
#>      Temp  0.052808820 0.0002404538 0.008701489  0.0353736458  0.06992423 1310
#>      Wind -0.057351094 0.0004338213 0.015411504 -0.0873384757 -0.02700112 1262
#>       phi  4.006977632 0.0159088062 0.536273508  3.0263444882  5.15527365 1136
#>  sd_level  0.057158663 0.0020138200 0.035366227  0.0083794202  0.14651419  308
#>  sd_slope  0.003894013 0.0001752319 0.003654978  0.0004250207  0.01374575  435
#>         SE_IS ESS_IS
#>  7.128104e-05  14485
#>  1.263047e-04  13905
#>  4.411840e-03  14611
#>  2.927386e-04  10591
#>  3.031489e-05   7766
#> 
#> Summary for alpha_154:
#> 
#>  variable time         Mean           SE         SD        2.5%      97.5%  ESS
#>     level  154 -0.200656509 0.0201721601 0.73134471 -1.62501396 1.24522802 1314
#>     slope  154 -0.002689176 0.0005121944 0.02289051 -0.04650504 0.04724173 1997
#>        SE_IS ESS_IS
#>  0.005987284   9458
#>  0.000191620   6448
#> 
#> Run time:
#>    user  system elapsed 
#>    7.50    0.01    7.71

Comparison:

pred2 <- fitted(fit2, model2)

bind_rows(list(Gaussian = pred, Gamma = pred2), .id = "Model") |>
  ggplot(aes(x = Time, y = Mean)) + 
  geom_ribbon(aes(ymin = `2.5%`, ymax = `97.5%`, fill = Model), 
    alpha = 0.25) + 
  geom_line(aes(colour = Model)) + 
  geom_point(data = obs, 
    aes(x = Time, y = Ozone)) +
  theme_bw()

Now let’s assume that we also want to use the solar radiation variable as predictor for ozone. As it contains few missing values, we cannot use it directly. As the number of missing time points is very small, simple imputation would likely be acceptable, but let’s consider more another approach. For simplicity, the slope terms of the previous models are now omitted, and we focus on the Gaussian case. Let $\mu_t$ be the true solar radiation at time $t$. Now for ozone $O_t$ we assume following model:

$O_t = D_t + \alpha_t + \beta_S \mu_t + \sigma_\epsilon \epsilon_t$
$\alpha_{t+1} = \alpha_t + \sigma_\eta\eta_t$
$\alpha_1 \sim N(0, 100^2\textrm{I})$,
wheere $D_t = \beta X_t$ contains regression terms related to wind and temperature, $\alpha_t$ is the time varying intercept term, and $\beta_S$ is the effect of solar radiation $\mu_t$.

Now for the observed solar radiation $S_t$ we assume

$S_t = \mu_t$
$\mu_{t+1} = \mu_t + \sigma_\xi\xi_t,$
$\mu_1 \sim N(0, 100^2)$,
i.e. we assume as simple random walk for the $\mu$ which we observe without error or not at all (there is no error term in the observation equation $S_t=\mu_t$).

We combine these two models as a bivariate Gaussian model with ssm_mlg:

# predictors (not including solar radiation) for ozone
xreg <- airquality |> select(Wind, Temp) |> as.matrix()

# Function which outputs new model components given the parameter vector theta
update_fn <- function(theta) {
  D <- rbind(t(xreg %*% theta[1:2]), 1)
  Z <- matrix(c(1, 0, theta[3], 1), 2, 2)
  R <- diag(exp(theta[4:5]))
  H <- diag(c(exp(theta[6]), 0))
  # add third dimension so we have p x n x 1, p x m x 1, p x p x 1 arrays
  dim(Z)[3] <- dim(R)[3] <- dim(H)[3] <- 1
  list(D = D, Z = Z, R = R, H = H)
}

# Function for log-prior density
prior_fn <- function(theta) {
  sum(dnorm(theta[1:3], 0, 10, log = TRUE)) + 
    sum(dgamma(exp(theta[4:6]), 2, 0.01, log = TRUE)) + 
    sum(theta[4:6]) # log-jacobian
}

init_theta <- c(0, 0, 0, log(50), log(5), log(20))
comps <- update_fn(init_theta)

model <- ssm_mlg(y = cbind(Ozone = airquality$Ozone, Solar = airquality$Solar.R),
  Z = comps$Z, D = comps$D, H = comps$H, T = diag(2), R = comps$R, 
  a1 = rep(0, 2), P1 = diag(100, 2), init_theta = init_theta, 
  state_names = c("alpha", "mu"), update_fn = update_fn,
  prior_fn = prior_fn)

fit <- run_mcmc(model, iter = 60000, burnin = 10000)
fit
#> 
#> Call:
#> run_mcmc.lineargaussian(model = model, iter = 60000, burnin = 10000)
#> 
#> Iterations = 10001:60000
#> Thinning interval = 1
#> Length of the final jump chain = 12234
#> 
#> Acceptance rate after the burn-in period:  0.245
#> 
#> Summary for theta:
#> 
#>  variable        Mean           SE         SD       2.5%      97.5%  ESS
#>   theta_1 -3.89121114 0.0233827004 0.58715113 -5.0085134 -2.6915137  631
#>   theta_2  0.98712126 0.0051506907 0.18819758  0.5917823  1.3475147 1335
#>   theta_3  0.06324657 0.0004672314 0.02417334  0.0141425  0.1100184 2677
#>   theta_4  0.82577262 0.0165661049 0.67134723 -0.6840637  1.9160168 1642
#>   theta_5  4.75567621 0.0010887250 0.05858454  4.6446809  4.8704036 2895
#>   theta_6  3.05462451 0.0014803971 0.07640392  2.9032635  3.2028023 2664
#> 
#> Summary for alpha_154:
#> 
#>  variable time      Mean        SE        SD       2.5%     97.5%  ESS
#>     alpha  154 -16.44435 0.3659912  14.99708 -46.321645  13.01863 1679
#>        mu  154 223.60490 1.3409568 116.49063  -6.206301 453.18554 7546
#> 
#> Run time:
#>    user  system elapsed 
#>    7.41    0.11    7.83

Draw predictions:

pred <- fitted(fit, model)

obs <- data.frame(Time = 1:nrow(airquality),
  Solar = airquality$Solar.R) |> filter(!is.na(Solar))

pred |> filter(Variable == "Solar") |>
  ggplot(aes(x = Time, y = Mean)) + 
  geom_ribbon(aes(ymin = `2.5%`, ymax = `97.5%`), 
    alpha = 0.5, fill = "steelblue") + 
  geom_line() + 
  geom_point(data = obs, 
    aes(x = Time, y = Solar), colour = "Tomato") +
  theme_bw()

obs <- data.frame(Time = 1:nrow(airquality),
  Ozone = airquality$Ozone) |> filter(!is.na(Ozone))

pred |> filter(Variable == "Ozone") |>
  ggplot(aes(x = Time, y = Mean)) + 
  geom_ribbon(aes(ymin = `2.5%`, ymax = `97.5%`), 
    alpha = 0.5, fill = "steelblue") + 
  geom_line() +  
  geom_point(data = obs, 
    aes(x = Time, y = Ozone), colour = "Tomato") +
  theme_bw()

See more examples in the paper, vignettes, and in the docs.

bssm's People

Contributors

helske avatar khusmann avatar sbgraves237 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

bssm's Issues

Segfault due to R call inside parallel sampling

Apparently using pragma omp critical is not enough when calling R functions inside openMP, so there are occasional crashes with models using update_fn when trying to sample states in parallel (MCMC with ssm_ulg, ssm_ung, and their multivariate counterparts). I figured out a way to circumvent this issue, and it seems to work, but still needs some tidying up.

CRAN complains about multithreading

Getting a note on CRAN's Debian pretest:

Examples with CPU time > 2.5 times elapsed time
          user system elapsed ratio
exchange 1.196   0.04   0.159 7.774

Nothing in that example should use parallelisation as we are not doing MCMC. And we don't use data.table (Rdatatable/data.table#5658). Setting Sys.setenv("OMP_THREAD_LIMIT" = 2) doesn't help either.

Stray Stats Std BS7.7

@helske I'm trying to figure out ropensci-review-tools/srr#24, and note that you've got a standard BS7.7 in this line:

#' @srrstats {G5.4, G5.4a, G5.4b, G5.4c, G5.5, BS7.0, BS7.1, BS7.2, BS7.7}

But Bayesian standards only go up to BS7.4.That results in your code reporting 124 standards from a total of 123 General + Bayesian. I'll implement a catch for any non-existent standards in srr, and could you please correct that one in your code? Thanks!

run_mcmc crashes with a simple linear gaussian model

Hi, I am trying to estimate a multivariate linear gaussian model. However, R crashes whenever I call the run_mcmc function. A toy example with simulated data is below.
The only output I get is the progress bar and after completion Rstudio tells me that the R Session aborted without further explanation.

Kalman filtering works fine.

library(bssm)

set.seed(1)

########### Simulating fake data
set.seed(1)

Tt <- 100
p<-2 #numero de observaveis
m<- 2 #numero de estados 
y <- matrix(0,Tt,p) # observaveis
Z <- matrix(0,p,m) #matriz de observacao
Tm <- matrix(0,m,m)
H <- matrix(0,2,2)
R <- matrix(0,m,m)
R[1,1] <- 0.1
R[2,2] <- 0.01
eps_y1 <- rnorm(Tt,mean = 0, sd = 0.1)
eps_y2 <- rnorm(Tt,mean = 0, sd = 0.01)
W <- matrix(0,Tt,p)
W[,1] <- eps_y1
W[,2] <- eps_y2

for (i in 1:p){
  Z[i,i] <- 1
}
Tm[1,1] <- 1
Tm[2,1] <- 0.1

for (t in 2:Tt){
  y[t,] <- Tm %*% y[t-1,] + W[t,]
  
}

######### Estimation
  
#observation with NAs
y2 <- y
y2[97,1] <- NA

update_fn <- function(theta) {
  
  Z <- matrix(c(1, 0, 0, 1), 2, 2)
  R <- matrix(0,2,2)
  R[1,1] <- 0.1
  R[2,2] <- 0.01
  H <- matrix(0,2,2)
  # add third dimension so we have p x n x 1, p x m x 1, p x p x 1 arrays
  dim(Z)[3] <- dim(R)[3] <- dim(H)[3] <- 1
  Tm <- matrix(0,2,2)
  Tm[1,1] <- theta
  Tm[2,1] <- 0.1
  dim(Z)[3] <- 1
  dim(R)[3] <- 1
  dim(H)[3] <- 1
  dim(Tm)[3] <- 1
  P1 <- matrix(10,m,m)
  a1 <- c(0,0)
  list(Z = Z, R = R,T = Tm, H = H,P1=P1, a1=a1)
}

init_theta <- 1

comps <- update_fn(init_theta)

prior_fn <- function(theta) {
  sum(dnorm(theta, 1, 10, log = TRUE))  # log-jacobian
}

mod <- ssm_mlg(y2,   Z = comps$Z, D = comps$D, H = comps$H, T=comps$T, R=comps$R,
               a1 = comps$a1, P1=comps$P1,state_names = c("y1","y2"),update_fn = update_fn,
               init_theta = init_theta,
               prior_fn = prior_fn)
kf <- kfilter(mod)
fit <- run_mcmc(mod, iter = 1000, thin=2, burnin = 100,seed =5,verbose = TRUE)
Session Info

UBSAN error

OMP: Warning #96: Cannot form a team with 24 threads, using 2 instead.
OMP: Hint: Consider unsetting KMP_ALL_THREADS and OMP_THREAD_LIMIT (if either is set).
In file included from growth_model_functions.cpp:1:
In file included from /data/gannet/ripley/R/test-clang/RcppArmadillo/include/RcppArmadillo.h:31:
In file included from /data/gannet/ripley/R/test-clang/RcppArmadillo/include/RcppArmadilloForward.h:26:
In file included from /data/gannet/ripley/R/test-clang/Rcpp/include/RcppCommon.h:70:
/data/gannet/ripley/R/test-clang/Rcpp/include/Rcpp/utils/tinyformat.h:1046:32: warning: no newline at end of file [-Wnewline-eof]
#endif // TINYFORMAT_H_INCLUDED
                               ^
1 warning generated.
function_pointers.h:99:12: runtime error: call to function a1_fn(arma::Col<double> const&, arma::Col<double> const&) through pointer to incorrect function type 'arma::Col<double> (*)(const arma::Col<double> &, const arma::Col<double> &)'
/tmp/RtmprpYU6Z/sourceCpp-x86_64-pc-linux-gnu-0.12.11/sourcecpp_a8d276dc88da/growth_model_functions.cpp:7: note: a1_fn(arma::Col<double> const&, arma::Col<double> const&) defined here
SUMMARY: AddressSanitizer: undefined-behavior function_pointers.h:99:12 in 
function_pointers.h:114:12: runtime error: call to function P1_fn(arma::Col<double> const&, arma::Col<double> const&) through pointer to incorrect function type 'arma::Mat<double> (*)(const arma::Col<double> &, const arma::Col<double> &)'
/tmp/RtmprpYU6Z/sourceCpp-x86_64-pc-linux-gnu-0.12.11/sourcecpp_a8d276dc88da/growth_model_functions.cpp:16: note: P1_fn(arma::Col<double> const&, arma::Col<double> const&) defined here
SUMMARY: AddressSanitizer: undefined-behavior function_pointers.h:114:12 in 
function_pointers.h:65:12: runtime error: call to function Z_gn(unsigned int, arma::Col<double> const&, arma::Col<double> const&, arma::Col<double> const&, arma::Mat<double> const&) through pointer to incorrect function type 'arma::Mat<double> (*)(unsigned int, const arma::Col<double> &, const arma::Col<double> &, const arma::Col<double> &, const arma::Mat<double> &)'
/tmp/RtmprpYU6Z/sourceCpp-x86_64-pc-linux-gnu-0.12.11/sourcecpp_a8d276dc88da/growth_model_functions.cpp:55: note: Z_gn(unsigned int, arma::Col<double> const&, arma::Col<double> const&, arma::Col<double> const&, arma::Mat<double> const&) defined here
SUMMARY: AddressSanitizer: undefined-behavior function_pointers.h:65:12 in 
function_pointers.h:81:12: runtime error: call to function H_fn(unsigned int, arma::Col<double> const&, arma::Col<double> const&, arma::Mat<double> const&) through pointer to incorrect function type 'arma::Mat<double> (*)(unsigned int, const arma::Col<double> &, const arma::Col<double> &, const arma::Mat<double> &)'
/tmp/RtmprpYU6Z/sourceCpp-x86_64-pc-linux-gnu-0.12.11/sourcecpp_a8d276dc88da/growth_model_functions.cpp:27: note: H_fn(unsigned int, arma::Col<double> const&, arma::Col<double> const&, arma::Mat<double> const&) defined here
SUMMARY: AddressSanitizer: undefined-behavior function_pointers.h:81:12 in 
function_pointers.h:49:12: runtime error: call to function Z_fn(unsigned int, arma::Col<double> const&, arma::Col<double> const&, arma::Col<double> const&, arma::Mat<double> const&) through pointer to incorrect function type 'arma::Col<double> (*)(unsigned int, const arma::Col<double> &, const arma::Col<double> &, const arma::Col<double> &, const arma::Mat<double> &)'
/tmp/RtmprpYU6Z/sourceCpp-x86_64-pc-linux-gnu-0.12.11/sourcecpp_a8d276dc88da/growth_model_functions.cpp:47: note: Z_fn(unsigned int, arma::Col<double> const&, arma::Col<double> const&, arma::Col<double> const&, arma::Mat<double> const&) defined here
SUMMARY: AddressSanitizer: undefined-behavior function_pointers.h:49:12 in 
function_pointers.h:33:12: runtime error: call to function log_prior_pdf(arma::Col<double> const&) through pointer to incorrect function type 'double (*)(const arma::Col<double> &)'
/tmp/RtmprpYU6Z/sourceCpp-x86_64-pc-linux-gnu-0.12.11/sourcecpp_a8d276dc88da/growth_model_functions.cpp:102: note: log_prior_pdf(arma::Col<double> const&) defined here
SUMMARY: AddressSanitizer: undefined-behavior function_pointers.h:33:12 in 
function_pointers.h:33:12: runtime error: call to function log_prior_pdf(arma::Col<double> const&) through pointer to incorrect function type 'double (*)(const arma::Col<double> &)'
/tmp/RtmprpYU6Z/sourceCpp-x86_64-pc-linux-gnu-0.12.11/sourcecpp_a8d276dc88da/growth_model_functions.cpp:102: note: log_prior_pdf(arma::Col<double> const&) defined here
SUMMARY: AddressSanitizer: undefined-behavior function_pointers.h:33:12 in 

mv_gssm missing from package

Hi,

I just installed bssm from CRAN and the function mv_gssm is missing, despite being in the documentation. Should it be there?

I see it is in models.R, should I just download from github?

Thanks

Adam

Cube index out of bound due to implicit casting of bool

From Travis (https://travis-ci.org/helske/bssm/jobs/160477908):

> base::assign(".ptime", proc.time(), pos = "CheckExEnv")

> ### Name: ng_bsm

> ### Title: Non-Gaussian Basic Structural (Time Series) Model

> ### Aliases: ng_bsm

> 

> ### ** Examples

> 

> model <- ng_bsm(Seatbelts[, "VanKilled"], distribution = "poisson",

+   sd_level = halfnormal(0.01, 1), 

+   sd_seasonal = halfnormal(0.01, 1), slope = FALSE,

+   beta = normal(0, 0, 10),

+   xreg = Seatbelts[, "law"])

> 

> set.seed(123)

> mcmc_out <- run_mcmc(model, n_iter = 5000, nsim = 10)

error: Cube::operator(): index out of bounds

Error in eval(substitute(expr), envir, enclos) : 

  Cube::operator(): index out of bounds

Calls: run_mcmc -> run_mcmc.ng_bsm -> ng_bsm_run_mcmc -> .Call

Weird thing is that this error happens only on Travis, not on my windows computers nor in a linux server where I have access. Valgrind results come clean as well. Next step address sanitizers?

add constant back to bootstrap filter likelihood

Just modified bootstrap filter for increased speed by dropping constants in density functions, but I should add the constant to final log-likelihood just for completeness. Low priority issue though...

Add progress bar

It would be nice to have a progress bar or something for long MCMC runs. And I think it would also allow automatic way to interrupt C++ code without terminating R.

Non-gaussian linear state model with continuous time

Hi, I have a model that is a linear state model with non-gaussian response with continuous time.

I found that function 'ssm_nlg' seems to be able to handle linear (also, non-linear) gaussian with continuous time, while 'ssm_mng' can handle linear state model with non-gaussian response but no continuous time.

I'm wondering whether it is possible to build a linear state model with non-gaussian response with continuous time. Is it possible to write C++ snippets in 'ssm_mng' function since the only difference is that I want continuous time.

Thank you!

trouble running filters / smoothers

Hello, thanks again for your work.

I am having trouble running filters and smoothers. Example:

model <- bsm_ng(Seatbelts[, "VanKilled"], distribution = "poisson",
  sd_level = halfnormal(0.01, 1),
  sd_seasonal = halfnormal(0.01, 1),
  beta = normal(0, 0, 10),
  xreg = Seatbelts[, "law"])

filtered <- ekf(model)

gives error:

Error in ekf_nlg(t(model$y), model$Z, model$H, model$T, model$R, model$Z_gn,  : 
  Not compatible with requested type: [type=NULL; target=double].

Also occurs with ekf_smoother, ukf, etc.

Thank you!

Installation from Github fails on Mac

Running os 10.12.6, latest Xcode, R3.4.2. This problem has arisen in other packages:

devtools::install_github("helske/bssm")
Downloading GitHub repo helske/bssm@master
from URL https://api.github.com/repos/helske/bssm/zipball/master
Installing bssm
'/Library/Frameworks/R.framework/Resources/bin/R' --no-site-file --no-environ
--no-save --no-restore --quiet CMD INSTALL
'/private/var/folders/xw/mcmsdzzx4mgbttplylgs7ysh0000gp/T/RtmpoGVsEQ/devtoolsf4114663d465/helske-bssm-c86446d'
--library='/Users/rmendels/Library/R/3.4/library' --install-tests

  • installing source package ‘bssm’ ...
    ** libs
    clang++ -std=gnu++11 -I/Library/Frameworks/R.framework/Resources/include -DNDEBUG -I"/Users/rmendels/Library/R/3.4/library/BH/include" -I"/Users/rmendels/Library/R/3.4/library/Rcpp/include" -I"/Users/rmendels/Library/R/3.4/library/RcppArmadillo/include" -I"/Users/rmendels/Library/R/3.4/library/ramcmc/include" -I"/Users/rmendels/Library/R/3.4/library/sitmo/include" -I/usr/local/include -fopenmp -fPIC -Wall -g -O2 -c R_approx.cpp -o R_approx.o
    clang: error: unsupported option '-fopenmp'
    make: *** [R_approx.o] Error 1
    ERROR: compilation failed for package ‘bssm’
  • removing ‘/Users/rmendels/Library/R/3.4/library/bssm’
  • restoring previous ‘/Users/rmendels/Library/R/3.4/library/bssm’
    Installation failed: Command failed (1)

Thanks

NA values in MCMC for Gaussian model

y <- log(AirPassengers)

model <- bsm(y, sd_level = uniform(0.1, 0, 1000), 
  sd_slope = uniform(0.1, 0, 1000), sd_y = uniform(0.1, 0, 1000),
  sd_seasonal = uniform(0.1, 0, 1000), period = 12)

out <- run_mcmc(model, n_iter = 2e5)
out

There seems to be a bug in the sampling of states as out$alpha is sometimes full of NaN:s...

can't use all fixed sd params in bsm_ng

Hello again,

I was trying to set fixed params for sd_level and sd_seasonal instead of priors, but it doesn't look like fixing both is allowed. (I was hoping to just solve for beta). For example:

model <- bsm_ng(Seatbelts[, "VanKilled"], distribution = "poisson",
  sd_level = halfnormal(0.01, 1),
  sd_seasonal = halfnormal(0.01, 1),
  beta = normal(0, 0, 10),
  xreg = Seatbelts[, "law"])

mcmc_bsm <- run_mcmc(model, iter = 5000, nsim=10)

works but:

model <- bsm_ng(Seatbelts[, "VanKilled"], distribution = "poisson",
  sd_level = 0.01,
  sd_seasonal = 0.01,
  beta = normal(0, 0, 10),
  xreg = Seatbelts[, "law"])

mcmc_bsm <- run_mcmc(model, iter = 5000, nsim=10)

gives error:

Error in nongaussian_da_mcmc(model, output_type, nsim, iter, burnin, thin, : Col::subvec(): indices out of bounds or incorrectly used

Thanks!

Installation from Github fails on Mac: clang: error: unsupported option '-fopenmp'

With R 3.6.0 under macOS 10.14.4, 'devtools::install_github("helske/bssm")' produced the following:

Downloading GitHub repo helske/bssm@master
✔ checking for file ‘/private/var/folders/mh/mrm_14nx19g13lsnj9zmvwjr0000gn/T/RtmpI8M1Qy/remotes57ff163932df/helske-bssm-70458ec/DESCRIPTION’ ...
─ preparing ‘bssm’:
✔ checking DESCRIPTION meta-information ...
─ cleaning src
─ checking for LF line-endings in source and make files and shell scripts
─ checking for empty or unneeded directories
─ looking to see if a ‘data/datalist’ file should be added
─ building ‘bssm_0.1.7.tar.gz’

  • installing source package ‘bssm’ ...
    ** using staged installation
    ** libs
    clang++ -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/Rcpp/include" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/RcppArmadillo/include" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/ramcmc/include" -I"/Library/Frameworks/R.framework/Versions/3.6/Resources/library/sitmo/include" -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -I/usr/local/include -fopenmp -fPIC -Wall -g -O2 -c R_approx.cpp -o R_approx.o

clang: error: unsupported option '-fopenmp'

make: *** [R_approx.o] Error 1

NOTE: This seemed quit similar to "Installation from Github fails on Mac #16", closed on 2017-11-13 with a solution posted. However, that solution seems out of date, and I'm afraid to try it for fear I might create more problems for myself than I solve. Thanks.

Psi-filter fails with semidefinite matrices

Psi-filter should be fixed to handle case where the variance of the conditional smoothed state is zero. This happens for example with constant drift term of seasonal component.

General non-Gaussian model defined with C++ snippets

Like in case of non-linear models and linear-gaussian models, it would be nice to have an option to define non-Gaussian (poisson, binomial, SV model would work similarly) model via C++ snippets as well. Should be really straightforward to do by looking how those other models are defined, I just don't have time.

Dummy seasonal to trigonometric seasonal

Have to think about changing the dummy seasonal representation to trigonometric version, this makes few things easier as the noise covariance matrix is non-singular (unless one wants to use deterministic trend).

UBSAN on CRAN: nan is outside the range of representable values of type 'unsigned int'


R Under development (unstable) (2017-07-12 r72910) -- "Unsuffered Consequences"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library("testthat")
> test_check("bssm")
Loading required package: bssm
/data/gannet/ripley/R/test-clang/Rcpp/include/Rcpp/internal/caster.h:30:25: runtime error: value nan is outside the range of representable values of type 'unsigned int'
SUMMARY: AddressSanitizer: undefined-behavior /data/gannet/ripley/R/test-clang/Rcpp/include/Rcpp/internal/caster.h:30:25 in 
OMP: Warning #96: Cannot form a team with 24 threads, using 2 instead.
OMP: Hint: Consider unsetting KMP_ALL_THREADS and OMP_THREAD_LIMIT (if either is set).
testthat results ================================================================
OK: 127 SKIPPED: 9 FAILED: 0
> 
> proc.time()
   user  system elapsed 
 16.443   0.740  18.432 

Question about nlg_ar_exp

I checked ‘cpp_example_model’ function and found that ‘nlg_ar_exp’ does not apply to my model, I wonder whether my model is supported by bssm package , my model is as follows:

Q_{t}=exp(y_{t})+exp(y_{t-1})+exp(y_{t-2})
y_{t}=tau_{t}+c_{t}
tau_{t}=tau_{t-1}+d_{t-1}+eta_{t}
d_{t}=d_{t-1}+V_{t}
c_{t}=a_{1}c_{t-1}+a_{2}c_{t-2}+epsilon_{t}

Many thanks!

Error with Solaris

From CRAN:

* installing to library ‘/home/ripley/R/Lib32’
* installing *source* package ‘bssm’ ...
** package ‘bssm’ successfully unpacked and MD5 sums checked
** libs
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c R_approx.cpp -o R_approx.o
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c R_auxiliary_filter.cpp -o R_auxiliary_filter.o
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c R_bootstrap_filter.cpp -o R_bootstrap_filter.o
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c R_ekf.cpp -o R_ekf.o
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c R_ekpf.cpp -o R_ekpf.o
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c R_importance_sample.cpp -o R_importance_sample.o
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c R_kfilter.cpp -o R_kfilter.o
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c R_loglik.cpp -o R_loglik.o
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c R_mcmc.cpp -o R_mcmc.o
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c R_predict.cpp -o R_predict.o
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c R_psi.cpp -o R_psi.o
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c R_smoother.cpp -o R_smoother.o
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c R_ukf.cpp -o R_ukf.o
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c RcppExports.cpp -o RcppExports.o
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c conditional_dist.cpp -o conditional_dist.o
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c distr_consts.cpp -o distr_consts.o
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c dmvnorm.cpp -o dmvnorm.o
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c filter_smoother.cpp -o filter_smoother.o
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c interval.cpp -o interval.o
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c invlink.cpp -o invlink.o
/opt/csw//bin/g++ -std=gnu++11 -I/home/ripley/R/gcc/include -DNDEBUG  -I"/home/ripley/R/Lib32/BH/include" -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/RcppArmadillo/include" -I"/home/ripley/R/Lib32/ramcmc/include" -I/opt/csw/include -I/usr/local/include  -fopenmp -fPIC  -O2 -c mcmc.cpp -o mcmc.o
mcmc.cpp: In constructor ‘mcmc::mcmc(const uvec&, const mat&, unsigned int, unsigned int, unsigned int, unsigned int, unsigned int, double, double, const mat&, bool)’:
mcmc.cpp:23:47: error: call of overloaded ‘floor(unsigned int)’ is ambiguous
   n_samples(floor((n_iter - n_burnin) / n_thin)), 
                                               ^
In file included from /usr/include/math.h:15:0,
                 from /opt/csw/include/c++/5.2.0/cmath:44,
                 from /home/ripley/R/Lib32/Rcpp/include/Rcpp/platform/compiler.h:100,
                 from /home/ripley/R/Lib32/Rcpp/include/Rcpp/r/headers.h:48,
                 from /home/ripley/R/Lib32/Rcpp/include/RcppCommon.h:29,
                 from /home/ripley/R/Lib32/RcppArmadillo/include/RcppArmadilloForward.h:26,
                 from /home/ripley/R/Lib32/RcppArmadillo/include/RcppArmadillo.h:31,
                 from /home/ripley/R/Lib32/ramcmc/include/ramcmc.h:4,
                 from mcmc.cpp:4:
/opt/csw/lib/gcc/i386-pc-solaris2.10/5.2.0/include-fixed/iso/math_iso.h:193:21: note: candidate: long double std::floor(long double)
  inline long double floor(long double __X) { return __floorl(__X); }
                     ^
/opt/csw/lib/gcc/i386-pc-solaris2.10/5.2.0/include-fixed/iso/math_iso.h:164:15: note: candidate: float std::floor(float)
  inline float floor(float __X) { return __floorf(__X); }
               ^
/opt/csw/lib/gcc/i386-pc-solaris2.10/5.2.0/include-fixed/iso/math_iso.h:77:15: note: candidate: double std::floor(double)
 extern double floor __P((double));
               ^
mcmc.cpp: In member function ‘void mcmc::state_posterior(T, unsigned int)’:
mcmc.cpp:53:56: error: call of overloaded ‘floor(unsigned int)’ is ambiguous
       unsigned thread_size = floor(n_stored / n_threads);
                                                        ^
In file included from /usr/include/math.h:15:0,
                 from /opt/csw/include/c++/5.2.0/cmath:44,
                 from /home/ripley/R/Lib32/Rcpp/include/Rcpp/platform/compiler.h:100,
                 from /home/ripley/R/Lib32/Rcpp/include/Rcpp/r/headers.h:48,
                 from /home/ripley/R/Lib32/Rcpp/include/RcppCommon.h:29,
                 from /home/ripley/R/Lib32/RcppArmadillo/include/RcppArmadilloForward.h:26,
                 from /home/ripley/R/Lib32/RcppArmadillo/include/RcppArmadillo.h:31,
                 from /home/ripley/R/Lib32/ramcmc/include/ramcmc.h:4,
                 from mcmc.cpp:4:
/opt/csw/lib/gcc/i386-pc-solaris2.10/5.2.0/include-fixed/iso/math_iso.h:193:21: note: candidate: long double std::floor(long double)
  inline long double floor(long double __X) { return __floorl(__X); }
                     ^
/opt/csw/lib/gcc/i386-pc-solaris2.10/5.2.0/include-fixed/iso/math_iso.h:164:15: note: candidate: float std::floor(float)
  inline float floor(float __X) { return __floorf(__X); }
               ^
/opt/csw/lib/gcc/i386-pc-solaris2.10/5.2.0/include-fixed/iso/math_iso.h:77:15: note: candidate: double std::floor(double)
 extern double floor __P((double));
               ^

Reparametrize theta for AR models for more efficient sampling

Sampling for ar1_lg and ar1_ng models is currently done in constrained (natural) space, but it would make sense to sample in transformed space for sigmas and phi as is already done for bsm models. This should improve the sampling efficiency especially if we are close to non-stationarity or zero variances.

To accomplish this, relatively small modifications to update_model and log_prior_pdf functions on the C++ side, and run_mcmc, ar1_lg, ar1_ng functions (and docs) as well as print and summary methods on the R side would be needed. This should be easy by just mimicking the bsm versions.

The same could be of course done to the SV model as well.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.