mfaymon / spinar Goto Github PK

View Code? Open in Web Editor NEW

4.0 2.0 1.0 5.13 MB

Semiparametric and parametric estimation and bootstrapping of integer-valued autoregressive (INAR) models.

License: GNU General Public License v3.0

R 93.99% Shell 0.19% TeX 5.82%

bootstrapping count-data parametric-estimation penalization semiparametric-estimation simulation time-series validation

spinar's Introduction

spINAR

Semiparametric and Parametric Estimation and Bootstrapping of Integer-Valued Autoregressive (INAR) Models.

The package provides flexible simulation of INAR data using a general pmf to define the innovations' distribution. It allows for semiparametric and parametric estimation of INAR models and includes a small sample refinement for the semiparametric setting. Additionally, it provides different procedures to appropriately bootstrap INAR data.

Citation

Please cite the JOSS paper using the BibTeX entry

@article{faymonville2024spinar,
  doi = {10.21105/joss.05386},
  url = {https://doi.org/10.21105/joss.05386},
  year = {2024},
  publisher = {The Open Journal},
  volume = {9},
  number = {97},
  pages = {5386},
  author = {Maxime Faymonville and Javiera Riffo and Jonas Rieger and Carsten Jentsch},
  title = {{spINAR}: An {R} Package for Semiparametric and Parametric Estimation and Bootstrapping of Integer-Valued Autoregressive ({INAR}) Models},
  journal = {Journal of Open Source Software}
}

which is also obtained by the call citation("spINAR").

References (related to the methodology)

Faymonville, M., Jentsch, C., Weiß, C.H. and Aleksandrov, B. (2022). "Semiparametric Estimation of INAR Models using Roughness Penalization". Statistical Methods & Applications. DOI
Jentsch, C. and Weiß, C.H. (2017), “Bootstrapping INAR Models”. Bernoulli 25(3), pp. 2359-2408. DOI
Drost, F., Van den Akker, R. and Werker, B. (2009), “Efficient estimation of auto-regression parameters and inovation distributions for semiparametric integer-valued AR(p) models”. Journal of the Royal Statistical Society. Series B 71(2), pp. 467-485. DOI

Contribution

This R package is licensed under the GPLv3. For bug reports (lack of documentation, misleading or wrong documentation, unexpected behaviour, ...) and feature requests please use the issue tracker. Pull requests are welcome and will be included at the discretion of the author.

Installation

For installation of the development version use devtools:

devtools::install_github("MFaymon/spINAR")

Structure

Example

library(spINAR)

We simulate two datasets. The first consists of n = 100 observations resulting from an INAR(1) model with coefficient alpha = 0.5 and Poi(1) distributed innovations. The second consists of n = 100 observations from an INAR(2) model with coefficients alpha_1 = 0.3, alpha_2 = 0.2 and a pmf equal to (0.3, 0.3, 0.2, 0.1, 0.1).

set.seed(1234)

dat1 <- spinar_sim(100, 1, alpha = 0.5, pmf = dpois(0:20,1))
dat2 <- spinar_sim(100, 2, alpha = c(0.3, 0.2), pmf = c(0.3, 0.3, 0.2, 0.1, 0.1))

We estimate an INAR(1) model on the first dataset.

#semiparametrically
spinar_est(dat1, 1)

#parametrically (moment estimation, true Poisson assumption)
spinar_est_param(dat1, 1, "mom", "poi")

We estimate an INAR(2) model on the second dataset.

#semiparametrically
spinar_est(dat2, 2)

For small samples, it can be beneficial to apply a penalized version of the semiparametric estimation. For illustration, we restrict ourselves to the first 50 observations of the first dataset and apply semiparametric, parametric and penalized semiparametric estimation. We choose a small L2 penalization as this showed to be most beneficial in the simulation study in Faymonville et al. (2022) (see references). Alternatively, one could also use the spinar_penal_val function which validates the two penalization parameters.

dat1_50 <- dat1[1:50]
spinar_est(dat1_50, 1)
spinar_est_param(dat1_50, 1, "mom", "poi")
spinar_penal(dat1, 1, penal1 = 0, penal2 = 0.1)

Finally, we bootstrap INAR(1) data on the first data set. We perform a semiparametric and a parametric INAR bootstrap (moment estimation, true Poisson assumption).

spinar_boot(dat1, 1, 500, setting = "sp")
spinar_boot(dat1, 1, 500, setting = "p", type = "mom", distr = "poi")

Application

The file vignette.md provides reproduced results from the literature for each provided functionality of the spINAR package.

Outlook

A possible extension of the spINAR package is to not only cover INAR models but also the extension to GINAR (generalized INAR) models, see Latour (1997). This model class does not only cover the binomial thinning but also allows for other thinning operations, e.g. thinning using geometrically distributed random variables.

spinar's People

Contributors

Stargazers

Watchers

Forkers

jariffo

spinar's Issues

Review

Dear Authors,

I reviewed the JOSS submission (manuscript and the software). The paper is well-written, and the software structure is organized and has a clear purpose.

Regarding software documentation, I suggest including a reproducible result from the literature for each provided functionality. For example, if not too computationally extensive, the spINAR results for Poi-INAR(1) or NB-INAR(1) from Table 1 or 2 in Jentsch and Weiß (2017) or a data set from a real-world application, like the iceberg stocks from Jung and Tremayne (2010), the Cascade-Turnoverdata or monthly demand of different car spare parts (time series of the car part 2404) from Snyder (2002) analyzed Faymonville et al. (2022). This would enable the user to validate the results from the R package with the published literature. If the reproducible examples are too extensive for the examples section of each documented function, you could provide them within a short vignette highlighting the functionality similar to the README.md of the GitHub repository. Related to this, you could also include a references section in the R documentation of each function.
Regarding the software package functionality, I performed tests on the provided functions and found that for some parameters (settings might be artificial), a check needs to be included to intercept these incorrect entries.

library(spINAR)

#spinar_boot
dat2 <- spinar_sim(n = 200, p = 2, alpha = c(0.2, 0.3), pmf = dgeom(0:60, 0.5))
spinar_boot(x = dat2, p = 2, B = 50, setting = "p", type = "mom", distr = "geo", level=0)
spinar_boot(x = dat2, p = 2, B = 50, setting = "p", type = "test", distr = "geo")
spinar_boot(x = dat2, p = 2, B = 50, setting = "p", type = NA, distr = "geo")
spinar_boot(x = dat2, p = NA, B = 50, setting = "p", distr = "nb")
#spinar_est
spinar_est(x = dat2, p = NA)
#spinar_est_param
dat1 <- spinar_sim(n = 200, p = 1, alpha = 0.5, pmf = dpois(0:20, 1))
spinar_est_param(x = dat1, p = NA, type = "mom", distr = "poi")
spinar_est_param(x = dat1, p = NA, type = "mom", distr = "geo")
spinar_est_param(x = dat1, p = NA, type = "mom", distr = "nb")
spinar_est_param(x = dat1, p = NA, type = "ml", distr = "poi")
spinar_est_param(x = dat1, p = NA, type = "ml", distr = "geo")
spinar_est_param(x = dat1, p = NA, type = "ml", distr = "nb")
#spinar_penal
dat1 <- spinar_sim(n = 50, p = 1, alpha = 0.5, pmf = c(0.3, 0.25, 0.2, 0.15, 0.1))
spinar_penal(x = dat1, p = NA, penal1 = 0, penal2 = 0)
#spinar_sim
spinar_sim(n = 100, p = NA, alpha = .3, pmf = c(0.3, 0.3, 0.2, 0.1, 0.1))
spinar_sim(n = 100, p = NA, alpha = c(0.2, 0.3), pmf = dpois(0:20,1))
spinar_sim(n = NA, p = 2, alpha = c(0.2, 0.3), pmf = dpois(0:20,1))
spinar_sim(n = 1, p = 2, alpha = c(0.2, 0.3), pmf = dpois(0:20,1), prerun=0)
spinar_sim(n = 1, p = 2, alpha = c(0.2, 0.3), pmf = dpois(0:20,1), prerun=1)
spinar_sim(n = 1, p = 2, alpha = c(0.2, 0.3), pmf = dpois(0:20,1), prerun=2)
#spinar_penal_val
dat1 <- spinar_sim(n = 50, p = 1, alpha = 0.5, pmf = c(0.3, 0.3, 0.2, 0.1, 0.1))
spinar_penal_val(x = dat1, p = 1, validation = NA)
spinar_penal_val(x = dat1, p = 1, validation = NULL)
spinar_penal_val(x = dat1, p = NA, validation = FALSE, penal1=0, penal2=0)