Git Product home page Git Product logo

spinar's Introduction

spINAR

DOI CRAN R build status codecov DOI

Semiparametric and Parametric Estimation and Bootstrapping of Integer-Valued Autoregressive (INAR) Models.

The package provides flexible simulation of INAR data using a general pmf to define the innovations' distribution. It allows for semiparametric and parametric estimation of INAR models and includes a small sample refinement for the semiparametric setting. Additionally, it provides different procedures to appropriately bootstrap INAR data.

Citation

Please cite the JOSS paper using the BibTeX entry

@article{faymonville2024spinar,
  doi = {10.21105/joss.05386},
  url = {https://doi.org/10.21105/joss.05386},
  year = {2024},
  publisher = {The Open Journal},
  volume = {9},
  number = {97},
  pages = {5386},
  author = {Maxime Faymonville and Javiera Riffo and Jonas Rieger and Carsten Jentsch},
  title = {{spINAR}: An {R} Package for Semiparametric and Parametric Estimation and Bootstrapping of Integer-Valued Autoregressive ({INAR}) Models},
  journal = {Journal of Open Source Software}
} 

which is also obtained by the call citation("spINAR").

References (related to the methodology)

  • Faymonville, M., Jentsch, C., Weiß, C.H. and Aleksandrov, B. (2022). "Semiparametric Estimation of INAR Models using Roughness Penalization". Statistical Methods & Applications. DOI
  • Jentsch, C. and Weiß, C.H. (2017), “Bootstrapping INAR Models”. Bernoulli 25(3), pp. 2359-2408. DOI
  • Drost, F., Van den Akker, R. and Werker, B. (2009), “Efficient estimation of auto-regression parameters and inovation distributions for semiparametric integer-valued AR(p) models”. Journal of the Royal Statistical Society. Series B 71(2), pp. 467-485. DOI

Contribution

This R package is licensed under the GPLv3. For bug reports (lack of documentation, misleading or wrong documentation, unexpected behaviour, ...) and feature requests please use the issue tracker. Pull requests are welcome and will be included at the discretion of the author.

Installation

For installation of the development version use devtools:

devtools::install_github("MFaymon/spINAR")

Structure

Example

library(spINAR)

We simulate two datasets. The first consists of n = 100 observations resulting from an INAR(1) model with coefficient alpha = 0.5 and Poi(1) distributed innovations. The second consists of n = 100 observations from an INAR(2) model with coefficients alpha_1 = 0.3, alpha_2 = 0.2 and a pmf equal to (0.3, 0.3, 0.2, 0.1, 0.1).

set.seed(1234)

dat1 <- spinar_sim(100, 1, alpha = 0.5, pmf = dpois(0:20,1))
dat2 <- spinar_sim(100, 2, alpha = c(0.3, 0.2), pmf = c(0.3, 0.3, 0.2, 0.1, 0.1))

We estimate an INAR(1) model on the first dataset.

#semiparametrically
spinar_est(dat1, 1)

#parametrically (moment estimation, true Poisson assumption)
spinar_est_param(dat1, 1, "mom", "poi")

We estimate an INAR(2) model on the second dataset.

#semiparametrically
spinar_est(dat2, 2)

For small samples, it can be beneficial to apply a penalized version of the semiparametric estimation. For illustration, we restrict ourselves to the first 50 observations of the first dataset and apply semiparametric, parametric and penalized semiparametric estimation. We choose a small L2 penalization as this showed to be most beneficial in the simulation study in Faymonville et al. (2022) (see references). Alternatively, one could also use the spinar_penal_val function which validates the two penalization parameters.

dat1_50 <- dat1[1:50]
spinar_est(dat1_50, 1)
spinar_est_param(dat1_50, 1, "mom", "poi")
spinar_penal(dat1, 1, penal1 = 0, penal2 = 0.1)

Finally, we bootstrap INAR(1) data on the first data set. We perform a semiparametric and a parametric INAR bootstrap (moment estimation, true Poisson assumption).

spinar_boot(dat1, 1, 500, setting = "sp")
spinar_boot(dat1, 1, 500, setting = "p", type = "mom", distr = "poi")

Application

The file vignette.md provides reproduced results from the literature for each provided functionality of the spINAR package.

Outlook

A possible extension of the spINAR package is to not only cover INAR models but also the extension to GINAR (generalized INAR) models, see Latour (1997). This model class does not only cover the binomial thinning but also allows for other thinning operations, e.g. thinning using geometrically distributed random variables.

spinar's People

Contributors

jariffo avatar jonasrieger avatar mfaymon avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

jariffo

spinar's Issues

Review

Dear Authors,

I reviewed the JOSS submission (manuscript and the software). The paper is well-written, and the software structure is organized and has a clear purpose.

  1. Regarding software documentation, I suggest including a reproducible result from the literature for each provided functionality. For example, if not too computationally extensive, the spINAR results for Poi-INAR(1) or NB-INAR(1) from Table 1 or 2 in Jentsch and Weiß (2017) or a data set from a real-world application, like the iceberg stocks from Jung and Tremayne (2010), the Cascade-Turnoverdata or monthly demand of different car spare parts (time series of the car part 2404) from Snyder (2002) analyzed Faymonville et al. (2022). This would enable the user to validate the results from the R package with the published literature. If the reproducible examples are too extensive for the examples section of each documented function, you could provide them within a short vignette highlighting the functionality similar to the README.md of the GitHub repository. Related to this, you could also include a references section in the R documentation of each function.

  2. Regarding the software package functionality, I performed tests on the provided functions and found that for some parameters (settings might be artificial), a check needs to be included to intercept these incorrect entries.

library(spINAR)

  • #spinar_boot
    dat2 <- spinar_sim(n = 200, p = 2, alpha = c(0.2, 0.3), pmf = dgeom(0:60, 0.5))
    spinar_boot(x = dat2, p = 2, B = 50, setting = "p", type = "mom", distr = "geo", level=0)
    spinar_boot(x = dat2, p = 2, B = 50, setting = "p", type = "test", distr = "geo")
    spinar_boot(x = dat2, p = 2, B = 50, setting = "p", type = NA, distr = "geo")
    spinar_boot(x = dat2, p = NA, B = 50, setting = "p", distr = "nb")

  • #spinar_est
    spinar_est(x = dat2, p = NA)

  • #spinar_est_param
    dat1 <- spinar_sim(n = 200, p = 1, alpha = 0.5, pmf = dpois(0:20, 1))
    spinar_est_param(x = dat1, p = NA, type = "mom", distr = "poi")
    spinar_est_param(x = dat1, p = NA, type = "mom", distr = "geo")
    spinar_est_param(x = dat1, p = NA, type = "mom", distr = "nb")
    spinar_est_param(x = dat1, p = NA, type = "ml", distr = "poi")
    spinar_est_param(x = dat1, p = NA, type = "ml", distr = "geo")
    spinar_est_param(x = dat1, p = NA, type = "ml", distr = "nb")

  • #spinar_penal
    dat1 <- spinar_sim(n = 50, p = 1, alpha = 0.5, pmf = c(0.3, 0.25, 0.2, 0.15, 0.1))
    spinar_penal(x = dat1, p = NA, penal1 = 0, penal2 = 0)

  • #spinar_sim
    spinar_sim(n = 100, p = NA, alpha = .3, pmf = c(0.3, 0.3, 0.2, 0.1, 0.1))
    spinar_sim(n = 100, p = NA, alpha = c(0.2, 0.3), pmf = dpois(0:20,1))
    spinar_sim(n = NA, p = 2, alpha = c(0.2, 0.3), pmf = dpois(0:20,1))
    spinar_sim(n = 1, p = 2, alpha = c(0.2, 0.3), pmf = dpois(0:20,1), prerun=0)
    spinar_sim(n = 1, p = 2, alpha = c(0.2, 0.3), pmf = dpois(0:20,1), prerun=1)
    spinar_sim(n = 1, p = 2, alpha = c(0.2, 0.3), pmf = dpois(0:20,1), prerun=2)

  • #spinar_penal_val
    dat1 <- spinar_sim(n = 50, p = 1, alpha = 0.5, pmf = c(0.3, 0.3, 0.2, 0.1, 0.1))
    spinar_penal_val(x = dat1, p = 1, validation = NA)
    spinar_penal_val(x = dat1, p = 1, validation = NULL)
    spinar_penal_val(x = dat1, p = NA, validation = FALSE, penal1=0, penal2=0)

Review

Hi there,

I reviewed your package/paper and genuinely enjoyed learning about the topic and testing the package. There are a few comments and questions, I'd like to address regarding my review checklist:

  • Scholarly effort: The package is probably flagged for inspection by the editors due to <1000 LOC (according to codecov), but I see the effort in structuring the package and the rather complex theory. However, I was wondering if there had been previous work you relied on? I.e an implementation of methods by Carsten Jentsch and Christian Weiß from their 2019 paper on the Bootstrap procedure? Or am implementation by Drost et al.?

  • Performance: You state (around line 48 of the paper) that your main contribution is the efficient semiparametric estimation. It is not clear to me whether "efficient" refers to the properties of the estimators or the implementation of estimation.

  • Statement of Need: In my opinion the SoN in your paper and/or documentation would greatly benefit if you mentioned an application example. Did other authors maybe even discuss an application in which the semiparametric estimation is superior to a parametric approach (as that's your focus)?

  • Statement of Need / State of field: You mentioned that INAR processes are "clearly the most popular" among observation-driven count data models (around line 11 of the paper). On what basis do you make the statement? And related question: Is there actually no implementation in R dealing with INAR models, at least for parametric estimation? That would be in contrast with above popularity statement, wouldn't it? For example the tscount package has been applied and cited quite a few times.

  • Community guidelines: JOSS suggests to give the target audience information on how to contribute and how to get support. I think it would be a great idea to include a note in your Repo.

  • Outlook: I am very curious about how the package will develop in the future. If you have any plans, maybe you could list them in the Repo's readme. My suggestion would be to cover INAR(p) processes or arbitrary order by implementing a general likelihood function and leave the choice to the users.

Looking forward to hearing from you.
All the best and keep up the good work!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.