Git Product home page Git Product logo

lcbc-uio / galamm Goto Github PK

View Code? Open in Web Editor NEW
22.0 3.0 1.0 24.55 MB

An R package for estimating generalized additive mixed models with latent variables

Home Page: https://lcbc-uio.github.io/galamm/

License: GNU General Public License v3.0

R 45.14% TeX 9.13% Shell 0.06% C++ 45.67%
structural-equation-models generalized-additive-models item-response-theory hierarchical-models latent-variable-models

galamm's Introduction

Generalized Additive Latent and Mixed Models galamm website

CRAN status Project Status: Active – The project has reached a stable, usable state and is being actively developed. R-CMD-check Codecov test coverage

galamm estimates generalized additive latent and mixed models (GALAMMs). This is the first package implementing the model framework and the computational algorithms introduced in Sørensen, Fjell, and Walhovd (2023). It is an extension of the GLLAMM framework for multilevel latent variable modeling detailed in Rabe-Hesketh, Skrondal, and Pickles (2004) and Skrondal and Rabe-Hesketh (2004), in particular by efficiently handling crossed random effects and semiparametric estimation.

What Can the Package Do?

Many applications, particularly in the social sciences, require modeling capabilities beyond what is easily supported and computationally feasible with popular R packages like mgcv (Wood 2017), lavaan (Rosseel 2012), lme4 (Bates et al. 2015), and OpenMx (Neale et al. 2016), as well as the Stata based GLLAMM software (Rabe-Hesketh, Skrondal, and Pickles 2004, 2005). In particular, to maximally utilize large datasets available today, it is typically necessary to combine tools from latent variable modeling, hierarchical modeling, and semiparametric estimation. While this is possible with Bayesian hierarchical models and tools like Stan, it requires considerable expertise and may be beyond scope for a single data analysis project.

The goal of galamm is to enable estimation of models with an arbitrary number of grouping levels, both crossed and hierarchical, and any combination of the following features (click the links to go to the relevant vignette):

Random effects are defined using lme4 syntax, and the syntax for factor structures are close to that of PLmixed (Rockwood and Jeon 2019). However, for the types of models supported by both PLmixed and galamm, galamm is usually considerably faster. Smooth terms, as in generalized additive mixed models, use the same syntax as mgcv.

For most users, it should not be necessary to think about how the actual computations are performed, although they are detailed in the optimization vignette. In short, the core computations are done using sparse matrix methods supported by RcppEigen (Bates and Eddelbuettel 2013) and automatic differentiation using the C++ library autodiff (Leal 2018). Scaling of the algorithm is investigated further in the vignette on computational scaling.

Where Do I Start?

To get started, take a look at the introductory vignette.

Installation

Install the package from CRAN using

install.packages("galamm")

You can install the development version of galamm from GitHub with:

# install.packages("remotes")
remotes::install_github("LCBC-UiO/galamm")

Examples

library(galamm)

Mixed Response Model

The dataframe mresp contains simulated data with mixed response types.

head(mresp)
#>   id         x          y itemgroup
#> 1  1 0.8638214  0.2866329         a
#> 2  1 0.7676133  2.5647490         a
#> 3  1 0.8812059  1.0000000         b
#> 4  1 0.2239725  1.0000000         b
#> 5  2 0.7215696 -0.4721698         a
#> 6  2 0.6924851  1.1750286         a

Responses in rows with itemgroup = "a" are normally distributed while those in rows with itemgroup = "b" are binomially distributed. For a given subject, identified by the id variable, both responses are associated with the same underlying latent variable. We hence need to model this process jointly, and the model is set up as follows:

mixed_resp <- galamm(
  formula = y ~ x + (0 + loading | id),
  data = mresp,
  family = c(gaussian, binomial),
  family_mapping = ifelse(mresp$itemgroup == "a", 1L, 2L),
  load.var = "itemgroup",
  lambda = matrix(c(1, NA), ncol = 1),
  factor = "loading"
)

The summary function gives some information about the model fit.

summary(mixed_resp)
#> GALAMM fit by maximum marginal likelihood.
#> Formula: y ~ x + (0 + loading | id)
#>    Data: mresp
#> 
#>      AIC      BIC   logLik deviance df.resid 
#>   9248.7   9280.2  -4619.3   3633.1     3995 
#> 
#> Scaled residuals: 
#>     Min      1Q  Median      3Q     Max 
#> -3.5360 -0.7078  0.2156  0.6456  2.5978 
#> 
#> Lambda:
#>         loading      SE
#> lambda1   1.000       .
#> lambda2   1.095 0.09982
#> 
#> Random effects:
#>  Groups Name    Variance Std.Dev.
#>  id     loading 1.05     1.025   
#> Number of obs: 4000, groups:  id, 1000
#> 
#> Fixed effects:
#>             Estimate Std. Error z value  Pr(>|z|)
#> (Intercept)    0.041    0.05803  0.7065 4.799e-01
#> x              0.971    0.08594 11.2994 1.321e-29

Generalized Additive Mixed Model with Factor Structures

The dataframe cognition contains simulated for which latent ability in three cognitive domains is measured across time. We focus on the first cognitive domain, and estimate a smooth trajectory for how the latent ability depends on time.

We start by reducing the data.

dat <- subset(cognition, domain == 1)
dat$item <- factor(dat$item)

Next we define the matrix of factor loadings, where NA denotes unknown values to be estimated.

loading_matrix <- matrix(c(1, NA, NA), ncol = 1)

We then compute the model estimates, containing both a smooth term for the latent ability and random intercept for subject and timepoints.

mod <- galamm(
  formula = y ~ 0 + item + sl(x, factor = "loading") +
    (0 + loading | id / timepoint),
  data = dat,
  load.var = "item",
  lambda = loading_matrix,
  factor = "loading"
)

We finally plot the estimated smooth term.

plot_smooth(mod)

How to cite this package

citation("galamm")
#> To cite package 'galamm' in publications use:
#> 
#>   Sørensen Ø, Walhovd K, Fjell A (2023). "Longitudinal Modeling of
#>   Age-Dependent Latent Traits with Generalized Additive Latent and
#>   Mixed Models." _Psychometrika_, *88*(2), 456-486.
#>   doi:10.1007/s11336-023-09910-z
#>   <https://doi.org/10.1007/s11336-023-09910-z>.
#> 
#> A BibTeX entry for LaTeX users is
#> 
#>   @Article{,
#>     title = {Longitudinal Modeling of Age-Dependent Latent Traits with Generalized Additive Latent and Mixed Models},
#>     author = {{\O}ystein S{\o}rensen and Kristine B. Walhovd and Anders M. Fjell},
#>     journal = {Psychometrika},
#>     year = {2023},
#>     volume = {88},
#>     number = {2},
#>     pages = {456-486},
#>     doi = {10.1007/s11336-023-09910-z},
#>   }

Acknowledgement

Some parts of the code base for galamm has been derived from internal functions of the R packages, gamm4 (authors: Simon Wood and Fabian Scheipl), lme4 (authors: Douglas Bates, Martin Maechler, Ben Bolker, and Steven Walker), and mgcv (author: Simon Wood), as well the C++ library autodiff (author: Allan Leal). In accordance with the CRAN Repository Policy, all these authors are listed as contributors in the DESCRIPTION file. If you are among these authors, and don’t want to be listed as a contributor to this package, please let me know, and I will remove you.

Contributing

Contributions are very welcome, see CONTRIBUTING.md for general guidelines.

References

Bates, Douglas M, and Dirk Eddelbuettel. 2013. “Fast and Elegant Numerical Linear Algebra Using the RcppEigen Package.” Journal of Statistical Software 52 (February): 1–24. https://doi.org/10.18637/jss.v052.i05.

Bates, Douglas M, Martin Mächler, Ben Bolker, and Steve Walker. 2015. “Fitting Linear Mixed-Effects Models Using Lme4.” Journal of Statistical Software 67 (1): 1–48. https://doi.org/10.18637/jss.v067.i01.

Leal, Allan M. M. 2018. “Autodiff, a Modern, Fast and Expressive C++ Library for Automatic Differentiation.”

Neale, Michael C., Michael D. Hunter, Joshua N. Pritikin, Mahsa Zahery, Timothy R. Brick, Robert M. Kirkpatrick, Ryne Estabrook, Timothy C. Bates, Hermine H. Maes, and Steven M. Boker. 2016. “OpenMx 2.0: Extended Structural Equation and Statistical Modeling.” Psychometrika 81 (2): 535–49. https://doi.org/10.1007/s11336-014-9435-8.

Rabe-Hesketh, Sophia, Anders Skrondal, and Andrew Pickles. 2004. “Generalized Multilevel Structural Equation Modeling.” Psychometrika 69 (2): 167–90. https://doi.org/10.1007/BF02295939.

———. 2005. “Maximum Likelihood Estimation of Limited and Discrete Dependent Variable Models with Nested Random Effects.” Journal of Econometrics 128 (2): 301–23. https://doi.org/10.1016/j.jeconom.2004.08.017.

Rockwood, Nicholas J., and Minjeong Jeon. 2019. “Estimating Complex Measurement and Growth Models Using the R Package PLmixed.” Multivariate Behavioral Research 54 (2): 288–306. https://doi.org/10.1080/00273171.2018.1516541.

Rosseel, Yves. 2012. “Lavaan: An R Package for Structural Equation Modeling.” Journal of Statistical Software 48 (May): 1–36. https://doi.org/10.18637/jss.v048.i02.

Skrondal, Anders, and Sophia Rabe-Hesketh. 2004. Generalized Latent Variable Modeling. Interdisciplinary Statistics Series. Boca Raton, Florida: Chapman and Hall/CRC.

Sørensen, Øystein, Anders M. Fjell, and Kristine B. Walhovd. 2023. “Longitudinal Modeling of Age-Dependent Latent Traits with Generalized Additive Latent and Mixed Models.” Psychometrika 88 (2): 456–86. https://doi.org/10.1007/s11336-023-09910-z.

Wood, Simon N. 2017. Generalized Additive Models: An Introduction with R. 2nd ed. Chapman and Hall/CRC.

galamm's People

Contributors

osorensen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

bbolker

galamm's Issues

Create examples

Create fast-running examples to have in the documentation for all functions.

Get rid of profile likelihood implementation

Since PL method is slower and more limited than directly maximizing the Laplace approximate marginal likelihood, it is best to just remove it. This will also remove some dependencies.

Initial values for random effects

When fitting mixed response models, it seems challenging to end up at the right random effects. Investigate if it's possible to solve this issue by fitting models separately first, and then providing the random effects from these separate models as initial values.

Mixed response models don't work with multiple trials

library(galamm)
dat <- subset(cognition, domain %in% 1:2)
dat$domain <- factor(dat$domain)
dat$item <- factor(paste0(dat$domain, dat$item))
family <- c(gaussian, binomial)
family_mapping <- ifelse(dat$domain == 1, 1L, 2L)
lambda <- matrix(c(1, NA, NA, 0, 0,
                   0, 0, 0, 1, NA), ncol = 2)

mod <- galamm(
  formula = y ~ item + (0 + loading1 + loading2 | id),
  weights = NULL,
  data = dat,
  family = family,
  family_mapping = family_mapping,
  load.var = "item",
  lambda = list(lambda),
  factor = list(c("loading1", "loading2")),
  start = NULL,
  control = galamm_control(optim_control = list(maxit = 2))
)

mod <- galamm(
  formula = cbind(y, trials - y) ~ item + (0 + loading1 + loading2 | id),
  weights = NULL,
  data = dat,
  family = family,
  family_mapping = family_mapping,
  load.var = "item",
  lambda = list(lambda),
  factor = list(c("loading1", "loading2")),
  start = NULL,
  control = galamm_control()
)
#> Error in response_obj[family_mapping == i, ] <- cbind(response = response, : number of items to replace is not a multiple of replacement length

Created on 2023-09-15 with reprex v2.0.2

Sensible initial values

Find a way of defining sensible initial values. For example, by fixing the factor loadings and using the results of a call to lme4::lmer or lme4::glmer.

Summary method does not always return formula

When the formula is provided as an argument to be evaluted on the fly, formula = y ~ x, then it is correctly shown in summary.galamm, but if it is provided with a variable, formula = form, then the name of the variable will show up in summary.galamm and not the actual formula.

Unit tests for model setup

Write unit tests based on the test data developed in #9, which test that the model construction works as it should.

Automatic setup of models with smooth terms

This relates also to #74. I need to create functionality that allows the user to define models as the following, and automatically translate it to the underlying lme4 representation.

mod <- galamm(
  formula = y ~ s(x) * loading + (0 + loading | item),
  data = dat,
  lambda = , load.var = , factor = )

Line search

Implement some form of line search. This is necessary for the case with non-identity link.

Create Vignettes

Create vignettes for the updated API. Use this as a working document to keep track of progress. Create and update necessary S3 methods and other functions along the way.

Set up loglikelihood computations

Overview

In C++, get the necessary data from R, and use this to compute the Laplace approximate log likelihood.

Intended outcome

Be able to evaluate the loglikelihood at given values of the fixed parameters, integrating over the random effects and the fixed regression coefficients.

Things to do

  • Find and store fill-reducing permutation
  • Update matrices with given lambda parameters.
  • Get the right link functions with their derivatives.
  • Solve linear system.
  • Create unit tests at given parameter values.

Hessian not positive definite

For complex models the Hessian does not become positive definite. This is probably correct, but creates problems for the covariance matrix. I think I need to split the covariance matrix into different sets of parameters, rather than insisting on propagating the uncertainty across all parameters. This would also be the same as is done by other mixed model software, e.g., lme4.

Make R formula interface

Overview

Start by defining an R interface for calling the galamm function with certain arguments, and from there generate the sparse matrices, mappings between parameters and sparse matrix entries, model family, and other things necessary to send to C++.

Intended outcome

Should be able to define models to be fitted from R.

Things to do

Optimization algorithms

Experimentation with full Newton methods with linear mixed models suggests that this requires quite good initial values for the parameters. Should consider different algorithms, perhaps something in this library.

Add plot function for smooth terms

plot is already taken, so one option is to call it plot_smooth or something like that. The other option is to utilize mgcv to do plot(mod$gam), but I'm not sure if that's very good practice, to require users to dig into elements of the model object.

Gradient identically zero with all equal weights

We have the following problem when setting all weights equal to a non-unit value:

library(galamm)
library(Matrix)
library(lme4)
library(dplyr, quietly = TRUE, warn.conflicts = FALSE)
library(tidyr, quietly = TRUE, warn.conflicts = FALSE)
library(memoise)

set.seed(11)
n <- 2000
dat <- tibble(
  id = 1:n,
  b = rnorm(n)
) %>%
  uncount(3, .id = "tp") %>%
  uncount(2, .id = "item") %>%
  mutate(
    x = runif(nrow(.)),
    winv = if_else(item == 1, 1, 2),
    y = x + b + rnorm(nrow(.), sd = sqrt(winv))
  )
theta_inds <- 1
beta_inds <- 2:3
lambda_inds <- integer()
bounds <- c(0, -Inf, -Inf)
lmod <- lFormula(y ~ x + (1 | id), data = dat, REML = FALSE)


mlwrapper <- function(par, weights, hessian = FALSE){
  marginal_likelihood(
    y = dat$y,
    trials = rep(1, length(dat$y)),
    X = lmod$X,
    Zt = lmod$reTrms$Zt,
    Lambdat = lmod$reTrms$Lambdat,
    beta = par[beta_inds],
    theta = par[theta_inds],
    theta_mapping = lmod$reTrms$Lind - 1L,
    lambda = par[lambda_inds],
    lambda_mapping_X = integer(),
    lambda_mapping_Zt = integer(),
    weights = weights,
    family = "gaussian",
    maxit_conditional_modes = 1,
    hessian = hessian
  )
}

mlmem <- memoise(mlwrapper)
fn <- function(par, weights){
  mlmem(par, weights)$logLik
}
gr <- function(par, weights){
  mlmem(par, weights)$gradient
}

par_init <- c(1, 0, 0)
opt <- optim(par_init, fn = fn, gr = gr,
             weights = rep(2, nrow(dat)),
             method = "L-BFGS-B", lower = bounds,
             control = list(fnscale = -1))

opt
#> $par
#> [1]  0.0000000 -0.1936921  0.6731032
#> 
#> $value
#> [1] 0
#> 
#> $counts
#> function gradient 
#>        5        5 
#> 
#> $convergence
#> [1] 0
#> 
#> $message
#> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"

Created on 2022-09-28 with reprex v2.0.2

The problem does not exist for small sample sizes, as can be seen in this example in which the size is reduced to 200, and lme4::lmer is exactly reproduced.

library(galamm)
library(Matrix)
library(lme4)
library(dplyr, quietly = TRUE, warn.conflicts = FALSE)
library(tidyr, quietly = TRUE, warn.conflicts = FALSE)
library(memoise)

set.seed(11)
n <- 200
dat <- tibble(
  id = 1:n,
  b = rnorm(n)
) %>%
  uncount(3, .id = "tp") %>%
  uncount(2, .id = "item") %>%
  mutate(
    x = runif(nrow(.)),
    winv = if_else(item == 1, 1, 2),
    y = x + b + rnorm(nrow(.), sd = sqrt(winv))
  )
theta_inds <- 1
beta_inds <- 2:3
lambda_inds <- integer()
bounds <- c(0, -Inf, -Inf)
lmod <- lFormula(y ~ x + (1 | id), data = dat, REML = FALSE)


mlwrapper <- function(par, weights, hessian = FALSE){
  marginal_likelihood(
    y = dat$y,
    trials = rep(1, length(dat$y)),
    X = lmod$X,
    Zt = lmod$reTrms$Zt,
    Lambdat = lmod$reTrms$Lambdat,
    beta = par[beta_inds],
    theta = par[theta_inds],
    theta_mapping = lmod$reTrms$Lind - 1L,
    lambda = par[lambda_inds],
    lambda_mapping_X = integer(),
    lambda_mapping_Zt = integer(),
    weights = weights,
    family = "gaussian",
    maxit_conditional_modes = 1,
    hessian = hessian
  )
}

mlmem <- memoise(mlwrapper)
fn <- function(par, weights){
  mlmem(par, weights)$logLik
}
gr <- function(par, weights){
  mlmem(par, weights)$gradient
}

par_init <- c(1, 0, 0)
opt <- optim(par_init, fn = fn, gr = gr,
             weights = rep(2, nrow(dat)),
             method = "L-BFGS-B", lower = bounds,
             control = list(fnscale = -1))

opt
#> $par
#> [1] 0.5870816 0.1171555 0.7457017
#> 
#> $value
#> [1] -2081.781
#> 
#> $counts
#> function gradient 
#>       14       14 
#> 
#> $convergence
#> [1] 0
#> 
#> $message
#> [1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"

Created on 2022-09-28 with reprex v2.0.2

Since the determinant of the weighting matrix is part of the likelihood function, I wonder if this might be a cause, but don't know.

Mixed logistic and Gaussian fails to converge

library(galamm)
library(Matrix)
library(lme4)
library(memoise)
library(dplyr, warn.conflicts = FALSE)
library(tidyr, warn.conflicts = FALSE)
library(purrr, warn.conflicts = FALSE)

set.seed(100)
dat <- tibble(id = 1:1000) %>%
  mutate(b = rnorm(nrow(.))) %>%
  uncount(4, .id = "item") %>%
  mutate(
    y = map2_dbl(
      b, item, ~ if_else(.y %in% c(1, 2), rnorm(1, .x), as.numeric(rbinom(1, 1, plogis(.x)))))
  )

lmod <- lFormula(y ~ (1 | id), data = dat)

theta_inds <- 1
beta_inds <- 2

mlwrapper <- function(par, hessian = FALSE){
  marginal_likelihood(
    y = dat$y,
    trials = rep(1, nrow(dat)),
    X = lmod$X,
    Zt = lmod$reTrms$Zt,
    Lambdat = lmod$reTrms$Lambdat,
    beta = par[beta_inds],
    theta = par[theta_inds],
    theta_mapping = lmod$reTrms$Lind - 1L,
    lambda = numeric(),
    lambda_mapping_X = integer(),
    lambda_mapping_Zt = integer(),
    weights = numeric(),
    weights_mapping = integer(),
    family = c("gaussian", "binomial"),
    family_mapping = if_else(dat$item %in% c(1, 2), 0L, 1L),
    maxit_conditional_modes = 2,
    hessian = hessian
  )
}

mlmem <- memoise(mlwrapper)
fn <- function(par){
  mlmem(par)$logLik
}
gr <- function(par){
  mlmem(par)$gradient
}
opt <- optim(c(1, 0), fn, gr, method = "L-BFGS-B",
             lower = c(0, -Inf), control = list(fnscale = -1))
#> Could not find reducing step: i = 1, j = 9
#> Error in marginal_likelihood_cpp(y, trials, X, Zt, Lambdat, beta, theta, : Error

Created on 2022-10-06 with reprex v2.0.2

It does work if I set maxit_conditional_modes = 1, but that is not correct with binomial data. In addition, it seems to work with Poission regression.

Summary for smooth terms

Summary for smooth terms should show the smooth terms. Currently only their mixed model representation is shown.

library(galamm)
dat <- subset(cognition, domain == 1 & item == 1)
mod <- galamm(y ~ s(x) + (1 | id), data = dat)
summary(mod)
#> Generalized additive latent and mixed model fit by maximum marginal likelihood.
#> Formula: y ~ s(x) + (1 | id)
#>    Data: dat
#> 
#>      AIC      BIC   logLik deviance df.resid 
#>   3166.0   3192.9  -1578.0   3156.0     1595 
#> 
#> Scaled residuals: 
#>     Min      1Q  Median      3Q     Max 
#> -2.9092 -0.5972 -0.0113  0.6094  3.3454 
#> 
#> Random effects:
#>  Groups   Name        Variance Std.Dev.
#>  id       (Intercept) 0.8744   0.9351  
#>  Xr       s(x)        2.1440   1.4642  
#>  Residual             0.2757   0.5250  
#> Number of obs: 1600, groups:  id, 200; Xr, 8
#> 
#> Fixed effects:
#>             Estimate Std. Error t value  Pr(>|t|)
#> (Intercept)  1.25074    0.06741 18.5543 7.527e-77
#> s(x)Fx1      0.03339    0.20805  0.1605 8.725e-01

Created on 2023-09-14 with reprex v2.0.2

Mixed response models

Enable fitting models with mixed response types. Since the random effects will at least be correlated, and sometimes even shared, it is not sufficient to do this in R. We must keep track of the model family for each row of the dataframe inside C++.

Model setup

Set up the necessary steps in R to convert user input into representations that can be sent to C++.

Define test data

Create an example dataset which can be used for testing that the models work.

Product of factor loadings

The following example from PLmixed, Example 2 in Rockwood and Jeon (2019), contains products of factor loadings. This is not possible with galamm at the moment:

data("JUDGEsim")
JUDGEsim <- JUDGEsim[order(JUDGEsim$item), ] # Order by item
unique(JUDGEsim$item)

# Specify Lambda matrix
judge.lam <- rbind(c( 1,  0,  1,  0,  0,  0),
                   c(NA,  0, NA,  0,  0,  0),
                   c(NA,  0, NA,  0,  0,  0),
                   c( 0,  1,  0,  1,  0,  0),
                   c( 0, NA,  0, NA,  0,  0),
                   c( 0, NA,  0, NA,  0,  0),
                   c( 0,  0,  0,  0,  1,  0),
                   c( 0,  0,  0,  0, NA,  0),
                   c( 0,  0,  0,  0, NA,  0),
                   c( 0,  0,  0,  0,  0,  1),
                   c( 0,  0,  0,  0,  0, NA),
                   c( 0,  0,  0,  0,  0, NA))

# Conduct analysis
judge.example <- PLmixed(response ~ 0 + as.factor(item) + (1 | class)
                         + (0 + trait1.t + trait2.t + trait1.s + trait2.s | stu)
                         + (0 + teacher1 + teacher2 | tch), data = JUDGEsim,
                         lambda = list(judge.lam), load.var = "item",
                         factor = list(c("teacher1", "teacher2", "trait1.t",
                                         "trait2.t", "trait1.s", "trait2.s")))

summary(judge.example)

data("KYPSitemsim")

time.lam <- rbind(c( 1,  0),  # Specify time lambda matrix
                  c(NA,  0),
                  c(NA,  1),
                  c(NA, NA))

item.lam <- c(1, NA, NA, NA, NA, NA) # Specify item lambda matrix

KYPSitemsim$time2 <- (KYPSitemsim$time == 2) * 1
KYPSitemsim$time3 <- (KYPSitemsim$time == 3) * 1
KYPSitemsim$time4 <- (KYPSitemsim$time == 4) * 1

kyps.item.model <- PLmixed(response ~ 0 + as.factor(item) + lat.var:time2
                           + lat.var:time3 + lat.var:time4 + (0 + hs:lat.var | hid)
                           + (0 + ms:lat.var | mid) + (0 + lat.var:as.factor(time) | id),
                           data = KYPSitemsim, lambda = list(time.lam, item.lam),
                           factor = list(c("ms", "hs"), "lat.var"),
                           load.var = c("time", "item"))

summary(kyps.item.model)

It could be possible to reformulate the problem, but that is not very user friendly. I should try to add it instead.

Find maximum likelihood estimates

Overview

Given a model set up according to #6 and methods for computing the loglikelihood according to #11, in this step we should maximize that function. We should compare automatic differentiation and numeric differentiation. The fill-reducing permutation found in #11 should be retained in each step of the algorithm, although the values of the lambda parameters differ. This means that the mapping for the lambda parametes should comply with the permutation.

Intended outcome

Based on model input in R, the maximum likelihood estimates should be computed in C++ and returned to R.

Things to do

  • Compute derivatives using automatic differentiation
  • Compute derivatives using numerical differentiation
  • #18
  • #19
  • #20
  • #21
  • #22

Avoid copying back and forth between C++ and R

In the current implementation, each L-BFGS-B step copies all data over to C++ and does all the setup. This includes #51. It would be much more efficient to either do it all in C++ or to have both access the same variables in the same way as is done in lme4.

anova method should return likelihood ratio tests

At the moment the anova.galamm method only shows the content of the "likelihood table", but does not compare the models. It should eventually also compute likelihood ratio tests between the models.

Present results of model fit

Overview

Once we have found maximum likelihood estimates for a given model as in #12, we should return the results back to R with results confidence intervals, covariance matrices, etc. Also define member functions like plot, summary, and print.

Intended outcome

A function package.

Things to do

  • Decide between S3 and S4 classes, and then define them.
  • Compute confidence intervals and covariance matrix.
  • Write summary function
  • Write plot function
  • Write print function
  • Write anova function

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.