Git Product home page Git Product logo

Comments (5)

privefl avatar privefl commented on June 14, 2024

I'm not sure I follow exactly what you want to do.
Could you maybe provide some (inefficient) R code that works on a very small example?

from bigstatsr.

chrisraynerr avatar chrisraynerr commented on June 14, 2024

I have a dataset with repeated outcome measurements (7 time-points). I've transformed the data to long format, and I want to run a GW-LMM, with IID as the random effect (currently neither plink nor regenie can deal with repeated IIDs). I'm trying to figure out the fastest way to do this, and was hoping to use bigstatsr/bigsnpr to load the geno data and perfrom the regression using big_parallelize (or something similar). Below is a quickly simulated dataset to show the structure of the data I'm using and a summary of the model I'm trying to use. Im wondering if I would gain speed by splitting it into steps -- step1 residualising all SNPs for PCs and using snp_save to save the resids as an FBM? then in step2 running the mixed model using resids and big_apply? Anyway.. here's an example input and the output I'm looking for... which would then be combine across blocks of SNPs. Many thanks for you're help with this!

library(ggplot2)
library(dplyr)
library(tidyr)
library(faux)  
library(GGally)

# simulating PCs from genotype data
pc <- 
  rnorm_multi(
    n    = 10000,
    mu   = 0,
    sd   = 1,
    r    = 0,
    varnames = c(paste0("PC",1:10)),
    empirical = F
  ) %>%
  mutate(IID  = row_number())

# simulating longitudinal outcome data with age covariate
df <- 
  rnorm_multi(
    n    = 10000,
    mu   = 50,
    sd   = 10,
    r    = 0.4,
    varnames = c(paste0("Y_Q0",1:7)),
    empirical = F
  ) %>%
  mutate(
    IID  = row_number(),
    Age_Q01 = rnorm(10000, mean=35, sd=5) 
  ) %>%
  mutate(
    Age_Q02 = Age_Q01 + 0.5,
    Age_Q03 = Age_Q01 + 1,
    Age_Q04 = Age_Q01 + 3,
    Age_Q05 = Age_Q01 + 5,
    Age_Q06 = Age_Q01 + 7,
    Age_Q07 = Age_Q01 + 8
  ) 

# adjusting phenotype for genotype PCs (this 2 step process is done in regenie)
yRes <-
  df %>% 
  full_join(pc, "IID") %>%
  mutate(
    across(matches("Y"), 
    ~ rstandard(lm(.x~PC1+PC2+PC3+PC4+PC5+PC6+PC7+PC8+PC9+PC10,df))
    )) %>%
  select(IID, matches("Y_|Age"))%>% 
  pivot_longer(
    cols = !IID, 
    names_to = c(".value", "Wave"), 
    names_sep = "_"
  ) 

# adjusting genotypes for PCs (again... this 2 step process is done in regenie)
gRes <- 
  rnorm_multi(
    n    = 10000,
    mu   = 0,
    sd   = 1,
    r    = 0,
    varnames = c(paste0("PC",1:10)),
    empirical = F
  ) %>%
  mutate(
    IID  = row_number(),
    SNP  = rbinom(10000, 2, 0.5)
  ) %>%
  mutate(
    SNPRes = rstandard(lm(as.formula(paste0("SNP ~ ",paste0("PC",1:10, collapse="+"))), data = .))
  ) %>%
  select(IID,SNPRes)

df <-  yRes %>% full_join(gRes)

# run mixed model... for each SNP I want the main effect of SNP and its interaction with age 
output <- 
  summary(
    lme4::lmer(Y ~ SNPRes + SNPRes*Age + (1|IID), data=df)
    )$coefficients

from bigstatsr.

privefl avatar privefl commented on June 14, 2024

If you want to get a residualized version of the genotype matrix, that shouldn't be too hard. You can implement this using Rcpp or big_apply() using the linear algebra trick I'm using in big_univLinReg().

If you want to implement your own mixed model, I have no experience with this, so won't be of much help.

from bigstatsr.

chrisraynerr avatar chrisraynerr commented on June 14, 2024

Ok thanks!

from bigstatsr.

privefl avatar privefl commented on June 14, 2024

If you need help with this, feel free to comment and reopen.

from bigstatsr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.