Git Product home page Git Product logo

Comments (4)

DannyArends avatar DannyArends commented on August 9, 2024

I'll just add my reply email to the comment, (I was expecting Github to automatically add it)

Dear Sandra,

Yes you are correct ctl can be used on independent markers, however since multiple testing correction is done inside the algorithm results will be different if you run it on individual/single markers or on multiple markers, since the null hypothesis changes.

If you want I can send you a package which does not adjust p-values, this will yield similar (but inflated) p-values when scanning a single or multiple markers, and as such is very prone to over-fitting the model since a lot more tests are performed when scanning 1 compared to multiple markers in LD.

The best way of using ctl is to scan all markers and all phenotypes together in a single run, this way multiple testing correction is then performed for the correct number of tests, and as such is the most safe way of performing ctl scans.

from ctlmapping.

DannyArends avatar DannyArends commented on August 9, 2024

Another update,

I added a new parameter to the R function that disables adjustment for multiple testing, so that you're able to run it on a cluster (e.g. separated by chromosome).

You can install the newest version directly from Github using the devtools package, make sure you have the correct Rtools version installed (https://cran.r-project.org/bin/windows/Rtools/):

library(devtools)
install_github("DannyArends/CTLmapping", subdir = "Rctl")

After that run your code, but in the CTLscan function set the parameter adjust to FALSE
this should produce identical p-values when running 1 marker or multiple markers. Please be aware that you do need to adjust p-values for multiple testing yourself (based on the number of markers and phenotypes tested)

Attached the updated code which after installing the package from Github doesn't give me any warnings.

library(qtl)
library(MASS)
library(ctl)

# Load Arabidopsis Thaliana data set
data(ath.metabolites)

# Define genotype and phenotype matrices
matrix_geno = ath.metab$genotypes
matrix_pheno = ath.metab$phenotypes

# Function mat_split from stackoverflow
# http://stackoverflow.com/questions/24299171/function-to-split-a-matrix-into-sub-matrices-in-r
mat_split <- function(M, r, c){
  nr <- ceiling(nrow(M)/r)
  nc <- ceiling(ncol(M)/c)
  newM <- matrix(NA, nr*r, nc*c)
  newM[1:nrow(M), 1:ncol(M)] <- M
  div_k <- kronecker(matrix(seq_len(nr*nc), nr, byrow = TRUE), matrix(1, r, c))
  matlist <- split(newM, div_k)
  N <- length(matlist)
  mats <- unlist(matlist)
  dim(mats)<-c(r, c, N)
  return(mats)
}

# Select the first marker and perform ctlscan
matrix_geno_one_marker_1 = as.matrix(mat_split(matrix_geno, nrow(matrix_geno), 1)[,,1])
ctlscan_one_marker_1 = CTLscan(genotypes = matrix_geno_one_marker_1 , phenotypes = matrix_pheno, adjust=FALSE)

# Select the second marker and perform ctlscan
matrix_geno_one_marker_2 = as.matrix(mat_split(matrix_geno, nrow(matrix_geno), 1)[,,2])
ctlscan_one_marker_2 = CTLscan(genotypes = matrix_geno_one_marker_2 , phenotypes = matrix_pheno, adjust=FALSE)

# Select the first and second markers and perform ctlscan
matrix_geno_two_markers_1_and_2 = as.matrix(mat_split(matrix_geno, nrow(matrix_geno), 2)[,,1])
ctlscan_two_markers_1_and_2 = CTLscan(genotypes = matrix_geno_two_markers_1_and_2, phenotypes = matrix_pheno, adjust=FALSE)

# Print phenotypes where the ctlscan results are different for the same marker but in different sets.
for(phenotype_i in 1:(ncol(matrix_pheno))){
  if (ctlscan_one_marker_1[[phenotype_i]]$ctl[1,] != ctlscan_two_markers_1_and_2[[phenotype_i]]$ctl[1,] && any(ctlscan_one_marker_1[[phenotype_i]]$ctl[1,] > 0.0)){
    print("ctlscan of marker 1 is not equivalent for scanning with one vs two markers for:")
    print(colnames(matrix_pheno)[phenotype_i])
    #print(ctlscan_one_marker_1[[phenotype_i]]$ctl[1,])
    #print(ctlscan_two_markers_1_and_2[[phenotype_i]]$ctl[1,])
  }
  if (ctlscan_one_marker_2[[phenotype_i]]$ctl[1,] != ctlscan_two_markers_1_and_2[[phenotype_i]]$ctl[2,] && any(ctlscan_one_marker_2[[phenotype_i]]$ctl[1,] > 0.0)){
    print("ctlscan of marker 2 is not equivalent for scanning with one vs two markers for:")
    print(colnames(matrix_pheno)[phenotype_i])
    #print(ctlscan_one_marker_2[[phenotype_i]]$ctl[1,])
    #print(ctlscan_two_markers_1_and_2[[phenotype_i]]$ctl[2,])
  }
}

from ctlmapping.

thkhavi avatar thkhavi commented on August 9, 2024

Hi Dr. Arends,

This is great, thank you! I really appreciate it.

I'll be sure to adjust the p-values for multiple testing in the post-CTL mapping steps.

Sincerely,
Sandra Truong

from ctlmapping.

DannyArends avatar DannyArends commented on August 9, 2024

Dear Sandra,

No problem, thanks for reporting it. I will update the version on CRAN as well, but that might take a little while. I think adding the extra parameter so that you can turn off the automatic p-value correction is a really good idea, this change will be indeed very useful for people trying to run it on multiple nodes / a cluster. I will close this issue, but if you have any questions in the future don't hesitate to open another issue.

Greetings and have a nice weekend,
Danny

from ctlmapping.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.