dannyarends / ctlmapping Goto Github PK
View Code? Open in Web Editor NEWCorrelated Trait Locus (CTL) mapping
Home Page: http://www.mapctl.org
License: GNU General Public License v3.0
Correlated Trait Locus (CTL) mapping
Home Page: http://www.mapctl.org
License: GNU General Public License v3.0
The D version should generate a CTL network output file
Add support for the QTAB format as used by qtlHD
Hi Dr. Arends,
Thank you for developing and making your CTLmapping code available.
Based on the documentation in your dissertation and the JOSS manuscript, my (limited) understanding suggests that CTL for markers can be calculated independently. However, in trying to split up a matrix of markers to run CTL mapping in parallel across multiple nodes, I found that the results of CTLscan() for the whole set of markers were different from running CTLscan() on subsets of the markers.
Should the results for a single marker be independent of the other markers processed with it? I provided a test case below where using CTLscan() on two markers provides different results from running them individually.
Thanks for your help, and for making the implementation available!
Sandra Truong
library(qtl)
library(MASS)
library(ctl)
# Load Arabidopsis Thaliana data set
data(ath.metabolites)
# Define genotype and phenotype matrices
matrix_geno = ath.metab$genotypes
matrix_pheno = ath.metab$phenotypes
# Function mat_split from stackoverflow
# http://stackoverflow.com/questions/24299171/function-to-split-a-matrix-into-sub-matrices-in-r
mat_split <- function(M, r, c){
nr <- ceiling(nrow(M)/r)
nc <- ceiling(ncol(M)/c)
newM <- matrix(NA, nr*r, nc*c)
newM[1:nrow(M), 1:ncol(M)] <- M
div_k <- kronecker(matrix(seq_len(nr*nc), nr, byrow = TRUE), matrix(1, r, c))
matlist <- split(newM, div_k)
N <- length(matlist)
mats <- unlist(matlist)
dim(mats)<-c(r, c, N)
return(mats)
}
# Select the first marker and perform ctlscan
matrix_geno_one_marker_1 = as.matrix(mat_split(matrix_geno, nrow(matrix_geno), 1)[,,1])
ctlscan_one_marker_1 = CTLscan(genotypes = matrix_geno_one_marker_1 , phenotypes = matrix_pheno)
# Select the second marker and perform ctlscan
matrix_geno_one_marker_2 = as.matrix(mat_split(matrix_geno, nrow(matrix_geno), 1)[,,2])
ctlscan_one_marker_2 = CTLscan(genotypes = matrix_geno_one_marker_2 , phenotypes = matrix_pheno)
# Select the first and second markers and perform ctlscan
matrix_geno_two_markers_1_and_2 = as.matrix(mat_split(matrix_geno, nrow(matrix_geno), 2)[,,1])
ctlscan_two_markers_1_and_2 = CTLscan(genotypes = matrix_geno_two_markers_1_and_2, phenotypes = matrix_pheno)
# Print phenotypes where the ctlscan results are different for the same marker but in different sets.
for(phenotype_i in 1:(ncol(matrix_pheno))){
if (ctlscan_one_marker_1[[phenotype_i]]$ctl[1,] != ctlscan_two_markers_1_and_2[[phenotype_i]]$ctl[1,] && any(ctlscan_one_marker_1[[phenotype_i]]$ctl[1,] > 0.0)){
print("ctlscan of marker 1 is not equivalent for scanning with one vs two markers for:")
print(colnames(matrix_pheno)[phenotype_i])
#print(ctlscan_one_marker_1[[phenotype_i]]$ctl[1,])
#print(ctlscan_two_markers_1_and_2[[phenotype_i]]$ctl[1,])
}
if (ctlscan_one_marker_2[[phenotype_i]]$ctl[1,] != ctlscan_two_markers_1_and_2[[phenotype_i]]$ctl[2,] && any(ctlscan_one_marker_2[[phenotype_i]]$ctl[1,] > 0.0)){
print("ctlscan of marker 2 is not equivalent for scanning with one vs two markers for:")
print(colnames(matrix_pheno)[phenotype_i])
#print(ctlscan_one_marker_2[[phenotype_i]]$ctl[1,])
#print(ctlscan_two_markers_1_and_2[[phenotype_i]]$ctl[2,])
}
}
make test_qtab
Reading test/data/multi_genotypes.qtab
object.Exception@/home/danny/Github/Rpackages/qtlHD/src/D/qtl/core/genotype.d(304): Failed to decode genotype "A"
./mapctl(_D3qtl7plugins4qtab9read_qtab18read_genotype_qtabFS3std5stdio4FileC3qtl4core8genotype17ObservedGenotypesZS3std8typecons88__T5TupleTC3qtl4core10individual11IndividualsTAAC3qtl4core8genotype18GenotypeCombinatorZ5Tuple10__lambda19MFAyaZv+0x11c) [0x4feff8]
./mapctl(void qtl.plugins.qtab.read_qtab.each_line_in_section(std.stdio.File, immutable(char)[], void delegate(immutable(char)[]))+0x1d4) [0x4fe0b4]
./mapctl(std.typecons.Tuple!(qtl.core.individual.Individuals, qtl.core.genotype.GenotypeCombinator[][]).Tuple qtl.plugins.qtab.read_qtab.read_genotype_qtab(std.stdio.File, qtl.core.genotype.ObservedGenotypes)+0xba) [0x4fee7a]
./mapctl(int[][] ctl.io.qtab.wrapper.QTABreader.loadgenotypes(immutable(char)[])+0x225) [0x4f9955]
./mapctl(_Dmain+0x331) [0x4aad49]
./mapctl(extern (C) int rt.dmain2.main(int, char**).void runMain()+0x1c) [0x513930]
./mapctl(extern (C) int rt.dmain2.main(int, char**).void tryExec(scope void delegate())+0x2a) [0x5132aa]
./mapctl(extern (C) int rt.dmain2.main(int, char**).void runAll()+0x3b) [0x513977]
./mapctl(extern (C) int rt.dmain2.main(int, char**).void tryExec(scope void delegate())+0x2a) [0x5132aa]
./mapctl(main+0xd1) [0x513235]
/lib/libc.so.6(__libc_start_main+0xfd) [0x2b360e998c8d]
We need to get rid of it because it limits the usefulness of the algorithm
Quick Fix:
in C scan the marker to determine how many different alleles (not -999) a marker has.
Environment variables don't scale in build systems. Use a command line parameter, example in qtlHD Rakefile.
cmdline switches are funny, make sure the are conforming to some kind of standard
The article should be in a separate private repository
The multitrait dataset isn't parsed correctly
E:\github\Rpackages\CTLmapping\D>mapctl -f=qtab -p=test/data/multi_phenotypes.qtab -g=test/data/multi
mapCTL: Correlated Trait Locus (CTL) mapping in D
(c) 2012 written by Danny Arends in the D programming language
The user should be able to specify an output directory of the D algorithm.
The code should check to see if it exists (and create it) if needed
"ANCOVA is used in QTL mapping is to adjust":
Remove second "is"
So we can use it in the D-code to replace the (c) unknown code
The D version does not produce a Phenotypes x Phenotypes matrix
Add test files for qtab
The manual: R and D use the same input csv files
However they don't , the csv format used by D is translated and contains additional 2 columns (Chr and cM)
For a working standalone D version we need to have single marker QTL mapping
(In such a way that we can use qtlHD at a later stage)
Perfect correlations aren't handled correctly in the C-code (we might need to update the floating point check code with a precision parameter
The error occurs in the GN379 microarray daya (a phenotype between 100 - 200)
Fehler in CTLmapping(genotypes, phenotypes, phenocol = phe, nperm = nperm, :
Correlation '-1.00000000' not in range [-1, 1]
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.