COVID-19: association between expression and AGE/SEX
Script structure
Pipeline: bash run.sh
download_data.sh: download data from GTEX and Gencode for downstream analysis
Script dependencies include the following R packages:
dplyr
data.table
ggplot2
sva
Approach I: Perform SVA to learn about the SVs, and test asscoations with AGE and SEX controlling for the SVs
extract_protein_coding_lincRNA_genes.sh: Extract genes that are protein coding or lincRNA
generate_tissue_wise_TPM.R: Generate a matrix of genes by samples for each tissue
SVA_compute_SV.R: Infer about surrogate variables (SVs) using SVA
SVA_followedby_LR.R: Perform association tests controlling for SVs
Rscript SVA_followedby_LR.R ENSG00000130234.10 ACE2
Rscript SVA_followedby_LR.R ENSG00000184012.11 TMPRSS2
Approach II: Test asscoations with AGE and SEX controlling for known confounders
Rscript LR_confounders.R ENSG00000130234.10 ACE2
Rscript LR_confounders.R ENSG00000184012.11 TMPRSS2
* DTHHRDY - Death Circumstances
* SMRIN - RIN number
* SMTSISCH - Total Ischemic time
* SMEXNCRT - Exonic Rate