rajlabmssm / echoannot Goto Github PK
View Code? Open in Web Editor NEWechoverse module: Annotate fine-mapping results
echoverse module: Annotate fine-mapping results
Tabix-index versions now on Zenodo!
https://doi.org/10.5281/zenodo.7062238
zen4r
rfigshare
Running MOTIFBREAKR_plot
causes error in certain contexts. Documented here:
Simon-Coetzee/motifBreakR#31
Function should run in all contexts.
library(BSgenome) ## <-- IMPORTANT!
library(BSgenome.Hsapiens.UCSC.hg19) ## <-- IMPORTANT!
#### Example fine-mapping results ####
merged_DT <- echodata::get_Nalls2019_merged()
#### Run motif analyses ####
mb_res <- MOTIFBREAKR(rsid_list = c("rs11175620"),
# limit the number of datasets tested
# for demonstration purposes only
pwmList_max = 5,
calculate_pvals = FALSE)
plot_paths <- MOTIFBREAKR_plot(mb_res = mb_res)
When run via CRAN checks.
dat is already a GRanges object.
Plotting 1 unique RSID(s).
Plotting motif disruption results: rs11175620
Error in grid.Call.graphics(C_unsetviewport, as.integer(n)) :
cannot pop the top-level viewport ('grid' and 'graphics' output mixed?)
Calls: MOTIFBREAKR_plot ... drawGD -> .local -> popViewport -> grid.Call.graphics
Execution halted
When run in R console:
genome_build set to hg19 by default.
Loading required namespace: SNPlocs.Hsapiens.dbSNP144.GRCh37
Using genome_build hg19
+ MOTIFBREAKR:: Converting SNP list into motifbreakR input format.
R version 4.2.1 (2022-06-23)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.4
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats4 stats graphics grDevices utils datasets
[7] methods base
other attached packages:
[1] echoannot_0.99.10
[2] BSgenome.Hsapiens.UCSC.hg19_1.4.3
[3] BSgenome_1.65.2
[4] rtracklayer_1.57.0
[5] Biostrings_2.65.6
[6] XVector_0.37.1
[7] GenomicRanges_1.49.1
[8] GenomeInfoDb_1.33.13
[9] IRanges_2.31.2
[10] S4Vectors_0.35.4
[11] BiocGenerics_0.43.4
loaded via a namespace (and not attached):
[1] utf8_1.2.2
[2] reticulate_1.26
[3] R.utils_2.12.0
[4] tidyselect_1.2.0
[5] poweRlaw_0.70.6
[6] RSQLite_2.2.18
[7] AnnotationDbi_1.59.1
[8] htmlwidgets_1.5.4
[9] grid_4.2.1
[10] BiocParallel_1.31.14
[11] XGR_1.1.8
[12] munsell_0.5.0
[13] codetools_0.2-18
[14] interp_1.1-3
[15] DT_0.26
[16] colorspace_2.0-3
[17] OrganismDbi_1.39.1
[18] Biobase_2.57.1
[19] filelock_1.0.2
[20] knitr_1.40
[21] supraHex_1.35.0
[22] rstudioapi_0.14
[23] DescTools_0.99.46
[24] motifStack_1.41.1
[25] MatrixGenerics_1.9.1
[26] GenomeInfoDbData_1.2.9
[27] bit64_4.0.5
[28] echoconda_0.99.7
[29] basilisk_1.9.11
[30] vctrs_0.4.2
[31] generics_0.1.3
[32] xfun_0.34
[33] biovizBase_1.45.0
[34] BiocFileCache_2.5.2
[35] R6_2.5.1
[36] splitstackshape_1.4.8
[37] grImport2_0.2-0
[38] AnnotationFilter_1.21.0
[39] bitops_1.0-7
[40] cachem_1.0.6
[41] reshape_0.8.9
[42] DelayedArray_0.23.2
[43] motifbreakR_2.11.2
[44] assertthat_0.2.1
[45] BiocIO_1.7.1
[46] scales_1.2.1
[47] nnet_7.3-18
[48] rootSolve_1.8.2.3
[49] gtable_0.3.1
[50] lmom_2.9
[51] ggbio_1.45.0
[52] ensembldb_2.21.5
[53] seqLogo_1.63.0
[54] rlang_1.0.6
[55] echodata_0.99.15
[56] splines_4.2.1
[57] lazyeval_0.2.2
[58] dichromat_2.0-0.1
[59] hexbin_1.28.2
[60] checkmate_2.1.0
[61] BiocManager_1.30.18
[62] yaml_2.3.6
[63] reshape2_1.4.4
[64] GenomicFeatures_1.49.7
[65] ggnetwork_0.5.10
[66] backports_1.4.1
[67] Hmisc_4.7-1
[68] RBGL_1.73.0
[69] tools_4.2.1
[70] ggplot2_3.3.6
[71] ellipsis_0.3.2
[72] RColorBrewer_1.1-3
[73] proxy_0.4-27
[74] Rcpp_1.0.9
[75] plyr_1.8.7
[76] base64enc_0.1-3
[77] progress_1.2.2
[78] zlibbioc_1.43.0
[79] purrr_0.3.5
[80] RCurl_1.98-1.9
[81] basilisk.utils_1.9.4
[82] prettyunits_1.1.1
[83] rpart_4.1.16
[84] deldir_1.0-6
[85] SummarizedExperiment_1.27.3
[86] ggrepel_0.9.1
[87] cluster_2.1.4
[88] fs_1.5.2
[89] crul_1.3
[90] magrittr_2.0.3
[91] data.table_1.14.4
[92] echotabix_0.99.8
[93] dnet_1.1.7
[94] openxlsx_4.2.5
[95] MotifDb_1.39.0
[96] mvtnorm_1.1-3
[97] ProtGenerics_1.29.1
[98] matrixStats_0.62.0
[99] pkgload_1.3.0
[100] xtable_1.8-4
[101] patchwork_1.1.2
[102] hms_1.1.2
[103] XML_3.99-0.11
[104] jpeg_0.1-9
[105] readxl_1.4.1
[106] gridExtra_2.3
[107] compiler_4.2.1
[108] biomaRt_2.53.3
[109] tibble_3.1.8
[110] crayon_1.5.2
[111] R.oo_1.25.0
[112] htmltools_0.5.3
[113] tzdb_0.3.0
[114] TFBSTools_1.35.0
[115] Formula_1.2-4
[116] tidyr_1.2.1
[117] expm_0.999-6
[118] Exact_3.2
[119] DBI_1.1.3
[120] dbplyr_2.2.1
[121] MASS_7.3-58.1
[122] rappdirs_0.3.3
[123] boot_1.3-28
[124] ade4_1.7-19
[125] Matrix_1.5-1
[126] readr_2.1.3
[127] piggyback_0.1.4
[128] cli_3.4.1
[129] R.methodsS3_1.8.2
[130] Gviz_1.41.1
[131] parallel_4.2.1
[132] igraph_1.3.5
[133] SNPlocs.Hsapiens.dbSNP144.GRCh37_0.99.20
[134] pkgconfig_2.0.3
[135] TFMPvalue_0.0.9
[136] GenomicAlignments_1.33.1
[137] dir.expiry_1.5.1
[138] RCircos_1.2.2
[139] foreign_0.8-83
[140] osfr_0.2.9
[141] xml2_1.3.3
[142] annotate_1.75.0
[143] DirichletMultinomial_1.39.0
[144] stringr_1.4.1
[145] VariantAnnotation_1.43.3
[146] digest_0.6.30
[147] pracma_2.4.2
[148] CNEr_1.33.0
[149] graph_1.75.0
[150] httpcode_0.3.0
[151] cellranger_1.1.0
[152] htmlTable_2.4.1
[153] gld_2.6.5
[154] restfulr_0.0.15
[155] curl_4.3.3
[156] gtools_3.9.3
[157] Rsamtools_2.13.4
[158] rjson_0.2.21
[159] lifecycle_1.0.3
[160] nlme_3.1-160
[161] jsonlite_1.8.3
[162] fansi_1.0.3
[163] downloadR_0.99.4
[164] pillar_1.8.1
[165] lattice_0.20-45
[166] GGally_2.1.2
[167] GO.db_3.16.0
[168] KEGGREST_1.37.3
[169] fastmap_1.1.0
[170] httr_1.4.4
[171] survival_3.4-0
[172] glue_1.6.2
[173] zip_2.2.1
[174] png_0.1-7
[175] bit_4.0.4
[176] Rgraphviz_2.41.1
[177] class_7.3-20
[178] stringi_1.7.8
[179] blob_1.2.3
[180] caTools_1.18.2
[181] latticeExtra_0.6-30
[182] memoise_2.0.1
[183] dplyr_1.0.10
[184] e1071_1.7-11
[185] ape_5.6-2
Currently can only call peaks from bedGraph. Not sure if MACSr can take bigwig, but if not could export bigwig --> bedGraph with rtracklayer first.
topSNPs <- echodata::topSNPs_Nalls2019
fullSS_path <- echodata::example_fullSS(dataset = "Nalls2019")
res <- echolocatoR::finemap_locus(
fullSS_path = fullSS_path,
topSNPs = topSNPs,
# results_dir = "/Desktop/res",
locus = "BST1",
dataset_name = "Nalls2019",
fullSS_genome_build = "hg19",
zoom = c("1x","4x"),
bp_distance = 25000,
n_causal = 5,
force_new_finemap = TRUE,
plot_types = c("simple","fancy","LD"),
roadmap = TRUE,
roadmap_query = "E053",
# nott_epigenome = TRUE,
# nott_show_placseq = TRUE,
munged = TRUE)
ROADMAP:: 1 annotation(s) identified that match: E053
Querying subset from Roadmap API: E053 - 1/1
Constructing GRanges query using min/max ranges across one or more chromosomes.
+ as_blocks=TRUE: Will query a single range per chromosome that covers all regions requested (plus anything in between).
Downloading Roadmap Chromatin Marks: E053
Preexisting file detected. Set force_overwrite=TRUE to override this.
Error in h(simpleError(msg, call)) :
error in evaluating the argument 'con' in selecting a method for function 'import': Cannot detect format (no extension found in file name)
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] echolocatoR_2.0.1 snpStats_1.46.0 Matrix_1.4-1 survival_3.4-0
loaded via a namespace (and not attached):
[1] rappdirs_0.3.3 rtracklayer_1.57.0 GGally_2.1.2
[4] R.methodsS3_1.8.2 ragg_1.2.2 tidyr_1.2.0
[7] echoLD_0.99.6 ggplot2_3.3.6 bit64_4.0.5
[10] knitr_1.40 irlba_2.3.5 DelayedArray_0.22.0
[13] R.utils_2.12.0 data.table_1.14.2 rpart_4.1.16
[16] KEGGREST_1.36.3 RCurl_1.98-1.8 AnnotationFilter_1.20.0
[19] generics_0.1.3 BiocGenerics_0.42.0 GenomicFeatures_1.48.3
[22] RSQLite_2.2.16 proxy_0.4-27 bit_4.0.4
[25] tzdb_0.3.0 xml2_1.3.3 SummarizedExperiment_1.26.1
[28] assertthat_0.2.1 viridis_0.6.2 gargle_1.2.0
[31] xfun_0.32 hms_1.1.2 evaluate_0.16
[34] fansi_1.0.3 restfulr_0.0.15 progress_1.2.2
[37] dbplyr_2.2.1 readxl_1.4.1 Rgraphviz_2.40.0
[40] igraph_1.3.4 DBI_1.1.3 htmlwidgets_1.5.4
[43] reshape_0.8.9 downloadR_0.99.4 stats4_4.2.1
[46] purrr_0.3.4 ellipsis_0.3.2 dplyr_1.0.9
[49] backports_1.4.1 biomaRt_2.52.0 deldir_1.0-6
[52] MatrixGenerics_1.8.1 MungeSumstats_1.5.9 vctrs_0.4.1
[55] Biobase_2.56.0 ensembldb_2.20.2 cachem_1.0.6
[58] BSgenome_1.64.0 checkmate_2.1.0 GenomicAlignments_1.32.1
[61] prettyunits_1.1.1 cluster_2.1.4 ape_5.6-2
[64] dir.expiry_1.4.0 lazyeval_0.2.2 crayon_1.5.1
[67] basilisk.utils_1.8.0 crul_1.2.0 labeling_0.4.2
[70] pkgconfig_2.0.3 GenomeInfoDb_1.32.3 nlme_3.1-159
[73] pkgload_1.3.0 ProtGenerics_1.28.0 XGR_1.1.8
[76] nnet_7.3-17 pals_1.7 rlang_1.0.4
[79] lifecycle_1.0.1 filelock_1.0.2 httpcode_0.3.0
[82] BiocFileCache_2.4.0 echotabix_0.99.7 dichromat_2.0-0.1
[85] cellranger_1.1.0 coloc_5.1.0.1 matrixStats_0.62.0
[88] graph_1.74.0 osfr_0.2.8 boot_1.3-28
[91] base64enc_0.1-3 png_0.1-7 viridisLite_0.4.1
[94] rjson_0.2.21 rootSolve_1.8.2.3 bitops_1.0-7
[97] R.oo_1.25.0 ggnetwork_0.5.10 Biostrings_2.64.1
[100] blob_1.2.3 mixsqp_0.3-43 stringr_1.4.1
[103] echoplot_0.99.4 dnet_1.1.7 readr_2.1.2
[106] jpeg_0.1-9 S4Vectors_0.34.0 echodata_0.99.12
[109] scales_1.2.1 memoise_2.0.1 magrittr_2.0.3
[112] plyr_1.8.7 hexbin_1.28.2 zlibbioc_1.42.0
[115] compiler_4.2.1 echoconda_0.99.7 BiocIO_1.6.0
[118] RColorBrewer_1.1-3 EnsDb.Hsapiens.v75_2.99.0 Rsamtools_2.12.0
[121] cli_3.3.0 XVector_0.36.0 echoannot_0.99.7
[124] patchwork_1.1.2 htmlTable_2.4.1 Formula_1.2-4
[127] MASS_7.3-58.1 tidyselect_1.1.2 stringi_1.7.8
[130] textshaping_0.3.6 yaml_2.3.5 supraHex_1.34.0
[133] latticeExtra_0.6-30 ggrepel_0.9.1 grid_4.2.1
[136] VariantAnnotation_1.42.1 tools_4.2.1 lmom_2.9
[139] parallel_4.2.1 rstudioapi_0.14 foreign_0.8-82
[142] piggyback_0.1.4 gridExtra_2.3 gld_2.6.5
[145] farver_2.1.1 RcppZiggurat_0.1.6 digest_0.6.29
[148] BiocManager_1.30.18 Rcpp_1.0.9 GenomicRanges_1.48.0
[151] OrganismDbi_1.38.1 httr_1.4.4 AnnotationDbi_1.58.0
[154] RCircos_1.2.2 ggbio_1.44.1 biovizBase_1.44.0
[157] colorspace_2.0-3 brio_1.1.3 XML_3.99-0.10
[160] fs_1.5.2 reticulate_1.25 IRanges_2.30.1
[163] splines_4.2.1 RBGL_1.72.0 expm_0.999-6
[166] seqminer_8.4 echofinemap_0.99.3 basilisk_1.8.1
[169] Exact_3.1 mapproj_1.2.8 systemfonts_1.0.4
[172] jsonlite_1.8.0 Rfast_2.0.6 testthat_3.1.4
[175] susieR_0.12.19 R6_2.5.1 Hmisc_4.7-1
[178] pillar_1.8.1 htmltools_0.5.3 glue_1.6.2
[181] fastmap_1.1.0 DT_0.24 BiocParallel_1.30.3
[184] class_7.3-20 codetools_0.2-18 maps_3.4.0
[187] mvtnorm_1.1-3 utf8_1.2.2 lattice_0.20-45
[190] tibble_3.1.8 curl_4.3.2 DescTools_0.99.45
[193] zip_2.2.0 openxlsx_4.2.5 interp_1.1-3
[196] rmarkdown_2.16 googleAuthR_2.0.0 munsell_0.5.0
[199] e1071_1.7-11 GenomeInfoDbData_1.2.8 reshape2_1.4.4
[202] gtable_0.3.0
</details>
Break up and store files from Dey_DeepLearning.tgz in a GitHub repo so users don't have to download entire dataset (37GB).
Extend DEEPLEARNING.
to connect to these remote resources by default.
Ideally, will tabix-index all files as well to improve querying speed.
Evaluating the informativeness of deep learning annotations for human complex diseases
Because of issues with importing remote bgz files (either via echotabix
or rtracklayer
), I've gone back to downloading the entire bed.bgz file and then querying the local copy:
lawremi/rtracklayer#76
Set the seed as an argument for reproducible results.
Getting variant annotations using biomaRt can be quite slow. Either figure out a way to improve this speed or switch to another package, like AnnoVar.
Add API to import data from ENCODE directly.
DeepBlueR seems like a good candidate package to do this, but is extremely complicated to use:
https://www.bioconductor.org/packages/devel/bioc/vignettes/DeepBlueR/inst/doc/DeepBlueR.html
XGR is only available on GitHub again, it seems. Hasn't been updated since 2018?
hfang-bristol/XGR#13
This means if i try to submitechoannot
to CRAN, it can't use XGR
.
I am trying to annotate finemapping results against brain tissue marks from ROADMAP, but there is this bug at the end of the query
(Please add the steps to reproduce the bug here. See here for an intro to making a reproducible example (i.e. reprex) and why they're important! This will help us to help you much faster.)
columnsnames = echodata::construct_colmap(munged= FALSE,
CHR = "CHR", POS = "BP",
SNP = "SNP", P = "P",
Effect = "BETA", StdErr = "SE",
A1 = "A1", A2 = "A2",
N = "N", N_cases = "N_CAS",
N_controls = "N_CON", MAF = "MAF")
#Freq = "FREQ", N = "N",
#N_cases = NULL,
#N_controls = NULL,
#proportion_cases = NULL,
#MAF = "calculate")
#tstat = NULL)
finemap_loci(# GENERAL ARGUMENTS
topSNPs = topSNPs,
results_dir = fullRS_path,
loci = topSNPs$Locus,
dataset_name = "LID_COX",
dataset_type = "GWAS",
force_new_subset = TRUE,
force_new_LD = FALSE,
force_new_finemap = FALSE,
remove_tmps = FALSE,
finemap_methods = c("FINEMAP","SUSIE"),
# Munge full sumstats first
munged = FALSE,
colmap = columnsnames,
# SUMMARY STATS ARGUMENTS
fullSS_path = newSS_name_colmap,
fullSS_genome_build = "hg19",
query_by ="tabix",
bp_distance = 500000*2,
min_MAF = 0.001,
trim_gene_limits = FALSE,
case_control = TRUE,
# FINE-MAPPING ARGUMENTS
## General
n_causal = 5,
credset_thresh = .95,
consensus_thresh = 2,
# LD ARGUMENTS
LD_reference = "1KGphase3",#"UKB",
superpopulation = "EUR",
download_method = "axel",
LD_genome_build = "hg19",
leadSNP_LD_block = FALSE,
#### PLotting args ####
plot_types = c("fancy"),
show_plot = TRUE,
zoom = c("1x", "10x", "20x"),
#zoom = "1x",
tx_biotypes = NULL,
nott_epigenome = FALSE,
nott_show_placseq = FALSE,
nott_binwidth = 200,
nott_bigwig_dir = NULL,
#xgr_libnames =c("ENCODE_TFBS_ClusteredV3_CellTypes", "TFBS_Conserved", "Uniform_TFBS"),
roadmap = TRUE,
roadmap_query = c("brain"),
#### General args ####
seed = 2022,
nThread = 20,
verbose = TRUE
)
┌──────────────────────────────────────────┐
│ │
│ )))> 🦇 TRIM22 [locus 1 / 1] 🦇 <((( │
│ │
└──────────────────────────────────────────┘
────────────────────────────────────────────────────────────────────────────────
── Step 1 ▶▶▶ Query 🔎 ─────────────────────────────────────────────────────────
────────────────────────────────────────────────────────────────────────────────
+ Query Method: tabix
Constructing GRanges query using min/max ranges within a single chromosome.
query_dat is already a GRanges object. Returning directly.
========= echotabix::convert =========
Converting full summary stats file to tabix format for fast querying.
Inferred format: 'table'
Explicit format: 'table'
Inferring comment_char from tabular header: 'SNP'
Determining chrom type from file header.
Chromosome format: 1
Detecting column delimiter.
Identified column separator: \t
Sorting rows by coordinates via bash.
Searching for header row with grep.
( grep ^'SNP' .../QC_SNPs_COLMAP.txt; grep
-v ^'SNP' .../QC_SNPs_COLMAP.txt | sort
-k2,2n
-k3,3n ) > .../filedd61f50654_sorted.tsv
Constructing outputs
Using existing bgzipped file: /home/rstudio/echolocatoR/echolocatoR_will/QC_SNPs_COLMAP.txt.bgz
Set force_new=TRUE to override this.
Tabix-indexing file using: Rsamtools
Data successfully converted to bgzip-compressed, tabix-indexed format.
========= echotabix::query =========
query_dat is already a GRanges object. Returning directly.
Inferred format: 'table'
Querying tabular tabix file using: Rsamtools.
Checking query chromosome style is correct.
Chromosome format: 1
Retrieving data.
Converting query results to data.table.
Processing query: 11:4724803-6724803
Adding 'query' column to results.
Retrieved data with 7,013 rows
Saving query ==> /home/rstudio/echolocatoR/echolocatoR_will/RESULTS/GWAS/LID_COX/TRIM22/TRIM22_LID_COX_subset.tsv.gz
+ Query: 7,013 SNPs x 12 columns.
Standardizing summary statistics subset.
Standardizing main column names.
++ Preparing A1,A1 cols
++ Preparing MAF,Freq cols.
++ Removing SNPs with MAF== 0 | NULL | NA or >1
++ Preparing N_cases,N_controls cols.
++ Preparing proportion_cases col.
++ Calculating proportion_cases from N_cases and N_controls.
Loading required namespace: MungeSumstats
Preparing sample size column (N).
Using existing 'N' column.
+ Imputing t-statistic from Effect and StdErr.
+ leadSNP missing. Assigning new one by min p-value.
++ Ensuring Effect,StdErr,P are numeric.
++ Ensuring 1 SNP per row and per genomic coordinate.
++ Removing extra whitespace
+ Standardized query: 7,013 SNPs x 15 columns.
++ Saving standardized query ==> /home/rstudio/echolocatoR/echolocatoR_will/RESULTS/GWAS/LID_COX/TRIM22/TRIM22_LID_COX_subset.tsv.gz
────────────────────────────────────────────────────────────────────────────────
── Step 2 ▶▶▶ Extract Linkage Disequilibrium 🔗 ────────────────────────────────
────────────────────────────────────────────────────────────────────────────────
LD_reference identified as: 1kg.
Previously computed LD_matrix detected. Importing: /home/rstudio/echolocatoR/echolocatoR_will/RESULTS/GWAS/LID_COX/TRIM22/LD/TRIM22.1KGphase3_LD.RDS
LD_reference identified as: r.
Converting obj to sparseMatrix.
+ FILTER:: Filtering by LD features.
────────────────────────────────────────────────────────────────────────────────
── Step 3 ▶▶▶ Filter SNPs 🚰 ───────────────────────────────────────────────────
────────────────────────────────────────────────────────────────────────────────
FILTER:: Filtering by SNP features.
+ FILTER:: Removing SNPs with MAF < 0.001
+ FILTER:: Post-filtered data: 6997 x 15
+ Subsetting LD matrix and dat to common SNPs...
Removing unnamed rows/cols
Replacing NAs with 0
+ LD_matrix = 6997 SNPs.
+ dat = 6997 SNPs.
+ 6997 SNPs in common.
Converting obj to sparseMatrix.
────────────────────────────────────────────────────────────────────────────────
── Step 4 ▶▶▶ Fine-map 🔊 ──────────────────────────────────────────────────────
────────────────────────────────────────────────────────────────────────────────
Gathering method sources.
Gathering method citations.
++ Previously multi-finemapped results identified. Importing: /home/rstudio/echolocatoR/echolocatoR_will/RESULTS/GWAS/LID_COX/TRIM22/Multi-finemap/1KGphase3_LD.Multi-finemap.tsv.gz
+ Fine-mapping with 'FINEMAP, SUSIE' completed:
────────────────────────────────────────────────────────────────────────────────
── Step 5 ▶▶▶ Plot 📈 ──────────────────────────────────────────────────────────
────────────────────────────────────────────────────────────────────────────────
+-------- Locus Plot: TRIM22 --------+
+ support_thresh = 2
+ Calculating mean Posterior Probability (mean.PP)...
+ 2 fine-mapping methods used.
+ 3 Credible Set SNPs identified.
+ 0 Consensus SNPs identified.
+ Filling NAs in CS cols with 0.
+ Filling NAs in PP cols with 0.
LD_matrix detected. Coloring SNPs by LD with lead SNP.
++ echoplot:: GWAS full window track
++ echoplot:: GWAS track
++ echoplot:: Merged fine-mapping track
Melting PP and CS from 3 fine-mapping methods.
++ echoplot:: Adding Gene model track.
Converting dat to GRanges object.
Loading required namespace: EnsDb.Hsapiens.v75
max_transcripts= 1 .
82 transcripts from 82 genes returned.
Loading required namespace: pals
Fetching data...OK
Parsing exons...OK
Defining introns...OK
Defining UTRs...OK
Defining CDS...OK
aggregating...
Done
Constructing graphics...
echoannot:: Plotting ROADMAP annotations.
Converting dat to GRanges object.
+ ROADMAP:: 13 annotation(s) identified that match: brain
Querying subset from Roadmap API: E053 - 1/13
Constructing GRanges query using min/max ranges across one or more chromosomes.
+ as_blocks=TRUE: Will query a single range per chromosome that covers all regions requested (plus anything in between).
Querying subset from Roadmap API: E054 - 2/13
Constructing GRanges query using min/max ranges across one or more chromosomes.
+ as_blocks=TRUE: Will query a single range per chromosome that covers all regions requested (plus anything in between).
Downloading Roadmap Chromatin Marks: E053
Preexisting file detected. Set force_overwrite=TRUE to override this.
Querying subset from Roadmap API: E067 - 3/13
Constructing GRanges query using min/max ranges across one or more chromosomes.
+ as_blocks=TRUE: Will query a single range per chromosome that covers all regions requested (plus anything in between).
Downloading Roadmap Chromatin Marks: E054
Preexisting file detected. Set force_overwrite=TRUE to override this.
Querying subset from Roadmap API: E068 - 4/13
Constructing GRanges query using min/max ranges across one or more chromosomes.
+ as_blocks=TRUE: Will query a single range per chromosome that covers all regions requested (plus anything in between).
Downloading Roadmap Chromatin Marks: E067
Preexisting file detected. Set force_overwrite=TRUE to override this.
Downloading Roadmap Chromatin Marks: E068
Preexisting file detected. Set force_overwrite=TRUE to override this.
Querying subset from Roadmap API: E069 - 5/13
Constructing GRanges query using min/max ranges across one or more chromosomes.
+ as_blocks=TRUE: Will query a single range per chromosome that covers all regions requested (plus anything in between).
Querying subset from Roadmap API: E070 - 6/13
Downloading Roadmap Chromatin Marks: E069
Constructing GRanges query using min/max ranges across one or more chromosomes.
Preexisting file detected. Set force_overwrite=TRUE to override this.
+ as_blocks=TRUE: Will query a single range per chromosome that covers all regions requested (plus anything in between).
Downloading Roadmap Chromatin Marks: E070
Querying subset from Roadmap API: E071 - 7/13
Preexisting file detected. Set force_overwrite=TRUE to override this.
Constructing GRanges query using min/max ranges across one or more chromosomes.
+ as_blocks=TRUE: Will query a single range per chromosome that covers all regions requested (plus anything in between).
Querying subset from Roadmap API: E072 - 8/13
Constructing GRanges query using min/max ranges across one or more chromosomes.
+ as_blocks=TRUE: Will query a single range per chromosome that covers all regions requested (plus anything in between).
Downloading Roadmap Chromatin Marks: E071
Preexisting file detected. Set force_overwrite=TRUE to override this.
Querying subset from Roadmap API: E073 - 9/13
Constructing GRanges query using min/max ranges across one or more chromosomes.
+ as_blocks=TRUE: Will query a single range per chromosome that covers all regions requested (plus anything in between).
Downloading Roadmap Chromatin Marks: E072
Preexisting file detected. Set force_overwrite=TRUE to override this.
Downloading Roadmap Chromatin Marks: E073
Preexisting file detected. Set force_overwrite=TRUE to override this.
Querying subset from Roadmap API: E074 - 10/13
Constructing GRanges query using min/max ranges across one or more chromosomes.
+ as_blocks=TRUE: Will query a single range per chromosome that covers all regions requested (plus anything in between).
Downloading Roadmap Chromatin Marks: E074
Preexisting file detected. Set force_overwrite=TRUE to override this.
Querying subset from Roadmap API: E081 - 11/13
Constructing GRanges query using min/max ranges across one or more chromosomes.
+ as_blocks=TRUE: Will query a single range per chromosome that covers all regions requested (plus anything in between).
Querying subset from Roadmap API: E082 - 12/13
Constructing GRanges query using min/max ranges across one or more chromosomes.
+ as_blocks=TRUE: Will query a single range per chromosome that covers all regions requested (plus anything in between).
Downloading Roadmap Chromatin Marks: E081
Preexisting file detected. Set force_overwrite=TRUE to override this.
Querying subset from Roadmap API: E125 - 13/13
Constructing GRanges query using min/max ranges across one or more chromosomes.
+ as_blocks=TRUE: Will query a single range per chromosome that covers all regions requested (plus anything in between).
Downloading Roadmap Chromatin Marks: E082
Preexisting file detected. Set force_overwrite=TRUE to override this.
Downloading Roadmap Chromatin Marks: E125
Preexisting file detected. Set force_overwrite=TRUE to override this.
Annotating chromatin states.
unable to find an inherited method for function 'mcols' for signature '"try-error"'Locus TRIM22 complete in: 1.66 min
────────────────────────────────────────────────────────────────────────────────
── Step 6 ▶▶▶ Postprocess data 🎁 ──────────────────────────────────────────────
────────────────────────────────────────────────────────────────────────────────
Returning results as nested list.
All loci done in: 1.66 min
$TRIM22
NULL
$merged_dat
Null data.table (0 rows and 0 cols)
Warning message:
In parallel::mclapply(seq_len(length(eid_list)), function(i) { :
all scheduled cores encountered errors in user code
rstudio@3e36fec3eec9:~/echolocatoR/echolocatoR_will/RESULTS/GWAS/LID_COX/TRIM22$ head ../../../../QC_SNPs_COLMAP.txt
SNP CHR BP A1 A2 MAF BETA SE P N N_CAS N_CON
rs3131972 1 752721 A G 0.1806 0.07177 0.1482 0.6281 2696 588 2108
rs11240777 1 798959 A G 0.2068 0.02904 0.1454 0.8417 2572 510 2062
rs28482280 1 834056 C A 0.01188 -1.013 0.6109 0.09743 2610 568 2042
rs7518581 1 834956 A G 0.01187 -1.02 0.6109 0.09504 2612 570 2042
rs149737509 1 837657 C G 0.01343 -0.609 0.578 0.292 2606 560 2046
rs28678693 1 838665 C T 0.01331 -0.7232 0.5261 0.1693 2630 580 2050
rs28477624 1 838732 A G 0.01257 -0.6727 0.5515 0.2225 2626 578 2048
rs28437697 1 838890 G A 0.01257 -0.6727 0.5515 0.2225 2626 578 2048
rs28539852 1 838916 T A 0.0126 -0.6755 0.551 0.2202 2620 576 2044
> sessionInfo()
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] SNPlocs.Hsapiens.dbSNP155.GRCh37_0.99.22 SNPlocs.Hsapiens.dbSNP144.GRCh37_0.99.20 BSgenome_1.65.2
[4] rtracklayer_1.57.0 Biostrings_2.65.3 XVector_0.37.1
[7] GenomicRanges_1.49.1 GenomeInfoDb_1.33.5 IRanges_2.31.2
[10] S4Vectors_0.35.3 BiocGenerics_0.43.1 forcats_0.5.2
[13] stringr_1.4.1 dplyr_1.0.10 purrr_0.3.4
[16] readr_2.1.2 tidyr_1.2.0 tibble_3.1.8
[19] ggplot2_3.3.6 tidyverse_1.3.2 data.table_1.14.2
[22] echolocatoR_2.0.1
loaded via a namespace (and not attached):
[1] rappdirs_0.3.3 GGally_2.1.2 R.methodsS3_1.8.2 ragg_1.2.2
[5] echoLD_0.99.7 bit64_4.0.5 knitr_1.40 irlba_2.3.5
[9] DelayedArray_0.23.1 R.utils_2.12.0 rpart_4.1.16 KEGGREST_1.37.3
[13] RCurl_1.98-1.8 AnnotationFilter_1.21.0 generics_0.1.3 GenomicFeatures_1.49.6
[17] RSQLite_2.2.16 proxy_0.4-27 bit_4.0.4 tzdb_0.3.0
[21] xml2_1.3.3 lubridate_1.8.0 SummarizedExperiment_1.27.2 assertthat_0.2.1
[25] viridis_0.6.2 gargle_1.2.0 xfun_0.32 hms_1.1.2
[29] fansi_1.0.3 restfulr_0.0.15 progress_1.2.2 dbplyr_2.2.1
[33] readxl_1.4.1 Rgraphviz_2.41.1 igraph_1.3.4 DBI_1.1.3
[37] htmlwidgets_1.5.4 reshape_0.8.9 downloadR_0.99.4 googledrive_2.0.0
[41] ellipsis_0.3.2 backports_1.4.1 biomaRt_2.53.2 deldir_1.0-6
[45] MatrixGenerics_1.9.1 MungeSumstats_1.5.13 vctrs_0.4.1 Biobase_2.57.1
[49] ensembldb_2.21.4 cachem_1.0.6 withr_2.5.0 checkmate_2.1.0
[53] GenomicAlignments_1.33.1 prettyunits_1.1.1 cluster_2.1.3 ape_5.6-2
[57] dir.expiry_1.5.0 lazyeval_0.2.2 crayon_1.5.1 basilisk.utils_1.9.2
[61] crul_1.2.0 labeling_0.4.2 pkgconfig_2.0.3 nlme_3.1-159
[65] ProtGenerics_1.29.0 XGR_1.1.8 gitcreds_0.1.1 pals_1.7
[69] nnet_7.3-17 rlang_1.0.5 lifecycle_1.0.1 filelock_1.0.2
[73] httpcode_0.3.0 BiocFileCache_2.5.0 modelr_0.1.9 echotabix_0.99.8
[77] dichromat_2.0-0.1 cellranger_1.1.0 coloc_5.1.0 matrixStats_0.62.0
[81] graph_1.75.0 Matrix_1.4-1 osfr_0.2.8 boot_1.3-28
[85] reprex_2.0.2 base64enc_0.1-3 googlesheets4_1.0.1 png_0.1-7
[89] viridisLite_0.4.1 rjson_0.2.21 rootSolve_1.8.2.3 bitops_1.0-7
[93] R.oo_1.25.0 ggnetwork_0.5.10 blob_1.2.3 mixsqp_0.3-43
[97] echoplot_0.99.5 dnet_1.1.7 jpeg_0.1-9 echodata_0.99.14
[101] scales_1.2.1 memoise_2.0.1 magrittr_2.0.3 plyr_1.8.7
[105] hexbin_1.28.2 zlibbioc_1.43.0 compiler_4.2.0 echoconda_0.99.7
[109] BiocIO_1.7.1 RColorBrewer_1.1-3 catalogueR_1.0.0 EnsDb.Hsapiens.v75_2.99.0
[113] Rsamtools_2.13.4 cli_3.3.0 echoannot_0.99.7 patchwork_1.1.2
[117] htmlTable_2.4.1 Formula_1.2-4 MASS_7.3-58.1 tidyselect_1.1.2
[121] stringi_1.7.8 textshaping_0.3.6 yaml_2.3.5 supraHex_1.35.0
[125] latticeExtra_0.6-30 ggrepel_0.9.1 grid_4.2.0 VariantAnnotation_1.43.3
[129] tools_4.2.0 lmom_2.9 parallel_4.2.0 rstudioapi_0.14
[133] foreign_0.8-82 piggyback_0.1.3 gridExtra_2.3 gld_2.6.5
[137] farver_2.1.1 digest_0.6.29 snpStats_1.47.1 BiocManager_1.30.18
[141] Rcpp_1.0.9 broom_1.0.1 OrganismDbi_1.39.1 httr_1.4.4
[145] AnnotationDbi_1.59.1 RCircos_1.2.2 ggbio_1.45.0 biovizBase_1.45.0
[149] colorspace_2.0-3 rvest_1.0.3 XML_3.99-0.10 fs_1.5.2
[153] reticulate_1.26 splines_4.2.0 RBGL_1.73.0 expm_0.999-6
[157] gh_1.3.0 echofinemap_0.99.3 basilisk_1.9.2 Exact_3.1
[161] mapproj_1.2.8 systemfonts_1.0.4 jsonlite_1.8.0 susieR_0.12.27
[165] R6_2.5.1 Hmisc_4.7-1 pillar_1.8.1 htmltools_0.5.3
[169] glue_1.6.2 fastmap_1.1.0 DT_0.24 BiocParallel_1.31.12
[173] class_7.3-20 codetools_0.2-18 maps_3.4.0 mvtnorm_1.1-3
[177] utf8_1.2.2 lattice_0.20-45 curl_4.3.2 DescTools_0.99.46
[181] zip_2.2.0 openxlsx_4.2.5 interp_1.1-3 survival_3.3-1
[185] googleAuthR_2.0.0 munsell_0.5.0 e1071_1.7-11 GenomeInfoDbData_1.2.8
[189] haven_2.5.1 reshape2_1.4.4 gtable_0.3.1
Would be great to access all genome-wide ENFORMER predictions via API. This should be possible since the predictions are shared as h5 files here. They're rather massive (14-42Gb each) but that should be mitigated by the h5 database format.
Alternatively, could extract the predictions on-the-fly from the pre-trained model. Usage examples here. But @Al-Murphy has mentioned that the pre-trained model they provide in the paper is not actually the one they describe in the paper.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.