Comments (4)
Have you seen this information: http://yulab-smu.top/biomedical-knowledge-mining-book/universal-api.html#msigdb-analysis ?
from clusterprofiler.
"I have read it. What I mean is using the compareCluster function for analysis, Can the 'fun' parameter accept the value 'GSEA'? I would greatly appreciate your assistance." for example,
compareCluster(geneList, fun="GSEA", TERM2GENE=m_df, pvalueCutoff=0.05)
thank you
from clusterprofiler.
Yes, that is indeed the way to do it. Note that you can include all arguments that you could set when analyzing a single dataset.
Using the included, human example dataset:
> library(msigdbr)
> library(clusterProfiler)
> library(enrichplot)
>
>
> ## using the code from the biomedical knowledge mining book.
> ## retrieve all human MSigDb gene sets for category C8 (cell type signature gene sets)
> m_dfC8 <- msigdbr(species = "Homo sapiens", category = "C8") %>%
+ dplyr::select(gs_name, entrez_gene)
> head(m_dfC8, 2)
# A tibble: 2 × 2
gs_name entrez_gene
<chr> <int>
1 AIZARANI_LIVER_C10_MVECS_1 2
2 AIZARANI_LIVER_C10_MVECS_1 2532
>
> ## load sample data, and create 3 random sets of sign. regulated genes.
> ## Note that these are human ids.
> data(geneList, package = "DOSE")
> de1 <- names(geneList)[1:500]
> de2 <- names(geneList)[301:800]
> de3 <- names(geneList)[401:600]
>
> ## combine all lists of significant genes into a list, and check
> cp.input.ora <- list("Exp. 1"=de1, "Exp. 2"=de2, "Exp. 3"=de3)
> str(cp.input.ora)
List of 3
$ Exp. 1: chr [1:500] "4312" "8318" "10874" "55143" ...
$ Exp. 2: chr [1:500] "22948" "768" "79315" "1230" ...
$ Exp. 3: chr [1:200] "3429" "80830" "23590" "10397" ...
>
> ## Perform overrepresentation analysis using generic function enricher.
> ## Note that no significance cutoff is implemented, because random gene sets
> ## are used, and this is just to illustrate the code as such.
> ##
> ## Next, just to be sure calculate similarity matrices of terms, and convert ids
> ## into symbols
> MSigDb.ora <- compareCluster(geneClusters=cp.input.ora, fun = "enricher", TERM2GENE=m_dfC8,
+ pvalueCutoff = 1, pAdjustMethod = "none", minGSSize = 10, maxGSSize = 500)
> MSigDb.ora <- enrichplot::pairwise_termsim(MSigDb.ora)
> MSigDb.ora <- setReadable(MSigDb.ora, 'org.Hs.eg.db', 'ENTREZID')
> MSigDb.ora
#
# Result of Comparing 3 gene clusters
#
#.. @fun enricher
#.. @geneClusters List of 3
$ Exp. 1: chr [1:500] "4312" "8318" "10874" "55143" ...
$ Exp. 2: chr [1:500] "22948" "768" "79315" "1230" ...
$ Exp. 3: chr [1:200] "3429" "80830" "23590" "10397" ...
#...Result 'data.frame': 513 obs. of 10 variables:
$ Cluster : Factor w/ 3 levels "Exp. 1","Exp. 2",..: 1 1 1 1 1 1 1 1 1 1 ...
$ ID : chr "ZHONG_PFC_C1_OPC" "FAN_EMBRYONIC_CTX_NSC_2" "FAN_EMBRYONIC_CTX_MICROGLIA_1" "MANNO_MIDBRAIN_NEUROTYPES_HPROGBP" ...
$ Description: chr "ZHONG_PFC_C1_OPC" "FAN_EMBRYONIC_CTX_NSC_2" "FAN_EMBRYONIC_CTX_MICROGLIA_1" "MANNO_MIDBRAIN_NEUROTYPES_HPROGBP" ...
$ GeneRatio : chr "96/473" "94/473" "75/473" "88/473" ...
$ BgRatio : chr "238/19170" "233/19170" "155/19170" "300/19170" ...
$ pvalue : num 2.87e-92 2.67e-90 3.13e-79 3.37e-70 8.48e-67 ...
$ p.adjust : num 2.87e-92 2.67e-90 3.13e-79 3.37e-70 8.48e-67 ...
$ qvalue : num 1.12e-89 5.18e-88 4.05e-77 3.27e-68 6.59e-65 ...
$ geneID : chr "NMU/CDCA8/CDC20/FOXM1/KIF23/CENPE/MELK/CCNB2/NDC80/TOP2A/NCAPH/ASPM/RRM2/CEP55/DLGAP5/UBE2C/HJURP/SKA1/PBK/NUSA"| __truncated__ "NMU/CDCA8/CDC20/FOXM1/KIF23/CENPE/MELK/CCNB2/NDC80/TOP2A/NCAPH/ASPM/RRM2/CEP55/DLGAP5/UBE2C/HJURP/SKA1/PBK/NUSA"| __truncated__ "CDCA8/CDC20/FOXM1/KIF23/CENPE/MELK/CCNB2/NDC80/TOP2A/NCAPH/ASPM/RRM2/CEP55/DLGAP5/UBE2C/HJURP/PBK/NUSAP1/CDCA3/"| __truncated__ "CDCA8/FOXM1/KIF23/CENPE/MYBL2/MELK/CCNB2/TOP2A/NCAPH/ASPM/RRM2/DLGAP5/UBE2C/HJURP/PBK/NUSAP1/CDCA3/TPX2/TACC3/N"| __truncated__ ...
$ Count : int 96 94 75 88 63 63 85 88 53 65 ...
#.. number of enriched terms found for each gene cluster:
#.. Exp. 1: 184
#.. Exp. 2: 206
#.. Exp. 3: 123
#
#...Citation
T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou,
W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
The Innovation. 2021, 2(3):100141
>
> dotplot(MSigDb.ora, showCategory = 5, font.size = 6)
>
> ## along the same line, perform GSEA
>
> ## load human example data; note that the input is 3x the same!
> cp.input.gsea <- list("Exp. 1"=geneList, "Exp. 2"=geneList, "Exp. 3"=geneList)
> str(cp.input.gsea)
List of 3
$ Exp. 1: Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
..- attr(*, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
$ Exp. 2: Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
..- attr(*, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
$ Exp. 3: Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
..- attr(*, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
>
> MSigDb.gsea <- compareCluster(geneClusters=cp.input.gsea, fun = "GSEA", eps = 0, TERM2GENE=m_dfC8,
+ pvalueCutoff = 1, pAdjustMethod = "none", minGSSize = 10, maxGSSize = 500)
> MSigDb.gsea <- enrichplot::pairwise_termsim(MSigDb.gsea)
> MSigDb.gsea <- setReadable(MSigDb.gsea, 'org.Hs.eg.db', 'ENTREZID')
> MSigDb.gsea
#
# Result of Comparing 3 gene clusters
#
#.. @fun GSEA
#.. @geneClusters List of 3
$ Exp. 1: Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
..- attr(*, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
$ Exp. 2: Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
..- attr(*, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
$ Exp. 3: Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
..- attr(*, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
#...Result 'data.frame': 1923 obs. of 12 variables:
$ Cluster : Factor w/ 3 levels "Exp. 1","Exp. 2",..: 1 1 1 1 1 1 1 1 1 1 ...
$ ID : chr "FAN_EMBRYONIC_CTX_NSC_2" "ZHONG_PFC_C1_OPC" "FAN_EMBRYONIC_CTX_MICROGLIA_1" "GAO_LARGE_INTESTINE_ADULT_CJ_IMMUNE_CELLS" ...
$ Description : chr "FAN_EMBRYONIC_CTX_NSC_2" "ZHONG_PFC_C1_OPC" "FAN_EMBRYONIC_CTX_MICROGLIA_1" "GAO_LARGE_INTESTINE_ADULT_CJ_IMMUNE_CELLS" ...
$ setSize : int 185 186 130 392 114 71 200 236 220 257 ...
$ enrichmentScore: num 0.822 0.81 0.848 -0.643 0.83 ...
$ NES : num 3.64 3.58 3.58 -2.83 3.45 ...
$ pvalue : num 8.72e-64 1.24e-59 2.67e-51 1.71e-46 2.82e-40 ...
$ p.adjust : num 8.72e-64 1.24e-59 2.67e-51 1.71e-46 2.82e-40 ...
$ qvalue : num 2.83e-61 2.02e-57 2.88e-49 1.39e-44 1.83e-38 ...
$ rank : num 937 937 937 1890 1044 ...
$ leading_edge : chr "tags=59%, list=7%, signal=56%" "tags=61%, list=7%, signal=57%" "tags=68%, list=7%, signal=63%" "tags=50%, list=15%, signal=44%" ...
$ core_enrichment: chr "NMU/CDCA8/CDC20/FOXM1/KIF23/CENPE/MELK/CCNB2/NDC80/TOP2A/NCAPH/ASPM/RRM2/CEP55/DLGAP5/UBE2C/HJURP/SKA1/PBK/NUSA"| __truncated__ "NMU/CDCA8/CDC20/FOXM1/KIF23/CENPE/MELK/CCNB2/NDC80/TOP2A/NCAPH/ASPM/RRM2/CEP55/DLGAP5/UBE2C/HJURP/SKA1/PBK/NUSA"| __truncated__ "CDCA8/CDC20/FOXM1/KIF23/CENPE/MELK/CCNB2/NDC80/TOP2A/NCAPH/ASPM/RRM2/CEP55/DLGAP5/UBE2C/HJURP/PBK/NUSAP1/CDCA3/"| __truncated__ "COL4A6/TEK/RAMP3/NID1/DZIP1/LAMA4/EGFL7/GPX3/COL6A1/PTH1R/CRIP2/LIMCH1/SCD5/CD93/WWC3/IL1R1/SALL1/WNT5A/FILIP1L"| __truncated__ ...
#.. number of enriched terms found for each gene cluster:
#.. Exp. 1: 641
#.. Exp. 2: 641
#.. Exp. 3: 641
#
#...Citation
T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou,
W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
The Innovation. 2021, 2(3):100141
>
>
> dotplot(MSigDb.gsea, font.size=8, showCategory=5, title =("GSEA C8 sets"), split=".sign") + facet_grid(.~.sign)
>
>
from clusterprofiler.
Thank you very much for your help
from clusterprofiler.
Related Issues (20)
- Count duplicate K Numbers (KEGG) for enrichKEGG?
- bitr报错 HOT 1
- something wrong of setReadable HOT 6
- Error in testForValidKeytype(x, keytype) : Invalid keytype: GOALL. Please use the keytypes method to see a listing of valid arguments. HOT 1
- gson 格式在最新版 enrichKEGG中出错 HOT 4
- ont = "BP" in enricher () function HOT 1
- OrgDb for maize HOT 1
- How to convert geneID from the enrichKEGG output from entrez to gene symbol? HOT 2
- enrichKEGG "No gene can be mapped" HOT 2
- All p-values are equal to 1 HOT 1
- compareCluster - Error in check_gene_id - but work perfectly with standar procedure (not using compareCluster) HOT 1
- How do I change the width of each coloum in the heatmap in the treeplot?
- Perform gseaplot2 using seurat DEG output.
- gseKEGG using KEGG database error (cannot open URL 'https://rest.kegg.jp/conv/ncbi=geneid/mmu': HTTP status was '400 Bad Request') HOT 1
- Different behaviour of gseGO upon setting nPerm=1000 HOT 2
- Upsetplot is cropped HOT 4
- A strange non-given GO comment appears HOT 8
- Retrieve the set of genes with assigned GO HOT 1
- Error with Enricher function HOT 1
- FeatureRequest: color category and gene labels differently in `cnetplot`
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clusterprofiler.