Comments (4)
Hi, a couple of things:
First of all, why do you generate a GSON
object with all mouse pathways?
Related to this, please check the help pages on how to call the enrichKEGG
function, because you made some mistakes. Note that the argument organism should be the KEGG abbreviation of the organism you are analyzing; in your case thus mmu
(and it should NOT be the GSON
object!)
The argument gene should be a (character) vector of entrezids.
It is also recommended to leave the argument use_internal_data
at its default setting FALSE
(so up-to-date information is being downloaded from the KEGG website).
Thus the code below, in which the 7 ids are used that you listed, will do what you intended to do!
> library(clusterProfiler)
>
> id_transform <- c("240427","12705","241770","102633301","319757","116903","72309")
> class(id_transform)
[1] "character"
>
> KEGG_enrich = enrichKEGG(gene = id_transform,
+ organism="mmu",
+ use_internal_data = FALSE
+ )
>
>
> KEGG_enrich
#
# over-representation test
#
#...@organism mmu
#...@ontology KEGG
#...@keytype kegg
#...@gene chr [1:7] "240427" "12705" "241770" "102633301" "319757" "116903" "72309"
#...pvalues adjusted by 'BH' with cutoff <0.05
#...5 enriched terms found
'data.frame': 5 obs. of 11 variables:
$ category : chr "Environmental Information Processing" "Human Diseases" "Organismal Systems" "Organismal Systems" ...
$ subcategory: chr "Signal transduction" "Cancer: specific types" "Circulatory system" "Development and regeneration" ...
$ ID : chr "mmu04340" "mmu05217" "mmu04270" "mmu04360" ...
$ Description: chr "Hedgehog signaling pathway - Mus musculus (house mouse)" "Basal cell carcinoma - Mus musculus (house mouse)" "Vascular smooth muscle contraction - Mus musculus (house mouse)" "Axon guidance - Mus musculus (house mouse)" ...
$ GeneRatio : chr "1/2" "1/2" "1/2" "1/2" ...
$ BgRatio : chr "58/9710" "63/9710" "144/9710" "181/9710" ...
$ pvalue : num 0.0119 0.0129 0.0294 0.0369 0.0416
$ p.adjust : num 0.0388 0.0388 0.0499 0.0499 0.0499
$ qvalue : num 0.00681 0.00681 0.00875 0.00875 0.00875
$ geneID : chr "319757" "319757" "116903" "319757" ...
$ Count : int 1 1 1 1 1
#...Citation
T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
The Innovation. 2021, 2(3):100141
>
> as.data.frame(KEGG_enrich)[1:3,]
category subcategory ID
mmu04340 Environmental Information Processing Signal transduction mmu04340
mmu05217 Human Diseases Cancer: specific types mmu05217
mmu04270 Organismal Systems Circulatory system mmu04270
Description
mmu04340 Hedgehog signaling pathway - Mus musculus (house mouse)
mmu05217 Basal cell carcinoma - Mus musculus (house mouse)
mmu04270 Vascular smooth muscle contraction - Mus musculus (house mouse)
GeneRatio BgRatio pvalue p.adjust qvalue geneID Count
mmu04340 1/2 58/9710 0.01191138 0.03880464 0.006807832 319757 1
mmu05217 1/2 63/9710 0.01293488 0.03880464 0.006807832 319757 1
mmu04270 1/2 144/9710 0.02944172 0.04989512 0.008753530 116903 1
>
from clusterprofiler.
Thank you for your detailed reply. Sorry for not using English before.
but do you know why enrichKEGG does not support gson object, I am confused bacause I saw the code below. @guidohooiveld
clusterProfiler/R/enrichKEGG.R
Lines 45 to 58 in 2ab30a9
I added gson_file@keytype <- 'ENTREZID'
before running enrichKEGG(), and the error disappeared. But I am not sure whether the results are correct by doing this.
In terms of input data, sorry for showing the wrong data, I showed id_transform
before, but I actually used id_transform[,1]
, which is exactly the character vector. Thank you for pointing out.
> head(id_transform)
id_transform
SP140 434484
SPATA32 328019
SAMD15 238333
FER1L6 631797
RERGL 632971
PHEX 18675
> head(id_transform[,1])
[1] "434484" "328019" "238333" "631797" "632971" "18675"
from clusterprofiler.
Sorry for my delayed reply!
Thanks for highlighting the relevant section in the source code from enrichKEGG
. I now got what you tried to achieve, and agree with you that the GSON-object kk
is somehow missing the keytype
slot.
Indeed, when manually adding it (like you did) enrichKEGG
works as expected. See code below.
> ## load library
> library(clusterProfiler)
>
> ## some ids
> id_transform <- c("240427","12705","241770","102633301","319757","116903","72309")
>
> ## generate GSON-object with pathway information
> kk <- gson_KEGG('mmu')
>
> ## use GSON as input: FAILS!
> KEGG_enrich = enrichKEGG(gene = id_transform,
+ organism=kk,
+ use_internal_data = FALSE)
Error in (function (cl, name, valueClass) :
assignment of an object of class “NULL” is not valid for @‘keytype’ in an object of class “enrichResult”; is(value, "character") is not TRUE
>
>
> ## check GSON-object
> kk
>> Gene Set: KEGG
>> 9710 genes annotated by 355 gene sets.
>> Species: mmu
>> Version: Release 110.0+/04-27, Apr 24
>
> ## note that slot keytype is NULL!
> str(kk)
Formal class 'GSON' [package "gson"] with 9 slots
..@ gsid2gene :'data.frame': 38640 obs. of 2 variables:
.. ..$ gsid: chr [1:38640] "mmu00010" "mmu00010" "mmu00010" "mmu00010" ...
.. ..$ gene: chr [1:38640] "103988" "106557" "110695" "11522" ...
..@ gsid2name :'data.frame': 355 obs. of 2 variables:
.. ..$ gsid: chr [1:355] "mmu01100" "mmu01200" "mmu01210" "mmu01212" ...
.. ..$ name: chr [1:355] "Metabolic pathways - Mus musculus (house mouse)" "Carbon metabolism - Mus musculus (house mouse)" "2-Oxocarboxylic acid metabolism - Mus musculus (house mouse)" "Fatty acid metabolism - Mus musculus (house mouse)" ...
..@ gene2name : NULL
..@ species : chr "mmu"
..@ gsname : chr "KEGG"
..@ version : chr "Release 110.0+/04-27, Apr 24"
..@ accessed_date: chr "2024-04-30"
..@ keytype : NULL
..@ info : NULL
>
> ## Fix, and check
> kk@keytype="kegg"
>
> str(kk)
Formal class 'GSON' [package "gson"] with 9 slots
..@ gsid2gene :'data.frame': 38640 obs. of 2 variables:
.. ..$ gsid: chr [1:38640] "mmu00010" "mmu00010" "mmu00010" "mmu00010" ...
.. ..$ gene: chr [1:38640] "103988" "106557" "110695" "11522" ...
..@ gsid2name :'data.frame': 355 obs. of 2 variables:
.. ..$ gsid: chr [1:355] "mmu01100" "mmu01200" "mmu01210" "mmu01212" ...
.. ..$ name: chr [1:355] "Metabolic pathways - Mus musculus (house mouse)" "Carbon metabolism - Mus musculus (house mouse)" "2-Oxocarboxylic acid metabolism - Mus musculus (house mouse)" "Fatty acid metabolism - Mus musculus (house mouse)" ...
..@ gene2name : NULL
..@ species : chr "mmu"
..@ gsname : chr "KEGG"
..@ version : chr "Release 110.0+/04-27, Apr 24"
..@ accessed_date: chr "2024-04-30"
..@ keytype : chr "kegg"
..@ info : NULL
>
>
> ## enrichKEGG now works!
> KEGG_enrich = enrichKEGG(gene = id_transform,
+ organism=kk,
+ use_internal_data = FALSE)
>
> KEGG_enrich
#
# over-representation test
#
#...@organism mmu
#...@ontology KEGG
#...@keytype kegg
#...@gene chr [1:7] "240427" "12705" "241770" "102633301" "319757" "116903" "72309"
#...pvalues adjusted by 'BH' with cutoff <0.05
#...5 enriched terms found
'data.frame': 5 obs. of 11 variables:
$ category : chr "Environmental Information Processing" "Human Diseases" "Organismal Systems" "Organismal Systems" ...
$ subcategory: chr "Signal transduction" "Cancer: specific types" "Circulatory system" "Development and regeneration" ...
$ ID : chr "mmu04340" "mmu05217" "mmu04270" "mmu04360" ...
$ Description: chr "Hedgehog signaling pathway - Mus musculus (house mouse)" "Basal cell carcinoma - Mus musculus (house mouse)" "Vascular smooth muscle contraction - Mus musculus (house mouse)" "Axon guidance - Mus musculus (house mouse)" ...
$ GeneRatio : chr "1/2" "1/2" "1/2" "1/2" ...
$ BgRatio : chr "58/9710" "63/9710" "144/9710" "181/9710" ...
$ pvalue : num 0.0119 0.0129 0.0294 0.0369 0.0416
$ p.adjust : num 0.0388 0.0388 0.0499 0.0499 0.0499
$ qvalue : num 0.00681 0.00681 0.00875 0.00875 0.00875
$ geneID : chr "319757" "319757" "116903" "319757" ...
$ Count : int 1 1 1 1 1
#...Citation
T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
The Innovation. 2021, 2(3):100141
>
from clusterprofiler.
As you will see above I opened an issue on the GitHub of the gson
package.
YuLab-SMU/gson#9
from clusterprofiler.
Related Issues (20)
- How to split the results of compareCluster? HOT 1
- How to combine different compareCluster results? HOT 1
- feature request: add message suggesting to cite fgsea package when it's used for GSEA analysis HOT 3
- Count duplicate K Numbers (KEGG) for enrichKEGG?
- bitr报错 HOT 1
- something wrong of setReadable HOT 6
- Error in testForValidKeytype(x, keytype) : Invalid keytype: GOALL. Please use the keytypes method to see a listing of valid arguments. HOT 1
- ont = "BP" in enricher () function HOT 1
- OrgDb for maize HOT 1
- How to convert geneID from the enrichKEGG output from entrez to gene symbol? HOT 2
- enrichKEGG "No gene can be mapped" HOT 2
- All p-values are equal to 1 HOT 1
- compareCluster - Error in check_gene_id - but work perfectly with standar procedure (not using compareCluster) HOT 1
- How do I change the width of each coloum in the heatmap in the treeplot?
- Perform gseaplot2 using seurat DEG output.
- gseKEGG using KEGG database error (cannot open URL 'https://rest.kegg.jp/conv/ncbi=geneid/mmu': HTTP status was '400 Bad Request') HOT 1
- Different behaviour of gseGO upon setting nPerm=1000 HOT 2
- Upsetplot is cropped HOT 4
- A strange non-given GO comment appears HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clusterprofiler.