ganglilab / genekitr Goto Github PK
View Code? Open in Web Editor NEW🧬 Gene analysis toolkit based on R
Home Page: https://www.genekitr.fun
License: GNU General Public License v3.0
🧬 Gene analysis toolkit based on R
Home Page: https://www.genekitr.fun
License: GNU General Public License v3.0
I'm encountering the following error when using transId() to convert gene aliases to symbols despite the same script working a week ago.
Here's the output using the example in the documentation:
> transId(c("BCC7", "TP53", "PD1", "PDL1", "TET2"), "sym")
Maybe your "trans_to" argument is wrong, please check again...
Error in tbl_vars_dispatch(x) : object 'res' not found
Hi,
Firstly, kudos for the development of genekitr
. It's a great tool and your reasons for its creation resonate so much with my experiences so far.
I'm currently working with the organism Yarrowia lipolytica and have noted some challenges:
geneset
package (GO and KEGG), but there is no organism value attached to it when running getGO:> mf <- getGO(org = "Yarrowia lipolytica", ont = "mf")
> head(mf$geneset)
mf gene
1 GO:0000030 YALI0_C04004g
2 GO:0000030 YALI0_D10549g
3 GO:0000030 YALI0_B01672g
4 GO:0000030 YALI0_E02222g
5 GO:0000030 YALI0_A20922g
6 GO:0000030 YALI0_A13585g
> head(mf$geneset_name)
id name
1 GO:0000030 mannosyltransferase activity
2 GO:0000049 tRNA binding
3 GO:0000149 SNARE binding
4 GO:0000166 nucleotide binding
5 GO:0000175 3'-5'-RNA exonuclease activity
6 GO:0000287 magnesium ion binding
> mf$organism
[1] NA
genORA
function. It suggests there's no short name for the organism. This is perplexing given the initial inclusion of Yarrowia lipolytica in the geneset. I also tried to add the organism value, but the function still does not work.> gs <- genORA(de.genes$ensembl_gene_id, mf$geneset,padj_method = "BH",
+ p_cutoff = 0.05,)
Error in if (organism == "hg" | organism == "human" | organism == "hsa" | :
argument is of length zero
genekitr
. But this too resulted in an error.> ora_go <- clusterProfiler::enrichGO(gene = de.genes,
+ OrgDb = org.Ylipolytica.eg.db,
+ universe = filtered_data$entrez,
+ keyType = "ENTREZID",
+ ont = "ALL", # Biological Process
+ pAdjustMethod = "BH", # adjust method
+ pvalueCutoff = 0.05,
+ minGSSize = 5,
+ maxGSSize = 500,
+ readable = FALSE)
> go_easy <- importCP(ora_go, type = "go")
Error in mapEnsOrg(object@organism) :
Check the latin_short_name in `genekitr::ensOrg_name`
I'd appreciate any insights or suggestions you might have regarding these issues. Is there a workaround or am I possibly missing a step?
Thanks!
Hi!
I'm trying to plot "gotangram" for GO BP enriched A.thaliana data. Other types like bar, upset, and the network worked fine, but gotangram is raising an error: "Error: Bioconductor orgdb for org.At.eg.db not found. You should install first."
. It looks like a bug since for A.thaliana it's org.At.tair.db
test <- c('AT1G12610', 'AT5G47600', 'AT1G33760') # just to make it easy to reproduce
gs <- getGO(org="Arabidopsis thaliana", ont="bp")
go_bp <- genORA(test,geneset=gs)
plotEnrich(go_bp, plot_type = "gotangram", sim_method = "Rel", org='Arabidopsis thaliana')
ps
Many thanks for the package. I do love it.
Describe the bug
Hello, I am attempting to perform a GO term visualization of my ShinyGO ORA results with plotEnrich. There are GO terms that plotEnrich won't recognize, is there a way to skip them entirely? thank you
Hi,
When I attempt to run
"gse <- genGSEA(genelist = ranks, geneset = gs)"
I receive the following error:
"Error in function (type, msg, asError = TRUE) :
Could not resolve host: genekitr-china.oss-accelerate.aliyuncs.com"
What is the reason for this error?
Hi there,
Thanks for developing this fantasy package. I have tried this package a lot, and I want to raise an issue about the visualization of the GSEA results.
In the 'classic' mode, if the genes are overlapped, they will only show part of the genes. May I ask if you could add a "max.overlap" to customize the number of showing in the GSEA plot.
Best,
Logan
Hi!
When I'm trying to create a figure with plotEnrichAdv
on simplified data and left xaxis limit is less than the right xaxis limit labels overlay bars of thee graph.
Let up_go_bp_sim
and down_go_bp_sim
be the resultant dataframes returned by genORA function ran with up- and downregulated DEGs.
Then:
plotEnrichAdv(up_go_bp_sim, down_go_bp_sim,
plot_type = "one",
term_metric = "FoldEnrich",
stats_metric = "p.adjust",
xlim_left = 15, xlim_right = 20) +
theme(legend.position = c(0.8, 0.5))
plotEnrichAdv(up_go_bp_sim, down_go_bp_sim,
plot_type = "one",
term_metric = "FoldEnrich",
stats_metric = "p.adjust",
xlim_left = 20.1, xlim_right = 20) + # now left border is greater than the right one
theme(legend.position = c(0.8, 0.5))
ps:
It also would be great to add more parameters to simGO function like cutoff etc.
pps:
Thanks again for the package!
Hi--
I use transId():
transId(id = IDs, transTo= "symbol", org = "mouse", keepNA = TRUE, unique = TRUE)
which should return all the input symbols; however, it returns less records and there is always a row of all NA.
Please use the same symbols file to verify.
Hello--
In the previous version, I used to convert 30k gene symbols in one command on my machine with 32GB and never had a problem. Now, when I try to run the same command (transId) on the same symbols, even a machine with 128GN will kill the process as the memory is not enough.
hello, is there any way one can load older versions of the genekitr package? thank you
Describe the bug
Hello, I was planning to coerce a fgsea (preranked gsea) result onto a plotEnrich function for plotting, with a previous step of gene count and GeneRatio calculation, geneID_symbol mapping and column name changes so that the dataframe looked identical to the model dataframe returned by genGSEA (which I find less flexible than fgsea).
However, when I attempted plotting the results for a single category which I checked was in the gsea_df, i recieved an error
Error in `$<-.data.frame`(`*tmp*`, "gene", value = c("BnaA04g22070D", :
replacement has 64949 rows, data has 65732
With the following traceback:
8.
stop(sprintf(ngettext(N, "replacement has %d row, data has %d",
"replacement has %d rows, data has %d"), N, nrows), domain = NA)
7.
`$<-.data.frame`(`*tmp*`, "gene", value = c("BnaA04g22070D",
NA, NA, NA, NA, "BnaC01g43250D", NA, NA, NA, NA, "BnaC07g39370D",
"BnaA03g47170D", NA, NA, "BnaCnng19060D", NA, NA, "BnaC01g19310D",
NA, NA, "BnaC07g50360D", NA, NA, NA, "BnaA05g03390D", NA, NA, ...
6.
`$<-`(`*tmp*`, "gene", value = c("BnaA04g22070D", NA, NA, NA,
NA, "BnaC01g43250D", NA, NA, NA, NA, "BnaC07g39370D", "BnaA03g47170D",
NA, NA, "BnaCnng19060D", NA, NA, "BnaC01g19310D", NA, NA, "BnaC07g50360D",
NA, NA, NA, "BnaA05g03390D", NA, NA, NA, NA, NA, NA, NA, NA, ...
5.
calcScore(geneset, genelist, x, exponent, fortify = TRUE, org)
4.
FUN(X[[i]], ...)
3.
lapply(show_pathway, function(x) {
calcScore(geneset, genelist, x, exponent, fortify = TRUE,
org)
})
2.
do.call(rbind, lapply(show_pathway, function(x) {
calcScore(geneset, genelist, x, exponent, fortify = TRUE,
org)
}))
1.
genekitr::plotGSEA(BP_HDAC_list, plot_type = "classic", show_pathway = "GO:0040029")
To Reproduce
reprex_plotGSEA_filtered.xlsx
This is my excel file representing the list of different dataframes I used after preprocessing (with names "gsea_df", "genelist", "geneset", "exponent" and "org" . I am working with Brassica napus external_gene_name
ENA identifiers
I filtered the gsea_result to having only 21 rows in order to preserve confidenciality of my results, but it still has the identifier Im looking forward to create a GSEA plot from, GO:0040029. If this is a problem for test generation, please confirm.
Additional context
Any other supplements?
Hello--
I am updating old gene symbols with keepNA = FALSE, unique = FALSE.
I am getting some strange data (please see below).
row 138 (Gm553): is official symbol and it is returned as NA.
row 149-151: the original symbols are Ankrd44 & 4930444A19Rik.
row 156 & 157 (Mob4): it comes one time as Mob4 and one time as NA.
Describe the bug
I am trying to create bar plots of my ORA results but keep getting an error in dyplr::mutate()
To Reproduce
Steps to reproduce the behavior:
using attached testfile 'testgenelist.csv', the following code should reproduce the error
library(genekitr)
library(geneset)
gs3 <- getReactome(org = "human")
testgenes <- read.csv(file = "data/testgenelist.csv", header = TRUE, sep = ",")
## ORA Analysis
id <- testgenes$GeneID
test_ego <- genORA(id,
geneset = gs3,
p_cutoff = 0.05,
q_cutoff = 0.10
)
#plot
plotEnrich(test_ego, plot_type = "bar")
plotEnrich(test_ego, plot_type = "bar")
Error indplyr::mutate()
:
ℹ In argument:Description = factor(.$Description, levels = .$Description, ordered = T)
.
Caused by error inlevels<-
:
! factor level [20] is duplicated
Runrlang::last_trace()
to see where the error occurred.
dplyr::mutate()
:Description = factor(.$Description, levels = .$Description, ordered = T)
.levels<-
:Backtrace:
▆
└─rlang::abort(message, class = error_class, parent = parent, call = error_call)
Expected behavior
I expected the barplot to be generated as normal. I haven't had this issue with any other datasets I have analyzed. Inspection of the test_ego result doesn't seem to be impacted either. Dataframe of ORA result (test_ego) screenshot included.
Screenshots
testgenelist.csv
Desktop (please complete the following information):
Additional context
Hi,
Thank you for your fantastic work and the great convenience you've brought to us.
However, recently, when I attempted to convert a column of IDs, despite setting both the 'keepNA' and 'unique' parameters to TRUE, I noticed that the returned data length doesn't match the input. What's even more peculiar is that when I re-enter the initially missed IDs into the function, the data is then output completely, although some may be None. The package version of genekitr is 1.2.5. Details are as mentioned above. I'm looking forward to your response, and once again, thank you for your awesome work.
Best wishes,
Zhaoyu
Hello,
I am again having issues with getting genes information.
for example this gene: ENSG00000257122 has an HGNC symbol but the package reports it as NA.
Best,
How do we cite your tool? I can't find it? Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.