Git Product home page Git Product logo

genekitr's People

Contributors

reedliu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

genekitr's Issues

transId from alias to symbol no longer works

I'm encountering the following error when using transId() to convert gene aliases to symbols despite the same script working a week ago.

Here's the output using the example in the documentation:

> transId(c("BCC7", "TP53", "PD1", "PDL1", "TET2"), "sym")
Maybe your "trans_to" argument is wrong, please check again...
Error in tbl_vars_dispatch(x) : object 'res' not found

Issue with ORA and importCP

Hi,
Firstly, kudos for the development of genekitr. It's a great tool and your reasons for its creation resonate so much with my experiences so far.

I'm currently working with the organism Yarrowia lipolytica and have noted some challenges:

  1. The geneset for Yarrowia lipolytica is available in the geneset package (GO and KEGG), but there is no organism value attached to it when running getGO:
> mf <- getGO(org = "Yarrowia lipolytica", ont = "mf")
> head(mf$geneset)
          mf          gene
1 GO:0000030 YALI0_C04004g
2 GO:0000030 YALI0_D10549g
3 GO:0000030 YALI0_B01672g
4 GO:0000030 YALI0_E02222g
5 GO:0000030 YALI0_A20922g
6 GO:0000030 YALI0_A13585g
> head(mf$geneset_name)
          id                           name
1 GO:0000030   mannosyltransferase activity
2 GO:0000049                   tRNA binding
3 GO:0000149                  SNARE binding
4 GO:0000166             nucleotide binding
5 GO:0000175 3'-5'-RNA exonuclease activity
6 GO:0000287          magnesium ion binding
> mf$organism
[1] NA
  1. However, I ran into issues with follow-up functions, specifically the genORA function. It suggests there's no short name for the organism. This is perplexing given the initial inclusion of Yarrowia lipolytica in the geneset. I also tried to add the organism value, but the function still does not work.
>   gs <- genORA(de.genes$ensembl_gene_id, mf$geneset,padj_method = "BH",
+                p_cutoff = 0.05,)
Error in if (organism == "hg" | organism == "human" | organism == "hsa" |  : 
  argument is of length zero
  1. I also tried a different route, performing the ORA with ClusterProfiler and then importing the results to genekitr. But this too resulted in an error.
>   ora_go <- clusterProfiler::enrichGO(gene = de.genes,
+                         OrgDb = org.Ylipolytica.eg.db,
+                         universe = filtered_data$entrez,
+                         keyType = "ENTREZID",  
+                         ont = "ALL",  # Biological Process
+                         pAdjustMethod = "BH",  # adjust method
+                         pvalueCutoff = 0.05,
+                         minGSSize = 5,
+                         maxGSSize = 500,
+                         readable = FALSE)
>   go_easy <- importCP(ora_go, type = "go")
Error in mapEnsOrg(object@organism) : 
Check the latin_short_name in `genekitr::ensOrg_name`

I'd appreciate any insights or suggestions you might have regarding these issues. Is there a workaround or am I possibly missing a step?
Thanks!

org.db bug in gotangram

Hi!

I'm trying to plot "gotangram" for GO BP enriched A.thaliana data. Other types like bar, upset, and the network worked fine, but gotangram is raising an error: "Error: Bioconductor orgdb for org.At.eg.db not found. You should install first.". It looks like a bug since for A.thaliana it's org.At.tair.db

test <- c('AT1G12610', 'AT5G47600', 'AT1G33760') # just to make it easy to reproduce
gs <- getGO(org="Arabidopsis thaliana", ont="bp")
go_bp <- genORA(test,geneset=gs) 
plotEnrich(go_bp, plot_type = "gotangram", sim_method = "Rel", org='Arabidopsis thaliana')

ps
Many thanks for the package. I do love it.

Plotting for ShinyGO ORA results

Describe the bug
Hello, I am attempting to perform a GO term visualization of my ShinyGO ORA results with plotEnrich. There are GO terms that plotEnrich won't recognize, is there a way to skip them entirely? thank you

transId loading issue

Describe the bug
transId not working

To Reproduce
Steps to reproduce the behavior:
Just run the example below.

Screenshots
image

Desktop (please complete the following information):

  • OS: Windows
  • Version [11 ]
  • Browser [Brave]

transId keeping unique ids issue

Hello,

some genes are not changed to the new symbols (the new symbol is BABAM2):
image

Also, the information of the genee is not complete:
image

plotGSEA with max.overlap parameter

Hi there,

Thanks for developing this fantasy package. I have tried this package a lot, and I want to raise an issue about the visualization of the GSEA results.

In the 'classic' mode, if the genes are overlapped, they will only show part of the genes. May I ask if you could add a "max.overlap" to customize the number of showing in the GSEA plot.

Best,
Logan

transId() mouse symbols

Hello,

I was comparing between transId() and biomaRt and found that biomaRt returns more symbols than transId() from ensembl ids. They are official mgi symbols, what would be the reason?

image

labels overlay bars in plotEnrichAdv when left x-axis limit is less than the right limit

Hi!
When I'm trying to create a figure with plotEnrichAdv on simplified data and left xaxis limit is less than the right xaxis limit labels overlay bars of thee graph.

Let up_go_bp_sim and down_go_bp_sim be the resultant dataframes returned by genORA function ran with up- and downregulated DEGs.
Then:

  1. Left limit is greater
plotEnrichAdv(up_go_bp_sim, down_go_bp_sim,
              plot_type = "one",
              term_metric = "FoldEnrich",
              stats_metric = "p.adjust",
              xlim_left = 15, xlim_right = 20) +
    theme(legend.position = c(0.8, 0.5))

изображение

  1. Right limit is greater (as in the example in the documentation)
    Everything is OK.
plotEnrichAdv(up_go_bp_sim, down_go_bp_sim,
              plot_type = "one",
              term_metric = "FoldEnrich",
              stats_metric = "p.adjust",
              xlim_left = 20.1, xlim_right = 20) +  # now left border is greater than the right one
    theme(legend.position = c(0.8, 0.5))

изображение

ps:
It also would be great to add more parameters to simGO function like cutoff etc.

pps:
Thanks again for the package!

transId() does not return all input symbols

Hi--

I use transId():

transId(id = IDs, transTo= "symbol", org = "mouse", keepNA = TRUE, unique = TRUE)

which should return all the input symbols; however, it returns less records and there is always a row of all NA.

Please use the same symbols file to verify.

new version is memory hungry

Hello--

In the previous version, I used to convert 30k gene symbols in one command on my machine with 32GB and never had a problem. Now, when I try to run the same command (transId) on the same symbols, even a machine with 128GN will kill the process as the memory is not enough.

plotGSEA classic type for non-model species

Describe the bug
Hello, I was planning to coerce a fgsea (preranked gsea) result onto a plotEnrich function for plotting, with a previous step of gene count and GeneRatio calculation, geneID_symbol mapping and column name changes so that the dataframe looked identical to the model dataframe returned by genGSEA (which I find less flexible than fgsea).

However, when I attempted plotting the results for a single category which I checked was in the gsea_df, i recieved an error

Error in `$<-.data.frame`(`*tmp*`, "gene", value = c("BnaA04g22070D", : 
replacement has 64949 rows, data has 65732

With the following traceback:

8.
stop(sprintf(ngettext(N, "replacement has %d row, data has %d", 
"replacement has %d rows, data has %d"), N, nrows), domain = NA)
7.
`$<-.data.frame`(`*tmp*`, "gene", value = c("BnaA04g22070D", 
NA, NA, NA, NA, "BnaC01g43250D", NA, NA, NA, NA, "BnaC07g39370D", 
"BnaA03g47170D", NA, NA, "BnaCnng19060D", NA, NA, "BnaC01g19310D", 
NA, NA, "BnaC07g50360D", NA, NA, NA, "BnaA05g03390D", NA, NA, ...
6.
`$<-`(`*tmp*`, "gene", value = c("BnaA04g22070D", NA, NA, NA, 
NA, "BnaC01g43250D", NA, NA, NA, NA, "BnaC07g39370D", "BnaA03g47170D", 
NA, NA, "BnaCnng19060D", NA, NA, "BnaC01g19310D", NA, NA, "BnaC07g50360D", 
NA, NA, NA, "BnaA05g03390D", NA, NA, NA, NA, NA, NA, NA, NA, ...
5.
calcScore(geneset, genelist, x, exponent, fortify = TRUE, org)
4.
FUN(X[[i]], ...)
3.
lapply(show_pathway, function(x) {
calcScore(geneset, genelist, x, exponent, fortify = TRUE, 
org)
})
2.
do.call(rbind, lapply(show_pathway, function(x) {
calcScore(geneset, genelist, x, exponent, fortify = TRUE, 
org)
}))
1.
genekitr::plotGSEA(BP_HDAC_list, plot_type = "classic", show_pathway = "GO:0040029")

To Reproduce
reprex_plotGSEA_filtered.xlsx

This is my excel file representing the list of different dataframes I used after preprocessing (with names "gsea_df", "genelist", "geneset", "exponent" and "org" . I am working with Brassica napus external_gene_name ENA identifiers

I filtered the gsea_result to having only 21 rows in order to preserve confidenciality of my results, but it still has the identifier Im looking forward to create a GSEA plot from, GO:0040029. If this is a problem for test generation, please confirm.

Additional context
Any other supplements?

transId() updating symbols weird behaviour

Hello--

I am updating old gene symbols with keepNA = FALSE, unique = FALSE.

I am getting some strange data (please see below).
row 138 (Gm553): is official symbol and it is returned as NA.
row 149-151: the original symbols are Ankrd44 & 4930444A19Rik.
row 156 & 157 (Mob4): it comes one time as Mob4 and one time as NA.

image

ORA result plotting error because of duplicated terms

Describe the bug
I am trying to create bar plots of my ORA results but keep getting an error in dyplr::mutate()

To Reproduce
Steps to reproduce the behavior:
using attached testfile 'testgenelist.csv', the following code should reproduce the error

library(genekitr)
library(geneset)
gs3 <- getReactome(org = "human")
testgenes <- read.csv(file = "data/testgenelist.csv", header = TRUE, sep = ",")
## ORA Analysis
id <- testgenes$GeneID
test_ego <- genORA(id,
                        geneset = gs3,
                        p_cutoff = 0.05,
                        q_cutoff = 0.10
)

#plot
plotEnrich(test_ego, plot_type = "bar")
  1. See error
    The following error was raised (screenshot included):

plotEnrich(test_ego, plot_type = "bar")
Error in dplyr::mutate():
ℹ In argument: Description = factor(.$Description, levels = .$Description, ordered = T).
Caused by error in levels<-:
! factor level [20] is duplicated
Run rlang::last_trace() to see where the error occurred.

When rlang last trace is run:
Error in dplyr::mutate():
ℹ In argument: Description = factor(.$Description, levels = .$Description, ordered = T).
Caused by error in levels<-:
! factor level [20] is duplicated

Backtrace:

  1. ├─genekitr::plotEnrich(test_ego, plot_type = "bar")
  2. │ └─... %>% ...
  3. ├─dplyr::mutate(...)
  4. ├─dplyr:::mutate.data.frame(...)
  5. │ └─dplyr:::mutate_cols(.data, dplyr_quosures(...), by)
  6. │ ├─base::withCallingHandlers(...)
  7. │ └─dplyr:::mutate_col(dots[[i]], data, mask, new_columns)
  8. │ └─mask$eval_all_mutate(quo)
  9. │ └─dplyr (local) eval()
  10. ├─base::factor(.$Description, levels = .$Description, ordered = T)
  11. └─base::.handleSimpleError(...)
  12. └─dplyr (local) h(simpleError(msg, call))
  13. └─rlang::abort(message, class = error_class, parent = parent, call = error_call)
    

Expected behavior

I expected the barplot to be generated as normal. I haven't had this issue with any other datasets I have analyzed. Inspection of the test_ego result doesn't seem to be impacted either. Dataframe of ORA result (test_ego) screenshot included.

Screenshots
testgenelist.csv
image
image

Desktop (please complete the following information):

  • OS: macOS
  • Version 12.6.5
  • Browser Chrome

Additional context

Problem about transId

Hi,

Thank you for your fantastic work and the great convenience you've brought to us.

However, recently, when I attempted to convert a column of IDs, despite setting both the 'keepNA' and 'unique' parameters to TRUE, I noticed that the returned data length doesn't match the input. What's even more peculiar is that when I re-enter the initially missed IDs into the function, the data is then output completely, although some may be None. The package version of genekitr is 1.2.5. Details are as mentioned above. I'm looking forward to your response, and once again, thank you for your awesome work.

image
image

Best wishes,

Zhaoyu

genInfo missing data

Hello,

I am again having issues with getting genes information.

for example this gene: ENSG00000257122 has an HGNC symbol but the package reports it as NA.

Best,

Citation

How do we cite your tool? I can't find it? Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.