bioconductor / keggrest Goto Github PK
View Code? Open in Web Editor NEWClient-side REST access to KEGG
Home Page: https://bioconductor.org/packages/KEGGREST
Client-side REST access to KEGG
Home Page: https://bioconductor.org/packages/KEGGREST
Hi,
I was trying to pull the gene information of a given KEGG pathway id using keggGet(). However, an error prompted as the following:
Error in curl::curl_fetch_memory(url, handle = handle) :
SSL peer certificate or SSH remote key was not OK: [rest.kegg.jp] SSL certificate problem: certificate has expired
Would you please help fixing the bug?
Thanks
Li
I encountered this problem on GitHub actions with the ubuntu-latest
runner. In an R package that has KEGGREST
and ChemmineOB
in Imports
(and has the system deps for ChemmineOB
installed), there is a segfault during R CMD check
. Here's a reprex.
* checking dependencies in R code ... NOTE
*** caught segfault ***
address 0x7ff29d5e2b00, cause 'invalid permissions'
Traceback:
1: rgb(1, 1, 1)
2: make_style(rgb(1, 1, 1))
3: make_DNA_AND_RNA_COLORED_LETTERS()
4: fun(libname, pkgname)
5: doTryCatch(return(expr), name, parentenv, handler)
6: tryCatchOne(expr, names, parentenv, handlers[[1L]])
7: tryCatchList(expr, classes, parentenv, handlers)
8: tryCatch(fun(libname, pkgname), error = identity)
9: runHook(".onLoad", env, package.lib, package)
10: loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]])
11: asNamespace(ns)
12: namespaceImportFrom(ns, loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]), i[[2L]], from = package)
13: loadNamespace(p)
14: withCallingHandlers(expr, message = function(c) if (inherits(c, classes)) tryInvokeRestart("muffleMessage"))
15: suppressMessages(loadNamespace(p))
16: withCallingHandlers(expr, warning = function(w) if (inherits(w, classes)) tryInvokeRestart("muffleWarning"))
17: suppressWarnings(suppressMessages(loadNamespace(p)))
18: doTryCatch(return(expr), name, parentenv, handler)
19: tryCatchOne(expr, names, parentenv, handlers[[1L]])
20: tryCatchList(expr, classes, parentenv, handlers)
21: tryCatch(suppressWarnings(suppressMessages(loadNamespace(p))), error = function(e) e)
22: tools:::.check_packages_used(package = "testpkg")
An irrecoverable exception occurred. R is aborting now ...
Segmentation fault (core dumped)
This only happens when both KEGGREST
and ChemmineOB
are dependencies. It seems to be something related to the rgb()
symbol in ChemmineOB.so
conflicting with the rgb()
from grDevices
. I came here first because the traceback seems to point to Biostrings
, which is a dependency of KEGGREST
and not of ChemmineOB
, but let me know if you think this belongs in the ChemmineOB
repo. I also tried this with Biostrings
and ChemmineOB
as Imports and didn't get an error.
Hello,
As of the latest version, when I try to use color.pathway.by.objects()
, the foreground and background colors seem to be swapped in the output image for some reason.
I'd appreciate any help on the issue,
Best,
-E
Hello there,
The quick question is: when performing KEGG enrichment analysis, should genes that have no KEGG identifier/annotation be excluded from the Wilcox test comparing p-values of genes in pathway vs genes not in pathway?
The long explanation of the question is: about 2500/14700 of our genes do not get converted to KEGG identifiers using keggConv. Presumably this is because they are not in the KEGG database.
Should we exclude those 2500 NA genes from the Wilcox test since those genes would always be considered "not in pathway" when comparing p-values to genes that are "in pathway". In an extreme case if those NA genes were all highly biased as significantly DE, that could dilute the impact of DE genes that are actually in pathways, potentially preventing those pathways from being significantly enriched. This makes me think those NA genes should be excluded...
Alternatively, we could consider those NA genes an essential part of the "baseline" transcriptome, of which the KEGG pathways and corresponding genes are also a component of... and therefore those NA genes are still needed to test for pathway enrichment. In this case, all genes, including those not in the KEGG database, should be included in the Wilcox test...
Wondering if there is a standard or suggested "protocol" for handling this issue (genes that have no KEGG identifier)? Any literature/links/insight would be greatly appreciated!
Thank you!
Hello,
I am currently getting this error when running any of the functions from your package. The problem is recent, and I guess that it is caused by the curl package after an update from the KEGG API side on 01.06.2022.
A simple example that shows this error could be : keggGet("hsa:8312")
My system version info is below:
sessionInfo()
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22000)
Matrix products: default
locale:
[1] LC_COLLATE=English_Germany.utf8 LC_CTYPE=English_Germany.utf8 LC_MONETARY=English_Germany.utf8
[4] LC_NUMERIC=C LC_TIME=English_Germany.utf8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] UniProt.ws_2.36.0 httr_1.4.3 curl_4.3.2 BiocGenerics_0.42.0 RSQLite_2.2.14
loaded via a namespace (and not attached):
[1] KEGGREST_1.36.0 tidyselect_1.1.2 purrr_0.3.4 lattice_0.20-45 vctrs_0.4.1
[6] generics_0.1.2 stats4_4.2.0 BiocFileCache_2.4.0 utf8_1.2.2 blob_1.2.3
[11] rlang_1.0.2 pillar_1.7.0 glue_1.6.2 DBI_1.1.2 rappdirs_0.3.3
[16] bit64_4.0.5 dbplyr_2.1.1 GenomeInfoDbData_1.2.8 lifecycle_1.0.1 zlibbioc_1.42.0
[21] Biostrings_2.64.0 memoise_2.0.1 Biobase_2.56.0 IRanges_2.30.0 fastmap_1.1.0
[26] GenomeInfoDb_1.32.2 AnnotationDbi_1.58.0 fansi_1.0.3 Rcpp_1.0.8.3 filelock_1.0.2
[31] cachem_1.0.6 S4Vectors_0.34.0 graph_1.74.0 XVector_0.36.0 bit_4.0.4
[36] png_0.1-7 dplyr_1.0.9 grid_4.2.0 cli_3.3.0 tools_4.2.0
[41] bitops_1.0-7 magrittr_2.0.3 RCurl_1.98-1.6 tibble_3.1.7 crayon_1.5.1
[46] pkgconfig_2.0.3 ellipsis_0.3.2 Matrix_1.4-1 assertthat_0.2.1 rstudioapi_0.13
[51] R6_2.5.1 compiler_4.2.0
Regards,
M. Al Maaz
Hello, thank you for developing the package.
Currently, I have a connection problem using this code:
library(KEGGREST)
library(org.Hs.eg.db)
library(tidyverse)
# get pathways and their entrez gene ids
hsa_path_entrez <- keggLink("pathway", "hsa") %>%
tibble(pathway = ., eg = sub("hsa:", "", names(.)))
Error in curl::curl_fetch_memory(url, handle = handle) :
OpenSSL SSL_read: error:0A000126:SSL routines::unexpected eof while reading, errno 0
Ubuntu 22.04, R 4.2.1
Could you please advise?
I have list of genes DE and I want to do KEGG pathway, Could anyone please suggest me or provide the pipeline for doing that.
Many thanks
im using this package to make kegg ,
> ppp_info <- lapply(lipid_pathway, keggGet)
Error in .getUrl(url, .flatFileParser) : Not Found (HTTP 404).
and fix it with
detach(package:KEGGREST, unload = T)
library(devtools)
devtools::install_github("https://github.com/kozo2/KEGGREST/tree/patch-1")
but it can not work
Running keggList gives
an error with curl
or openssl
.
> pathways.list <- keggList("pathway", "mcc")
Error in curl::curl_fetch_memory(url, handle = handle) :
OpenSSL SSL_read: error:0A000126:SSL routines::unexpected eof while reading, errno 0
Is there a way to address this issue without having to update the packages? 1.30.1 is the only version compatible with the other packages in my environment yaml.
Here is the session information:
sessionInfo()
R version 4.0.5 (2021-03-31)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 18.04.6 LTS
Matrix products: default
BLAS/LAPACK: /srv/conda/envs/notebook/lib/libopenblasp-r0.3.21.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] KEGGREST_1.30.1 patchwork_1.1.1 EnhancedVolcano_1.8.0 ggrepel_0.9.1 ggvenn_0.1.9
[6] DESeq2_1.30.1 SummarizedExperiment_1.20.0 Biobase_2.50.0 MatrixGenerics_1.2.1 matrixStats_0.62.0
[11] GenomicRanges_1.42.0 GenomeInfoDb_1.26.4 IRanges_2.24.1 S4Vectors_0.28.1 BiocGenerics_0.36.0
[16] tximeta_1.8.4 kableExtra_1.3.4 jsonlite_1.8.0 png_0.1-7 pathview_1.30.1
[21] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.9 purrr_0.3.4 readr_2.1.2
[26] tidyr_1.2.0 tibble_3.1.8 ggplot2_3.3.6 tidyverse_1.3.2 BiocManager_1.30.18
[31] biomaRt_2.46.3
loaded via a namespace (and not attached):
[1] readxl_1.4.0 backports_1.4.1 AnnotationHub_2.22.0 BiocFileCache_1.14.0 systemfonts_1.0.4
[6] lazyeval_0.2.2 splines_4.0.5 BiocParallel_1.24.1 digest_0.6.29 ensembldb_2.14.0
[11] htmltools_0.5.3 fansi_1.0.3 magrittr_2.0.3 memoise_2.0.1 googlesheets4_1.0.1
[16] openxlsx_4.2.5 tzdb_0.3.0 Biostrings_2.58.0 annotate_1.68.0 modelr_0.1.8
[21] extrafont_0.18 extrafontdb_1.0 svglite_2.1.0 askpass_1.1 prettyunits_1.1.1
[26] colorspace_2.0-3 blob_1.2.3 rvest_1.0.2 rappdirs_0.3.3 haven_2.5.0
[31] xfun_0.32 tximport_1.18.0 crayon_1.5.1 RCurl_1.98-1.8 graph_1.68.0
[36] genefilter_1.72.1 survival_3.4-0 glue_1.6.2 gtable_0.3.0 gargle_1.2.0
[41] zlibbioc_1.36.0 XVector_0.30.0 webshot_0.5.3 DelayedArray_0.16.3 proj4_1.0-11
[46] Rgraphviz_2.34.0 Rttf2pt1_1.3.10 maps_3.4.0 scales_1.2.0 DBI_1.1.3
[51] Rcpp_1.0.9 viridisLite_0.4.0 xtable_1.8-4 progress_1.2.2 bit_4.0.4
[56] httr_1.4.3 RColorBrewer_1.1-3 ellipsis_0.3.2 pkgconfig_2.0.3 XML_3.99-0.10
[61] dbplyr_2.2.1 locfit_1.5-9.4 utf8_1.2.2 later_1.2.0 tidyselect_1.1.2
[66] rlang_1.0.4 AnnotationDbi_1.52.0 munsell_0.5.0 BiocVersion_3.12.0 cellranger_1.1.0
[71] tools_4.0.5 cachem_1.0.6 cli_3.3.0 generics_0.1.3 RSQLite_2.2.8
[76] broom_1.0.0 evaluate_0.16 fastmap_1.1.0 yaml_2.3.5 org.Hs.eg.db_3.12.0
[81] knitr_1.39 bit64_4.0.5 fs_1.5.2 zip_2.2.0 AnnotationFilter_1.14.0
[86] mime_0.12 ash_1.0-15 ggrastr_1.0.1 KEGGgraph_1.50.0 xml2_1.3.3
[91] compiler_4.0.5 rstudioapi_0.13 beeswarm_0.4.0 curl_4.3.2 interactiveDisplayBase_1.28.0
[96] reprex_2.0.2 geneplotter_1.68.0 stringi_1.7.8 GenomicFeatures_1.42.2 ggalt_0.4.0
[101] lattice_0.20-45 ProtGenerics_1.22.0 Matrix_1.4-1 vctrs_0.4.1 pillar_1.8.0
[106] lifecycle_1.0.1 bitops_1.0-7 rtracklayer_1.50.0 httpuv_1.6.5 R6_2.5.1
[111] promises_1.2.0.1 KernSmooth_2.23-20 vipor_0.4.5 MASS_7.3-58.1 assertthat_0.2.1
[116] openssl_2.0.2 withr_2.5.0 GenomicAlignments_1.26.0 Rsamtools_2.6.0 GenomeInfoDbData_1.2.4
[121] hms_1.1.1 rmarkdown_2.15 googledrive_2.0.0 shiny_1.7.2 lubridate_1.8.0
Hello,
Can't seem to get color.pathway.by.objects() to work and it is required by pathfindR.
color.pathway.by.objects("path:hsa00010", c("hsa:2821", "hsa:226"), c("#ff0000", "#00ff00"), c("#ffff00", "yellow"))
Error in color.pathway.by.objects("path:hsa00010", c("hsa:2821", "hsa:226"), :
'color.pathway.by.objects()' failed to extract KEGG image path from response.
And yet this works
png <- keggGet("hsa05130", "image")
writePNG(png, "hsa05130_tmp.png")
I am not sure why this is happening. In pathfindR I am having the following error:
> visualize_hsa_KEGG("hsa00010", input_processed = input2)
Downloading pathway diagrams of 1 KEGG pathways
| | 0%Cannot retrieve PNG url: hsa00010
Here's the original error message:
'color.pathway.by.objects()' failed to extract KEGG image path from response.
Cannot download PNG file: NA
Here's the original error message:
'url' must be a length-one character vector
|======================================================================| 100%
Saving colored pathway diagrams of 0 KEGG pathways
Warning message:
In utils::download.file(url = KEGGgraph::getKGMLurl(pathwayid = sub("hsa", :
downloaded length 47563 != reported length 0
hey,
Thanks for the great package! As of the latest version (1.37.2), color.pathway.by.objects()
started returning the plain pathway image URL instead of the colored one.
mark.pathway.by.objects()
does work.
I'd greatly appreciate your help on resolving the issue.
Best,
-E
Hello,
At the moment, the regression tests at https://bioconductor.org/checkResults/3.17/bioc-LATEST/KEGGREST/nebbiolo1-checksrc.html fail with:
KEGGREST RUnit Tests - 11 test functions, 1 error, 0 failures
ERROR in test_mark_and_color_pathways_by_objects: Error in strsplit(urlLine, "\"", fixed = TRUE)[[1]] :
subscript out of bounds
Test files with failing tests
test_KEGGREST.R
test_mark_and_color_pathways_by_objects
Error in BiocGenerics:::testPackage("KEGGREST") :
unit tests failed for package KEGGREST
Execution halted
I could reduce this to:
> KEGGREST:::.get.tmp.url("https://www.kegg.jp/pathway/eco00260+eco:b0002+eco:c00263")
No encoding supplied: defaulting to UTF-8.
Error in strsplit(urlLine, "\"", fixed = TRUE)[[1]] :
subscript out of bounds
The function fails when running strsplit(…)[[1]]
on urlLine
, which is the output of grep("<img src=\"/tmp", lines…
, which is character(0)
Probably the structure of the KEGG webpages changed? But as I am not familiar enough with the package, I can not propose a patch.
I hope it helps,
Charles
I am using this package, which is extremely useful! I think I found a small issue with the keggConv function - it seems to have a limit of 100 on the output (at least when converting to NCBI IDs). Here is a reproducible example:
library(KEGGREST) ##use this to download KEGG pathways and convert KEGG gene IDs to NCBI gene IDs
library(KEGGgraph) ##use this to convert KGML to data
##Get all genes/proteins in breast cancer pathway (hsa05224) in KEGG-specific KGML format:
path <- "hsa05224"
path_kgml <- keggGet(path, "kgml")
##Convert KGML to data frame:
path_df <- parseKGML2DataFrame(path_kgml)
path_df <- sapply(path_df, as.character)
##Get all the genes/proteins in the pathway using the KEGG IDs:
kegg_ids <- unique(c(path_df[,"from"], path_df[,"to"]))
head(kegg_ids)
length(kegg_ids)
##Now try to convert them to NCBI IDs:
NCBI_gids <- keggConv("ncbi-geneid", kegg_ids)
length(NCBI_gids)
##Note this has only 100 entries!
##However, we can convert the first 100 and the last 39 separately:
length(keggConv("ncbi-geneid", kegg_ids[1:100]))
length(keggConv("ncbi-geneid", kegg_ids[101:139]))
Hello,
Thanks for your amazing package!
Lately, when I try to use keggLink() I get the following error:
Error in curl::curl_fetch_memory(url, handle = handle) :
Failure when receiving data from the peer
I'd appreciate any help on the issue,
Best,
-V
Code:
library("KEGGREST")
KEGGREST::keggList("glycan")
Result:
Error in FUN(X[[i]], ...) : subscript out of bounds
Expected result :
A list of KEGG glycans.
System info:
R version 3.6.1 (2019-07-05)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: Ubuntu 16.04.6 LTS
Matrix products: default
BLAS/LAPACK: /home/cbleker/miniconda2/envs/r-env/lib/R/lib/libRblas.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] KEGGREST_1.24.1
loaded via a namespace (and not attached):
[1] zlibbioc_1.30.0 httr_1.4.1 compiler_3.6.1
[4] R6_2.4.0 IRanges_2.18.2 XVector_0.24.0
[7] parallel_3.6.1 curl_4.0 Biostrings_2.52.0
[10] S4Vectors_0.22.0 BiocGenerics_0.30.0 stats4_3.6.1
[13] png_0.1-7 ```
Hi,
When I tried to rerun my code, which worked fine a few months ago, I now run into a problem. I want to fetch a simple kegglist.
pathways.list = keggList('pathway','ath')
However I allways get the error:
Error in curl::curl_fetch_memory(url, handle = handle) :
Failure when receiving data from the peer
Thank you very much for your help.
Moritz
Hi Everybody,
I'm trying to get the KO numbers for my genes that are in Symbol annotation. Running keggFind() returns NULL for some genes (no KO corresponds to the query), but when I look for that gene manually in the KEGG database I can find its KO. For example:
> names(keggFind("ko", "Acyp1"))
NULL
But, if you search Acyp1 in KEGG, you can find it:
Symbol | ACYP1, ACYPE
(RefSeq) acylphosphatase 1
K01512 acylphosphatase [EC:3.6.1.7]
However, running the following command retrieves the KO number:
> names(keggFind("ko", "Acyp"))
[1] "ko:K01512"
Any clarification is welcome.
Thank you,
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.