Git Product home page Git Product logo

keggrest's People

Contributors

alisajid avatar dtenenba avatar dvantwisk avatar hpages avatar jvolkening avatar jwokaty avatar kozo2 avatar kristinariemer avatar link-ny avatar lshep avatar mtmorgan avatar nturaga avatar sonali-bioc avatar vjcitn avatar vobencha avatar zielinskipp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

keggrest's Issues

SSL certificate problem

Hi,

I was trying to pull the gene information of a given KEGG pathway id using keggGet(). However, an error prompted as the following:

Error in curl::curl_fetch_memory(url, handle = handle) :
SSL peer certificate or SSH remote key was not OK: [rest.kegg.jp] SSL certificate problem: certificate has expired

Would you please help fixing the bug?

Thanks
Li

segfault on Ubuntu when KEGGREST and ChemmineOB are imported by a package

I encountered this problem on GitHub actions with the ubuntu-latest runner. In an R package that has KEGGREST and ChemmineOB in Imports (and has the system deps for ChemmineOB installed), there is a segfault during R CMD check. Here's a reprex.

* checking dependencies in R code ... NOTE
 *** caught segfault ***
address 0x7ff29d5e2b00, cause 'invalid permissions'
Traceback:
 1: rgb(1, 1, 1)
 2: make_style(rgb(1, 1, 1))
 3: make_DNA_AND_RNA_COLORED_LETTERS()
 4: fun(libname, pkgname)
 5: doTryCatch(return(expr), name, parentenv, handler)
 6: tryCatchOne(expr, names, parentenv, handlers[[1L]])
 7: tryCatchList(expr, classes, parentenv, handlers)
 8: tryCatch(fun(libname, pkgname), error = identity)
 9: runHook(".onLoad", env, package.lib, package)
10: loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]])
11: asNamespace(ns)
12: namespaceImportFrom(ns, loadNamespace(j <- i[[1L]], c(lib.loc,     .libPaths()), versionCheck = vI[[j]]), i[[2L]], from = package)
13: loadNamespace(p)
14: withCallingHandlers(expr, message = function(c) if (inherits(c,     classes)) tryInvokeRestart("muffleMessage"))
15: suppressMessages(loadNamespace(p))
16: withCallingHandlers(expr, warning = function(w) if (inherits(w,     classes)) tryInvokeRestart("muffleWarning"))
17: suppressWarnings(suppressMessages(loadNamespace(p)))
18: doTryCatch(return(expr), name, parentenv, handler)
19: tryCatchOne(expr, names, parentenv, handlers[[1L]])
20: tryCatchList(expr, classes, parentenv, handlers)
21: tryCatch(suppressWarnings(suppressMessages(loadNamespace(p))),     error = function(e) e)
22: tools:::.check_packages_used(package = "testpkg")
An irrecoverable exception occurred. R is aborting now ...
Segmentation fault (core dumped)

This only happens when both KEGGREST and ChemmineOB are dependencies. It seems to be something related to the rgb() symbol in ChemmineOB.so conflicting with the rgb() from grDevices. I came here first because the traceback seems to point to Biostrings, which is a dependency of KEGGREST and not of ChemmineOB, but let me know if you think this belongs in the ChemmineOB repo. I also tried this with Biostrings and ChemmineOB as Imports and didn't get an error.

Handling genes with no KEGG identifier

Hello there,

The quick question is: when performing KEGG enrichment analysis, should genes that have no KEGG identifier/annotation be excluded from the Wilcox test comparing p-values of genes in pathway vs genes not in pathway?

The long explanation of the question is: about 2500/14700 of our genes do not get converted to KEGG identifiers using keggConv. Presumably this is because they are not in the KEGG database.

Should we exclude those 2500 NA genes from the Wilcox test since those genes would always be considered "not in pathway" when comparing p-values to genes that are "in pathway". In an extreme case if those NA genes were all highly biased as significantly DE, that could dilute the impact of DE genes that are actually in pathways, potentially preventing those pathways from being significantly enriched. This makes me think those NA genes should be excluded...

Alternatively, we could consider those NA genes an essential part of the "baseline" transcriptome, of which the KEGG pathways and corresponding genes are also a component of... and therefore those NA genes are still needed to test for pathway enrichment. In this case, all genes, including those not in the KEGG database, should be included in the Wilcox test...

Wondering if there is a standard or suggested "protocol" for handling this issue (genes that have no KEGG identifier)? Any literature/links/insight would be greatly appreciated!

Thank you!

Error in curl::curl_fetch_memory(url, handle = handle) : Failure when receiving data from the peer

Hello,

I am currently getting this error when running any of the functions from your package. The problem is recent, and I guess that it is caused by the curl package after an update from the KEGG API side on 01.06.2022.

A simple example that shows this error could be : keggGet("hsa:8312")
My system version info is below:

sessionInfo()
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22000)

Matrix products: default

locale:
[1] LC_COLLATE=English_Germany.utf8 LC_CTYPE=English_Germany.utf8 LC_MONETARY=English_Germany.utf8
[4] LC_NUMERIC=C LC_TIME=English_Germany.utf8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] UniProt.ws_2.36.0 httr_1.4.3 curl_4.3.2 BiocGenerics_0.42.0 RSQLite_2.2.14

loaded via a namespace (and not attached):
[1] KEGGREST_1.36.0 tidyselect_1.1.2 purrr_0.3.4 lattice_0.20-45 vctrs_0.4.1
[6] generics_0.1.2 stats4_4.2.0 BiocFileCache_2.4.0 utf8_1.2.2 blob_1.2.3
[11] rlang_1.0.2 pillar_1.7.0 glue_1.6.2 DBI_1.1.2 rappdirs_0.3.3
[16] bit64_4.0.5 dbplyr_2.1.1 GenomeInfoDbData_1.2.8 lifecycle_1.0.1 zlibbioc_1.42.0
[21] Biostrings_2.64.0 memoise_2.0.1 Biobase_2.56.0 IRanges_2.30.0 fastmap_1.1.0
[26] GenomeInfoDb_1.32.2 AnnotationDbi_1.58.0 fansi_1.0.3 Rcpp_1.0.8.3 filelock_1.0.2
[31] cachem_1.0.6 S4Vectors_0.34.0 graph_1.74.0 XVector_0.36.0 bit_4.0.4
[36] png_0.1-7 dplyr_1.0.9 grid_4.2.0 cli_3.3.0 tools_4.2.0
[41] bitops_1.0-7 magrittr_2.0.3 RCurl_1.98-1.6 tibble_3.1.7 crayon_1.5.1
[46] pkgconfig_2.0.3 ellipsis_0.3.2 Matrix_1.4-1 assertthat_0.2.1 rstudioapi_0.13
[51] R6_2.5.1 compiler_4.2.0

Regards,
M. Al Maaz

OpenSSL error when using keggLink

Hello, thank you for developing the package.
Currently, I have a connection problem using this code:

library(KEGGREST)
library(org.Hs.eg.db)
library(tidyverse)

# get pathways and their entrez gene ids
hsa_path_entrez  <- keggLink("pathway", "hsa") %>%
    tibble(pathway = ., eg = sub("hsa:", "", names(.)))
Error in curl::curl_fetch_memory(url, handle = handle) : 
  OpenSSL SSL_read: error:0A000126:SSL routines::unexpected eof while reading, errno 0

Ubuntu 22.04, R 4.2.1

Could you please advise?

Hello

I have list of genes DE and I want to do KEGG pathway, Could anyone please suggest me or provide the pipeline for doing that.

Many thanks

Error in .getUrl(url, .flatFileParser) : Not Found (HTTP 404).

im using this package to make kegg ,

> ppp_info <- lapply(lipid_pathway, keggGet)
Error in .getUrl(url, .flatFileParser) : Not Found (HTTP 404).

and fix it with

detach(package:KEGGREST, unload = T)
library(devtools)
devtools::install_github("https://github.com/kozo2/KEGGREST/tree/patch-1")

but it can not work

Error with keggList

Running keggList gives an error with curl or openssl.

> pathways.list <- keggList("pathway", "mcc")
Error in curl::curl_fetch_memory(url, handle = handle) : 
  OpenSSL SSL_read: error:0A000126:SSL routines::unexpected eof while reading, errno 0

Is there a way to address this issue without having to update the packages? 1.30.1 is the only version compatible with the other packages in my environment yaml.

Here is the session information:

sessionInfo()
R version 4.0.5 (2021-03-31)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 18.04.6 LTS

Matrix products: default
BLAS/LAPACK: /srv/conda/envs/notebook/lib/libopenblasp-r0.3.21.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
 [1] parallel  stats4    grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] KEGGREST_1.30.1             patchwork_1.1.1             EnhancedVolcano_1.8.0       ggrepel_0.9.1               ggvenn_0.1.9               
 [6] DESeq2_1.30.1               SummarizedExperiment_1.20.0 Biobase_2.50.0              MatrixGenerics_1.2.1        matrixStats_0.62.0         
[11] GenomicRanges_1.42.0        GenomeInfoDb_1.26.4         IRanges_2.24.1              S4Vectors_0.28.1            BiocGenerics_0.36.0        
[16] tximeta_1.8.4               kableExtra_1.3.4            jsonlite_1.8.0              png_0.1-7                   pathview_1.30.1            
[21] forcats_0.5.1               stringr_1.4.0               dplyr_1.0.9                 purrr_0.3.4                 readr_2.1.2                
[26] tidyr_1.2.0                 tibble_3.1.8                ggplot2_3.3.6               tidyverse_1.3.2             BiocManager_1.30.18        
[31] biomaRt_2.46.3             

loaded via a namespace (and not attached):
  [1] readxl_1.4.0                  backports_1.4.1               AnnotationHub_2.22.0          BiocFileCache_1.14.0          systemfonts_1.0.4            
  [6] lazyeval_0.2.2                splines_4.0.5                 BiocParallel_1.24.1           digest_0.6.29                 ensembldb_2.14.0             
 [11] htmltools_0.5.3               fansi_1.0.3                   magrittr_2.0.3                memoise_2.0.1                 googlesheets4_1.0.1          
 [16] openxlsx_4.2.5                tzdb_0.3.0                    Biostrings_2.58.0             annotate_1.68.0               modelr_0.1.8                 
 [21] extrafont_0.18                extrafontdb_1.0               svglite_2.1.0                 askpass_1.1                   prettyunits_1.1.1            
 [26] colorspace_2.0-3              blob_1.2.3                    rvest_1.0.2                   rappdirs_0.3.3                haven_2.5.0                  
 [31] xfun_0.32                     tximport_1.18.0               crayon_1.5.1                  RCurl_1.98-1.8                graph_1.68.0                 
 [36] genefilter_1.72.1             survival_3.4-0                glue_1.6.2                    gtable_0.3.0                  gargle_1.2.0                 
 [41] zlibbioc_1.36.0               XVector_0.30.0                webshot_0.5.3                 DelayedArray_0.16.3           proj4_1.0-11                 
 [46] Rgraphviz_2.34.0              Rttf2pt1_1.3.10               maps_3.4.0                    scales_1.2.0                  DBI_1.1.3                    
 [51] Rcpp_1.0.9                    viridisLite_0.4.0             xtable_1.8-4                  progress_1.2.2                bit_4.0.4                    
 [56] httr_1.4.3                    RColorBrewer_1.1-3            ellipsis_0.3.2                pkgconfig_2.0.3               XML_3.99-0.10                
 [61] dbplyr_2.2.1                  locfit_1.5-9.4                utf8_1.2.2                    later_1.2.0                   tidyselect_1.1.2             
 [66] rlang_1.0.4                   AnnotationDbi_1.52.0          munsell_0.5.0                 BiocVersion_3.12.0            cellranger_1.1.0             
 [71] tools_4.0.5                   cachem_1.0.6                  cli_3.3.0                     generics_0.1.3                RSQLite_2.2.8                
 [76] broom_1.0.0                   evaluate_0.16                 fastmap_1.1.0                 yaml_2.3.5                    org.Hs.eg.db_3.12.0          
 [81] knitr_1.39                    bit64_4.0.5                   fs_1.5.2                      zip_2.2.0                     AnnotationFilter_1.14.0      
 [86] mime_0.12                     ash_1.0-15                    ggrastr_1.0.1                 KEGGgraph_1.50.0              xml2_1.3.3                   
 [91] compiler_4.0.5                rstudioapi_0.13               beeswarm_0.4.0                curl_4.3.2                    interactiveDisplayBase_1.28.0
 [96] reprex_2.0.2                  geneplotter_1.68.0            stringi_1.7.8                 GenomicFeatures_1.42.2        ggalt_0.4.0                  
[101] lattice_0.20-45               ProtGenerics_1.22.0           Matrix_1.4-1                  vctrs_0.4.1                   pillar_1.8.0                 
[106] lifecycle_1.0.1               bitops_1.0-7                  rtracklayer_1.50.0            httpuv_1.6.5                  R6_2.5.1                     
[111] promises_1.2.0.1              KernSmooth_2.23-20            vipor_0.4.5                   MASS_7.3-58.1                 assertthat_0.2.1             
[116] openssl_2.0.2                 withr_2.5.0                   GenomicAlignments_1.26.0      Rsamtools_2.6.0               GenomeInfoDbData_1.2.4       
[121] hms_1.1.1                     rmarkdown_2.15                googledrive_2.0.0             shiny_1.7.2                   lubridate_1.8.0     

color.pathway.by.objects()' failed to extract KEGG image path from response

Hello,

Can't seem to get color.pathway.by.objects() to work and it is required by pathfindR.

color.pathway.by.objects("path:hsa00010", c("hsa:2821", "hsa:226"), c("#ff0000", "#00ff00"), c("#ffff00", "yellow"))
Error in color.pathway.by.objects("path:hsa00010", c("hsa:2821", "hsa:226"),  : 
  'color.pathway.by.objects()' failed to extract KEGG image path from response.

And yet this works

png <- keggGet("hsa05130", "image") 
writePNG(png, "hsa05130_tmp.png")

I am not sure why this is happening. In pathfindR I am having the following error:

> visualize_hsa_KEGG("hsa00010", input_processed = input2)
Downloading pathway diagrams of 1 KEGG pathways

  |                                                                      |   0%Cannot retrieve PNG url: hsa00010
Here's the original error message:
'color.pathway.by.objects()' failed to extract KEGG image path from response.
Cannot download PNG file: NA
Here's the original error message:
'url' must be a length-one character vector
  |======================================================================| 100%
Saving colored pathway diagrams of 0 KEGG pathways

Warning message:
In utils::download.file(url = KEGGgraph::getKGMLurl(pathwayid = sub("hsa",  :
  downloaded length 47563 != reported length 0

`color.pathway.by.objects()` returns pathway's plain url

hey,

Thanks for the great package! As of the latest version (1.37.2), color.pathway.by.objects() started returning the plain pathway image URL instead of the colored one.

mark.pathway.by.objects() does work.

I'd greatly appreciate your help on resolving the issue.

Best,
-E

.get.tmp.url is broken, causing regression tests to fail

Hello,

At the moment, the regression tests at https://bioconductor.org/checkResults/3.17/bioc-LATEST/KEGGREST/nebbiolo1-checksrc.html fail with:

KEGGREST RUnit Tests - 11 test functions, 1 error, 0 failures
ERROR in test_mark_and_color_pathways_by_objects: Error in strsplit(urlLine, "\"", fixed = TRUE)[[1]] : 
  subscript out of bounds

Test files with failing tests

   test_KEGGREST.R 
     test_mark_and_color_pathways_by_objects 


Error in BiocGenerics:::testPackage("KEGGREST") : 
  unit tests failed for package KEGGREST
Execution halted

I could reduce this to:

> KEGGREST:::.get.tmp.url("https://www.kegg.jp/pathway/eco00260+eco:b0002+eco:c00263")
No encoding supplied: defaulting to UTF-8.
Error in strsplit(urlLine, "\"", fixed = TRUE)[[1]] : 
  subscript out of bounds

The function fails when running strsplit(…)[[1]] on urlLine, which is the output of grep("<img src=\"/tmp", lines…, which is character(0)

Probably the structure of the KEGG webpages changed? But as I am not familiar enough with the package, I can not propose a patch.

I hope it helps,

Charles

Possible output limit for ID conversion?

I am using this package, which is extremely useful! I think I found a small issue with the keggConv function - it seems to have a limit of 100 on the output (at least when converting to NCBI IDs). Here is a reproducible example:

library(KEGGREST) ##use this to download KEGG pathways and convert KEGG gene IDs to NCBI gene IDs
library(KEGGgraph) ##use this to convert KGML to data
##Get all genes/proteins in breast cancer pathway (hsa05224) in KEGG-specific KGML format:
path <- "hsa05224"
path_kgml <- keggGet(path, "kgml")
##Convert KGML to data frame:
path_df <- parseKGML2DataFrame(path_kgml)
path_df <- sapply(path_df, as.character)
##Get all the genes/proteins in the pathway using the KEGG IDs:
kegg_ids <- unique(c(path_df[,"from"], path_df[,"to"]))
head(kegg_ids)
length(kegg_ids)
##Now try to convert them to NCBI IDs:
NCBI_gids <- keggConv("ncbi-geneid", kegg_ids)
length(NCBI_gids)
##Note this has only 100 entries!
##However, we can convert the first 100 and the last 39 separately:
length(keggConv("ncbi-geneid", kegg_ids[1:100]))
length(keggConv("ncbi-geneid", kegg_ids[101:139]))

keggLink() returns error

Hello,
Thanks for your amazing package!
Lately, when I try to use keggLink() I get the following error:

Error in curl::curl_fetch_memory(url, handle = handle) :
Failure when receiving data from the peer

I'd appreciate any help on the issue,
Best,
-V

Error in keggList("glycan")

Code:

library("KEGGREST")
KEGGREST::keggList("glycan")

Result:

Error in FUN(X[[i]], ...) : subscript out of bounds

Expected result :
A list of KEGG glycans.

System info:

R version 3.6.1 (2019-07-05)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: Ubuntu 16.04.6 LTS

Matrix products: default
BLAS/LAPACK: /home/cbleker/miniconda2/envs/r-env/lib/R/lib/libRblas.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] KEGGREST_1.24.1

loaded via a namespace (and not attached):
 [1] zlibbioc_1.30.0     httr_1.4.1          compiler_3.6.1     
 [4] R6_2.4.0            IRanges_2.18.2      XVector_0.24.0     
 [7] parallel_3.6.1      curl_4.0            Biostrings_2.52.0  
[10] S4Vectors_0.22.0    BiocGenerics_0.30.0 stats4_3.6.1       
[13] png_0.1-7      ```

Failure when receiving data from the peer

Hi,

When I tried to rerun my code, which worked fine a few months ago, I now run into a problem. I want to fetch a simple kegglist.

pathways.list = keggList('pathway','ath')

However I allways get the error:
Error in curl::curl_fetch_memory(url, handle = handle) :
Failure when receiving data from the peer

Thank you very much for your help.
Moritz

keggFind() doesn't find all the available KO numbers

Hi Everybody,

I'm trying to get the KO numbers for my genes that are in Symbol annotation. Running keggFind() returns NULL for some genes (no KO corresponds to the query), but when I look for that gene manually in the KEGG database I can find its KO. For example:

> names(keggFind("ko", "Acyp1"))
NULL

But, if you search Acyp1 in KEGG, you can find it:

Symbol | ACYP1, ACYPE
(RefSeq) acylphosphatase 1
K01512  acylphosphatase [EC:3.6.1.7]

However, running the following command retrieves the KO number:

> names(keggFind("ko", "Acyp"))
[1] "ko:K01512"

Any clarification is welcome.

Thank you,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.