Git Product home page Git Product logo

encodexplorer's People

Contributors

adeschen avatar andronekomimi avatar charlesjb avatar dtenenba avatar ericfournier2 avatar hpages avatar louisgendron26 avatar nturaga avatar sonali-bioc avatar vobencha avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

encodexplorer's Issues

Database is not completed?

Hey,

I was looking for RNA-seq data for one cell lines, however, it returns nothing with query function.

`> query_results <- queryEncode(df=encode_df, organism = "Homo sapiens", biosample_name="GM12878",assay="RNA-seq", fixed=FALSE)

head(query_results)
Empty data.table (0 rows) of 73 cols: accession,file_accession,file_type,file_format,file_size,output_category...`

This is not consistent with searching results in ENCODE portal.
https://www.encodeproject.org/search/?type=Experiment&status=released&biosample_ontology.classification=cell+line&assay_title=polyA+RNA-seq&biosample_ontology.term_name=GM12878&assay_title=polyA+RNA-seq&biosample_ontology.classification=cell+line

Downloading already existing files will cause an error

Attempts at downloading a file that has already been downloaded will cause ENCODExplorer to crash.

In short, the path used to calculate the existing file's md5 hash is wrong, which makes the hash value NA. When compared against the ENCODE hash, the resulting value is NA, which makes the if statement raise an exception.

some metadata are not up to date?

Hi,

Thanks for this useful tool:

library(ENCODExplorer)
data(encode_df, package = "ENCODExplorer")
query_results_melanocyte <- queryEncode(df=encode_df, organism = "Homo sapiens",
                      biosample_name = c("foreskin melanocyte"), file_format = "fastq", fixed = FALSE,
                      assay = "ChIP-seq")

> query_results_melanocyte
Empty data.table (0 rows) of 73 cols: accession,file_accession,file_type,file_format,file_size,output_category...

> devtools::session_info()
Session info ---------------------------------------------------------------------------------------
 setting  value                       
 version  R version 3.4.2 (2017-09-28)
 system   x86_64, darwin15.6.0        
 ui       RStudio (1.0.153)           
 language (EN)                        
 collate  en_US.UTF-8                 
 tz       America/Chicago             
 date     2018-01-11                  

Packages -------------------------------------------------------------------------------------------
 package       * version  date       source         
 assertthat      0.2.0    2017-04-11 cran (@0.2.0)  
 base          * 3.4.2    2017-10-04 local          
 bindr           0.1      2016-11-13 cran (@0.1)    
 bindrcpp      * 0.2      2017-06-17 cran (@0.2)    
 BiocInstaller * 1.28.0   2017-10-31 Bioconductor   
 bitops          1.0-6    2013-08-17 cran (@1.0-6)  
 compiler        3.4.2    2017-10-04 local          
 data.table      1.10.4-3 2017-10-27 cran (@1.10.4-)
 datasets      * 3.4.2    2017-10-04 local          
 devtools        1.13.3   2017-08-02 CRAN (R 3.4.1) 
 digest          0.6.12   2017-01-27 CRAN (R 3.4.0) 
 dplyr           0.7.4    2017-09-28 cran (@0.7.4)  
 DT            * 0.2      2016-08-09 CRAN (R 3.4.0) 
 ENCODExplorer * 2.4.0    2017-10-31 Bioconductor   
 glue            1.2.0    2017-10-29 cran (@1.2.0)  
 graphics      * 3.4.2    2017-10-04 local          
 grDevices     * 3.4.2    2017-10-04 local          
 htmltools       0.3.6    2017-04-28 CRAN (R 3.4.0) 
 htmlwidgets     0.9      2017-07-10 CRAN (R 3.4.1) 
 httpuv          1.3.5    2017-07-04 CRAN (R 3.4.1) 
 jsonlite        1.5      2017-06-01 CRAN (R 3.4.0) 
 magrittr        1.5      2014-11-22 cran (@1.5)    
 memoise         1.1.0    2017-04-21 CRAN (R 3.4.0) 
 methods       * 3.4.2    2017-10-04 local          
 mime            0.5      2016-07-07 CRAN (R 3.4.0) 
 parallel        3.4.2    2017-10-04 local          
 pkgconfig       2.0.1    2017-03-21 cran (@2.0.1)  
 purrr           0.2.4    2017-10-18 CRAN (R 3.4.2) 
 R6              2.2.2    2017-06-17 CRAN (R 3.4.0) 
 Rcpp            0.12.14  2017-11-23 cran (@0.12.14)
 RCurl           1.95-4.8 2016-03-01 cran (@1.95-4.)
 rlang           0.1.4    2017-11-05 cran (@0.1.4)  
 shiny         * 1.0.5    2017-08-23 CRAN (R 3.4.1) 
 shinythemes   * 1.1.1    2016-10-12 CRAN (R 3.4.0) 
 stats         * 3.4.2    2017-10-04 local          
 stringi         1.1.6    2017-11-17 cran (@1.1.6)  
 stringr         1.2.0    2017-02-18 cran (@1.2.0)  
 tibble          1.3.4    2017-08-22 cran (@1.3.4)  
 tidyr           0.7.2    2017-10-16 cran (@0.7.2)  
 tools           3.4.2    2017-10-04 local          
 utils         * 3.4.2    2017-10-04 local          
 withr           2.0.0    2017-07-28 CRAN (R 3.4.1) 
 xtable          1.8-2    2016-02-05 cran (@1.8-2) 

but I went to the ENCODE site and can find the fastqs are there https://www.encodeproject.org/search/?type=Experiment&assay_title=ChIP-seq&target.investigated_as=histone+modification&files.file_type=fastq&biosample_type=primary+cell&biosample_term_name=foreskin+melanocyte&biosample_term_name=foreskin+melanocyte

Thank you for looking into this.

Best,
Tommy

encode_df is not used by default when called from a package

The behaviour of using ENCODExplorer::encode_df by default when the df parameter is NULL in queryEncode and downloadEncode fails if those functions are called from within a package rahter than the global environment whre library(ENCODExplorer) has been called.

HiC question

Is it possible yet to use ENCODE Explorer in R to query HiC or promoter capture HiC in humans in Encode? I was hoping to call for some consensus by cell type or tissue type.

Error in clean_column from searchEncode

Since this week, the searchEncode example fails with an error:

> searchEncode("ChIP-Seq+H3K4me1")
Error in `[[<-.data.frame`(`*tmp*`, i, value = list(current_version = c("/analysis-step-versions/ggr-tr1-chip-seq-quantification-step-v-1-0/",  : 
  replacement has 20 rows, data has 10

1: searchEncode("ChIP-Seq+H3K4me1")
2: suppressWarnings(clean_table(r))
3: withCallingHandlers(expr, warning = function(w) invokeRestart("muffleWarning"))
4: clean_table(r)
5: lapply(colnames(table), clean_column, table)
6: FUN(X[[i]], ...)
7: `[[<-`(`*tmp*`, i, value = list(current_version = c("/analysis-step-versions/ggr-tr1-chip-seq-quantification-step-v-1-0/", "/analysis-step-versions/ggr-tr1-chip-seq-quantification-step-v-1-0/", "/analysis-step-versions/ggr-tr1-chip
8: `[[<-.data.frame`(`*tmp*`, i, value = list(current_version = c("/analysis-step-versions/ggr-tr1-chip-seq-quantification-step-v-1-0/", "/analysis-step-versions/ggr-tr1-chip-seq-quantification-step-v-1-0/", "/analysis-step-versions/g

The error occurs because the table being cleaned (the one returned by searchEncode: searchEncode results aren't split into multiple tables) has a double-nested data.frame (The table has a data.frame column which itself has a data.frame column). clean_column does not handle such a case, and fails.

The downloaded table object, as it is passed to clean_column, is attached. The error occurs on the analysis_step_version column of the table, which is a data.frame including the analysis_step_version column, which is itself a 20-column data.frame.

queryEncode with non-NULL df variable

Hi,

I am using the prepare_ENCODEdb and export_ENCODEdb_matrix to get the latest ENCODE data. I can pass the result to queryEncode with the argument df = encode_df2. However, it seems that inside the queryEncode function it is always using encode_df and not the passed df value. I can get around the issue by calling my result from prepare_ENCODEdb and export_ENCODEdb_matrix encode_df instead of encode_df2.

Thanks,
Karen

absolute files

Using queryEncode function from ENCODExplorer_1.2.4 R package, I
downloaded a table including below:

href accession
/files/ENCFF002AZN/@@download/ENCFF002AZN.fastq.gz ENCSR471VHW
/files/ENCFF002BBE/@@download/ENCFF002BBE.fastq.gz ENCSR859JNA
/files/ENCFF002AYH/@@download/ENCFF002AYH.fastq.gz ENCSR114GLZ
/files/ENCFF001ZZY/@@download/ENCFF001ZZY.fastq.gz ENCSR597UDW
/files/ENCFF002AZP/@@download/ENCFF002AZP.fastq.gz ENCSR018LUP
/files/ENCFF002BCH/@@download/ENCFF002BCH.fastq.gz ENCSR741STU
/files/ENCFF002AZZ/@@download/ENCFF002AZZ.fastq.gz ENCSR367UDO
/files/ENCFF002AZO/@@download/ENCFF002AZO.fastq.gz ENCSR179YLS

However, the files indicated in href above did not exist. Contacting the ENCODE project, I learnt that these files were replaced with new files. They sent me the following information:

Experiment Old File New File href
ENCSR471VHW ENCFF002AZN ENCFF023EFI /files/ENCFF023EFI/@@download/ENCFF023EFI.fastq.gz
ENCSR859JNA ENCFF002BBE ENCFF637NPE /files/ENCFF637NPE/@@download/ENCFF637NPE.fastq.gz
ENCSR114GLZ ENCFF002AYH ENCFF140WER /files/ENCFF140WER/@@download/ENCFF140WER.fastq.gz
ENCSR597UDW ENCFF001ZZY ENCFF606DIN /files/ENCFF606DIN/@@download/ENCFF606DIN.fastq.gz
ENCSR018LUP ENCFF002AZP ENCFF195XOI /files/ENCFF195XOI/@@download/ENCFF195XOI.fastq.gz
ENCSR741STU ENCFF002BCH ENCFF538SFE /files/ENCFF538SFE/@@download/ENCFF538SFE.fastq.gz
ENCSR367UDO ENCFF002AZZ ENCFF699FIQ /files/ENCFF699FIQ/@@download/ENCFF699FIQ.fastq.gz
ENCSR179YLS ENCFF002AZO ENCFF185IUS /files/ENCFF185IUS/@@download/ENCFF185IUS.fastq.gz

Isn't ENCODExplorer using the most recent database from ENCODE?

Thank you,

searchEncode returns NA rows

> searchEncode("MNase+GM12878")[1:4]
[1] "results : 2 entries"
    accession assay_term_name dataset_type                      lab
1 ENCSR000CXP       MNase-seq   experiment Michael Snyder, Stanford
2        <NA>            <NA>         <NA>                     <NA>

If we do the same search in the encode project web portal, we see that the search returns an experiment and a publication data type. The latter is probably what is causing this bug.

downloadEncode with force == FALSE

I am trying to call downloadEncode with force == FALSE. In download_single_file, md5sum is called on dir/fileName, but fileName had already been concatenated with dir a couple of lines above. Therefore, the md5sum_file cannot be compared with md5sum_encode and an error is given.

some funcitions can not be found

I found that some functions, get_encode_types & get_schemas could not be found in current version of ENCODExplorer. But these functions were listed in the official manual.

I did not check the rest functions in the packadge. But it is nessasary to add these features as the encode_df data is too old and needs to be updated.

Thanks you.

Inconsistencies in replicate mapping

Certain types of files possess the "replicate" attribute, which we map to "replicate_list" and use to fill in fields from the replicate table. However, this usage is not consistent (It seems raw data(fastq) have it, but not processed files). One of the fields pulled from the replicate table is "replicate_library".

However, a new attribute named "replicate_libraries" has been added, and seems to be present everywhere. My current testing reveals that it is a superset of what we currently deduce with the file->replicate->library chain.

Should we remove replicate_library and just use replicate_libraries?

ENCODExplorer not compatible with R (version 4.1.0) nor Bioconductor (v 3.14)

Console output
'getOption("repos")' replaces Bioconductor standard repositories,
see '?repositories' for details

replacement repositories:
CRAN: https://cran.rstudio.com/

Bioconductor version 3.14 (BiocManager 1.30.16), R 4.1.0
(2021-05-18)
Warning message:
In .inet_warning(msg) :
package ‘ENCODExplorer’ is not available for Bioconductor version '3.14'

A version of this package for your version of R might be available elsewhere,
see the ideas at
https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages

R version

Is it possible to download an older version of the ENCODE Explorer? I can't download the new one because of the R version.

Thank you!

Not available for new R (4.0.2)

Hi,
I tried to install package, but in the conventional way (with BiocManager::install() ) downloading got stuck and operation got aborted. After that I tried install from binary file, but I got the following warning message:
"package ‘~/Downloads/ENCODExplorer_2.14.0.tar’ is not available (for R version 4.0.2)"
Will package updated to be compatible with newest version of R?

Thanks in advance!
Zsolt

Update documentation with correct file formats

There are some problem with the documentation where some files are not found. This is caused by the fact that the peaks file format were updated in the Encode project database. If we look into the current encode_df object:

> table(encode_df$experiment$file_format)

    bam     bed  bigBed  bigWig     CEL csfasta  csqual   fasta   fastq     gtf
   7058    5092    4915    9639       8      37      38       3   12977    1152
   idat     tar     tsv
    126     111     465

We can see that the only format that fit for peaks is bed.

Changes to ENCODE metadata

The following attributes have disappeared from experiment objects:

  • system_slims
  • organ_slims
  • biosample_term_name
  • developmental_slims
  • biosample_type
  • biosample_synonyms
  • biosample_term_id

The following columns are new:

  • experiment_classification
  • biosample_ontology

biosample_type and biosample_term_name were part of the final encode_df.

biosample_type entries are indices into the new biosample_type table (EX: "/biosample-types/cell_line_EFO_0001182/", "/biosample-types/primary_cell_CL_1000458/").

The biosample_type table has the following columns. Columns which directly reference old experiments columns are marked with *. Columns which were part of encode_df are marked with **

  • dbxrefs
  • status
  • developmental_slims *
  • cell_slims
  • uuid
  • organ_slims *
  • type **
  • system_slims *
  • synonyms *
  • schema_version
  • classification
  • name
  • term_id *
  • id
  • term_name **

The lack of those columns cause export_ENCODEdb_matrix to fail when reordering columns. Three possible actions must be taken:

  1. At the very least, the function mustn't crash anymore and drop the missing columns silently.
  2. If possible, the columns from biosample_type which were previously in encode_df should now be joined into the experiment table so that information isn't lost.
  3. To go above and beyond the call of duty, the new columns of biosample_type which have no equivalent in the old data frame should also be added.

Find a better fix for searchEncode new return values

The return values from the API call during the searchEncode now returns different values. Especially, the "replicates" column is a data.frame with a variable number of columns. I did a quick fix by setting the value of this column to NULL but we need to find a better approach to keep this info.

Impossible to search for items with value NA

The current queryEncode interface does not support looking for results which bear a "NA" value.

A typical use-case would be trying to retrieve all results which have not been subjected to any treatment, and hence where the treatment column has value NA.

createDesign ctrl

Tu avais fait des modifications a la fonction get_ctrl_design mais il semble que la nouvelle version ne reussie pas extraire les controles

downloadEncode() error 'curl' call had nonzero exit status

Hello,

I'm having issues using the downloadEncode() function. I'm trying to follow along with the tutorial and run the following code:
query_results <- queryEncode(assay = "switchgear", target ="elavl1",
file_format = "bed" , fixed = FALSE)
downloadEncode(query_results)

I get the following error:
Error in download.file(url = paste0(encode_root, href), quiet = TRUE, :
'curl' call had nonzero exit status

I really appreciate any help!

sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] curl_3.3 ENCODExplorer_2.10.0 shinythemes_1.1.2 DT_0.6 shiny_1.3.2

loaded via a namespace (and not attached):
[1] Rcpp_1.0.1 compiler_3.6.0 pillar_1.4.1
[4] later_0.8.0 BiocManager_1.30.4 dbplyr_1.4.0
[7] AnnotationHub_2.16.0 tools_3.6.0 digest_0.6.19
[10] bit_1.1-14 jsonlite_1.6 RSQLite_2.1.1
[13] memoise_1.1.0 BiocFileCache_1.8.0 tibble_2.1.2
[16] pkgconfig_2.0.2 rlang_0.3.4 DBI_1.0.0
[19] yaml_2.2.0 parallel_3.6.0 stringr_1.4.0
[22] httr_1.4.0 dplyr_0.8.1 IRanges_2.18.1
[25] rappdirs_0.3.1 htmlwidgets_1.3 S4Vectors_0.22.0
[28] stats4_3.6.0 bit64_0.9-7 tidyselect_0.2.5
[31] data.table_1.12.2 Biobase_2.44.0 glue_1.3.1
[34] R6_2.4.0 AnnotationDbi_1.46.0 tidyr_0.8.3
[37] purrr_0.3.2 blob_1.1.1 magrittr_1.5
[40] promises_1.0.1 htmltools_0.3.6 BiocGenerics_0.30.0
[43] assertthat_0.2.1 interactiveDisplayBase_1.22.0 mime_0.6
[46] xtable_1.8-4 httpuv_1.5.1 stringi_1.4.3
[49] crayon_1.3.4

error in metadata table

We'll show a confusing situation with organism and assembly for elements of the
full 2019-10-13 metadata build.

ah = AnnotationHub()
query(ah, "ENCODExplorerData")
## AnnotationHub with 4 records
## # snapshotDate(): 2020-02-28
## # $dataprovider: ENCODE Project
## # $species: NA
## # $rdataclass: data.table
## # additional mcols(): taxonomyid, genome, description,
## #   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## #   rdatapath, sourceurl, sourcetype 
## # retrieve records with, e.g., 'object[["AH69290"]]' 
## 
##             title                                         
##   AH69290 | ENCODE File Metadata (Light, 2019-04-12 build)
##   AH69291 | ENCODE File Metadata (Full, 2019-04-12 build) 
##   AH75131 | ENCODE File Metadata (Light, 2019-10-13 build)
##   AH75132 | ENCODE File Metadata (Full, 2019-10-13 build)
fm = ah[["AH75132"]]

This is a data.table. We tabulate the assemblies in use for experiments of organism Homo sapiens.

> table(fm$assembly[fm$organism=="Homo sapiens"], fm$organism[fm$organism=="Homo sapiens"])
                
                 Homo sapiens
  GRCh38               129473
  GRCh38-minimal            4
  hg19                 162657
  mm10                    133
  mm10-minimal            270
  mm9                       2

Probably the organism is simply mislabeled, and the assembly annotation is more reliable. But is this an error upstream in ENCODE metadata or is it a curation problem in this package? Thank you.

missing function queryConsensusPeaks in Bioc v.2.10.0

Hi,

I am trying to follow the bioconductor vignette for ENCODEExplorer and when I try to obtain consensus Peaks from ChIP-Seq (section 5.1) I get the following error:

res = queryConsensusPeaks("22Rv1", "GRCh38", "CTCF")
Error in queryConsensusPeaks("22Rv1", "GRCh38", "CTCF"): 
  could not find function "queryConsensusPeaks"

It looks like the function is not part of the package anymore? I also looked into package ENCODExplorerData, but to no avail... Could you point me to the function? thanks!

session info:

  • Session info --------------------------------------------------
    setting value
    version R version 3.6.2 (2019-12-12)
    os Windows 10 x64
    system x86_64, mingw32
    ui RStudio
    language (EN)
    collate English_United States.1252
    ctype English_United States.1252
    tz America/New_York
    date 2020-02-16

  • Packages ------------------------------------------------------
    package * version date lib source
    AnnotationDbi * 1.46.1 2019-08-20 [1] Bioconductor
    AnnotationFilter * 1.8.0 2019-05-02 [1] Bioconductor
    AnnotationHub * 2.16.1 2019-09-04 [1] Bioconductor
    assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.1)
    backports 1.1.5 2019-10-02 [1] CRAN (R 3.6.1)
    Biobase * 2.44.0 2019-05-02 [1] Bioconductor
    BiocFileCache * 1.8.0 2019-05-02 [1] Bioconductor
    BiocGenerics * 0.30.0 2019-05-02 [1] Bioconductor
    BiocManager 1.30.10 2019-11-16 [1] CRAN (R 3.6.1)
    BiocParallel 1.18.1 2019-08-06 [1] Bioconductor
    biomaRt 2.40.5 2019-10-01 [1] Bioconductor
    Biostrings 2.52.0 2019-05-02 [1] Bioconductor
    bit 1.1-15.1 2020-01-14 [1] CRAN (R 3.6.2)
    bit64 0.9-7 2017-05-08 [1] CRAN (R 3.6.0)
    bitops 1.0-6 2013-08-17 [1] CRAN (R 3.6.0)
    blob 1.2.1 2020-01-20 [1] CRAN (R 3.6.2)
    boot 1.3-24 2019-12-20 [2] CRAN (R 3.6.2)
    broom 0.5.4 2020-01-27 [1] CRAN (R 3.6.2)
    callr 3.4.2 2020-02-12 [1] CRAN (R 3.6.2)
    caTools 1.18.0 2020-01-17 [1] CRAN (R 3.6.2)
    cellranger 1.1.0 2016-07-27 [1] CRAN (R 3.6.1)
    checkmate 2.0.0 2020-02-06 [1] CRAN (R 3.6.2)
    cli 2.0.1 2020-01-08 [1] CRAN (R 3.6.2)
    codetools 0.2-16 2018-12-24 [2] CRAN (R 3.6.2)
    colorRamps 2.3 2012-10-29 [1] CRAN (R 3.6.0)
    colorspace 1.4-1 2019-03-18 [1] CRAN (R 3.6.1)
    crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.1)
    curl 4.3 2019-12-02 [1] CRAN (R 3.6.1)
    data.table 1.12.8 2019-12-09 [1] CRAN (R 3.6.2)
    DBI 1.1.0 2019-12-15 [1] CRAN (R 3.6.2)
    dbplyr * 1.4.2 2019-06-17 [1] CRAN (R 3.6.1)
    DelayedArray 0.10.0 2019-05-02 [1] Bioconductor
    desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.1)
    devtools 2.2.1 2019-09-24 [1] CRAN (R 3.6.1)
    digest 0.6.23 2019-11-23 [1] CRAN (R 3.6.1)
    doParallel 1.0.15 2019-08-02 [1] CRAN (R 3.6.1)
    dplyr * 0.8.4 2020-01-31 [1] CRAN (R 3.6.2)
    DT * 0.12 2020-02-05 [1] CRAN (R 3.6.2)
    edgeR * 3.26.8 2019-09-01 [1] Bioconductor
    ellipsis 0.3.0 2019-09-20 [1] CRAN (R 3.6.1)
    ENCODExplorer * 2.10.0 2019-05-02 [1] Bioconductor
    ENCODExplorerData * 0.99.1 2020-02-16 [1] Bioconductor
    ensembldb * 2.8.1 2019-10-11 [1] Bioconductor
    fansi 0.4.1 2020-01-08 [1] CRAN (R 3.6.2)
    fastmap 1.0.1 2019-10-08 [1] CRAN (R 3.6.1)
    forcats * 0.4.0 2019-02-17 [1] CRAN (R 3.6.1)
    foreach * 1.4.8 2020-02-09 [1] CRAN (R 3.6.2)
    fs 1.3.1 2019-05-06 [1] CRAN (R 3.6.1)
    gdata 2.18.0 2017-06-06 [1] CRAN (R 3.6.0)
    generics 0.0.2 2018-11-29 [1] CRAN (R 3.6.1)
    GenomeInfoDb * 1.20.0 2019-05-02 [1] Bioconductor
    GenomeInfoDbData 1.2.1 2019-11-30 [1] Bioconductor
    GenomicAlignments 1.20.1 2019-06-18 [1] Bioconductor
    GenomicFeatures * 1.36.4 2019-07-11 [1] Bioconductor
    GenomicRanges * 1.36.1 2019-09-06 [1] Bioconductor
    ggplot2 * 3.2.1 2019-08-10 [1] CRAN (R 3.6.1)
    glue 1.3.1 2019-03-12 [1] CRAN (R 3.6.1)
    gplots 3.0.1.2 2020-01-11 [1] CRAN (R 3.6.2)
    graph 1.62.0 2019-05-02 [1] Bioconductor
    graphite * 1.30.0 2019-05-02 [1] Bioconductor
    gtable 0.3.0 2019-03-25 [1] CRAN (R 3.6.1)
    gtools 3.8.1 2018-06-26 [1] CRAN (R 3.6.0)
    haven 2.2.0 2019-11-08 [1] CRAN (R 3.6.1)
    hms 0.5.3 2020-01-08 [1] CRAN (R 3.6.2)
    htmltools 0.4.0 2019-10-04 [1] CRAN (R 3.6.1)
    htmlwidgets 1.5.1 2019-10-08 [1] CRAN (R 3.6.1)
    httpuv 1.5.2 2019-09-11 [1] CRAN (R 3.6.1)
    httr 1.4.1 2019-08-05 [1] CRAN (R 3.6.1)
    interactiveDisplayBase 1.22.0 2019-05-02 [1] Bioconductor
    IRanges * 2.18.3 2019-09-24 [1] Bioconductor
    iterators 1.0.12 2019-07-26 [1] CRAN (R 3.6.1)
    jsonlite 1.6.1 2020-02-02 [1] CRAN (R 3.6.2)
    KernSmooth 2.23-16 2019-10-15 [2] CRAN (R 3.6.2)
    later 1.0.0 2019-10-04 [1] CRAN (R 3.6.1)
    lattice 0.20-38 2018-11-04 [2] CRAN (R 3.6.2)
    lazyeval 0.2.2 2019-03-15 [1] CRAN (R 3.6.1)
    lifecycle 0.1.0 2019-08-01 [1] CRAN (R 3.6.1)
    limma * 3.40.6 2019-07-26 [1] Bioconductor
    lme4 1.1-21 2019-03-05 [1] CRAN (R 3.6.1)
    locfit 1.5-9.1 2013-04-20 [1] CRAN (R 3.6.1)
    lubridate 1.7.4 2018-04-11 [1] CRAN (R 3.6.1)
    magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.1)
    MASS 7.3-51.5 2019-12-20 [2] CRAN (R 3.6.2)
    Matrix 1.2-18 2019-11-27 [1] CRAN (R 3.6.1)
    matrixStats 0.55.0 2019-09-07 [1] CRAN (R 3.6.1)
    memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.1)
    mime 0.9 2020-02-04 [1] CRAN (R 3.6.2)
    minqa 1.2.4 2014-10-09 [1] CRAN (R 3.6.1)
    modelr 0.1.5 2019-08-08 [1] CRAN (R 3.6.1)
    munsell 0.5.0 2018-06-12 [1] CRAN (R 3.6.1)
    nlme 3.1-144 2020-02-06 [1] CRAN (R 3.6.2)
    nloptr 1.2.1 2018-10-03 [1] CRAN (R 3.6.1)
    packrat 0.5.0 2018-11-14 [1] CRAN (R 3.6.1)
    pbkrtest 0.4-7 2017-03-15 [1] CRAN (R 3.6.1)
    pillar 1.4.3 2019-12-20 [1] CRAN (R 3.6.2)
    pkgbuild 1.0.6 2019-10-09 [1] CRAN (R 3.6.1)
    pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 3.6.1)
    pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.6.1)
    plyr 1.8.5 2019-12-10 [1] CRAN (R 3.6.2)
    prettyunits 1.1.1 2020-01-24 [1] CRAN (R 3.6.2)
    processx 3.4.2 2020-02-09 [1] CRAN (R 3.6.2)
    progress 1.2.2 2019-05-16 [1] CRAN (R 3.6.1)
    promises 1.1.0 2019-10-04 [1] CRAN (R 3.6.1)
    ProtGenerics 1.16.0 2019-05-02 [1] Bioconductor
    ps 1.3.0 2018-12-21 [1] CRAN (R 3.6.2)
    purrr * 0.3.3 2019-10-18 [1] CRAN (R 3.6.1)
    R6 2.4.1 2019-11-12 [1] CRAN (R 3.6.1)
    rappdirs 0.3.1 2016-03-28 [1] CRAN (R 3.6.1)
    Rcpp 1.0.3 2019-11-08 [1] CRAN (R 3.6.1)
    RCurl 1.98-1.1 2020-01-19 [1] CRAN (R 3.6.2)
    readr * 1.3.1 2018-12-21 [1] CRAN (R 3.6.1)
    readxl 1.3.1 2019-03-13 [1] CRAN (R 3.6.1)
    remotes 2.1.1 2020-02-15 [1] CRAN (R 3.6.2)
    reprex 0.3.0 2019-05-16 [1] CRAN (R 3.6.1)
    reshape2 1.4.3 2017-12-11 [1] CRAN (R 3.6.1)
    rlang 0.4.4 2020-01-28 [1] CRAN (R 3.6.2)
    rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.1)
    Rsamtools 2.0.3 2019-10-10 [1] Bioconductor
    RSQLite 2.2.0 2020-01-07 [1] CRAN (R 3.6.2)
    rstudioapi 0.11 2020-02-07 [1] CRAN (R 3.6.2)
    rtracklayer 1.44.4 2019-09-06 [1] Bioconductor
    rvest 0.3.5 2019-11-08 [1] CRAN (R 3.6.1)
    S4Vectors * 0.22.1 2019-09-09 [1] Bioconductor
    scales * 1.1.0 2019-11-18 [1] CRAN (R 3.6.1)
    sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.1)
    shiny * 1.4.0 2019-10-10 [1] CRAN (R 3.6.1)
    shinythemes * 1.1.2 2018-11-06 [1] CRAN (R 3.6.1)
    stringi 1.4.4 2020-01-09 [1] CRAN (R 3.6.2)
    stringr * 1.4.0 2019-02-10 [1] CRAN (R 3.6.1)
    SummarizedExperiment 1.14.1 2019-07-31 [1] Bioconductor
    testthat 2.3.1 2019-12-01 [1] CRAN (R 3.6.1)
    tibble * 2.1.3 2019-06-06 [1] CRAN (R 3.6.1)
    tidyr * 1.0.2 2020-01-24 [1] CRAN (R 3.6.2)
    tidyselect 1.0.0 2020-01-27 [1] CRAN (R 3.6.2)
    tidyverse * 1.3.0 2019-11-21 [1] CRAN (R 3.6.1)
    usethis 1.5.1 2019-07-04 [1] CRAN (R 3.6.1)
    variancePartition * 1.14.1 2019-10-01 [1] Bioconductor
    vctrs 0.2.2 2020-01-24 [1] CRAN (R 3.6.2)
    withr 2.1.2 2018-03-15 [1] CRAN (R 3.6.1)
    XML 3.99-0.3 2020-01-20 [1] CRAN (R 3.6.2)
    xml2 1.2.2 2019-08-09 [1] CRAN (R 3.6.1)
    xtable 1.8-4 2019-04-21 [1] CRAN (R 3.6.2)
    XVector 0.24.0 2019-05-02 [1] Bioconductor
    yaml 2.2.1 2020-02-01 [1] CRAN (R 3.6.2)
    zlibbioc 1.30.0 2019-05-02 [1] Bioconductor

downloadEncode(force=FALSE) has non-intuitive behaviour.

Calling downloadEncode with force=FALSE will not check that an existign file has a matching md5sum, and still report success.

Example:

    q_results = queryEncodeGeneric(biosample_name="A549", 
                                   file_type="bed narrowPeak", 
                                   target="BHLHE40")
    d_results = downloadEncode(q_results)
    d_result_files = gsub("Success downloading file : ", "", d_results)
    
    checkTrue(all(file.exists(d_result_files)))
    
    # Downlaod again with force=FALSE, should fail.
    file.remove(d_result_files)
    file.create(d_result_files)
    d_results = downloadEncode(q_results, force=FALSE)

will yield:

[1] "Success downloading file : ./ENCFF001VDM.bed.gz"
[1] "Success downloading file : ./ENCFF002COC.bed.gz"
[1] "Files can be found at C:/Dev/Projects/ENCODExplorer"

whereas the files ahve clearly not been downloaded, and are in fact corrupt.

The behaviour of force might not need toc hange, but its reporting should at least clearly spell out what happened ("Files were not dowloaded because they already exists". Checking if the existing file's md5 matches the expected one would be a nice bonus.

package failing in Bioc 3.14

txdb <- TxDb.Mmusculus.UCSC.mm9.knownGene
isActiveSeq(txdb) <- c(rep(FALSE,20), TRUE, rep(FALSE, 14))
allpromoter <- getPromoterClass(txdb, Nproc=1, org=Mmusculus)
Error in clusterApplyLB(cl, chrs, pcChr, tssGR = tssGR, ORG = org) :
could not find function "clusterApplyLB"
Calls: getPromoterClass -> getPromoterClass
Execution halted

see the build report -- the package seems to be deprecated. so please change the README.md if you are not going to maintain the package. @CharlesJB @lshep

Error in Bioconductor build

Relevant snipped from build:

Running ‘runTests.R’ [4s/5s]
 ERROR
Running the tests in ‘tests/runTests.R’ failed.
Last 13 lines of output:
  ENCODExplorer RUnit Tests - 18 test functions, 0 errors, 1 failure
  FAILURE in test.md5sum: Error in checkEquals(as.character(tools::md5sum(system.file("extdata/ENCFF001VCK.broadPeak.gz",  : 
    1 string mismatch

  Test files with failing tests

     test_download.R 
       test.md5sum 

  Error in BiocGenerics:::testPackage("ENCODExplorer") : 
    unit tests failed for package ENCODExplorer
  Execution halted

prepare_ENCODEdb breaks on empty tables

Currently, the types of ENCODE tables(as returned by get_encode_types(), which sources its list directly from the ENCODE github repository) includes access_key, but querying it returns an empty set (No columns, no rows). This causes prepare_ENCODEdb to crash.

Checking for empty tables seems to have fixed it, but I need to test it more thoroughly before I push it to production.

shinyEncode sources a file that does not exist

In shinyEncode(), the line

source(file = system.file("inst/shiny/ui.R", package = "ENCODExplorer"))

should be

source(file = system.file("shiny/ui.R", package = "ENCODExplorer"))

since everything in the inst directory gets placed in the package root upon installation.

downloadEncode fails if the download directory is not the current directory.

Presently, downloadEncode uses the following line to fetch the files:

download.file(url=paste0(encode_root, href), quiet=TRUE,
                                   destfile=fileName, method = "curl",
                                   extra = "-O -L" )

However, the -O flag for curl reads as follow:

-O, --remote-name

Write output to a local file named like the remote file we get. (Only the file part of the remote file is used, the path is cut off.)

The file will be saved in the current working directory. If you want the file saved in a different directory, make sure you change the current working directory before invoking curl with this option.

So whenever a user passes in a different directory as a target, curl ignores it, and the md5sum check fails since the file is now where it is expected to be.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.