encodexplorer's People
encodexplorer's Issues
Database is not completed?
Hey,
I was looking for RNA-seq data for one cell lines, however, it returns nothing with query function.
`> query_results <- queryEncode(df=encode_df, organism = "Homo sapiens", biosample_name="GM12878",assay="RNA-seq", fixed=FALSE)
head(query_results)
Empty data.table (0 rows) of 73 cols: accession,file_accession,file_type,file_format,file_size,output_category...`
This is not consistent with searching results in ENCODE portal.
https://www.encodeproject.org/search/?type=Experiment&status=released&biosample_ontology.classification=cell+line&assay_title=polyA+RNA-seq&biosample_ontology.term_name=GM12878&assay_title=polyA+RNA-seq&biosample_ontology.classification=cell+line
Support Drosophila melanogaster?
Hello! I have enjoyed using ENCODExplorer but I would like to query for Drosophila melanogaster. I tried using the organism, assembly, and biosample_type. Do you support other organisms besides Homo sapiens and Mus musculus?
I have been following:
https://bioconductor.org/packages/release/bioc/vignettes/ENCODExplorer/inst/doc/ENCODExplorer.html and http://bioconductor.org/packages/release/bioc/manuals/ENCODExplorer/man/ENCODExplorer.pdf
Downloading already existing files will cause an error
Attempts at downloading a file that has already been downloaded will cause ENCODExplorer to crash.
In short, the path used to calculate the existing file's md5 hash is wrong, which makes the hash value NA. When compared against the ENCODE hash, the resulting value is NA, which makes the if statement raise an exception.
some metadata are not up to date?
Hi,
Thanks for this useful tool:
library(ENCODExplorer)
data(encode_df, package = "ENCODExplorer")
query_results_melanocyte <- queryEncode(df=encode_df, organism = "Homo sapiens",
biosample_name = c("foreskin melanocyte"), file_format = "fastq", fixed = FALSE,
assay = "ChIP-seq")
> query_results_melanocyte
Empty data.table (0 rows) of 73 cols: accession,file_accession,file_type,file_format,file_size,output_category...
> devtools::session_info()
Session info ---------------------------------------------------------------------------------------
setting value
version R version 3.4.2 (2017-09-28)
system x86_64, darwin15.6.0
ui RStudio (1.0.153)
language (EN)
collate en_US.UTF-8
tz America/Chicago
date 2018-01-11
Packages -------------------------------------------------------------------------------------------
package * version date source
assertthat 0.2.0 2017-04-11 cran (@0.2.0)
base * 3.4.2 2017-10-04 local
bindr 0.1 2016-11-13 cran (@0.1)
bindrcpp * 0.2 2017-06-17 cran (@0.2)
BiocInstaller * 1.28.0 2017-10-31 Bioconductor
bitops 1.0-6 2013-08-17 cran (@1.0-6)
compiler 3.4.2 2017-10-04 local
data.table 1.10.4-3 2017-10-27 cran (@1.10.4-)
datasets * 3.4.2 2017-10-04 local
devtools 1.13.3 2017-08-02 CRAN (R 3.4.1)
digest 0.6.12 2017-01-27 CRAN (R 3.4.0)
dplyr 0.7.4 2017-09-28 cran (@0.7.4)
DT * 0.2 2016-08-09 CRAN (R 3.4.0)
ENCODExplorer * 2.4.0 2017-10-31 Bioconductor
glue 1.2.0 2017-10-29 cran (@1.2.0)
graphics * 3.4.2 2017-10-04 local
grDevices * 3.4.2 2017-10-04 local
htmltools 0.3.6 2017-04-28 CRAN (R 3.4.0)
htmlwidgets 0.9 2017-07-10 CRAN (R 3.4.1)
httpuv 1.3.5 2017-07-04 CRAN (R 3.4.1)
jsonlite 1.5 2017-06-01 CRAN (R 3.4.0)
magrittr 1.5 2014-11-22 cran (@1.5)
memoise 1.1.0 2017-04-21 CRAN (R 3.4.0)
methods * 3.4.2 2017-10-04 local
mime 0.5 2016-07-07 CRAN (R 3.4.0)
parallel 3.4.2 2017-10-04 local
pkgconfig 2.0.1 2017-03-21 cran (@2.0.1)
purrr 0.2.4 2017-10-18 CRAN (R 3.4.2)
R6 2.2.2 2017-06-17 CRAN (R 3.4.0)
Rcpp 0.12.14 2017-11-23 cran (@0.12.14)
RCurl 1.95-4.8 2016-03-01 cran (@1.95-4.)
rlang 0.1.4 2017-11-05 cran (@0.1.4)
shiny * 1.0.5 2017-08-23 CRAN (R 3.4.1)
shinythemes * 1.1.1 2016-10-12 CRAN (R 3.4.0)
stats * 3.4.2 2017-10-04 local
stringi 1.1.6 2017-11-17 cran (@1.1.6)
stringr 1.2.0 2017-02-18 cran (@1.2.0)
tibble 1.3.4 2017-08-22 cran (@1.3.4)
tidyr 0.7.2 2017-10-16 cran (@0.7.2)
tools 3.4.2 2017-10-04 local
utils * 3.4.2 2017-10-04 local
withr 2.0.0 2017-07-28 CRAN (R 3.4.1)
xtable 1.8-2 2016-02-05 cran (@1.8-2)
but I went to the ENCODE site and can find the fastqs are there https://www.encodeproject.org/search/?type=Experiment&assay_title=ChIP-seq&target.investigated_as=histone+modification&files.file_type=fastq&biosample_type=primary+cell&biosample_term_name=foreskin+melanocyte&biosample_term_name=foreskin+melanocyte
Thank you for looking into this.
Best,
Tommy
encode_df is not used by default when called from a package
The behaviour of using ENCODExplorer::encode_df by default when the df parameter is NULL in queryEncode and downloadEncode fails if those functions are called from within a package rahter than the global environment whre library(ENCODExplorer) has been called.
HiC question
Is it possible yet to use ENCODE Explorer in R to query HiC or promoter capture HiC in humans in Encode? I was hoping to call for some consensus by cell type or tissue type.
Error in clean_column from searchEncode
Since this week, the searchEncode example fails with an error:
> searchEncode("ChIP-Seq+H3K4me1")
Error in `[[<-.data.frame`(`*tmp*`, i, value = list(current_version = c("/analysis-step-versions/ggr-tr1-chip-seq-quantification-step-v-1-0/", :
replacement has 20 rows, data has 10
1: searchEncode("ChIP-Seq+H3K4me1")
2: suppressWarnings(clean_table(r))
3: withCallingHandlers(expr, warning = function(w) invokeRestart("muffleWarning"))
4: clean_table(r)
5: lapply(colnames(table), clean_column, table)
6: FUN(X[[i]], ...)
7: `[[<-`(`*tmp*`, i, value = list(current_version = c("/analysis-step-versions/ggr-tr1-chip-seq-quantification-step-v-1-0/", "/analysis-step-versions/ggr-tr1-chip-seq-quantification-step-v-1-0/", "/analysis-step-versions/ggr-tr1-chip
8: `[[<-.data.frame`(`*tmp*`, i, value = list(current_version = c("/analysis-step-versions/ggr-tr1-chip-seq-quantification-step-v-1-0/", "/analysis-step-versions/ggr-tr1-chip-seq-quantification-step-v-1-0/", "/analysis-step-versions/g
The error occurs because the table being cleaned (the one returned by searchEncode: searchEncode results aren't split into multiple tables) has a double-nested data.frame (The table has a data.frame column which itself has a data.frame column). clean_column does not handle such a case, and fails.
The downloaded table object, as it is passed to clean_column, is attached. The error occurs on the analysis_step_version column of the table, which is a data.frame including the analysis_step_version column, which is itself a 20-column data.frame.
queryEncode with non-NULL df variable
Hi,
I am using the prepare_ENCODEdb and export_ENCODEdb_matrix to get the latest ENCODE data. I can pass the result to queryEncode with the argument df = encode_df2. However, it seems that inside the queryEncode function it is always using encode_df and not the passed df value. I can get around the issue by calling my result from prepare_ENCODEdb and export_ENCODEdb_matrix encode_df instead of encode_df2.
Thanks,
Karen
absolute files
Using queryEncode function from ENCODExplorer_1.2.4 R package, I
downloaded a table including below:
href accession
/files/ENCFF002AZN/@@download/ENCFF002AZN.fastq.gz ENCSR471VHW
/files/ENCFF002BBE/@@download/ENCFF002BBE.fastq.gz ENCSR859JNA
/files/ENCFF002AYH/@@download/ENCFF002AYH.fastq.gz ENCSR114GLZ
/files/ENCFF001ZZY/@@download/ENCFF001ZZY.fastq.gz ENCSR597UDW
/files/ENCFF002AZP/@@download/ENCFF002AZP.fastq.gz ENCSR018LUP
/files/ENCFF002BCH/@@download/ENCFF002BCH.fastq.gz ENCSR741STU
/files/ENCFF002AZZ/@@download/ENCFF002AZZ.fastq.gz ENCSR367UDO
/files/ENCFF002AZO/@@download/ENCFF002AZO.fastq.gz ENCSR179YLS
However, the files indicated in href above did not exist. Contacting the ENCODE project, I learnt that these files were replaced with new files. They sent me the following information:
Experiment Old File New File href
ENCSR471VHW ENCFF002AZN ENCFF023EFI /files/ENCFF023EFI/@@download/ENCFF023EFI.fastq.gz
ENCSR859JNA ENCFF002BBE ENCFF637NPE /files/ENCFF637NPE/@@download/ENCFF637NPE.fastq.gz
ENCSR114GLZ ENCFF002AYH ENCFF140WER /files/ENCFF140WER/@@download/ENCFF140WER.fastq.gz
ENCSR597UDW ENCFF001ZZY ENCFF606DIN /files/ENCFF606DIN/@@download/ENCFF606DIN.fastq.gz
ENCSR018LUP ENCFF002AZP ENCFF195XOI /files/ENCFF195XOI/@@download/ENCFF195XOI.fastq.gz
ENCSR741STU ENCFF002BCH ENCFF538SFE /files/ENCFF538SFE/@@download/ENCFF538SFE.fastq.gz
ENCSR367UDO ENCFF002AZZ ENCFF699FIQ /files/ENCFF699FIQ/@@download/ENCFF699FIQ.fastq.gz
ENCSR179YLS ENCFF002AZO ENCFF185IUS /files/ENCFF185IUS/@@download/ENCFF185IUS.fastq.gz
Isn't ENCODExplorer using the most recent database from ENCODE?
Thank you,
searchEncode returns NA rows
> searchEncode("MNase+GM12878")[1:4]
[1] "results : 2 entries"
accession assay_term_name dataset_type lab
1 ENCSR000CXP MNase-seq experiment Michael Snyder, Stanford
2 <NA> <NA> <NA> <NA>
If we do the same search in the encode project web portal, we see that the search returns an experiment
and a publication
data type. The latter is probably what is causing this bug.
downloadEncode with force == FALSE
I am trying to call downloadEncode with force == FALSE. In download_single_file, md5sum is called on dir/fileName, but fileName had already been concatenated with dir a couple of lines above. Therefore, the md5sum_file cannot be compared with md5sum_encode and an error is given.
some funcitions can not be found
I found that some functions, get_encode_types & get_schemas could not be found in current version of ENCODExplorer. But these functions were listed in the official manual.
I did not check the rest functions in the packadge. But it is nessasary to add these features as the encode_df data is too old and needs to be updated.
Thanks you.
Inconsistencies in replicate mapping
Certain types of files possess the "replicate" attribute, which we map to "replicate_list" and use to fill in fields from the replicate table. However, this usage is not consistent (It seems raw data(fastq) have it, but not processed files). One of the fields pulled from the replicate table is "replicate_library".
However, a new attribute named "replicate_libraries" has been added, and seems to be present everywhere. My current testing reveals that it is a superset of what we currently deduce with the file->replicate->library chain.
Should we remove replicate_library and just use replicate_libraries?
ENCODExplorer not compatible with R (version 4.1.0) nor Bioconductor (v 3.14)
Console output
'getOption("repos")' replaces Bioconductor standard repositories,
see '?repositories' for details
replacement repositories:
CRAN: https://cran.rstudio.com/
Bioconductor version 3.14 (BiocManager 1.30.16), R 4.1.0
(2021-05-18)
Warning message:
In .inet_warning(msg) :
package ‘ENCODExplorer’ is not available for Bioconductor version '3.14'
A version of this package for your version of R might be available elsewhere,
see the ideas at
https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages
R version
Is it possible to download an older version of the ENCODE Explorer? I can't download the new one because of the R version.
Thank you!
Not available for new R (4.0.2)
Hi,
I tried to install package, but in the conventional way (with BiocManager::install() ) downloading got stuck and operation got aborted. After that I tried install from binary file, but I got the following warning message:
"package ‘~/Downloads/ENCODExplorer_2.14.0.tar’ is not available (for R version 4.0.2)"
Will package updated to be compatible with newest version of R?
Thanks in advance!
Zsolt
Update documentation with correct file formats
There are some problem with the documentation where some files are not found. This is caused by the fact that the peaks file format were updated in the Encode project database. If we look into the current encode_df
object:
> table(encode_df$experiment$file_format)
bam bed bigBed bigWig CEL csfasta csqual fasta fastq gtf
7058 5092 4915 9639 8 37 38 3 12977 1152
idat tar tsv
126 111 465
We can see that the only format that fit for peaks is bed.
Changes to ENCODE metadata
The following attributes have disappeared from experiment objects:
- system_slims
- organ_slims
- biosample_term_name
- developmental_slims
- biosample_type
- biosample_synonyms
- biosample_term_id
The following columns are new:
- experiment_classification
- biosample_ontology
biosample_type and biosample_term_name were part of the final encode_df.
biosample_type entries are indices into the new biosample_type table (EX: "/biosample-types/cell_line_EFO_0001182/", "/biosample-types/primary_cell_CL_1000458/").
The biosample_type table has the following columns. Columns which directly reference old experiments columns are marked with *. Columns which were part of encode_df are marked with **
- dbxrefs
- status
- developmental_slims *
- cell_slims
- uuid
- organ_slims *
- type **
- system_slims *
- synonyms *
- schema_version
- classification
- name
- term_id *
- id
- term_name **
The lack of those columns cause export_ENCODEdb_matrix to fail when reordering columns. Three possible actions must be taken:
- At the very least, the function mustn't crash anymore and drop the missing columns silently.
- If possible, the columns from biosample_type which were previously in encode_df should now be joined into the experiment table so that information isn't lost.
- To go above and beyond the call of duty, the new columns of biosample_type which have no equivalent in the old data frame should also be added.
queryEncode - Add support for boolean operators.
Also add support for vectors.
Find a better fix for searchEncode new return values
The return values from the API call during the searchEncode now returns different values. Especially, the "replicates" column is a data.frame
with a variable number of columns. I did a quick fix by setting the value of this column to NULL
but we need to find a better approach to keep this info.
Impossible to search for items with value NA
The current queryEncode interface does not support looking for results which bear a "NA" value.
A typical use-case would be trying to retrieve all results which have not been subjected to any treatment, and hence where the treatment column has value NA.
createDesign ctrl
Tu avais fait des modifications a la fonction get_ctrl_design mais il semble que la nouvelle version ne reussie pas extraire les controles
downloadEncode() error 'curl' call had nonzero exit status
Hello,
I'm having issues using the downloadEncode() function. I'm trying to follow along with the tutorial and run the following code:
query_results <- queryEncode(assay = "switchgear", target ="elavl1",
file_format = "bed" , fixed = FALSE)
downloadEncode(query_results)
I get the following error:
Error in download.file(url = paste0(encode_root, href), quiet = TRUE, :
'curl' call had nonzero exit status
I really appreciate any help!
sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] curl_3.3 ENCODExplorer_2.10.0 shinythemes_1.1.2 DT_0.6 shiny_1.3.2
loaded via a namespace (and not attached):
[1] Rcpp_1.0.1 compiler_3.6.0 pillar_1.4.1
[4] later_0.8.0 BiocManager_1.30.4 dbplyr_1.4.0
[7] AnnotationHub_2.16.0 tools_3.6.0 digest_0.6.19
[10] bit_1.1-14 jsonlite_1.6 RSQLite_2.1.1
[13] memoise_1.1.0 BiocFileCache_1.8.0 tibble_2.1.2
[16] pkgconfig_2.0.2 rlang_0.3.4 DBI_1.0.0
[19] yaml_2.2.0 parallel_3.6.0 stringr_1.4.0
[22] httr_1.4.0 dplyr_0.8.1 IRanges_2.18.1
[25] rappdirs_0.3.1 htmlwidgets_1.3 S4Vectors_0.22.0
[28] stats4_3.6.0 bit64_0.9-7 tidyselect_0.2.5
[31] data.table_1.12.2 Biobase_2.44.0 glue_1.3.1
[34] R6_2.4.0 AnnotationDbi_1.46.0 tidyr_0.8.3
[37] purrr_0.3.2 blob_1.1.1 magrittr_1.5
[40] promises_1.0.1 htmltools_0.3.6 BiocGenerics_0.30.0
[43] assertthat_0.2.1 interactiveDisplayBase_1.22.0 mime_0.6
[46] xtable_1.8-4 httpuv_1.5.1 stringi_1.4.3
[49] crayon_1.3.4
error in metadata table
We'll show a confusing situation with organism and assembly for elements of the
full 2019-10-13 metadata build.
ah = AnnotationHub()
query(ah, "ENCODExplorerData")
## AnnotationHub with 4 records
## # snapshotDate(): 2020-02-28
## # $dataprovider: ENCODE Project
## # $species: NA
## # $rdataclass: data.table
## # additional mcols(): taxonomyid, genome, description,
## # coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## # rdatapath, sourceurl, sourcetype
## # retrieve records with, e.g., 'object[["AH69290"]]'
##
## title
## AH69290 | ENCODE File Metadata (Light, 2019-04-12 build)
## AH69291 | ENCODE File Metadata (Full, 2019-04-12 build)
## AH75131 | ENCODE File Metadata (Light, 2019-10-13 build)
## AH75132 | ENCODE File Metadata (Full, 2019-10-13 build)
fm = ah[["AH75132"]]
This is a data.table. We tabulate the assemblies in use for experiments of organism Homo sapiens.
> table(fm$assembly[fm$organism=="Homo sapiens"], fm$organism[fm$organism=="Homo sapiens"])
Homo sapiens
GRCh38 129473
GRCh38-minimal 4
hg19 162657
mm10 133
mm10-minimal 270
mm9 2
Probably the organism is simply mislabeled, and the assembly annotation is more reliable. But is this an error upstream in ENCODE metadata or is it a curation problem in this package? Thank you.
missing function queryConsensusPeaks in Bioc v.2.10.0
Hi,
I am trying to follow the bioconductor vignette for ENCODEExplorer and when I try to obtain consensus Peaks from ChIP-Seq (section 5.1) I get the following error:
res = queryConsensusPeaks("22Rv1", "GRCh38", "CTCF")
Error in queryConsensusPeaks("22Rv1", "GRCh38", "CTCF"):
could not find function "queryConsensusPeaks"
It looks like the function is not part of the package anymore? I also looked into package ENCODExplorerData, but to no avail... Could you point me to the function? thanks!
session info:
-
Session info --------------------------------------------------
setting value
version R version 3.6.2 (2019-12-12)
os Windows 10 x64
system x86_64, mingw32
ui RStudio
language (EN)
collate English_United States.1252
ctype English_United States.1252
tz America/New_York
date 2020-02-16 -
Packages ------------------------------------------------------
package * version date lib source
AnnotationDbi * 1.46.1 2019-08-20 [1] Bioconductor
AnnotationFilter * 1.8.0 2019-05-02 [1] Bioconductor
AnnotationHub * 2.16.1 2019-09-04 [1] Bioconductor
assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.1)
backports 1.1.5 2019-10-02 [1] CRAN (R 3.6.1)
Biobase * 2.44.0 2019-05-02 [1] Bioconductor
BiocFileCache * 1.8.0 2019-05-02 [1] Bioconductor
BiocGenerics * 0.30.0 2019-05-02 [1] Bioconductor
BiocManager 1.30.10 2019-11-16 [1] CRAN (R 3.6.1)
BiocParallel 1.18.1 2019-08-06 [1] Bioconductor
biomaRt 2.40.5 2019-10-01 [1] Bioconductor
Biostrings 2.52.0 2019-05-02 [1] Bioconductor
bit 1.1-15.1 2020-01-14 [1] CRAN (R 3.6.2)
bit64 0.9-7 2017-05-08 [1] CRAN (R 3.6.0)
bitops 1.0-6 2013-08-17 [1] CRAN (R 3.6.0)
blob 1.2.1 2020-01-20 [1] CRAN (R 3.6.2)
boot 1.3-24 2019-12-20 [2] CRAN (R 3.6.2)
broom 0.5.4 2020-01-27 [1] CRAN (R 3.6.2)
callr 3.4.2 2020-02-12 [1] CRAN (R 3.6.2)
caTools 1.18.0 2020-01-17 [1] CRAN (R 3.6.2)
cellranger 1.1.0 2016-07-27 [1] CRAN (R 3.6.1)
checkmate 2.0.0 2020-02-06 [1] CRAN (R 3.6.2)
cli 2.0.1 2020-01-08 [1] CRAN (R 3.6.2)
codetools 0.2-16 2018-12-24 [2] CRAN (R 3.6.2)
colorRamps 2.3 2012-10-29 [1] CRAN (R 3.6.0)
colorspace 1.4-1 2019-03-18 [1] CRAN (R 3.6.1)
crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.1)
curl 4.3 2019-12-02 [1] CRAN (R 3.6.1)
data.table 1.12.8 2019-12-09 [1] CRAN (R 3.6.2)
DBI 1.1.0 2019-12-15 [1] CRAN (R 3.6.2)
dbplyr * 1.4.2 2019-06-17 [1] CRAN (R 3.6.1)
DelayedArray 0.10.0 2019-05-02 [1] Bioconductor
desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.1)
devtools 2.2.1 2019-09-24 [1] CRAN (R 3.6.1)
digest 0.6.23 2019-11-23 [1] CRAN (R 3.6.1)
doParallel 1.0.15 2019-08-02 [1] CRAN (R 3.6.1)
dplyr * 0.8.4 2020-01-31 [1] CRAN (R 3.6.2)
DT * 0.12 2020-02-05 [1] CRAN (R 3.6.2)
edgeR * 3.26.8 2019-09-01 [1] Bioconductor
ellipsis 0.3.0 2019-09-20 [1] CRAN (R 3.6.1)
ENCODExplorer * 2.10.0 2019-05-02 [1] Bioconductor
ENCODExplorerData * 0.99.1 2020-02-16 [1] Bioconductor
ensembldb * 2.8.1 2019-10-11 [1] Bioconductor
fansi 0.4.1 2020-01-08 [1] CRAN (R 3.6.2)
fastmap 1.0.1 2019-10-08 [1] CRAN (R 3.6.1)
forcats * 0.4.0 2019-02-17 [1] CRAN (R 3.6.1)
foreach * 1.4.8 2020-02-09 [1] CRAN (R 3.6.2)
fs 1.3.1 2019-05-06 [1] CRAN (R 3.6.1)
gdata 2.18.0 2017-06-06 [1] CRAN (R 3.6.0)
generics 0.0.2 2018-11-29 [1] CRAN (R 3.6.1)
GenomeInfoDb * 1.20.0 2019-05-02 [1] Bioconductor
GenomeInfoDbData 1.2.1 2019-11-30 [1] Bioconductor
GenomicAlignments 1.20.1 2019-06-18 [1] Bioconductor
GenomicFeatures * 1.36.4 2019-07-11 [1] Bioconductor
GenomicRanges * 1.36.1 2019-09-06 [1] Bioconductor
ggplot2 * 3.2.1 2019-08-10 [1] CRAN (R 3.6.1)
glue 1.3.1 2019-03-12 [1] CRAN (R 3.6.1)
gplots 3.0.1.2 2020-01-11 [1] CRAN (R 3.6.2)
graph 1.62.0 2019-05-02 [1] Bioconductor
graphite * 1.30.0 2019-05-02 [1] Bioconductor
gtable 0.3.0 2019-03-25 [1] CRAN (R 3.6.1)
gtools 3.8.1 2018-06-26 [1] CRAN (R 3.6.0)
haven 2.2.0 2019-11-08 [1] CRAN (R 3.6.1)
hms 0.5.3 2020-01-08 [1] CRAN (R 3.6.2)
htmltools 0.4.0 2019-10-04 [1] CRAN (R 3.6.1)
htmlwidgets 1.5.1 2019-10-08 [1] CRAN (R 3.6.1)
httpuv 1.5.2 2019-09-11 [1] CRAN (R 3.6.1)
httr 1.4.1 2019-08-05 [1] CRAN (R 3.6.1)
interactiveDisplayBase 1.22.0 2019-05-02 [1] Bioconductor
IRanges * 2.18.3 2019-09-24 [1] Bioconductor
iterators 1.0.12 2019-07-26 [1] CRAN (R 3.6.1)
jsonlite 1.6.1 2020-02-02 [1] CRAN (R 3.6.2)
KernSmooth 2.23-16 2019-10-15 [2] CRAN (R 3.6.2)
later 1.0.0 2019-10-04 [1] CRAN (R 3.6.1)
lattice 0.20-38 2018-11-04 [2] CRAN (R 3.6.2)
lazyeval 0.2.2 2019-03-15 [1] CRAN (R 3.6.1)
lifecycle 0.1.0 2019-08-01 [1] CRAN (R 3.6.1)
limma * 3.40.6 2019-07-26 [1] Bioconductor
lme4 1.1-21 2019-03-05 [1] CRAN (R 3.6.1)
locfit 1.5-9.1 2013-04-20 [1] CRAN (R 3.6.1)
lubridate 1.7.4 2018-04-11 [1] CRAN (R 3.6.1)
magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.1)
MASS 7.3-51.5 2019-12-20 [2] CRAN (R 3.6.2)
Matrix 1.2-18 2019-11-27 [1] CRAN (R 3.6.1)
matrixStats 0.55.0 2019-09-07 [1] CRAN (R 3.6.1)
memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.1)
mime 0.9 2020-02-04 [1] CRAN (R 3.6.2)
minqa 1.2.4 2014-10-09 [1] CRAN (R 3.6.1)
modelr 0.1.5 2019-08-08 [1] CRAN (R 3.6.1)
munsell 0.5.0 2018-06-12 [1] CRAN (R 3.6.1)
nlme 3.1-144 2020-02-06 [1] CRAN (R 3.6.2)
nloptr 1.2.1 2018-10-03 [1] CRAN (R 3.6.1)
packrat 0.5.0 2018-11-14 [1] CRAN (R 3.6.1)
pbkrtest 0.4-7 2017-03-15 [1] CRAN (R 3.6.1)
pillar 1.4.3 2019-12-20 [1] CRAN (R 3.6.2)
pkgbuild 1.0.6 2019-10-09 [1] CRAN (R 3.6.1)
pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 3.6.1)
pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.6.1)
plyr 1.8.5 2019-12-10 [1] CRAN (R 3.6.2)
prettyunits 1.1.1 2020-01-24 [1] CRAN (R 3.6.2)
processx 3.4.2 2020-02-09 [1] CRAN (R 3.6.2)
progress 1.2.2 2019-05-16 [1] CRAN (R 3.6.1)
promises 1.1.0 2019-10-04 [1] CRAN (R 3.6.1)
ProtGenerics 1.16.0 2019-05-02 [1] Bioconductor
ps 1.3.0 2018-12-21 [1] CRAN (R 3.6.2)
purrr * 0.3.3 2019-10-18 [1] CRAN (R 3.6.1)
R6 2.4.1 2019-11-12 [1] CRAN (R 3.6.1)
rappdirs 0.3.1 2016-03-28 [1] CRAN (R 3.6.1)
Rcpp 1.0.3 2019-11-08 [1] CRAN (R 3.6.1)
RCurl 1.98-1.1 2020-01-19 [1] CRAN (R 3.6.2)
readr * 1.3.1 2018-12-21 [1] CRAN (R 3.6.1)
readxl 1.3.1 2019-03-13 [1] CRAN (R 3.6.1)
remotes 2.1.1 2020-02-15 [1] CRAN (R 3.6.2)
reprex 0.3.0 2019-05-16 [1] CRAN (R 3.6.1)
reshape2 1.4.3 2017-12-11 [1] CRAN (R 3.6.1)
rlang 0.4.4 2020-01-28 [1] CRAN (R 3.6.2)
rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.1)
Rsamtools 2.0.3 2019-10-10 [1] Bioconductor
RSQLite 2.2.0 2020-01-07 [1] CRAN (R 3.6.2)
rstudioapi 0.11 2020-02-07 [1] CRAN (R 3.6.2)
rtracklayer 1.44.4 2019-09-06 [1] Bioconductor
rvest 0.3.5 2019-11-08 [1] CRAN (R 3.6.1)
S4Vectors * 0.22.1 2019-09-09 [1] Bioconductor
scales * 1.1.0 2019-11-18 [1] CRAN (R 3.6.1)
sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.1)
shiny * 1.4.0 2019-10-10 [1] CRAN (R 3.6.1)
shinythemes * 1.1.2 2018-11-06 [1] CRAN (R 3.6.1)
stringi 1.4.4 2020-01-09 [1] CRAN (R 3.6.2)
stringr * 1.4.0 2019-02-10 [1] CRAN (R 3.6.1)
SummarizedExperiment 1.14.1 2019-07-31 [1] Bioconductor
testthat 2.3.1 2019-12-01 [1] CRAN (R 3.6.1)
tibble * 2.1.3 2019-06-06 [1] CRAN (R 3.6.1)
tidyr * 1.0.2 2020-01-24 [1] CRAN (R 3.6.2)
tidyselect 1.0.0 2020-01-27 [1] CRAN (R 3.6.2)
tidyverse * 1.3.0 2019-11-21 [1] CRAN (R 3.6.1)
usethis 1.5.1 2019-07-04 [1] CRAN (R 3.6.1)
variancePartition * 1.14.1 2019-10-01 [1] Bioconductor
vctrs 0.2.2 2020-01-24 [1] CRAN (R 3.6.2)
withr 2.1.2 2018-03-15 [1] CRAN (R 3.6.1)
XML 3.99-0.3 2020-01-20 [1] CRAN (R 3.6.2)
xml2 1.2.2 2019-08-09 [1] CRAN (R 3.6.1)
xtable 1.8-4 2019-04-21 [1] CRAN (R 3.6.2)
XVector 0.24.0 2019-05-02 [1] Bioconductor
yaml 2.2.1 2020-02-01 [1] CRAN (R 3.6.2)
zlibbioc 1.30.0 2019-05-02 [1] Bioconductor
downloadEncode(force=FALSE) has non-intuitive behaviour.
Calling downloadEncode with force=FALSE will not check that an existign file has a matching md5sum, and still report success.
Example:
q_results = queryEncodeGeneric(biosample_name="A549",
file_type="bed narrowPeak",
target="BHLHE40")
d_results = downloadEncode(q_results)
d_result_files = gsub("Success downloading file : ", "", d_results)
checkTrue(all(file.exists(d_result_files)))
# Downlaod again with force=FALSE, should fail.
file.remove(d_result_files)
file.create(d_result_files)
d_results = downloadEncode(q_results, force=FALSE)
will yield:
[1] "Success downloading file : ./ENCFF001VDM.bed.gz"
[1] "Success downloading file : ./ENCFF002COC.bed.gz"
[1] "Files can be found at C:/Dev/Projects/ENCODExplorer"
whereas the files ahve clearly not been downloaded, and are in fact corrupt.
The behaviour of force might not need toc hange, but its reporting should at least clearly spell out what happened ("Files were not dowloaded because they already exists". Checking if the existing file's md5 matches the expected one would be a nice bonus.
ENCODE portal now have dataset from Roadmap
We need to find how we could parse the metadata from the Roadmap datasets. In the metadata.tsv file from ENCODEproject, there is a project
column.
Check if new datasets are available on ENCODE portal
By default, we should check if the stored encode_df is up-to-date with the ENCODE portal for the current request. If not, we should output a Warning to the users.
package failing in Bioc 3.14
txdb <- TxDb.Mmusculus.UCSC.mm9.knownGene
isActiveSeq(txdb) <- c(rep(FALSE,20), TRUE, rep(FALSE, 14))
allpromoter <- getPromoterClass(txdb, Nproc=1, org=Mmusculus)
Error in clusterApplyLB(cl, chrs, pcChr, tssGR = tssGR, ORG = org) :
could not find function "clusterApplyLB"
Calls: getPromoterClass -> getPromoterClass
Execution halted
see the build report -- the package seems to be deprecated. so please change the README.md if you are not going to maintain the package. @CharlesJB @lshep
Error in Bioconductor build
Relevant snipped from build:
Running ‘runTests.R’ [4s/5s]
ERROR
Running the tests in ‘tests/runTests.R’ failed.
Last 13 lines of output:
ENCODExplorer RUnit Tests - 18 test functions, 0 errors, 1 failure
FAILURE in test.md5sum: Error in checkEquals(as.character(tools::md5sum(system.file("extdata/ENCFF001VCK.broadPeak.gz", :
1 string mismatch
Test files with failing tests
test_download.R
test.md5sum
Error in BiocGenerics:::testPackage("ENCODExplorer") :
unit tests failed for package ENCODExplorer
Execution halted
downloadEncode error
I was testing the ENCODExplorer package, but the function downloadEncode give a error:
downloadEncode(query_results, df=encode_df, format = "bed")
Error in download.file(url = paste0(encode_root, href), quiet = TRUE, :
'curl' call had nonzero exit status
In addition: Warning message:
running command 'curl -L -s -S "https://www.encodeproject.org/files/ENCFF001VCK/@@download/ENCFF001VCK.bed.gz" -o "./ENCFF001VCK.bed.gz"' had status 127
prepare_ENCODEdb breaks on empty tables
Currently, the types of ENCODE tables(as returned by get_encode_types(), which sources its list directly from the ENCODE github repository) includes access_key, but querying it returns an empty set (No columns, no rows). This causes prepare_ENCODEdb to crash.
Checking for empty tables seems to have fixed it, but I need to test it more thoroughly before I push it to production.
shinyEncode sources a file that does not exist
In shinyEncode()
, the line
source(file = system.file("inst/shiny/ui.R", package = "ENCODExplorer"))
should be
source(file = system.file("shiny/ui.R", package = "ENCODExplorer"))
since everything in the inst directory gets placed in the package root upon installation.
downloadEncode fails if the download directory is not the current directory.
Presently, downloadEncode uses the following line to fetch the files:
download.file(url=paste0(encode_root, href), quiet=TRUE,
destfile=fileName, method = "curl",
extra = "-O -L" )
However, the -O flag for curl reads as follow:
-O, --remote-name
Write output to a local file named like the remote file we get. (Only the file part of the remote file is used, the path is cut off.)
The file will be saved in the current working directory. If you want the file saved in a different directory, make sure you change the current working directory before invoking curl with this option.
So whenever a user passes in a different directory as a target, curl ignores it, and the md5sum check fails since the file is now where it is expected to be.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.