leekgroup / derfinder Goto Github PK

R package for DER Finder, a method for differential expression analysis of RNA-seq data

Home Page: http://biostatistics.oxfordjournals.org/content/early/2014/01/06/biostatistics.kxt053.short?rss=1

License: MIT License

R 97.67% Python 2.33%

derfinder's Issues

derfinder 1.7.16

Hi all,
I'm writing to inform that derfinder is not passing check on morelia, the Mac OS X machine.
Please see the report at:

http://bioconductor.org/checkResults/3.4/bioc-LATEST/derfinder/morelia-checksrc.html

Thanks,
Marcel

Make Docker container for reproducing this analysis

I'm working on making a Docker container with the correct dependency/R versions so that this script is reproducible without the user having to install archived packages. This should resolve all issues opened recently by @brianhigh. I will be sure to close this issue when the container is done, though please note that it might be a few weeks. Thanks!

Error in countReads.py

I have tried running it multiple times and this is the error I am receiving.

Traceback (most recent call last):
File "countReads.py", line 144, in
countReadlets(options.file, options.output, options.kmer, options.chrom, stranded)
TypeError: countReadlets() takes exactly 4 arguments (5 given)

seems like 'strand' is missing in def of countReadlets() . Also noticed that in the last line of the code all arguments have options.argument while stranded does not.

countReadlets(options.file, options.output, options.kmer, options.chrom, stranded)

package RSQLite.extfuns was removed from CRAN

As with Issue #9, the RSQLite.extfuns package has also been removed from CRAN. An archived version is available and can be installed with:

library(devtools)
install_url("http://cran.r-project.org/src/contrib/Archive/RSQLite.extfuns/RSQLite.extfuns_0.0.1.tar.gz")

makeDb error when running analysis_code.R

I got this makeDb error when running analysis_code.R:

> makeDb(dbfile = dbfile, tablename = tablename, textfile = textfile, cutoff = 5)
Error in if (.allows_extensions(db)) { : 
  missing value where TRUE/FALSE needed
Error in !dbPreExists : invalid argument type

Perhaps an update to a package since your code was developed has created an incompatibility. Any ideas for a solution? Perhaps ...

library(checkpoint)
checkpoint("2014-10-08")

... as suggedted on the sqldf homepage ... though that did not work for me.

exons length error from stopifnot() call in getFlags.R

When trying to reproduce results in the 2013 paper, from this code in analysis_code.R ...

# get the flags:
exons = getAnnotation("hg19","knownGene")
myflags = getFlags(regions = regions.merged.y, exons, "chrY", pctcut = 0.8)

... I am getting this fatal error ...

Error: length(unique(exons$chr)) == 1 is not TRUE

Some tests:

> length(unique(subset(x=exons, seqnames == "chrY", c(chr))))
 [1] 1
> length(unique(unlist(subset(x=exons, seqnames == "chrY", c(chr)))))
 [1] 101
> str(unique(unlist(subset(x=exons, seqnames == "chrY", c(chr)))))
 chr [1:101] "100101116" "100101120" "100132596" "100133941" "100289150" ...
> summary(subset(x=exons, seqnames == "chrY", c(chr)))
     chr           
 Length:1190       
 Class :character  
 Mode  :character  
> head(subset(x=exons, seqnames == "chrY"))
    gene       chr seqnames   start     end width strand
392   85 100101116     chrY 6258442 6258716   275      +
393   85 100101116     chrY 6262141 6262300   160      +
394   85 100101116     chrY 6269164 6269272   109      +
395   85 100101116     chrY 6271629 6271766   138      +
396   85 100101116     chrY 6279348 6279605   258      +
397   85 100101116     chrY 9590765 9591022   258      -
> tail(subset(x=exons, seqnames == "chrY"))
        gene  chr seqnames    start      end width strand
258853 22527 9189     chrY  2354455  2358810  4356      -
258854 22527 9189     chrY  2354455  2358813  4359      -
258855 22527 9189     chrY  2368352  2368580   229      -
258856 22527 9189     chrY  2368858  2369015   158      -
263606 22923 9426     chrY 20137667 20139627  1961      +
263607 22923 9426     chrY 19990140 19992100  1961      -

See: getFlags.R ...

stopifnot(length(unique(exons$chr))==1,
length(unique(regions$chr)) == 1,
unique(exons$chr) == unique(regions$chr))

This is my sessionInfo():

> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C         LC_TIME=C            LC_COLLATE=C        
 [5] LC_MONETARY=C        LC_MESSAGES=C        LC_PAPER=C           LC_NAME=C           
 [9] LC_ADDRESS=C         LC_TELEPHONE=C       LC_MEASUREMENT=C     LC_IDENTIFICATION=C 

attached base packages:
 [1] splines   grid      parallel  stats4    stats     graphics  grDevices utils     datasets 
[10] methods   base     

other attached packages:
 [1] rtracklayer_1.26.2     GenomicFeatures_1.18.3 AnnotationDbi_1.28.1  
 [4] derfinder_1.0.2        locfdr_1.1-7           devtools_1.7.0        
 [7] HiddenMarkov_1.8-1     Genominator_1.20.0     GenomeGraphs_1.26.0   
[10] DESeq_1.18.0           lattice_0.20-29        locfit_1.5-9.1        
[13] Biobase_2.26.0         edgeR_3.8.5            limma_3.22.4          
[16] biomaRt_2.22.0         Rsamtools_1.18.2       Biostrings_2.34.1     
[19] XVector_0.6.0          GenomicRanges_1.18.4   GenomeInfoDb_1.2.4    
[22] IRanges_2.0.1          S4Vectors_0.4.0        BiocGenerics_0.12.1   
[25] BiocInstaller_1.16.1   sqldf_0.4-10           gsubfn_0.6-6          
[28] proto_0.3-10           RSQLite_1.0.0          DBI_0.3.1             

loaded via a namespace (and not attached):
 [1] BBmisc_1.9              BatchJobs_1.5           BiocParallel_1.0.3     
 [4] GenomicAlignments_1.2.1 RColorBrewer_1.1-2      RCurl_1.95-4.5         
 [7] XML_3.98-1.1            annotate_1.44.0         base64enc_0.1-2        
[10] bitops_1.0-6            brew_1.0-6              checkmate_1.5.1        
[13] chron_2.3-45            codetools_0.2-10        digest_0.6.8           
[16] fail_1.2                foreach_1.4.2           genefilter_1.48.1      
[19] geneplotter_1.44.0      httr_0.6.1              iterators_1.0.7        
[22] sendmailR_1.2-1         stringr_0.6.2           survival_2.37-7        
[25] tcltk_3.1.2             tools_3.1.2             xtable_1.7-4           
[28] zlibbioc_1.12.0

Where is tophatY-updated?

Hi! I am trying to run your R script analysis_code.R to reproduce your research paper results and when the command...

makeDb(dbfile = dbfile, tablename = tablename, textfile = textfile, cutoff = 5)

...is executed I get this error:

Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file 'tophatY-updated': No such file or directory

I do not see this file tophatY-updated in your repository. Where can I get this file? Can you add it to your repository? Thanks!

fix dependencies

packages "BSgenome.Hsapiens.UCSC.hg19" and "BSgenome.Mmusculus.UCSC.mm10" are REALLY SUPER SUPER slow in installing and aren't always needed. Is there a way to only load main packages when installing the packages and load those bulky ones only if the user needs them? (i.e. doesn't already have an annotation)?

Also, fix README to include all dependencies (not just the "Depends") and while you're at it, fix Depends vs. Imports.

*cuff-UDPATED.rda files missing

When trying to run analysis_code.R, the "chunk 3a: load resuls from Cufflinks" section gives missing file errors because the *cuff-UDPATED.rda files are missing:

https://github.com/alyssafrazee/derfinder/search?utf8=%E2%9C%93&q=cuff-updated

I do not see where in this script the files are saved, nor do I see another script that saves these. Please help me understand where and how these files should have been saved. Thanks!

package locfdr was removed from CRAN

In your README.md, you mention locfdr in the installation section. Since locfdr has been removed from CRAN, you might want to use an alternative which is still supported. Some suggestions may be found in this stackoverflow thread. The packages twilight and fdrtool were suggested and some examples were given in that thread. Otherwise a archived package for locfdr must be used, as in...

library(devtools)
install_url("http://cran.r-project.org/src/contrib/Archive/locfdr/locfdr_1.1-7.tar.gz")

request to modify countReads.py to include strand specific information

Hi Alyssa,

can you please modify countReads.py so that it can use strand specific RNAseq data.

Thanks so much,

Samir

Regions are not detected as expressed

I encountered a strange result while analysing RNA-seq data with derfinder. My goal is to find expressed regions in a number of conditions which broadly derfinder does. The issue is that it seems that the reported regions are smaller than what would be expected by visual inspection:

As you can see there is quite a large expressed region of which only a fraction is detected. This doesn't seem to be an issue of low coverage because, to the left, there is a an area with similar high coverage which is not reported as expressed. This an example but I found several of these in my data.

The question is now if this a setting which needs to be changed, or is this something to be expected?

The analysis was run with the following settings:

fullCov <- fullCoverage(
      files = files,
      chrs = chrom,
      cutoff = 0,
      L = read_length,
      verbose = TRUE,
      totalMapped = total_mapped,
      filter = "one",
      mc.cores = nproc
)

regionMat <- regionMatrix(
   fullCov,
   cutoff = min_cov,
   L = read_length,
   maxClusterGap = 3000L,
   returnBP = TRUE,
   verbose = TRUE,
   filter = "one",
   targetSize = targetSize,
   mc.cores = nproc
   )

files are bam files, read length = 75 and cutoff = 5. Session info can be found here.

makeTranscriptDbFromUCSC() returns "'data' must be of a vector type" error

It looks like changes in UCSC have introduced a problem with running this beta derfinder with R 2.15.3.

In the sample code analysis_code.R:

exons = getAnnotation("hg19","knownGene")
Warning message:
In .local(.Object, ...) : NAs introduced by coercion
Error in genome(ucscCart(x)) : 
  error in evaluating the argument 'x' in selecting a method for function 'genome': Error in matrix(unlist(pairs), nrow = 2) : 
  'data' must be of a vector type

Error in getAnnotation("hg19", "knownGene") : 
  Problem accessing requested UCSC annotation - likely there is a problem with genome or tablename arguments. Use ucscGenomes() to see acceptable genomes; use supportedTables(genome) to see acceptable tablenames for your genome.

A test:

a <- makeTranscriptDbFromUCSC(genome = "hg19", tablename = "knownGene")
Warning message:
In .local(.Object, ...) : NAs introduced by coercion
Error in genome(ucscCart(x)) : 
  error in evaluating the argument 'x' in selecting a method for function 'genome': Error in matrix(unlist(pairs), nrow = 2) : 
  'data' must be of a vector type

leekgroup / derfinder Goto Github PK

derfinder's Issues

derfinder 1.7.16

Make Docker container for reproducing this analysis

Error in countReads.py

package RSQLite.extfuns was removed from CRAN

makeDb error when running analysis_code.R

exons length error from stopifnot() call in getFlags.R

Where is tophatY-updated?

fix dependencies

*cuff-UDPATED.rda files missing

package locfdr was removed from CRAN

request to modify countReads.py to include strand specific information

Regions are not detected as expressed

makeTranscriptDbFromUCSC() returns "'data' must be of a vector type" error

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent