gagneurlab / fraser Goto Github PK
View Code? Open in Web Editor NEWThis project forked from c-mertes/fraser
FRASER - Find RAre Splicing Events in RNA-seq
License: MIT License
This project forked from c-mertes/fraser
FRASER - Find RAre Splicing Events in RNA-seq
License: MIT License
Hi,
I tried to run fds <- calculatePSIValues(fds, types="jaccard", BPPARAM=bpparam())
but keep getting the error:
Error in FUN(X[[i]], ...) : Could not find read type: jaccard Calls: calculatePSIValues -> unique -> vapply -> FUN
fds
is created from:
library(FRASER)
annotation_dat <- data.table(sampleID="{sample_id}", bamFile="{sample_bam_path}", pairedEnd=TRUE)
settings <- FraserDataSet(colData=annotation_dat, workingDir=".")
fds <- countRNAData(settings)
I wonder if you could help look into this. Thanks!
The parallelization should be turned on by the following code:
if(.Platform$OS.type == "unix") { register(MulticoreParam(workers=min(10, multicoreWorkers())))
} else { register(SnowParam(workers=min(10, multicoreWorkers()))) }
Correct way to calculate z-scores would be to derive it from the CDF:
cdf = pbetabinom(a, b , q=counts, log.p=TRUE)
zscore = qnorm(cdf, log.p=TRUE)
Hello,
First of all, is this code still maintained? I received absolutely no news despite writing directly to the author.
My question is, is it possible to adapt the code for mouse data?
I tried to use
library(TxDb.Mmusculus.UCSC.mm10.knownGene)
library(org.Mm.eg.db)
but I get errors about chromosome length, I suspect it's hardcoded to use human, is it the case?
It makes most of the plotting functions not usable, as they seem to depend on this annotation.
Hello, I'm trying to install FRASER 2.0, but when I install it with bioconductor, i get version 1.8.1, which is the first FRASER. When trying to use devtools i get the following error:
Using GitHub PAT from the git credential store.
Error: Failed to install 'unknown package' from GitHub:
HTTP error 401.
Bad credentials
Rate limit remaining: 53/60
Rate limit reset at: 2024-04-26 14:52:19 UTC
I have already checked my github path with Sys.getenv("GITHUB_PAT") and it seemes fine (I get [1] "" as answer). If I may have some help in order to install FRASER 2.0.
Thank you in advance.
I've tried repeatedly to build the vignette for this package with little success. These are the steps I took along with the output. Any idea why this might be happening?
Alternatively, would it be possible to also include an already built PDF copy of the vignette. The bits I am particularly interested in are the syntax nuances in using FRASER to get split and non-split counts
I'm doing this on Mac OS Sierra
Steps I took:
> devtools::build(pkg='.', vignettes=TRUE)
✔ checking for file ‘/Users/himanshujoshi/PycharmProjects/FRASER/DESCRIPTION’ ...
─ preparing ‘FRASER’:
✔ checking DESCRIPTION meta-information ...
─ cleaning src
─ installing the package to build vignettes
E creating vignettes (2m 19.9s)
--- re-building ‘FraseR.Rnw’ using knitr
Fri Jan 17 15:27:01 2020: Calculate the PSI 5 and 3 values ...
Fri Jan 17 15:27:03 2020: Calculate the PSI site values ...
Fri Jan 17 15:27:04 2020: Calculate the delta for psi5 values ...
Fri Jan 17 15:27:04 2020: Calculate the delta for psi3 values ...
Fri Jan 17 15:27:04 2020: Calculate the delta for psiSite values ...
Fri Jan 17 15:27:05 2020: Writing final FraseR object ('/var/folders/gw/rccgk_r53w316ntmdmtv1hrw000157/T//RtmpaEz1GE/savedObjects/Example_Dataset/fds-object.RDS').
Fri Jan 17 15:27:05 2020: Fit step for: 'psi5'.
Fri Jan 17 15:27:06 2020: Running fit with correction method: PCA
Fri Jan 17 15:27:06 2020: Computing PCA ...
Fri Jan 17 15:27:06 2020: Fitting rho ...
Fri Jan 17 15:27:07 2020: Compute p values for: 'psi5'.
Fri Jan 17 15:27:08 2020: Adjust p values for: 'psi5'.
Fri Jan 17 15:27:08 2020: Compute Z scores for: 'psi5'.
Fri Jan 17 15:27:08 2020: Fit step for: 'psi3'.
Fri Jan 17 15:27:09 2020: Running fit with correction method: PCA
Fri Jan 17 15:27:09 2020: Computing PCA ...
Fri Jan 17 15:27:09 2020: Fitting rho ...
Fri Jan 17 15:27:10 2020: Compute p values for: 'psi3'.
Fri Jan 17 15:27:11 2020: Adjust p values for: 'psi3'.
Fri Jan 17 15:27:11 2020: Compute Z scores for: 'psi3'.
Fri Jan 17 15:27:12 2020: Fit step for: 'psiSite'.
Fri Jan 17 15:27:12 2020: Running fit with correction method: PCA
Fri Jan 17 15:27:12 2020: Computing PCA ...
Fri Jan 17 15:27:12 2020: Fitting rho ...
Fri Jan 17 15:27:13 2020: Compute p values for: 'psiSite'.
Fri Jan 17 15:27:14 2020: Adjust p values for: 'psiSite'.
Fri Jan 17 15:27:15 2020: Compute Z scores for: 'psiSite'.
Loading required package: TxDb.Hsapiens.UCSC.hg19.knownGene
Loading required package: GenomicFeatures
Loading required package: AnnotationDbi
Loading required package: org.Hs.eg.db
'select()' returned 1:1 mapping between keys and columns
'select()' returned 1:1 mapping between keys and columns
Fri Jan 17 15:27:18 2020: Collecting results for: psi5
Fri Jan 17 15:27:18 2020: Process chunk: 1 for: psi5
Fri Jan 17 15:27:18 2020: Collecting results for: psi3
Fri Jan 17 15:27:18 2020: Process chunk: 1 for: psi3
Fri Jan 17 15:27:18 2020: Collecting results for: psiSite
Fri Jan 17 15:27:18 2020: Process chunk: 1 for: psiSite
Warning in has_utility("convert", "ImageMagick") :
ImageMagick not installed or not in PATH
Fri Jan 17 15:27:22 2020: Start counting the split reads ...
Fri Jan 17 15:27:22 2020: Count split reads for sample: sample1
Warning in dir.create(cachedir, recursive = TRUE) :
'/var/folders/gw/rccgk_r53w316ntmdmtv1hrw000157/T//RtmpaEz1GE/cache/splitCounts' already exists
Fri Jan 17 15:27:22 2020: Count split reads for sample: sample2
Fri Jan 17 15:27:22 2020: Count split reads for sample: sample3
Warning in dir.create(cachedir, recursive = TRUE) :
'/var/folders/gw/rccgk_r53w316ntmdmtv1hrw000157/T//RtmpaEz1GE/cache/splitCounts' already exists
Fri Jan 17 15:27:33 2020 : count ranges need to be merged ...
Fri Jan 17 15:27:35 2020: Create splice site indices ...
Fri Jan 17 15:27:35 2020: Writing counts to file: /var/folders/gw/rccgk_r53w316ntmdmtv1hrw000157/T//RtmpaEz1GE/savedObjects/Data_Analysis/splitCounts.tsv.gz
Fri Jan 17 15:27:35 2020: Create splice site indices ...
Fri Jan 17 15:27:35 2020: Start counting the non spliced reads ...
Fri Jan 17 15:27:35 2020: In total 60 splice junctions are found.
Fri Jan 17 15:27:35 2020: In total 77 splice sites (acceptor/donor) will be counted ...
Fri Jan 17 15:27:35 2020: Count non spliced reads for sample: sample1
Warning in dir.create(cachedir, recursive = TRUE) :
'/var/folders/gw/rccgk_r53w316ntmdmtv1hrw000157/T//RtmpaEz1GE/cache/nonSplicedCounts/Data_Analysis' already exists
========== _____ _ _ ____ _____ ______ _____
===== / ____| | | | _ \| __ \| ____| /\ | __ \
===== | (___ | | | | |_) | |__) | |__ / \ | | | |
==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
Rsubread 2.0.0
//========================== featureCounts setting ===========================\\
|| ||
|| Input files : 1 BAM file ||
|| o sample1.bam ||
|| ||
|| Annotation : R data.frame ||
|| Dir for temp files : /var/folders/gw/rccgk_r53w316ntmdmtv1hrw0001 ... ||
|| Threads : 3 ||
|| Level : meta-feature level ||
|| Paired-end : no ||
|| Multimapping reads : counted ||
|| Multi-overlapping reads : counted ||
|| Min overlapping bases : 10 ||
|| Long read mode : yes ||
|| ||
\\============================================================================//
//================================= Running ==================================\\
|| ||
|| Load annotation file .Rsubread_UserProvidedAnnotation_pid92802 ... ||
|| Features : 77 ||
|| Meta-features : 77 ||
|| Chromosomes/contigs : 2 ||
|| ||
|| Process BAM file sample1.bam... ||
|| WARNING: Paired-end reads were found. ||
|| Total alignments : 802 ||
|| Successfully assigned alignments : 188 (23.4%) ||
|| Running time : 0.01 minutes ||
|| ||
|| Write the final count table. ||
|| Write the read assignment summary. ||
|| ||
\\============================================================================//
Saving splice site cache ...
Fri Jan 17 15:27:35 2020: Count non spliced reads for sample: sample3
========== _____ _ _ ____ _____ ______ _____
===== / ____| | | | _ \| __ \| ____| /\ | __ \
===== | (___ | | | | |_) | |__) | |__ / \ | | | |
==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
Rsubread 2.0.0
//========================== featureCounts setting ===========================\\
|| ||
|| Input files : 1 BAM file ||
|| o sample3.bam ||
|| ||
|| Annotation : R data.frame ||
|| Dir for temp files : /var/folders/gw/rccgk_r53w316ntmdmtv1hrw0001 ... ||
|| Threads : 3 ||
|| Level : meta-feature level ||
|| Paired-end : no ||
|| Multimapping reads : counted ||
|| Multi-overlapping reads : counted ||
|| Min overlapping bases : 10 ||
|| Long read mode : yes ||
|| ||
\\============================================================================//
//================================= Running ==================================\\
|| ||
|| Load annotation file .Rsubread_UserProvidedAnnotation_pid92804 ... ||
|| Features : 77 ||
|| Meta-features : 77 ||
|| Chromosomes/contigs : 2 ||
|| ||
|| Process BAM file sample3.bam... ||
|| WARNING: Paired-end reads were found. ||
|| Total alignments : 3498 ||
|| Successfully assigned alignments : 866 (24.8%) ||
|| Running time : 0.04 minutes ||
|| ||
|| Write the final count table. ||
|| Write the read assignment summary. ||
|| ||
\\============================================================================//
Saving splice site cache ...
Fri Jan 17 15:27:35 2020: Count non spliced reads for sample: sample2
Warning in dir.create(cachedir, recursive = TRUE) :
'/var/folders/gw/rccgk_r53w316ntmdmtv1hrw000157/T//RtmpaEz1GE/cache/nonSplicedCounts/Data_Analysis' already exists
========== _____ _ _ ____ _____ ______ _____
===== / ____| | | | _ \| __ \| ____| /\ | __ \
===== | (___ | | | | |_) | |__) | |__ / \ | | | |
==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
Rsubread 2.0.0
//========================== featureCounts setting ===========================\\
|| ||
|| Input files : 1 BAM file ||
|| o sample2.bam ||
|| ||
|| Annotation : R data.frame ||
|| Dir for temp files : /var/folders/gw/rccgk_r53w316ntmdmtv1hrw0001 ... ||
|| Threads : 3 ||
|| Level : meta-feature level ||
|| Paired-end : no ||
|| Multimapping reads : counted ||
|| Multi-overlapping reads : counted ||
|| Min overlapping bases : 10 ||
|| Long read mode : yes ||
|| ||
\\============================================================================//
//================================= Running ==================================\\
|| ||
|| Load annotation file .Rsubread_UserProvidedAnnotation_pid92803 ... ||
|| Features : 77 ||
|| Meta-features : 77 ||
|| Chromosomes/contigs : 2 ||
|| ||
|| Process BAM file sample2.bam... ||
|| WARNING: Paired-end reads were found. ||
|| Total alignments : 4397 ||
|| Successfully assigned alignments : 1011 (23.0%) ||
|| Running time : 0.05 minutes ||
|| ||
|| Write the final count table. ||
|| Write the read assignment summary. ||
|| ||
\\============================================================================//
Saving splice site cache ...
Fri Jan 17 15:27:38 2020 : Fast merging of counts ...
Fri Jan 17 15:27:39 2020: Writing counts to file: /var/folders/gw/rccgk_r53w316ntmdmtv1hrw000157/T//RtmpaEz1GE/savedObjects/Data_Analysis/nonSplitCounts.tsv.gz
Fri Jan 17 15:27:41 2020: Calculate the PSI 5 and 3 values ...
Fri Jan 17 15:27:42 2020: Calculate the PSI site values ...
Fri Jan 17 15:27:42 2020: Calculate the delta for psi5 values ...
Fri Jan 17 15:27:43 2020: Calculate the delta for psi3 values ...
Fri Jan 17 15:27:43 2020: Calculate the delta for psiSite values ...
Fri Jan 17 15:27:44 2020: Writing final FraseR object ('/var/folders/gw/rccgk_r53w316ntmdmtv1hrw000157/T//RtmpaEz1GE/sav Fri Jan 17 15:27:44 2020: Writing final FraseR object ('/var/folders/gw/rccgk_r53w316ntmdmtv1hrw000157/T//RtmpaEz1GE/savedObjects/Data_Analysis/fds-object.RDS').
Warning: Transformation introduced infinite values in continuous y-axis
Warning: Removed 165 rows containing missing values (geom_bar).
Fri Jan 17 15:27:46 2020: Fit step for: 'psi5'.
Fri Jan 17 15:27:46 2020: Running fit with correction method: PCA
Fri Jan 17 15:27:46 2020: Computing PCA ...
Fri Jan 17 15:27:46 2020: Fitting rho ...
Fri Jan 17 15:27:47 2020: Compute p values for: 'psi5'.
Fri Jan 17 15:27:48 2020: Adjust p values for: 'psi5'.
Fri Jan 17 15:27:48 2020: Compute Z scores for: 'psi5'.
Fri Jan 17 15:27:48 2020: Fit step for: 'psi3'.
Fri Jan 17 15:27:49 2020: Running fit with correction method: PCA
Fri Jan 17 15:27:49 2020: Computing PCA ...
Fri Jan 17 15:27:49 2020: Fitting rho ...
Fri Jan 17 15:27:50 2020: Compute p values for: 'psi3'.
Fri Jan 17 15:27:51 2020: Adjust p values for: 'psi3'.
Fri Jan 17 15:27:51 2020: Compute Z scores for: 'psi3'.
Fri Jan 17 15:27:52 2020: Fit step for: 'psiSite'.
Fri Jan 17 15:27:52 2020: Running fit with correction method: PCA
Fri Jan 17 15:27:52 2020: Computing PCA ...
Fri Jan 17 15:27:52 2020: Fitting rho ...
Fri Jan 17 15:27:53 2020: Compute p values for: 'psiSite'.
Fri Jan 17 15:27:54 2020: Adjust p values for: 'psiSite'.
Fri Jan 17 15:27:55 2020: Compute Z scores for: 'psiSite'.
'select()' returned 1:1 mapping between keys and columns
'select()' returned 1:1 mapping between keys and columns
Fri Jan 17 15:27:58 2020: Collecting results for: psi5
Fri Jan 17 15:27:58 2020: Process chunk: 1 for: psi5
Fri Jan 17 15:27:58 2020: Collecting results for: psiSite
Fri Jan 17 15:27:58 2020: Process chunk: 1 for: psiSite
Fri Jan 17 15:27:58 2020: Collecting results for: psi3
Fri Jan 17 15:27:58 2020: Process chunk: 1 for: psi3
Fri Jan 17 15:27:59 2020: Collecting results for: psi3
Fri Jan 17 15:28:00 2020: Process chunk: 1 for: psi3
Fri Jan 17 15:27:59 2020: Collecting results for: psiSite
Fri Jan 17 15:28:00 2020: Process chunk: 1 for: psiSite
Fri Jan 17 15:27:59 2020: Collecting results for: psi5
Fri Jan 17 15:28:00 2020: Process chunk: 1 for: psi5
Fri Jan 17 15:28:03 2020: Running fit with correction method: PCA-BB-Decoder
125
67
dPsi filter:FALSE: 121 TRUE: 2
Exclusion matrix: TRUE: 123
Fri Jan 17 15:28:12 2020: Injecting outliers: 0 / 0 (primary/secondary)
Warning in optimHyperParams(fds, type = "psi5") :
No outliers could be injected so the hyperparameter optimization could not run. Possible reason: too few junctions in the data.
Warning in plotEncDimSearch(fds, type = "psi5") :
no hyperparameters were estimated for psi5
Please use `optimHyperParams` to compute them.
Warning in switch(x, psi5 = c(bquote(psi[5])), psi3 = c(bquote(psi[3])), :
EXPR is a "factor", treated as integer.
Consider using 'switch(as.character( * ), ...)' instead.
Warning in switch(x, psi5 = c(bquote(psi[5])), psi3 = c(bquote(psi[3])), :
EXPR is a "factor", treated as integer.
Consider using 'switch(as.character( * ), ...)' instead.
Warning in switch(x, psi5 = c(bquote(psi[5])), psi3 = c(bquote(psi[3])), :
EXPR is a "factor", treated as integer.
Consider using 'switch(as.character( * ), ...)' instead.
Error: processing vignette 'FraseR.Rnw' failed with diagnostics:
Running 'texi2dvi' on 'FraseR.tex' failed.
--- failed re-building ‘FraseR.Rnw’
SUMMARY: processing the following file failed:
‘FraseR.Rnw’
Error: Vignette re-building failed.
Execution halted
Error in (function (command = NULL, args = character(), error_on_status = TRUE, :
System command error
Hello,
I installed the package on an Ubuntu 22 server.
When running
fds <- countRNAData(settings)
with my own data (ONT bam files), the first steps work ok, but then I get
Mon Aug 29 17:58:49 2022: In total 239257 splice sites (acceptor/donor) will be counted ...
Error in reducer$value.cache[[as.character(idx)]] <- values :
wrong args for environment subassignment
In addition: Warning message:
In parallel::mccollect(wait = FALSE, timeout = 1) :
1 parallel job did not deliver a result
What could be the issue?
Thanks
When I use countNonSplicedReads, it showed subread logo. And after waited for 1 day, it still keep running, but the cpu usage is very low. Because of no any error showed from it, I can't know what happened. Could you please add more information for better diagnosis。
Hello,
I would like to see the raw data that was used to generate p-values and z-scores in the results() table. For example, fds_filtered@assays@data[["psi5"]], and see the values by sample.
However, I don't seem to be able to get back to it using the concatenation of seqnames : ranges. Are these coordinates changed somehow when running res() ?
Or is it again a problem of species?
Thanks in advance!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.