gagneurlab / fraser Goto Github PK

View Code? Open in Web Editor NEW

This project forked from c-mertes/fraser

36.0 6.0 20.0 14.8 MB

FRASER - Find RAre Splicing Events in RNA-seq

License: MIT License

R 93.66% C++ 5.71% TeX 0.63%

rna-seq outlier-detection aberrant-splicing splicing rare-disease diagnostics r

fraser's People

Stargazers

Watchers

Forkers

kur1sutaru nvk747 s6juncheng

fraser's Issues

Could not find read type: jaccard

Hi,

I tried to run fds <- calculatePSIValues(fds, types="jaccard", BPPARAM=bpparam()) but keep getting the error:
Error in FUN(X[[i]], ...) : Could not find read type: jaccard Calls: calculatePSIValues -> unique -> vapply -> FUN

fds is created from:

    library(FRASER)
    
    annotation_dat <- data.table(sampleID="{sample_id}", bamFile="{sample_bam_path}", pairedEnd=TRUE)
    settings <- FraserDataSet(colData=annotation_dat, workingDir=".")
    
    fds <- countRNAData(settings)

I wonder if you could help look into this. Thanks!

crash if run on large number of samples

We need to run around 400 samples (each is RNA-seq data from one of the patients) but the FRASER crashes. (It runs ok with lower number of samples - <25).
machine specifications:
CPUs: 8
memory: 32849380
storage: /home total: 11T used: 2,7T free: 8,3T
Programm is killed when running function countRNAData().
with the following error: Error in sendMaster(try(eval(expr, env), silent = TRUE), FALSE) :
ignoring SIGPIPE signal
Calls: countRNAData ... bploop.lapply -> .send_to -> .send_to -> -> sendMaster
Error in sendMaster(try(eval(expr, env), silent = TRUE), FALSE) :
ignoring SIGPIPE signal
Calls: countRNAData ... bploop.lapply -> .send_to -> .send_to -> -> sendMaster

The parallelization should be turned on by the following code:
if(.Platform$OS.type == "unix") { register(MulticoreParam(workers=min(10, multicoreWorkers())))
} else { register(SnowParam(workers=min(10, multicoreWorkers()))) }

improve zscores

Correct way to calculate z-scores would be to derive it from the CDF:

cdf = pbetabinom(a, b , q=counts, log.p=TRUE)
zscore = qnorm(cdf, log.p=TRUE)

Adapt the script for mouse?

Hello,

First of all, is this code still maintained? I received absolutely no news despite writing directly to the author.

My question is, is it possible to adapt the code for mouse data?

I tried to use
library(TxDb.Mmusculus.UCSC.mm10.knownGene)
library(org.Mm.eg.db)

but I get errors about chromosome length, I suspect it's hardcoded to use human, is it the case?

It makes most of the plotting functions not usable, as they seem to depend on this annotation.

Trying to install FRASER 2.0

Hello, I'm trying to install FRASER 2.0, but when I install it with bioconductor, i get version 1.8.1, which is the first FRASER. When trying to use devtools i get the following error:

Using GitHub PAT from the git credential store.
Error: Failed to install 'unknown package' from GitHub:
HTTP error 401.
Bad credentials

Rate limit remaining: 53/60
Rate limit reset at: 2024-04-26 14:52:19 UTC

I have already checked my github path with Sys.getenv("GITHUB_PAT") and it seemes fine (I get [1] "" as answer). If I may have some help in order to install FRASER 2.0.

Thank you in advance.

Unable to build vignette

I've tried repeatedly to build the vignette for this package with little success. These are the steps I took along with the output. Any idea why this might be happening?

Alternatively, would it be possible to also include an already built PDF copy of the vignette. The bits I am particularly interested in are the syntax nuances in using FRASER to get split and non-split counts

I'm doing this on Mac OS Sierra

R version 3.6.2
texinfo: stable 6.7 [keg-only]

Steps I took:

git cloned gagneurlab/FRASER repo
Installed devtools
Navigated to FRASER folder
R session involved & devtools::build(pkg='.', vignettes=TRUE)

> devtools::build(pkg='.', vignettes=TRUE)
✔  checking for file ‘/Users/himanshujoshi/PycharmProjects/FRASER/DESCRIPTION’ ...
─  preparing ‘FRASER’:
✔  checking DESCRIPTION meta-information ...
─  cleaning src
─  installing the package to build vignettes
E  creating vignettes (2m 19.9s)
   --- re-building ‘FraseR.Rnw’ using knitr
   Fri Jan 17 15:27:01 2020: Calculate the PSI 5 and 3 values ...
   Fri Jan 17 15:27:03 2020: Calculate the PSI site values ...
   Fri Jan 17 15:27:04 2020: Calculate the delta for psi5 values ...
   Fri Jan 17 15:27:04 2020: Calculate the delta for psi3 values ...
   Fri Jan 17 15:27:04 2020: Calculate the delta for psiSite values ...
   Fri Jan 17 15:27:05 2020: Writing final FraseR object ('/var/folders/gw/rccgk_r53w316ntmdmtv1hrw000157/T//RtmpaEz1GE/savedObjects/Example_Dataset/fds-object.RDS').

   Fri Jan 17 15:27:05 2020: Fit step for: 'psi5'.
   Fri Jan 17 15:27:06 2020: Running fit with correction method: PCA
   Fri Jan 17 15:27:06 2020: Computing PCA ...
   Fri Jan 17 15:27:06 2020: Fitting rho ...
   Fri Jan 17 15:27:07 2020: Compute p values for: 'psi5'.
   Fri Jan 17 15:27:08 2020: Adjust p values for: 'psi5'.
   Fri Jan 17 15:27:08 2020: Compute Z scores for: 'psi5'.

   Fri Jan 17 15:27:08 2020: Fit step for: 'psi3'.
   Fri Jan 17 15:27:09 2020: Running fit with correction method: PCA
   Fri Jan 17 15:27:09 2020: Computing PCA ...
   Fri Jan 17 15:27:09 2020: Fitting rho ...
   Fri Jan 17 15:27:10 2020: Compute p values for: 'psi3'.
   Fri Jan 17 15:27:11 2020: Adjust p values for: 'psi3'.
   Fri Jan 17 15:27:11 2020: Compute Z scores for: 'psi3'.

   Fri Jan 17 15:27:12 2020: Fit step for: 'psiSite'.
   Fri Jan 17 15:27:12 2020: Running fit with correction method: PCA
   Fri Jan 17 15:27:12 2020: Computing PCA ...
   Fri Jan 17 15:27:12 2020: Fitting rho ...
   Fri Jan 17 15:27:13 2020: Compute p values for: 'psiSite'.
   Fri Jan 17 15:27:14 2020: Adjust p values for: 'psiSite'.
   Fri Jan 17 15:27:15 2020: Compute Z scores for: 'psiSite'.
   Loading required package: TxDb.Hsapiens.UCSC.hg19.knownGene
   Loading required package: GenomicFeatures
   Loading required package: AnnotationDbi
   Loading required package: org.Hs.eg.db

   'select()' returned 1:1 mapping between keys and columns
   'select()' returned 1:1 mapping between keys and columns
   Fri Jan 17 15:27:18 2020: Collecting results for: psi5
   Fri Jan 17 15:27:18 2020: Process chunk: 1 for: psi5
   Fri Jan 17 15:27:18 2020: Collecting results for: psi3
   Fri Jan 17 15:27:18 2020: Process chunk: 1 for: psi3
   Fri Jan 17 15:27:18 2020: Collecting results for: psiSite
   Fri Jan 17 15:27:18 2020: Process chunk: 1 for: psiSite
   Warning in has_utility("convert", "ImageMagick") :
     ImageMagick not installed or not in PATH
   Fri Jan 17 15:27:22 2020: Start counting the split reads ...
   Fri Jan 17 15:27:22 2020: Count split reads for sample: sample1
   Warning in dir.create(cachedir, recursive = TRUE) :
     '/var/folders/gw/rccgk_r53w316ntmdmtv1hrw000157/T//RtmpaEz1GE/cache/splitCounts' already exists
   Fri Jan 17 15:27:22 2020: Count split reads for sample: sample2
   Fri Jan 17 15:27:22 2020: Count split reads for sample: sample3
   Warning in dir.create(cachedir, recursive = TRUE) :
     '/var/folders/gw/rccgk_r53w316ntmdmtv1hrw000157/T//RtmpaEz1GE/cache/splitCounts' already exists
   Fri Jan 17 15:27:33 2020 : count ranges need to be merged ...
   Fri Jan 17 15:27:35 2020: Create splice site indices ...
   Fri Jan 17 15:27:35 2020: Writing counts to file: /var/folders/gw/rccgk_r53w316ntmdmtv1hrw000157/T//RtmpaEz1GE/savedObjects/Data_Analysis/splitCounts.tsv.gz
   Fri Jan 17 15:27:35 2020: Create splice site indices ...
   Fri Jan 17 15:27:35 2020: Start counting the non spliced reads ...
   Fri Jan 17 15:27:35 2020: In total 60 splice junctions are found.
   Fri Jan 17 15:27:35 2020: In total 77 splice sites (acceptor/donor) will be counted ...
   Fri Jan 17 15:27:35 2020: Count non spliced reads for sample: sample1
   Warning in dir.create(cachedir, recursive = TRUE) :
     '/var/folders/gw/rccgk_r53w316ntmdmtv1hrw000157/T//RtmpaEz1GE/cache/nonSplicedCounts/Data_Analysis' already exists

           ==========     _____ _    _ ____  _____  ______          _____
           =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \
             =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
               ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
                 ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
           ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
          Rsubread 2.0.0

   //========================== featureCounts setting ===========================\\
   ||                                                                            ||
   ||             Input files : 1 BAM file                                       ||
   ||                           o sample1.bam                                    ||
   ||                                                                            ||
   ||              Annotation : R data.frame                                     ||
   ||      Dir for temp files : /var/folders/gw/rccgk_r53w316ntmdmtv1hrw0001 ... ||
   ||                 Threads : 3                                                ||
   ||                   Level : meta-feature level                               ||
   ||              Paired-end : no                                               ||
   ||      Multimapping reads : counted                                          ||
   || Multi-overlapping reads : counted                                          ||
   ||   Min overlapping bases : 10                                               ||
   ||          Long read mode : yes                                              ||
   ||                                                                            ||
   \\============================================================================//

   //================================= Running ==================================\\
   ||                                                                            ||
   || Load annotation file .Rsubread_UserProvidedAnnotation_pid92802 ...         ||
   ||    Features : 77                                                           ||
   ||    Meta-features : 77                                                      ||
   ||    Chromosomes/contigs : 2                                                 ||
   ||                                                                            ||
   || Process BAM file sample1.bam...                                            ||
   ||    WARNING: Paired-end reads were found.                                   ||
   ||    Total alignments : 802                                                  ||
   ||    Successfully assigned alignments : 188 (23.4%)                          ||
   ||    Running time : 0.01 minutes                                             ||
   ||                                                                            ||
   || Write the final count table.                                               ||
   || Write the read assignment summary.                                         ||
   ||                                                                            ||
   \\============================================================================//

   Saving splice site cache ...
   Fri Jan 17 15:27:35 2020: Count non spliced reads for sample: sample3

           ==========     _____ _    _ ____  _____  ______          _____
           =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \
             =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
               ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
                 ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
           ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
          Rsubread 2.0.0

   //========================== featureCounts setting ===========================\\
   ||                                                                            ||
   ||             Input files : 1 BAM file                                       ||
   ||                           o sample3.bam                                    ||
   ||                                                                            ||
   ||              Annotation : R data.frame                                     ||
   ||      Dir for temp files : /var/folders/gw/rccgk_r53w316ntmdmtv1hrw0001 ... ||
   ||                 Threads : 3                                                ||
   ||                   Level : meta-feature level                               ||
   ||              Paired-end : no                                               ||
   ||      Multimapping reads : counted                                          ||
   || Multi-overlapping reads : counted                                          ||
   ||   Min overlapping bases : 10                                               ||
   ||          Long read mode : yes                                              ||
   ||                                                                            ||
   \\============================================================================//

   //================================= Running ==================================\\
   ||                                                                            ||
   || Load annotation file .Rsubread_UserProvidedAnnotation_pid92804 ...         ||
   ||    Features : 77                                                           ||
   ||    Meta-features : 77                                                      ||
   ||    Chromosomes/contigs : 2                                                 ||
   ||                                                                            ||
   || Process BAM file sample3.bam...                                            ||
   ||    WARNING: Paired-end reads were found.                                   ||
   ||    Total alignments : 3498                                                 ||
   ||    Successfully assigned alignments : 866 (24.8%)                          ||
   ||    Running time : 0.04 minutes                                             ||
   ||                                                                            ||
   || Write the final count table.                                               ||
   || Write the read assignment summary.                                         ||
   ||                                                                            ||
   \\============================================================================//

   Saving splice site cache ...
   Fri Jan 17 15:27:35 2020: Count non spliced reads for sample: sample2
   Warning in dir.create(cachedir, recursive = TRUE) :
     '/var/folders/gw/rccgk_r53w316ntmdmtv1hrw000157/T//RtmpaEz1GE/cache/nonSplicedCounts/Data_Analysis' already exists

           ==========     _____ _    _ ____  _____  ______          _____
           =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \
             =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
               ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
                 ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
           ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
          Rsubread 2.0.0

   //========================== featureCounts setting ===========================\\
   ||                                                                            ||
   ||             Input files : 1 BAM file                                       ||
   ||                           o sample2.bam                                    ||
   ||                                                                            ||
   ||              Annotation : R data.frame                                     ||
   ||      Dir for temp files : /var/folders/gw/rccgk_r53w316ntmdmtv1hrw0001 ... ||
   ||                 Threads : 3                                                ||
   ||                   Level : meta-feature level                               ||
   ||              Paired-end : no                                               ||
   ||      Multimapping reads : counted                                          ||
   || Multi-overlapping reads : counted                                          ||
   ||   Min overlapping bases : 10                                               ||
   ||          Long read mode : yes                                              ||
   ||                                                                            ||
   \\============================================================================//

   //================================= Running ==================================\\
   ||                                                                            ||
   || Load annotation file .Rsubread_UserProvidedAnnotation_pid92803 ...         ||
   ||    Features : 77                                                           ||
   ||    Meta-features : 77                                                      ||
   ||    Chromosomes/contigs : 2                                                 ||
   ||                                                                            ||
   || Process BAM file sample2.bam...                                            ||
   ||    WARNING: Paired-end reads were found.                                   ||
   ||    Total alignments : 4397                                                 ||
   ||    Successfully assigned alignments : 1011 (23.0%)                         ||
   ||    Running time : 0.05 minutes                                             ||
   ||                                                                            ||
   || Write the final count table.                                               ||
   || Write the read assignment summary.                                         ||
   ||                                                                            ||
   \\============================================================================//

   Saving splice site cache ...
   Fri Jan 17 15:27:38 2020 : Fast merging of counts ...
   Fri Jan 17 15:27:39 2020: Writing counts to file: /var/folders/gw/rccgk_r53w316ntmdmtv1hrw000157/T//RtmpaEz1GE/savedObjects/Data_Analysis/nonSplitCounts.tsv.gz
   Fri Jan 17 15:27:41 2020: Calculate the PSI 5 and 3 values ...
   Fri Jan 17 15:27:42 2020: Calculate the PSI site values ...
   Fri Jan 17 15:27:42 2020: Calculate the delta for psi5 values ...
   Fri Jan 17 15:27:43 2020: Calculate the delta for psi3 values ...
   Fri Jan 17 15:27:43 2020: Calculate the delta for psiSite values ...
Fri Jan 17 15:27:44 2020: Writing final FraseR object ('/var/folders/gw/rccgk_r53w316ntmdmtv1hrw000157/T//RtmpaEz1GE/sav   Fri Jan 17 15:27:44 2020: Writing final FraseR object ('/var/folders/gw/rccgk_r53w316ntmdmtv1hrw000157/T//RtmpaEz1GE/savedObjects/Data_Analysis/fds-object.RDS').
   Warning: Transformation introduced infinite values in continuous y-axis
   Warning: Removed 165 rows containing missing values (geom_bar).

   Fri Jan 17 15:27:46 2020: Fit step for: 'psi5'.
   Fri Jan 17 15:27:46 2020: Running fit with correction method: PCA
   Fri Jan 17 15:27:46 2020: Computing PCA ...
   Fri Jan 17 15:27:46 2020: Fitting rho ...
   Fri Jan 17 15:27:47 2020: Compute p values for: 'psi5'.
   Fri Jan 17 15:27:48 2020: Adjust p values for: 'psi5'.
   Fri Jan 17 15:27:48 2020: Compute Z scores for: 'psi5'.

   Fri Jan 17 15:27:48 2020: Fit step for: 'psi3'.
   Fri Jan 17 15:27:49 2020: Running fit with correction method: PCA
   Fri Jan 17 15:27:49 2020: Computing PCA ...
   Fri Jan 17 15:27:49 2020: Fitting rho ...
   Fri Jan 17 15:27:50 2020: Compute p values for: 'psi3'.
   Fri Jan 17 15:27:51 2020: Adjust p values for: 'psi3'.
   Fri Jan 17 15:27:51 2020: Compute Z scores for: 'psi3'.

   Fri Jan 17 15:27:52 2020: Fit step for: 'psiSite'.
   Fri Jan 17 15:27:52 2020: Running fit with correction method: PCA
   Fri Jan 17 15:27:52 2020: Computing PCA ...
   Fri Jan 17 15:27:52 2020: Fitting rho ...
   Fri Jan 17 15:27:53 2020: Compute p values for: 'psiSite'.
   Fri Jan 17 15:27:54 2020: Adjust p values for: 'psiSite'.
   Fri Jan 17 15:27:55 2020: Compute Z scores for: 'psiSite'.
   'select()' returned 1:1 mapping between keys and columns
   'select()' returned 1:1 mapping between keys and columns
   Fri Jan 17 15:27:58 2020: Collecting results for: psi5
   Fri Jan 17 15:27:58 2020: Process chunk: 1 for: psi5
   Fri Jan 17 15:27:58 2020: Collecting results for: psiSite
   Fri Jan 17 15:27:58 2020: Process chunk: 1 for: psiSite
   Fri Jan 17 15:27:58 2020: Collecting results for: psi3
   Fri Jan 17 15:27:58 2020: Process chunk: 1 for: psi3
   Fri Jan 17 15:27:59 2020: Collecting results for: psi3
   Fri Jan 17 15:28:00 2020: Process chunk: 1 for: psi3
   Fri Jan 17 15:27:59 2020: Collecting results for: psiSite
   Fri Jan 17 15:28:00 2020: Process chunk: 1 for: psiSite
   Fri Jan 17 15:27:59 2020: Collecting results for: psi5
   Fri Jan 17 15:28:00 2020: Process chunk: 1 for: psi5
   Fri Jan 17 15:28:03 2020: Running fit with correction method: PCA-BB-Decoder
   125
   67
   dPsi filter:FALSE: 121	TRUE: 2
   Exclusion matrix: TRUE: 123
   Fri Jan 17 15:28:12 2020: Injecting outliers: 0 / 0 (primary/secondary)
   Warning in optimHyperParams(fds, type = "psi5") :
     No outliers could be injected so the hyperparameter optimization could not run. Possible reason: too few junctions in the data.
   Warning in plotEncDimSearch(fds, type = "psi5") :
     no hyperparameters were estimated for psi5
   Please use `optimHyperParams` to compute them.
   Warning in switch(x, psi5 = c(bquote(psi[5])), psi3 = c(bquote(psi[3])),  :
     EXPR is a "factor", treated as integer.
    Consider using 'switch(as.character( * ), ...)' instead.
   Warning in switch(x, psi5 = c(bquote(psi[5])), psi3 = c(bquote(psi[3])),  :
     EXPR is a "factor", treated as integer.
    Consider using 'switch(as.character( * ), ...)' instead.
   Warning in switch(x, psi5 = c(bquote(psi[5])), psi3 = c(bquote(psi[3])),  :
     EXPR is a "factor", treated as integer.
    Consider using 'switch(as.character( * ), ...)' instead.
   Error: processing vignette 'FraseR.Rnw' failed with diagnostics:
   Running 'texi2dvi' on 'FraseR.tex' failed.
   --- failed re-building ‘FraseR.Rnw’

   SUMMARY: processing the following file failed:
     ‘FraseR.Rnw’

   Error: Vignette re-building failed.
   Execution halted
Error in (function (command = NULL, args = character(), error_on_status = TRUE,  :
  System command error

Getting error when counting splice sites

Hello,

I installed the package on an Ubuntu 22 server.

When running
fds <- countRNAData(settings)
with my own data (ONT bam files), the first steps work ok, but then I get

Mon Aug 29 17:58:49 2022: In total 239257 splice sites (acceptor/donor) will be counted ...
Error in reducer$value.cache[[as.character(idx)]] <- values :
wrong args for environment subassignment
In addition: Warning message:
In parallel::mccollect(wait = FALSE, timeout = 1) :
1 parallel job did not deliver a result

What could be the issue?

Thanks

In countNonSplicedReads stucked for no error report

When I use countNonSplicedReads, it showed subread logo. And after waited for 1 day, it still keep running, but the cpu usage is very low. Because of no any error showed from it, I can't know what happened. Could you please add more information for better diagnosis。

Go back to raw data after using FRASER::results()

Hello,

I would like to see the raw data that was used to generate p-values and z-scores in the results() table. For example, fds_filtered@assays@data[["psi5"]], and see the values by sample.

However, I don't seem to be able to get back to it using the concatenation of seqnames : ranges. Are these coordinates changed somehow when running res() ?

Or is it again a problem of species?

Thanks in advance!

Error: useNames = NA is defunct

Dear sir,
I ran the test dataset, and got the error below

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.