rcastelo / variantfiltering Goto Github PK
View Code? Open in Web Editor NEWFilter genetic variants using different criteria such as inheritance model, conservation, etc.
Filter genetic variants using different criteria such as inheritance model, conservation, etc.
Hi!
I am exploring variant annotation for the first time and was hoping to use this tool, however I've run into problems.
I have a trio-VCF file of parents & offspring with a genetic condition. The VCF tells me the build used was GRCh38. Note the VCF file was given to me and has already been heavily filtered down to about ~500 variants.
I installed VariantFiltering and tried to follow the tutorial. I wasn't that surprised to get an error.
vfpar <- VariantFilteringParam("~/Documents/Imperial.Job.Test/clinical_genetics_RA_test/PHNED_joint_hg38.vcf")
allvars <- unrelatedIndividuals(vfpar)
Chromosomes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, X, Y have different lengths between the input VCF and the BSgenome package. These chromosomes will be discarded from further analysis
Assuming the genome build of the input variants is hs37d5.
Switching to the UCSC chromosome-name style from the transcript-centric annotation package.
Chromosome chrM has different lengths between the input VCF and the input TxDb pakage. This chromosome will be discarded from further analysis
Error in .matchSeqinfo(variants, txdb, bsgenome) :
None of the chromosomes in the input VCF file has the same length as the chromosomes in the input TxDb package. The genome reference sequence employed to generate the VCF file was probably different from the one in the input TxDb package.
I tried again specifying hg38 as the Txdb file but still got an error, which has basically stopped me and I'll have to switch tools because I'm working towards a deadline. However I'd really prefer to get this tool off-the-ground for my future analyses. Any pointers?
vfpar <- VariantFilteringParam("~/Documents/Imperial.Job.Test/clinical_genetics_RA_test/PHNED_joint_hg38.vcf",
txdb = "TxDb.Hsapiens.UCSC.hg38.knownGene")
allvars <- unrelatedIndividuals(vfpar)
Chromosomes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, X, Y have different lengths between the input VCF and the BSgenome package. These chromosomes will be discarded from further analysis
Assuming the genome build of the input variants is hs37d5.
Switching to the UCSC chromosome-name style from the transcript-centric annotation package.
Assumming hs37d5 and hg38 represent the same genome build.
Discarding scaffold sequences.
458 variants processed
Error in validObject(.Object) :
invalid class "VariantFilteringResults" object: 1: invalid object for slot "cutoffs" in class "VariantFilteringResults": got class "list", should be or extend class "CutoffsList"
invalid class "VariantFilteringResults" object: 2: invalid object for slot "sortings" in class "VariantFilteringResults": got class "NULL", should be or extend class "CutoffsList"
In addition: Warning message:
In .local(param, ...) : The input VCF file has no variants.
Thanks!
Hello, I would like to use VAriantFiltering for NGS analysis. I successfully installed VariantFiltering
I was following the instructions using the test data and then came across this issue blocking my progress.
BPPRAM seems causing the issue but I don't know how to deal with the error. Any help or explanation will be appreciated.
Find below the commands, errors, and session info.
CEUvcf <- file.path(system.file("extdata", package="VariantFiltering"), "CEUtrio.vcf.bgz")
CEUped <- file.path(system.file("extdata", package="VariantFiltering"), "CEUtrio.ped")
param <- VariantFilteringParam(vcfFilename=CEUvcf, pedFilename=CEUped)
reHet <- autosomalRecessiveHeterozygous(param )
or
reHet <- autosomalRecessiveHeterozygous(param, svparam=ScanVcfParam(), BPPARAM=bpparam("SerialParam") )
Error in .local(param, ...) :
Parallel back-end function bpparam given in argument 'BPPARAM' does not exist in the current workspace. Either you did not write correctly the function name or you did not load the package 'BiocParallel'.Parallel back-end function SerialParam given in argument 'BPPARAM' does not exist in the current workspace. Either you did not write correctly the function name or you did not load the package 'BiocParallel'.
sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.3 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: Africa/Bamako
tzcode source: system (glibc)
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] phastCons100way.UCSC.hg19_3.7.2 SIFT.Hsapiens.dbSNP137_1.0.0 PolyPhen.Hsapiens.dbSNP131_1.0.2
[4] RSQLite_2.3.3 MafDb.1Kgenomes.phase1.hs37d5_3.10.0 GenomicScores_2.14.1
[7] SNPlocs.Hsapiens.dbSNP144.GRCh37_0.99.20 TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 GenomicFeatures_1.54.1
[10] org.Hs.eg.db_3.18.0 AnnotationDbi_1.64.1 BSgenome.Hsapiens.1000genomes.hs37d5_0.99.1
[13] BSgenome_1.70.1 rtracklayer_1.62.0 BiocIO_1.12.0
[16] VariantFiltering_1.38.0 VariantAnnotation_1.48.1 Rsamtools_2.18.0
[19] Biostrings_2.70.1 XVector_0.42.0 SummarizedExperiment_1.32.0
[22] GenomicRanges_1.54.1 GenomeInfoDb_1.38.1 IRanges_2.36.0
[25] S4Vectors_0.40.1 MatrixGenerics_1.14.0 matrixStats_1.1.0
[28] BiocParallel_1.36.0 Biobase_2.62.0 BiocGenerics_0.48.1
loaded via a namespace (and not attached):
[1] RColorBrewer_1.1-3 jsonlite_1.8.7 rstudioapi_0.15.0 magrittr_2.0.3
[5] rmarkdown_2.25 zlibbioc_1.48.0 vctrs_0.6.4 memoise_2.0.1
[9] RCurl_1.98-1.13 base64enc_0.1-3 htmltools_0.5.7 S4Arrays_1.2.0
[13] progress_1.2.2 AnnotationHub_3.10.0 curl_5.1.0 Rhdf5lib_1.24.0
[17] SparseArray_1.2.2 Formula_1.2-5 rhdf5_2.46.0 htmlwidgets_1.6.3
[21] Gviz_1.46.1 cachem_1.0.8 GenomicAlignments_1.38.0 mime_0.12
[25] lifecycle_1.0.4 pkgconfig_2.0.3 Matrix_1.6-3 R6_2.5.1
[29] fastmap_1.1.1 GenomeInfoDbData_1.2.11 shiny_1.8.0 digest_0.6.33
[33] colorspace_2.1-0 shinyTree_0.3.1 Hmisc_5.1-1 filelock_1.0.2
[37] fansi_1.0.5 httr_1.4.7 abind_1.4-5 compiler_4.3.2
[41] bit64_4.0.5 htmlTable_2.4.2 backports_1.4.1 DBI_1.1.3
[45] HDF5Array_1.30.0 biomaRt_2.58.0 rappdirs_0.3.3 DelayedArray_0.28.0
[49] rjson_0.2.21 tools_4.3.2 foreign_0.8-85 interactiveDisplayBase_1.40.0
[53] httpuv_1.6.12 nnet_7.3-19 glue_1.6.2 restfulr_0.0.15
[57] rhdf5filters_1.14.1 promises_1.2.1 grid_4.3.2 checkmate_2.3.0
[61] cluster_2.1.4 generics_0.1.3 gtable_0.3.4 ensembldb_2.26.0
[65] data.table_1.14.8 hms_1.1.3 xml2_1.3.5 utf8_1.2.4
[69] BiocVersion_3.18.1 pillar_1.9.0 stringr_1.5.1 later_1.3.1
[73] dplyr_1.1.4 BiocFileCache_2.10.1 lattice_0.22-5 deldir_2.0-2
[77] bit_4.0.5 biovizBase_1.50.0 tidyselect_1.2.0 RBGL_1.78.0
[81] knitr_1.45 gridExtra_2.3 ProtGenerics_1.34.0 xfun_0.41
[85] DT_0.30 stringi_1.8.2 lazyeval_0.2.2 yaml_2.3.7
[89] evaluate_0.23 codetools_0.2-19 interp_1.1-4 tibble_3.2.1
[93] BiocManager_1.30.22 graph_1.80.0 cli_3.6.1 rpart_4.1.21
[97] shinythemes_1.2.0 xtable_1.8-4 munsell_0.5.0 dichromat_2.0-0.1
[101] Rcpp_1.0.11 dbplyr_2.4.0 png_0.1-8 XML_3.99-0.15
[105] parallel_4.3.2 ellipsis_0.3.2 ggplot2_3.4.4 blob_1.2.4
[109] prettyunits_1.2.0 jpeg_0.1-10 latticeExtra_0.6-30 AnnotationFilter_1.26.0
[113] bitops_1.0-7 scales_1.2.1 crayon_1.5.2 rlang_1.1.2
[117] KEGGREST_1.42.0 shinyjs_2.1.0
In file R/methods-VariantFilteringParam.R
(and others), it makes more sense to me to write seq_along(vcfFilenames)
, instead of seq(along=vcfFilenames)
.
https://github.com/rcastelo/VariantFiltering/blob/master/R/annotationEngine.R#L801-L802
Error in .loc2SNPid(annObj, variantsGR[masksnp], BPPARAM = BPPARAM) :
all ranges in 'locs' must be of width 1
That seems awfully strict. How can I tell VariantFiltering to simply not annotate those loci?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.