Git Product home page Git Product logo

vpot's Introduction

VPOT

VPOT - Variant Prioritisation Ordering Tool.

VPOT is a Python tool written to allow prioritisation of variants in ANNOVAR annotated VCF files. VPOT provides various functions for the purpose of speeding up variant discovery.

  • priority - priority tool
  • genef - gene filter
  • samplef - samples and inheritance model filtering
  • stats - general statistics on the VPOT priority output file
  • merge - merge multiple VPOT priority output files into a single VPOT priority output file
  • utility - VPOT utilities

Requirements

  • Python 3.6.+ and NumPy
  • Linux environment, or access to linux via ssh

Installation

  • Navigate to desired install directory and clone this repository.

git clone https://github.com/VCCRI/VPOT.git VPOT

  • Ensure that requirements are met.
  • Test that VPOT is working using the test_data and README.MD/tutorial provided.
  • all done!

Usage - VPOT.py < option >

  • help - will return a help screen

  • priority - priority tool this option performs the variant proritisation process on the input samples VCF files. It will score each variant found for the supplied samples based on the weighting affixed to the annotations supplied in the Prioritisation Parameter File, PPF.

    Inputs :

    • location for output file+prefix
    • file of input VCF files and sample IDs
    • Prioritisation Parameter File (PPF)

    Output :

    • VPOT Priority Output List (VPOL)
  • genef - gene filter this option performs variant filtering of the VPOL based on genes supplied as input.

    Inputs :

    • location for output file+prefix
    • VPOT Priority Output List (VPOL)
    • gene list

    Output :

    • gene filtered VPOT Priority Output List (VPOL)
  • samplef - samples and inheritance model filtering this option performs variant filtering of the VPOL based on a suplied ped format file. It can be used for a simple case-control filtering or an inheritance model filtering for a family trio.

    Inputs :

    • location for output file+prefix
    • VPOT Priority Output List (VPOL)
    • sample selection file (ped format)
    • proband sample ID (for inheritance model filtering)
    • inheritance model (DN/AD/AR/CH for inheritance model filtering)

    Output :

    • sample filtered VPOT Priority Output List (VPOL)
  • stats - general statistics on the VPOT priority output file this option returns a summary statistic file for the VPOL supplied. It provide a small report listing the number of variants (the total number of variants, the number of scoring variants, the number of non-scoring variants), the number of genes and the number of samples. There is a breakdown of the number of variants found in each score 10% percentile range. The top 20 variants are also listed. A breakdown for each sample is also provided, with a table containing the number of variants in genes found above the percentile value supplied.

    Inputs :

    • location for output file+prefix
    • VPOT Priority Output List (VPOL)
    • a percentile value [1-99] for quick summary statistic

    Output :

    • statistic VPOT Priority Output List (VPOL)
  • merge - merge multiple VPOT priority output list files into a single VPOT priority output list file this option provide the ability to merge various VPOLs into one single VPOL. This function allows large cohorts to be split into small groups to speed up proritisation processing and then output to be re-consolidated back into one single large cohort VPOL for downstream analysis or filtering.

    Inputs :

    • location for output file+prefix
    • file containing a list of VPOT Priority Output List (VPOL)

    Output :

    • merged VPOT Priority Output List (VPOL)
  • utilities - VPOT utilities.

         Utility 1 : convertVEP - this utility convert VEP annotated VCF into the standard VCF format.
     		    Inputs :
     			* full pathname of input VEP annotated VCF file
     		    * full pathname of output VCF file, including directory path 
     
     		    Output :
     			 * standard VCF 
    

see README.MD in the test_data directory for more details on each function.

Citation Eddie Ip, Gavin Chapman, David Winlaw, Sally L. Dunwoodie, Eleni Giannoulatou, VPOT: A Customizable Variant Prioritization Ordering Tool for Annotated Variants, Genomics, Proteomics & Bioinformatics, Volume 17, Issue 5, 2019, Pages 540-545, ISSN 1672-0229, https://doi.org/10.1016/j.gpb.2019.11.001. (http://www.sciencedirect.com/science/article/pii/S1672022919301494)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.