Git Product home page Git Product logo

fido-snp's Introduction

Fido-SNP

INTRODUCTION

  Emidio Capriotti, 2018.
  University of Bologna
  Scripts are licensed under the Creative Commons by NC-SA license.

  Fido-SNP is a program for the annotation of single nucleotide variants in the dog genome.

  Please cite:
  Capriotti E, Montanucci L, Profiti G, Rossi I, Giannuzzi D, Aresu L, Fariselli P. (2019).
  Fido-SNP: The first webserver for scoring the impact of single nucleotide variants in the dog genome.
  Nucleic Acids Research. DOI:10.1093/nar/gkz420.

INSTALLATION

  Minimum requirements:
  wget, curl, zcat, scikit-learn.

  Run:
    python setup.py install arch_type

  For Linux 64bit architectures for which ucsc executable files are available:
  The standard version is:
    - linux.x86_64

  Installation time depends on the network speed.
  About 35G of UCSC like files need to be downloaded.

  Test:
    python setup.py test	

MANUAL INSTALLATION

  1) Download Fido-SNP script from github
    - git clone https://github.com/biofold/Fido-SNP

  2) Required python library: scikit-learn
    - git://github.com/scikit-learn/scikit-learn

  3) Required UCSC tools and data:
    - bigWigToBedGraph and twoBitToFa from
      http://hgdownload.cse.ucsc.edu/admin/exe
      in ucsc/exe directory

    - For canfam2 based predictions:
      canfam2.2bit: http://snps.biofold.org/Fido-SNP/ucsc/canfam2/canfam2.2bit
      canfam2.phyloP4way.bw http://snps.biofold.org/Fido-SNP/ucsc/canfam2/canfam2.phyloP4way.bw
      canfam2.phyloP10way.bw http://snps.biofold.org/Fido-SNP/ucsc/canfam2/canfam2.phyloP10way.bw

    - For canfam3 based predictions:
      canfam3.2bit: http://snps.biofold.org/Fido-SNP/ucsc/canfam3/canfam3.2bit
      canfam3.phyloP4way.bw http://snps.biofold.org/Fido-SNP/ucsc/canfam3/canfam3.phyloP4way.bw
      canfam3.phyloP10way.bw http://snps.biofold.org/Fido-SNP/ucsc/canfam3/canfam3.phyloP10way.bw

HOW TO RUN

  Fido-SNP can take in input a single variation or a file containing multiple single nucleotide variants.

  - For single variants use the option -c:
    python fido_variants.py chr1,15189413,C,G -g canfam3 -c

  - For input file the input can be either: 

    plain tab separated file with 4 columns: chr, position, ref, alt
    python fido_variants.py test/test_canfam3.tsv -g canfam3
   
    vcf file with in the firt 5 columns: chr, position, rsid, ref, alt  
    python fido_variants.py test/test_canfam3.tsv.gz --vcf -g canfam3

OUTPUT

  Fido-SNP returns in output a probabilistic score between 0 and 1. If the score is >0.5 the variants
  is predicted as disease related. The probability is added as an extra column to the input file. 
  An example of output is reported below.

    1       15189413        C       G       Yes     Pathogenic    0.515   0.075  -0.027   0.317
    5       34700967        T       A       Yes     Benign        0.145   0.242   0.766   0.678
    9       54071528        T       C       Yes     Benign        0.327   0.246  -0.402   0.260

fido-snp's People

Contributors

ecapriotti avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.