Git Product home page Git Product logo

ithaca-bees's Introduction

Whole-genome re-sequencing to examine genetic changes in a population of Ithaca, NY honeybees using samples collected in 1977 and 2011

This repo contains the part of the analysis that was performed on the cluster. Downstream analysis done in R and plotting in python have not yet been added.

Genomes were sequenced on an Illumina HiSeq, using genomic libraries prepared without PCR. In addition to the Ithaca samples, there were some bees included from populations in Arizona, Chiapas (Africanized) and from Hawaii, Korea and Japan (non-Africanized).

Some of the steps are parallelized on an SGE cluster.

Workflow

The first step was to align the reads to the reference using bowtie2, and then to re-calibrate alignments around indels using GATK.

SNP calling

major_split.py

  • create a file of limits for GATK, corresponding to the 16 major chromosomes
    • this was piped to data/scaffolds_long.txt

bqsr.sh

  • perform base quality recalibration using known SNP sites from NCBI and validated sites kindly provided by Greg Hunt

call.sh

  • starting with mapped fragments, call genotypes for all samples

vqsr.sh

  • perform variant quality score recalibration to filter low-quality SNPs

SNP frequency measurement using ANGSD

angsd.sh

  • compute minor allele frequencies for old and modern populations, and conduct likelihood ratio tests for significant changes

intersect_mafs.py

  • intersect minor allele frequency files for old and modern populations

Imputation and association testing using BEAGLE

vcf2bgl.sh

  • convert GATK vcf to BEAGLE format

phase.sh

  • phase genotypes and impute missing values

assoc.sh

  • association testing on imputed haplotypes, looking for evidence of selection between old and modern populations
    • This is a parallel analysis to likelihoood ratio testing with ANGDS
c2h.py and c2h.sh
  • extract haplotypes from BEAGLE results

Differentiation between European and Africanized bees

ahb.sh

  • calculate Fst between populations with European and African ancestry using vcftools
    • note: output files manually moved into the data directory

angsd_ahb.sh

  • trying to compute Fst using ngsutils.
    • this approach has not worked, given the different number of snp calls between samples.
    • I have given up on this for now, focusing instead on the vcftools analysis

plotting differentiation between populations

angsd2bgl.sh

  • generate BEAGLE-formatted data from ngs count data

ngsAdmix.sh

  • use NgsAdmix to infer ancestral population clusters

pca.sh

  • compute covariance matrix using posterior probabilities of genotypes computed by angsd.sh

Still left to do

  • intersect beagle and angds results
  • iEHH (using rehh package in R)
  • visualize data
  • look at genes in beagle haplotype blocks

ithaca-bees's People

Contributors

jatinarora-upmc avatar mikheyev avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

ithaca-bees's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.