This is the code repository for my master thesis: Inferring natural selection from pool-sequencing allele frequency data -- an applied statistical genetics project on drought tolerance in Brassica napus at FiBL
Structure
~/plots: all figures in the thesis
~/renv: R package environment
~/scripts contains all R code written and used for the thesis
- combine_replicates.R: all data preprocessing
- darmor2shengli.R: mapping of SNP positions found in the GWAS literature to the reference genome used for genotyping our data
- allele_frequency_means.R: all descriptive statistics per generation, per haplotype, and per chromosome
- run_equaltest.R: run the pairwise testing approach from section 4.1 with (zero-inflated) negative binomial model fits and likelihood ratio test across all haplotypes
- run_betareg.R: run the second extension using beta regression from section 4.2
- evaluate_equaltest.R: evaluate the results produced by run_equaltest.R for section 5.1.2
- evaluate_betareg.R: evaluate the results produced by run_betareg.R for section 5.1.3
- figures.R: code producing all figures used in the thesis
- functions: a directory containing R scripts with all functions called by the run_*.R and evaluate_*.R scripts
- preprocessing_functions.R: functions used for preprocessing in combine_replicates.R and for preprocessing the meteorological data as described in section 3.2.4
- equaltest_functions.R: functions used in scripts *equaltest.R
- betareg_functions.R: functions used for scripts *betareg.R
- plot_functions.R: functions used in figures.R
- helper_functions.R: convenience functions used throughout