Git Product home page Git Product logo

erisemberg / peanut-allergy Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 76.42 MB

This repository contains data and code to reproduce analysis in the manuscript, "A mutation in Themis contributes to peanut-induced oral anaphylaxis in CC027 mice"

Home Page: https://doi.org/10.1016/j.jaci.2024.03.027

Dockerfile 0.73% R 98.03% Shell 1.24%
genetics qtl-mapping quantitative-genetics statistical-models

peanut-allergy's Introduction

A mutation in Themis contributes to peanut-induced oral anaphylaxis in CC027 mice

This document describes how to reproduce heritability and QTL analyses of various peanut allergy-related phenotypes in a backcross between CC0027 (reacts orally to peanut) and C3H (does not react orally to peanut).

Environment prep

This project uses a Docker container to produce an environment similar to that used in the original analysis (e.g. R v4.2.1 and R package versions available on August 1, 2023). In order to run this container you will need Docker installed.

Build the docker container:

docker build . -t pnut 

Run the docker container, opening a terminal session within the container:

docker run -e PASSWORD=pw123 --rm -v $(pwd):/home/rstudio/work -p 8787:8787 -it pnut /bin/bash

Navigate to the working directory:

cd home/rstudio/work 

Prepare R/qtl file

Run the following code to produce a file in the format required by R/qtl.

Rscript rqtl-file-gen-pnutbc.R

This script will generate the following .csv file, which can be imported into R and analyzed using the Rqtl package: derived_data/Rqtl_CC27xC3H_BCv2.csv

Inbred analysis

Analyze inbred parent data:

Rscript parent-analysis.R

This script will generate:

  • figs/parent-temps.png: plot of temperature trajectories in inbred CC027 and C3H mice (Fig. 1A in manuscript)
  • results/inbred-parent-data-summary.csv: means and ranges for each phenotype by strain (Supp. Table 1 in manuscript)
  • results/h2-from-inbred-parents.csv: heritability estimates from inbred parent data (Supp. Table 1 in manuscript)

Backcross analysis

Note: Add steps to produce genomic relationship matrix for backcross mice (will be used in heritability calculations). For now, plink files are included in source_data directory (.map and .bed produced by plink_file_setup.R, everything else produced by commands in plink.sh).

Generate a report of analysis performed on backcross mice, including heritability estimation and QTL mapping.

Rscript -e 'library(rmarkdown); rmarkdown::render("qtl-analysis.Rmd", "html_document")'

This script will generate a report, qtl-analysis.html, as well as:

  • several figures in the figs directory (including figures that compose Figures 2-4 and Supplemental Figures 1-5 in manuscript)
  • results/h2-from-bc-mice.csv: heritability estimates from backcross mice (Supp. Table 1 in manuscript)
  • results/QTLsummary.csv: summary of QTL mapping results (Table 1 in manuscript)

Candidate gene analysis

First, unzip the compressed directory of VCF files for each QTL region:

unzip source_data/VCFs.zip -d source_data/

Then unzip the GTF file representing all genes in the mouse genome:

gunzip source_data/Mus_musculus.GRCm38.97.gtf.gz --keep

Create a map between Ensembl IDs and gene names to be used by later scripts. This will produce a csv file called GRCm38-gtf-genemap.csv which maps ensembl IDs to gene symbols.

Rscript make_ensembl_genemap.R

Count total genes in each QTL region:

Rscript count_genes.R

Count candidate genes in each QTL region. This loops over all QTL and runs the cand_gene_analysis.R script to count variants segregating between C3H/HeJ and CC027 (protein-coding and regulatory) within each gene.

bash cand_gene.sh

This analysis can take several hours. To run the above command on a SLURM-based high-performance computing cluster, run:

bash cand_gene.sh -m slurm 

This will generate two directories:

  • results/cand-gene-analysis/summary: summary of candidate gene analysis for the region
  • results/cand-gene-analysis/vardata: data on candidate genes in the region, including the number of total variants, regulatory variants and protein-modifying variants

Create a map between mouse gene symbols and human homologs:

Rscript make_mouse2human_genemap.R

Summarize candidate gene data in table:

Rscript summarize_cand_genes.R

This script will produce a summary of the candidate gene analysis results, in results/cand-gene-summary.csv (Table 2 in manuscript).

Themis analysis

This script will generate a report of analysis (linear regression, Tukey's honest significant difference, and PCA) on flow phenotypes from Themis-variant CC strains.

Rscript -e 'library(rmarkdown); rmarkdown::render("themis-analysis.Rmd", "html_document")'

This script will generate a report, themis-analysis.html, along with several figures in figs/fig5 (Fig. 5 in manuscript) and figs/supplemental/tukey/ (Supp. Fig. 6 in manuscript).

peanut-allergy's People

Contributors

erisemberg avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.