Git Product home page Git Product logo

hdcqtl's Introduction

Code for eQTL analyses in heterogeneous differentiating cultures

Single-cell preprocessing

  • Code for filtering is available in code/filtering/
  • The output of this sequence is the raw data file available for download from GEO

Classifier development (fetal cell atlas)

  • Raw fetal cell atlas data can be downloaded from https://atlas.brotmanbaty.org/bbi/human-gene-expression-during-development/ (5,000 sampled cells from each cell type, cell annotations, and gene annotations)
  • The workflow for generating fetal cell type gene signatures is at rules/fca_annotation.py. We explored three versions of the classifier: one with all 77 fetal cell atlas subtypes, one with a subset of 33 cell types curated to be more distinctive and found to offer greater classification accuracy on held-out test data, and one augmented set including the 33 fetal cell atlas cell types along with an IPSC signature from Plurinet
    • The relevant code is found in the code/annotation folder
    • Note that the 33 cell type classifier was used for the annotation of cells that was ultimately used for eQTL calling across cell types

Cell type annotation

  • The workflow for generating the fetal cell atlas classifier and classifying HDC cells is at rules/fca_annotation.py
  • Code for this part of the analysis is in the folder code/annotation

UMAP visualization and hierarchical clustering of cell types

  • rules/sc_preprocessing.py contains the rules used to generate the UMAP embedding, with associated code in code/sc_preprocessing/

Static eQTL calling

Pseudobulk aggregation

  • UMI counts for each donor and cell type are first aggregated in analysis/annotation/assign_cellid.ipynb
  • Initial pseudobulk normalization and generation of a few QC metrics is done in code/static_qtl_calling/pseudobulk_tmm-basic-qc.R, followed by manual QC (based on inspection of PC plots) in analysis/static_qtl_calling/eb_cellid/pseudobulk_tmm/basic/pseudobulk_qc.Rmd. After removing outlier samples based on this manual step, we preprocess the pseudobulk data in code/static_qtl_calling/pseudobulk_tmm-basic-agg.R

eQTL calling in each cell type

  • After data wrangling (spread over several rules in rules/static_qtl_calling.py), we call eQTLs using tensorQTL in code/static_qtl_calling/static_qtl_calling/tensorqtl_permutations.py
  • We control the FDR across all genes in code/static_qtl_calling/static_qtl_calling/tensorqtl_fdr.py

Mash multivariate eQTL analysis

  • Snakemake rules for the multivariate (cross-celltype) mash analysis are in rules/mash_qtl_calling.py, with associated code in code/mash_qtl_calling/

Interaction eQTL calling

Trajectory isolation and pseudotime inference

  • Snakemake rules for trajectory isolation and pseudotime estimation are in rules/trajectory_inference.py, with associated code in code/trajectory_inference/

Dynamic eQTL calling

  • Snakemake rules for pseudobulk aggregation (by pseudotime binning) and dynamic eQTL calling are in rules/dynamic_qtl_calling.py, with associated code in code/dynamic_qtl_calling/

Topic modeling

  • Snakemake rules for pseudocell aggregation, topic modeling, and the topic DE analysis are in rules/fast_topics.py, with associated code in code/fast_topics/

Topic eQTL calling

  • Snakemake rules for topic eQTL calling with CellRegMap, including multiple testing correction and post hoc estimation of effect sizes, are in rules/cellregmap_eqtl_calling.py, with associated code in code/cellregmap_eqtl_calling/

Complex trait analysis

  • Snakemake rules for the comparison of cell type eQTLs to schizophrenia GWAS loci are in rules/scz_analysis.py with code in `

Software notes

  • To run the mash analysis, we used flashier, which was not available on anaconda at the time

hdcqtl's People

Contributors

jmp448 avatar

Stargazers

Andrew Wichmann avatar

Watchers

 avatar Andrew Wichmann avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.