Git Product home page Git Product logo

dea_seurat's Introduction

scRNA-seq Differential Expression Analysis & Visualization using Seurat and Snakemake

A Snakemake workflow for performing differential expression analyses (DEA) of sc/snRNA-seq data powered by the R package Seurat's functions FindMarkers and FindAllMarkers.

Workflow Rulegraph

Table of contents

Test data

We provide a subset of cells from the following human T-Cell dataset as test case for the workflow:

Cano-Gamez, E., Soskic, B., Roumeliotis, T.I. et al. Single-cell transcriptomics identifies an effectorness gradient shaping the response of CD4+ T cells to cytokines. Nat Commun 11, 1801 (2020). https://doi.org/10.1038/s41467-020-15543-y

image

This work examines the transcriptional patterns of human naïve and memory CD4+ T cells to show that responses to cytokines differ substantially between these cell types. The analysis of the different data modalities is documented in the following repository: https://github.com/eddiecg/T-cell-effectorness

The folder test_data contains two sets of files starting with "Memory_Tcells" and "Naive_Tcells". The "counts.rds" files contain a raw count matrix and the "metadata.csv" the corresponding metadata annotation for the cells in the count matrix. Each count matrix and metadata table can be combined into a Seurat object as starting point for the workflow.

Steps

The workflow perfroms the following steps.

  • Differential Expression Analysis (DEA)
    • using Seurat's FindMarkers or FindAllMarkers depending on the configuration (CSV)
    • feature list per comparison group and direction (up/down) for downstream analysis (eg enrichment analysis) (TXT)
    • (optional) feature score tables (with two columns: "feature" and "score") per comparison group using {score_formula} for downstream analyses (eg preranked enrichment analysis) (CSV).
  • DEA result statistics: number of statistically significant results split by positive (up) and negative (down) change (CSV)
  • DEA result filtering by
    • statistical significance (adjusted p-value)
    • effect-size (log 2 fold change)
    • expression (minimum percentage of expression) in one of the comparison groups
  • Log Fold Change (LFC) matrix of filtered features by comparison groups (CSV)
  • Visualizations
    • all and filtered DEA result statistics: number of features and direction (stacked Bar plots)
    • Volanco plot per comparison with configured cutoffs for statistical significance and effect-size
    • Clustered Heatmaps of the LFC matrix

Configuration

Detailed specifications can be found here ./config/README.md

Software

This project wouldn't be possible without the following software and their dependencies:

Software Reference (DOI)
EnhancedVolcano https://doi.org/10.18129/B9.bioc.EnhancedVolcano
ggplot2 https://ggplot2.tidyverse.org/
patchwork https://CRAN.R-project.org/package=patchwork
pheatmap https://cran.r-project.org/package=pheatmap
Seurat https://doi.org/10.1016/j.cell.2021.04.048
Snakemake https://doi.org/10.12688/f1000research.29032.2

Authors

Links

dea_seurat's People

Contributors

sreichl avatar roblehmann avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.