Git Product home page Git Product logo

cygnal's Introduction

Documentation Status

CyTOF Signalling Analysis (CyGNAL)

Repository of the Cell Communication Lab at UCL's Cancer Institute. The Cell Communication Lab studies how oncogenic mutations communicate with stromal and immune cells in the colorectal cancer (CRC) tumour microenvironment (TME). By understanding how mutations regulate all cell types within a tumour, we aim to uncover novel approaches to treat cancer.

In this repo we present CyGNAL, a pipeline for analysing mass cytometry data similar to that used in our Nature Methods paper: Cell-type-specific signaling networks in heterocellular organoids. With code in both Python and R, CyGNAL assumes some preliminary and inter-step processing through the platform Cytobank (although the user could in theory use any other solution for this and the gating steps).

Overview of the current workflow: alt text

How to use

Main steps in code folder. Various utilities can be found in code/utils.

Raw data contains sample dataset files. Pipeline can take in both FCS and .txt files (as tab-separated dataframes).

NOTE: The dataset used in this tutorial is a down-sampled version (5,000 cells per time point, EpCAM/Pan-CK gated) of the small intestinal organoid time-course experiment described in Figure 4 of our paper. The full dataset is available through Cytobank Community. The users will need to register a free Cytobank Community account to access the project and are encouraged to clone the experiments and explore the data in further details.

A Brief Step-by-Step Tutorial

  1. (SETUP): Clone the repository and ensure you have all necessary software and dependencies.

  2. Pre-process: Copy all the data files to the 'Raw_Data' folder and run 1-data_preprocess.py. The output files with their antibody panel processed (i.e. measured channels decluttered, empty channels deleted, cell-index assigned) will be saved in the 'Preprocessed_Data' folder, together with a 'panel_markers.csv' file listing all the markers measured in the given experiment.

    Optional (if exporting .txt datasets from Cytobank): Go to the working illustration page (Illustrations - My working illustration), highlight the population(s) of interest, and export events as untransformed text files (Actions - Export - Export events, with 'Include header with FCS filename' unchecked).

    Note: This step is essential for getting the dataset compatible with downstream analysis and has to be performed as the first step in our workflow.

  3. UMAP: Move the processed data file(s) and panel_marker.csv to 'Analysis/UMAP_input'. Edit 'panel_markers.csv' to set all the markers used for UMAP analysis from 'N' to 'Y'. Run 2-umap.py, and the output files will be saved within the 'Analysis/UMAP_output' folder. The markers and the indices of the cells used in the analysis will also be saved in the new folder.

    Note: When there is more than one data file used as input of the analysis, each data file can be downsampled to the lowest number of the input (i.e. 'equal' sampling) and concatenated prior to UMAP calculation. After the calculation is complete, the concatenated dataset as well as each individual condition are saved with their UMAP coordinates attached.

  4. EMD: To perform EMD calculation (using the tools available in the scprep library), copy the input data files to 'Analysis/EMD_input'. Run 3-emd.py and follow the instructions. By default, the denominator of the EMD calculation will be the concatenation of all the input data files, but the user is given the option to provide a specific denominator data file. While EMD scores of all channels can be calculated by default, by default the user should place the 'panel_markers.csv' in the input folder to specifiy which marker are to be used. The calculated EMD scores will be saved in 'Analysis/EMD_output', within the 'EMD_arc_no_norm' column in the saved file.

  5. DREMI: To perform DREMI calculation (using the tools available in the scprep library) copy the input data files to 'Analysis/DREMI_input'. Run 4-dremi.py and follow the instructions. As with EMD, DREMI scores of all permutations of marker combinations can be calculated, but we suggest specifying the markers of interest by modifying the 'panel_markers.csv' file. The calculated DREMI scores will be saved in 'Analysis/DREMI_output'.

    Optional: The user is given the option to save the density-resampled plots for data inspection and to perform a standard deviation-based outlier removal step prior to DREMI calculation.

  6. Heatmap: To visualise EMD/DREMI scores in heatmaps, copy the EMD/DREMI calculation outputs to the 'Analysis/Vis_Heatmap' folder. Run 5v1-emd_dremi_htmp.py and follow the instructions in the GUI. The script accepts only one EMD data file and one DREMI data file (with 'EMD' and 'DREMI' in their file names respectively) to be visualised.

  7. Principal component analysis (PCA): To perform PCA and visualise the results, copy the EMD/DREMI calculation outputs to the 'Analysis/Vis_PCA' folder. Run 5v2-pca.py and follow the instructions in the GUI.

Dependencies

We strongly encourage using conda to setup an environment from 'conda_env.yml'.

  • Python: Tested with Python v3.6 and v3.7. Used in the backbone of the workflow and most computational steps.

    • fcsparser
    • fcswrite
    • numpy
    • pandas
    • plotly
    • rpy2
    • scprep
    • sklearn
    • umap-learn
  • R: Tested with R v3.6.1 and RStudio v1.2.5001. Mostly used for visualisation, but also for computing the PCA.

    • DT
    • factoextra
    • FactoMineR
    • flowCore
    • Ggally
    • Hmisc
    • MASS
    • matrixStats
    • plotly
    • psych
    • RColorBrewer
    • shiny
    • tidyverse
  • Bourne shell:

    • Rscript

Authors

The work here is actively being developed by Ferran Cardoso (@FerranC96) and Dr. Xiao Qin (@qinxiao1990). Based also on original work by Pelagia Kyriakidou.

cygnal's People

Contributors

ferranc96 avatar qinxiao1990 avatar ctape avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.