Git Product home page Git Product logo

repseq-annotation-tutorial's Introduction

CCBYSA

Immune repertoire annotation: a RepSeq data analysis tutorial

Splash

Introduction

This tutorial covers some basic aspects of Immune Repertoire Sequencing (RepSeq) data analysis focused on T-cell receptor (TCR) repertoires:

  • Repertoire diversity analysis

  • Segment usage analysis

  • Repertoire overlap analysis

  • Annotation of antigen-specific TCR sequences

The main idea of this tutorial is to demonstrate the immense amount of information encoded in immune repertoires and the ability to decode relevant characteristics from the RepSeq data using relatively simple bioinformatic/data mining methods.

Given a set of unlabeled samples from different donors (generated as shown below), T-cell subpopulations and phenotypes, we can reliably infer the sample origin and even some properties of the immunological status of a (relatively) healthy donor.

Splash

This analysis uses 16 samples of 10,000 random reads from two donors from Qi et al. PNAS 2014 study (sample labels and TCR nucleotide sequences are removed).

Pre-requisites / setup

This tutorial should run fine on both Mac, Linux and Windows. The setup is the following:

  1. Install RStudio which can be downloaded from https://www.rstudio.com

  2. Execute the following code in RStudio console to install the required R packages:

install.packages(c("data.table","dplyr","reshape2","ggplot2","NMF","scales","forcats","parallel","stringr"))

You can also install these packages manually from the Tools -> Install packages.. menu.

N.B. As the tutorial relies on widely used R packages, if you encounter any problems during installation the best option for you would be to google (or check StackOverflow) for the error message and try to fix it on your own. For example most problems with Mac are solved by installing the missing Xcode software.

  1. Get yourself familiar with RStudio/R markdown by watching this video.

  2. Download the contents of this repository by clicking the Clone or download button (you can also use git clone).

  3. Navigate with the folder containing the tutorial, then open the tutorial.Rmd R markdown notebook.

After that, you can sequentially execute different parts of the analysis arriving to a set of basic RepSeq analysis results.

Post-analysis

After obtaining the results summarizing all 16 samples you should be able to label them with the following categories:

  1. Donor ID, first or second donor (the order doesn't matter)
  2. Cell phenotype, memory or naive
  3. Cell subset, CD4 or CD8
  4. Donor CMV status, cmv+ or cmv-
  5. Donor HLA allele, in X*XX format (where applicable)

Slides

The slides can be found here.

P.S.

If you're interested in learning how to work with our standalone Java tools, check out this repository.

repseq-annotation-tutorial's People

Contributors

mikessh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

repseq-annotation-tutorial's Issues

Rmd fails

Sorry to bother you again but here too I have issues running the Rmd code
It dies on line 273 complaining about

Error: 1 components of ...were not used. We detected these problematic arguments: *fun Did you misspecify an argument?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.