Git Product home page Git Product logo

bioc2017_tcga_gtex_bof's Introduction

BioC2017_TCGA_GTEx_BOF

This repository contains a brief outline for a 'Birds-of-Feather' (BOF) session to meet and discuss analyzing publicly available cancer data from TCGA & GTEx

TCGA data

  1. 39 projects
  2. 29 primary sites
  3. 14551 cases
  4. 274,724 files
  5. 22,144 genes
  6. 3,115,606 mutations

GTEx data

  1. 53 tissues
  2. 544 donors
  3. 8555 samples

Sources for getting TCGA data

  1. NCI's Genomic Data Commons (GDC) website
  2. TCGAbiolinks
  3. Recount2
    • data is made available as a Bioconductor RangedSummarizedExperiment object.
    • data from select publications is also available
    • gene level and exon level data is present.
  4. RTCGAToolbox
  5. UCSC Xena Server
    • contains data from other sources too such as ICGC, TARGET, GTEx , TOIL
    • RNASeq data is present as log2(RPKM+1),no raw read counts to use as input for edgeR or DESeq2
    • mutation data, copy number data, protein expression RRPA, DNA methylation, miRNA isoform expression data.
  6. ExperimentHub() contains raw RNASeq gene counts from TCGA. GSE62944
library(ExperimentHub)
eh = ExperimentHub()
query(eh, "TCGA")
tumor_samples = eh[["EH164"]]
normal_sample = eh[["EH165"]]
  1. Re-normalize RNASeq data from TCGA using kallisto can be found here

Sources for getting GTEx data

  1. GTEx website
  2. Recount2
  3. Coming Soon!! will be added to ExperimentHub()

What kind of analysis do you typically do with TCGA/GTEx data ? Packages being used ?

  1. clustering of samples/ genes - PCA plots.
  2. differenrial expression analysis between 2 chosen groups?0 using RNASeq data
  3. mutation analysis

Other publicly available databases for Cancer / Cool resources for studying cancer

Acknowledgements

  1. Martin Morgan & the Core Bioconductor Team
  2. GTEx Project - The Genotype-Tissue Expression (GTEx) Project was supported by the Common Fund of the Office of the Director of the National Institutes of Health, and by NCI, NHGRI, NHLBI, NIDA, NIMH, and NINDS. The data used for the analyses described in this manuscript were obtained from: [insert, where appropriate] the GTEx Portal on MM/DD/YY and/or dbGaP accession number phs000424.vN.pN on MM/DD/YYYY.
  3. TCGA data discussed here is generated by the TCGA Research Network: http://cancergenome.nih.gov/.

bioc2017_tcga_gtex_bof's People

Contributors

stevetsa avatar sonali-bioc avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.