Git Product home page Git Product logo

snpsea's Introduction

SNPsea: an algorithm to identify cell types, tissues, and pathways affected by risk loci

Home Page: http://www.broadinstitute.org/mpg/snpsea

Documentation: HTML | PDF | Epub

Executable: snpsea-v1.0.3.tar.gz

Data: SNPsea_data_20140520.zip

License: GNU GPLv3

Citation

If you benefit from this method, please cite:

Slowikowski, K. et al. SNPsea: an algorithm to identify cell types, tissues, and pathways affected by risk loci. Bioinformatics (2014). doi:10.1093/bioinformatics/btu326

See the first description of the algorithm and additional examples here:

Hu, X. et al. Integrating autoimmune risk loci with gene-expression data identifies specific pathogenic immune cell subsets. The American Journal of Human Genetics 89, 496โ€“506 (2011). PubMed

Description

SNPsea is an algorithm to identify cell types and pathways likely to be affected by risk loci. It requires a list of SNP identifiers and a matrix of genes and conditions.

Genome-wide association studies (GWAS) have discovered multiple genomic loci associated with risk for different types of disease. SNPsea provides a simple way to determine the types of cells influenced by genes in these risk loci.

Suppose disease-associated alleles influence a small number of pathogenic cell types. We hypothesize that genes with critical functions in those cell types are likely to be within risk loci for that disease. We assume that a gene's specificity to a cell type is a reasonable indicator of its importance to the unique function of that cell type.

First, we identify the genes in linkage disequilibrium (LD) with the given trait-associated SNPs and score the gene set for specificity to each cell type. Next, we define a null distribution of scores for each cell type by sampling random SNP sets matched on the number of linked genes. Finally, we evaluate the significance of the original gene set's specificity by comparison to the null distributions: we calculate an exact permutation p-value.

SNPsea is a general algorithm. You may provide your own:

  1. Continuous gene matrix with gene expression profiles (or other values).
  2. Binary gene annotation matrix with presence/absence 1/0 values.

We provide you with three expression matrices and one annotation matrix. See the Data section of the Manual.

The columns of the matrix may be tissues, cell types, GO annotation codes, or other conditions. Continuous matrices must be normalized before running SNPsea: columns must be directly comparable to each other.

Example

SNPsea results for RBC count-associated SNPs in the Gene Atlas.

The heatmap shows Pearson correlation coefficients between pairs of tissue expression profiles. The blue bars show p-values. Statistically significant p-values cross the Bonferroni multiple testing threshold (black line).

We identified BM-CD71+Early Erythroid as the cell type with most significant enrichment (P < 2e-7) for cell type-specific gene expression relative to 78 other tissues in the Gene Atlas (Su et al. 2004).

SNPsea tested the genes in linkage disequilibrium (LD) with 45 input SNPs associated with count of red blood cells (P <= 5e-8 in Europeans) (Harst et al. 2012). For each of the 79 cell types in the Gene Atlas, we tested a maximum of 1e7 null SNP sets where each null SNP was matched to an input SNP on the number of genes in LD.

We ran SNPsea like this:

options=(
    --snps              Red_blood_cell_count-Harst2012-45_SNPs.gwas
    --gene-matrix       GeneAtlas2004.gct.gz
    --gene-intervals    NCBIgenes2013.bed.gz
    --snp-intervals     TGP2011.bed.gz
    --null-snps         Lango2010.txt.gz
    --out               out
    --slop              10e3
    --threads           8
    --null-snpsets      0
    --min-observations  100
    --max-iterations    1e7
)
snpsea ${options[*]}

# Time elapsed: 2 minutes 36 seconds

# Create the figure shown above:
snpsea-barplot out

Contributing

Please submit an issue to report bugs or ask questions.

Please contribute bug fixes or new features with a pull request to this repository.

snpsea's People

Contributors

slowkow avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

snpsea's Issues

AttributeError: 'DataFrame' object has no attribute 'sort'

Hi! Package looks great, can't wait to use it!

Unfortunately, running the tutorial I got this error:

$ snpsea-barplot out
/home/avoda/bin/snpsea-barplot:67: FutureWarning: read_table is deprecated, use read_csv instead, passing sep='\t'.
  pvalues = pd.read_table(f_pvalues, index_col=0)
/home/avoda/bin/snpsea-barplot:74: FutureWarning: read_table is deprecated, use read_csv instead, passing sep='\t'.
  index_col=0).drop('Description', axis=1)
Traceback (most recent call last):
  File "/home/avoda/bin/snpsea-barplot", line 250, in <module>
    main()
  File "/home/avoda/bin/snpsea-barplot", line 61, in main
    top=int(args['--top']))
  File "/home/avoda/bin/snpsea-barplot", line 84, in barplot
    pvalues = pvalues.sort('pvalue')[:top]
  File "/gfs/devel/avoda/miniconda3/lib/python2.7/site-packages/pandas/core/generic.py", line 5067, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'sort'

Seems like the Python calls a deprecated sort function? https://stackoverflow.com/a/44123892/11265666

Any help would be appreciated.

TypeError: ("make_label() missing 3 required positional arguments: 'pvalue',

When I use the function 'snpsea-heatmap', it always shows 'TypeError: ("make_label() missing 3 required positional arguments: 'pvalue', 'nu 'occurred at index 53'', Even when I am using the data from the example.

I changed the code 'def make_label(condition, pvalue, nulls, reps):'in line74 into

def make_label(x):
 76         condition = x[0]
 77         pvalue = x[1]
 78         nulls = x[2]
 79         reps = x[3]

and solved this problem

Pre-compiled binary error on OSX, also cannot compile

Hi!

Downloaded a fresh clone of the repo then tried ./snpsea-osx64 --help and got:


dyld[33963]: Library not loaded: /usr/local/lib/libgsl.0.dylib
  Referenced from: /Users/avoda/Desktop/Snpsea/snpsea/snpsea-master/bin/snpsea-osx64
  Reason: tried: '/usr/local/lib/libgsl.0.dylib' (no such file), '/usr/lib/libgsl.0.dylib' (no such file)
Abort trap: 6

After I installed the library with brew install gsl (as indicated here) but it still throws the same error even after macbook restart.

Do you know by any chance why this would happen?

Also, is there any chance this will compile on Windows?

Mercurial dependency could be listed

Hi,

Noticed that I needed Mercurial to fetch the Eigen library. I run Ubuntu and the Makefile requires it to work.
Might be worth adding to the apt-get install list. It also looks like git is needed in the make file as well, to get intervaltree.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.