Git Product home page Git Product logo

migwas's Introduction

MIGWAS

This software evaluates enrichment of genome-wide association study (GWAS) signals on miRNA-target gene networks (MIGWAS) and partition them into various human tissues with the help of tissue specific miRNA expression data.

Overview

Publication/Citation

Our paper is out!

Sakaue S. et al. Integration of genetics and miRNA-target gene network identified disease biology implicated in tissue specificity. Nucleic Acids Research. doi:10.1093/nar/gky1066.

Please cite this paper if you use the software or any material in this repository.

Requirements

  • python 3.X
  • scipy
  • numpy
  • pandas
  • six
  • argparse
  • math
  • multiprocessing
  • futures

Installation

In order to get started with MIGWAS, you can just clone this repo as follows;

git clone https://github.com/saorisakaue/MIGWAS
cd ./MIGWAS

Usage

Step 0: Prepare your input

All you need is a text file with GWAS summary statistics.

Column Descriptions
1 rsID (optional with --no-rsid flag below.)
2 chromosome
3 BP position
4 z-score (optional)
5 p-value

Please have a look at an example input at ./example/RA_trans.chr12.pos.P.txt (RA GWAS result at chr12).

Step 1: GWAS summary to gene- and miRNA- level P values

This part is based on the excelent work by Masahiro Kanai, which was implemented to calculate the corrected "gene association score" from a GWAS result, according to MAGENTA's method. For detailed explanations, please visit the original repository. Do take care of the input format of the summary statistics. The example command is as follows;

$ python3 ./minimgnt.py score_filename --out output_prefix [--cpus 4] [--not-remove-HLA] [--remove-NA] --no-rsid

Note! Our example data file ./example/RA_trans.chr12.pos.P.txt only contains chr12 summary statistics (due to the size limitation on GitHub), while the subsequent analysis assumes genome-wide one.

Arguments and options

  • score_filename : GWAS summary statistics.
Option name Descriptions Required Default
--out, -o An output prefix. This should preferably be a phenotype name. Yes "your_phenotype"
--cpus, -j a number of cpus used for computation. No 1
--not-remove-HLA do not remove genes in HLA region from a result. No False
--remove-NA remove genes with NA score from the output. No False
--no-rsid use this flag when a score file doesn't contain a rsID column. No False 

Output files will be generated at ./miRNA_P/ and ./gene_P/.

Step 2: MIGWAS analysis for all tissues and specific tissues

The example command is as follows;

$ python3 ./migwas.py --phenotype RA_trans --out miRA_RA [--cpus 4] --iterations 20000 [--output-candidate]

Options

Option name Descriptions Required Default
--phenotype, -p Name of the phenotype of interest (file name prefix from minimgnt output). Yes None
--out, -o Output file prefix. No "your_migwas"
--cpus, -j Number of CPUs to be used. No 1
--iterations, -i Number of permutations to simulate null distributions. No 20000
--output-candidate, -c If you want to output a list of candidate miRNAs and genes associated with the trait, set this flag. No False
--tsi, -t Tissue specificity index threshold for partitioning miRNA's enrichment signal. Details will be in our article. No 0.7

Output

  1. The example enritchment resut output is as follows;
$ head miRA_RA_migwas_result.txt
#tissue	P_value	Fold_change
endothelial_cell_of_hepatic_sinusoid	0.0734928970649366	1.3943485234788573
epithelial_cell_of_proximal_tubule	0.22659192222025776	0.9410164034752201
keratinocyte	0.3550971896625686	0.7721248194464695

Each cell's partitioned enrichment P value and fold change, as well as miRNA-gene enrichment for all tissues will be described.

  1. The example candidate output is as follows;
$ head DIAGRAM_DM_candidates.txt
99	ZNF148	hsa-mir-548aq	MIMAT0022263
99	ZNF148	hsa-mir-548aq	MIMAT0022264
99	C5orf24	hsa-mir-548aq	MIMAT0022263
99	C5orf24	hsa-mir-548aq	MIMAT0022264

Each row shows the candidate trait-associated miRNA-gene pairs.

Column Descriptions
1 The percentile of scores indicating gene-miRNA target prediction certainty.
2 The target gene symbol.
3 The miRNA ID.
4 The mature miRNA ID.

Please note that the candidate output list holds redundancy, because the pairs supported by higher score are included those with lower scores.

Acknowledgements

  • Tissue specific miRNA-gene enrichment analysis was made possible by the awesome work from FANTOM5, a comprehensive expression catalog of miRNA expression in varitous human cells. The original data can be found here.
  • The original MAGENTA was written by Ayellet Segre, Mark Daly, and David Altshuler of The Broad Institute.
    • Ayellet V. Segrè, DIAGRAM Consortium, MAGIC investigators, Leif Groop, Vamsi K. Mootha, Mark J. Daly, and David Altshuler (2010). Common Inherited Variation in Mitochondrial Genes is not Enriched for Associations with Type 2 Diabetes or Related Glycemic Traits. PLoS Genetics Aug 12;6(8). pii: e1001058.
  • Minimgnt (miniMAGENTA) part was written by Masahiro Kanai, reimplementing the calculation of "gene associatino score" feature in Python.

Licence

This software is freely available for academic users. Usage for commercial purposes is not allowed. Please refer to the LICENCE page.

Creative Commons License

migwas's People

Contributors

saorisakaue avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

migwas's Issues

Interpreting results from STEP 1/2 and paper request

Hi,
I heard this work presented at ASHG 2018 and would like to apply this to my work.

Would you be able to provide me a pre-print of your paper "Integration of genetics and miRNA-target gene network identified disease biology implicated in tissue specificity" so that I can follow the concept/idea/approach applied in this tool?

TWO additional questions:

  1. How do I interpret the values in the result files from STEP 1 in ./miRNA_P and ./gene_P? I understand that these are p-values per miRNA and per gene. If testing for only one binary phenotype, would the p value cutoff be 0.05 or do I need to adjust it for the number of genes or miRNA that was tested?
  2. And as for the results from STEP2, what is the appropriate p-value cutoff when testing only for one binary phenotype?

thanks for your time and help - looking forward to your reply.

Best wishes,
Ruth
[email protected]

New cell type specific miRNA data

Hello:

I am interesting to use this software, but is there any way to include a new cell type specific RNA-seq data along with published data together? We have our own cell-type specific miRNA-Seq data, do you have instruction how to incorporated our data in this program? Thanks.

MIGWAS for hg38 build?

Hi Saori,
I've used your tool for hg37 but wondering if you have updates suitable for hg38? I believe your current input miRNA files are based on hg37.

thanks!
Ruth

migwas.py showing error.

Hello,

I tried running migwas.py with the given example data. Unfortunately it shows the following error message as shown in the screenshot.
Kindly help me understand how to solve it. Thank you in advance.
migwas_error

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.