Git Product home page Git Product logo

hmvar's People

Contributors

surh avatar

Stargazers

 avatar

Watchers

 avatar  avatar

hmvar's Issues

Correlation between different tests

Script should take results from both vMWAs and MKtest and compare the results.

It should be able to:

  1. Compare the results either by estimate or p-value
  2. Compare all or a subset of genes
  3. Compare within a genome or across many genomes
  4. Produce plots
  5. Use different significance thresholds

Number of variable/fixed sites

Should take output from MIDAS and plot/analyze number of variable sites per genome per sample.

The script should be able to:

  1. Take different depth and MAF frequency thresholds.
  2. Analyze any number of genomes.
  3. Analyze any number of groups for the samples.
  4. Produce plots absolute count and percentage plots.

test data

Need to include some test data.

Some basic output from MIDAS must be here. The data must include:

  1. 3 genomes
  2. multiple genes per genome
  3. Some non CDS sites
  4. Different types of sites (i.e. Dn, Ds, Pn, Ps, invariant).
  5. A mapping file with at least three groups.

gsea binding of results changes p-values

There is some major bug, from unknown reasons, in gsea function. Calling term_gsea on each
term produces a list with the correct p-values and sizes, but for some unknown reason, using either bind_rows or rbind (or map_dfr which uses bind_rows) changes all the numeric values.

I cannot reproduce the error with simple tibble creation.

Incorrect assignment of allele when MAF=0.5

When the minor allele frequency is equal to 0.5, and freq_thres = 0.5. The locus is always assigned to the major allele. It should be modiied to discard such ties. Most will come from cases when there are two reads only.

plot_genes executable

Script should be able to plot SNPs from genes from MIDAS. Might require some functions.

Script must be able to:

  1. Take any number of genes and genomes.
  2. Produce a variety of plots and allow the user to specify which plots to use.

Incorporate GO ontology structure

For functional enrichment functions (#14). If Gene Ontology is being tested. Use the structure of the ontology to get correct p-values. Probably via topgo.

sign test

Add function for sign test for either DoS or mkratio.

Create functional_enrichment.r exectuable

Basically from a file of genes with statistics, a file of annotations in eggnog mapper (emapper.py) format, and some significance thresholds. Calculate enrichment for desired annotations.

Ideally it should have option to take files or directories with the stats. If directories are passed, it should analyze all files within directory.

Functional test enrichment functions

We need a test_annotations function that tests all annotations above some count threshold from a set (test_annotations).

We also need a more general process_annotations function that takes some form of table of genes with annotations, and a subset of significant genes to look for enrichments.

See functions in inst/scripts/mktest_enrichments

Ka/Ks functions

Calculate Ka/Ks per gene.

either use the MK table or use other package that incorporates more information.

Gene set enrichment analysis

Perform KS tests on functional groups. set of functions should be similar to generic functional enrichment functions (#14).

Probably there should be one wrapper function to call either method.

Alternatively, we should have consistent naming and output.

Some anlsysi can be made via topgo

Basic usage vignette

Vignette should include

  1. Basic loading of MIDAS data
  2. Metawas, MKtest, DoS analysis.
  3. Visualization of results.
  4. Extraction of significant genes.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.