surh / hmvar Goto Github PK
View Code? Open in Web Editor NEWHuman Microbiome Variant Analysis in R
License: GNU General Public License v3.0
Human Microbiome Variant Analysis in R
License: GNU General Public License v3.0
Script should take results from both vMWAs and MKtest and compare the results.
It should be able to:
Should take output from MIDAS and plot/analyze number of variable sites per genome per sample.
The script should be able to:
Need to include some test data.
Some basic output from MIDAS must be here. The data must include:
There is some major bug, from unknown reasons, in gsea
function. Calling term_gsea
on each
term produces a list with the correct p-values and sizes, but for some unknown reason, using either bind_rows or rbind (or map_dfr which uses bind_rows) changes all the numeric values.
I cannot reproduce the error with simple tibble creation.
Currently, terms below min_size enter the testing function and return NULL. It would be better if they never called the testing function, and if the testing function didn't have a min_size argument.
Track internal testing coverage via codecov
Build testing via Travis CI
When the minor allele frequency is equal to 0.5, and freq_thres = 0.5. The locus is always assigned to the major allele. It should be modiied to discard such ties. Most will come from cases when there are two reads only.
Create function to read egnogg mapper (emapper.py
) annotation format.
I am not sure if it is possible. But at least n_trials should be the same as 'size' in the ouptut of the other functions
Use DoS function (#11) in dos.r
Set command line arguments via argparser.
Imputation via mice needs to complete benchmark
Should take the output from MIDAS and select all the data from a subset of genes.
Incorporate unit testing via testhat.
Script should be able to plot SNPs from genes from MIDAS. Might require some functions.
Script must be able to:
For functional enrichment functions (#14). If Gene Ontology is being tested. Use the structure of the ontology to get correct p-values. Probably via topgo.
It doesn't work with allele frequencies
Set AMOR as suggests in DESCRIPTION.
Make sure it doesn't break anything.
Add function for sign test for either DoS or mkratio.
Basically from a file of genes with statistics, a file of annotations in eggnog mapper (emapper.py) format, and some significance thresholds. Calculate enrichment for desired annotations.
Ideally it should have option to take files or directories with the stats. If directories are passed, it should analyze all files within directory.
We need a test_annotations function that tests all annotations above some count threshold from a set (test_annotations
).
We also need a more general process_annotations
function that takes some form of table of genes with annotations, and a subset of significant genes to look for enrichments.
See functions in inst/scripts/mktest_enrichments
To be consistent with other determine_* functions
Function to plot SNP abundances
Calculate Ka/Ks per gene.
either use the MK table or use other package that incorporates more information.
Perform KS tests on functional groups. set of functions should be similar to generic functional enrichment functions (#14).
Probably there should be one wrapper function to call either method.
Alternatively, we should have consistent naming and output.
Some anlsysi can be made via topgo
The function test_go
already does this automatically via topGO, but the other enrichment functions, gsea
and sign_test
, rely exclusively on the annotations given by the user.
It was originally written for mktest but probably has general utility. Must add option to keep only CDS and set default to TRUE for backwards compatibility
Add an option that allows me to subset the genes I want to analyze.
Vignette should include
Obtain a doi via Zenodo.
Create function for DoS from mktest output.
Use Weir & Cockerham 1984 estimates for multiple loci
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.