Git Product home page Git Product logo

nymph's Introduction

nymph

Randomization tests for noNparaMetric INFerence

Installation

devtools::install_git('https://github.com/holub008/nymph')

Purpose

Randomization methods provide a powerful toolset for performing inference on a wide range of statistics with few distributional constraints on the data. I believe these methods are underutilized, partially as a result of poor library support; existing R packages (e.g. coin & perm) provide a narrow range of functionality or obfuscate the underlying simplicity of the methods from the user. nymph aims to provide the practitioner a simple & generic interface to this class of methods.

Examples

Inference

Using the iris dataset, we'd like to investigate if the ratio of length to width of Virginica flower petals are different from that of Virginica sepals. We use the mcrd function to compute ratio differences between petals and sepals.

library(dplyr)

petal_measurements <- iris %>% 
    filter(Species == 'virginica') %>% 
    mutate(len = Petal.Length, 
           width = Petal.Width) %>%
    select(len, width)
sepal_measurements <- iris %>% 
    filter(Species == 'virginica') %>% 
    mutate(len = Sepal.Length,
           width = Sepal.Width) %>%
    select(len, width)

set.seed(55414)
mcrt <- mcrd_test(petal_measurements, sepal_measurements, 
                  lw_ratio = mean(len / width),
                  length_proportion = mean(len / (len + width)))
summary(mcrt)

with results:

Permutation test of 1000 permutations against alternative of two.sided at significance 0.95 
         statistic    ci_lower   ci_upper     actual p_value
          lw_ratio -0.16575289 0.17387946 0.55020960       0
 length_proportion -0.01267495 0.01315527 0.04388741       0

Quite convincing that a difference exists! For a visualization of the observed statistic differences (blue) against their null distributions:

plot(mcrt)

iris_mcrt

Power Analysis

Here we perform a power analysis of a contrived experiment - a two treatment experiment with a minimally impactful effect size of 1 against otherwise standard normal populations.

gen_data_s1 <- function(){ data.frame(x = rnorm(50)) }
gen_data_s2 <- function(){ data.frame(x = rnorm(50, 1)) }
mcrp <- mcrd_power(gen_data_s1, gen_data_s2, mean = mean(x), median = median(x), 
                    test_trials = 1e2)
summary(mcrp, alpha = .05, alternative = 'two.sided')

With result:

Power analysis of 100 experiments with alternative of two.sided at significance 0.05 
Group sizes:
 sample size
      1   50
      2   50
 statistic power average_effect
      mean     1     -0.9829345
    median  0.99     -0.9878372

We can also visualize the distribution of p-values (from repeated simulation of the experiment):

plot(mcrp, statistic = 'median', alternative = 'two.sided', 
     alpha = NULL)

median_p_dist

Or we can visualize how power varies across desired inferential FPRs:

plot(mcrp, statistic = 'median', alternative = 'two.sided', 
     alpha = seq(.01, .2, by = .01))

median_alpha_v_power

Implementation

  • No dependencies
    • Portability & ease of install
    • Functionality should be transparent to the user
    • Suggests the parallel package, which comes standard with R >= 2.14, for accelerated computations
  • S4 object system
    • Ensure consistency & validity across a range of tests
    • Fewer redundant objects for the user to comprehend
  • Non-standard evaluation wrappers for common use cases
    • Reduce boilerplate for interactive use

nymph's People

Contributors

holub008 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.