Git Product home page Git Product logo

anova's Introduction

ANOVA from scratch

This repo contains code that does statistical variance analysis from good ol' first principles.

Usage outline

python3 anova.py [simple|blocked|twoway] [csv_file] [reps]

where

  • The first argument is one of the three options specifying what kind of ANOVA to perform. This will essentially determine the number of rows in the final ANOVA table. See below
  • [csv_file] is the raw data csv file. No headers for rows or columns. All rows and columns should have the same lengths, respectively. Formatting below.
  • [reps] is the number of repetitions the experiment underwent. Place multiple reps directly underneath each other in the same block. So if 1 rep has 3 rows and 3 cols, then 2 reps should have 6 rows and 3 cols. The program will then interpret the csv data correctly.

Usage detail

Simple

Compare any number of samples. The CSV file should have a column for each sample.

Blocked

Compare any number of samples, taking blocks into account. CSV should have a col for each sample, and a row for each blocking level. If 2 reps for instance, then the number of rows should be 2*blocks (2 reps for each blocking factor).

Two-way

Compare any number of samples, taking a second factor and their interaction into account. Just like with blocked, the CSV should have a col for each sample, and a row for each secondary factor level. Scales up with reps just like the blocked design.

Output Example

python3 anova.py twoway emulsion.csv 2
src       DF          SS          MS          F           p
------  ----  ----------  ----------  ---------  ----------
faccol     2  0.631667    0.315833    15.7917    0.00406881
facrow     1  0.00333333  0.00333333   0.166667  0.697261
inter      2  0.0116667   0.00583333   0.291667  0.757035
err        6  0.12        0.02
total     11  0.766667

Dependencies

  • tabulate for tabulation
  • scipy just for getting the p-val from the F-stat

Why did I bother?

I did a sophomoric statistics course in university where we covered ANOVA. I noticed that it's quite a lot of repetitive work, and it seemed easily programmable. And so I programmed it from first principles (no fancy R or pandas) to prove to myself I could. It also helped a great deal in tests and exams (which were online in 2020 thanks to COVID). Since I wrote it myself from first princips I don't consider it cheating.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.