Git Product home page Git Product logo

matrixset's Introduction

matrixset

R-CMD-check Lifecycle: experimental CRAN status

A matrixset is a container of matrices, each having the same number of rows and columns and the same dimnames. Moreover, each dimname must uniquely identify elements.

While there is minimal support for NULL dimnames (and that is bound to change at some point in the future), it is strongly recommended to provide meaningful dimnames. One of the main reason for this is that annotation is impossible with NULL dimnames.

In addition, as alluded above, a matrixset can store independent row and column annotations. This meta information is stored, and available, in the form of data frames - one for row information and one for column. The annotation names are referred to as traits.

This latter feature makes matrixset especially attractive even if it stores only a single matrix, because several methods have been developped to manipulate matrixsets, accounting for annotations.

Why a matrixset?

Many problems that matrixset can tackle could be solved via a data.frame and more specifically using the tidyverse suite.

Two reasons for which you may want to use a matrixset instead are:

  • object size. The data.frame needed to store the same information as a matrixset can be significantly bigger
  • You actually need a matrix format, for example for running a PCA.

Installation

The easiest way to install matrixset is from CRAN:

install.packages("matrixset")

Or you can install the development version of matrixset from GitHub with:

# install.packages("devtools")
devtools::install_github("pascalcroteau/matrixset")

Example

In addition to store multiple matrices that share the same attributes, a matrixset object’s strength is it’s annotation feature. You create an object from existing matrices and annotation data.frame.

library(MASS)
library(tidyverse)
animals <- as.matrix(Animals)
head(animals)
#>                     body brain
#> Mountain beaver     1.35   8.1
#> Cow               465.00 423.0
#> Grey wolf          36.33 119.5
#> Goat               27.66 115.0
#> Guinea pig          1.04   5.5
#> Dipliodocus     11700.00  50.0


animal_info <- MASS::Animals %>% 
  rownames_to_column("Animal") %>% 
  mutate(is_extinct = case_when(Animal %in% c("Dipliodocus", "Triceratops", "Brachiosaurus") ~ TRUE,
                                TRUE ~ FALSE),
         class = case_when(Animal %in% c("Mountain beaver", "Guinea pig", "Golden hamster", "Mouse", "Rabbit", "Rat") ~ "Rodent",
                           Animal %in% c("Potar monkey", "Gorilla", "Human", "Rhesus monkey", "Chimpanzee") ~ "Primate",
                           Animal %in% c("Cow", "Goat", "Giraffe", "Sheep") ~ "Ruminant",
                           Animal %in% c("Asian elephant", "African elephant") ~ "Elephantidae",
                           Animal %in% c("Grey wolf") ~ "Canine",
                           Animal %in% c("Cat", "Jaguar") ~ "Feline",
                           Animal %in% c("Donkey", "Horse") ~ "Equidae",
                           Animal == "Pig" ~ "Sus",
                           Animal == "Mole" ~ "Talpidae",
                           Animal == "Kangaroo" ~ "Macropodidae",
                           TRUE ~ "Dinosaurs")) %>% 
  select(-body, -brain)
animal_info
#>              Animal is_extinct        class
#> 1   Mountain beaver      FALSE       Rodent
#> 2               Cow      FALSE     Ruminant
#> 3         Grey wolf      FALSE       Canine
#> 4              Goat      FALSE     Ruminant
#> 5        Guinea pig      FALSE       Rodent
#> 6       Dipliodocus       TRUE    Dinosaurs
#> 7    Asian elephant      FALSE Elephantidae
#> 8            Donkey      FALSE      Equidae
#> 9             Horse      FALSE      Equidae
#> 10     Potar monkey      FALSE      Primate
#> 11              Cat      FALSE       Feline
#> 12          Giraffe      FALSE     Ruminant
#> 13          Gorilla      FALSE      Primate
#> 14            Human      FALSE      Primate
#> 15 African elephant      FALSE Elephantidae
#> 16      Triceratops       TRUE    Dinosaurs
#> 17    Rhesus monkey      FALSE      Primate
#> 18         Kangaroo      FALSE Macropodidae
#> 19   Golden hamster      FALSE       Rodent
#> 20            Mouse      FALSE       Rodent
#> 21           Rabbit      FALSE       Rodent
#> 22            Sheep      FALSE     Ruminant
#> 23           Jaguar      FALSE       Feline
#> 24       Chimpanzee      FALSE      Primate
#> 25              Rat      FALSE       Rodent
#> 26    Brachiosaurus       TRUE    Dinosaurs
#> 27             Mole      FALSE     Talpidae
#> 28              Pig      FALSE          Sus

You can create the object and then do some operations.

library(matrixset)
animals_ms <- matrixset(msr = animals, row_info = animal_info, row_key = "Animal")
animals_ms %>% 
    apply_row_dfl(rg = range(.i),
                  qt = quantile(.i, probs = c(.25, .75)))   
#> $msr
#> # A tibble: 56 × 5
#>    .rowname        rg.name     rg qt.name     qt
#>    <chr>           <chr>    <dbl> <chr>    <dbl>
#>  1 Mountain beaver ..1       1.35 25%       3.04
#>  2 Mountain beaver ..2       8.1  75%       6.41
#>  3 Cow             ..1     423    25%     434.  
#>  4 Cow             ..2     465    75%     454.  
#>  5 Grey wolf       ..1      36.3  25%      57.1 
#>  6 Grey wolf       ..2     120.   75%      98.7 
#>  7 Goat            ..1      27.7  25%      49.5 
#>  8 Goat            ..2     115    75%      93.2 
#>  9 Guinea pig      ..1       1.04 25%       2.16
#> 10 Guinea pig      ..2       5.5  75%       4.38
#> # ℹ 46 more rows


animals_ms %>% 
    row_group_by(class) %>% 
    apply_column_dfl(avr = mean)
#> $msr
#> # A tibble: 22 × 3
#>    class        .colname     avr
#>    <chr>        <chr>      <dbl>
#>  1 Canine       body        36.3
#>  2 Canine       brain      120. 
#>  3 Dinosaurs    body     36033. 
#>  4 Dinosaurs    brain       91.5
#>  5 Elephantidae body      4600. 
#>  6 Elephantidae brain     5158. 
#>  7 Equidae      body       354. 
#>  8 Equidae      brain      537  
#>  9 Feline       body        51.6
#> 10 Feline       brain       91.3
#> # ℹ 12 more rows

matrixset's People

Contributors

pascalcroteau avatar

Stargazers

vinit kumar pandey avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.