Git Product home page Git Product logo

maffin's Introduction

MAFFIN

Overview

MAFFIN is an integrated R package for metabolomics sample normalization. MAFFIN main algorithm consists of three modules, high-quality feature selection, MS signal intensity correction, and maximal density fold change normalization. This package also implements commonly used normalization methods and normalization evaluation functions.

Installation

Check if R package "devtools" is installed.

# If not, install R package "devtools" first
install.packages("devtools")

Install MAFFIN

devtools::install_github("Waddlessss/MAFFIN")

Usage

Input

MAFFIN takes feature intensity table (dataframe) as input, with features in row and samples in column by default.

Specifically, the column names are sample names. The first row includes sample group names. The input feature intensity table can be prepared in .csv format. Your input data table should be similar to this:

An example input data table can also be viewed in R:

# View the example input data table
View(TestingData)

In the first row, you need to label samples by their group names. In particular,

  • use blank for blank samples
  • use RT for retention time
  • use QC for normal QC samples
  • use SQC_### for serial QC samples, where "###" means their relative loading amounts. For example, "SQC_1", "SQC_2" and "SQC_3" mean serial QC samples with loading amounts of 1, 2 and 3 uL, respectively.

Note, the group names of real biological samples cannot be any above.

After preparing the input feature intensity table, you are ready to read it to R.

# Read the input feature intensity table in .csv format.
inputTable = read.csv(filename)
# Using testing data as input
# inputTesting = TestingData

MAFFIN main function

# Sample normalization using MAFFIN main function
MAFFINTable = MAFFINNorm(inputTable)

Output

The function MAFFINNorm returns a list contains 4 items.

  • NormedTable is the normalized feature intensity table (data frame) with extra columns of the data processing results.
  • NormFactor is a named numeric vector of calculated normalization factors. Normalization factor reflects the relative concentrations of biological samples.
  • OriPRMAD is a numeric vector of PRMADs from original data before normalization (but after MS intensity correction).
  • NormedPRMAD is a numeric vector of PRMADs from normalized data.

In particular, three columns are added in NormedTable compared to the input.

  • Quality labels the feature quality. For low-quality features, MAFFIN elaborates the reasons including Blank, RT, LowRSD, and LowCor corresponding to four feature selection criteria.
  • Model indicates the regression model for MS intensity correction.
  • SQC_points indicates the number of detected serial QC data that were used in MS intensity correction.

To output the normalized feature intensity table NormedTable to a csv file

# This will output a csv file in your current working directory
MAFFINTable = MAFFINNorm(inputTable, output=TRUE)

# Output to a user-defined directory
outputDir = ""  # define a directory for output
setwd(outputDir)
write.csv(MAFFINNorm(inputTable)[[1]], "normedTable.csv")

More descriptions are provided in R.

# Find more descriptions of MAFFIN output
?MAFFINNorm()

Other normalization functions

# Sum normalization
SumNormedTable = SumNorm(inputTable)

# Median normalization
MedianNormedTable = MedianNorm(inputTable)

# Quantile normalization
QuantileNormedTable = QuantileNorm(inputTable)

# Probabilistic quotient normalization
PQNNormedTable = PQNNorm(inputTable)

Normalization evaluation

MAFFIN package uses intragroup variation to evaluate the normalization results. The lower intragroup variation means better sample normalization.

# Calculate pooled RMAD for normalization evaluation.
prmad = EvaPRMAD(TestingData, GroupNames=c("HY", "SX", "SW", "YC"))

# Calculate pooled RSD for normalization evaluation.
prsd = EvaPRMAD(TestingData, GroupNames=c("HY", "SX", "SW", "YC"))

Citation

Yu, H., & Huan, T. (2021). MAFFIN: Metabolomics Sample Normalization Using Maximal Density Fold Change with High-Quality Metabolic Features and Corrected Signal Intensities. bioRxiv.

maffin's People

Contributors

huaxuyu avatar

Stargazers

Nathaniel G. Mahieu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.