Git Product home page Git Product logo

nhejazi / biotmle Goto Github PK

View Code? Open in Web Editor NEW
4.0 7.0 2.0 122.73 MB

:package: :microscope: R/biotmle: Targeted Learning with Moderated Statistics for Biomarker Discovery

Home Page: https://code.nimahejazi.org/biotmle/

License: Other

R 84.96% Makefile 1.95% TeX 13.10%
bioinformatics biostatistics bioconductor statistics machine-learning causal-inference r bioconductor-packages bioconductor-package targeted-learning

biotmle's Introduction

R/biotmle

R-CMD-check Coverage Status Project Status: Active – The project has reached a stable, usable state and is being actively developed. BioC status Bioc Time Bioc Downloads MIT license DOI JOSS Status

Targeted Learning with Moderated Statistics for Biomarker Discovery

Authors: Nima Hejazi, Mark van der Laan, and Alan Hubbard


What’s biotmle?

The biotmle R package facilitates biomarker discovery through a generalization of the moderated t-statistic (Smyth 2004) that extends the procedure to locally efficient estimators of asymptotically linear target parameters (Tsiatis 2007). The set of methods implemented modify targeted maximum likelihood (TML) estimators of statistical (or causal) target parameters (e.g., average treatment effect) to apply variance moderation to the standard variance estimator based on the efficient influence function (EIF) of the target parameter (van der Laan and Rose 2011, 2018). By performing a moderated hypothesis test that pools the individual probe-specific EIF-based variance estimates, a robust variance estimator is constructed, which stabilizes the standard error estimates and improves the performance of such estimators both in smaller samples and in settings where the EIF is poorly estimated. The resultant procedure allows for the construction of conservative hypothesis tests that reduce the false discovery rate and/or the family-wise error rate (Hejazi, van der Laan, and Hubbard 2021). Improvements upon prior TML-based approaches to biomarker discovery (e.g., Bembom et al. (2009)) include both the moderated variance estimator as well as the use of conservative reference distributions for the corresponding moderated test statistics (e.g., logistic distribution), inspired by tail bounds based on concentration inequalities (Rosenblum and van der Laan 2009); the latter prove critical for obtaining robust inference when the finite-sample distribution of the estimator deviates from normality.


Installation

For standard use, install from Bioconductor using BiocManager:

if (!requireNamespace("BiocManager", quietly=TRUE)) {
  install.packages("BiocManager")
}
BiocManager::install("biotmle")

To contribute, install the bleeding-edge development version from GitHub via remotes:

remotes::install_github("nhejazi/biotmle")

Current and prior Bioconductor releases are available under branches with numbers prefixed by “RELEASE_”. For example, to install the version of this package available via Bioconductor 3.6, use

remotes::install_github("nhejazi/biotmle", ref = "RELEASE_3_6")

Example

For details on how to best use the biotmle R package, please consult the most recent package vignette available through the Bioconductor project.


Issues

If you encounter any bugs or have any specific feature requests, please file an issue.


Contributions

Contributions are very welcome. Interested contributors should consult our contribution guidelines prior to submitting a pull request.


Citation

After using the biotmle R package, please cite both of the following:

    @article{hejazi2017biotmle,
      author = {Hejazi, Nima S and Cai, Weixin and Hubbard, Alan E},
      title = {biotmle: Targeted Learning for Biomarker Discovery},
      journal = {The Journal of Open Source Software},
      volume = {2},
      number = {15},
      month = {July},
      year  = {2017},
      publisher = {The Open Journal},
      doi = {10.21105/joss.00295},
      url = {https://doi.org/10.21105/joss.00295}
    }

    @article{hejazi2021generalization,
      author = {Hejazi, Nima S and Boileau, Philippe and {van der Laan},
        Mark J and Hubbard, Alan E},
      title = {A generalization of moderated statistics to data adaptive
        semiparametric estimation in high-dimensional biology},
      journal={under review},
      volume={},
      number={},
      pages={},
      year = {2021+},
      publisher={},
      doi = {},
      url = {https://arxiv.org/abs/1710.05451}
    }

    @manual{hejazi2019biotmlebioc,
      author = {Hejazi, Nima S and {van der Laan}, Mark J and Hubbard, Alan
        E},
      title = {{biotmle}: {Targeted Learning} with moderated statistics for
        biomarker discovery},
      doi = {10.18129/B9.bioc.biotmle},
      url = {https://bioconductor.org/packages/biotmle},
      note = {R package version 1.10.0}
    }

Related

  • R/biotmleData - R package with example experimental data for use with this analysis package.

Funding

The development of this software was supported in part through grants from the National Institutes of Health: P42 ES004705-29 and R01 ES021369-05.


License

© 2016-2021 Nima S. Hejazi

The contents of this repository are distributed under the MIT license. See file LICENSE for details.


References

Bembom, Oliver, Maya L Petersen, Soo-Yon Rhee, W Jeffrey Fessel, Sandra E Sinisi, Robert W Shafer, and Mark J van der Laan. 2009. “Biomarker Discovery Using Targeted Maximum-Likelihood Estimation: Application to the Treatment of Antiretroviral-Resistant Hiv Infection.” Statistics in Medicine 28 (1): 152–72.

Hejazi, Nima S, Mark J van der Laan, and Alan E Hubbard. 2021. “A Generalization of Moderated Statistics to Data Adaptive Semiparametric Estimation in High-Dimensional Biology.” Under Review. https://arxiv.org/abs/1710.05451.

Rosenblum, Michael A, and Mark J van der Laan. 2009. “Confidence Intervals for the Population Mean Tailored to Small Sample Sizes, with Applications to Survey Sampling.” The International Journal of Biostatistics 5 (1).

Smyth, Gordon K. 2004. “Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments.” Statistical Applications in Genetics and Molecular Biology 3 (1): 1–25. https://doi.org/10.2202/1544-6115.1027.

Tsiatis, Anastasios. 2007. Semiparametric Theory and Missing Data. Springer Science & Business Media.

van der Laan, Mark J., and Sherri Rose. 2011. Targeted Learning: Causal Inference for Observational and Experimental Data. Springer Science & Business Media.

van der Laan, Mark J, and Sherri Rose. 2018. Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies. Springer Science & Business Media.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.