Git Product home page Git Product logo

githubssl's Introduction

SSL Validation

Overview

Phenotyping algorithms based on Electronic Health Records (EHR) aim to identify a patient’s disease status using the information in the health record.

Algorithm evaluation in EHR data is often based on a small number of gold-standard labeled data. Two main issues are the difficulty in selecting the appropriate cut and the high variance in the accuracy parameter estimates.

This package provides a semi-supervised learning approach that incorporates unlabeled data into the estimation of the receiver operating characteristic (ROC) curve parameters to address these issues.

The data consists of :

  • a small validation set with algorithm score S and label Y, which takes values 0 or 1

  • a large unlabeled set containing only the algorithm score S.

The main function roc.semi.superv takes as arguments S and Y, where Y contains a large amount of missing values, and provides a semi-supervised estimation of the ROC parameters.

The function roc.superv takes as arguments S and Y from the small validation set only and provides a classic supervised method to estimate the ROC parameters.

Installation

Install development version from GitHub.

# install.packages("remotes")
remotes::install_github("celehs/SSL.validation")

Load the package into R.

library(SSL.validation)

Simulated Example

set.seed(1234)
dat <- read.csv("https://raw.githubusercontent.com/celehs/SSL.validation/master/data-raw/data.csv")
p.0 <- mean(dat$Y , na.rm = TRUE)
id.v <- which(is.na(dat$Y) != 1)
dat.v <- dat[id.v, ] # Labeled Data 

Semi-Supervised Learning (SSL)

system.time(res.ssl <- roc.semi.superv(dat$S,dat$Y))
## [1] 263

##    user  system elapsed 
##  21.446   0.553  23.255
auc.ssl <- res.ssl$auc
roc.ssl <- res.ssl$roc
auc.ssl
## [1] 0.7467197
tail(roc.ssl)
##          cut  p.pos  fpr       tpr       ppv       npv
## [94,] 0.0471 0.9614 0.94 0.9909063 0.2591625 0.9518820
## [95,] 0.0386 0.9688 0.95 0.9930972 0.2574391 0.9554319
## [96,] 0.0312 0.9785 0.96 0.9949573 0.2559512 0.9597191
## [97,] 0.0215 0.9885 0.97 0.9973425 0.2540214 0.9691953
## [98,] 0.0115 1.0000 0.98 0.9997318 0.2520541 0.9941873
## [99,] 0.0115 1.0000 0.98 0.9997318 0.2520541 0.9941873

Supervised Learning (SL)

system.time(res.sl <- roc.superv(dat.v$S,dat.v$Y))
## Warning in regularize.values(x, y, ties, missing(ties)): collapsing to unique
## 'x' values

##    user  system elapsed 
##   0.012   0.000   0.012
auc.sl <- res.sl$auc
roc.sl <- res.sl$roc
colnames(roc.sl) <- colnames(roc.ssl)
auc.sl
## [1] 0.7016632
tail(roc.sl)
##               cut p.pos       fpr tpr       ppv npv
## [94,] 0.015460938  0.96 0.9459459   1 0.2708333   1
## [95,] 0.014389420  0.96 0.9459459   1 0.2708333   1
## [96,] 0.013239677  0.98 0.9729730   1 0.2653061   1
## [97,] 0.010720983  0.98 0.9729730   1 0.2653061   1
## [98,] 0.008202289  0.98 0.9729730   1 0.2653061   1
## [99,] 0.006568542  0.99 0.9864865   1 0.2626263   1

githubssl's People

Contributors

mingstat avatar celehs avatar claralea avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.