`admittime` stored as `difftime`, not `datetime`

ricu

Working with ICU datasets, especially with publicly available ones as provided by PhysioNet in R is facilitated by ricu, which provides data access, a level of abstraction to encode clinical concepts in a data source agnostic way, as well as classes and utilities for working with the arising types of time series datasets.

Citation

To cite ricu, please use the following:

@article{bennett2023ricu,
  title={ricu: R’s interface to intensive care data},
  author={Bennett, Nicolas and Ple{\v{c}}ko, Drago and Ukor, Ida-Fong and Meinshausen, Nicolai and B{\"u}hlmann, Peter},
  journal={GigaScience},
  volume={12},
  pages={giad041},
  year={2023},
  publisher={Oxford University Press}
}

Installation

Currently, installation is only possible from github directly, using the remotes if installed

remotes::install_github("eth-mds/ricu")

or by sourcing the required code for installation from github by running

rem <- source(
  paste0("https://raw.githubusercontent.com/r-lib/remotes/main/",
         "install-github.R")
)
rem$value("eth-mds/ricu")

In order to make sure that some useful utility packages are installed as well, consider installing the packages marked as Suggests as well by running

remotes::install_github("eth-mds/ricu", dependencies = TRUE)

instead, or by installing some of the utility packages (relevant for downloading and preprocessing PhysioNet datasets)

install.packages("xml2")

and demo dataset packages

install.packages(c("mimic.demo", "eicu.demo"),
                 repos = "https://eth-mds.github.io/physionet-demo")

explicitly.

Data access

Out of the box (provided the two data packages mimic.demo and eicu.demo are available), ricu provides access to the demo datasets corresponding to the PhysioNet Clinical Databases eICU and MIMIC-III. Tables are available as

mimic_demo$admissions

#> # <mimic_tbl>: [129 ✖ 19]
#> # ID options:  subject_id (patient) < hadm_id (hadm) < icustay_id (icustay)
#> # Defaults:    `admission_type` (val)
#> # Time vars:   `admittime`, `dischtime`, `deathtime`, `edregtime`, `edouttime`
#>     row_id subject_id hadm_id admittime           dischtime
#>      <int>      <int>   <int> <dttm>              <dttm>
#> 1    12258      10006  142345 2164-10-23 21:09:00 2164-11-01 17:15:00
#> 2    12263      10011  105331 2126-08-14 22:32:00 2126-08-28 18:59:00
#> 3    12265      10013  165520 2125-10-04 23:36:00 2125-10-07 15:13:00
#> 4    12269      10017  199207 2149-05-26 17:19:00 2149-06-03 18:42:00
#> 5    12270      10019  177759 2163-05-14 20:43:00 2163-05-15 12:00:00
#> …
#> 125  41055      44083  198330 2112-05-28 15:45:00 2112-06-07 16:50:00
#> 126  41070      44154  174245 2178-05-14 20:29:00 2178-05-15 09:45:00
#> 127  41087      44212  163189 2123-11-24 14:14:00 2123-12-30 14:31:00
#> 128  41090      44222  192189 2180-07-19 06:55:00 2180-07-20 13:00:00
#> 129  41092      44228  103379 2170-12-15 03:14:00 2170-12-24 18:00:00
#> # ℹ 124 more rows
#> # ℹ 14 more variables: deathtime <dttm>, admission_type <chr>,
#> #   admission_location <chr>, discharge_location <chr>, insurance <chr>,
#> #   language <chr>, religion <chr>, marital_status <chr>, ethnicity <chr>,
#> #   edregtime <dttm>, edouttime <dttm>, diagnosis <chr>,
#> #   hospital_expire_flag <int>, has_chartevents_data <int>

and data can be loaded into an R session for example using

load_ts("labevents", "mimic_demo", itemid == 50862L,
        cols = c("valuenum", "valueuom"))

#> # A `ts_tbl`: 299 ✖ 4
#> # Id var:     `icustay_id`
#> # Index var:  `charttime` (1 hours)
#>     icustay_id charttime valuenum valueuom
#>          <int> <drtn>       <dbl> <chr>
#> 1       201006   0 hours      2.4 g/dL
#> 2       203766 -18 hours      2   g/dL
#> 3       203766   4 hours      1.7 g/dL
#> 4       204132   7 hours      3.6 g/dL
#> 5       204201   9 hours      2.3 g/dL
#> …
#> 295     298685 130 hours      1.9 g/dL
#> 296     298685 154 hours      2   g/dL
#> 297     298685 203 hours      2   g/dL
#> 298     298685 272 hours      2.2 g/dL
#> 299     298685 299 hours      2.5 g/dL
#> # ℹ 294 more rows

which returns time series data as ts_tbl object.

Acknowledgments

This work was supported by grant #2017-110 of the Strategic Focal Area “Personalized Health and Related Technologies (PHRT)” of the ETH Domain for the SPHN/PHRT Driver Project “Personalized Swiss Sepsis Study”.

	if (is_win_tbl(x) && !end_var %in% colnames(x)) {

	on.exit(rm_cols(x, end_var, by_ref = TRUE))

	dura_var <- dur_var(x)

	x <- x[, c(end_var) := re_time(get(start_var) + get(dura_var), interval)]
	x <- x[get(end_var) < 0, c(end_var) := as.difftime(0, units = time_unit)]
	}

	trunc_time <- function(x, min, max) {

	if (not_null(min)) {
	replace(x, x < min, min)
	}

	if (not_null(max)) {
	replace(x, x > max, max)
	}

	x
	}

Source	SpO2	SaO2
AUMC	numericitems: 6709, 8903	numericitems: 12311
eICU	vitalperiodic: sao2	lab: O2 Sat (%) ?
HiRID	observation: 4000, 8280	observation: 20000800
MIMIC III	chartevents: 646, 220277, 226253+, 50817^	chartevents: 834, 220227
MIMIC IV	chartevents: 220277, 226253+	labevents: 220227*, 50817

Glucose 20%/100ml Pflege		1000746
Glucose 20% /100ml		1000544
Glucose 40%		1000545
Glucose 50%		1000567
Glucose 30%		1000060
Glucose 20% /500ml		1000835
Glucose 10%		1000022
Glucose 10%		1000690
Glucose 20%		1000689

Glucose 5%		7257
Glucose 40%		8940
Glucose 5%		9569
Glucose 40%		9571
Glucose 20%		7255
Glucose 10%		7254
Glucose 30%		7256

	round_to <- function(x, to = 1) {
	if (all_equal(to, 1)) trunc(x) else to * trunc(x / to)
	}

	x <- create_intervals(x, c(id_vars(x), grp_var), overhang = hours(1L),
	max_len = hours(6L), end_var = "endtime")

	ext <- list(patient_ids = patient_ids, id_type = id_type,
	interval = coalesce(x[["interval"]], interval),
	progress = progress)

Dextrose 5%		220949
Dextrose 50%		220952
Dextrose 10%		220950
Dextrose 20%		228140

Dextrose 10%		30016
Dextrose 20%		30017
Dextrose 5%

	map <- id_map(src, id_vars(x), target_id, sft, idx)

	res <- map[x, on = meta_vars(x), roll = -Inf, rollends = TRUE]
	res <- res[, c(cols) := lapply(.SD, `-`, get(sft)), .SDcols = cols]

	time_vars <- setdiff(intersect(time_vars, colnames(res)), dur_var)

	res <- change_id(res, id_var, x, cols = time_vars, keep_old_id = FALSE)

eth-mds / ricu Goto Github PK

ricu's Introduction

Citation

Installation

Data access

Acknowledgments

ricu's People

Contributors

Stargazers

Watchers

Forkers

ricu's Issues

Problems

Question

Problem

Question

Problem

Problem

Problem

Patient attrition

Suspicion of infection

Question

Problem

Proposed solution

Problem

(Potential) Solution

Problem

Solution

Problem

Expected behaviour

Solution

Problem

Example

Problem

Session info

Recommend Projects

Recommend Topics

Recommend Org