Git Product home page Git Product logo

visr's Introduction

visR

Lifecycle: questioning Codecov test coverage R-CMD-check CRAN status

Note: We aim to keep visR maintained, but cannot commit to any future development. For all your KM plot needs, we would recommend checking out ggsurvit.

The goal of visR is to enable fit-for-purpose, reusable clinical and medical research focused visualizations and tables with sensible defaults and based on sound graphical principles.

Package documentation

Motivation

By using a common package for visualising data analysis results in the clinical development process, we want to have a positive influence on

  • choice of visualisation by making it easy explore different visualisation and to use impactful visualisations fit-for-purpose
  • effective visual communication by making it easy to implement best practices

We are not judging on what visualisation you chose for your research question, but want to facilitate and support good practice.

You can read more about the philosophy and architecture in the repo wiki.

Installation

The easiest way to get visR is to install from CRAN:

install.packages("visR")

Install the development version from GitHub with:

# defaults to main branch
devtools::install_github("openpharma/visR") 

Cite visR

> citation("visR")

Contributing

Please note that the visR project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms. Thank you to all contributors:

@ablack3, @AlexandraP-21, @ardeeshany, @bailliem, @cschaerfe, @ddsjoberg, @diego-s, @epijim, @galachad, @gdario, @ginberg, @jameshunterbr, @jinjooshim, @joanacmbarros, @Jonnie-Bevan, @kawap, @kawap93, @kentm4, @krystian8207, @kzalocusky, @lcomm, @lesniewa, @olivroy, @prabhushanmup, @rebecca-albrecht, @SHAESEN2, @teunbrand, @thanos-siadimas, @therneau, @thomas-neitmann, @timtreis, @yonicd

visr's People

Contributors

actions-user avatar alexandrap-21 avatar ardeeshany avatar bailliem avatar cschaerfe avatar ddsjoberg avatar diego-s avatar epijim avatar galachad avatar ginberg avatar joanacmbarros avatar jonnie-bevan avatar kawap avatar kentm4 avatar olivroy avatar rebecca-albrecht avatar shaesen2 avatar teunbrand avatar thanos-siadimas avatar timtreis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

visr's Issues

Ask about possibility of contribution

I was asked by @epijim if I could contribute in this project and that's why I would ask you about such possibility. @cschaerfe @diego-s @bailliem if these functions already initiated like for example: vr_plt_kaplan_meier.R and vr_sankey.R are already in progress / during implementation or not? If not and this assignments in comments are not finally set then I would be happy to work on this. Or maybe do you have any other kind of visualization-functions or other job to do which is still not initiated?

generating kaplan-meier table/plot without strata

Issue description

I'm hoping to generate kaplan-meier plots/tables without stratifying my population. My equation looks like this:
surv_eq <- "survival::Surv(time, status)~1"

Steps to reproduce the issue

Please share code if possible.

surv_eq <- "survival::Surv(time, status)~1"

vr_kaplan_meier_summary(km_dt, surv_eq)

vr_kaplan_meier(km_dt, surv_eq)

What's the expected result?

I get back a summary table if I supply "ms_subtype" rather than "1" as the strata. For possibly unrelated reasons, I'm unable to generate a kaplan-meier plot even with strata supplied (see below).

What's the actual result?

vr_kaplan_meier_summary() generates this error if I do not supply strata:
Error: Column strata must be length 9 (the number of rows) or one, not 0

vr_kaplan_meier_summary() generates the expected tables if I do supply strata.

vr_kaplan_meier() generates this error if I do not supply strata:
Error in data.frame(time = survfit_summary$time, n.risk = survfit_summary$n.risk, : arguments imply differing number of rows: 43, 0

vr_kaplan_meier() generates this error if I do supply strata:
Error in summarize(., max_time = max(time)) : argument "by" is missing, with no default

Additional details / screenshot

vr_summarize and get_summary

We have multiple functions creating different kinds of summaries:

  • vr_summarize and vr_summarize_tab1 are defined and used in the context of creating a summary table (in vr_utils.R)
  • get_summary is defined to create a summary of a survfit object

I wonder if we should consolidate these methods or alternatively have more distinct names for them to avoid confusion?

Default and Custom Color Palettes (`vr_style`)

Colors may be a second task to be tackled separately from the theming that is integral to styling the visualisation. There may be some requirements making this a non-trivial task to automate, though, so it'd be great to have a discussion on this prior to implementation.

Some requirements that came up before

  • colors should be used sparingly and only where needed to convey message
  • colors should be consistent across visualisations in the report (i.e., the same drug should always have the same color across a series of graphs)
  • defaults needed for continuous and discrete variables
  • ability to define specific colors for key variable values (e.g., drugs of particular interest, custom colors for factor values, ...)
  • depending on use case: use corporate design colors as standard palette (parameterized by company or institution)
  • colors should be discernible even if colorblind

To some extent, one could specify default color palettes at the beginning of the (rmarkdown) script using options for, e.g., ggplot2.continuous.colour and ggplot2.continuous.fill, but that would not account for most of the above customizations required for adhering to good graphics principles and our stakeholder requirements.

Does anyone have an idea how to best implement these requirements around colors and styling in the package? I guess we'd need to disentangle what could be meaningful defaults to be set at the document setup phase and what could be customized in a vr_style_plot function?

Add a license

Can we chuck in a license? Something that says we have no liability or warranty, and make sure it's not copy left (e.g. you can use the code for anything).

Parameters, parameter names and assumptions about column names

Using terminology defined in a standard like CDISC ADaM has a lot of advantages. However, there are several aspects we need to consider, for example:

  • general flexibility,
  • the tradeoff between ease of use if data follows the standard vs. if it does not
  • how easy it is to learn and understand

Generally, I would suggest not to make strong assumptions about column names. Then again, if we have to make assumptions it is a good idea to follow the standard. Also, I think we should have parameter names that are easy to read and to understand, especially for people familiar with survival analysis but not necessarily with the standard.

The easiest solution would probably be using the column names from the ADaM standard as default values and using simple parameter names that are meaningful in a functional way.

Suggestion: Change the function signature for vr_KM_est from
vr_KM_est(data = adtte, strata = "TRTP")
to either

  1. vr_KM_est(data=NULL, stratifier=NULL, censor = "CNSR", time = "AVAL")
  2. vr_KM_est(data=NULL, stratifier=NULL, event = "CNSR", time = "AVAL")

NA Handling in vr_table_one

Greetings!

I'm hoping to use vr_table_one, but some of my values are missing/NA. I run into this error:

Error in quantile.default(x, probs = 0.25) :
missing values and NaN's not allowed if 'na.rm' is FALSE

If I change the NAs to "missing", as in the example in the doc, it then treats my continuous variable (age) as a categorical, and rather than get mean/median/etc, I get "number of unique values" in the table.

Is there a good workaround for this?

Thanks!
Kelly

Reporting sample versus population statistics

We are currently not using a harmonized nomenclature for reporting sample size etc. Based on our scientist's feedback, this is something we should change soon, not only for the summary table (table one), but also for meta-data on the figures.

From Letizia: In general, capital letters refer to population attributes (i.e., parameters); and lower-case letters refer to sample attributes (i.e., statistics). For example, P refers to a population proportion; and p, to a sample proportion. @bailliem you mentioned there is also a more extensive guidance available that we could build upon?

add_risk_table: min at risk = 0

Issue description

When the argument min_at_risk = 0, I am expecting the risk table to extend the entire x axis

Solution

I played around a bit and found an easy solution:

  if (min_at_risk > 0) {
  max_time <-
    tidy_object %>%
    dplyr::filter(n.risk >= min_at_risk) %>%
    dplyr::group_by(strata) %>%
    dplyr::summarize(max_time = max(time)) %>%
    dplyr::ungroup() %>%
    dplyr::summarize(min_time = min(max_time)) %>%
    dplyr::pull(min_time)
  } else {
    max_time = max(time_ticks)
  }

Next Features

We started out implementing a few functions that are important in our daily work for visualising time to event analysis. However, this is not where the package should stop. Some ideas from discussions with colleagues and stakeholders for additional visualisations to be added are listed below. Please comment with other ideas or open an issue and start implementing if you wish to take one on.

Cohort Selection and Baseline Characteristics

  • Table 1
  • cohort attrition flowchart
  • cohort attrition barplot
  • tables of disease characteristics or prior therapies

Change from Baseline

  • Scatterplots incl. diff options for overlapping data points like jitter
  • Waterfall plots
  • Cumulative distribution plots

Survival/Time to Event Analysis

  • Kaplan-Meier plots (different defaults for descriptive vs. comparative questions)
  • KM Summary tables
  • Forest Plot (@bailliem is working on this)

Treatment/Testing Pattern Analysis

  • Frequency Table of selected drugs by line
  • Lineplot of year of start of 1st line vs % of patients by drug
  • Stacked boxplot of types of line of therapy by year of diagnosis
  • Donut plot showing distribution of cohort by max LOT with bars specifying most frequent drugs in each line
  • Swimmer Plot showing patterns of treatment (or other activity) by patient

Treatment Sequences

  • Flowchart specifying patient flow incl. Death, lost to follow up, and progression to next line
  • Sankey plot (also showing numbers in non-interactive mode)
  • Sunburst plot showing progression of drugs in lines

Set up CI/CD workflow

Post webinar set up a CI/CD workflow including

  • unit tests
  • dependencies
  • styling and linting
  • documentation build
  • website build

Remove bioconductor dependency

Issue description

We have bioconductor dependencies for the example data - this makes installation difficult I'm getting errors now on R 4.0

See here: #42

[KM Plot] Order of N at risk, censored, events in Risktable does not agree with Morris et al.'s suggestion anymore

The guidance from the Kmunicate paper is to arrange n.risk, n.censor and n.events by stratum (see example of previos visR KM implementation)
image

image

With our new API, this ordering does not appear possible anymore:
image

I think it would be great to include the option to order the elements of the risk table following the KMunicate guidance, though, given that this was one of the motivators to actually write our own KM function. Maybe in addition to the more common ordering as implemented in ggsurvplot? What do you think?

Attrition flow chart in Example Analysis vignette not showing correctly

Building the example analysis locally (either running each chunk in the notebook or knitting the full document) in RStudio works fine. Executing the following code results in the flowchart shown in the viewer or RStudio:

attrition_flow <- vr_attrition(
  N_array = cohort_attrition$`Remaining N`,
  descriptions = cohort_attrition$Criteria,
  complement_descriptions = complement_descriptions,
  display = F,
  output_path = attrition_chart_fn)

And embedding the exported svg in the report also works:
image

When building the vignette using pkgdown::build_articles(), the image included by knitr::include_graphics is broken:
image

Additionally, when knitting the vignette, it the svg (that is shown in the Viewer when working interactively) is not incorporated as an image, but as text:
image

And this svg returned by the call to attrition_flow does not appear at all when building the vignette.
image

One problem seems to be the file path used in knitr::include_graphics does not seem correct in the context of the fully built website. Not sure how to correct for this, though... Does anyone know have any ideas how to fix this problem and show the final flow chart in our vignette on the website?

Change get_tableone function name

Change the get_tableone function name to be more general. It calculates summary statistics for categorical and continuous variable types. Although it is used to derive table one, it could also be used with other wrapper functions / table types.

Fix the broken CICD

Issue description

Current CICD doesn't work... fix it.

Steps to reproduce the issue

Push a commit and Github actions doesnt work

What's the expected result?

  • build and deploy docs
  • README.rmd is built automatically
  • check windows
  • check ubuntu
  • check osx
  • calculate code coverage

vr_render_table always collapses rows when using kable engine

Issue description

vr_render_table always collapses rows when using kable engine. This is not what we always need and should be an option or so.

Steps to reproduce the issue

Please share code if possible.

library(survival)
data(lung)


lung %>% 
  head(10) %>% 
  vr_render_table(title = "First 10 observations of `lung_cohort`.",
                  caption = "",
                  datasource = DATASET, 
                  engine = "kable")

What's the expected result?

  • I want to see each row of the first ten samples

What's the actual result?

  • Those consecutive row cells with same content are collapsed into one

Additional details / screenshot

How many significant digits after decimal point in tables?

The number of significant digits may change. At the moment, we mainly implement 2 or 3 digits after the decimal point, but this should probably be discussed further.
Do we want the number of digits to be sth. the user can specify in each function (risk: parameter creep) or enforce best practices? In the latter case, what would be the best practices we can agree upon with out scientists?

Several journals provide guidances, does anyone have any thoughts on which one's are preferable?

`pkgdown::build_site()` not working due to issue in `vr_alluvial_plot`

Issue description

pkgdown::build_site() not working due to issue in vr_alluvial_plot

Steps to reproduce the issue

Run vignettes/Alluvial.Rmd up to line 79 using R 4.0 and the latest version of tibble.

What's the actual result?

vr_alluvial_plot(dataset,
                 id = "PatientID",
                 linename = "LineName",
                 linenumber = "LineNumber",
                 data_source = "Simulatation")

Error: Must extract column with a single valid subscript.
x Subscript `tibble_vars[[i]]` has value 0 but must be a positive location.
Run `rlang::last_error()` to see where the error occurred. 

Additional details / screenshot

May be an R 4.0 issue?

Add title and data source to attrition flow chart

Following the principle that every output should have a title and a reference to the underlying data source, we should add such a feature also to vr_attrition.

Ideally, this would be two parameters following terminology of other vr functions and be an integral part of the resulting image.

Multiple branches in a cohort attrition diagram

Currently the package supports a single tree and a single branch in a cohort attrition diagram. It would be great if the package can allow the users to add multiple branches (like having subset of samples).

Combined KM Plot and tables for faceting

When stratifying, it is sometimes useful to use ggplot's faceting (facet_grid and facet_wrap) However, this only works (if at all) only for the KM plot (e.g., when using survminer, but also visR), but it would be really useful to also have the risk table for each facet.
Maybe this is something we could implement in the future?

get_COX_HR called data.frame in addition to survfit object

I would suggest to add the option to call the get_COX_HR method also on a data.frame. We should not make strong assumptions about the flow/hierarchy of different calls, if it is not necessary. Currently, the get_COX_HR method is only called on a survfit object but it would be very easy to add a function that calls coxph function on a data.frame.

Testing of Wrappers around model (summaries)

This was an interesting idea from the Validation workshop of R in pharma:
They are using peer-reviewed literature for estimates of models to test again with testthat. I really like this idea for our wrapper functions for models and the summary measures.

Validation

Refactoring issue

Renamed the parameter "groupCols" in the functions to create and render a table_one with "group_cols". However, the html in the docs folder remain unchanged.

Any idea why?

Filtering in attrition returning inconsistent results

Issue description

There seems to be an issue with the attrition code when using logical operators in the filtering strings. The table does not return correct N's

Steps to reproduce the issue

Please share code if possible.

data(lung)
lung_cohort <- lung %>% 
    dplyr::rename(ECOG = ph.ecog,
                  Karnofsky = ph.karno,
                  SITEID = inst,
                  CNSR = status,
                  AVAL = time,
                  WEIGHT = wt.loss
                 ) %>% 
    dplyr::mutate(USUBJID = sprintf("Pat%03d", row_number()),
                  SEX = factor(if_else(sex == 1, "male", "female")),
                  CNSR = if_else(CNSR == 2, 1, 0),
                  SUBGR1 = factor(case_when(ECOG == 0 ~ "0 asymptomatic",
                                            ECOG == 1 ~ "1 ambulatory",
                                            ECOG == 2 ~ "2 in bed less than 50% of day",
                                            ECOG == 3 ~ "3 in bed more than 50% of day",
                                            ECOG == 4 ~ "4 bedbound",
                                            ECOG == 5 ~ "5 dead")
                                 ),
                  AGEGR1 = factor(case_when(age < 30 ~ "< 30y",
                                            age >= 30 & age <= 50 ~ "30-50y",
                                            age > 50 & age <= 70 ~ "51-70y",
                                            age > 70 ~ "> 70y"),
                                  levels=c("< 30y", "30-50y", "51-70y", "> 70y")
                                 )
                 ) 
cohort_attrition <- vr_attrition_table(
  data = lung_cohort,
  criteria_descriptions = c("1. ECOG available", 
                            "2. Weight loss data available", 
                            "3. Non-missing censoring status",
                            "4. Positive follow up time",
                            "5. Meal Cal (missing or  > 1000"),
  criteria_conditions   = c("!is.na(SUBGR1)",
                            "!is.na(WEIGHT)",
                            "!is.na(CNSR)",
                            "AVAL >= 0",
                            "is.na(meal.cal) | meal.cal >= 1000"),
  subject_column_name   = 'USUBJID')

returns 132 patients after all filtering.

What's the expected result?

lung_cohort %>%
  dplyr::filter(!is.na(SUBGR1),
                !is.na(WEIGHT),
                !is.na(CNSR),
                 AVAL >= 0) %>% 
  dplyr::filter(is.na(meal.cal) | meal.cal >= 1000)

This returns a dataframe with 124 rows and not 132.

What's the actual result?

Additional details / screenshot

image

Forest Plot

Idea

What is your idea for a new plot/table?
Not only meta-analyses, but also cohort overviews or results of cox PH models and other regression models are often shown in ForestPlots. @bailliem had some initial ideas, but we haven't gotten around to fully implement this visualisation and help would be appreciated

Do you have any examples you can share of what it looks like?
Example for Cox PH Model (example from forestmodel)
image

Example for cohort overview from Morris et al.
image

What potential packages/functions could be used to make this plot?
There are the forestplot and forestmodel packages on CRAN as well as potentially some good ideas in the ggforest function of the survminer package. Ideally, this would be implemented using ggplot2 only.

Are you interested to implement this yourself?
@bailliem started, but not sure how it goes bandwidth wide.

Consolidate vr_utils.R and utilities.R

It looks like we have several places where utility functions are defined: minimally R/vr_utils.R and R/utilities.R. Maybe it's good to agree on one single place for this? Happy to be convinced otherwise.

installation

I have tried a number of times to install vizR on Windows, but it keeps failing. The latest failure has to do with the files/packages (?) RTCGA and RTCGA.clinical as below. What am I doing wrong or what is the bug involved?
Thanks.
James Hunter

√  checking for file 'C:\Users\james\AppData\Local\Temp\Rtmp8McCI4\remotes77f838e42777\openpharma-visR-0c1b345/DESCRIPTION' (893ms)
-  preparing 'visR': (557ms)
√  checking DESCRIPTION meta-information ... 
-  excluding invalid files
   Subdirectory 'man' contains invalid file names:
     'visR architecture.png'
-  checking for LF line-endings in source and make files and shell scripts
-  checking for empty or unneeded directories
-  building 'visR_0.1.0.tar.gz'
   
Installing package into ‘C:/Users/james/OneDrive/Documents/R/win-library/3.6’
(as ‘lib’ is unspecified)
ERROR: dependencies 'RTCGA', 'RTCGA.clinical' are not available for package 'visR'
* removing 'C:/Users/james/OneDrive/Documents/R/win-library/3.6/visR'
Error: Failed to install 'visR' from GitHub:
  (converted from warning) installation of package ‘C:/Users/james/AppData/Local/Temp/Rtmp8McCI4/file77f8328a59f6/visR_0.1.0.tar.gz’ had non-zero exit status

Example analysis vignette not building

The file "attrition_diagram.svg" is located in the "vignettes" folder but the codes searches it in the non-existing folder "/Example_analysis_files/figure-html".

I would fix this - any preferences how?

Styling/Theming functions (`vr_style`)

Following our current architecture, styling of plots should be separated out from the data wrangling and plotting task (see Wiki article on Architecture). We currently mainly rely on applying themes (or even color palettes) after the plotting task.

It may be nice to have convenience functions, though, that take care of multiple styling operations, similar to style_bbc() (from the BBC visuals cookbook).
To implement this, we would need some reasonable defaults to start with for font type/sizes, preferred grid styles, axis position etc. @bailliem do you happen to have some guidance from the graphical principles work

Treatment Flow Plots (Sankey) and requirements

Hi @kawap,

here is a little writup on our requirements for Alluvial/Sankey plots which are often used for assessing patterns in patient flow, e.g., consecutively received treatments

There are some requirements for the plotting that one needs to take into account based on feedback we have received from scientists working with the output

  • data wrangling and plotting should be 2 different functions
    • wrangling should result in a table with line of treatment, treatment name, pateint count and frequency (incl. those "lost" in each step)
    • visualisation then focus on showing that data as sankey plot
    • ideally, this is generic enough to not only focus on treatment, but any other kind of sequential data
  • plotting functions should include annotation of plot with title (can be empty string), number of samples, and datasource in footnote
  • has to work with the ggplot2 ecosystem (e.g., styling, adding titles etc.)
  • shown data should include death/censoring, no later treatment (i.e., those patients lost to later observations)
  • data needs to be interpretable in static form (containing information such as name of therapy and frequency)
  • treatment groups should have meaningful order (or should be easily changeable)
  • color scheme should match other analyses (i.e., fct. should have option to define custom color palette)
  • optional: interactive version can hide some of the annotation and make visible through mouse-over etc.

There are a few sankey packages out there that may be useful in this context. Selecting which one is up to the discretion of whoever writes the visr function:

@diego-s and @bailliem please add to this if I forgot anything important

Install Issues - Dependency on V8

When trying to install the package in our analytics environment, users often run into issues with resolving the V8 dependency:

Using PKG_CFLAGS=-I/usr/include/v8 -I/usr/include/v8-3.14
Using PKG_LIBS=-lv8 -lv8_libplatform
-----------------------------[ ANTICONF ]-------------------------------
Configuration failed to find the libv8 engine library. Try installing:
 * deb: libv8-dev or libnode-dev (Debian / Ubuntu)
 * rpm: v8-devel (Fedora, EPEL)
 * brew: v8 (OSX)
 * csw: libv8_dev (Solaris)
To use a custom libv8, set INCLUDE_DIR and LIB_DIR manually via:
R CMD INSTALL --configure-vars='INCLUDE_DIR=... LIB_DIR=...'
---------------------------[ ERROR MESSAGE ]----------------------------
<stdin>:1:10: fatal error: v8.h: No such file or directory
compilation terminated.
------------------------------------------------------------------------
ERROR: configuration failed for package 'V8'
* removing '/home/reyessia/R/x86_64-pc-linux-gnu-library/4.0/V8'
Error: Failed to install 'visR' from GitHub:
  (converted from warning) installation of package 'V8' had non-zero exit status

It can be resolved by following the instructions in the traceback, i.e., installing the corresponding libv8 library in the terminal (not in R, but into the system).
For ubuntu, apparently also the following worked
sudo apt install librsvg2-dev

If I remember correctly, we thought we had removed that dependency in develop a while ago, but it seems to be back both on master and develop. @bailliem, @SHAESEN2 do you know if we need this still?

Data Source caption duplicated in DT rendering of Table1

The caption at the bottom of a DT-rendered table seems to replicate every time the user interacts with the table (filtering, sorting etc.).

MRE:

adtte %>%
  visR::estimate_KM("TRTP") %>%
  visR::get_pvalue() %>%
  visR::render(title = "P-values", 
               datasource = paste0("Analysis data - time to event"),
               engine = "DT")

Then interact with the table.

image

Likely caused by the caption being added to the DT object in using drawCallback = DT::JS(source_cap))

visR/R/vr_render_table.R

Lines 81 to 88 in 68941e7

source_cap <- c(
"function(settings){",
" var datatable = settings.oInstance.api();",
" var table = datatable.table().node();",
paste(" var caption = 'Data Source:", datasource, "'"),
" $(table).append('<caption style=\"caption-side: bottom\">' + caption + '</caption>');",
"}"
)

get_summary

Issue description

"LCL", "UCL" and "CI" are allowed to be part of the statlist argument in the get_summary function, and by default all three are.
When CL or CI are a part if the statlist (l.55-61) ...
("conf.int" %in% names(survfit_object) &
survfit_object[["conf.type"]] != "none" &
((base::any(grepl(
"CL", statlist, fixed = TRUE
))) | (base::any(grepl(
"CI", statlist, fixed = TRUE
)))))
...they are changed to something like "0.95CI" or "0.95CL". And this happens also if only one of these stats is there (l.62-67):
CI <-
paste0(survfit_object[["conf.int"]], statlist[grepl("CL", statlist, fixed = TRUE)])
statlist[grepl("CL", statlist, fixed = TRUE)] <- CI
CI_Varname <-
paste0(survfit_object[["conf.int"]], statlist[grepl("CI", statlist, fixed = TRUE)])
statlist[grepl("CI", statlist, fixed = TRUE)] <- CI_Varname

In line 99 then CI_Varname is used independently of whether it was set or not (see condition l.55-61).

Steps to reproduce the issue

#works:
get_summary(survfit_object)

#breaks sometimes; cannot reproduce:
get_summary(survfit_object, statlist = c('median'))

#breaks always
get_summary(survfit_object, statlist = c('events', 'median', 'CI'))
--> Error: Problem with mutate() input 0.95CI.
x undefined columns selected
i Input 0.95CI is .CIpaste(.).

Dependency on Survival - Fix Call

Currently we prefix all function calls, except for eg Survfit. One way to fix this is update the Call in estimate_KM, but this does not seem to be allowed. Another is to parse all calls downstream.

convert render_table into an S3 method

Converting this core package function into an S3 method would:

  • make the package more extendable and accessible to more developers to add on different table engines that they use.
  • make testing more compartmental and simpler to track since each engine would be self contained.
  • create a simpler way to make knitr::knit_print methods to better connect with knitr and its methods.

vr_render_table <- function(

Replace globalvariables use with maintainable approach

Work through the list of variables referenced in visR globalvariable to set those variables to NULL within the function call, where possible. At the moment we are using this to suppress note in R CMD check "no visible binding". Ideally, we resolve this at the function level.

See here for more details on the alternative approach STAT545-UBC/Discussion#451

General API style

I think the API as shown in the kick-off by Steve is already great and I really like it a lot!
The key points are:

  • Different classes model estimation / necessary calculations (implemented in the vr_KM_est.R) and plotting (vr_plot.R)
  • Very generic plotting (vr_plot.R) and generic plotting-helpers (e.g. add_CI), with also consistent and clear naming.
  • Use of the general tidyverse philosophy and especially the pipe operator.

A next step could be adding refactoring another plotting function to be integrated into the existing API, for example, the alluvial
The next steps would then be:

  • Changing the referencing of column names. Right now they are renamed in the data.frame, I would use the parameter names instead (cf. issue 82).
  • Creating an object for the result of the precalculations necessary to plot the alluvial (function vr_alluvial_wrangling).
  • Separate alluvial base plot from addons (like adding the N) to have the same calling structure as in the survival example.
  • Integrating the alluvial plotting into the vr_plot method.
    Each point would be a separate github issue but I don't want to clutter the issue list before we have decided if we do it like this

[KM Plot] Parameter to define ratio between KM Plot and risktable

When following good principles for plotting KM curves, it is advised by Morris et al. to also show an extended risktable with N at risk, censored and events. In it's current implementation, this may lead to a rather small KM plot and the risk tables taking up a large part (I'd guess 50%) of the figure.
It would be nice to adjust this ratio (similarly to the tables.height parameter in ggsurvplot) or have some other options to reduce the plot real estate taken up by the table-part.

image

add texPreview as an option to preview latex tables

It could be a nice feature to add a print method to preview latex tables using texPreview (apologies for the shameless self plug). Many times it is a pain to tweak latex tables since they need to be rendered in a pdf doc to see them. texPreview quickly renders latex in the preview window of RStudio so users can iterate over tables much like they do in ggplot workflows.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.