Git Product home page Git Product logo

neelsoumya / dssurvival Goto Github PK

View Code? Open in Web Editor NEW
5.0 6.0 9.0 4.36 MB

Survival functions for DataSHIELD. Package for building survival models, Cox proportional hazards models and Cox regression models in DataSHIELD.

Home Page: https://neelsoumya.github.io/dsSurvivalbookdown/

License: GNU General Public License v3.0

R 97.22% TeX 2.78%
survival-analysis datashield cox-models cox-regression clinical-informatics meta-analysis survival-models datashield-technical-team survival-functions r

dssurvival's Introduction

dsSurvival

License

Introduction

dsSurvival is a package for building survival functions for DataSHIELD (a platform for federated analysis of private data). These are server side functions for survival models, Cox proportional hazards models and Cox regression models.

A tutorial in bookdown format with executable code is available here:

https://neelsoumya.github.io/dsSurvivalbookdown/

DataSHIELD is a platform in R for federated analysis of private data. DataSHIELD has a client-server architecture and this package has a client side and server side component.

If you use the code, please cite the following manuscript:

Banerjee S, Sofack G, Papakonstantinou T, Avraam D, Burton P, et al. (2022), dsSurvival: Privacy preserving survival models for federated individual patient meta-analysis in DataSHIELD, bioRxiv: 2022.01.04.471418.

https://www.biorxiv.org/content/10.1101/2022.01.04.471418v2

https://doi.org/10.1101/2022.01.04.471418

https://bmcresnotes.biomedcentral.com/articles/10.1186/s13104-022-06085-1

A bibliography file is available here:

https://github.com/neelsoumya/dsSurvival/blob/main/CITATION.bib

@article{Banerjee2022,
author = {Banerjee, Soumya and Sofack, Ghislain and Papakonstantinou, Thodoris and Avraam, Demetris and Burton, Paul and Z{\"{o}}ller, Daniela and Bishop, Tom RP},
doi = {10.1101/2022.01.04.471418},
journal = {bioRxiv},
month = {jan},
pages = {2022.01.04.471418},
publisher = {Cold Spring Harbor Laboratory},
title = {{dsSurvival: Privacy preserving survival models for federated individual patient meta-analysis in DataSHIELD}},
year = {2022}
}

Quick start

and R Studio

https://www.rstudio.com/products/rstudio/download/preview/

  • Install the following packages in R:
install.packages('devtools')
library(devtools)
devtools::install_github('neelsoumya/dsSurvivalClient')
devtools::install_github('datashield/[email protected]')
install.packages('rmarkdown')
install.packages('knitr')
install.packages('tinytex')
install.packages('metafor')
install.packages('DSOpal')
install.packages('DSI')
install.packages('opalr')
install.packages('patchwork')
  • Follow the tutorial in bookdown format with executable code and synthetic data:

https://neelsoumya.github.io/dsSurvivalbookdown/

This uses the Opal demo server which has all server-side packages preinstalled

https://opal-sandbox.mrc-epid.cam.ac.uk/

You can also see the script simple_script.R

https://github.com/neelsoumya/dsSurvival/blob/main/vignettes/simple_script.R

Installation

Screenshot of installation of package in VM

In the newer version of Opal your window may look like the screenshot below

Screenshot in new version of Opal

See the link below on how to install a package in Opal

https://opaldoc.obiba.org/en/latest/web-user-guide/administration/datashield.html#add-package

If you have an older version of Opal, please use this version by Stuart Wheater

https://github.com/StuartWheater/dsSurvival

install.packages('devtools')

library(devtools)

devtools::install_github('neelsoumya/dsBaseClient')

devtools::install_github('neelsoumya/dsSurvivalClient')

If you want to use a certain release then you can do the following

library(devtools)

devtools::install_github('neelsoumya/[email protected]')

If you want to try privacy preserving survival curves (available in v2.0), you can use the main branch or you can do the following

library(devtools)

devtools::install_github('neelsoumya/dsSurvivalClient', ref = 'privacy_survival_curves')

or

library(devtools)

devtools::install_github('neelsoumya/[email protected]')

Usage

A tutorial with executable code in bookdown format is available here:

https://neelsoumya.github.io/dsSurvivalbookdown/

A screenshot of the meta-analyzed hazard ratios from a survival model is shown below.

A screenshot of meta-analyzed hazard ratios from the survival model is shown below

For polished publication ready plots, use the following script forestplot_FINAL.R

https://github.com/neelsoumya/dsSurvival/blob/main/forestplot_FINAL.R

or the script simple_script.R

https://github.com/neelsoumya/dsSurvival/blob/main/vignettes/simple_script.R

If you want to plot survival curves or Kaplan-Meier curves, see the following link:

https://neelsoumya.github.io/dsSurvivalbookdown/computational-workflow.html#plotting-of-privacy-preserving-survival-curves

If you want to learn the basics of survival models, please see the following repository:

https://github.com/neelsoumya/survival_models

If you want to learn coding models in DataSHIELD, see the following repository:

https://github.com/neelsoumya/dsMiscellaneous

Release notes

v1.0.0: A basic first release of survival models in DataSHIELD. This release has Cox proportional hazards models, summaries of models, diagnostics and the ability to meta-analyze hazard ratios. There is also capability to generate forest plots of meta-analyzed hazard ratios. This release supports study-level meta-analysis (SLMA).

A shiny graphical user interface for building survival models in DataSHIELD has also been created by Xavier Escriba Montagut and Juan Gonzalez. It uses dsSurvival and dsSurvivalClient.

v1.0.1: Minor fixes.

v2.0.0: This has privacy preserving survival curves.

v2.1.0: This has vcov() functionality.

v2.1.1: This has minor fixes.

v2.1.2: This has minor fixes.

v2.1.3: This has minor fixes, fixes for plotting of a stratified survival analysis and use of ggplot in plotting survival curves.

Acknowledgements

We acknowledge the help and support of the DataSHIELD technical team. We are especially grateful to Elaine Smith, Eleanor Hyde, Shareen Tan, Stuart Wheater, Yannick Marcon, Paul Burton, Demetris Avraam, Patricia Ryser-Welch, Kevin Rue-Albrecht, Maria Gomez Vazquez and Wolfgang Viechtbauer for fruitful discussions and feedback.

We thank Yannick Marcon and @StuartWheater for fixes, @joerghenkebuero for suggestions about documentation, @AlanRace and Stefan Buchka for bug fixes and Xavier Escriba Montagut for a fix to the plotting functionality.

Contact

  • Soumya Banerjee, Demetris Avraam, Paul Burton, Xavier Escriba Montagut, Juan Gonzalez, Tom R. P. Bishop and DataSHIELD technical team

  • [email protected]

  • DataSHIELD

Citation

If you use the code, please cite the following manuscript:

Banerjee S, Sofack G, Papakonstantinou T, Avraam D, Burton P, et al. (2022), dsSurvival: Privacy preserving survival models for federated individual patient meta-analysis in DataSHIELD, bioRxiv: 2022.01.04.471418.

https://www.biorxiv.org/content/10.1101/2022.01.04.471418v2

https://doi.org/10.1101/2022.01.04.471418

https://bmcresnotes.biomedcentral.com/articles/10.1186/s13104-022-06085-1

A bib file is available here:

https://github.com/neelsoumya/dsSurvival/blob/main/CITATION.bib

@article{Banerjee2022,
author = {Banerjee, Soumya and Sofack, Ghislain and Papakonstantinou, Thodoris and Avraam, Demetris and Burton, Paul and Z{\"{o}}ller, Daniela and Bishop, Tom RP},
doi = {10.1101/2022.01.04.471418},
journal = {bioRxiv},
month = {jan},
pages = {2022.01.04.471418},
publisher = {Cold Spring Harbor Laboratory},
title = {{dsSurvival: Privacy preserving survival models for federated individual patient meta-analysis in DataSHIELD}},
year = {2022}
}

Publications

The following publications describe dsSurvival

Banerjee, S., Sofack, G.N., Papakonstantinou, T. et al. dsSurvival: Privacy preserving survival models for federated individual patient meta-analysis in DataSHIELD. BMC Res Notes 15, 197 (2022). https://doi.org/10.1186/s13104-022-06085-1

Banerjee, S., Bishop, T.R.P. dsSurvival 2.0: privacy enhancing survival curves for survival models in the federated DataSHIELD analysis system. BMC Res Notes 16, 98 (2023). https://doi.org/10.1186/s13104-023-06372-5

If you use the code, please cite the following manuscript:

Banerjee, S., Sofack, G.N., Papakonstantinou, T. et al. dsSurvival: Privacy preserving survival models for federated individual patient meta-analysis in DataSHIELD. BMC Res Notes 15, 197 (2022). https://doi.org/10.1186/s13104-022-06085-1

dssurvival's People

Contributors

escri11 avatar neelsoumya avatar stuartwheater avatar ymarcon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

dssurvival's Issues

confirm deterministic technique

In the prototyping code, 2 difference approaches are tested for the deterministic anonymisation. The first applies the process to the individuals before Kaplan-Meier, the second applies it afterwards (ie to groups of individuals with the same survivla/ censoring time)

Here it looks like we have the first method. I am not sure if it makes any difference which one is used but it would be good to at least state which was chosen.

Rationalise coxphSLMAassignDS and coxphSLMADS

coxphSLMAassignDS is designed to store a Cox model on the server side (assign) and coxphSLMADS is designed to return a summary of a Cox model (aggregate)

The only line that should be different is 207, where one returns the summary and the other the complete object.

There are some other differences, though, around these lines:

	    control <- gsub("~ bbbb", "", control, fixed = TRUE)
	    control <- gsub("~", "", control, fixed = TRUE)
	    control <- gsub("bbbb", "", control, fixed = TRUE)     

compared with:

control <- gsub("~bbbb", "", control, fixed = TRUE) 

I think this second one will miss ~ bbbb

Also in terms of maintainability, there should probably be a single function that generates the Cox model and applies the disclosure control (this might be called coxphSLMADS). Then there should be two other functions for the aggregate and assign operations, that call the main function. The aggregate would just call the main function, then do summary() on the coxph object and return to the client. The assign just returns the coxph object which stores it on the server side. That way the bulk of the code is in one place.

Check the disclosiveness of n.risk, n.events

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Add vcov functionality

In standard R with the survival package one can get the variance-covariance matrix using vconv()

In a future version of dsSurvival this should be added. If the model is stored on the server, then it is a case of running vcov() on that model and returning the results.

If the model itself has passed the disclosure checks, then it should be OK to release these summary statistics.

dsBase dependency missing

According to what I can see in the source code, this package depends on dsBase. You should specify it in the DESCRIPTION file (with I guess a minimum version of dsBase). It is also not necessary to repeat the DataSHIELD Options settings that are already defined by dsBase.

invalid documentation

The installation of Datashield packages does not fit to the current OPAL version

Try to install this R package in a current OPAL (version 4.4.10)

Expected behavior
The documentation of this package does not fit to the dialogs from OPAL; see screenshots of the current forms

Screenshots
grafik
grafik

Desktop (please complete the following information):
does not matter

Smartphone (please complete the following information):
does not matter

Additional context
Please update the installation procedures or create multiple versions if you want to keep the old versions, too

Wrong package version

Hi,

The DESCRIPTION file states that the package version is 6.2.0-1 whereas the tag is v1.0.0.

listDisclosureSettingsDS() duplicated from dsBase

Describe the bug
listDisclosureSettingsDS() is duplicated from dsBase. Given that this package imports dsBase, listDisclosureSettingsDS() should be used from there

To Reproduce
Steps to reproduce the behavior:

  1. Load dsBase library
  2. Load dsSurvival library

dsomics_testing-rserver-1 | Attaching package: ‘dsSurvival’
dsomics_testing-rserver-1 |
dsomics_testing-rserver-1 | The following object is masked from ‘package:dsBase’:
dsomics_testing-rserver-1 |
dsomics_testing-rserver-1 | listDisclosureSettingsDS

Expected behavior
No message about duplicated functions

Remove the DataSHIELD options that are duplicated from dsBase

At the moment the DataSHIELD options are duplicated from dsBase. I think the options defined by this package should only be those that are unique to this package (which there aren't any currently). Otherwise you might get a conflict between this package and dsBase.

This should work ok if dsBase is a dependency.

ds.cox.zphSLMA(fit = 'coxph_serverside') hangs

Calling ds.cox.zphSLMA(fit = 'coxph_serverside') doesn't appear to work. I get the following message, but it just stays at this point without progressing:

Getting aggregate Study1 (cox.zphSLMADS("coxph_serverside", "km", TRUE, FALSE, TRUE)) []  25% ...

ds.coxphSummary(x = 'coxph_serverside') works fine, so I know the variable is at least on the server. Is there a way to debug what is happening?

minimum version of the survival dependency

We got the following error with the ds.cox.zphSLMA function and we found out that this particular cohort had version 2.44-1.1 of the survival package. In another cohort with version 3.2-11 of the survival the function worked fine. So I suggest you to specify a minimum required version of the survival package in the DESCRIPTION file

[1] "Command 'cox.zphSLMADS("coxph_serverside", "km", FALSE, FALSE, TRUE)' failed on 'moba': Error while evaluating 'dsSurvival::cox.zphSLMADS("coxph_serverside", "km", FALSE, FALSE, TRUE)' -> Error in survival::cox.zph(fit = fit_model, transform = transform, terms = terms, : \n unused arguments (terms = terms, singledf = singledf)\n"

prepare for version 2.0 release

prepare for version 2.0 release

update DESCRIPTION file with new version

update README with release notes and description for v2.0

update README and remove privacy_survival_curves branch references anywhere

Incorrect meta analysis in tutorial?

Hi,
Is the code for performing the meta analysis (found here) correct?

Should it not be

input_logHR = c(coxph_model_full$server1$coefficients[1,1], 
        coxph_model_full$server2$coefficients[1,1], 
        coxph_model_full$server3$coefficients[1,1])
        
input_se    = c(coxph_model_full$server1$coefficients[1,3], 
        coxph_model_full$server2$coefficients[1,3], 
        coxph_model_full$server3$coefficients[1,3])
        
meta_model <- metafor::rma(input_logHR, sei = input_se, method = 'REML')

and

metafor::forest.rma(x = meta_model, digits = 4, atransf=exp) 

? Otherwise you're combining exp(coef) with se(coef) rather than coef and se(coef).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.