Git Product home page Git Product logo

clinicaltrialequity's Introduction

Clinical Trial Representation

License

Quantify representativeness in randomized clinical trials and provide insights to improve the clinical trial equity and health equity

Contents

  1. Description
  2. Visualization Tool
  3. Example RCTs
  4. Sample Visualization
  5. Statistical Results
  6. Data Availability
  7. Contact
  8. License
  9. Acknowledgments

Description

We develop randomized clinical trial (RCT) representativeness metrics based on Machine Learning (ML) Fairness Research. Visualizations and statistical tests based on proposed metrics enable researchers and physicians to rapidly visualize and assess subgroup representation in RCTs. The approach enables users to determine underrepresentation, absence, or other misrepresentation of subgroups indicating potential limitations of RCTs. The method could help support generalizability evaluation of existing RCT cohorts, enrollment target decisions for new RCTs (if eligibility criteria are included), and monitoring of RCT enrollment, ultimately contributing to more equitable public health outcomes.

What's the problem?

Within the field of RCT research, there has been ongoing concern that RCTs which lack a diversity of participants may not provide clear evidence of efficacy and safety for new interventions in underrepresented or missing subpopulations. Standardized methods are needed to assess potential representation disparities between RCT cohorts and the broader populations who could benefit from novel interventions.

How can technology help?

Extensive research in ML Fairness has created metrics for quantifying inequities in trained ML classification models and for creating novel ML approaches to address these inequities.

Our novel insight is that sampling of a subject to an RCT can be regarded as a classification function that is random. Applying ML Fairness metrics to this classification problem creates novel representativeness metrics for RCTs and other clinical studies. The representativeness metrics capture how well the actual sampling of subjects to a RCT matches of a hypothetical random sampling.

The solution

We compare the observed rate in the RCT for the subgroup to the hypothetical ideal rate in an equitable RCT in which participants are assigned truly randomly to the clinical trial. By considering assignment to the clinical trial as a random classification function, we develop standardized metrics based on variations of ML fairness metrics, focusing here on "Log Disaparity." The resulting metrics are functions of disease-specific observed and ideal rates of sampling of protected subgroups to the RCT.

Visualization Tool

You can run the example studies on RCT Representativeness Visualization (Paper Supplement).

The tool can

  1. Measure representativeness of any subgroup of interest in randomized clinical trials
  2. Visualize representation for subgroups
  3. Compare representation within/among studies

The R codes are available in the folder Visualization Codes.

Prerequisites

What things you need to install the software and how to install them:

  1. A software/server that can run R shiny codes
  2. R packages used in the codes (available in Source.R)

Example RCTs

We apply the proposed RCT representativeness metrics to three landmark clinical trials released in the last decade: Action to Control Cardiovascular Risk in Diabetes (ACCORD), Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial(ALLHAT), and Systolic Blood Pressure Intervention Trial (SPRINT). All participant data are obtained through the Biologic Specimen and Data Repositories Information Coordinating Center (BioLINCC).

RCT Representativeness Visualization (Paper Supplement)

The distributions of subgroups defined over age and race/ethnicity in both target population and ACCORD

It demonstrates the distributions of participants from different age and race/ethnicity groups in the RCT ACCORD and the target population. This figure clearly identifies that young participants are missing from the clinical trial. Also, the higher red bin shows that the subgroup may be overrepresented in the RCT (e.g. non-Hispanic white subjects age 45-64), while the higher green bin shows that the subgroup has the potential to be underrepresented in the RCT (e.g. Hispanic participants age 45-64). The wider green bin means that the subgroup is missing from the clinical trial (e.g. non-Hispanic white participants age 18-44).

Color representation of representativeness levels

In our visualization, dark gray indicates that no people with selected protected attributes exist in the target population; light gray means that the subgroup is missing from both target population and study cohort; dark red represents the absent subgroup from the cohort; light orange and orange point out that some subgroups are not sufficiently represented and may be at risk of being insufficiently enrolled into and represented in the clinical trial cohort; on the other hand, light blue and blue identify the potential advantaged subgroups which may make inefficient treatment seem helpful or vice versa; teal shows that the subgroup is equitably represented in the clinical trial.

Color representation with corresponding metric values

The categorized metric values for different representativeness levels with corresponding color in the visualization is shown above.

The Log Disparity representativeness levels of subgroups defined over race/ethnicity, gender, age, and education and the corresponding function of observed rate for female subjects from other races with some college/technical school education and age over 64 in ACCORD, with significance level = 0.05,lower metric threshold = 0.2, and upper metric threshold = 0.4.

This figure presents the representativeness levels of ACCORD subgroups defined by race/ethnicity, gender, age, and education from the inner ring to the outer ring using the Log Disparity metric. By hovering the pointer over the target subgroup areas on the sunburst, the representativeness label, ideal rate, and observed rate of the subgroups will show on the screen. Additionally, the corresponding math function of observed rates will also show on the side. The green line indicates the ideal rate and the brown line indicates the current RCT observed rate on the figure.

The Normalized Parity representativeness levels of subgroups defined over race/ethnicity, gender, age, and education and the corresponding function of observed rate for female subjects from other races with some college/technical school education and age over 64 in ACCORD, with significance level = 0.05,lower metric threshold = 0.1, and upper metric threshold = 0.2.

This figure presents the representativeness levels of ACCORD subgroups defined by race/ethnicity, gender, age, and education from the inner ring to the outer ring using the Normalized Parity metric.

Statistical Results

Subgroup data of ACCORD, ALLHAT, and SPRINT are summarised with the subgroups' charactersitics, observed rates in the RCT, ideal rates, Log Disparity value, group size, p-value, and BH p-value.

The results are available in the folder Statistical Results for Example Studies as csv files.

Data Availability

Our dataset "Quantifying representativeness in RCTs using ML fairness metrics - Data and codes" are available in the Dryad Digital Repository at https://doi.org/10.5061/dryad.76hdr7sxf.

The software is being published at Zenodo: https://doi.org/10.5281/zenodo.5348584.

The supplemental information is being published at Zenodo: https://doi.org/10.5281/zenodo.5348586.

Contact

Miao Qi - [email protected]

Project GitHub Link: ClinicalTrialEquity

License

This project is licensed under the Apache 2 License

Acknowledgments

  • This work was primarily funded by IBM Research AI Horizons Network with the Rensselaer Institute for Data Exploration and Applications (IDEA)

clinicaltrialequity's People

Contributors

ariellecerini avatar hansidm avatar olyerickson avatar qimiaorpi avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.