Git Product home page Git Product logo

acme's Introduction

Anthropometric Cleaning Methods Explorer (ACME)

This application seeks to provide a framework for comparing different anthropometric cleaning methods, including automatic runs for implemented algorithms, comparison visualization, and in-depth walk-throughs of results.

ACME has been implemented to compare infants and adult data cleaning algorithms. For details on the algorithms implemented for each scenario and how to run them, Included Algorithms and Installation and Quickstart. We also provide guidance on how to adapt ACME for different algorithms.

Installation and Quickstart

To run this application, please follow the following steps:

  1. Install R and RStudio.
  2. Open RStudio.
  3. In the console window (bottom-left corner by default), enter the following to download necessary packages (note: donwloading package requirements may take a while):
install.packages(c("shiny", "ggplot2", "rstudioapi", "colorspace", "plotly", "viridisLite", "ggplotify", "reshape2", "shinyBS", "shinyWidgets", "data.table", "growthcleanr", "lme4", "anthro"))

A package from GitHub is needed as well. To download this package, enter the following in the console:

install.packages("devtools")
devtools::install_github("zeehio/facetscales", ref = "archived")

Versions for packages used in development can be found in the R and Package Versions section.

In downloading facetscales on Windows, it may request that you need Rtools. You can download that following instructions listed here, making sure to match the Rtools version to your R version.

If you are on a Mac, R also requires the Cairo package, which itself requires that the XQuartz package is also installed. You may need to download and install XQuartz first, as it does not come with MacOS by default. After XQuartz is installed on your system, install Cairo in R:

install.packages("Cairo")

Note that this may ask for some input about updating related packages -- "no" is a fine answer. This may take some time.

  1. Download this repository by clicking "Code" in the top right corner of the repository, and cloning through git with the given instructions or downloading a ZIP file. If the ZIP file is downloaded, unzip it.
  2. In this downloaded repository folder, open acme_app.R, which should be included in the downloaded files from this repository. It should open in the top-left corner pane of RStudio by default.
  3. ACME is is set to run the infants comparison algorithms by default. If you would like to use the adult comparison algorithms, comment out line 11 by putting a # at the beginning of the line and deleting the # on line 12. To run a custom configuration, please see the section Using the ACME Framework.
  4. In the top right corner of the pane with the acme_app.R script, you should see the button, "Run App". Click on the small downwards arrow next to it and choose "Run External".
  5. Now click "Run App". This should open the application in your default browser window. If you want to run the application with example data, click the "Run data!" button in the sidebar to get started.
  6. Have fun! More information on running the application and methods involved can be found within the app.

R and Package Versions

ACME was developed using R 4.2.1. Package versions used include:

  • shiny: 1.7.1
  • ggplot2: 3.3.6
  • rstudioapi: 0.13
  • colorspace: 2.0-3
  • plotly: 4.10.0
  • viridisLite: 0.4.0
  • ggplotify: 0.1.0
  • reshape2: 1.4.4
  • shinyBS: 0.61.1
  • shinyWidgets: 0.7.0
  • data.table: 1.14.2
  • growthcleanr: 2.0.1
  • lme4: 1.1-30
  • anthro: 1.0.0

If ACME is not working, please check and update package versions. If your packages are up to date and you are still running into an issue, please create a detailed issue on ACME's GitHub site.

Data Format

Data format is modeled after the growthcleanr algorithm. To use data within ACME, your data must contain columns with the following format (names must be exact):

  • id: number for each row, must be unique
  • subjid: character, subject ID
  • param: character, parameter for each measurement. must be either HEIGHTCM (height in centimeters) or WEIGHTKG (weight in kilograms)
  • measurement: numeric, measurement of height or weight, corresponding to the parameter
  • age_years AND/OR age_days: numeric, age in years or age in days
  • sex: numeric, 0 (male) or 1 (female)
  • answers: (not required) character, an answer column, indicating whether the value should be Include or Implausible

Included Algorithms

Algorithms included for comparison of infants algorithm are:

Algorithms included for comparison of adult algorithms are:

If you could like to include other algorithms in the ACME framework for comparison, see the Using the ACME Framework for Other Anthropometric Algorithms section.

Features

Once you have ACME up and have run your data through, there are two parts of the application: Compare Results and Examine Methods.

Compare Results Tab View

Compare Results lets you view compare algorithm results at a high level, with tabs including:

  • Overall: Bar graphs of counts of implausible values, reasons for those implausible values, and a correlation matrix of similarity between methods
  • Individual: Scatterplots of results for a single subject, colored by whether the point was implausible by any method or not. Also includes lines of best fit and standard deviations. Counts and reasons for implausibility appear below the plot.
  • All Individuals: Heat map of results for all subjects across all methods, colored by correct or incorrect answers if included.
  • Check Answers: Bar graphs of count of correct answers for both included and implausible values, if included.
  • View Results: Table of results for each record.

Examine Methods Tab View

Examine Methods lets you dig deeper into the reasons for each method's choices. Graphs to the right show an individual subject's value for height and weight, sorting before the algorithm. Using the slider above will let you "step through" each step in the algorithm, with:

  • A short description of each step
  • Updated graphs for that subject, if the step resulted in any values being designated as implausible
  • A table below showing exact values for the relevant step (height, weight, or both), and values that the algorithm used to make the decision. Hovering over any point in the plot will highlight its corresponding column in the table.

Using the ACME Framework for Other Anthropometric Algorithms

Though ACME has default configurations for sets of adult and infant algorithms, ACME can be adjusted to run and compare other sets of implemented algorithms. The process for making these changes is below, but you can also use the infants and adult sets of algorithms as working examples.

  1. Name your set of algorithms with a single, non-spaced term (for example, "adult", "infants"). You will be using this term for organization throughout the ACME framework. For this set of instructions, we will call this term "group".

  2. In "EHR_Cleaning_Implementations", create a folder called "group" for your algorithm implementation scripts.

  3. In that folder, create scripts for your algorithm implementations. For clarity, I implemented each algorithm in its own script. Algorithms should be functions with the following signature and inputs/outputs:

# inputs:
# df: data frame with 7 columns:
#   id: row id, must be unique
#   subjid: subject id
#   sex: sex of subject
#   age_days: age, in days OR age_years: age, in years, depending on your implementation
#   param: HEIGHTCM or WEIGHTKG
#   measurement: height or weight measurement
# inter_vals: boolean, return intermediate values
#
# outputs:
#   df, with additional columns:
#     result, which specifies whether the height measurement should be included,
#       or is implausible.
#     reason, which specifies, for implausible values, the reason for exclusion,
#       and the step at which exclusion occurred.
#     intermediate value columns, if specified
algorithm_function <- function(df, inter_vals = F){
  return(df)
}

To add intermediate steps (for the "Compare" section of ACME), add columns with intermediate values corresponding to each step in the format Step_(NUMBER)(h or w)_(STEP TITLE). These intermediate step columns should only be added to output if inter_vals = TRUE. I would highly recommend looking at other algorithm implementations for examples.

  1. In "Data", create a folder called "group" for your example datasets.

  2. Add a small subset of example data in the format required (see the Data Format section) as a CSV. If you have an additional file that has the results included, you can include that within that folder as well.

  3. Create your "config" file in the main repository folder, using "EXAMPLE_config.R" as a base. "EXAMPLE_config.R" has directions on the sections that need to be changed, designated with TODO, including:

  • Comparison Title
  • Data Files
  • Age Range Specifications
  • Algorithms Implemented
  • Algorithms With Intermediate Values Implemented
  • Algorithm Documentation for ACME
  1. Open "acme_app.R" and change line 11 to specify your config file.

  2. Then follow the instructions in "Installation and Quickstart" to get started!

Notice

Copyright 2021 - 2023 The MITRE Corporation.

Approved for Public Release; Distribution Unlimited. Case Number 19-2008

acme's People

Contributors

delosh653 avatar molivier-314 avatar dchud avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.