Git Product home page Git Product logo

mimic-clinical-decision-making-framework's Introduction

MIMIC Clinical Decision Making Framework

🔥 New Addition: Llama 3 has been added to the leaderboard! 🔥

This repository contains the code for running the clinical decision making task using the MIMIC CDM dataset.

The code to create the dataset is found at: https://github.com/paulhager/MIMIC-Clinical-Decision-Making-Dataset

The dataset is based on the MIMIC-IV database. Access can be requested here: https://physionet.org/content/mimiciv/2.2/

A pre-processed version of the dataset is found here: https://physionet.org/content/mimic-iv-ext-cdm/

Visit https://huggingface.co/spaces/MIMIC-CDM/leaderboard to check out the current leaderboard. I will update this as new models are released. If you would like a model to be tested and put on the board, please write me an email at paul (dot) hager (at) tum (dot) de.

MIMIC CDM

This code simulates a realistic clinical environment where an LLM is provided with the history of present illness of a patient and then tasked to gather information to come to a final diagnosis and treatment plan.

To run the clinical decision making task, execute python run.py. The arguments for this file are specified through config files managed by the hydra library and found under configs. The most important arguments are:

  • pathology: Specify one of appendicitis, cholecystitis, diverticulitis, pancreatitis
  • model: Specify which model to use. The model file also contains the different role tags
  • summarize: Automatically summarize the progress if we begin to reach the token limit

These additional arguments change the way information is presented but did not help performance in my experience and so were not included in the paper:

  • include_ref_range: Include the reference ranges for lab results, as provided in the MIMIC database
  • bin_lab_results: Replace exact lab result values with the word "low", "normal", or "high", using the reference ranges
  • provide_diagnostic_criteria: Adds an extra tool where the model can consult diagnostic criteria if desired
  • diag_crit_writer_openai_api_key: OpenAI key to ask for new diagnostic criteria if they are missing from the datafile
  • include_tool_use_examples: Provides examples of how to use the tools

MIMIC CDM Full Information

For the MIMIC-CDM-Full Information task, executed through python run_full_info.py, all relevant information required for a diagnosis is provided upfront to the model and only a diagnosis is asked for. This allows us to also control what information we provide the model and explore many aspects of model performance such as robustness. The relevant arguments for this task are those from above and additionally:

  • prompt_template: Determines the system instruction or prompt used to ask for an answer. Possible values are specified in run_full_info.py
  • order: The order in which information is provided
  • abbreviated: Provide the original, abbreviated text
  • fewshot: Provides hand-crafted fewshot cases and diagnosis examples
  • save_probabilities: Saves the probabilities of the generation for downstream analysis
  • only_abnormal_labs: Provide only those lab results that are abnormal.
  • bin_lab_results_abnormal: If only abnormal labs are provided, also bin them

Other

Housekeeping arguments are:

  • seed: The seed used for greedy decoding
  • local_logging: If logs should be saved locally
  • run_descr: An extra name to give to the run
  • first_patient: Start executing at a specific patient
  • patient_list_path: Run on only a select group of patients (given as a list of hadm_ids)

Environment

To setup the environment, create a new virtual environment of your choosing with python=3.10, export your CUDA_HOME path to whatever version CUDA you have (does not have to be 11.7.1 like in the example) and then install the libraries from requirements.txt:

export CUDA_HOME=.../cuda/cuda_11.7.1
pip install --no-deps -r requirements.txt

Citation

If you found this code and dataset useful, please cite our paper and dataset with:

Hager, P., Jungmann, F., Holland, R. et al. Evaluation and mitigation of the limitations of large language models in clinical decision-making. Nat Med (2024). https://doi.org/10.1038/s41591-024-03097-1

@article{hager_evaluation_2024,
	title = {Evaluation and mitigation of the limitations of large language models in clinical decision-making},
	issn = {1546-170X},
	url = {https://doi.org/10.1038/s41591-024-03097-1},
	doi = {10.1038/s41591-024-03097-1},,
	journaltitle = {Nature Medicine},
	shortjournal = {Nature Medicine},
	author = {Hager, Paul and Jungmann, Friederike and Holland, Robbie and Bhagat, Kunal and Hubrecht, Inga and Knauer, Manuel and Vielhauer, Jakob and Makowski, Marcus and Braren, Rickmer and Kaissis, Georgios and Rueckert, Daniel},
	date = {2024-07-04},
}

Hager, P., Jungmann, F., & Rueckert, D. (2024). MIMIC-IV-Ext Clinical Decision Making: A MIMIC-IV Derived Dataset for Evaluation of Large Language Models on the Task of Clinical Decision Making for Abdominal Pathologies (version 1.0). PhysioNet. https://doi.org/10.13026/2pfq-5b68.

@misc{hager_mimic-iv-ext_nodate,
	title = {{MIMIC}-{IV}-Ext Clinical Decision Making: A {MIMIC}-{IV} Derived Dataset for Evaluation of Large Language Models on the Task of Clinical Decision Making for Abdominal Pathologies},
	url = {https://physionet.org/content/mimic-iv-ext-cdm/1.0/},
	shorttitle = {{MIMIC}-{IV}-Ext Clinical Decision Making},
	publisher = {{PhysioNet}},
	author = {Hager, Paul and Jungmann, Friederike and Rueckert, Daniel},
	urldate = {2024-07-04},
	doi = {10.13026/2PFQ-5B68},
	note = {Version Number: 1.0
Type: dataset},
}

mimic-clinical-decision-making-framework's People

Contributors

paulhager avatar

Stargazers

Amao. avatar Pawel Renc avatar  avatar Amrin Kareem avatar Danielh Carranza avatar Mert Sayar avatar David avatar Sami Nas avatar  avatar 老马 avatar Dominik Böhler avatar Zhihao PENG avatar YOLO avatar baeseongsu avatar mr_qone avatar Ismaël RAYMOND avatar

Watchers

 avatar Kostas Georgiou avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.