Git Product home page Git Product logo

batch4_diafoirus_fleming's Introduction

Team 3 - Fleming

Alexander Fleming

Named after Alexander Fleming (1881 - 1955), former Scottish physician, microbiologist, and pharmacologist, Nobel Prize 1945 in Physiology or Medicine, best known for discovering the lysozyme enzyme and the world's first antibiotic (Penicillin G).

The main goal of the project is to predict dynamically mortality risk for a given patient on a horizon of a few days.

Meeting Notes

04/04/2018

  • exploration des données,
  • discussions sur la manière de constituer les cohorts (âge, sexe, cholestérol moyen, fréquence cardiaque moyenne, pression artérielle moyenne, taille, poids, nb d'allergies, etc.),
  • stats importantes à obtenir: durée de séjours par unité (influence sur le nb d'observations), taux de mortalité et fréquence par diagnostics et par année, idem par unité, stats sur les indicateurs classiques par cohort.

11/04/2018

  • Présentation des avantages du format OMOP: format OMOP
  • Possibilité de tester le format OMOP

TODO

  • Accès à MIMIC PostGreS en python directement
  • Benchmarker tous les indicateurs principaux (SOFA, IGS-II) et écrire des scripts pour les calculer.
  • Réaliser un EDA complet (jupyter notebook) pour se faire une idée des biais existants (cf. idée de stats du 04/04)
  • Avec ces résultats, créer nos propres indicateurs et en discuter avec le médecin référent (éventuellement les calculer sur des données d'hopitaux parisiens)
  • Définir la mesure précise qu'on souhaite prédire (par itération sur la période temporelle entre autres)
  • Benchmark des différents modèles suivant qq métriques dont: précision, nombre de variables explicatives, complexité d'entraînement du modèle (souci de reproductibilité).

Relevant work

2011

  • A Comparison of Intensive Care Unit Mortality Prediction Models through the Use of Data Mining Techniques (dec 2011): paper

2015

  • Feature Representation for ICU Mortality (dec 2015): paper

2016

  • Predicting Clinical Events by Combining Static and Dynamic Information Using Recurrent Neural Networks (feb 2016): paper
  • Predicting ICU Mortality Risk by Grouping Temporal Trends from a Multivariate Panel of Physiologic Measurements (feb 2016): paper
  • Using recurrent neural network models for early detection of heart failure onset (aug 2016): paper
  • Recurrent Neural Networks for Multivariate Time Series with Missing Values (nov 2016): paper
  • Hospital Standardized Mortality Ratio (HSMR) (nov 2016): paper

2017

  • Dynamic Mortality Risk Predictions in Pediatric Critical Care Using Recurrent Neural Networks (jan 2017): paper
  • Interpretable Deep Models for ICU Outcome Prediction (feb 2017): paper
  • Generating Multi-label Discrete Patient Records using Generative Adversarial Networks (march 2017): paper
  • Multitask Learning and Benchmarking with Clinical Time Series Data (march 2017): paper, repo
  • The Dependence of Machine Learning on Electronic Medical Record Quality (march 2017): paper
  • PPMF: A Patient-based Predictive Modeling Framework for Early ICU Mortality Prediction (april 2017): paper
  • Deep Learning to Attend to Risk in ICU (may 2017): paper
  • Real time mortality prediction in the MIMIC-III database (july 2017): repo
  • A review of modeling methods for predicting in-hospital mortality of patients in intensive care unit (august 2017): paper
  • Mapping Patient Trajectories using Longitudinal Extraction and Deep Learning in the MIMIC-III Critical Care Database (august 2017): paper
  • How To Predict ICU Mortality with Digital Health Data, DL4J, Apache Spark and Cloudera (sep 2017): article
  • Early hospital mortality prediction of intensive care unit patients using an ensemble learning approach (oct 2017): paper, review
  • Benchmark of Deep Learning Models on Large Healthcare MIMIC Datasets (oct 2017): paper
  • Real-time mortality prediction in the Intensive Care Unit (end of 2017): paper

2018

  • Scalable and accurate deep learning for electronic health records (jan 2018): paper
  • An Empirical Evaluation of Deep Learning for ICD-9 Code Assignment using MIMIC-III Clinical Notes (feb 2018): paper
  • Deep Representation for Patient Visits from Electronic Health Records (march 2018): paper
  • Memoire de stage - Prédiction mortalité sous 24h: paper

Lectures

  • MIT 6.S897: Machine Learning for Healthcare link

Predictor candidates

batch4_diafoirus_fleming's People

Contributors

dimitricabaud avatar frgfm avatar jeremydesir avatar juliengaillet avatar mateolostanlen avatar mifine avatar paulroujansky avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

batch4_diafoirus_fleming's Issues

Roadmap

Global roadmap 🏁

  • [ ]
  • [ ]
  • [ ]

Check units

Issue

Make sure that all values have the same unit for every columns for all patients!

Potential solution

  • along the dataframe containing measurements, create a similar table containing, instead of values, the unit of each measure (add unit_source_value in the SQL query and then create a dupplicate dataframe etc)

Find the causes/type of deaths

In OMOP current representation, the causes/type of death of each dead patient is not known (see death table https://github.com/MIT-LCP/mimic-omop/tree/master/etl/StandardizedClinicalDataTables/DEATH).

Below are the first four entries of the death table:

person_id death_date death_datetime death_type_concept_id cause_concept_id cause_source_value cause_source_concept_id
0 62063368 2188-11-22 2188-11-22 12:00:00 38003569 None None
1 62063384 2198-02-18 2198-02-18 03:55:00 38003569 None None
2 62063393 2182-07-31 2182-07-31 06:45:00 38003569 None None
3 62063403 2145-03-19 2145-03-19 07:00:00 38003569 None None

While the reason might simply be that people die in general for multiple reasons, it is nonetheless possible to extract the principal causes of death out of the condition_occurrence table.

For the moment, this table associate a series of "conditions" to each patient (each one associated with a visit occurrence from visit_occurrence table). This conditions are obviously not all fatal though. Extracting only the final fatal conditions (for instance "Cardiac arrest" or "Cardiogenic shock") for each patient could be a first step to determine the principal causes of death of each patient and derive better descriptive statistics out of it.

Define a proper strategy to deal with missing/incorrect values

  • In case of missing values, one solution would be to (1) pick the last available value up to 24h before missing value (for instance) - taking care not to use the last copied value as a good initial value. We should go backward in time for this reason. (using pandas.DataFrame.fillna() method for instance) - this part was implemented in commit 36dbc58.
    After doing so, we would then (2) delete any entry that would contain any NaN value.
    Yet we could imagine many other methods to deal with missing values (interpolation when it is potentially realistic or average-value filling etc).

  • In case of incorrect/aberrant values, we should define a proper strategy to (1) spot them and (2) deal with them. Feel free to propose any method !

Check death status

Issue:

In order not to penalize the learning curve of our predictive algos, we should think of dealing with specific cases where, for a given patient, its last measurement is prior to its death (meaning there were no measurement of interest, i.e. containing predefined variables of interest, after its clinical death). Consequently, we cannot mark that the patient is dead for any its prior measurement (i.e. target is always equal to 0). Yet, the patient is indeed dead at the end of its stay in the hospital.

Potential solution:

We could simply add, in addition to column target which indicates at time twhether the patient is still alive or not, a column super_target that would simply indicate whether the patient survive or not at the end of its stay in the hospital. Actually, in order to train some models, we should use target instead of super_target but not systematically.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.