Git Product home page Git Product logo

ml4h2020's Introduction

ML4H2020

Repository for projects of the ETH Zürich course "Machine Learning for Health Care (Spring 2020)" (lecture page).

Authors :
Han Bai
Nora Moser
Martin Tschechne ([email protected])

Project 1 - ECG Time Series

Classifying ECG signals of the MIT-BIH Arrhythmia Dataset and the PTB Diagnostic ECG Database by Recurrent Neural Networks and make use of Transfer Learning techniques in order to improve predictive performance.

Results

Models MIT-BIH* PTBDB* PTBDB° PTBDB
LSTM + FC F1: 0.184
Acc: 0.823
F1: 0.787 Acc: 0.776
AUROC: 0.808 AUPRC: 0.934
F1: 0.419 Acc: 0.722
AUROC: 0.5 AUPRC: 0.861
F1: 0.371 Acc: 0.565
AUROC: 0.397 AUPRC: 0.805
CNN + LSTM + FC F1: 0.868
Acc: 0.971
F1: 0.940 Acc: 0.951
AUROC: 0.947 AUPRC: 0.982
F1: 0.988 Acc: 0.990
AUROC: 0.988 AUPRC: 0.996
F1: 0.992 Acc: 0.994
AUROC: 0.990 AUPRC: 0.996
LSTM + XGB F1: 0.875
Acc: 0.976
F1: 0.971 Acc: 0.977
AUROC: 0.968 AUPRC: 0.988
F1: 0.963 Acc: 0.970
AUROC: 0.955 AUPRC: 0.983
-
CNN + LSTM + XGB F1: 0.916
Acc: 0.985
F1: 0.983 Acc: 0.986
AUROC: 0.980 AUPRC: 0.993
F1: 0.981 Acc: 0.990
AUROC: 0.977 AUPRC: 0.991
-
XGB F1: 0.896
Acc: 0.979
F1: 0.970 Acc: 0.976
AUROC: 0.966 AUPRC: 0.987
- -
Kachuee, et al.[1] Acc: 0.934 - F1: 0.951
Acc: 0.959
-
Baseline[2] F1: 0.915
Acc: 0.985
F1: 0.988
Acc: 0.983
F1: 0.969
Acc: 0.956
F1: 0.994
Acc: 0.992

* Only trained on this dataset
° Transfer Learning, pre-trained model trained on MIT-BIH, retrained with frozen base layers
Transfer Learning, pre-trained model trained on MIT-BIH, retrained with unfrozen base layers
Base layers always frozen to train XGBoost

Visualization of learned embeddings

t-SNE UMAP PCA
MIT-BIH
PTBDB

For more details about the project have a look at the README.md in the project directory /ECG-time-series.

Project 2 - Diabetes Readmission Prediction

Investigating which medical features from patient records (categorical, numerical and text) play an important role in the prediction of patient readmission. Comparing models using only numerical + categorical features, only text and both.

Results

Cat./Num. Features
Text Features

For more details about the project have a look at the README.md in the project directory /Diabetes-readmission.

Project 3 - Medical Image Segmentation

Using the U-Net neural network model [4] to segment MRI prostate images from the NCI-ISBI 2013 Challenge - Automated Segmentation of Prostate Structures into anatomical regions (Peripheral Zone & Central Gland). Part of the project was to perform hyperparameter-tuning and try different optimizers and loss-functions to reduce generalization error.

Results

Example Test Image and Prediction

Test set example

Rotated test set example

For more details about the project have a look at the README.md in the project directory /Image-segmentation.

Project 4 - Splice Site Prediction

Splice site prediction is common problem in computational genome finding where it is desirable to find the splice sites that mark the boundaries of exons and introns in organisms whose cells have a nucleus enclosed within membranes (eukaryotes). This classification can then be used to predict a gene's structure, function, interaction or its role in a disease. Main challenge of this task was the high class imbalance of the splice sites.

Results

C.Elegans DNA

AUC c.elegans dna

Human DNA

AUC human dna

For more details about the project have a look at the README.md in the project directory /Splice-site-prediction.

Requirements

pandas, numpy, scikit-learn, keras, matplotlib, xgboost, umap-learn, seaborn, keras-contrib, tensorflow-addons

References

[1] Mohammad Kachuee, Shayan Fazeli, and Majid Sarrafzadeh. "ECG Heartbeat Classification: A Deep Transferable Representation." arXiv preprint arXiv:1805.00794 (2018) .

[2] CVxTz's GitHub implementation: ECG_Heartbeat_Classification (link)

[3] Impact of HbA1c Measurement on Hospital Readmission Rates: Analysis of 70,000 Clinical Database Patient Records. https://doi.org/10.1155/2014/781670

[4] Ronneberger, Olaf & Fischer, Philipp & Brox, Thomas. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. LNCS. 9351. 234-241. 10.1007/978-3-319-24574-4_28.

Project Organization

For this repository the cookiecuter data science project template is used slightly adapted to the our needs and requirements. Each of the four projects is in a separate folder which is a copy of the src directory.

├── LICENSE
├── Makefile           <- Makefile with commands like `make data` or `make train`
├── README.md          <- The top-level README for developers using this project.
├── data
│   ├── external       <- Data from third party sources.
│   ├── interim        <- Intermediate data that has been transformed.
│   ├── processed      <- The final, canonical data sets for modeling.
│   └── raw            <- The original, immutable data dump.
│
├── docs               <- A default Sphinx project; see sphinx-doc.org for details
│
├── models             <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
│                         the creator's initials, and a short `-` delimited description, e.g.
│                         `1.0-jqp-initial-data-exploration`.
│
├── references         <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
│   └── figures        <- Generated graphics and figures to be used in reporting
│
├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
│                         generated with `pip freeze > requirements.txt`
│
├── setup.py           <- makes project pip installable (pip install -e .) so src can be imported
├── src                <- Source code for use in this project.
│   ├── __init__.py    <- Makes src a Python module
│   │
│   ├── data           <- Scripts to download or generate data
│   │   └── make_dataset.py
│   │
│   ├── features       <- Scripts to turn raw data into features for modeling
│   │   └── build_features.py
│   │
│   ├── models         <- Scripts to train models and then use trained models to make
│   │   │                 predictions
│   │   ├── predict_model.py
│   │   └── train_model.py
│   │
│   └── visualization  <- Scripts to create exploratory and results oriented visualizations
│       └── visualize.py
│
└── tox.ini            <- tox file with settings for running tox; see tox.testrun.org

Project based on the cookiecutter data science project template. #cookiecutterdatascience

ml4h2020's People

Contributors

martintschechne avatar norahjoerdis avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.