Git Product home page Git Product logo

heartbeat-classification's Introduction

Deep Stethoscope: CNN Heartbeat Classifier

Walkthrough of Data Preprocessing & Model Training


Objective

As traditional, analog physician tools are reimagined for the digital world, there is an increasing opportunity to bring medical software to care providers and patients. Through these new tools, patient data can now be collected directly and automatically instead of being recorded manually and retroactively. When these modern reinventions are slowly adopted, the platform for medical software grows too.

The stethoscope, for example, is now seeing its electronic counterpart slowly being preferred; these e-stethoscopes provide a number of utilities like ambient noise reduction, audio recording storage, and even remote listening. While these improvements are important, they are only a matter of increasing convenience of use. We can take this technology to the next step by offering insights based on the collected data, at a personal level.

As a proof-of-concept, I've developed a heartbeat audio classifier that is able to differentiate between normal heartbeats and heart murmurs. Though my approach is not perfect, it shows what our modern analytical tools can offer the medical community beyond these incremental technological advances.

Method

Audio Preprocessing

The raw data provided is in WAV format, which is an encoding of an audio. In order for my model to read this information, I convert these heartbeat recordings into an image, a PNG of a 2D spectrogram. Spectrograms are convenient for representing these heartbeat recordings because they capture the intensity of the frequencies throughout a given soundbyte; recent work has shown that one can recreate original audio closely from respective spectrograms, so it can be assumed that spectrograms are effective representations of an audio recording.

These images are then further trimmed to reduce their size, before being fed to the model. In addition, the data is split into stratified training, validation, and testing sets.

Model Development

The model is a traditional convolutional neural network (CNN) and performs a series of 2D convolutions and max-pooling operations prior to a series of fully connected layers. Due to the class imbalance, Dropout and kernel regularizers are employed selectively to prevent overfitting.

While binary cross-entropy and KL-divergence are effective loss functions for optimizing the accuracy of my model, they fail to properly optimize AUC-ROC, or the AUROC score. Instead, I employed a loss function that is a differentiable approximation of the AUROC score in order to improve the classifier performance in this regard. While there is a minor decrease in accuracy, there is a significant improvement in the AUROC score of the trained classifier after this change.

Code

The file hbutils.py contains utility methods for processing the audio and image data. The file hbmodels.py holds the Keras model architecture. The file Heartbeat Classification Example.ipynb is a walkthrough of the data preprocessing and model development/training process.

Results

The CNN achieves an accuracy of approximately 78% on unseen test data, with an approximate AUROC score of 0.77.

Discussion

While these results are not convincing enough to use the model in a diagnostic or commercial setting, they are still strong and offer a good baseline for future competing methods. The human performance baseline is also unknown, though I suspect the average physician performance is currently better than my model.

There is also the problem of identifying extrasystole heartbeats, of which there is little labeled data. My model only classifies normal heartbeats and murmurs and cannot learn this third classification due to the great class imbalance that occurs.

Although the classifier can definitely be improved, I believe I have established the approach for how to classify heartbeat recordings. Perhaps with better computing resources and more time and data, one can quickly find a better solution with a deeper or more specialized network.

heartbeat-classification's People

Contributors

sachanganesh avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.