Git Product home page Git Product logo

azariagmt / pulmonary-disorder-detection-using-x-ray-images Goto Github PK

View Code? Open in Web Editor NEW
6.0 2.0 0.0 444.69 MB

Deep Learning approaches in the detection of pulmonary disorders: COVID19, Tuberculosis, Bacterial, and Viral Pneumonia, Healthy/Normal using 17500 non-augmented X-ray images. 5 class classification performed using different pre-trained models like DenseNet201, Xception, Inception, and many more reaching near 99% accuracy.

Home Page: https://respiratory-disorder-detection.azurewebsites.net/

License: MIT License

Python 0.02% CSS 0.01% JavaScript 0.01% HTML 0.13% Dockerfile 0.01% Jupyter Notebook 99.85% Shell 0.01%
pandemic patients spread radiology-imaging positive-cases xray-images covid

pulmonary-disorder-detection-using-x-ray-images's Introduction

About

AI and Machine Learnings use in detecting pulmonary disorders such as Tuberculosis, pneumonia and many others has been a well-known fact for quite a while now. When — Coronavirus disease 2019 also known as COVID-19 was declared as a pandemic in 2020 those similar techniques could be used for the diagnosis of COVID-19, especially in developing countries where there is lack of specialized physicians, and test kits are in scarce supply. These inadequacies are leading to the misdiagnosis of the different pulmonary disorders, more specifically COVID-19. The objective of this project is to provide an overview on the use of Deep neural networks, both Vanilla neural networks and other pretrained models to present a quick solution to provide a classification between COVID-19, Bacterial Pneumonia, viral Pneumonia, Tuberculosis, and Normal/Healthy images.

It is critical to detect the positive cases as early as possible so as to prevent the further spread of this pandemic and to quickly treat affected patients. The need for auxiliary diagnostic tools has increased as there are no accurate automated toolkits available.

Recent findings obtained using radiology imaging techniques suggest that such images contain salient information about the COVID-19 virus. Application of advanced artificial intelligence (AI) techniques coupled with radiological imaging can be helpful for the accurate detection of this disease, and can also be assistive to overcome the problem of a lack of specialized physicians in remote villages.

Objective

Given X-ray images of patients, to build a machine learning model that will analyze and detect if the patient has COVID-19 or not and also classify between 3 other pulmonary Disorders Tuberculosis, Bacterial, and Viral Pneumonia.

This may not be clinically viable

Usage

Option 1: Docker

This is a containerized flask application with docker image put on docker hub. If azure url is not working Pull docker image

docker pull 0941924816/covid-detection-or-analysis:latest

Run docker image

docker run --rm -it  -p 8000:8000/tcp 0941924816/covid-detection-or-analysis:latest

Option 2:(no docker)

If you do not have docker installed make sure you have python3 and pip3 installed

1. Clone the repo

git clone https://github.com/Azariagmt/pulmonary-disorder-detection-using-x-ray-images.git

2. cd into repo

cd pulmonary-disorder-detection-using-x-ray-images

3.Install required dependencies:

pip3 install requirements.txt

4.Run Flask application

flask run

Datasets

The datasets utilized in this project are obtained from three publicly available sources. The first one is the Covid-19 Chest X-ray Database (Rahman et al.) which in its current version consists of 3616 COVID-19 positive cases along with 10,192 Normal, 6012 Lung Opacity (Non-COVID lung infection), and 1345 Viral Pneumonia images. The second one is the Chest X-Ray Images (Pneumonia) dataset (Mooney) consisting of 5,863 X-Ray images split into 2 categories (Pneumonia/Normal). The third dataset used is The Tuberculosis (TB) Chest X-ray Database (Rahman #) which contains CXR images of Normal (3500) and patients with TB (3500).

Repository overview

Structure
    ├── models (fetched from google drive due to large size)
    ├── modules	
    │   ├── load_models.py (fetches models from drive)
    │   ├── predict.py
    ├── notebooks	
    │   ├── Data Fetching and preprocessing
    │   |   ├── Get datasets.ipynb (Gets the datasets from sources specified in Datasets)
    │   |   ├── Data preprocessing.ipynb (Gets the data ready for training)
    |   └── Training (training notebooks of different neural network architectures)
    │       ├── DenseNet201.ipynb
    │       ├── InceptionResNetV2.ipynb
    │       ├── InceptionV3.ipynb
    │       ├── NasNetLarge.ipynb
    │       ├── ResNet101V2.ipynb
    │       ├── ResNet50V2.ipynb
    │       ├── VGG19.ipynb
    │       ├── Xception.ipynb
    ├── numpy arrays (Stored .npy and .npz files used for training and testing)
    ├── static
    ├── templates
    ├── app.py (flask application initialized here)
    └── Dockerfile

Results

Transfer learning was used in the building of the models, some reaching an accuracy of 98.4% on the dataset collected from the different sources above.

classification report

Model Accuracy Precision Recall F1-Score Support
InceptionV3 97.37% 97% 97% 97% 4404
InceptionResNetV2 95.5% 97% 97% 97% 300
DenseNet121 98.27% 98% 98% 98% 4404
DenseNet169 98.07% 98% 98% 98% 4404
DenseNet201 98.27% 98% 98% 98% 4404
ResNet50V2 95% 95% 95% 95% 4404
ResNet101V2 97.12% 97% 97% 97% 4404
VGG16 97.39% 96% 96% 96% 4404
VGG19 97.48% 97% 97% 97% 4404
Xception - 95% 95% 95% 4404

Confusion matrix for chosen models

These are the confusion matrices of tsome of the pretrained models used to build the models.

Xception

Confusion matrix for Xception

DenseNet201

Confusion matrix for Xception

ResNet50V2

Confusion matrix for Xception

Conclusion

This project has several limitations that can be overcome in future research. In particular, a more in-depth analysis requires much more patient data, especially those suffering from COVID-19(currently) and more research for the specific pulmonary disorder we're trying to predict in an area. A more interesting approach for future research would focus on distinguishing patients showing mild symptoms, and those that will actually need intubation or not. while these symptoms may not be accurately visualized on X-rays, or may not be visualized at all.

Furthermore, we will try to use our approach on bigger datasets, to solve other medical problems like cancer, tumors, etc. and also on other computer vision fields as energy, agriculture, and transport in the upcoming days. Future research directions will include the exploration of image data augmentation techniques to improve accuracy even more while avoiding overfitting. We observed that performance could be improved further, by increasing dataset size, using a data augmentation approach, and using hand-crafted features, in the future. More models will be trained and the different architectures of the pretrained models will also be analyzed further!

pulmonary-disorder-detection-using-x-ray-images's People

Contributors

abrehamgezahegn avatar azariagmt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

pulmonary-disorder-detection-using-x-ray-images's Issues

implement unit test

Unit tests for the predict and load_models function should be implemented

Flask app multiclass classification model needs update

The multiclass classification model used in the application is from the previous model making a 3 class classification(covid, pneumonia, TB). Update to the new model of 5 class classification(covid, TB. bacterial pneumonia, viral pneumonia, normal) is required!

Dockerhub build

Dockerhub has disabled autobuild. Use github actions to build container before push to Azure

Normal output neuron missing

Only three output neurons exist on the multiclass classification notebook, an additional one neuron for normal Image case should be added

untrack data from Git

All image arrays are currently tracked by Git as numpy arrays. This is not the right approach and this should be a dvc initialized repository with all data available from the cloud in a different remote.

create flask api

  • receive json of {diagnosis_id, patient_id, image_url}
  • process result
  • upload results back to firebase

Deep Learning models act like a black box

The DL models give no reason to the user why they have classified into that specific category. Look into implementations on what is the best way to solve this issue and make prediction output reason out as to how that prediction was made.

untrack models from Git

Selected models are currently tracked by Git. This should be a dvc initialized repository with different versions of the model available in a remote repository and models fetched from that repository.

move model loading from load_models script to the Docker image

Currently, the models are being loaded in the load_models.py file. This causes inconvenience when someone tries to predict for the first time as it will require a lot of bandwidth it should be shipped first within the docker image. Loading of the models for someone not using docker should be given some consideration.

Error handling for the model predictions

Any image will have some sort of prediction as one perceptron will light up no matter what the input is. Some sort of validation is required to ensure that proper x-ray images are the inputs and only then that the perceptrons light up.
! add out of distribution detection system(OOD detection)

multiple disorders from one patient

One image might have more than one respiratory disorder thus having the maximum activation of the output neuron is not sufficient. The activation percentage of each neuron should be given as output for the multiclass classification model.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.