Git Product home page Git Product logo

thyroid-detection's Introduction

Thyroid Disease Detection

Problem Statement:

Thyroid disease is a common cause of medical diagnosis and prediction, with an onset that is difficult to forecast in medical research. The thyroid gland is one of our body's most vital organs. Thyroid hormone releases are responsible for metabolic regulation. Hyperthyroidism and hypothyroidism are one of the two common diseases of the thyroid that releases thyroid hormones in regulating the rate of body's metabolism. The main goal is to predict the estimated risk on a patient's chance of obtaining thyroid disease or not.

Demo

prediction.mp4

Technical Aspects

  • Python 3.7 and more
  • Important Libraries: sklearn, pandas, numpy, matplotlib & seaborn
  • Front-end: HTML, CSS
  • Back-end: Flask framework
  • IDE: Jupyter Notebook, Pycharm & VSCode
  • Database: SQLite
  • Deployment: Locally

How to run this app

Code is written in Python 3.7 and more. If you don't have python installed on your system, click here https://www.python.org/downloads/ to install.

  • Create virtual environment
    conda create -p venv python==3.7 -y

  • Activate the environment
    conda activate venv

  • Install the packages
    pip install -r requirements.txt

  • Run the app
    python main.py

Workflow

Data Collection

Thyroid Disease Data Set from UCI Machine Learning Repository.

Link:https://archive.ics.uci.edu/ml/datasets/thyroid+disease

Data Pre-processing

  • Drop columns not useful for training the model. Such columns were selected while doing the EDA.
  • Replace the invalid values with numpy β€œnan” so we can use imputer on such values.
  • Encode the categorical values
  • Check for null values in the columns. If present, impute the null values using the KNN imputer.
  • After imputing, handle the imbalanced dataset by using RandomOverSampler

Model Creation and Evaluation

  • Various classification algorithms like Random Forest, XGBoost, KNN etc tested.
  • Random Forest, XGBoost and KNN were all performed well. XGBoost, Random Forest was chosen for the final model training and testing.
  • Hyper parameter tuning was performed using RandomizedSearchCV
  • Model performance evaluated based on accuracy, confusion matrix, classification report.

Database Connection

SQLite database used for this project.

Model Deployment

The final model is deployed locally.

Batch File Prediction User Interface

User need to upload CSV file and click Custom File Predict button for prediction to start.

prediction page

prediction file result

Prediction CSV file will contain index numer with type of thyroid disease patient is suffering from.

prediction output file

Experience Letter

Experience_letter_Thyroid Disease Detection

Project Documents

initialize git repo

git init
git add .
git commit -m "first commit"
git branch -M main
git remote add origin <github_url>
git push -u origin main

to update the modification

git add .
git commit -m "proper message"
git push 

Author

Vikram Jha: https://www.linkedin.com/in/vikram888/

thyroid-detection's People

Contributors

vikram0888 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.