Git Product home page Git Product logo

cardio_catch_disease's Introduction

Cardio Catch Diseases

Predicting cardiovascular diseases

1. Business Problem.

Cardio Catch Diseases is a company specialized in detecting heart disease in the early stages. Its business model lies in offering an early diagnosis of cardiovascular disease for a certain price.

Currently, the diagnosis of cardiovascular disease is manually made by a team of specialists. The current accuracy of the diagnosis varies between 55% and 65%, due to the complexity of the diagnosis and also the fatigue of the team who take turns to minimize the risks. The cost of each diagnosis, including the devices and the payroll of the analysts, is around $1,000.00.

The price of the diagnosis, paid by the client, varies according to the precision achieved by the team of specialists.

Exam Accuracy Price Rules Example
Above 50% min $500.00 +$500 for each additional 5% precision Precision = 55% -> $1,000.00
Up to 50% $0.00 N/A N/A

Thus, we see that different values in the exam precision, given by the team of specialists, make the company either have a profitable operation, revenue greater than the cost, or an operation with a loss, revenue less than the cost. This instability of the diagnosis makes the company to have an unpredictable cashflow.

2. Business Assumptions.

The assumptions about the business problem is as follows:

  • CVDs are the number 1 cause of death globally: more people die annually from CVDs than from any other cause.
  • An estimated 17.9 million people died from CVDs in 2016, representing 31% of all global deaths. Of these deaths, 85% are due to heart attack and stroke.
  • Over three quarters of CVD deaths take place in low- and middle-income countries.
  • Out of the 17 million premature deaths (under the age of 70) due to noncommunicable diseases in 2015, 82% are in low- and middle-income countries, and 37% are caused by CVDs.
  • Most cardiovascular diseases can be prevented by addressing behavioural risk factors such as tobacco use, unhealthy diet and obesity, physical inactivity and harmful use of alcohol using population-wide strategies.
  • People with cardiovascular disease or who are at high cardiovascular risk (due to the presence of one or more risk factors such as hypertension, diabetes, hyperlipidaemia or already established disease) need early detection and management using counselling and medicines, as appropriate.

PS 1: All the references are stated at the end of this README.

PS 2: You can find useful information at section 1 of my notebook.

3. Solution Strategy

My strategy to solve this challenge was:

Step 01. Data Description: My goal is to use statistics metrics to identify data outside the scope of business.

Step 02. Feature Engineering: Derive new attributes based on the original variables to better describe the phenomenon that will be modeled.

Step 03. Data Filtering: Filter rows and select columns that do not contain information for modeling or that do not match the scope of the business.

Step 04. Exploratory Data Analysis: Explore the data to find insights and better understand the impact of variables on model learning.

Step 05. Data Preparation: Prepare the data so that the Machine Learning models can learn the specific behavior.

Step 06. Feature Selection: Selection of the most significant attributes for training the model.

Step 07. Machine Learning Modelling: Machine Learning model training

Step 08. Hyperparameter Fine Tunning: Choose the best values for each of the parameters of the model selected from the previous step.

Step 09. Convert Model Performance to Business Values: Convert the performance of the Machine Learning model into a business result.

Step 10. Deploy Modelo to Production: Publish the model in a cloud environment so that other people or services can use the results to improve the business decision.

4. Top 3 Data Insights

Hypothesis 01: The cases of heart diseases does not significantly depend on the height.

False. As observed, up to ~165 cm there are significantly more cases of heart diseases. Then, above this height, there are fewer cases.

Hypothesis 02: The are more cases of heart diseases for people who smokes than for people who does not.

False. As observed, the great majority of cases are among people who doesn't smoke.

Hypothesis 03: The are more cases of heart diseases for people who intakes alcohol than for people who does not.

False. As observed, the great majority of cases are among people who doesn't intake alcohol.

5. Machine Learning Model Applied

Tests were made using different algorithms.

cardio_catch_disease's People

Contributors

alessandra-barbosa avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.