Git Product home page Git Product logo

researchseminar's Introduction

For the research seminar Career-Relevant Writing Exercise I add the following project to my github account.

Breast Cancer Dataset

I have worked on an independent project to explore various Machine Learning algorithms on breast cancer datasets from the UCI Machine Learning Repository to classify whether a set of readings from clinical reports are positive for breast cancer or not.

Background:

Breast cancer is the most common cancer among women in the world. It account for 25% of all cancer cases, and affected over 2.1 Million people in 2015. It starts when cells in the breast begin to grow out of control. These cells usually form a tumor that can often be seen on an X-ray or felt as a lump.

Early diagnosis significantly increases the chances of surviver. The key challenges against it's detection is how to classify tumors into malignant(Cancer) or benign(not cancer). A tumor is considered malignant (Cancer) if the cells can grow into surrounding tissues or spread to distant areas of the body. A benign tumor does not invade nearby tissue or spread to other parts of the body the way cancer can. But benign tumors can be serious if they press on vital structures such as blood vessel or nerves.

Machine Learning technique can dramatically improve the level of diagnosis in breast cancer. Research shows that experience physicians can detect cancer by 79% accuracy, while 91%(up to 97%) accuracy can be achieved using Machine Learning techniques.

Aim

In this study, my task is to classify tumors into malignant (cancer) or benign using features obtained from several cell images. I use radius mean as the dependant variable.

Methodology

I train test split at 0.25 and 0.75, and use cross validation and k-fold cross validation.

I use the following Machine Learning models: Logistic Regression, k Nearest Neighbours, Support Vector Machines, Decision Trees, Neural Networks

Attribute Information:

ID number Diagnosis (M = malignant, B = benign) Ten real-valued features are computed for each cell nucleus:

Radius (mean of distances from center to points on the perimeter) Texture (standard deviation of gray-scale values) Perimeter Area Smoothness (local variation in radius lengths) Compactness (perimeter^2 / area - 1.0) Concavity (severity of concave portions of the contour) Concave points (number of concave portions of the contour) Symmetry Fractal dimension ("coastline approximation" - 1)

researchseminar's People

Contributors

sehrishmastoor2166 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.