Git Product home page Git Product logo

sohrabrezaei / credit-risk-analysis Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 19.35 MB

I am asked to resample the credit card data since it is not balanced. First, I start to split the data and perform oversampling with RandomOverSampler and SMOTE method, and I undersample with ClusterCentroids algorithm. Then, I utilize the SMOTEENN method to oversample and undersample the data. Finally, I used ensemble models.

Jupyter Notebook 100.00%
clustercentroids logistic-regression machine-learning smote smoteenn balancedrandomforestclassifier easyensembleclassifier randomoversampler

credit-risk-analysis's Introduction

Credit-Risk-Analysis

Overview of the analysis

I am asked to resample the credit card data since it is not balanced. First, I start to split the data and perform oversampling with RandomOverSampler and SMOTE method, and I undersample with ClusterCentroids algorithm. Then, I utilize the SMOTEENN method to oversample and undersample the data. Finally, I use ensemble models such as EasyEnsembleClassifier and BalancedRandomForestClassifier to predict the credit card fraud risks.

Results

  • The balanced accuracy score for RandomOverSampler oversampling method is 0.643. The precision and recall are 1 and 0.59, respectively, for non-fraudulent credit cards. image image

  • The balanced accuaracy score for SMOTE oversampling method is 0.662. The presicion and recall is 1 and 0.69 respectively for non-fraudaulent credit cards. image image

  • The balanced accuaracy score for ClusterCentroids undersampling method is 0.544. The presicion and recall is 1 and 0.4 respectively for non-fraudaulent credit cards. image image

  • The balanced accuracy score for SMOTEENN oversampling and undersampling method is 0.674. The precision and recall is 1 and 0.59 respectively for non-fraudulent credit cards.image image

  • The balanced accuaracy score for BalancedRandomForestClassifier ensemble method is 0.788. The presicion and recall is 1 and 0.87 respectively for non-fraudaulent credit cards. image image

  • The balanced accuaracy score for EasyEnsembleClassifier ensemble method is 0.915. The presicion and recall is 1 and 0.9 respectively for non-fraudaulent credit cards. image image

Summary

All the models had a precision of 1 for non-fraudulent cards, and all of them had 0.01 precision for fraudulent cards. Therefore, they are not suitable for predicting fraudulent credit cards. I recommend using The EasyEnsembleClassifier model, which has a balance accuracy score of 0.915 and a recall of 0.9 for non-fraudulent credit cards.

credit-risk-analysis's People

Contributors

sohrabrezaei avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.