Git Product home page Git Product logo

abduliante / vehicle-default-loan-prediction Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 1.0 71.41 MB

Forecasting the likelihood of a customer defaulting their auto loan using classification models

Jupyter Notebook 5.87% Python 92.06% PowerShell 0.02% Shell 0.01% C 0.87% Cython 0.82% C++ 0.18% Fortran 0.03% Makefile 0.01% MATLAB 0.01% TeX 0.09% JavaScript 0.02% CSS 0.02%
classification python lasso-regression feature-engineering feature-selection variance-inflation-factor logistic-regression knn-classification xgboost-classifier gridsearchcv voting-classifier

vehicle-default-loan-prediction's Introduction

Vehicle Default Loan Prediction

Financial institues are suffering from significant loses due to car loan defaults. This has resulted in tighter car loans and higher car loan denial rates. Institutions have addressed the need for better credit risk scoring model. Therefore, in our project we seek to predict vehicle loan defaults on customers to determine whether a subject of given criteria is receptive to defaulting their loan or not.

Research Hypothesis:

In our project, we try to hypothesize if we can accurately predict the possibility that the loanee/borrower will default on the car loan.

Data Description

Our data is fetched from Kaggle website that contains information about loanee (Demographics such as age, income, etc.), loan itself (amount, loan to value ratio, Equated Monthly Installments, etc.) and information from the bureau (Bureau score, number of active accounts, status of other loans, credit history, etc.). There is no disclosure about the origin of the data (i.e country) but from our column investigation, it turns out to be from India. Based on that assumption which is not fully verified, we will treat it as undisclosed.

There are approximately 230k+ rows and 41 columns in the dataset. Data description goes as follows:

Variable Name Description
UniqueID Identifier for customers
loan_default "Payment default in the first EMI on due date, 1 for default"
disbursed_amount Amount of Loan disbursed
asset_cost Cost of the Asset
ltv Loan to Value of the asset
branch_id Branch where the loan was disbursed
supplier_id Vehicle Dealer where the loan was disbursed
manufacturer_id "Vehicle manufacturer(Hero, Honda, TVS etc.)"
Current_pincode Current pincode of the customer
Date.of.Birth Date of birth of the customer
Employment.Type Employment Type of the customer (Salaried/Self Employed)
DisbursalDate Date of disbursement
State_ID State of disbursement
Employee_code_ID Employee of the organization who logged the disbursement
MobileNo_Avl_Flag if Mobile no. was shared by the customer then flagged as 1
Aadhar_flag if aadhar was shared by the customer then flagged as 1
PAN_flag if pan was shared by the customer then flagged as 1
VoterID_flag if voter was shared by the customer then flagged as 1
Driving_flag if DL was shared by the customer then flagged as 1
Passport_flag if passport was shared by the customer then flagged as 1
PERFORM_CNS.SCORE Bureau Score
PERFORM_CNS.SCORE.DESCRIPTION Bureau score description
PRI.NO.OF.ACCTS count of total loans taken by the customer at the time of disbursement
PRI.ACTIVE.ACCTS count of active loans taken by the customer at the time of disbursement
PRI.OVERDUE.ACCTS count of default accounts at the time of disbursement
PRI.CURRENT.BALANCE total Principal outstanding amount of the active loans at the time of disbursement
PRI.SANCTIONED.AMOUNT total amount that was sanctioned for all the loans at the time of disbursement
PRI.DISBURSED.AMOUNT total amount that was disbursed for all the loans at the time of disbursement
SEC.NO.OF.ACCTS count of total loans taken by the customer at the time of disbursement
SEC.ACTIVE.ACCTS count of active loans taken by the customer at the time of disbursement
SEC.OVERDUE.ACCTS count of default accounts at the time of disbursement
SEC.CURRENT.BALANCE total Principal outstanding amount of the active loans at the time of disbursement
SEC.SANCTIONED.AMOUNT total amount that was sanctioned for all the loans at the time of disbursement
SEC.DISBURSED.AMOUNT total amount that was disbursed for all the loans at the time of disbursement
PRIMARY.INSTAL.AMT EMI Amount of the primary loan
SEC.INSTAL.AMT EMI Amount of the secondary loan
NEW.ACCTS.IN.LAST.SIX.MONTHS New loans taken by the customer in last 6 months before the disbursment
DELINQUENT.ACCTS.IN.LAST.SIX.MONTHS Loans defaulted in the last 6 months
AVERAGE.ACCT.AGE Average loan tenure
CREDIT.HISTORY.LENGTH Time since first loan
NO.OF_INQUIRIES Enquries done by the customer for loans

Tools:

The prediction is going to be delivered on an IPython Notebook using Cookiecutter structure. Tools to be used are:

  • Python 3.7
  • Pandas
  • Numpy
  • Scikit-learn
  • Matplotlib
  • Seaborn
  • Tableau
  • Flask - Deploying model through Heroku

vehicle-default-loan-prediction's People

Contributors

abduliante avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Forkers

monicarekhan

vehicle-default-loan-prediction's Issues

Tuning parameters

Implement grid search on chosen models to choose best parameters:

  • KNN
  • Random forest
  • XGBoost
  • Logistics regression (Assigned to Mohammed)

Investigate logistics regression scores

Logistics regression with our chosen parameters yields these results:

Training classification report:

          precision    recall  f1-score   support

       0       0.65      0.49      0.56    135558
       1       0.63      0.77      0.69    153784

accuracy                           0.64    289342
macro avg      0.64      0.63      0.62    289342
weighted avg   0.64      0.64      0.63    289342

Validation classification report:

          precision    recall  f1-score   support

       0       0.81      0.40      0.54     34000
       1       0.24      0.67      0.35      9502

accuracy                           0.46     43502
macro avg      0.53      0.54      0.44     43502
weighted avg   0.69      0.46      0.50     43502

outliers

Remove outliers of the count of active loans and previously taken by the customer at the time of disbursement.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.