Git Product home page Git Product logo

capstone-project--applicantscorecardpreparation's Introduction

`

This is done as capstone project in BFSI domain and submitted as first phase.

Problem Description:
Given demographic data and credit bureau data of the CredX applicants, asked to evaluate the customers of CredX using scorecard.

Methodology:
Followed CRISP-DM framework and below are the components. Also described achievments and approach followed in each component.

  1. Business Objective:

    CredX is a leading credit card provider that gets thousands of credit card applicants every year. But in the past few years, it has experienced an increase in credit loss. Company believes that the best strategy to mitigate credit risk is to ‘acquire the right customers’.
    In this project, we will help CredX to identify the right customers using predictive models. Using past data of the bank’s applicants, we need to determine the factors affecting credit risk, create strategies to mitigate the acquisition risk.

  1. Data understanding:

    Exploratory Data Analysis (EDA), is performed in various levels such as univariate, bi-variate and multivariate to understand the data
    and to understand the driving factors that impacts ‘PerformenceTag’. Below is the highlevel snapshot of given data.

    Demographic/applicant data: This is obtained from the information provided by the applicants at the time of credit card application. It contains customer-level information on age, gender, income, marital status, etc.

    Credit Bureau Data: This is taken from the credit bureau and contains variables such as ‘number of times 30 DPD or worse in last 3/6/12 months’, ‘outstanding balance’, ‘number of trades’, etc.

  2. Data preparation:

    Below data quality issues:

    1. Credit bureau data has 3 duplicate application id’s. We have removed it.
    2. Applicant age ‘-3’ has treated as incorrect age.
    3. Gender- 2 rows have missing for gender.
    4. Marital Status is missing for 6 applicants.
    5. Number of Dependents were missing for 3 Applicants , out of which 1 applicants age is 0.
    6. One Applicants Income is mentioned as -0.5.
    7. Education details are missing for 119 applicants.
    8. Profession information is missing for 14 Applicants.
    9. Type of Residence is missing for 8 Applicants.
    10. Performance Tag for 2% of the total applicants is Missing.
    11. All the missing values except Performance Tag were replaced with WOE values.
    12. Missing values for Performance tag have been treated as Rejected applicants.

    WoE Analysis:

    Performed analysis on WoE plots, to identify the impact of PerformenceTag on each of the attribute in given data. Created Wieght of Evidence (WoE) values for each attribute on merged data (Demographic Data + Credit Bureau Data). Populated WoE values in given data for futher model building.

    Information Value Analysis:

    Created InformationValue for each of the attribute to measure the level of significancy of individual attributes on ‘PerformenceTag’.
    Followed the bench mark convention, to identify the significance of attributes.

  3. Modeling:

  4. Evaluation:

    Performed a recommended split of 70:30 of the merged data (Demographic Data + Credit Bureau Data). Built Bayesian logistic regression model, to better understand the string attributes, on training data.Could achieve 90% accuracy, 88% of sensitivity and 90% of specificity. Provided the performence factors of built logistic regression model. Obtained model is further used to compute scorecard of the customer. peformed prediction on test data.

  5. Deployment:

`

capstone-project--applicantscorecardpreparation's People

Contributors

settur1409 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.