Git Product home page Git Product logo

eda-on-bank-loan-data's Introduction

Investment-Study-

  • This Analysis report shows Online Loan study, where multiple derivations based on Return of Investment,Loan Defaulted, Number of loans issued based on employment, state, loan_puprose and other metrics have been depicted.
  • Data shape - 39717 rows and 111 columns

Code and Resources used

Python - 3.0 Packages : pandas, numpy, matplotlib, seaborn.


Data Cleaning

After reading the data, following changes were done

  • Fixing Rows and Columns: DataType changes and column values fixing
  • Convert string object to date object for below columns: issue_d, last_pymt_date,int_rate
  • Removal of extra space in column value : term

Missing values

Dropping unnecessary columns and columns with more 70% of missing values and filling missing values.

  • Let's drop the columns with more than 30% missing values(since the data is already huge).
  • Since there is no much spread of data and the difference between mean and median is very small, let's impute the missing values with mean for column: revol_util.

Analysis for number of loan issued

  1. From above plots, it shows that more number of loans were from B,A and C grade's and least from G grade.
  2. From Sub grades A4, B3 have more number of loans.
  3. From 3rd plot, it shows that A,B,C grade loans have less interest rate and E,F,G have high interest rate. From 1st, 2nd plots there are more number of loans from A,B,C grade(granularity check from sub-grades). It might be the reason that the loan applicant's from A,B,C grades have better credit score and lower risk.
  4. From 4th plot, it shows that there are high funded amount in A,B,C and D grades as the applicant's from these grades have better credit score and lower risk.

We see that the majority of borrowers have been employed for at least 10 years.


Analysis on the loan defaulters

It shows there are more defaulters in RENT and MORTGAGE.

There are more defaulters from 'debt_consolidation','other', 'credit_card' and 'small_business'


Conclusion:

  1. Number of loans issued increased steadily by every year with a slight decrease in 2008.
  2. Of settled loans, 83% were Fully Paid and 14% were Charged Off.
  3. Borrowers with own house and the purpose of loan with consolidate debt, 'credit_card' and 'small_business' are not at much risk, but borrower with rent,mortgage are high risk applicants.
  4. Majority of loans were from A, B, and C grade.
  5. There is an inverse relationship between interest rate and loan grade - lower grades(E,F,G) have higher interest rate.
  6. Overall, there are more defaulters from 'debt_consolidation', 'others', 'credit_card' and 'small_business' purpose loans from all grades.

Inspiration:

  • This repo will give you everything required to understand Data Exploration as a beginner.
  • From Data Cleaning, Data preprocessing, Data Visualisation you will have an excellent overview with multiple cases.
  • Working on Categorical datatype, Numerical datatype, this will be your goto guide for EDA.

Hope you Like my work here, do Upvote and Fork this repo for your own Good!

eda-on-bank-loan-data's People

Contributors

lokeshrathi avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

m0u kunjaljadav

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.