Goal: In this project, we want to predict whether or not loans acquired by Fannie Mae will become more than 90 days delinquent.
Fannie Mae acquires loans from other lenders as a way of inducing them to lend more. Fannie Mae releases data on the loans it has acquired and their performance afterwards.
- For this project, I used Fannie Mae's Single-Family Historical Loan Performance Dataset.
- Fannie Mae provides loan performance data on a portion of its single-family mortgage loans to promote better understanding of the credit performance of Fannie Mae mortgage loans.
- Data can be downloaded from Fannie Mae's website.
- Fannie Mae requires the user to register and create a unique username and password in order to access the performance data.
- After creating the account, we can log in to Data Dynamics, and download the data we need for this project.
- We will be downloading "Single-Family Loan Acquisition and Performance data". The data are available by quarter starting from 2000 Q1 till the latest available date (2023 Q2 as of now). For this project, we will use the data from 2020 Q1 till 2023 Q2.
-
col_names.R
: set column names and variable types -
read_data.R
: reads downloaded raw data sets into R dataframe -
prep_data.R
: Prepares working data by selecting relevant acquisition and performance variables of interest, renames variables, constructs derived variables required in the analysis, cleans data and saves working data. -
predict.R
: creates training and test datasets, perform predictive analysis using logistic regression
Please note that this is an ongoing analysis, and will be updated.