Predicting Term Deposit Subscription by a client
Portugal-Banking-Data-UCI (https://archive.ics.uci.edu/ml/datasets/Bank+Marketing)
Abstract:
Marketing campaigns are characterized by focusing on the customer needs and their overall satisfaction.
Nevertheless, there are different variables that determine whether a marketing campaign will be successful or not.
There are certain variables that we need to take into consideration when making a marketing campaign.
A Term deposit is a deposit that a bank or a financial institution offers with a fixed rate (often better than just opening a deposit account) in which your money will be returned back at a specific maturity time.
Problem Statement:
Predict if a customer subscribes to a term deposits or not, when contacted by a marketing agent, by understanding the different features and performing predictive analytics
Steps Performed:
- Data Cleaning
- Outlier removal using z-score
- Basic EDA
- Splitting into train and test
- Encoding the independednt variables
- Model building and prediction
- Over-sampling minority class by SMOTE
- Model building and evaluation
- Conclusion
Data-set Info:
- age (numeric)
- job: type of job(categorical:"admin.","bluecollar","entrepreneur","housemaid","management","retired","selfemployed","services","student","technician","unemployed","unknown")
- marital: marital status (categorical: "divorced","married","single","unknown"; note: "divorced" means divorced or widowed)
- education: education of individual (categorical: "basic.4y","basic.6y","basic.9y","high.school","illiterate","professional.course","university.degree","u nknown")
- default: has credit in default? (categorical: "no","yes","unknown")
- housing: has housing loan? (categorical: "no","yes","unknown")
- loan: has personal loan? (categorical: "no","yes","unknown") Related with the last contact of the current campaign:
- contact: contact communication type (categorical: "cellular","telephone")
- month: last contact month of year (categorical: "jan", "feb", "mar", โฆ, "nov", "dec")
- dayofweek: last contact day of the week (categorical: "mon","tue","wed","thu","fri")
- duration: last contact duration, in seconds (numeric). Important note: this attribute highly affects the output target (e.g., if duration=0 then y="no"). Yet, the duration is not known before a call is performed. Also, after the end of the call y is obviously known. Thus, this input should only be included for benchmark purposes and should be discarded if the intention is to have a realistic predictive model Other attributes:
- campaign: number of contacts performed during this campaign and for this client (numeric, includes last contact)
- pdays: number of days that passed by after the client was last contacted from a previous campaign (numeric; 999 means client was not previously contacted)
- previous: number of contacts performed before this campaign and for this client (numeric)
- poutcome: outcome of the previous marketing campaign (categorical: "failure","nonexistent","success") Social and economic context attributes
- emp.var.rate: employment variation rate - quarterly indicator (numeric)
- cons.price.idx: consumer price index - monthly indicator (numeric)
- cons.conf.idx: consumer confidence index - monthly indicator (numeric)
- concavepoints_se: standard error for number of concave portions of the contour
- euribor3m: euribor 3 month rate - daily indicator (numeric)
- nr.employed: number of employees - quarterly indicator (numeric) Output variable (desired target):
- y: has the client subscribed a term deposit? (binary: "yes","no")