Classification of 1994 Census Income Data
Problem Statement: To build a model that will predict if the income of any individual in the US is greater than or less than USD 50,000 based on the data available about that individual.
Data Set Description: This Census Income dataset was collected by Barry Becker in 1994 and given to the public site http://archive.ics.uci.edu/ml/datasets/Census+Income. This data set will help you understand how the income of a person varies depending on various factors such as the education background, occupation, marital status, geography, age, number of working hours/week, etc.
Here’s a list of the independent or predictor variables used to predict whether an individual earns more than USD 50,000 or not:
- Age
- Work-class
- Final-weight
- Education
- Education-num (Number of years of education)
- Marital-status
- Occupation
- Relationship
- Race
- Sex
- Capital-gain
- Capital-loss
- Hours-per-week
- Native-country
Original project outline can be found at: https://www.edureka.co/blog/data-science-projects/ This project is strictly used for RStudio practice and served me as an introductory project into data science