This notebook explains data science pipeline.
- Data collection
- Data preparation
- Data exploration
- Modeling
- Deployment
According to recent Glassdoor survey, nearly 35 percent of hiring decision makers expect more employees to quit over the span of 2018 than they did in 2017.
The goal is to exemplify the process in making data provide resolution for employee attrition problem. The cost of employee attrition is enormous. Attrition from resignation or retirement demands additional work hours from remaining employees. Whereas, long-term employees would have created a good rapport with customers and clients during their period of service. Losing them create the risk of losing customers and clients to competitors. This code patterns walks you through every step in data science life cycle and arms you with different tools, techniques, and algorithms that can be used to solve the attrition problem. Some highlighting tools are pixiedust, AIF360, pandas, and Jupyter notebook. The model will be deployed on Watson Machine Learning that provides you with an interactive dashboard to test the model.
- Can we find the factors having significant contribution to employee attrition?
- Can we predict if an employee is likely to quit or not? This will help HR to intervene on time and remedy the situation.
- Pandas
- Numpy
- Scikit-learn
- Matplotlib
- Seaborn
- Graphviz
- Source: Kaggle
- Dataset Link
- License
- Database: Open Database
- Contents: Database Contents