XYZ is a courier company. As we appreciate that human capital plays an important role in collection, transportation and delivery. The company is passing through genuine issue of Absenteeism. The company has shared it dataset and requested to have an answer on the following areas:
- What changes company should bring to reduce the number of absenteeism?
- How much losses every month can we project in 2011 if same trend of absenteeism continues?
- Data Pre-processing.
- Data Visualization.
- Outlier Analysis.
- Missing value Analysis.
- Feature Selection.
- Correlation analysis.
- Chi-Square test.
- Analysis of Variance(Anova) Test
- Multicollinearity Test.
- Feature Scaling.
- Normalization.
- Splitting into Train and Test Dataset.
- Dimensionality Reduction using PCA technique.
- Hyperparameter Optimization.
- Model Development
I. Linear Regression
II. Decision Tree III. Random Forest IV. XGBOOST - Model Performance- Without PCA.
- Model Performance- With PCA.
- Conclusion
- Python Code
- R code.