Let’s practice and become familiar with classification algorithms.
-
Exercise 1:
- Create at least three different classification models to try to best predict DelayedFlights.csv flight delay (ArrDelay).
-
Exercise 2:
- Creates a new variable depending on whether the flight arrived late or not (ArrDelay> 0).
-
Exercise 3:
- Compare classification models using accuracy, a confidence matrix, and other more advanced metrics.
-
Exercise 4:
- Train them using the different parameters they support.
-
Exercise 5:
- Compare your performance using the traint / test approach or using all data (internal validation).
- Exercise 5:
- Perform some variable engineering process to improve prediction.
- Exercise 6:
- Do not use the DepDelay variable when making predictions.
- Classification trees
- KNN - k-Nearest Neighbors
- Logistic Regression
- Support Vector Machine
- XGboost