In this work, I have tried to predict the winner of a football match with the help of ML algorithms. The data set is collected from a freelancing site in this case but can be collected from other sites as well such as: https://www.kaggle.com/hugomathien/soccer, http://football-data.co.uk/data.php. Several ML algorithms were employed but Naive Bayes showed maximum accuracy of 0.85 on the test data.
Initially the dataset had 28 columns and 84788 entries. Removal of few columns were done due data unavailability. After that 24 columns and 58864 valid entries were considered for development of the prediction model. Naive Bayes was found to be the most efficient model with accuracy score of 0.70. To improve the model performance Win, Draw, and Loss probability for both Home and Away teams were computed and included as features. Inclusion of new features took the accuracy score to 0.83 which was further improved to 0.85 with the help of hyperparameter tuning.