leoegidi / footbayes Goto Github PK
View Code? Open in Web Editor NEWAn R package for many football models
An R package for many football models
--- output: github_document --- <!-- README.md is generated from README.Rmd. Please edit that file --> ```{r, echo = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-" ) ``` # footBayes The goal of ```footBayes``` is to propose a complete workflow to: - fit the most well-known football models: double Poisson, bivariate Poisson, Skellam, student-t, according to both maximum likelihood and Bayesian methods (+ Hamiltonian Monte Carlo engine); - visualize the teams' abilities, the model checks, the rank-league reconstruction; - predict out-of-sample matches. ## Installation Alternatively to CRAN, you can safely install ```footBayes``` from github with: ```{r gh-installation, eval = FALSE} # install.packages("devtools") devtools::install_github("leoegidi/footBayes") ``` ## Example In what follows, a quick example to fit a Bayesian double Poisson model for the Italian Serie A (seasons 2000-2001, 2001-2002, 2002-2003), visualize the estimated teams' abilities, and predict the last four match days for the season 2002-2003: ```{r example, eval = FALSE} library(footBayes) require(dplyr) # dataset for Italian serie A data("italy") italy <- as_tibble(italy) italy_2000_2002<- italy %>% dplyr::select(Season, home, visitor, hgoal, vgoal) %>% filter(Season=="2000" | Season=="2001" | Season =="2002") fit1 <- stan_foot(data = italy_2000_2002, model="double_pois", predict = 36) # double poisson fit (predict last 4 match-days) foot_abilities(fit1, italy_2000_2002) # teams abilities pp_foot(italy_2000_2002, fit1) # pp checks foot_rank(italy_2000_2002, fit1) # rank league reconstruction foot_prob(fit1, italy_2000_2002) # out-of-sample posterior pred. probabilities ``` For more and more technical details and references, see the vignette!
Hi -- nice encapsulation of popular models. Easy to work with.
What's the most straightforward way to keep the simplicity, but add in other variables -- for example game week, or externals like weather?
Thank you for providing such a useful library. When will the new version be released? I don’t know how to add variables.
Thanks a lot!
Hi!
Looking at bipois_lpmf, I cannot get it to match the Wikipedia equation for bivariate Poisson:
An alternative source is page 5 in this slide deck.
Comment 1. For instance, assume miny = 0
. Now look at the term exp(-lambda1 - lambda2 - lambda3)
in the equation, before the summation. In the code it says
ss = poisson_lpmf(r[1] | mu1) + poisson_lpmf(r[2] | mu2) - exp(mu3);
Should it not be -mu3
instead of - exp(mu3)
since we're in log-space?
Comment 2. Also, in the sum I don't understand where the terms in
log_s = log_s + log(r[1] - k + 1) + mus+ log(r[2] - k + 1)- log(k);
come from. Seems wrong. When I take logs of the terms in the sum I get (see full code below):
lchoose(r[1], k) + lchoose(r[2], k) + lgamma(k + 1) + k * (log(mu3) - log(mu1) - log(mu2));
Any comments on the above?
Here is my attempt at an implementation:
real bipois_lpmf(int[] r ,real mu1, real mu2, real correlation_coeff) {
// r = argument to evaluate on
// https://en.wikipedia.org/wiki/Poisson_distribution#Bivariate_Poisson_distribution
// http://www2.stat-athens.aueb.gr/~karlis/multivariate%20Poisson%20models.pdf
real log_base_factor;
real log_sum_factor;
real log_factor;
int miny;
// Logarithm of the base factor in the multivariate Poisson distribution
// ie the term before the summation in the equation here
// https://en.wikipedia.org/wiki/Poisson_distribution#Bivariate_Poisson_distribution
log_base_factor = poisson_lpmf(r[1] | mu1) + poisson_lpmf(r[2] | mu2) -
correlation_coeff;
// Number of terms in the summation
miny = min(r[1], r[2]);
// This initial conditions works because the first term in the sum
// when k = 0 evaluates to 1, so
// log_sum_exp(0, log(term)) = log_sum ( 1 + term)
log_sum_factor = 0;
if(miny > 0) {
for(k in 1:miny) {
// The term is: choose(r_1, k) * choose(r_2, k) * k! * (corr / (mu1 * mu2))^k
// Here we compute the log of that term
log_factor = lchoose(r[1], k) + lchoose(r[2], k) +
lgamma(k + 1) +
k * (log(correlation_coeff) - log(mu1) - log(mu2));
// The log-sum-exp function is associative, since
// log_sum_exp(log_sum_exp(a, b), c) = log_sum_exp(a, b, c)
log_sum_factor = log_sum_exp(log_sum_factor, log_factor);
}
}
return log_base_factor + log_sum_factor;
}
Any comments on this? Am I onto something, or am I really dense or missing something obvious?
Thanks in advance for your inputs.
Hi,
thanks a lot for this amazing package - def. great work and looknig forward to the next version!
I have two questions: How can i make predictions on the future - e.g. matches that are not played yet? Lets assume we train on the current Serie A season - is it possible to predict the next matchday? When i put in matches with NA as h_goals i get an error. I guess i could make a workaround with assigning a dummy result like 10-10 and exlude it using the predictins option?
Secondly - does the algorithms work with decimal inputs rather than actual integers (goals)? I think of putting in expected Goals metrics like 2.32 xG home team, 0.37 away team....
Kind regards
Malte
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.