Comments (7)
Agreed - this could be improved and I like the approach. It's something discussed previously in #15 . I will take a look at adding. Very nice example too !
from engsoccerdata.
That's great; nice Tableau visualisation referenced in the other issue too.
Was scratching my head for too long wondering why Leicester have only 30 games played in the above table before realising it's the inconsistency in team names between england
and the new data -- back to that issue! How about using partial string matching to add new team names to the teamnames
df, then manually reviewing to check accuracy? Might be less work?
from engsoccerdata.
agrep()
seems to work pretty nicely for partial string matching; much better than pmatch()
anyway. e.g.:
> unique(teamnames[agrep("Tottenham", teamnames$name),]$name)
[1] "Tottenham Hotspur"
> unique(teamnames[agrep("Leicester", teamnames$name),]$name)
[1] "Leicester City"
from engsoccerdata.
I noticed a few things - will try and update in the next few days.
-
agree regarding 'date'. This used to be a historical issue why I kept it as a character. It definitely should not be a factor. In all dataframes it should be a date class. I'll correct that.
-
with the
england_current()
function in the most recent version of the package, I think that all teamnames are correct. It for instance returns "Leicester City" correctly. I had thought about partial string matching, but I can see that might lead to undesired errors/side-effects. It's better to try and get a definitive list of names. -
I'll add a 'since date' function to the
maketable
andmaketable_eng
functions. I'll also makeseason
andtier
optional. The annoying thing about these data is all the strange rules in different countries regarding points for wins and tiebreak procedures (not to mention point deductions). To get it all 100% straight is painful and has to be hard-coded.
from engsoccerdata.
adding issue comment here:
rename maketable_eng()
to england_maketable()
to be syntactically related to england_current()
- easier to remember.
from engsoccerdata.
Additional enhancement: option to make separate tables for home fixtures only and away fixtures only.
Including this feature, the optional Season
and tier
arguments discussed above, and the pts_deductions
flag discussed here, my Frankensteined maketable()
function now looks like this. Seems to work for me!
maketable <- function(df=NULL, Season=NULL, tier=NULL, type = c("both", "home", "away"), pts_deductions = FALSE, pts=3) {
GA<-GF<-ga<-gf<-gd<-GD<-D<-L<-W<-Pts<-.<-Date<-home<-team<-visitor<-hgoal<-opp<-vgoal<-goaldif <-FT<-division<-result<-maxgoal<-mingoal<-absgoaldif<-NULL
#subset by season and tier, if applicable
if(!is.null(Season) & is.null(tier)) {
dfx <- df[(df$Season == Season), ]
} else if(is.null(Season) & !is.null(tier)) {
dfx <- df[(df$tier == tier), ]
} else if(!is.null(Season) & !is.null(tier)) {
dfx <- df[(df$Season == Season & df$tier == tier), ]
} else {
dfx <- df
}
#subset only home or away fixtures, if applicable
if(match.arg(type)=="home") {
temp <- dplyr::select(dfx, team=home, opp=visitor, GF=hgoal, GA=vgoal)
} else if(match.arg(type)=="away") {
temp <- dplyr::select(dfx, team=visitor, opp=home, GF=vgoal, GA=hgoal)
} else if(match.arg(type)=="both") {
temp <-rbind(
dplyr::select(dfx, team=home, opp=visitor, GF=hgoal, GA=vgoal),
dplyr::select(dfx, team=visitor, opp=home, GF=vgoal, GA=hgoal)
)
}
#make table
table <- temp %>%
dplyr::mutate(GD = GF-GA) %>%
dplyr::group_by(team) %>%
dplyr::summarise(GP = sum(GD<=100),
W = sum(GD>0),
D = sum(GD==0),
L = sum(GD<0),
gf = sum(GF),
ga = sum(GA),
gd = sum(GD)
) %>%
dplyr::mutate(Pts = (W*pts) + D) %>%
dplyr::arrange(-Pts, -gd, -gf) %>%
dplyr::mutate(Pos = rownames(.)) %>%
as.data.frame()
#apply points deductions, if applicable
if(pts_deductions & any(table$team %in% deductions$team & Season %in% deductions$Season)) {
penalty <- deductions[deductions$team %in% table$team & deductions$Season %in% Season,]
table[table$team %in% penalty$team,]$Pts <- table[table$team %in% penalty$team,]$Pts - penalty$points_deducted
table <- dplyr::arrange(table, -Pts, -gd, -gf)
}
return(table)
}
from engsoccerdata.
OK I have implemented this. I hope the solution I have come to makes sense.
It gets hard to have one maketable
function that does all things in all circumstances. I have kept maketable_eng
as requiring a Season
and tier
argument. This might need to be amended in the future but is ok for now as that function can handle points deductions and can also discriminate league tables based on the different tie-breaker procedures in different seasons/tiers.
The maketable_eng
function depends on the original maketable
function, so I left that alone.
Therefore to make a maketable
function that did not depend on a Season or tier argument and could make tables by defined dates, I created maketable_all
. I would suggest using this instead of maketable
going forward. I may even make maketable
non-exportable.
Whilst I still think it's best to pre-subset your dataframe before using the maketable_all
function, you can now:
- add begin date and end date arguments
- added home, away or both results option
Lastly, I haven't renamed maketable_eng
as it appears the package will have a family of maketable
functions - I've kept the name.
Examples:
#EPL historical table
maketable_all(df=england[england$tier==1,],begin="1992-08-15", end="2017-07-01")
#EPL historical table away results
maketable_all(df=england[england$tier==1,],begin="1992-08-15", end="2017-07-01", type="away")
from engsoccerdata.
Related Issues (20)
- Add Brazilian league Data
- safrica deductions
- England_current() returns same column names but different column types HOT 2
- Champs_Update Needed HOT 2
- League position HOT 2
- England Tier 1 2016/2017 & 2017/2018 data missing? HOT 5
- Add domestic cup data HOT 1
- league cup
- div 3N and div 3S are messed up in england HOT 1
- england 2019/2020 double entries ? HOT 1
- R version 3.50 required? HOT 4
- France raw data is the same as Germany
- MUFC-Bournemouth game 2016
- 2021 england_current() HOT 1
- england_current() issue HOT 1
- belgium
- Greek Playoff Games
- NA in some team names in name_other column of teamnames.csv
- italy 2nd tier
- add England conference data - at least from 2006/7 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from engsoccerdata.