Git Product home page Git Product logo

Comments (7)

jalapic avatar jalapic commented on May 24, 2024

Agreed - this could be improved and I like the approach. It's something discussed previously in #15 . I will take a look at adding. Very nice example too !

from engsoccerdata.

JoGall avatar JoGall commented on May 24, 2024

That's great; nice Tableau visualisation referenced in the other issue too.

Was scratching my head for too long wondering why Leicester have only 30 games played in the above table before realising it's the inconsistency in team names between england and the new data -- back to that issue! How about using partial string matching to add new team names to the teamnames df, then manually reviewing to check accuracy? Might be less work?

from engsoccerdata.

JoGall avatar JoGall commented on May 24, 2024

agrep() seems to work pretty nicely for partial string matching; much better than pmatch() anyway. e.g.:

> unique(teamnames[agrep("Tottenham", teamnames$name),]$name)
[1] "Tottenham Hotspur"

> unique(teamnames[agrep("Leicester", teamnames$name),]$name)
[1] "Leicester City"

from engsoccerdata.

jalapic avatar jalapic commented on May 24, 2024

I noticed a few things - will try and update in the next few days.

  • agree regarding 'date'. This used to be a historical issue why I kept it as a character. It definitely should not be a factor. In all dataframes it should be a date class. I'll correct that.

  • with the england_current() function in the most recent version of the package, I think that all teamnames are correct. It for instance returns "Leicester City" correctly. I had thought about partial string matching, but I can see that might lead to undesired errors/side-effects. It's better to try and get a definitive list of names.

  • I'll add a 'since date' function to the maketable and maketable_eng functions. I'll also make season and tier optional. The annoying thing about these data is all the strange rules in different countries regarding points for wins and tiebreak procedures (not to mention point deductions). To get it all 100% straight is painful and has to be hard-coded.

from engsoccerdata.

jalapic avatar jalapic commented on May 24, 2024

adding issue comment here:

rename maketable_eng() to england_maketable() to be syntactically related to england_current() - easier to remember.

from engsoccerdata.

JoGall avatar JoGall commented on May 24, 2024

Additional enhancement: option to make separate tables for home fixtures only and away fixtures only.

Including this feature, the optional Season and tier arguments discussed above, and the pts_deductions flag discussed here, my Frankensteined maketable() function now looks like this. Seems to work for me!

maketable <- function(df=NULL, Season=NULL, tier=NULL, type = c("both", "home", "away"), pts_deductions = FALSE, pts=3) {

  GA<-GF<-ga<-gf<-gd<-GD<-D<-L<-W<-Pts<-.<-Date<-home<-team<-visitor<-hgoal<-opp<-vgoal<-goaldif <-FT<-division<-result<-maxgoal<-mingoal<-absgoaldif<-NULL

  #subset by season and tier, if applicable
  if(!is.null(Season) & is.null(tier)) {
	dfx <- df[(df$Season == Season), ]
  } else if(is.null(Season) & !is.null(tier)) {
	dfx <- df[(df$tier == tier), ]
  } else if(!is.null(Season) & !is.null(tier)) {
	dfx <- df[(df$Season == Season & df$tier == tier), ]
  } else {
	dfx <- df
  }

  #subset only home or away fixtures, if applicable
  if(match.arg(type)=="home") {
	temp <- dplyr::select(dfx, team=home, opp=visitor, GF=hgoal, GA=vgoal)
  } else if(match.arg(type)=="away") {
	temp <- dplyr::select(dfx, team=visitor, opp=home, GF=vgoal, GA=hgoal)
  } else if(match.arg(type)=="both") {
	temp <-rbind(
        dplyr::select(dfx, team=home, opp=visitor, GF=hgoal, GA=vgoal),
        dplyr::select(dfx, team=visitor, opp=home, GF=vgoal, GA=hgoal)
    )
  }

  #make table	
  table <- temp %>%
    dplyr::mutate(GD = GF-GA) %>%
    dplyr::group_by(team) %>%
    dplyr::summarise(GP = sum(GD<=100),
              W = sum(GD>0),
              D = sum(GD==0),
              L = sum(GD<0),
              gf = sum(GF),
              ga = sum(GA),
              gd = sum(GD)
    ) %>%
    dplyr::mutate(Pts = (W*pts) + D) %>%
    dplyr::arrange(-Pts, -gd, -gf) %>%
    dplyr::mutate(Pos = rownames(.)) %>%
    as.data.frame()
    
  #apply points deductions, if applicable
  if(pts_deductions & any(table$team %in% deductions$team & Season %in% deductions$Season)) {
	
	penalty <- deductions[deductions$team %in% table$team & deductions$Season %in% Season,]
	
	table[table$team %in% penalty$team,]$Pts <- table[table$team %in% penalty$team,]$Pts - penalty$points_deducted
	
	table <- dplyr::arrange(table, -Pts, -gd, -gf)
  }

  return(table)
  
}

from engsoccerdata.

jalapic avatar jalapic commented on May 24, 2024

OK I have implemented this. I hope the solution I have come to makes sense.

It gets hard to have one maketable function that does all things in all circumstances. I have kept maketable_eng as requiring a Season and tier argument. This might need to be amended in the future but is ok for now as that function can handle points deductions and can also discriminate league tables based on the different tie-breaker procedures in different seasons/tiers.

The maketable_eng function depends on the original maketable function, so I left that alone.

Therefore to make a maketable function that did not depend on a Season or tier argument and could make tables by defined dates, I created maketable_all. I would suggest using this instead of maketable going forward. I may even make maketable non-exportable.

Whilst I still think it's best to pre-subset your dataframe before using the maketable_all function, you can now:

  • add begin date and end date arguments
  • added home, away or both results option

Lastly, I haven't renamed maketable_eng as it appears the package will have a family of maketable functions - I've kept the name.

Examples:
#EPL historical table
maketable_all(df=england[england$tier==1,],begin="1992-08-15", end="2017-07-01")

#EPL historical table away results
maketable_all(df=england[england$tier==1,],begin="1992-08-15", end="2017-07-01", type="away")

from engsoccerdata.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.