Git Product home page Git Product logo

mlbgameday's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

mlbgameday's Issues

Data download Error.

I'm getting an error message with my get payload command.

innings_df <- get_payload(start = "2017-04-03", end = "2017-04-04")
Gathering Gameday data, please be patient...
Error: by can't contain join column batter which is missing from LHS

Linescore Dataset error

I'm getting the following error below.

innings_df <- get_payload(start = "2017-01-01", end = "2018-01-01", dataset = "linescore", db_con = con)
Gathering Gameday data, please be patient...
Processing data chunk 1 of 7
Processing data chunk 2 of 7
Error: Column name mismatch.

Subscript errors

Getting a "subscript out of bounds" error on the following line of code. Probably due to problems in with the make_gids() function. A fix for the next patch release should be a priority.

innings_df <- get_payload(start = "2017-09-21", end = Sys.Date()-1)

2019 year's data downloading failed.

I tried to get 2019 year's data.
But the 2019 year's data downloads failed.
Except 2019 year's, it is working well. ( I tested 2013 to 2018 )

This is the error message and my code.

> con2019 <- DBI::dbConnect(RPostgreSQL::PostgreSQL(), dbname = "mlb_2019",
+                       host = "172.28.1.2", port = 5432,
+                       user = "postgres", password = "postgres")

> get_payload(start = "2019-04-01", end = "2019-05-01", async = TRUE, db_con=con2019)

Gathering Gameday data, please be patient...
Processing data chunk 1 of 2
Error: `by` can't contain join column `tfs_zulu`, `inning`, `inning_side`, `des` which is missing from LHS
Run `rlang::last_error()` to see where the error occurred.
> rlang::last_error()
<error/rlang_error>
`by` can't contain join column `tfs_zulu`, `inning`, `inning_side`, `des` which is missing from LHS
Backtrace:
 1. mlbgameday::get_payload(...)
 2. mlbgameday::payload.gd_inning_all(urlz)
 4. dplyr:::left_join.tbl_df(...)
 6. dplyr:::common_by.character(by, x, y)
 7. dplyr:::common_by.list(by, x, y)
 8. dplyr:::bad_args(...)
 9. dplyr:::glubort(fmt_args(args), ..., .envir = .envir)
Run `rlang::last_trace()` to see the full context.
> rlang::last_trace()
<error/rlang_error>
`by` can't contain join column `tfs_zulu`, `inning`, `inning_side`, `des` which is missing from LHS
Backtrace:
    █
 1. └─mlbgameday::get_payload(...)
 2.   └─mlbgameday::payload.gd_inning_all(urlz)
 3.     ├─dplyr::left_join(...)
 4.     └─dplyr:::left_join.tbl_df(...)
 5.       ├─dplyr::common_by(by, x, y)
 6.       └─dplyr:::common_by.character(by, x, y)
 7.         └─dplyr:::common_by.list(by, x, y)
 8.           └─dplyr:::bad_args(...)
 9.             └─dplyr:::glubort(fmt_args(args), ..., .envir = .envir)

Appending database and obtaining wins

Kris- What commands do I need to use if I want to append my 2017 database to now include 2016 (and other historical data)?

Would I need to separately scrape the linescore dataset in order to merge wins and saves to the play by play data?

BIS_BOXSCORE Issue

The "gameday_link" returned when using the BIS_BOXSCORE dataset seems to be cut off.

A typical link/id returned using other datasets has the following format:
gid_X_X_X_Xmlb_Xmlb_X

The Gameday_Link returned is missing the last 5 characters of the normal format, which is a problem for double-headers, which end with _2.

Double-Header Example:
gid_2018_06_19_lanmlb_chnmlb_1
gid_2018_06_19_lanmlb_chnmlb_1
Link in BIS_BOXSCORE --->>>> gid_2018_06_19_lanmlb_chn

**There's no way to differentiate the two games on this date

Minor league data

When I place league = "aaa" into the get_payload command, only mlb data is pulled. The syntax i use is below:
events <- get_payload(start = "2018-04-01", end = "2018-04-07", league = "aaa")

Is there a different way to get just aaa data? Is it possible the get_payload needs to be modified to access the gdx server?

No Team ID in get_payload function

Can't seem to find any information about the teams playing in the get_payload function call. Would be nice to get some ID of which player is on what team.

Updated GIDs / Data refresh

Hi, thanks for pulling this together.
Just curious, do you have code that pulls the updated gids? Looks like you load them into a data file. 2018 data (shells) are out on the mlb gameday site. Neither pitchrx or your code has the game listing pulled in. Was wondering if you had that before I build code to pull them in myself. Thanks!

error using get_payload

Using get_payload(start = "2019-07-01", end = "2019-07-10") I'm getting the following error:
Error: by can't contain join column tfs_zulu, inning, inning_side, des which is missing from LHS

Any idea how to solve this issue?
Regards

2019 season get_payload() Error: Column `on_1b` must be length 4055 (the number of rows) or one, not 0

Expected Behavior

Expected to get df of pitch data for data on 2019-03-28
http://gd2.mlb.com/components/game/mlb/year_2019/month_03/day_28

df = get_payload(start = '2019-03-28', end = '2019-03-28')

Games listed here
https://www.mlb.com/scores/2019-03-28

Current Behavior

Encounters error message, the download fails

Gathering Gameday data, please be patient...
Error: Column `on_1b` must be length 4055 (the number of rows) or one, not 0
In addition: Warning messages:
1: NAs introduced by coercion 
2: NAs introduced by coercion 
3: NAs introduced by coercion 
4: NAs introduced by coercion 
5: NAs introduced by coercion 
6: NAs introduced by coercion 

However, data download succeeds when the date is before 2018-10-28

df = get_payload(start = '2018-10-28', end = '2018-10-28')

Attempted Solution

Tried to reinstall the latest package to GitHub dev version
Did not solve the issue

devtools::install_github("keberwein/mlbgameday", force = TRUE)

Context

sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] doParallel_1.0.14 iterators_1.0.10  foreach_1.4.4     mlbgameday_0.1.4  jsonlite_1.5     
 [6] stringi_1.4.3     RSQLite_2.1.1     DBI_1.0.0         dbplyr_1.2.2      dplyr_0.8.0.1    
[11] config_0.3       

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.1       rstudioapi_0.8   xml2_1.2.0       magrittr_1.5     tidyselect_0.2.5
 [6] bit_1.1-14       R6_2.4.0         rlang_0.3.1      stringr_1.4.0    blob_1.1.1      
[11] tools_3.5.1      yaml_2.2.0       bit64_0.9-7      assertthat_0.2.0 digest_0.6.17   
[16] tibble_2.1.1     crayon_1.3.4     tidyr_0.8.3      purrr_0.3.2      codetools_0.2-15
[21] curl_3.2         memoise_1.1.0    glue_1.3.1       compiler_3.5.1   pillar_1.3.1    
[26] pkgconfig_2.0.2 

Table names

Need to line up tables names with those output by pitchrx, in case some users may want to append to an existing database.

Umpire IDs

Hi @keberwein, in which table are the umpire IDs stored? In other words, is the home plate umpire who made df$pitch$des not available in the data? I understand you have the script for updating umpire IDs, but I do not see the umpire ID in any of the tables from the get_payload() call.

Hope it's clear what I'm asking.

Get_Payload

There appears to be an issue pulling data <= 2014. Below are two examples of error messages returned when trying to pull data.

Events_14 <- get_payload(start = "2014-04-04", end = "2014-04-05", dataset = "inning_all")
Gathering Gameday data, please be patient...
Error: tfs_zulu = NULL must be a column name or position, not NULL

Events_10 <- get_payload(start = "2010-04-04", end = "2010-04-05", dataset = "inning_all")
Gathering Gameday data, please be patient...
Error: tfs_zulu = NULL must be a column name or position, not NULL

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.