Git Product home page Git Product logo

Comments (6)

jaebradley avatar jaebradley commented on May 31, 2024 1

@wsecheng

But I don't think the current web scraper pulls that row if I am correct?

Yeah, that's correct - there's no API for this yet

if you included a date field you could keep the existing player box score data granularity, but also easily aggregate up to the team level for that date/game if you wanted, without having to introduce a separate scraper for the team-level box score

This definitely makes sense, however, if somebody wants stats aggregated at the team level, it probably makes even more sense to just get that data straight from the relevant web page vs. doing the aggregation on the fly.

Maybe an API method team_box_scores(day, month, year)?

Finally, I want to reiterate that I don't consider adding a date field to the player_box_scores output a suitable solution. I think it adds superfluous data to the results and that any performance concerns that might arise from the data set you've specified are hopefully mostly alleviated by the addition of a team_box_scores method. Apologies if this is an unsatisfactory answer.

from basketball_reference_web_scraper.

jaebradley avatar jaebradley commented on May 31, 2024

@wsecheng thanks for opening this issue! (and I'm glad you're using the library)

However, correct me if I'm wrong, but don't you look up player box scores by a date? Seems like you might already have this information on hand for any post-fetch processing? (Maybe I'm missing something?)

from basketball_reference_web_scraper.

wsecheng avatar wsecheng commented on May 31, 2024

Yeah you definitely would have this information. For a couple games it's not too much work, but the particular use case I was thinking of was if one wanted to scrape multiple years (say 500+ days) of box scores. My post processing right now involves iterating over each date's csv and adding a date column then appending the csv's together. Perhaps there's a better way of approaching this?

from basketball_reference_web_scraper.

jaebradley avatar jaebradley commented on May 31, 2024

@wsecheng a couple thoughts

  1. If what you really need is team-level box scores then maybe there's a solution that uses the table format in https://www.basketball-reference.com/boxscores/?month=01&day=01&year=2017 to get the actual team totals for each game on that date (like from this game: https://www.basketball-reference.com/boxscores/201701010ATL.html). This should cut down on the number of rows needed to be processed?
  2. In general, I'm hesitant to add a field for date as it already feels like this information needs to already be identified in order to make the request in the first place.

from basketball_reference_web_scraper.

wsecheng avatar wsecheng commented on May 31, 2024

Right, so the 'Team Totals' row is what I'm looking for. But I don't think the current web scraper pulls that row if I am correct? My thinking is that if you included a date field you could keep the existing player box score data granularity, but also easily aggregate up to the team level for that date/game if you wanted, without having to introduce a separate scraper for the team-level box score.

from basketball_reference_web_scraper.

jaebradley avatar jaebradley commented on May 31, 2024

@wsecheng I recently published 4.1.0 that should include the aforementioned team_box_scores API method.

Let me know if you run into issues or if it doesn't suit your needs.

If everything looks 👍 feel free to close this issue!

from basketball_reference_web_scraper.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.