Git Product home page Git Product logo

jaebradley / basketball_reference_web_scraper Goto Github PK

View Code? Open in Web Editor NEW
407.0 23.0 96.0 14.29 MB

NBA Stats API via Basketball Reference

Home Page: http://jaebradley.github.io/basketball_reference_web_scraper/

License: MIT License

Python 89.14% HTML 10.18% Shell 0.07% PowerShell 0.05% C 0.48% Cython 0.02% JavaScript 0.02% Batchfile 0.01% Nushell 0.01%
basketball-reference python nba web-scraping web-scraper

basketball_reference_web_scraper's Introduction

logo

pypi python version license code coverage continuous integration

Basketball Reference is a great site (especially for a basketball stats nut like me), and hopefully they don't get too pissed off at me for creating this.

I initially wrote this library as an exercise for creating my first PyPi package - hope you find it valuable!

Documentation

For documentation about installing the package and API methods see the documentation page.

Contributors

Thanks to @DaiJunyan, @ecallahan5, @Yotamho, @ntsirakis, @allanbelliti, @krlu, and @aaronbannin for their contributions!

basketball_reference_web_scraper's People

Contributors

aaronbannin avatar allanbelliti avatar daijunyan avatar deepyaman avatar dependabot[bot] avatar ecallahan5 avatar jaebradley avatar krlu avatar ntsirakis avatar yotamho avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

basketball_reference_web_scraper's Issues

It looks like there are some other positions that are not accounted for:

Traceback (most recent call last):
  File "compare.py", line 40, in <module>
    main()
  File "compare.py", line 37, in main
    team1_players = get_values(team_1)
  File "compare.py", line 11, in get_values
    players = client.players_season_totals(YEAR_END)
  File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/basketball_reference_web_scraper/client.py", line 35, in players_season_totals
    values = http_client.players_season_totals(season_end_year)
  File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/basketball_reference_web_scraper/http_client.py", line 66, in players_season_totals
    return parse_players_season_totals(response.content)
  File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/basketball_reference_web_scraper/parsers/players_season_totals.py", line 43, in parse_players_season_totals
    totals.append(parse_player_season_totals(row))
  File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/basketball_reference_web_scraper/parsers/players_season_totals.py", line 9, in parse_player_season_totals
    "position": POSITION_ABBREVIATIONS_TO_POSITION[row[2].text_content()],
KeyError: 'F'

I would presume there might be combinations like "F-C" and "G" as well but I am yet to confirm.

Originally posted by @AnkitPatanaik in #52 (comment)

Open to including TOT lines in advanced stats?

Would you be open to a Pull Request which added a new argument to allowing optionally including the "TOT" (total) line rows when retrieving the advanced stats data?

https://github.com/jaebradley/basketball_reference_web_scraper/blob/v4/basketball_reference_web_scraper/parsers/players_advanced_season_totals.py#L40

Not all columns are simply "a total" sum of the individual team rows (VORP for example) and it would be great to be able to get the TOT data here for those stats where we only care about the aggregate and not just team-by-team info per player.

Manually calculate points?

Hey @jaebradley ,

First of all, this is one of the coolest things ever. Really having fun with it. Thank you so much for making it!

I was just wondering if you were calculating points manually by adding free throws, field goals, and three pointers together? I'm trying to do PPG but don't see a Points value on a player when retrieving season totals.

I don't mind doing it manually but just want to make sure I'm not missing something.

Thanks again!

Dylan

Unable to get stats for current season

When trying to make a call getting data for the current season (season ending in 2019), I receive the following error:

Traceback (most recent call last):
  File "compare.py", line 25, in <module>
    print(get_stats(get_values(["Joe Harris", "Kevin Durant"])))
  File "compare.py", line 8, in get_values
    players = client.players_season_totals(2019)
  File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/basketball_reference_web_scraper/client.py", line 35, in players_season_totals
    values = http_client.players_season_totals(season_end_year)
  File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/basketball_reference_web_scraper/http_client.py", line 66, in players_season_totals
    return parse_players_season_totals(response.content)
  File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/basketball_reference_web_scraper/parsers/players_season_totals.py", line 43, in parse_players_season_totals
    totals.append(parse_player_season_totals(row))
  File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/basketball_reference_web_scraper/parsers/players_season_totals.py", line 9, in parse_player_season_totals
    "position": POSITION_ABBREVIATIONS_TO_POSITION[row[2].text_content()],
KeyError: 'F-G'```

Update Team Abbreviations

Currently the full season stats pull returns errors for historical teams:

"WSB" Washington Bullets 1994
"SDC" San Diego Clippers 1982

Non-unique player names?

How are non-unique player names handled? stats.nba.com for example uses a player ID system to avoid this.

Add Deprecated Teams Going Back to 1950

When pulling advanced stats from the beginning of time on BR (1950), a lot of the results come back with None for team. I did a quick test and added the Minneapolis Lakers (MNL) to data.py and was able to properly populate the team where None had been previously.

Missing Points Data from Season Totals

Thanks for creating this package! I noticed season totals for all players is missing the points column. I was able to modify the script to include the points column. Let me know if you'd like me to update.

Reorganize test suite

Currently a hodge-podge of integration tests, unit tests, etc.

Need to figure out a better organizational system for tests.

Urllib3, requirements.txt, README, pip install

Following just the README as it relates to the installation of the project, I downloaded through pip install instead of the conventional git clone because of the explicit specification of the pip install capabilities, in that it was stated that this project was originally created “as an exercise for creating […] first PyPi package.” However, not long after I got the error of:“ERROR: requests 2.20.0 has requirement urllib3<1.25,>=1.21.1, but you'll have urllib3 1.25.2 which is incompatible.”

After some research into the matter, with similar issues arising in both the project in question, as well as other projects on Github, I saw that some people suggested either downloading the urlib that would work (this was assuming you had too old of a version, instead of too new of a version, which was my issue), or also modifying the requirements.txt such that the urlib specifications could be corrected.

Ironically, when I examined the requirements.txt file, I saw that the project had actually made that exact update recently. Therefore, I realized that I should just download the project in the more conventional manner of git clone, and not pip install, since it appeared that that had not been updated as well to match the git clone download. This corrected the issue. Might I suggest, citing the discrepancy in the README, or perhaps updating the pip install capabilities as well to match the requirements.txt as it related to the urllib specifications?

Certifi Conflict

Having issues installing this because of a conflict with 'Certifi':

Found existing installation: certifi 2019.6.16
Uninstalling certifi-2019.6.16:
Exception:
Traceback (most recent call last):
File "/Users/evan.agovino/anaconda3/lib/python3.6/shutil.py", line 550, in move
os.rename(src, real_dst)
PermissionError: [Errno 13] Permission denied: '/Users/evan.agovino/anaconda3/lib/python3.6/site-packages/certifi-2019.6.16.dist-info/DESCRIPTION.rst' -> '/var/folders/lk/zxfkkcj155d51dt9l80yrdfhfbznt_/T/pip-qttxhug1-uninstall/Users/evan.agovino/anaconda3/lib/python3.6/site-packages/certifi-2019.6.16.dist-info/DESCRIPTION.rst'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Users/evan.agovino/anaconda3/lib/python3.6/site-packages/pip/basecommand.py", line 215, in main
status = self.run(options, args)
File "/Users/evan.agovino/anaconda3/lib/python3.6/site-packages/pip/commands/install.py", line 342, in run
prefix=options.prefix_path,
File "/Users/evan.agovino/anaconda3/lib/python3.6/site-packages/pip/req/req_set.py", line 778, in install
requirement.uninstall(auto_confirm=True)
File "/Users/evan.agovino/anaconda3/lib/python3.6/site-packages/pip/req/req_install.py", line 754, in uninstall
paths_to_remove.remove(auto_confirm)
File "/Users/evan.agovino/anaconda3/lib/python3.6/site-packages/pip/req/req_uninstall.py", line 115, in remove
renames(path, new_path)
File "/Users/evan.agovino/anaconda3/lib/python3.6/site-packages/pip/utils/init.py", line 267, in renames
shutil.move(old, new)
File "/Users/evan.agovino/anaconda3/lib/python3.6/shutil.py", line 565, in move
os.unlink(src)
PermissionError: [Errno 13] Permission denied: '/Users/evan.agovino/anaconda3/lib/python3.6/site-packages/certifi-2019.6.16.dist-info/DESCRIPTION.rst'

Request to include date in player box score output

Hi @jaebradley, thanks for the library! It's really neat and useful.

I was thinking it might be useful to also include the date in the player box score output. That way you could easily aggregate by date and team to obtain team level box scores and use this data in conjunction with season schedule results (which is at the team level).

Request for Uniform Number

Hi @jaebradley ,

I was wondering if you had thought about adding the Uniform Number of a player to the HTTP response for season_totals. (So for Lebron James it would be {number: 23} for instance)

I think it would be very useful!

Dylan

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.