Git Product home page Git Product logo

mlb-game-prediction-v2.0's Introduction

Version 2.0 of my MLB Game Prediction model

This began as my final project for CS50 and has evolved into a research project I'm currently writing a paper on. V1.0 had a maximum accuracy of 57.2%, compared to student averages of 55-56% with a logistic regression model. Target accuracy is up to 59-60%, bringing in data from the past decade worth of seasons.

Executables:

Main.py

Main.py searches through the roster, gamelog, and pbp (Play-by-Play) data to create combined.csv, test.csv, and train.csv. Test and train are the same data as combined, just split into prior to 2022 and after 2022 for train and test respectively.

Train.py

Train.py accesses data in combined.csv, or test/train.csv (depending on the configuration in log_games.py), builds a Linear Regresion model off of it and tests it.

Disclaimer: The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at "www.retrosheet.org".

mlb-game-prediction-v2.0's People

Contributors

jacobpieczynski avatar

Stargazers

 avatar

Watchers

 avatar

mlb-game-prediction-v2.0's Issues

Where is Test.py

Sorry,

One last question, I don't see Test.py in your src file, and could you clarify what the output result is? is it a game predictor for the current games/season or last?

Thanks again!

Issue Running Main and Train

I receive the following errors when running main then train.py

main.py

--------------------------------------------------
LOADING ROSTERS
FAILED TO OPEN ROSTER ros/2023/ANA2023.ROS
--------------------------------------------------

         10 function calls in 0.000 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 main.py:7(main)
        1    0.000    0.000    0.000    0.000 parse.py:8(parse_roster)
        1    0.000    0.000    0.000    0.000 {built-in method builtins.exec}
        4    0.000    0.000    0.000    0.000 {built-in method builtins.print}
        1    0.000    0.000    0.000    0.000 {built-in method io.open}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

train.py

Traceback (most recent call last):
  File "./MLB-Game-Prediction-V2.0/src/train.py", line 22, in <module>
    data = pd.read_csv('combined.csv').drop(columns=['Date', 'Home', 'Visitor', 'GameID'], axis=1)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "./mlb-game-prediction-v2/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 1026, in read_csv
    return _read(filepath_or_buffer, kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "./mlb-game-prediction-v2/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 620, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "./mlb-game-prediction-v2/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 1620, in __init__
    self._engine = self._make_engine(f, self.engine)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "./mlb-game-prediction-v2/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 1880, in _make_engine
    self.handles = get_handle(
                   ^^^^^^^^^^^
  File "./mlb-game-prediction-v2/lib/python3.11/site-packages/pandas/io/common.py", line 873, in get_handle
    handle = open(
             ^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'combined.csv'

Running

Is there an order to running your scripts or just run main?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.