Kaanan Kharwa ([email protected]) Laura McGann ([email protected])
We chose python because it is the language we are both familiar with, and it is optimized for fast and easy data analysis using libraries such as pandas, numpy, and matplotlib.
Open our code in a notebook environment, such as Google Collab or Jupyter Notebooks. The source csv file must be in the same directory as our source code.
KDD_Lab1.ipynb is our only source file in our zip file. It contains the data filtering, grouping, manipulation, and visualization commands for all 3 of our research questions.
If, for whatever reason, reading the source csv files doesn't work when they are placed in the same directory as our source code, you can swap which two read_csv lines are commented vs uncommented. If acquiring data via Google Drive, you will need to follow the URL provided when running that section of the code in order to get a verification code for your Google user. Acquire the code, copy it into the waiting box and hit Enter to continue running our code.