##PYTHON FINAL PROJECT
IPL Data Analysis
Content:
-
Introduction about data
-
Pre-processing steps
-
Analysis 1 : Season wise wins for each team at different venue
-
Result : Stacked Bar Chart
-
Analysis 2 : Toss Impact on different teams across seasons
-
Result : Grouped Bar Plot
-
Analysis 3a : Teams handling their nerves successfully
-
Result : Bar Plot
-
Analysis 3b : Teams dominating their opposition
-
Result : Bar Plot
-
Analysis 4 : Top 5 batsman performances across seasons
-
Result : Point Plot Chart
-
Analysis 5 : Dynamic Player Comparison against runs scored, strike rate
-
Result : Grouped Bar Chart
Introduction
- Raw data is collected from http://cricsheet.org/
- Raw data consists of 2 files matches.csv & deliveries.csv
- Matches.csv consists of following columns & data for all 577 matches held till date:
- Deliveries.csv consists of following columns & ball by ball data for each match held till date:
- Sample code to read csv data into a data frame.
sample code:
path = "C:/PYTHON/pythonFinalProject/rawDataPythonIPL"
all_matches_df = pd.read_csv(path+"\matches.csv")
all_matches_df.head(2)
- Aggregate Total scores, team extras for each match
-
Merge the team scores for each match with the matches, so adding new columns to the all matches data
-
Adding new column which identifies the match type for each match as a Pre-qualifier, Qualifier, Eliminator & Final
sample code:
for year in range(2008,2017):
fourth_last_match_in_each_season = all_matches_df[all_matches_df["season"] == year][-4:].index.values[0]
third_last_match_in_each_season = fourth_last_match_in_each_season + 1
second_last_match_in_each_season = third_last_match_in_each_season + 1
last_match_in_each_season = second_last_match_in_each_season + 1
all_matches_df = all_matches_df.set_value(fourth_last_match_in_each_season, "match-type" , "Qualifier-1")
all_matches_df = all_matches_df.set_value(third_last_match_in_each_season, "match-type" , "Eliminator")
all_matches_df = all_matches_df.set_value(second_last_match_in_each_season, "match-type" , "Qualifier-2")
all_matches_df = all_matches_df.set_value(last_match_in_each_season, "match-type" , "Final")
ANALYSIS 1: Team Wins in different Cities in various IPL Seasons
####Code to get team wins per city data:
ANALYSIS 2: Toss Decision & Impact in IPL across seasons for various teams
####Code to get toss winners data:
ANALYSIS 3a : Team which handle their nerves under pressure ?
####Code to get teams which won close matches:
ANALYSIS 3b: No. of times, Teams dominated their opposition with big victories
####Code to get teams winning with big margins
ANALYSIS 4: Top 5 Batsman
####Code to get top 5 batsman:
ANALYSIS 5: Player Comparison (Player 1 vs Player 2, in terms of runs scored, balls faced, strike rate)
- Calculated all the batsman aggregates like runs scored per match, balls face, 4s scored, 6s scored, strike rate, dismissal-kind
match_id | inning | batting_team | batsman | batsman_runs | balls_faced | Strike-Rate | More-Than-30 | More-Than-50 | More-Than-100 | 4s | 6s | dismissal_kind | fielder |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | Kolkata Knight Riders | BB McCullum | 158 | 73.0 | 216.44 | 1 | 1 | 1 | 10 | 13 | 0 | 0 |
1 | 1 | Kolkata Knight Riders | DJ Hussey | 12 | 12.0 | 100.00 | 0 | 0 | 0 | 1 | 0 | caught | CL White |
####Graph 1: Player Comparison By Runs