hackerspace-pesu / best11-fantasycricket Goto Github PK

View Code? Open in Web Editor NEW

24.0 4.0 17.0 5.2 MB

Predicting the Best 11 for a fantasy cricket game

License: GNU Affero General Public License v3.0

Python 63.45% CSS 9.13% HTML 25.88% Dockerfile 1.54%

fantasy-cricket regress python fastapi scrapy sports scrapyrt espncricinfo

best11-fantasycricket's People

Contributors

Stargazers

Watchers

Forkers

parijatdhar97 rahul30032 nimendrak ramgsuri scientes techyringo ghanender-chauhan egurnick abhisinha2001 milanmandal data-science-ai-open-source dan-329 jukiforde mraaghav sajalshlan hazdik bharani25

best11-fantasycricket's Issues

Add SEO to HTML templates

Good Resource: https://developers.google.com/search/docs/beginner/seo-starter-guide?hl=en

[recommendation] Change from Flask to fastapi

Fastapi is a python framework designed for building apis:
https://fastapi.tiangolo.com/
it simplifies things, uses stuff like pydantic for input verification and has automatic Openapi (swagger) support and much more nice treats

[FEATURE REQ] Scoring systems for different Fantasy cricket platforms

Is your feature request related to a problem? Please describe.
Since more and more fantasy cricket platforms emerge , we would like to build a support for all such platforms

Describe the solution you'd like
In file fantasy_leagues.py each fantasy cricket platform should be represented in the following way
The list in each key of dictionary represents ['T20','ODI','TEST']
Example class

class Dream11(Teams):
    """Dream11 League

    Supported platforms:
            * ODI
            * T20
            * TEST
    """

    name = "Dream11"

    batting_dict = {
            "runs": [1, 1, 1],
            "boundaries": [1, 1, 1],
            "sixes": [2, 2, 2],
            "50": [8, 4, 4],
            "100": [16, 8, 8],
            "duck": [-2, -3, -4],
        }
  

    bowling_dict =  {
            "wicket": [25, 25, 16],
            "4-wicket-haul": [8, 4, 4],
            "5-wicket-haul": [16, 8, 8],
            "Maiden": [4, 8, None],
        }

    wk_dict = {
            "Catch": [8, 8, 8],
            "Stump": [12, 12, 12],
        }

Some platforms are

Comment if you would like to work on it

Categorising Players by Country

Describe the Issue:
The current data has players in no particular order or pattern. To allow for other exploratory data analysis, categorising players by country would be useful.

Solution:
Categorise the players in the data folder by country and update them in another folder named categorised under the data folder.

Comment if you would like to work on it.

[FEATURE] Missing requirements.txt

It would be easier to install when the amount of modules grows by using a requirements.txt

[BUG] Fix changes after PR #26

Describe the bug
In PR #26 , the model fails due to updated folder names, It needs to fixed before merging it into master

To Reproduce
Steps to reproduce the behavior:
run app.py or check .py from the issue-25 branch only

Expected behavior
It must work the same as before. All changes must be done in the issue-25 branch only

Additional context
Not much work, wherever there is zip,zip2,bowl or wk, it just needs to be updated to 'zip/ODI' or 'zip2/ODI' and so on

Women's Cricket Data

Describe the Issue:
Note: This is an optional issue as of now
The data used consists of men's cricket. To make it more inclusive, adding records of women's cricket would be very useful.

Solution:
Mimic the data folder for women's cricket using a web-scraper
Currently there are no record for womens cricket on howstat.com, if you could find a open source, web scraping friendly website, go ahead, but be careful of legal issues while scraping
Use the names as given in the folder with the appropriate suffix.
eg. zip_women.csv

Comment if you would like to work on it.

[FEATURE] Implementation of FastAPI Framework

Describe the Issue
The current model makes use of the Flask web framework for implementing the model. Now with the existence of many more robust frameworks, implementing one such frame work that is FastAPI would be beneficial.

Solution
Create a FastAPI implementation of the existing Flask model that has basic functionalities provided by it. Any extra features that seem appropriate or aesthetically pleasing are well appreciated.

Test
Ensure the model makes use of the existing python scripts and matches to form teams and display them accordingly.
i.e. It should be seamless to integrate with the current scripts and project.

Comment if you would like to work on this.

More tests required

Tests for crawler espn-matches required

Scrapy_autounit failes due to random generation of links in the crawler

[FEATURE] Add bowling crawler to the webcrawler

Is your feature request related to a problem? Please describe.
The crawler currently only crawls for new players , and match ids. The web crawler also needs to crawl for batting stats

Describe the solution you'd like
Extend function parse_scorecard in crawler/cricketcrawler/cricketcrawler/spiders/howstat.py to bowling and get the following

wickets
Overs
Maidens
Economy

Refer Dataset.md to understand the matchcodes and playercodes

Additional context
Crawler can be found in feature-webcrawler branch of the repo. Crawler is built using [scrapy](https://scrapy.org/)

[DATA] Segregation of ODI records from T20 records

Is your feature request related to a problem? Please describe.
Currently the zip and zip2 folder has records for both ODI and T20, we would like to segregate these records for the future ,so that it would be easy to scrape the data in the future

Describe the solution you'd like
Create two folders in both zip and zip2, called ODI and T20 and segregate the records, also create a new folder called ODI in bowl and wk and place all the files in it ,

Comment if you would like to work on it

Time series Model

Describe the issue
Currently our model predicts points using linear regression, due to lack of features, if you can produce better results than the current model with another time series model, it will be great

Solution
change the current linear regression model to your model, and set up a pull request

Test
run check.py and see if your score beats our losses

Comment if you would like to work on it

More player records for ODI

Describe the Issue:
Currently we have players only from 11 countries whose distribution can be found in issue #16 , we would like more countries players to be added, some countries are Sri Lanka, Zimbabwe , etc.
Solution:
Employ a web-scraper to scrape player records from howstat.com for ODI matches for these countries (only non -retired) and update them in the data folder.
Ensure the scraped data is in the same format as the files in the data folder.

Comment if you would like to work on it.

Typing hints using pydantic

Is your feature request related to a problem? Please describe.
Add type hints to the files inside fantasy_cricket.
using pydantic and typing
Additional context
It would also be great if mypy was used for the checks and added to the CI

Add privacy policy page

Needed for GDPR as of my knowledge

Dockerfile

How do you plan on hosting the App?

Describe the solution you'd like
Using Docker/docker-compose would one of the easier ways of hosting, especially it you plan on using a database. It would also make using a database in development more comparable when issues arise

[FEATURE] Add wicket keeper crawler to the webcrawler

Is your feature request related to a problem? Please describe.
The crawler currently only crawls for new players , and match ids. The web crawler also needs to crawl for wicketkeeping stats

Describe the solution you'd like
Extend function parse_scorecard in crawler/cricketcrawler/cricketcrawler/spiders/howstat.py to wicketkeeping and get the following

catches
stumpings

Additional context
Crawler can be found in feature-webcrawler branch of the repo. Crawler is built using scrapy

[FEATURE] Add a function for averages

Describe the issue
We would like to add statistics such as batting average, bowling average to our dataset

Solution
To do this , we would like you to add two functions in a file called average.py, batting
_average and bowling_average, and implement them.

Test
run both of them and check if it updates the dataset

Note: While making a PR dont send it in with the updated dataset, that will be done by the maintainers only, just write the functions in the file as of now

Comment if you would like to work on it

Unnecessary Comments and print statements

Describe the issue
In team.py , there are a lot of uneccessary comment statements, comment string and a few print statements

Solution
Remove all such comments. comment strings and all print statements and rememeber to autoformat using black

Note: This is only a first timers issue, PRs from experienced users will be labelled invalid
Comment if you would like to work on it

Pre-commit hook

Add a pre-commit hook with basic hooks like check-ast , check-yaml, check-merge-conflict, end-of-file-fixer

Also add additional hooks for black

[DOCS] Update Dataset.md after PR 36

Is your feature request related to a problem? Please describe.
After #36 the entire dataset has been restructured, those changes need to reflected in Dataset.md
The file structure can be found in the descriptions of #36

Describe the solution you'd like
Since zip, zip2,Bowl and wk folders no longer exist, they need to removed , the scoring table remains same,
Two folders have been added namely ODI and T20
Each having 4 folders containing the joint files of zip,zip2,Bowl and wk
this needs to reflected

Better interface

Issue
We are looking for a better front-end for our GUI, no backend changes expected
Solution
We are not expecting anything complex, any kind of significant improvements to the present model is welcome.

Add algorithm to select players based on credits

Describe the issue
Currently our model predicts only on points but fantasy cricket has limitations on the basis of credits, we would like to implement the algorithm for it

Solution
Create a function that takes input the players list , credits list , maximum credits for the match and points predicted by the model for each player and select the best 11 players, based on the points whilst not crossing the maximum credits.
Note: be careful not to violate the team rule , i.e maximum 7 players from each team

Test
You can test with your own credits as of now, as we are still figuring out an api for it
You do not need to integrate it with current flask model, you can create a function in team.py as of now

Comment if you would like to work on it

Pycricbuzz is down

Describe the bug
The pycricbuzz package which we had used for getting the live matches and their respective squads has been disabled,a alternative option has to be put in ASAP

To Reproduce
Steps to reproduce the behavior:
Run the local development of the website

Desktop (please complete the following information):

OS: [ALL]
Version [v0.1.0]

Additional context
Possible solution is to crawl espncricinfo

Add Contact Page

Requirement for GDPR as of my knowledge

[BUG] Web crawler searches through matches from the 1900s

Describe the bug
The web crawler in feature-crawler
takes in match records from the 1900s . This wastes a lot of time and reduces efficiency of the crawler
To Reproduce
Steps to reproduce the behavior:

Follow the instructions in the README file to run the crawler
Wait for the Ids crawl to finish and notice

Expected behavior
The solution to this would be to set a filter which takes match records only from the year 2017 and greater
Possible solution
in cralwer/cricketcrawler/spiders/howstat.py in function parse_scorecard

if int(date[0:4]) >= 2017:
     item=MatchidItem(name=url[startint+10:],folder=folder,matchid=matchid,date=date)
      yield item

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Version [feature-crawler]

Additional context
The starting point to this might be crawler/cricketcrawler/spiders/howstat.py

T20 League Player Records Required

Describe the Issue:
The current data folder contains player records from ODI matches only. It would make the model more relevant to include matches from other leagues as well.
Set up a PR if you are done with any T20 league(currently only IPL is available in howstat.com)
Solution:
The webcrawler has been set up in feature-webcrawler. Add your solution to it
Try to keep the data in the same format as in the data folder

T20_Leagues list

IPL
Big Bash
CPL (Carribean)

Comment if you would like to work on it.

Players performance against different teams

Describe the issue
Many a times players will be out of form but the moment they see a specific opponents they somehow find that form, great Example is Steve Smith against India, no matter how his recent form is, he will play well against India, there are many more examples like that, one of the problems with our model is, it doesn't take that into account.

Solution
There's no prototype solution for this, You can collect data from websites(make sure its legal) and form a algorithm or detect patterns, anything that would improve the models losses.
One solution we have in mind to try out is to take into account both recent form and performance of player against that specific team, so if a player has had a rough patch in recent games but has an amazing record against that opponent the model should take that into account so that it predicts a better 11, we think it might help in improving the model.

Comment if you would like to work on it

Test Player Records Required

Describe the Issue:
The current model uses player data from ODI matches only. Adding Test matches would improve the usability of the model.

Solution:
Employ a web-scraper to scrape player records from howstat.com for test matches and update them in the data folder.
Ensure the scraped data is in the same format as the files in the data folder.

Comment if you would like to work on it.

[DATA] Organizing data into one csv file

Is your data format related to a problem? Please describe.
Currently, we have about 337 files per batting and bowling, each player has his own csv file, this wouldnt work once the number of players keep increasing.

Describe the solution you'd like
Inside the zip folder there are two folders called ODI and T20, these two folders must be converted to a single csv file called zip_ODI.csv and zip_T20.csv
the format of the csv file is as follows:

player	matches	Date
player1	matchid1	date 1
	matchid2	date 2
	matchid3	date 3
player2	matchid1	date 1
.... and so on

Similarly do it also for zip2, bowl,wk

Additional context
I'll be creating a seperate branch for this once a PR is opened for this

Check the appropriate choice

Organize data better
Adding data to the dataset

[FEATURE] Add circleci badge on README.md

Is your feature request related to a problem? Please describe.
Since the CI has been shifted to circleci, it would be great if the circleci badge was added to the readme file

Describe the solution you'd like
Reference: https://circleci.com/docs/2.0/status-badges/

Fantasy cricket API

We are looking for a API for any of the fantasy cricket platforms, as scraping them would be illegal as of now
Any suggestions are welcome

Preferably not very expensive, amazing if it would be free xD

Ignore retired players

Describe the bug
It is evident that retired players don't play anymore. The webcrawler still includes them which needs to be filtered

To Reproduce
Steps to reproduce the behavior:

Follow the instructions in README.md and notice once it starts collecting players. It can also be noticed in data_crawler/ids_names.csv.

Possible Solution
The solution :
in cralwer/cricketcrawler/spiders/howstat.py , in function parse_player

if retired == False:
          yield PlayerItem(name=url[url.find("?PlayerID=")+10:],gametype=gametype,folder=".",longname=name,retired=retired)

Screenshots

Desktop (please complete the following information):

Version [master]

No match found

Getting, There are no matches scheduled for the next 24 hours!.
IPL 2022 started but it is showing the above result. can anyone help on this.

[FEATURE] Add batting crawler to the webcrawler

Is your feature request related to a problem? Please describe.
The crawler currently only crawls for new players , and match ids. The web crawler also needs to crawl for batting stats

Describe the solution you'd like
~~A function similar to parse_player and parse_scorecard~~ Extend function parse_scorecard in crawler/cricketcrawler/cricketcrawler/spiders/howstat.py to batting and get the following

runs
no of 4s and 6s
strike rate

Refer Dataset.md to understand the matchcodes and playercodes

Additional context
Crawler can be found in feature-webcrawler branch of the repo. Crawler is built using [scrapy](https://scrapy.org/)

Limit the countries in Web Scraping

The Scraper gets players from some countries whose matches the fantasy cricket platforms dont host on their website.
The case is same for matches

Make the crawler such that it takes only the players/Matches from the following countries

Country
India
England
Australia
Bangladesh
New Zealand
South Africa
West Indies
Pakistan
Ireland
Afghanistan
Sri Lanka

Setup Web-Crawler for daily updates

Describe the Issue:
The player records in the data folder are outdated and static. Thus, they may not be enough to accurately predict player performances in current matches. Previous records were created from web-scraped data from howstat.com.

Solution:
Keep the records up to date using web-scraping for daily updates and reflect those changes in the data folder.

Comment if you would like to work on this

hackerspace-pesu / best11-fantasycricket Goto Github PK

best11-fantasycricket's People

Contributors

Stargazers

Watchers

Forkers

best11-fantasycricket's Issues

Recommend Projects

Recommend Topics

Recommend Org