Light

asinggih / skillselect-scraper Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 0.0 14 KB

Scraping SkillSelect's latest invitation round details

Python 100.00%

scraping-websites

skillselect-scraper's Introduction

SkillSelect Latest Invitation Round Scraper

My first attempt at using BeautifulSoup4 and Selenium to scrape a website. It's scraping the occupations and minimum points needed for latest available invitation round of SkillSelect.

Running the program will create current_list.txt , which consists of the latest invitation round date and details of the pro rata occupations. Everytime we execute ./ss_scrape.py, it will cross check the live site's date with the date inside current_list.txt. If they're different, it will update the text file.

Example Content of `current_list.txt`

11 September 2018

                               name                           minimum_point    date_of_effect
id
2211  Accountants                                                  80        25/05/2018 9:59am
2212  Auditors, Company Secretaries and Corporate Treasurers       80        1/05/2018 10:54am
2334  Electronics Engineer                                         70        15/11/2017 10:32am
2335  Industrial, Mechanical and Production Engineers              70        18/01/2018 9:55pm
2339  Other Engineering Professionals                              75        3/07/2018 6:37pm
2611  ICT Business and System Analysts                             75        28/05/2018 6:25pm
2613  Software and Applications Programmers                        75        20/08/2018 3:13pm
2631  Computer Network Professionals                               70        17/01/2018 11:36am


The data in this table is licenced under a Creative Commons attribution 3.0 Australia licence,
attributed to Australian Government Department of Home Affairs

How to Run The Script

install chromedriver using your OS' package manager (e.g., brew)
install python 3
virtualenv env -p python3
source ./env/bin/activate
pip install -r requirements.txt
./ss_scrape.py

TODO

automatically send me an email if there's changes
add this script into crontab
[] find better way to send email

skillselect-scraper's People

Contributors

Stargazers

Watchers

skillselect-scraper's Issues

would love to see a demo :)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.