Git Product home page Git Product logo

upworkscraper's Introduction

upwork_scraper

Project Title

Scrapy spider which scrape Upwork website under certain keywords for higher budget jobs.

Prerequisites

Python 3.7 or later
Scrapy 1.6.0 or later

How to install prerequisites

If you don't have python:

Firstly, install anaconda nagivator.

  1. Follow the link: "https://www.anaconda.com/distribution/".
  2. Choose your operational system and install anaconda for python 3.7 or later.
  3. Here are the links for tutorials: https://docs.anaconda.com/anaconda/install/, https://www.youtube.com/watch?v=6LXwdjdACWM

Secondly, install Scrapy.

  1. Open anaconda navigator.
  2. Go to environments.
  3. Press "play" button and open terminal.
  4. In terminal type: conda install scrapy

Anaconda Nav

If you already have python:

Just install scrapy:

  1. Open terminal
  2. Type: pip3 install scrapy

Running a spider

To run spider you just have to download project and execute runner.py file

  1. Download project
  2. Open terminal
  3. Go to project root directory
  4. From project root directory type in terminal: python3 runner.py
  5. You will see the following window running spider1 running spider2
  • Enter key word and press enter (just press enter if you want to scrape all the jobs)
  • Enter min budget and press enter (if you just press enter, you will set buget to 0)
  • Enter max post date in days (just press enter if you don't want to set upper limit for max post date)
  • Enter max number of scraped jobs and press enter (just press enter if you don't want to set upper limit for number of jobs to scrape)

Enter "key words" without mistakes !!!

  1. When spider finish scraping, you will see the following window. finish scraping
  2. Scraped jobs are located in csv file in the same directory as project. table

upworkscraper's People

Contributors

dyadgames avatar

Stargazers

 avatar Olaitan Adeniyi avatar

Watchers

 avatar

Forkers

deep840sen

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.