Git Product home page Git Product logo

twistedhelscan-scraper's Introduction

twistedhelscan scraper

Install Dependencies

To start scraping twistedhelscans, you first must install all of the files in requirements.txt. It's recommended that you do this usingvirtualenv.

$ pip install virtualenv $ virtualenv -p python3 scraper_env

Activate the virtualenv

$ source scraper_env/bin/activate

Install dependencies

$ pip install -r requirements.txt

Now you need to download a webdriver that will allow python to control your browser. I tested this using chrome, so I used the chrome driver. https://sites.google.com/a/chromium.org/chromedriver/downloads

Download the latest webdriver and extract it to ~/chromedriver

Specify the Chapters to Download

You're almost ready to run things. Create chapters.yaml of the chapters you want to download, along with the volumes that they belong to. The key is the volume, and the list of integers are the chapters you want to download from that volume.

1:
  - 1
  - 2
  - 3
2:
  - 10

This yaml file will download chapters 1, 2, 3 and 10. You should check wikipedia or the wikia to see which chapters belong to each volume.

Run It

$ python download.py chapters.yaml

Manga pages will be downloaded and saved to directories based on chapter.

Todo:

  • Refactor code
  • Make it generic. Right now it only works on Tokyo Ghoul: Re unless you edit the source
  • I never tested partial chapters like 31.5. These chapters have different URLs, that I can't test because TwistedHelScans doesn't have chapters 30 or higher.

twistedhelscan-scraper's People

Contributors

ksarge avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.