Git Product home page Git Product logo

haodoo's Introduction

haodoo

這是好讀網站的網路爬蟲, 可以把ebook連結找出來

建立環境

  1. 安裝python package 管理軟體
sudo python3 get-pip.py
  1. 安裝virtualenv
 pip3 install virtualenv
 virtualenv bin
  1. 進入virtualenv
source bin/bin/activate
  1. 安裝所需軟體
pip3 install -r requirements.txt
  1. 離開virtualenv
deactivate

使用方法

此tool會依據三個步驟下載ebook連結

  • 步驟一: 下載目錄網頁, 並把目錄網頁裡書的網頁連結
    python3 parser.py -t generate_book_page_links
    
  • 步驟二: 下載書的網頁與把書的網頁裡關於ebook的下載連結找出來
    python3 parser.py -t generate_book_download_links
    
  • 步驟三: 依序下載"book_link"下的ebook連結
    python3 parser.py -t download_book
    

步驟詳細說明如下:

  1. 目前列出的目錄網頁包括
  • 世紀百強
  • 隨身智囊
  • 歷史煙雲
  • 武俠小說
  • 懸疑小說
  • 言情小說
  • 奇幻小說
  • 小說園地
  • 有聲書

如您只想下載某些目錄, 您可以更改project_config/config.json(需符合JSON格式), 例如您只想下載"歷史煙雲"種類, 可改成

{
  "target_link":[
      {"title":"歷史煙雲", "link":"http://www.haodoo.net/?M=hd&P=history"}
  ]
}
  1. 一共會建立五個目錄,

    目錄網頁會儲存在"index_html"目錄下

    書的網頁連結會儲存在"index_link"目錄下

    書的網頁會儲存在"book_html"目錄下

    ebook連結會儲存在"book_link"目錄下

    ebook儲存在"ebook"目錄下, 所有的格式(updb, prc, mobi, epub, vepub) 皆會下載

  2. 因為怕弄壞"好讀"網站, 每次下載都會間隔數分鐘, 請耐心等候.

haodoo's People

Contributors

starsdog avatar

Stargazers

Elliott Steer avatar  avatar Yanjun Hu avatar pulpfunction avatar  avatar fat sian avatar wuboy avatar  avatar  avatar  avatar  avatar Eric SHI avatar sptaob avatar chunlik avatar  avatar  avatar  avatar EK avatar  avatar

Watchers

James Cloos avatar  avatar  avatar  avatar Daniel Gau avatar  avatar

haodoo's Issues

python3 parser.py -t generate_book_page_links

1.下載master.zip後解壓縮
2.執行第一個步驟'python3 parser.py -t generate_book_page_links'後發生以下錯誤:
Traceback (most recent call last):
File "parser.py", line 197, in
project_config=json.load(file)
File "/usr/lib/python3.6/json/init.py", line 299, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "/usr/lib/python3.6/json/init.py", line 354, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 12 column 5 (char 682)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.