Git Product home page Git Product logo

wuxiaworld's Introduction

About:

Python Script To Copy WuxiaWorld Chapters Into EPUB File.

Copies The Novel Chapters Along With Novel Details And Sometimes(Once Every 6-10 Times Code Is Run) 'Not' The Cover Image (IDK Why ? Maybe Because Of BeautifulSoup4 Internal Problem).

How Does The Script Work ? Just Enter The Novel URL Inside The Script And The Rest Follows.

I'll Try To Add Any Necessary Updates.

Initial Implementation By : Aundinn


Note :

Check this other novel webiste: https://wxuiaworld.co. Ask Me, Why This Website? Well, It Has Novels From Webnovel(Qidan) & WuxiaWorld With All Latest Chapters Unlocked. No Spirit Stones, No Patreon, No Subscription Or Any Of Those Things Required To Read The Latest Chapters! Don't Take My Word For It ? Check It Out.


Task(s) :

  • Get List Of Chapters From Novel Website And Use Links From The List Rather Than Progress Sequentially Because Of The Arising Problem Of Some Pages Not Having Sequential Names.
  • Implement multiprocessing to speed up process.

Problem(s) :

  • None Yet(Report if any).

Screenshot :

Image Not Avialable

Documentation :

  1. For Beginners, After Setting Up A Working Python 3 Environment(Along With Latest pip), You Need To Install Some Packages. To Install, Run These Commands In Your CMD/Terminal :

    • pip3 install bs4
    • pip3 install ebooklib
    • pip3 install requests
    • pip3 install html5lib=="0.9999999"
  2. Download The Python Script And Unzip It.

  3. Open The Script With A Text Editor And Read The Details Inside.

  4. In Case The Script Was Not Updated According To The Changes In Website, You Might Refer The BeautifulSoup Docs To Make Changes Accordingly.

  5. To Run, Open CMD/Terminal, Navigate To The Unzip Location And Type :

    • Linux -python3 code.py
    • Windows - python code.py or py code.py
  6. EPUB File Will Be Saved At The Location Of Script.

Working :

Parsing :

html5lib Is Used Because Although Being Tiny Winy Bit Slow, It Generates Valid HTML. You May Compare Others Here, Differences Between Parsers. I've Copied The Table From BS4 Website Below To Give A Faint Overview.

Parser Typical usage Advantages Disadvantages
Python’s html.parser BeautifulSoup(markup, "html.parser")
  • Batteries included
  • Decent speed
  • Lenient (as of Python 2.7.3 and 3.2.)
  • Not very lenient (before Python 2.7.3 or 3.2.2)
lxml’s HTML parser BeautifulSoup(markup, "lxml")
  • Very fast
  • Lenient
  • External C dependency
lxml’s XML parser BeautifulSoup(markup, "lxml-xml") BeautifulSoup(markup, "xml")
  • Very fast
  • The only currently supported XML parser
  • External C dependency
html5lib BeautifulSoup(markup, "html5lib")
  • Extremely lenient
  • Parses pages the same way a web browser does
  • Creates valid HTML5
  • Very slow
  • External Python dependency

If Any Problem Occurs With html5lib :

  • In Case You Update It Accidentally, You Can Reinstall The Specific Version By Checking The Details For Beginners.
  • Another Choice, Change html5lib To lxml - If Installed, Otherwise To Python's Inbuilt html.parser .

License

Copyright © 2018 Kogam22. Released under the terms of the Apache 2.0 license.

wuxiaworld's People

Contributors

1ycx avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.