dreamcobbler / fiction-dl Goto Github PK
View Code? Open in Web Editor NEWA content downloader, capable of retrieving works of (fan)fiction from the web and saving them in a few common file formats.
License: GNU General Public License v3.0
A content downloader, capable of retrieving works of (fan)fiction from the web and saving them in a few common file formats.
License: GNU General Public License v3.0
I'm not good at coding, so I just want to know, how do I comment out/disable some of the text formatting stuff? Specifically the line break editor. There's plenty of older fic that has weirder typography that it automatically reads as 'intended to be line breaks' and replaces it with a
When I tried to convert a html with single frame gifs they failed to download while giving "! Failed to download image..." message.
I installed this using pip but when I tried to download a story from HF I got error related to dreamy-utilities (don't remember exactly what but it worked perfectly after I installed dreamy-utilities using pip though.
On a side note I was trying to scrape stories from HF when I found this on pypi.
I'm new to programming and was trying to get an epub from the scraped html strings(chapters). Now that I found this I don't need to go through that anymore. Thanks for creating this tool. Really appreciate it.
Here is the error (I didn't reduce any details, please don't be offended!)
Creating the extractor...
Scanning the story...
┌─────────────────┬───────────────────────┐
│ Title: │ Bitchbreaker Lucifer │
├─────────────────┼───────────────────────┤
│ Author: │ Delaware │
├─────────────────┼───────────────────────┤
│ Date published: │ Oct 16, 2020 │
├─────────────────┼───────────────────────┤
│ Date updated: │ Oct 19, 2020 │
├─────────────────┼───────────────────────┤
│ Chapter count: │ 3 │
├─────────────────┼───────────────────────┤
│ Word count: │ 19,945 │
└─────────────────┴───────────────────────┘
Extracting content...
Downloading images...
Processing content...
Formatting and saving the story...
ERROR:root:Failed to format the stories as HTML.
Traceback (most recent call last):
File "c:\users\anangaya\appdata\local\programs\python\python38-32\lib\runpy.py", line 194, in _run_module_as_main
return run_code(code, main_globals, None,
File "c:\users\anangaya\appdata\local\programs\python\python38-32\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "C:\Users\anangaya\AppData\Local\Programs\Python\Python38-32\Scripts\fiction-dl.exe_main.py", line 7, in
File "c:\users\anangaya\appdata\local\programs\python\python38-32\lib\site-packages\fiction_dl_main.py", line 93, in Main
Application(
File "c:\users\anangaya\appdata\local\programs\python\python38-32\lib\site-packages\fiction_dl\Core\Application.py", line 181, in Launch
self._FormatAndSaveStoryOrPackage(newlyDownloadedStory)
File "c:\users\anangaya\appdata\local\programs\python\python38-32\lib\site-packages\fiction_dl\Core\Application.py", line 486, in _FormatAndSaveStoryOrPackage
if not formatter.FormatAndSave(story, filePaths["ODT"]):
File "c:\users\anangaya\appdata\local\programs\python\python38-32\lib\site-packages\fiction_dl\Formatters\FormatterODT.py", line 215, in FormatAndSave
with ZipFile(filePath, mode = "a") as outputArchive:
File "c:\users\anangaya\appdata\local\programs\python\python38-32\lib\zipfile.py", line 1251, in init
self.fp = io.open(file, filemode)
FileNotFoundError: [Errno 2] No such file or directory: 'fiction-dl Downloads\Delaware\Bitchbreaker Lucifer \Bitchbreaker Lucifer .odt'
It happened when I was processing multiple links in a text file, so the program getting terminated is really annoying. Especially because the links are being processed randomly.
It would be great if you can make it move to the next link without terminating the entire program while giving message about the failure. Fixing the error is always welcome.
Most Nifty links do not work for me.
It returns:
Scanning the story...
ERROR:root:Failed to read metadata from the first chapter of the story.
ERROR:root:Failed to scan the story.
OR
Scanning the story...
ERROR:root:List of chapters not found.
ERROR:root:Failed to scan the story.
My experience is under the same circumstances , some links always work , most just do not work.
f-dl : The term 'f-dl' is not recognized as the name of a cmdlet, function, script file, or operable program. Check
the spelling of the name, or if a path was included, verify that the path is correct and try again.
At line:1 char:1
+ CategoryInfo : ObjectNotFound: (f-dl:String) [], CommandNotFoundException
+ FullyQualifiedErrorId : CommandNotFoundException
I did everything which includes upgrading and just errors out on any link provided from other sites. Nifty.Org does not work either.
Isn't there supposed to be a f-dl exe file included with this?
Not working for literotica anymore. Please fix it. The channel support feature is going to be missed for while😔
Something I've noticed is some older fic (pre ~2012 or so) has an issue, where, likely due to the older formatting of the fic, words become mashed together likethis. From what I can see, it's due to the original's site formatting in the html, where basically almost arbitrarily lines break mid-sentence rather than text (but with no
or anything). Due to this, the text is pulled together without a space. An example can be seen here; the html arbitrarily cuts itself in half. The resulting downloads have the words on the ends of the lines smashed together.
Is there any way to remedy this? I've tried tweaking the source, but I only have incredibly rudimentary coding skills so there isn't a ton I can really do.
Could you add rtenzo.net/rtenzo to make epub from it's stories and with it images
When converting a html that has the same image used in several places fiction-dl downloads&saves the same image several times which is not necessary.
Fiction-dl downloads all the images, but most of the img tags are missing from the xhtml file(s). So the images are not displayed when reading the ebook.
I tried to download an article from Quotev.com, but the author name is simplified Chinese so the output direction could not be built.
Is there any way to skip that problem? Thanks!
AO3 offers downloads in multiple formats, namely: AZW3, EPUB, MOBI, PDF, HTML
Direct links to these are very easy to get: https://archiveofourown.org/downloads/<story_id>/a.<extension>
(the a
here can be any text, by default it is the stories name, but it does not matter)
This has the potential to simplify the downloading and reduce load on ao3 and potential 429 - Too Many Requests errors.
When downloading fanfiction and a Cloudflare challenge occurs during the request, the request fails and the download process is aborted.
Seems F-DL only works on plain-text chapters, but unfortunately some authors use HTML pages for each chapter and the extractor doesn't seem to work with those, generating the error:
ERROR:root:Failed to read metadata from the first chapter of the story.
ERROR:root:Failed to scan the story.
Literotica recently did a small site redesign and it seems to have broken the extractor. However I try to download (either the story directly or author page) gives me a "Failed to download a story error"
Please support AO3 Series, to make downloading an entire series easier.
Adding to this, a feature to combine all stories downloaded into one document would be awsome (maybe a commandline flag that also works with providing a list?)
I've been getting this error the past few days with ff.net fics:
ERROR:root:Failed to download page: "[URL]".
ERROR:root:Failed to scan the story.
Is it something on my end? or has ff.net updated something? I've tried updating f-dl but it still keeps happening.
system: linuxmint 19.3, pip 9.0.1
command: python3 -m pip install --upgrade fiction-dl
error:
Collecting fiction-dl
#Could not find a version that satisfies the requirement fiction-dl (from versions: )
No matching distribution found for fiction-dl
command: python3 -m pip install ficiton-dl
error:
Collecting ficiton-dl
Exception:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/pip/basecommand.py", line 215, in main
status = self.run(options, args)
File "/usr/lib/python3/dist-packages/pip/commands/install.py", line 353, in run
wb.build(autobuilding=True)
File "/usr/lib/python3/dist-packages/pip/wheel.py", line 749, in build
self.requirement_set.prepare_files(self.finder)
File "/usr/lib/python3/dist-packages/pip/req/req_set.py", line 380, in prepare_files
ignore_dependencies=self.ignore_dependencies))
File "/usr/lib/python3/dist-packages/pip/req/req_set.py", line 554, in _prepare_file
require_hashes
File "/usr/lib/python3/dist-packages/pip/req/req_install.py", line 278, in populate_link
self.link = finder.find_requirement(self, upgrade)
File "/usr/lib/python3/dist-packages/pip/index.py", line 465, in find_requirement
all_candidates = self.find_all_candidates(req.name)
File "/usr/lib/python3/dist-packages/pip/index.py", line 423, in find_all_candidates
for page in self._get_pages(url_locations, project_name):
File "/usr/lib/python3/dist-packages/pip/index.py", line 568, in _get_pages
page = self._get_page(location)
File "/usr/lib/python3/dist-packages/pip/index.py", line 683, in _get_page
return HTMLPage.get_page(link, session=self.session)
File "/usr/lib/python3/dist-packages/pip/index.py", line 795, in get_page
resp.raise_for_status()
File "/usr/share/python-wheels/requests-2.18.4-py2.py3-none-any.whl/requests/models.py", line 935, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://pypi.org/simple/ficiton-dl/
Some of them are hidden. You don't get hidden threadmarks. For example that https://forums.spacebattles.com/threads/this-wont-end-well-30k-isekai.587209/threadmarks . The result is 100 chapters. Command used: >fiction-dl https://forums.spacebattles.com/threads/this-wont-end-well-30k-isekai.587209/threadmarks
I just installed the latest release, and now when I try to do anything I get the following error message:
Traceback (most recent call last): File "__main__.py", line 34, in <module> from Utilities.Filesystem import AddToPATH, GetPackageDirectory ImportError: cannot import name 'GetPackageDirectory' from 'Utilities.Filesystem' (C:\Users\Betsybugaboo\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\fiction_dl\Utilities\Filesystem.py)
What do I need to do to resolve this?
I think it would be better if the html files are not inside a txt file. It would be better to give fiction-dl a txt file that contains the metadata and the name(or path) of the html files (chapters of the story). This will make it easier to create epub of stories that has lot of chapters.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.