bitbybyte / fantiadl Goto Github PK
View Code? Open in Web Editor NEWDownload posts and media from Fantia
License: MIT License
Download posts and media from Fantia
License: MIT License
A crucial (in my opinion) function that is hopefully easy to implement, either as default or with a parameter.
Example posts (500 Yen plan):
Resolution differences (downloaded with fantiadl vs highest resolution):
PS: I'd highly appreciate it if you would release a new binary. Thanks a lot for your work!
In models.py/Fantiadownloader/perform_download(),
the case that files has the same name but different sizes did not handled.
def perform_download(self, url, filename, server_filename=False):
"""Perform a download for the specified URL while showing progress."""
request = self.session.get(url, stream=True)
if request.status_code == 404:
self.output("Download URL returned 404. Skipping...\n")
return
request.raise_for_status()
if server_filename:
filename = os.path.join(os.path.dirname(filename), os.path.basename(unquote(request.url.split("?", 1)[0])))
file_size = int(request.headers["Content-Length"])
if os.path.isfile(filename):
if os.stat(filename).st_size == file_size:
self.output("File found (skipping): {}\n".format(filename))
return
####some changes should be made#####
####perhaps####
#
else:
while os.path.isfile(filename):
filename = '_' + filename
#
self.output("File: {}\n".format(filename))
downloaded = 0
with open(filename, "wb") as file:
for chunk in request.iter_content(self.chunk_size):
downloaded += len(chunk)
file.write(chunk)
done = int(25 * downloaded / file_size)
percent = int(100 * downloaded / file_size)
self.output("\r|{0}{1}| {2}% ".format("\u2588" * done, " " * (25 - done), percent))
self.output("\n")
Thanks!
Trailing spaces, reserved words.
It will be better to choose filename option: save original image filename or number(0, 1, 2, ...)
.jpg, .png, .gif, .mp4, .webm
Do a dictionary lookup before calling guess_extension()
.
I already has Requests and Python 3.7 installed but it still gives me "Error: No valid input provided"
Getting following error message
Error: Invalid session. Please verify your session cookie.
I am receiving the same error when I try to use the cookie text file exported by cookie.txt extension while logged in. I have tried firefox and chrome browsers and tried entering the session id from devtools directly when prompted, all methods lead to the same error.
Did something change or is it user error.
now the folders name is the post number. Can the downloader auto rename the folder using the post title instead of the number? For the better look.
Thanks
For articles with multiple billing plans on the same page, the owner can upload different files with the same filename, but in that case only the last file on the page will be saved.
I can work around this by using use-server-filenames, but please do not overwrite each of them if possible.
Thank you.
Hello, i am having an issue where it doesn't recognize the cookie.
I have deleted the cookies and loged in again to create a new one, but it insists on not recognizing the actual valid cookie.
Thanks in advance.
Command line: fantiadl_v1.7.exe -c [session cookie] -o [output folder] -r -x -t https://fantia.jp/fanclubs/5744/posts
Example URL: https://fantia.jp/posts/658107
On the page above there are two posts with the same title for the paid plan. In this case fantiadl v1.7 only downloads the first set of images for the paid plan and not the second one because files with the same name already exist. To work around this issue I have to pause fantiadl, move the images of the first set to another folder and resume to get the second set.
Command prompt output:
─────────────────
Downloading fanclub 5744...
Collecting fanclub posts...
Collected 98 posts.
Downloading post 658107...
File: .\かるたも\658107\thumb.png
|█████████████████████████| 100%
File: .\かるたも\658107\フリープラン\0.png
|█████████████████████████| 100%
File: .\かるたも\658107\フリープラン\1.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\0.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\1.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\2.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\3.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\4.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\5.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\6.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\7.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\8.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\9.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\10.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\11.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\12.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\13.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\14.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\15.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\0.png
|█████████████████████████| 100%
Traceback (most recent call last):
File "fantiadl.py", line 108, in
File "models.py", line 175, in download_fanclub
File "models.py", line 387, in download_post
File "models.py", line 317, in download_post_content
File "models.py", line 298, in download_photo
File "models.py", line 286, in perform_download
FileExistsError: [WinError 183] Cannot create a file when that file already exists: '.\かるたも\658107\感謝プラン\0.incomplete' -> '.\かるたも\658107\感謝プラン\0.png'
[12784] Failed to execute script fantiadl
Note: I shortened the path of the files.
PS: Thanks a lot for coding this tool!
sorry for bothering you,
but i have some login problem in V1.7
already login to fantia from chrome to get session id number,
but still got "no valid input provided" even after using -c cookies.txt / -c "insert session id here" / --cookie cookies.txt / --cookie "insert session ID here".
i already check readme and read plus following guide from previous issue that have the same problem, but it's still like this,
is this bug or there's some step i missing?
thanks for reading this
Hi, I'm using the latest build of fantiadl and encountered this issue:
Encountered an error downloading URL. Skipping...
Traceback (most recent call last):
File "fantiadl.py", line 88, in <module>
downloader.download_fanclub(fanclub, cmdl_opts.limit)
File "F:\fantia\models.py", line 142, in download_fanclub
self.download_fanclub_metadata(fanclub)
File "F:\fantia\models.py", line 122, in download_fanclub_metadata
self.perform_download(header_url, header_filename, server_filename=self.use_server_filenames)
File "F:\fantia\models.py", line 210, in perform_download
self.output("File: {}\n".format(filename))
File "F:\fantia\models.py", line 68, in output
sys.stdout.write(output)
UnicodeEncodeError: 'gbk' codec can't encode character '\u2615' in position 8: illegal multibyte sequence
Seems like there is no exception mechanism when a filename contains illegal characters in a certain charset. Would be really appreciated if filenames with illegal characters can be automatically renamed / normalized.
Could you look into the issue? Thank you very much!
Trojan:Win32/Glupteba!ml
Trojan:Win32/Wacatac.D2!ml
Maybe I dont know how to use this program, but every time i click on the executable, it keeps closing itself but before that happens, it says the message "no valid input".
When rerunning the program with the same parameters (download all pay for current month) after the last run failed and only downloaded partial posts or a new creator has been added:
the download takes forever especially on posts with larger image galleries because every file gets requested and then skipped if already on the disk.
would it be possible to keep track of already downloaded posts? or at least check if a post has been modified since the last download.
any x86 version?
it keeps saying no valid input provided
python fantiadl.py -c 6c14... https://fantia.jp/fanclubs/6561
Traceback (most recent call last):
File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 696, in urlopen
self._prepare_proxy(conn)
File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 964, in _prepare_proxy
conn.connect()
File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connection.py", line 359, in connect
conn = self._connect_tls_proxy(hostname, conn)
File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connection.py", line 496, in _connect_tls_proxy
return ssl_wrap_socket(
File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\util\ssl_.py", line 432, in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls)
File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\util\ssl_.py", line 474, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock)
File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 500, in wrap_socket
return self.sslsocket_class._create(
File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 1040, in _create
self.do_handshake()
File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 1309, in do_handshake
self._sslobj.do_handshake()
FileNotFoundError: [Errno 2] No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\adapters.py", line 439, in send
resp = conn.urlopen(
File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 755, in urlopen
retries = retries.increment(
File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\util\retry.py", line 573, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='fantia.jp', port=443): Max retries exceeded with url: /api/v1/me (Caused by ProxyError('Cannot connect to proxy.', FileNotFoundError(2, 'No such file or directory')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\15516\Desktop\fantia\fantiadl.py", line 77, in <module>
downloader = models.FantiaDownloader(session_arg=session_arg, dump_metadata=cmdl_opts.dump_metadata, parse_for_external_links=cmdl_opts.parse_for_external_links, download_thumb=cmdl_opts.download_thumb, directory=cmdl_opts.output_path, quiet=cmdl_opts.quiet, continue_on_error=cmdl_opts.continue_on_error, use_server_filenames=cmdl_opts.use_server_filenames, mark_incomplete_posts=cmdl_opts.mark_incomplete_posts, month_limit=cmdl_opts.month_limit)
File "C:\Users\15516\Desktop\fantia\models.py", line 75, in __init__
self.login()
File "C:\Users\15516\Desktop\fantia\models.py", line 99, in login
check_user = self.session.get(ME_API,verify = False)
File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\sessions.py", line 555, in get
return self.request('GET', url, **kwargs)
File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\adapters.py", line 510, in send
raise ProxyError(e, request=request)
requests.exceptions.ProxyError: HTTPSConnectionPool(host='fantia.jp', port=443): Max retries exceeded with url: /api/v1/me (Caused by ProxyError('Cannot connect to proxy.', FileNotFoundError(2, 'No such file or directory')))
I am sorry to trouble you. And here is the error information. My python version is 3.9.0.
And I use global system proxy to access the website.
When calling Login
, is shows
requests.exceptions.ProxyError: HTTPSConnectionPool(host='fantia.jp', port=443): Max retries exceeded with url: /api/v1/me (Caused by ProxyError('Cannot connect to proxy.', FileNotFoundError(2, 'No such file or directory')))
Is that caused by proxy software?
I am looking forward to your earliest reply, thank you!
Downloading post 131033...
Traceback (most recent call last):
File "fantiadl.py", line 66, in <module>
downloader.download_fanclub_posts(fanclub, cmdl_opts.limit)
File "/dev/shm/models.py", line 78, in download_fanclub_posts
self.download_post(post_id)
File "/dev/shm/models.py", line 169, in download_post
self.parse_external_links(post_description, os.path.abspath(post_directory))
File "/dev/shm/models.py", line 174, in parse_external_links
link_matches = self.EXTERNAL_LINKS_RE.findall(post_description)
TypeError: expected string or bytes-like object
Produce an output of external links (e.g. Mega) that can be easily processed with another downloader. Find out if JDownloader or another downloader has a way to import links with an assigned directory.
Error: no valid input provided
Thanks for the great tool! I have a request regarding functionality, could you please add a feature to define the filename to the post number?
ex: https://fantia.jp/posts/******/post_content_photo/[*******] or https://cc.fantia.jp/uploads/post_content_photo/file/[*******]
Hi,
For some reason whenever I open 'fantiadl_v1.3.3.exe', it always return the message 'Error: No valid input provided'. While I am able to execute the python source code directly i.e. using visual studio or SPYDER , I just cannot open 'fantiadl_v1.3.3.exe'. What is the way to resolve it?
Cheers
Rename folders or otherwise if content is not available on your currently backed plan.
I got the error 429 - Too many requests for bit, so I would suggest adding or increasing the rate limits tiny bit to avoid this
Windows Defender flags fantiadl as a threat(Program:Win32/Uwamson.A!ml). Is this actually safe to use?
Getting "Error: Invalid session. Please verify your session cookie" every time I attempt to run fantiadl.
Things were working the last time I ran this back in december.
command run: .\fantiadl_v1.8.exe -i -s -d 2022-01 -c cookies.txt https://fantia.jp/fanclubs/<FAN_CLUB_ID>
tested: .\fantiadl_v1.8.exe -i -s -d 2022-01 -c <_SESSION_ID VALUE> https://fantia.jp/fanclubs/<FAN_CLUB_ID>
I have also logged out of fantia and logged back in to generate a new cookie/_session_id to test with.
Downloading fanclub 36599...
File found (skipping): .\fanclub\tokorot\6b70d7db-1fe2-459c-8065-54b4b088617b.png
Traceback (most recent call last):
File "fantiadl.py", line 88, in
File "models.py", line 139, in download_fanclub
File "models.py", line 126, in download_fanclub_metadata
File "models.py", line 191, in perform_download
File "site-packages\requests\models.py", line 940, in raise_for_status
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://fantia.jp/images/fallback/fanclub/icon_image
/_default5.png
[11932] Failed to execute script fantiadl
I've seen this happen with other fanclubs but I didn't take note of the ID's so this is my only example.
Downloading post 132271...
Traceback (most recent call last):
File "fantiadl.py", line 62, in <module>
downloader.download_fanclub_posts(fanclub, cmdl_opts.limit)
File "/tmp/fantiadl/models.py", line 75, in download_fanclub_posts
self.download_post(post_id)
File "/tmp/fantiadl/models.py", line 157, in download_post
self.download_post_content(post, post_directory)
File "/tmp/fantiadl/models.py", line 134, in download_post_content
gallery_directory = os.path.join(post_directory, sanitize_for_path(photo_gallery_title))
File "/tmp/fantiadl/models.py", line 172, in sanitize_for_path
return re.sub(r'[<>\"\?\\\/\*:]', replace, value)
File "/usr/lib/python3.5/re.py", line 182, in sub
return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object
If value is None, the sanitizer will fail.
Have tried multiple versions.. running the exe gives the error: "no valid input provided" before immediately closing. Trying to run the python file from the zip with cmd yields the same error. I'm on Python 3.10.
From time to time, creators will change the title of their post.
Due to the way the paths are constructed in fantiadl, that causes file duplication.
The post ID is the only thing that doesn't change.
Solution:
Hi bitbybyte,
On June 14th, 2020, a download was interrupted when I was using the downloader. The program keep return "Error: Invalid session" since that time.
I tried to trouble shooting it. I reviewed the source code and check the webpage "https://fantia.jp/api/v1/me" by developer tools on firedox 77.0.1 & Chrome 83.0.4103.106. The page status code is 304 instead of 200.
Did you have the same problem? I am not sure if fantia made some change on their server or there is some cache issue on my side.
Thank you and have a nice day,
Skip posts when -m
is none and the user has no access to post content.
For over a week fantiadl_v1.8.exe has been displaying the error message "Error: Invalid session. Please verify your session cookie" on my PC although the session ID is correct. Of course, I copied and pasted it. I also re-logged on fantia.jp to get a new one, to no avail. Before that, it worked perfectly.
As the title says, fantiadl will return an Error: Invalid session
after a password change; and it's not a wrong password either - on wrong password, it will threw "Error: Failed to login. Please verify your username and password", but with the new one, it will just return "Invalid session" instead.
Tried again on a different network and OS, didn't work.
Could you look into the issue? Thanks in advance!
I've been looking for where to put the url but can't find it.
For example, when a free plan user performs a scrape for a club already scraped with a paid plan, metadata will be overwritten. To fix this we can check the number of available post_contents
to the scraping user and compare with the existing metadata. If more are available to the user, overwrite and continue downloading. If more are availble in the existing metadata, skip the post entirely.
We can also check which plan status is currently joined, or even simplify this entirely by specifying the type of metadata e.g. metadata_plan0.json
.
This script is very convenient and it helps me a lot , thank you.
But I can not find how to set download dirctory. For example , you can see this script .
If edit "config.ini" and type "%artist%/%urlFilename%" , Then all files will be downloaded in a same dirctory , such as following pictures
And if edit "config.ini" as "%artist%/%member_id%/%urlFilename%/" , then the script will create dirctory for every post , like this
Fantiadl will create dirctory for every post like example 2 defaultly, then please tell me how to let "fantiadl" downloads files like example 1
Don't know if possible, but would be really nice to be able to do this if you have a specific month(s) of content you'd like to download instead of the entire discography or manually entering each post you want downloaded.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.