Git Product home page Git Product logo

fantiadl's People

Contributors

bitbybyte avatar hibikidesu avatar hinata avatar ichan18 avatar kamenreader avatar marierose147 avatar suika avatar utilael avatar xwtf avatar zekira avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fantiadl's Issues

Files with same filename will be covered with the lastest downloaded one

In models.py/Fantiadownloader/perform_download(),
the case that files has the same name but different sizes did not handled.

def perform_download(self, url, filename, server_filename=False):
        """Perform a download for the specified URL while showing progress."""
        request = self.session.get(url, stream=True)

        if request.status_code == 404:
            self.output("Download URL returned 404. Skipping...\n")
            return

        request.raise_for_status()

        if server_filename:
            filename = os.path.join(os.path.dirname(filename), os.path.basename(unquote(request.url.split("?", 1)[0])))

        file_size = int(request.headers["Content-Length"])
        if os.path.isfile(filename):
            if os.stat(filename).st_size == file_size:
                self.output("File found (skipping): {}\n".format(filename))
                return
           ####some changes should be made#####
           ####perhaps####
           #
           else:
                   while os.path.isfile(filename):
                    filename = '_' + filename 
           #


        self.output("File: {}\n".format(filename))

        downloaded = 0
        with open(filename, "wb") as file:
            for chunk in request.iter_content(self.chunk_size):
                downloaded += len(chunk)
                file.write(chunk)
                done = int(25 * downloaded / file_size)
                percent = int(100 * downloaded / file_size)
                self.output("\r|{0}{1}| {2}% ".format("\u2588" * done, " " * (25 - done), percent))
        self.output("\n")

Thanks!

About filename

It will be better to choose filename option: save original image filename or number(0, 1, 2, ...)

Force file extensions

.jpg, .png, .gif, .mp4, .webm

Do a dictionary lookup before calling guess_extension().

"No valid input provided"

I already has Requests and Python 3.7 installed but it still gives me "Error: No valid input provided"

Fee Content

Is it possible to download from fantia.jp/product?
And sometimes show text "post content not available on current plan". So i just download free plan photo/vid?
Screenshot_2021-04-28-19-34-42-875_com android chrome
?

Invalid session cookie

Getting following error message
Error: Invalid session. Please verify your session cookie.

I am receiving the same error when I try to use the cookie text file exported by cookie.txt extension while logged in. I have tried firefox and chrome browsers and tried entering the session id from devtools directly when prompted, all methods lead to the same error.

Did something change or is it user error.

Not downloading all images if there's two posts with the same title

Command line: fantiadl_v1.7.exe -c [session cookie] -o [output folder] -r -x -t https://fantia.jp/fanclubs/5744/posts

Example URL: https://fantia.jp/posts/658107

On the page above there are two posts with the same title for the paid plan. In this case fantiadl v1.7 only downloads the first set of images for the paid plan and not the second one because files with the same name already exist. To work around this issue I have to pause fantiadl, move the images of the first set to another folder and resume to get the second set.

Command prompt output:
─────────────────
Downloading fanclub 5744...
Collecting fanclub posts...
Collected 98 posts.
Downloading post 658107...
File: .\かるたも\658107\thumb.png
|█████████████████████████| 100%
File: .\かるたも\658107\フリープラン\0.png
|█████████████████████████| 100%
File: .\かるたも\658107\フリープラン\1.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\0.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\1.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\2.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\3.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\4.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\5.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\6.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\7.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\8.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\9.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\10.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\11.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\12.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\13.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\14.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\15.png
|█████████████████████████| 100%
File: .\かるたも\658107\感謝プラン\0.png
|█████████████████████████| 100%
Traceback (most recent call last):
File "fantiadl.py", line 108, in
File "models.py", line 175, in download_fanclub
File "models.py", line 387, in download_post
File "models.py", line 317, in download_post_content
File "models.py", line 298, in download_photo
File "models.py", line 286, in perform_download
FileExistsError: [WinError 183] Cannot create a file when that file already exists: '.\かるたも\658107\感謝プラン\0.incomplete' -> '.\かるたも\658107\感謝プラン\0.png'
[12784] Failed to execute script fantiadl


Note: I shortened the path of the files.

PS: Thanks a lot for coding this tool!

another "No Valid Input Provided" using -c or --cookie


Untitled

sorry for bothering you,
but i have some login problem in V1.7
already login to fantia from chrome to get session id number,
but still got "no valid input provided" even after using -c cookies.txt / -c "insert session id here" / --cookie cookies.txt / --cookie "insert session ID here".
i already check readme and read plus following guide from previous issue that have the same problem, but it's still like this,
is this bug or there's some step i missing?
thanks for reading this

Handle terminal encoding on output

Hi, I'm using the latest build of fantiadl and encountered this issue:

Encountered an error downloading URL. Skipping...
Traceback (most recent call last):
  File "fantiadl.py", line 88, in <module>
    downloader.download_fanclub(fanclub, cmdl_opts.limit)
  File "F:\fantia\models.py", line 142, in download_fanclub
    self.download_fanclub_metadata(fanclub)
  File "F:\fantia\models.py", line 122, in download_fanclub_metadata
    self.perform_download(header_url, header_filename, server_filename=self.use_server_filenames)
  File "F:\fantia\models.py", line 210, in perform_download
    self.output("File: {}\n".format(filename))
  File "F:\fantia\models.py", line 68, in output
    sys.stdout.write(output)
UnicodeEncodeError: 'gbk' codec can't encode character '\u2615' in position 8: illegal multibyte sequence

Seems like there is no exception mechanism when a filename contains illegal characters in a certain charset. Would be really appreciated if filenames with illegal characters can be automatically renamed / normalized.

Could you look into the issue? Thank you very much!

"no valid input provided"

Maybe I dont know how to use this program, but every time i click on the executable, it keeps closing itself but before that happens, it says the message "no valid input".

Skip posts with already downloaded content as a whole instead of each file on its own

When rerunning the program with the same parameters (download all pay for current month) after the last run failed and only downloaded partial posts or a new creator has been added:

the download takes forever especially on posts with larger image galleries because every file gets requested and then skipped if already on the disk.

would it be possible to keep track of already downloaded posts? or at least check if a post has been modified since the last download.

Max retries exceeded with url: /api/v1/me

python fantiadl.py -c 6c14... https://fantia.jp/fanclubs/6561
Traceback (most recent call last):
  File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 696, in urlopen
    self._prepare_proxy(conn)
  File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 964, in _prepare_proxy
    conn.connect()
  File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connection.py", line 359, in connect
    conn = self._connect_tls_proxy(hostname, conn)
  File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connection.py", line 496, in _connect_tls_proxy
    return ssl_wrap_socket(
  File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\util\ssl_.py", line 432, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls)
  File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\util\ssl_.py", line 474, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock)
  File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 1040, in _create
    self.do_handshake()
  File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
FileNotFoundError: [Errno 2] No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\adapters.py", line 439, in send
    resp = conn.urlopen(
  File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 755, in urlopen
    retries = retries.increment(
  File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\util\retry.py", line 573, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='fantia.jp', port=443): Max retries exceeded with url: /api/v1/me (Caused by ProxyError('Cannot connect to proxy.', FileNotFoundError(2, 'No such file or directory')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\15516\Desktop\fantia\fantiadl.py", line 77, in <module>
    downloader = models.FantiaDownloader(session_arg=session_arg, dump_metadata=cmdl_opts.dump_metadata, parse_for_external_links=cmdl_opts.parse_for_external_links, download_thumb=cmdl_opts.download_thumb, directory=cmdl_opts.output_path, quiet=cmdl_opts.quiet, continue_on_error=cmdl_opts.continue_on_error, use_server_filenames=cmdl_opts.use_server_filenames, mark_incomplete_posts=cmdl_opts.mark_incomplete_posts, month_limit=cmdl_opts.month_limit)
  File "C:\Users\15516\Desktop\fantia\models.py", line 75, in __init__
    self.login()
  File "C:\Users\15516\Desktop\fantia\models.py", line 99, in login
    check_user = self.session.get(ME_API,verify =  False)
  File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\sessions.py", line 555, in get
    return self.request('GET', url, **kwargs)
  File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "C:\Users\15516\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\adapters.py", line 510, in send
    raise ProxyError(e, request=request)
requests.exceptions.ProxyError: HTTPSConnectionPool(host='fantia.jp', port=443): Max retries exceeded with url: /api/v1/me (Caused by ProxyError('Cannot connect to proxy.', FileNotFoundError(2, 'No such file or directory')))

I am sorry to trouble you. And here is the error information. My python version is 3.9.0.

And I use global system proxy to access the website.

When calling Login, is shows
requests.exceptions.ProxyError: HTTPSConnectionPool(host='fantia.jp', port=443): Max retries exceeded with url: /api/v1/me (Caused by ProxyError('Cannot connect to proxy.', FileNotFoundError(2, 'No such file or directory')))

Is that caused by proxy software?

I am looking forward to your earliest reply, thank you!

External link detection is wonky

Downloading post 131033...
Traceback (most recent call last):
  File "fantiadl.py", line 66, in <module>
    downloader.download_fanclub_posts(fanclub, cmdl_opts.limit)
  File "/dev/shm/models.py", line 78, in download_fanclub_posts
    self.download_post(post_id)
  File "/dev/shm/models.py", line 169, in download_post
    self.parse_external_links(post_description, os.path.abspath(post_directory))
  File "/dev/shm/models.py", line 174, in parse_external_links
    link_matches = self.EXTERNAL_LINKS_RE.findall(post_description)
TypeError: expected string or bytes-like object

Dump external links in metadata

Produce an output of external links (e.g. Mega) that can be easily processed with another downloader. Find out if JDownloader or another downloader has a way to import links with an assigned directory.

unable to open the executable programme

Hi,

For some reason whenever I open 'fantiadl_v1.3.3.exe', it always return the message 'Error: No valid input provided'. While I am able to execute the python source code directly i.e. using visual studio or SPYDER , I just cannot open 'fantiadl_v1.3.3.exe'. What is the way to resolve it?

Cheers

Better rate limiting

I got the error 429 - Too many requests for bit, so I would suggest adding or increasing the rate limits tiny bit to avoid this

[BUG] Unable to download due to invalid session

Getting "Error: Invalid session. Please verify your session cookie" every time I attempt to run fantiadl.
Things were working the last time I ran this back in december.

command run: .\fantiadl_v1.8.exe -i -s -d 2022-01 -c cookies.txt https://fantia.jp/fanclubs/<FAN_CLUB_ID>
tested: .\fantiadl_v1.8.exe -i -s -d 2022-01 -c <_SESSION_ID VALUE> https://fantia.jp/fanclubs/<FAN_CLUB_ID>
I have also logged out of fantia and logged back in to generate a new cookie/_session_id to test with.

Traceback when downloading certain fanclub(s)

Downloading fanclub 36599...
File found (skipping): .\fanclub\tokorot\6b70d7db-1fe2-459c-8065-54b4b088617b.png
Traceback (most recent call last):
File "fantiadl.py", line 88, in
File "models.py", line 139, in download_fanclub
File "models.py", line 126, in download_fanclub_metadata
File "models.py", line 191, in perform_download
File "site-packages\requests\models.py", line 940, in raise_for_status
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://fantia.jp/images/fallback/fanclub/icon_image
/_default5.png
[11932] Failed to execute script fantiadl

I've seen this happen with other fanclubs but I didn't take note of the ID's so this is my only example.

Galleries without titles

Downloading post 132271...
Traceback (most recent call last):
  File "fantiadl.py", line 62, in <module>
    downloader.download_fanclub_posts(fanclub, cmdl_opts.limit)
  File "/tmp/fantiadl/models.py", line 75, in download_fanclub_posts
    self.download_post(post_id)
  File "/tmp/fantiadl/models.py", line 157, in download_post
    self.download_post_content(post, post_directory)
  File "/tmp/fantiadl/models.py", line 134, in download_post_content
    gallery_directory = os.path.join(post_directory, sanitize_for_path(photo_gallery_title))
  File "/tmp/fantiadl/models.py", line 172, in sanitize_for_path
    return re.sub(r'[<>\"\?\\\/\*:]', replace, value)
  File "/usr/lib/python3.5/re.py", line 182, in sub
    return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object

If value is None, the sanitizer will fail.

Error: No valid input provided

Have tried multiple versions.. running the exe gives the error: "no valid input provided" before immediately closing. Trying to run the python file from the zip with cmd yields the same error. I'm on Python 3.10.

can't login with _session_id cookie or cookies.txt

Hello, when i'm trying to connect to my fantia account with _session_id cookie cookies.txt i'am getting a error that tell me that my session cookie is invalide. i've tried to wait and delete it and connecting with the new one but i'm still getting this error. am I doing something wrong?

dfsf

Preserve data on post title change

From time to time, creators will change the title of their post.
Due to the way the paths are constructed in fantiadl, that causes file duplication.

The post ID is the only thing that doesn't change.

Solution:

  1. if post ID directory already exists, check files and download new files if size differs
  2. rename old metadata .json and download new metadata as usual
  3. rename directory to new title

Status code 304 from API "fantia.jp/api/v1/me"

Hi bitbybyte,

On June 14th, 2020, a download was interrupted when I was using the downloader. The program keep return "Error: Invalid session" since that time.

I tried to trouble shooting it. I reviewed the source code and check the webpage "https://fantia.jp/api/v1/me" by developer tools on firedox 77.0.1 & Chrome 83.0.4103.106. The page status code is 304 instead of 200.

Did you have the same problem? I am not sure if fantia made some change on their server or there is some cache issue on my side.

Thank you and have a nice day,

Error because of invalid session cookie

For over a week fantiadl_v1.8.exe has been displaying the error message "Error: Invalid session. Please verify your session cookie" on my PC although the session ID is correct. Of course, I copied and pasted it. I also re-logged on fantia.jp to get a new one, to no avail. Before that, it worked perfectly.

Screenshot:
FaniaDL_v1_8-issue

Login process now requires reCAPTCHA

As the title says, fantiadl will return an Error: Invalid session after a password change; and it's not a wrong password either - on wrong password, it will threw "Error: Failed to login. Please verify your username and password", but with the new one, it will just return "Invalid session" instead.

Tried again on a different network and OS, didn't work.

Could you look into the issue? Thanks in advance!

Metadata check on paid and free plan scrape overwrites

For example, when a free plan user performs a scrape for a club already scraped with a paid plan, metadata will be overwritten. To fix this we can check the number of available post_contents to the scraping user and compare with the existing metadata. If more are available to the user, overwrite and continue downloading. If more are availble in the existing metadata, skip the post entirely.

We can also check which plan status is currently joined, or even simplify this entirely by specifying the type of metadata e.g. metadata_plan0.json.

Can this project can set download dirctory?

This script is very convenient and it helps me a lot , thank you.
But I can not find how to set download dirctory. For example , you can see this script .

If edit "config.ini" and type "%artist%/%urlFilename%" , Then all files will be downloaded in a same dirctory , such as following pictures
image

And if edit "config.ini" as "%artist%/%member_id%/%urlFilename%/" , then the script will create dirctory for every post , like this
image

Fantiadl will create dirctory for every post like example 2 defaultly, then please tell me how to let "fantiadl" downloads files like example 1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.