Git Product home page Git Product logo

wqxuetang_downloader's Introduction

最后更新更新时间:2020-02-07 10:26 Wiki

2020-02-09 服务裂开,获取到的全是不到20k的小图,分辨率很低,影响阅读

文泉学堂 离线阅读转换器(ReOpen)

文泉学堂解析工具自动生成pdf,方便用户通过离线的方式阅读文泉学堂免费的资源。

利用python3自动下载文泉学堂学堂提供的免费的pdf书籍(仅供测试,请24小时内删除)

这个脚本的本质上就是将线上阅读转化为线下阅读。

切勿商用和广泛传播。

❤免责声明

所发布的一切破解补丁、注册机和注册信息及软件的解密分析文章仅限用于学习和研究目的; 不得将上述内容用于商业或者非法用途,否则,一切后果请用户自负。 本站信息来自网络,版权争议与本站无关。 您必须在下载后的24个小时之内,从您的电脑中彻底删除上述内容。 如果您喜欢该程序,请支持正版软件,购买注册,得到更好的正版服务。 如有侵权请邮件与我们联系处理。

免责声明

爬取声明

⌛️ 安装与使用

安装方法

使用方法

FAQ

🌙 更新与补丁

更新与补丁

⚡ 版权问题

爬虫学习

📃 LICENSE

MIT

爬虫学习

爬虫学习

其他渠道

其他渠道

wqxuetang_downloader's People

Contributors

billxuce avatar dependabot[bot] avatar kajweb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wqxuetang_downloader's Issues

会下载无效图片

您好,测试发现,在下载中会出现不少的无效图片,显示请耐心等候或者刷新重试。

代理上网

如果使用代理上网如何配置?谢谢UP

请问更新版下载书时,提示登陆错误怎么解决

麻烦问下我是用的更新到最新版本的,下载时提示登陆错误,程序异常退出:
C:\Users\Administrator\Desktop\wqxuetang_downloader-master>python main.py 3199490
2020-02-06 14:18:52,394 [ERROR] 登录错误,程序异常退出。
请问该如何解决?

抓取数据分享

我是隔壁wqDownloader的作者,前两天没发现这个仓库就用自己的方法抓了几本书下来。现在可能服务器扛不住了一直加载不出来,上GitHub再搜就发现了这个仓库。

用知识战疫

这是文泉学堂近期免费图书的初衷,但是现在服务器的状态似乎并不乐观,所以如果有人需要的话,我愿意将自己之前下的分享出来。但与文泉学堂一样,疫情结束,我就取消分享。

OneDrive
百度云

无私的技术传播、无私分享知识,才是真正的技术伦理,为作者点赞!

清华的资源不能为全体国人所共享,平时只能为一小部分人所享,其实是国家教育资源不平等的体现,是一个很扯很丢脸的事。
人类之所以技术发展到现在,就是因为知识传播与技术共享,让越来越多的人掌握科技,人类的整体水平才不断发展。
一部分人想靠技术发财,且不让别人致富,世上就出现了各种保护技术私有的手段,充分说明人类有多自私。平民百姓需要更多的钱去获取技术产品及获取知识。
当年互联网先驱们,发明互联网就是为了共享,分享啊。为什么要无私共享分享?就是为了不让自私的人或资本垄断技术,从而更好地促进人类的福祉啊。
专利和技术保护手段实际上是保护了资本的利益,让世界的财富分配日益不平衡,贫富差距导致各种发展问题,产生各种丑恶。我刚才看到一个说写什么人血馒头的那个家伙的脑子是不是坏掉了?
不管如何,清华的资源能通过这个工具被越多的人所看到,对国家的发展,对科技的发展,都是一个极好的事情。无私分享这个工具的人,没有私心。利国利民,侠之大者!

pymupdf安装失败

Win7
Collecting pymupdf Using cached PyMuPDF-1.16.10.tar.gz (175 kB) Installing collected packages: pymupdf Running setup.py install for pymupdf ... error ERROR: Command errored out with exit status 1: command: 'c:\users\administrator\appdata\local\programs\python\python38-32\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\Administrator\\AppData\\Local\\Temp\\pip-install-ifcv86oe\\pymupdf\\setup.py'"'"'; __file__='"'"'C:\\Users\\Administrator\\AppData\\Local\\Temp\\pip-install-ifcv86oe\\pymupdf\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\Administrator\AppData\Local\Temp\pip-record-j9ktdftz\install-record.txt' --single-version-externally-managed --compile --install-headers 'c:\users\administrator\appdata\local\programs\python\python38-32\Include\pymupdf' cwd: C:\Users\Administrator\AppData\Local\Temp\pip-install-ifcv86oe\pymupdf\ Complete output (13 lines): running install running build running build_py creating build creating build\lib.win32-3.8 creating build\lib.win32-3.8\fitz copying fitz\__init__.py -> build\lib.win32-3.8\fitz copying fitz\fitz.py -> build\lib.win32-3.8\fitz copying fitz\utils.py -> build\lib.win32-3.8\fitz copying fitz\__main__.py -> build\lib.win32-3.8\fitz running build_ext building 'fitz._fitz' extension error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools": https://visualstudio.microsoft.com/downloads/ ---------------------------------------- ERROR: Command errored out with exit status 1: 'c:\users\administrator\appdata\local\programs\python\python38-32\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\Administrator\\AppData\\Local\\Temp\\pip-install-ifcv86oe\\pymupdf\\setup.py'"'"'; __file__='"'"'C:\\Users\\Administrator\\AppData\\Local\\Temp\\pip-install-ifcv86oe\\pymupdf\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\Administrator\AppData\Local\Temp\pip-record-j9ktdftz\install-record.txt' --single-version-externally-managed --compile --install-headers 'c:\users\administrator\appdata\local\programs\python\python38-32\Include\pymupdf' Check the logs for full command output.

需要同学优化一下PDF

  • 在wqxtDownloader init()增加一个方法将链接转换为bid
  • 在wqxtDownloader init()增加一个方法检测bid是否正确
  • 在wqxtDownloader initread()返回数据接口时,检测是否返回{"data":[],"errcode":8003,"errmsg":"很抱歉,您访问的图书不存在"}

你早点更啊,我手动加了cookie,好累

def initread( self ):
		url = "https://lib-nuanxin.wqxuetang.com/v1/read/initread?bid={}".format( self.bid );
		curl = get_value("urllib");
		headstr = {
		'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
		'Accept-Language':'zh-CN,zh;q=0.9,en;q=0.8',
		'Accept-Encoding': 'gzip, deflate, br',
		'User-Agent':' Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36',
		'Upgrade-Insecure-Requests':'1',
		'Connection':'keep-alive',
		'Host':'lib-nuanxin.wqxuetang.com',
		'Sec-Fetch-Mode': 'navigate',
		'Sec-Fetch-Site': 'none',
		'Sec-Fetch-User': '?1',
		'Cookie':'_gid=37932323894; _gidv=537efffb24efdsfsfs6d4c4b230a129; PHPSESSID=v5eq3t2i87ldadadadaddst; Hm_lvt_a84b27ffd87daa3273555205ef60df89=15801231246,158013758,158013054,158137507; Hm_lpvt_a84b27ffd87daa3273555205ef60df89=1580831213; acw_tc=b65cfd3915808333851232131239590aa0bc6819f05f1b00',};
		req= curl.request.Request(url, headers=headstr);
		request = curl.request.urlopen(req);
		data = request.read().decode("UTF-8");
		# {"data":[],"errcode":8003,"errmsg":"很抱歉,您访问的图书不存在"} #图书不存在
		bookInfo = json.loads( data );
		pages = bookInfo['data'];
		return pages;

HTTP Error 502

能否增加个出现 HTTP Error 502 的时候,自动重启的功能?
————————————————————————————————————————————
错误如下:
Traceback (most recent call last):
File "main.py", line 24, in
book = wqxtDownloader( bid );
File "C:\Users\83669\Desktop\wqxuetang_downloader-master (1)\wqxuetang_downloader-master\wqxtDownloader.py", line 36, in init
self.catatree = self.getCatatree();
File "C:\Users\83669\Desktop\wqxuetang_downloader-master (1)\wqxuetang_downloader-master\wqxtDownloader.py", line 84, in getCatatree
request = curl.request.urlopen(url);
File "C:\Users\83669\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\83669\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Users\83669\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Users\83669\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 569, in error
return self._call_chain(*args)
File "C:\Users\83669\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Users\83669\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 502: Bad Gateway

http解决后,请问怎么解决name的问题?:name 'http' is not defined

C:\Users\ynlsg\wqxuetang_downloader10>python main.py
C:\Users\ynlsg\wqxuetang_downloader10\cookies.txt
Traceback (most recent call last):
File "main.py", line 13, in
initUrllib();
File "C:\Users\ynlsg\wqxuetang_downloader10\utils.py", line 17, in initUrllib
cookie = http.cookiejar.LWPCookieJar()
NameError: name 'http' is not defined

main报错了

Traceback (most recent call last):
File "C:\Users\liten\Desktop\wqxuetang_downloader\main.py", line 33, in
if opts[0][0] in ['-h','--help']:
IndexError: list index out of range

使用前一版本,按补丁1和2都修改了,但出现这个错请问怎么解决:list indices must be integers or slices, not str

C:\Users\ynlsg\wqxuetang_downloader10>python main.py 3208995 83 388
C:\Users\ynlsg\wqxuetang_downloader10\cookies.txt
Traceback (most recent call last):
File "main.py", line 25, in
book = wqxtDownloader( bid );
File "C:\Users\ynlsg\wqxuetang_downloader10\wqxtDownloader.py", line 32, in init
self.name = bookInfo['name'];
TypeError: list indices must be integers or slices, not str

urllib.error.HTTPError: HTTP Error 503: Service Temporarily Unavailable

File "D:\Users\xiao\Desktop\1\wqxtDownloader.py", line 157, in start
downloadPage = self.downloadImage( url, path );
File "D:\Users\xiao\Desktop\1\wqxtDownloader.py", line 215, in downloadImage
request = curl.request.urlopen(requestPer, timeout=10);
File "D:\MySoftWare\Programming\Python\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "D:\MySoftWare\Programming\Python\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "D:\MySoftWare\Programming\Python\lib\urllib\request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "D:\MySoftWare\Programming\Python\lib\urllib\request.py", line 569, in error
return self._call_chain(*args)
File "D:\MySoftWare\Programming\Python\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "D:\MySoftWare\Programming\Python\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 503: Service Temporarily Unavailable

During handling of the above exception, another exception occurred:

“套接字操作尝试一个无法连接的主机的”错误

出现了这种问题,目前重启就行了
————————————————————————————————————————————
Traceback (most recent call last):
File "main.py", line 26, in
book.start( *(int(x) for x in sys.argv[2:]) );
File "C:\Users\83669\Desktop\wqxuetang_downloader-master\wqxtDownloader.py", line 158, in start
downloadPage = self.downloadImage( url, path );
File "C:\Users\83669\Desktop\wqxuetang_downloader-master\wqxtDownloader.py", line 217, in downloadImage
request = curl.request.urlopen(requestPer, timeout=10);
File "C:\Users\83669\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\83669\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 525, in open
response = self._open(req, data)
File "C:\Users\83669\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 543, in _open
'_open', req)
File "C:\Users\83669\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Users\83669\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 1360, in https_open
context=self._context, check_hostname=self._check_hostname)
File "C:\Users\83669\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 1319, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [WinError 10065] 套接字操作尝试一个无法连接的主机。>

获取到了失败的图片

2020-02-04 17:33:43,378 [ERROR] 3209066 获取到了失败的图片 第366页(366/417) 正在重试第1次
2020-02-04 17:33:48,795 [ERROR] 3209066 获取到了失败的图片 第366页(366/417) 正在重试第2次
2020-02-04 17:33:57,981 [ERROR] 3209066 获取到了失败的图片 第366页(366/417) 正在重试第3次
2020-02-04 17:34:05,145 [ERROR] 3209066 获取到了失败的图片 第366页(366/417) 正在重试第4次
2020-02-04 17:34:13,447 [ERROR] 3209066 获取到了失败的图片 第366页(366/417) 正在重试第5次
2020-02-04 17:34:28,017 [ERROR] 3209066 获取到了失败的图片 第366页(366/417) 正在重试第6次
2020-02-04 17:34:37,791 [CRITICAL] 重试次数过多,程序终止,请尝试重新执行main.py
Traceback (most recent call last):
File "main.py", line 25, in
book.start( *(int(x) for x in sys.argv[2:]) );
File "G:\文泉下载\wqxuetang_downloader-master\wqxtDownloader.py", line 176, in start
raise TooManyRetry;
wqxtDownloader.TooManyRetry

打补丁后提示一个小问题

技术攻防战,精彩!

Traceback (most recent call last):
File "main.py", line 12, in
initUrllib();
File "C:\wqxuetang_downloader-master\utils.py", line 78, in initUrllib
cookie = http.cookiejar.LWPCookieJar()
NameError: name 'http' is not defined

【报错】NameError: name 'sleepTime' is not defined

419页只下了14页,出现以下错误:
urllib.error.HTTPError: HTTP Error 503: Service Temporarily Unavailable 【可能是网络原因】

NameError: name 'sleepTime' is not defined
"wqxtDownloader.py", line 186, in start
logging.error("{} 发生了严重错误,暂停s秒 第{}页({}/{}) 正在重试第{}次".format( str(bid), str(sleepTime), page, str(downloadTimes), str(countNum), str(Errortimes)));

登录不了

2020-02-06 14:58:55,814 [ERROR] 登录错误,程序异常退出

报错,一脸懵逼,咋修复

Traceback (most recent call last):
File "main.py", line 22, in
book = wqxtDownloader( bid );
File "c:\1\wqxtDownloader.py", line 34, in init
self.name = bookInfo['name'];
TypeError: list indices must be integers or slices, not str

构造函数

def __init__( self, bid ):
	# 储存输入列表
	self.bid = bid;
	self.jwt_key = self.getJwtKey();
	bookInfo = self.initread();
	
	self.name = bookInfo['name'];
	
	self.page = int(bookInfo['pages']);
	
	self.kData = self.getK();
	folder = self.getFolder();
	self.createFolder( folder );
	self.folder = folder;
	self.catatree = self.getCatatree();
	self.invalidpic = self.getInvalidPicInfo();

这一块没改动,咋还报错了,咋修复。好像是数据类型出错了,。

下载过程中又出问题了

2020-02-03 16:51:02,041 [INFO] 下载成功 第149页(149/357)
2020-02-03 16:51:04,126 [INFO] 下载成功 第150页(150/357)
2020-02-03 16:51:06,002 [INFO] 下载成功 第151页(151/357)
2020-02-03 16:51:07,436 [INFO] 下载成功 第152页(152/357)
2020-02-03 16:51:08,870 [INFO] 下载成功 第153页(153/357)
2020-02-03 16:51:10,444 [INFO] 下载成功 第154页(154/357)
2020-02-03 16:51:12,315 [INFO] 下载成功 第155页(155/357)
Traceback (most recent call last):
  File "E:\Python37\lib\urllib\request.py", line 1317, in do_open
    encode_chunked=req.has_header('Transfer-encoding'))
  File "E:\Python37\lib\http\client.py", line 1229, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "E:\Python37\lib\http\client.py", line 1275, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "E:\Python37\lib\http\client.py", line 1224, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "E:\Python37\lib\http\client.py", line 1016, in _send_output
    self.send(msg)
  File "E:\Python37\lib\http\client.py", line 956, in send
    self.connect()
  File "E:\Python37\lib\http\client.py", line 1384, in connect
    super().connect()
  File "E:\Python37\lib\http\client.py", line 932, in connect
    self._tunnel()
  File "E:\Python37\lib\http\client.py", line 906, in _tunnel
    (version, code, message) = response._read_status()
  File "E:\Python37\lib\http\client.py", line 257, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "E:\Python37\lib\socket.py", line 589, in readinto
    return self._sock.recv_into(b)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 20, in <module>
    book.start( *(int(x) for x in sys.argv[2:]) );
  File "D:\Other\wqxuetang_downloader-master\wqxtDownloader.py", line 144, in start
    downloadPage = self.downloadImage( url, path );
  File "D:\Other\wqxuetang_downloader-master\wqxtDownloader.py", line 182, in downloadImage
    request = curl.request.urlopen(requestPer, timeout=10);
  File "E:\Python37\lib\urllib\request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "E:\Python37\lib\urllib\request.py", line 525, in open
    response = self._open(req, data)
  File "E:\Python37\lib\urllib\request.py", line 543, in _open
    '_open', req)
  File "E:\Python37\lib\urllib\request.py", line 503, in _call_chain
    result = func(*args)
  File "E:\Python37\lib\urllib\request.py", line 1360, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "E:\Python37\lib\urllib\request.py", line 1319, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error timed out>

报错,无法解决。谢谢

运行程序后报错,请问有没有方法可以解决?感谢!下载完全只是自己阅读,完全不会用于商业用途。

Traceback (most recent call last): File "main.py", line 19, in <module> book = wqxtDownloader( bid ); File "C:\Users\mayn\Desktop\文泉学堂下载器\wqxtDownloader.py", line 21, in __init__ self.name = bookInfo['name']; TypeError: list indices must be integers or slices, not str

感谢作者的分享

知识是学习进步的唯一方法,看书而且看有用的书才会对学习有帮助,网上其他电子书资源质量也是参差不齐,很多作者写点书全是水没有干货,而且基础科学的书籍也少得可怜,所以就是想白嫖一下清华出版社的书籍。不过白嫖失败,没有配置好,刚上车就熄火了。
有人觉得是吃人血馒头?
请问知网和其他**的学术网站下载都要收钱,钱又不会分给写论文的机构和个人,他们是不是吃人血馒头,本来不懂才看才学习的还要找学生要钱,更何况学生时代本来就没多少钱,别说什么校园网免费,只能说你不食肉糜,只知道你当下的情况。
难道穷人就活该享受不了公平获得知识的机会?
你觉得会有人拿去卖,从而导致正版受到伤害,我很不明白有人想贩卖盗版去赚钱难道只会这一种方式获取?想必不会吧,比如学校的围墙建的再高也会有人铤而走险,你能把他犯罪的心控制住吗?

我这报错

Traceback (most recent call last):
File "main.py", line 20, in
book.start( *(int(x) for x in sys.argv[2:]) );
File "C:\Users\petswir\Desktop\wqxuetang_downloader-master\wqxtDownloader.py", line 139, in start
url = self.getPageUrl( page );
File "C:\Users\petswir\Desktop\wqxuetang_downloader-master\wqxtDownloader.py", line 100, in getPageUrl
getKparmas = self.generateKparmas( page );
File "C:\Users\petswir\Desktop\wqxuetang_downloader-master\wqxtDownloader.py", line 116, in generateKparmas
jwt_enc = jwt.encode( jwt_data, jwt_key, algorithm='HS256');
AttributeError: module 'jwt' has no attribute 'encode'

我这个报错

错误如下

Traceback (most recent call last):
  File "main.py", line 5, in <module>
    from wqxtDownloader import *;
  File "D:\Other\wqxuetang_downloader-master\wqxtDownloader.py", line 5, in <module>
    from wqxtPDF import *
  File "D:\Other\wqxuetang_downloader-master\wqxtPDF.py", line 8, in <module>
    import fitz # pip3 install pymupdf
ModuleNotFoundError: No module named 'fitz'

说的过了,早上看到这样的事确实有点过激。但是对我而言,我仍然不敢苟同。

我可能觉得你想秀一秀自己的技术,或者是为了“造福”不想花钱的人。但是你明知这是窃取别人的知识产权,破坏计算机系统,还发布这样的程序,实在是有悖于技术伦理。在疫情下,你这样说是吃人血馒头也不为过;即使不是在疫情下,这也属于无良的技术。没有利益相关,我只是觉得,如果一个技术人没有技术伦理,无异于高知识的犯罪分子,更让人愤恨!

生成pdf代码有一本书遇到问题

3207901,291张图,我重复下载了两次,每张图都看过都没有问题,但是转换PDF时报Invalid argument,我不清楚里面哪张图有问题导致的,能否提供些解决或者发现问题的思路?下了十多本书,就这个遇到问题了

提示获取到了失败的图片

前面下载了一些图片文件还正常,但后面就无法下载了,先是表现为得到的图片文件大小只有5字节。中断程序运行后删除这些无效图片文件,再重新运行后就提示“获取到了失败的图片”,是否是被封了?该怎么解决?谢谢!

main.py 单个下载正常 / main_mult.py 多个下载报错

请输入需要下载的bid(以间隔):3208944 3207422
2020-02-06 16:54:01,233 [INFO] 成功创建文件夹 books/IMG/3208944
············
2020-02-06 17:11:52,414 [INFO] 3208944下载成功 第56页(56/211) 随机11.7s
Traceback (most recent call last):
File "c:\Users\Administrator.vscode\extensions\ms-python.python-2020.1.58038\pythonFiles\ptvsd_launcher.py", line 43, in
main(ptvsdArgs)
File "c:\Users\Administrator.vscode\extensions\ms-python.python-2020.1.58038\pythonFiles\lib\python\old_ptvsd\ptvsd_main_.py", line 432, in main
run()
File "c:\Users\Administrator.vscode\extensions\ms-python.python-2020.1.58038\pythonFiles\lib\python\old_ptvsd\ptvsd_main_.py", line 316, in run_file
runpy.run_path(target, run_name='main')
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 263, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "e:\Personal\Desktop\wqxuetang_downloader-master\main_mult.py", line 29, in
parseMultBid( Abid );
File "e:\Personal\Desktop\wqxuetang_downloader-master\main_mult.py", line 19, in parseMultBid
book.start();
File "e:\Personal\Desktop\wqxuetang_downloader-master\wqxtDownloader.py", line 163, in start
downloadPage = self.downloadImage( url, path );
File "e:\Personal\Desktop\wqxuetang_downloader-master\wqxtDownloader.py", line 224, in downloadImage
request = curl.request.urlopen(requestPer, timeout=10);
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 525, in open
response = self._open(req, data)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 542, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 502, in _call_chain
result = func(*args)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 1362, in https_open
return self.do_open(http.client.HTTPSConnection, req,
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 1323, in do_open
r = h.getresponse()
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\http\client.py", line 1322, in getresponse
response.begin()
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\http\client.py", line 303, in begin
version, status, reason = self._read_status()
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38\lib\http\client.py", line 272, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
PS E:\Personal\Desktop\wqxuetang_downloader-master>

出现 AttributeError: 'dict' object has no attribute 'sleep'

Traceback (most recent call last):
File "main.py", line 19, in
book.start( *(int(x) for x in sys.argv[2:]) );
File "C:\Users\8366\Desktop\wqxuetang_downloader-master\wqxtDownloader.py", line 172, in start
time.sleep( self.errorConfig.sleep )
AttributeError: 'dict' object has no attribute 'sleep'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.