kohn / httpproxymiddleware Goto Github PK
View Code? Open in Web Editor NEWA middleware for scrapy. Used to change HTTP proxy from time to time.
License: MIT License
A middleware for scrapy. Used to change HTTP proxy from time to time.
License: MIT License
爬取的ip好像不能爬Https网站
HttpProxyMiddleware中request.meta["proxy_index"]会报错,KeyError:‘proxy_index'.
def process_request(self, request, spider):
"""
将request设置为使用代理
"""
if self.proxy_index > 0 and datetime.now() > (self.last_no_proxy_time + timedelta(minutes=self.recover_interval)):
logger.info("After %d minutes later, recover from using proxy" % self.recover_interval)
self.last_no_proxy_time = datetime.now()
self.proxy_index = 0
request.meta["dont_redirect"] = True
博主我想问下这段代码中self.proxy_index = 0是什么意思呢?如果是将request设置为使用代理的话,设置self.proxy_index = 0,不就是使用零号代理(无代理)了吗?
我使用上面的代理去抓取点评的页面,运行几分钟后发生错误<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly,试过几次都是报同样的错误,所以想问下是代理IP不行吗?还是。。。
看了源码,一直好奇,代理时,什么时候切换IP,尤其是当直连时,遇到服务器返回302 或者 403时,
貌似在parse中抛出异常,在process_exception中也捕获不了,想让它切换IP,希望能明示。
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.