Comments (2)
You need to install the chrome browser in your operating system in the first place to use the selenium package.
from image-downloader.
For anyone still having this issue, the problem lies within the regex for parsing the image URL. It gets extra junk in there which breaks the image link. To fix the code, modify google_image_url_from_webpage
function in crawler.py
to this:
# (line 121)
image_elements = driver.find_elements(By.CLASS_NAME, "islib")
image_urls = list()
url_pattern = r"imgurl=\S*?&" # explanation: \S -> match any whitespace character
# *? -> match previous token \S between 0 and unlimited times and do so lazily, aka match until the first & and not the last one
for image_element in image_elements[:max_number]:
outer_html = image_element.get_attribute("outerHTML")
re_group = re.search(url_pattern, outer_html)
if re_group is not None:
image_url = unquote(re_group.group()[len("imgurl=") : -len("&")])
image_urls.append(image_url)
return image_urls
from image-downloader.
Related Issues (20)
- 爬百度图片的数量问题 HOT 3
- Any ideas? HOT 2
- hi~我是一个正在学习ai的学生,使用您的爬虫爬取baidu图片,特此求助:使用gui方式打开,选取baidu,搜索关键字,点击start,然后就会报错如下 HOT 8
- Key error: 'listnum' HOT 2
- AttributeError
- Is there are way to set image resolution?
- Unsplash search engine, and firefox browser enhancement and image resolution preferences HOT 1
- win10+wsl2 ubuntu20.04+chrome92.0.4515.107+ChromeDriver+92.0.4515.43 error
- Error DevToolsActivePort file doesn't exist HOT 2
- 新版selenium不支持PhantomJS 要用老版本吗 HOT 1
- JSONDecodeError
- How can I rename the download it files with the keywords. HOT 1
- 支持mac吗 HOT 3
- 无法下载百度图片
- driver = webdriver.PhantomJS(executable_path=phantomjs_path报错如下
- No module named 'PyQt5' HOT 2
- 对chrome版本是否有限制 HOT 1
- 无法用,selenium 的version你都不说是多少,版本一更新一堆报错,全是历史版本不兼容
- 使用谷歌搜索一直报错,是否需要升级什么版本?Can not find chromedriver for currently installed chrome version HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from image-downloader.