Git Product home page Git Product logo

autohome_crawler's Introduction

###项目简介

本项目主要用于介绍使用 requests 和 BeautifulSoup 进行爬虫开发,最后采集到的条目格式如下:

{
    "外观颜色": "晨露白,布里奇沃特青铜,马达加斯加橙,鲜绿,塞勒涅青铜,深蓝色,栗子黑", 
    "name": "Vanquish", 
    "url": "http://car.autohome.com.cn/price/brand-35.html", 
    "brand": "阿斯顿·马丁", 
    "车身结构": "硬顶跑车", 
    "变速箱": "自动", 
    "发动机": "6.0L", 
    "级别": "跑车", 
    "price": "526.88-628.00万"
}

使用须知

  1. clone 本项目

2. 配置依赖

    ```
	# cd autohome_crawler
	# pip install -r requirements
	```
	
3. 修改配置(如果有需要)


    ```
	# vim setting.py
	```
	
4. 执行爬取任务,默认结果会下载到 requests 目录下


    ```
	# python app.py
	```

### 需要加强功能

1.  下载重试功能 http://www.coglib.com/~icordasc/blog/2014/12/retries-in-requests.html

### 可能出现的问题

1. 抓取具体车型信息的时候,会出现颜色无法抓取成功的情况。(有时)

autohome_crawler's People

Contributors

william-sang avatar

Watchers

Peng Chen avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.