Git Product home page Git Product logo

vino-crawlers's Introduction

关于

  • 学习 Python 时写一些简单的爬虫来获取需要的数据。
  • 有些程序估计写的比较早,一些网站的验证机制估计也变了,只做参考用。
  • 不定期更新。欢迎 PR。

爬虫实例

  • Readme_Luowang:关于如何爬取落网音乐,下载到本地的小程序。
  • Readme_Baidu:关于如何基于 Py2.7 根据关键词从百度下载图片的小程序。
  • Readme_Zhihu:关于如何抓取知乎上一些信息的程序。
  • Readme_One:关于如何爬取 One 网站上的每日一图以及 One 问答,并且存储在 LeanCloud 云后台。
  • Readme_Sujin:关于如何爬取素锦网站上的好文章,并且存储在 LeanCloud 云后台。
  • Readme_Douban:关于如何爬取豆瓣图书 Top250。
  • Readme_Lagou:关于如何从拉勾网爬取较大量的职位信息以及存储至 NoSql 类型数据库中。
  • Readme_XiciDaili:抄自知乎一个回答。改成 MongoDB 存储以及加了验证机制。但是可用性不是很高,大概30%。

爬虫基础

爬虫进阶

数据分析

Python 相关

书籍推荐

  • 《用 Python 进行数据分析》
  • 《Python 数据挖掘入门与实战》
  • 《干净的数据-数据清洗与入门实践》
  • 《Python 网络数据采集》
  • 《集体智慧编程》
  • 《数据挖掘导论》

感谢

vino-crawlers's People

Contributors

dlx4 avatar suzumiyang avatar wuchangfeng avatar zhisheng17 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vino-crawlers's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.