Git Product home page Git Product logo

meituan's Introduction

美团(美食)店铺信息爬虫

  通过接口抓取美团美食店铺信息,并做相关的数据分析。

项目结构

.
├── README.md
├── common.py
├── configs
│   ├── config.py
│   ├── parse.py
│   ├── requirements.txt
│   ├── token_.py
│   ├── utils
│   │   ├── br.json
│   │   ├── cities.json
│   │   ├── ua.log
│   │   └── uuid.log
│   ├── view
│   │   ├── FZSTK.TTF
│   │   ├── db.jpg
│   │   ├── jing.jpeg
│   │   ├── key.png
│   │   ├── pricom.jpg
│   │   ├── ratio.jpg
│   │   ├── title.txt
│   │   └── top10.jpg
│   └── visual.py
├── meituan.py

主要实现过程

  1. 组装token、cookie、ua等基础参数
  2. 通过requests获取数据
  3. 解析json数据
  4. 保存数据至mysql数据库
  5. 使用matplotlib进行可视化分析

环境依赖

pip3 install -r requirements.txt

解释说明

  1. 接口动态参数:uuid, _token, cookie
  2. 接口参数 uuid 需要不定时从网页源码获取 ,否则_token 的 uuid 就会失效。
  3. 接口 _token 参数加密:二进制压缩、Base64 编码, 解密:Base64 解码、二进制解压。另外、生成 token 的 sign 参数加密解密过程与 _token 相同。
  4. 接口cookie 参数需要从pc浏览器端登录后获取

运行

切换至 meituan 文件夹的根目录执行(运行之前请确保已经安装了相关模块及数据库):

# pip3 install -r configs/requirements.txt
python common.py
python meituan.py

数据分析展示

  • 美食店铺名称词云

    key

  • 北京美食店铺排行榜前10名( 仅限美团数据 )

    top10

  • 店铺价格与评论数量的关联性

    pricom

  • 美食店铺各评分占比

    ratio

  • MySql 数据

    db

公告

本代码仅作学习交流,切勿用于商业用途,否则后果自负。若涉及美团侵权,请与我联系,会尽快处理。

meituan's People

Contributors

chenshuaikang avatar

Stargazers

涵曦 avatar  avatar ShunGe avatar  avatar  avatar  avatar  avatar zaeMyn avatar  avatar liuxgui avatar Kevin avatar  avatar  avatar  avatar  avatar fu avatar

Watchers

 avatar

meituan's Issues

url 403

请问目前组成的url直接访问是403是什么情况呀

运行common.py 报错

image

你好,在本地运行common.py 时报错如上,请问代码需要改哪些配置嘛

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.