Git Product home page Git Product logo

googlemap_spider's Introduction

GoogleMap_Spider

[iewoai]爬取谷歌地图的商业信息(requests版)

谷歌地图爬虫(requests版)

一、目标:搜素company,改坐标,获取谷歌地图上的所有公司信息

注:requests版只是简单实现爬取逻辑,其他方面没有兼顾,如查重、爬取速度等

思路:从一点出发,选取该点最优Z(以整数调整),以该点的切片边长或1d表达式做距离,经纬等长移动度差,移动后的每一点重复该步骤。

以下纯个人理解:

1. 同一经纬度,不同缩放倍数的公司数目不一样,很难选最优值

2. 当没有最优值,即该坐标附近无公司的情况,需设置默认度幅(经纬度增减幅度)

3. 移动时为前后左右移动,即固定经度或纬度,正负移动另一个

4. 固纬度动经度,度差受纬度和距离影响;固经度动纬度,度差仅受距离影响

5. 最优值选缩放倍数大于等于12的(纬度幅0.82739)小于等于18的(纬度幅0.00899)中公司数目中最多的倍数,以其1d(见url解析)做距离,上下左右移动。当12-18公司数都为0时,取默认度幅

二、瓦片地图原理参考资料:

三、url解析1

1. X 纬度,范围[-85.05112877980659, 85.05112877980659]

2. Y 经度,范围 [-180, 180]

3. Z 缩放倍数,范围[2, 21]

4. Z=2 切片正方形边长为20037508.3427892

url解析2

1. hl={1}为语言,常用zh-CN或en

2. g1={2}为当前所在国家地区缩写

3. !1d{3}为1d,与缩放倍数有关,相邻两整数缩放倍数之间1d成1/2关系,缩放倍数越高值越小,最大为94618532.08008283,猜测为当前所在地图切片边长或周长等边长正比关系

4. !2d{4}为2d,经度

5. !3d{5}为3d,纬度

6. {6}为搜素结果页数,格式为!8i+page,其中page默认为20的倍数,且page为0时(即第一页时),url中无!8i字段

7. q={7}&oq={8},基本都是搜素词

四、经纬度和距离互转

googlemap_spider's People

Contributors

iewoai avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.