Git Product home page Git Product logo

dml-2021 / amazon-spider Goto Github PK

View Code? Open in Web Editor NEW
6.0 1.0 1.0 5 KB

amazon crawler、亚马逊分布式爬虫系统,已在生产环境中稳定运行两年,支撑每天5千万+爬取量,包含亚马逊前台所有数据的爬取,如亚马逊商品爬虫、亚马逊关键词爬虫(可指定邮编爬取)、亚马逊关键词广告爬虫、亚马逊5大榜单爬虫、亚马逊评论爬虫、亚马逊类目爬虫、亚马逊商户爬虫、亚马逊买家信息爬虫、亚马逊跟卖信息爬虫、亚马逊QA爬虫、亚马逊促销活动爬虫等

amazon amazon-product amazon-review crawler java nodejs spider amazon-keyword puppeteer

amazon-spider's Introduction

amazon-spider

amazon crawler、亚马逊分布式爬虫系统,已在生产环境中稳定运行两年,支撑每天6千万+爬取量,包含亚马逊前台所有数据的爬取,如商品、关键词(可指定邮编)、广告、5大榜单、评论、类目、商户、买家信息、跟卖信息、QA、促销活动等

解决以下几个问题:

一、大规模爬取亚马逊前台所有数据,目前每天爬取6千万+数据;
二、反反爬,突破亚马逊反爬策略,稳定爬取;
三、获取高质量数据,当亚马逊风控系统怀疑你是爬虫后返回的数据质量将下降很多,比如返回的关键词页面、商品页面缺少广告数据;

合作方式

一、提供api接口
二、私有化部署

联系方式,微信:Dml-2021

amazon-spider's People

Contributors

dml-2021 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

tomguodong

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.