Git Product home page Git Product logo

yupengyan's Projects

pyppeteer icon pyppeteer

Headless chrome/chromium automation library (unofficial port of puppeteer)

python-goose icon python-goose

Html Content / Article Extractor, web scrapping lib in Python

python-oauth2 icon python-oauth2

A fully tested, abstract interface to creating OAuth clients and servers.

scannerlite icon scannerlite

An OpenCV program implementing the recognition feature of the app "CamScanner". It extracts the main document object from an image and adjusts it to A4 size.

scel2mmseg icon scel2mmseg

convert sogou input dict ( .scel file ) to mmseg(coreseek) dict

scrapely icon scrapely

A pure-python HTML screen-scraping library

scrapy-cluster icon scrapy-cluster

This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.

scrapy-distribute_crawler icon scrapy-distribute_crawler

使用scrapy,redis, mongodb,graphite实现的一个分布式网络爬虫,底层存储mongodb集群,分布式使用redis实现,爬虫状态显示使用graphite实现

scrapy-mongodb icon scrapy-mongodb

MongoDB pipeline for Scrapy. This module supports both MongoDB in standalone setups and replica sets. scrapy-mongodb will insert the items to MongoDB as soon as your spider finds data to extract.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.