Git Product home page Git Product logo

crawler-project's Introduction

Crawler-website

It's crawler website using Go language.

This is the mainPage image

This is the homePage image

Features

  • Go language
  • Docker
  • Elastic Search
  • MVC pattern
  • Microservices
  • Singleton -> Concurrent -> Distribute

Installation and go packages

  • go language
  • docker
  • elasticsearch
  • go get golang.org/x/text
  • go get -v github.com/gpmgo/gopm
  • gopm get -g -v golang.org/x/text
  • gopm get -g -v golang.org/x/net/html
  • go get gopkg.in/olivere/elastic.v5

Usage for Concurrent

  • Start Docker.
  • Run Script "docker run -d -p 9200:9200 elasticsearch"
  • Run "src/crawler/main.go", to start the singleton crawler.
  • Run "src/crawler/frontend/starter.go", to view the result in the website.
  • Visit "http://localhost:8888/" in your browser
  • Type in query string with REST format. such as "女 && Age>20"

Usage for Distribute

  • Start Docker.
  • Run Script "docker run -d -p 9200:9200 elasticsearch"
  • Open a Terminal, execute: src\crawler_distributed\persist\server>go run ItemSaver.go --port=1234
  • Open a Terminal, execute: src\crawler_distributed\worker\server>go run worker.go --port=9000
  • Open a Terminal, execute: src\crawler_distributed\worker\server>go run worker.go --port=9001
  • Open a Terminal, execute: src\crawler_distributed>go run main.go --itemsaver_host=":1234" --worker_hosts=":9000,:9001"
  • Run "src/crawler/frontend/starter.go", to view the result in the website.
  • Visit "http://localhost:8888/" in your browser
  • Type in query string with REST format. such as "男 && 已购车"

Architecture

image

Framework

image

Algorithm

image

Reference

crawler-project's People

Contributors

albert-w avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.