Git Product home page Git Product logo

dns's Introduction

CREATE TABLE `domains` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `word` varchar(6) NOT NULL DEFAULT '',
  `word_len` int(11) NOT NULL,
  `pinyin` varchar(22) NOT NULL DEFAULT '',
  `pinyin_len` int(11) NOT NULL,
  `available` int(11) NOT NULL,
  `status` int(11) NOT NULL,
  `entry_cnt` int(11) NOT NULL,
  `query_cnt` int(11) NOT NULL,
  `query_status` int(11) NOT NULL,
  `created_at` timestamp NULL DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
  `updated_at` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=70480 DEFAULT CHARSET=utf8;
<?php 

namespace App\Console\Commands;

use App\Console\Commands\Crawler;
use App\Console\Boot;
use App\Http\ZhiHu;
use Curl;
class CrawlerZhiHu extends Boot{

    protected $signature = 'crawler:zhihu {mutix?}';

    /** @var string [描述] */
    protected $description = 'weibo';

    public function __construct()
    {
        parent::__construct();
    }

    public function handle()
    {
        $this->start();

		ZhiHu::unguard(true);

		$this->grap();

        $this->end();
    }

    public function grap()
    {
    	while (true) {
    		$zhihus = ZhiHu::whereStatus(0)->limit(100)->get();
    		foreach ($zhihus as $zhihu) {

				$craw = new Crawler();
    			$url = $zhihu->url;

    			$craw->get($url)->startFilter();

				$zhihu->title = $craw->filter('h2.zm-item-title')->text();

				$zhihu->status = 1;

				$zhihu->save();

				$this->info($url);

				// ZhiHu::saveData(compact('url','status','title'));


				$craw->filter('a.question_link')->each(function($node){
					$link = $node->attr('href');
					$child_url = 'http://www.zhihu.com'.$link;
					if(!ZhiHu::where('url',$child_url)->first())
						ZhiHu::saveData(['url'=>$child_url]);
				});
    		}
    	}
    	




    }


}

 ?>
 

dns's People

Contributors

polimao avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.