Git Product home page Git Product logo

gecco-spring's Introduction

gecco-spring

gecco爬虫和spring结合使用。1.2.9版本开始支持spring-boot。spring升级到4.x。

Download

<dependency>
    <groupId>com.geccocrawler</groupId>
    <artifactId>gecco-spring</artifactId>
    <version>x.x.x</version>
</dependency>

maven

初始化Gecco

加载完成bean后启动Gecco,可以通过继承SpringGeccoEngine类,初始化你的GeccoEngine,需要特别注意的是GeccoEngine需要用非阻塞模式start()运行:

@SpringBootApplication
@Configuration
public class App {

    @Bean
    public SpringGeccoEngine initGecco() {
        return new SpringGeccoEngine() {
            @Override
            public void init() {
                GeccoEngine.create()
                .pipelineFactory(springPipelineFactory)
                .classpath("com.geccocrawler.gecco.spring")
                .start("https://github.com/xtuhcy/gecco")
                .interval(3000)
                .loop(true)
                .start();
            }
        };
    }
    
    public static void main(String[] args) throws Exception {
        SpringApplication.run(App.class, args);
    }
    
}

开发Pipeline

pipeline的开发和之前一样,唯一不同的是不需要@PipelineName("consolePipeline")定义pipeline的名称,而是使用spring的@Service定义,spring的bean名称即为pipeline的名称。可以参考:

@Service("consolePipeline")
public class ConsolePipeline implements Pipeline<SpiderBean> {
	@Override
	public void process(SpiderBean bean) {
		System.out.println(JSON.toJSONString(bean));
	}
}

也可以使用@Configuration和@Bean定义pipeline。如:

@Configuration
public class BeanConfigure {
    
    @Bean(name="consolePipeline")
    public ConsolePipeline consolePipeline() {
        return new ConsolePipeline();
    }
}

DEMO

参考源代码中测试用例src/test,有详细的例子

gecco-spring's People

Contributors

xtuhcy avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.