Git Product home page Git Product logo

kafka-long-running-jobs's Introduction

kafka-long-running-jobs

This repo contains examples of dealing with long-running jobs using Spring and Apache Kafka. Its purpose purely informative to give developers an idea of the different approaches to take.

The project has 5 submodules, each independent of one another so that all information needed can be found by looking at a single module. Note that this does mean that there is duplicate code across modules, however as its purpose is to show how to tackle the problem I found it easier if you only had to look in a single module instead of navigating all over the repo.

All the submodules have (mostly) the same classes:

  • Application: main class to start the module
  • Consumer: the kafka consumer that consumes the event and starts the long-running job
  • Controller: contains an endpoint to call which produces an event (which is consumed by the consumer)
    • A postman collection (kafka-long-running-jobs.postman_collection.json) has been added with all endpoint calls
  • LongRunningJob: a simulation of a long-running job, performs a thread.sleep of 10 minutes

Furthermore, the configurations needed for the different approaches are in the resources/application.yaml

Below you can find a summary of the 5 submodules. For a more detailed explanation on the issue and all of the solution I would like to refer to my blog on the subject: TODO

##Module: microprocesses This module shows how splitting a job into multiple shorter-running microprocesses would work. Of course all processes are fictional (as in they're just thread.sleeps), but flow wise this is how it would work.

This module relates to the section Split the job into microprocesses in the blog.

Module: not-async

This module has nothing async to it. It exists for the sake of showing what the default behaviour is (spoiler alert: the consumer leaves the group)

This module relates to the section Problem in the blog. note that it also contains the configurations mentioned in section Increase the timeout.

Module: spring-async

The most straightforward approach, add an @Async annotation to the long-running job so that it will be executed on a separate thread.

  • Pro: easy to set up, if resources (CPU usage/memory) aren't an issue than this approach works well since it will handle multiple events concurrently
  • Con: if the long-running job is resource intensive the concurrency can become a bottleneck (too much CPU usage, OOM issues etc.). Also, since it's a fire-and-forget approach, retries and kafka error handlers require more work

This module relates to the section Fire-and-forget threading in the blog.

Module: pause-container

Uses the KafkaListenerEndpointRegistry to get the listener container (spring container that contains the consumer) to pause and resume the consumer.

  • Pro: more control over success and error callbacks and thus error handling
  • Con: requires more effort than the spring-async approach.

This module relates to the section Pause consumer with threading and callbacks in the blog.

Module: pause-container-with-acknowledge

Same as the previous one except that it has auto-commit disabled and uses the Acknowledge object to acknowledge the event (and commit).

  • Pro: More control, and the ability to not acknowledge events when an error occurs
  • Con: More effort than the two approaches above, also don't forget to acknowledge

This module relates to the section Pause consumer with threading and callbacks in combination with the sections Limit number of events consumed per poll and Use manual commits in the blog.

kafka-long-running-jobs's People

Contributors

xrademaker avatar xavyr-r avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.