Git Product home page Git Product logo

tasques's Introduction

Tasques Build Status codecov Go Report Card

Task queues backed by Elasticsearch (ES): Tasques

Pronounced: /tɑːsks/, like "tasks"

You know, for Background Tasks !

Why use ES as a Tasks data store? It's horizontally scalable, highly-available, and offers a lot of built-in ways to manage your data (lifecycle management, snapshots, etc).

dashboard

Features:

  • Easily scalable:
    • Servers are stateless; just spin more up as needed
    • The storage engine is Elasticsearch, nuff' said.
  • Tasks are configurable:
    • Priority
    • Schedule to run later
    • Retries with exponential increase in retry delays.
  • Idempotency
  • Recurring Tasks that are repeatedly enqueued at configurable intervals (cron format with basic macro support à la @every 1m)
    • Also supports skipping if outstanding Tasks exist for a given Recurring Task
  • Timeouts for Tasks that are picked up by Workers but either don't report in or finish on time.
  • Archiving of completed Tasks (DONE or DEAD), also configurable
    • If Index Lifecycle Management (ILM) is enabled (default), the archive index is set to roll over automatically for easy management of old data.
  • Unclaiming allows Tasks that get picked up but can't be handled to be requeued without consequence.
  • API is exposed as Swagger; easily generate clients in any language:
    • Use the client to enqueue Tasks from your application
    • Workers are just a loop around the client, then your own business logic to do the actual work.
  • Pre-seeded Kibana Index Patterns and Dashboards for monitoring tasks.
  • Simple configuration: use a config file, optionally override with environment variables (12 factor-ready).
  • Application Performance monitoring: metrics are exported to APM and available again from Kibana (more below)

Usage

Running

Tasques is available as a small Docker image, with images published automatically by CI upon pushes/merges to master and tag pushes.

To get a quick idea of what is included and how to get up and running with Kubernetes:

  1. Go to docker/k8s and run make install-eck deploy, and wait until the pods are all ready (kubectl get pods)
  2. For Swagger, go to localhost:8080/swagger/index.html Swagger
  3. To log into Kibana for dashboards and APM stats, run make show-credentials to get the elastic user password, and go to localhost:5601 to log in.

There is also an example project that demonstrates the application-tasques-worker relationship more thoroughly; please see example/ciphers for more details.

Monitoring

APM

The server supports APM, as configured according to the official docs.

High availability

The Tasque server is in principle stateless; but there are internal recurring jobs that need to be taken care off, like monitoring claimed Tasks and timing them out as needed, and scheduling recurring Tasks.

These internal jobs only occur on the Leader server, as determined by a leader lock. By spinning up more than one Tasque server, you not only gain the benefits of being able to handle more load, but also shield yourself from potential disruptions in the running of these internal Tasks, as a new Leader will be elected and take over if the current Leader loses connectivity or is terminated.

Running multiple servers also allows for zero-downtime rollouts of new versions of Tasques server.

Delivery

Assuming there is no data loss at the ES level, Tasques provides at-least-once delivery, ensuring that it only allows a single Worker (identified by Id) to have claim on a Task at any given time.

If there is data loss/recovery (snapshot recovery, ES node loss), jobs might be handed out twice, so it's a good idea to make job handling idempotent.

Idempotency

When submitting/creating a Task, you can optionally specify an "id" field, which acts a Queue-specific idempotency key (a UUID is generated and used if not specified). If there is already a Task in the Queue you specified with that key, the submission will fail.

Note that idempotency is only for un-archived Tasks: it's possible for a Task to be created with the same Id as another already archived Task. The archive_older_than period config can be tweaked if this is an issue.

Recurring Tasks

Tasques comes with support for scheduling Tasks enqueued at a given cron expression (delegating to robconfig/cron, so check there for supported expressions).

There is also support to skip enqueueing Tasks for a Recurring Task if there are existing outstanding (not dead or done) that belong to it. For high-frequency Recurring Tasks (higher than once every ~2 seconds) this is subject to limitations in ES itself, since it refreshes Indices on a configurable interval. To address this, you can configure the ES index refresh interval to be more frequent, or you can configure tasques itself to be more aggressive about refreshing indices (see config/tasques.example.yaml for details).

Dev

Requires Go 1.13+.

  1. Install Go
  2. Use your favourite editor/IDE
  3. For updating Swagger docs:
    1. Install Swaggo
    2. Run swag init -g app/main.go from the root project dir
      • Check that there are no time.Time fields... there's a race condition in there somewhere
    3. Commit the generated files.
  4. For updating the Go Client:
    1. Install go-swagger
    2. Run swagger generate client -f docs/swagger.yaml
    3. Commit the generated files.

Running tests

Unit tests: go test ./...

Integration tests: go test -tags=integration ./... (finds // +build integration at the top o IT files)

Code guidelines

The code emphasises the following:

  1. Safety: the code needs to do the right thing. Use built-in features (locks, typed ids) and tests to help maximise safety.
  2. Efficiency: the server and its code should be reasonably efficient so that it can handle high loads.
  3. Observability: where reasonable, pass a Context around so we can export performance metrics. This is a must when making any kind of IO call.
  4. High availability: the server needs to be able to run in a highly available way to maximise robusness.
  5. Simplicity: the API needs to be simple and easy to use
  6. Maintainability: the internals need to be as simple as possible and invite contributions. Use the rule of 3 to know when something should be generalised. If in doubt, repeat yourself.

Credit

This project was inspired by the following, in no particular order

tasques's People

Contributors

dependabot[bot] avatar lloydmeta avatar swallez avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

tasques's Issues

Queue name validations

There needs to be a few more validations on queue names, and allowable chars for creates and updates should be different for claiming.

Also, the queue should be persisted with each task.

[Feature] Support skip-if-uncompleted-tasks-exist for Recurring Tasks

It would be nice to support not scheduling a Recurring Task if there are any outstanding Tasks that belong to it (not DEAD or DONE).

This could be added as an optional (default false) field on RecurringTask.

One simple way to do this for Recurring Tasks that have this option set to true would be to do a search for the existence of outstanding tasks any before scheduling the Task in the Scheduler

if log.Debug().Enabled() {
log.Debug().
Str("id", string(task.ID)).
Str("expression", string(task.ScheduleExpression)).
Msg("Enqueuing Task")
}
tx := i.tracer.BackgroundTx("recurring-task-enqueue")
ctx := tx.Context()
_, err := i.tasksService.Create(ctx, i.taskDefToNewTask(task.ID, &task.TaskDefinition))
if err != nil {
log.Error().
Err(err).
Str("id", string(task.ID)).
Str("expression", string(task.ScheduleExpression)).
Msg("Failed to insert new Task at interval")
}
tx.End()

Improve documentation

  • Improve the README
    • Explain motivation
    • Add comprehensive features list
    • Explain some inner workings
    • Mention how to spin up the K8s stuff in docker/k8s

Improve config

  • Support env variable substitutions for config (this will improve the k8s experience):
  • Improve documentation

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.