Git Product home page Git Product logo

dolaterio's People

Contributors

acroca avatar harshadyeola avatar jonog avatar pulkitsinghal avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

dolaterio's Issues

Provide default worker name and avoid duplicates

Suggestion:

  1. Running a command to create a worker results in an empty worker_name. By default, we should translate the "docker_image": "org/image" to be one of the following:
    1. "worker_name": "org/image" or
    2. "worker_name": "org_image" or
    3. "worker_name": "org-image" or
    4. "worker_name": "image"
  2. When creating a worker with the same image repeatedly. We should decide if we want to avoid duplicates and enforce uniqueness based on docker_image or worker_name. Whatever we decide, we should enforce it.
    1. I vote for worker_name based uniqueness.
      1. I have tested that if I pass in the same worker name right now (not talking about empty or defaults anymore), it creates a new id for me every time in the DB, and we can improve upon that as it doesn't seem desirable.

Document approach for stateful workers

A worker's code is meant to be stateless, in the sense that it should only work on top of the payload given to it.

There are times when worker processing can fail or pointlessly waste cycles because the code relies on some 3rd party api that is throttling the total # of requests until a certain amount of time has passed.

At times like these:

  1. the worker code can keep running and waiting and retrying until the throttling ends and it can finish the job
  2. or it can fail while persisting some info about its state to a database or cache ... and when the worker framework decides to restart the worker:
    1. based on some sort of retry policy
    2. or because it was done manually by an admin
    3. at that time the state information can be the first thing that worker retrieves to continue where it left off
  3. or it can shutdown and schedule a payload for itself in the workerFramework's queue to run again after a certain amount of time has passed ... same logic about persisting state and retriving it on the next run applies as before

option2 and option3 aren't feasible unless there is a predictable identifier tied to the worker's payload from the very beginning!

So for jobs that want to leverage state, the very first payload should contain a stateId that can be used by the worker to store/retrieve state in case of trouble.

worker dies unexpectedly

These are the logs:

$ docker-compose logs --follow worker
Attaching to dolaterio_worker_1
worker_1     | panic: runtime error: invalid memory address or nil pointer dereference
worker_1     | [signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x45ed75]
worker_1     | 
worker_1     | goroutine 84 [running]:
worker_1     | panic(0x8bb960, 0xc42000a030)
worker_1     |  /usr/local/go/src/runtime/panic.go:500 +0x1a1
worker_1     | github.com/shoppinpal/dolaterio/docker.(*Engine).BuildContainer(0xc4201bca60, 0xc4201d2700, 0x0, 0x0, 0x0)
worker_1     |  /root/work/src/github.com/shoppinpal/dolaterio/docker/engine.go:70 +0xe65
worker_1     | github.com/shoppinpal/dolaterio/runner.Run(0xc4201d2700, 0xc4201bca60, 0x0, 0x0)
worker_1     |  /root/work/src/github.com/shoppinpal/dolaterio/runner/run.go:26 +0x82
worker_1     | github.com/shoppinpal/dolaterio/runner.(*JobRunner).run(0xc42014b220)
worker_1     |  /root/work/src/github.com/shoppinpal/dolaterio/runner/job_runner.go:99 +0x110
worker_1     | created by github.com/shoppinpal/dolaterio/runner.(*JobRunner).Start
worker_1     |  /root/work/src/github.com/shoppinpal/dolaterio/runner/job_runner.go:52 +0x4f
dolaterio_worker_1 exited with code 2

This is how I attempted to run the worker

echo "setup worker - with dolaterio AND autoset the WORKER_NAME in env vars" && \
export WORKER_NAME="shoppinpal_generate-weekly-stock-orders" && \
export IMAGE_NAME="shoppinpal/generate-weekly-stock-orders" && \
echo "WORKER_NAME: " $WORKER_NAME && \
echo "IMAGE_NAME: " $IMAGE_NAME && \
  curl localhost:7000/v1/workers \
    -H "Content-Type: application/json" \
    -X POST \
    -d '{
      "docker_image": "'"$IMAGE_NAME"'",
      "env": {"NODE_ENV": "local"},
      "worker_name": "'"$WORKER_NAME"'"
    }' && \
echo -e "\n You can check worker status with: \n curl localhost:7000/v1/workers/$WORKER_NAME \n" && \
echo "queue payload - with dolaterio AND autoset the JOB_ID in env vars" && \
export JOB_ID=$(
  curl localhost:7000/v1/jobs \
    -H "Content-Type: application/json" \
    -X POST \
    -d '{
      "worker_id": "'"$WORKER_NAME"'",
      "stdin": {
        "projectId": "xxx",
        "oauthToken": "yyy",
        "workerPayloads": []
      }
    }' | python -c 'import sys, json; print json.load(sys.stdin)["id"]'
) && echo $JOB_ID && \
echo -e "\n You can retrieve task/job status and results with: \n curl localhost:7000/v1/jobs/$JOB_ID \n" && \
curl localhost:7000/v1/jobs/$JOB_ID

This is great for catching onto problems but eventually we would want to always restart the worker service.

Add the ability to pause a worker

Add an endpoint that allows us to mark a worker as paused ... any new incoming payloads for a paused worker will continue to be queued but not executed until it is unpaused.

Add debug capable images

Summary

Lets add Dockerfile-debug and docker-compose-debug.yml
that can be used when troubleshooting.

Motivation

The images for dolaterio are lightweight and built from scratch which makes it hard to troubleshoot ... maybe we should think of building/publishing debug capable images too?

Given the way the containers are right now ...

$ docker logs dolaterio_api_1
$ docker logs dolaterio_worker_1

gave no output

And attempting to attach to them failed as well:

$ docker exec -t -i dolaterio_api_1 /bin/bash
rpc error: code = 2 desc = oci runtime error: exec failed: exec: "/bin/bash": stat /bin/bash: no such file or directory
$ docker exec -t -i dolaterio_api_1 ls -alrt
rpc error: code = 2 desc = oci runtime error: exec failed: exec: "ls": executable file not found in $PATH

add a replay endpoint to the API

Let's say we created a job via a curl request that had a payload (env and stdin etc.) and in response we ended up with JOB_ID=aaa

To take the feature set further, how about adding an endpoint like:
POST /api/jobs/<JOB_ID>/replay
which will automatically create another job and return a new JOB_ID=bbb

The merit is that we don't post the request data again and therefore we don't make a human error when trying to replay something from the past.

Passing json to stdin fails worker api

curl http://127.0.0.1:7000/v1/jobs -H "Content-Type: application/json" -X POST -d '{"worker_id": "acb4b32d-b141-4ce5-8e7f-d449af6addac", "stdin": "{"projectId":"xxxxxxxx","oauthToken":"xxxxxx","workerPayloads":[]}", "env": {"HI": "BYE"}}'
worker_1     | panic: runtime error: invalid memory address or nil pointer dereference
worker_1     | [signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x4602cc]
worker_1     | 
worker_1     | goroutine 75 [running]:
worker_1     | panic(0x8ba5e0, 0xc42000a030)
worker_1     |  /usr/local/go/src/runtime/panic.go:500 +0x1a1
worker_1     | github.com/shoppinpal/dolaterio/runner.Run(0xc42017f4d0, 0xc42018d7e0, 0x0, 0x0)
worker_1     |  /root/work/src/github.com/shoppinpal/dolaterio/runner/run.go:17 +0x7c
worker_1     | github.com/shoppinpal/dolaterio/runner.(*JobRunner).run(0xc420196690)
worker_1     |  /root/work/src/github.com/shoppinpal/dolaterio/runner/job_runner.go:99 +0x110
worker_1     | created by github.com/shoppinpal/dolaterio/runner.(*JobRunner).Start
worker_1     |  /root/work/src/github.com/shoppinpal/dolaterio/runner/job_runner.go:52 +0x4f

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.