Git Product home page Git Product logo

Comments (4)

elanv avatar elanv commented on July 22, 2024

It seems the default job mode is changed. Previously it was "detahced".
Therefore the submitter is not completed when the job mode is not specified.

if jobSpec.Mode != nil {
switch *jobSpec.Mode {
case v1beta1.JobModeBlocking:
case v1beta1.JobModeDetached:
jobArgs = append(jobArgs, "--detached")
}
}

On the other hand, If a new job pod were spun, it is unexpected behavior.
Even if submitter job fails k8s job does not spin a new pod because backoffLimit is set 0 like below.

// Disable the retry mechanism of k8s Job, all retries should be initiated
// by the operator based on the job restart policy. This is because Flink
// jobs are stateful, if a job fails after running for 10 hours, we probably
// don't want to start over from the beginning, instead we want to resume
// the job from the latest savepoint which means strictly speaking it is no
// longer the same job as the previous one because the `--fromSavepoint`
// parameter has changed.
var backoffLimit int32 = 0

However when I tested to delete submitter pod forcibly, it was recreated. Therefore I guess the pod was deleted and recreated for some reason. For example, the submitter pod might have been deleted with oom kill and then recreated, but I don't know oom kill could be the cause of recreating pod of k8s job. Anyway it looks like the pod must be prevented to be recreated for any reason in Blocking mode.

from flink-on-k8s-operator.

elanv avatar elanv commented on July 22, 2024

Once you specify spec.job.mode to Detached, I think that creating multiple jobs will be prevented, because the job is completed after job submission.

from flink-on-k8s-operator.

pjthepooh avatar pjthepooh commented on July 22, 2024

Thanks @elanv for pointing to the job mode. There are two issues I described, 1) job submitter stays in running status after job submit, 2) multiple jobs could be submitted at some scenarios (job restart or chart update). They seems related but I can't be sure.

However, I don't see neither issues at v0.2.1, so I think this could be related to the webhooks bug being fixed? @regadas I guess we can close this, but maybe better to confirm the root cause.

from flink-on-k8s-operator.

regadas avatar regadas commented on July 22, 2024

Hi @pjthepooh! yeah, these issues should be addressed now. This was due to a bug introduced in the admission webhooks when migrating to v1 #136. I recommend using the latest version v0.2.2.

I'll close this issue for now. Please re-open if there are still issues.

from flink-on-k8s-operator.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.