Comments (4)
It seems the default job mode is changed. Previously it was "detahced".
Therefore the submitter is not completed when the job mode is not specified.
flink-on-k8s-operator/controllers/flinkcluster_converter.go
Lines 629 to 635 in 16a1830
On the other hand, If a new job pod were spun, it is unexpected behavior.
Even if submitter job fails k8s job does not spin a new pod because backoffLimit
is set 0 like below.
flink-on-k8s-operator/controllers/flinkcluster_converter.go
Lines 719 to 726 in 16a1830
However when I tested to delete submitter pod forcibly, it was recreated. Therefore I guess the pod was deleted and recreated for some reason. For example, the submitter pod might have been deleted with oom kill and then recreated, but I don't know oom kill could be the cause of recreating pod of k8s job. Anyway it looks like the pod must be prevented to be recreated for any reason in Blocking
mode.
from flink-on-k8s-operator.
Once you specify spec.job.mode
to Detached
, I think that creating multiple jobs will be prevented, because the job is completed after job submission.
from flink-on-k8s-operator.
Thanks @elanv for pointing to the job mode. There are two issues I described, 1) job submitter stays in running status after job submit, 2) multiple jobs could be submitted at some scenarios (job restart or chart update). They seems related but I can't be sure.
However, I don't see neither issues at v0.2.1, so I think this could be related to the webhooks bug being fixed? @regadas I guess we can close this, but maybe better to confirm the root cause.
from flink-on-k8s-operator.
Hi @pjthepooh! yeah, these issues should be addressed now. This was due to a bug introduced in the admission webhooks when migrating to v1
#136. I recommend using the latest version v0.2.2.
I'll close this issue for now. Please re-open if there are still issues.
from flink-on-k8s-operator.
Related Issues (20)
- Cluster stuck in Updating state if PodDisruptionBudget is set
- Wrong job status after job update.
- Rework examples
- Allow Flink to ignore savepoint on restore if the states of the old and new jobs are incompatible
- Create a new cluster before deleting the old one on the job update
- FlinkCluster stuck in Updating state when PDB is used.
- Add HorizontalPodAutoScaller properties to FlinkCluster spec HOT 1
- QUESTION: how to get sample app WordCount.jar to run with version 1.15.3 and 2 taskmanager replicas HOT 3
- Caused by GSSException: No valid credentials provided(Mechanism level: Failed to find any Kerberos tgt)
- Validation Error `nodeaffinity` rule for the flinkcluster HOT 2
- HPA not creating new pods on scale event HOT 3
- Pod Affinity Feature Causing Flink Pipeline Redeployment to Fail HOT 2
- poddisruptionbudget is not allowing any disruptions HOT 1
- While using application mode, the jobmanager pod is not restarted when killed
- Flink Operator Loses Job Manager Contact during EKS upgrade HOT 10
- Question: is the latest CRD backwards compatible with the CRD from 0.30 HOT 3
- Job Manager is not brought back up HOT 1
- Application mode Job Manager restart can create multiple FlinkJobs
- Streaming Application mode Jobs can sometimes reach completed stage
- If the job submitter fails, the job keeps running
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from flink-on-k8s-operator.