Comments (22)
Ah, this is a bug actually. The issue is with the legacy operator .. its looking for a deprecated package. The error is that when loading the "PodRuntimeInfoEnv" its not using the proper location for the pod.
Questions:
- What operator are you using?
- What airflow version are you using?
- What image are you using?
from kubernetesjoboperator.
1. What operator are you using? KubernetesJobOperator - no legacy operators
2. What airflow version are you using? I have tried 2.5.0 and 2.4.1 (which I have locally)
3. What image are you using? public airflow image
I have just tried with the following image (I don't use slim but I opted for it to be able to download the same packages as the ones I have locally, since they are usually pre-installed in the normal version) and I did not get the legacy error
FROM apache/airflow:slim-2.5.0-python3.10
RUN pip install kubernetes==23.6.0 apache-airflow-providers-cncf-kubernetes==4.4.0 airflow-kubernetes-job-operator==2.0.8 psycopg2-binary==2.9.5
however the 2nd error still happens
airflow-6b946445bd-d6gsl
*** Log file does not exist: /opt/airflow/logs/dag_id=full-refresh/run_id=manual__2022-12-15T20:39:54.994293+00:00/task_id=example_kubernetes_job_operator/attempt=1.log
*** Fetching from: http://airflow-6b946445bd-d6gsl:8793/log/dag_id=full-refresh/run_id=manual__2022-12-15T20:39:54.994293+00:00/task_id=example_kubernetes_job_operator/attempt=1.log
[2022-12-15, 20:39:56 UTC] {taskinstance.py:1087} INFO - Dependencies all met for <TaskInstance: full-refresh.example_kubernetes_job_operator manual__2022-12-15T20:39:54.994293+00:00 [queued]>
[2022-12-15, 20:39:56 UTC] {taskinstance.py:1087} INFO - Dependencies all met for <TaskInstance: full-refresh.example_kubernetes_job_operator manual__2022-12-15T20:39:54.994293+00:00 [queued]>
[2022-12-15, 20:39:56 UTC] {taskinstance.py:1283} INFO -
--------------------------------------------------------------------------------
[2022-12-15, 20:39:56 UTC] {taskinstance.py:1284} INFO - Starting attempt 1 of 1
[2022-12-15, 20:39:56 UTC] {taskinstance.py:1285} INFO -
--------------------------------------------------------------------------------
[2022-12-15, 20:39:56 UTC] {taskinstance.py:1304} INFO - Executing <Task(KubernetesJobOperator): example_kubernetes_job_operator> on 2022-12-15 20:39:54.994293+00:00
[2022-12-15, 20:39:56 UTC] {standard_task_runner.py:55} INFO - Started process 453 to run task
[2022-12-15, 20:39:56 UTC] {standard_task_runner.py:82} INFO - Running: ['airflow', 'tasks', 'run', 'full-refresh', 'example_kubernetes_job_operator', 'manual__2022-12-15T20:39:54.994293+00:00', '--job-id', '5', '--raw', '--subdir', 'DAGS_FOLDER/full_refresh_dag.py', '--cfg-path', '/tmp/tmpzdpue9_q']
[2022-12-15, 20:39:56 UTC] {standard_task_runner.py:83} INFO - Job 5: Subtask example_kubernetes_job_operator
[2022-12-15, 20:39:56 UTC] {task_command.py:389} INFO - Running <TaskInstance: full-refresh.example_kubernetes_job_operator manual__2022-12-15T20:39:54.994293+00:00 [running]> on host airflow-6b946445bd-d6gsl
[2022-12-15, 20:39:56 UTC] {taskinstance.py:1511} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=full-refresh
AIRFLOW_CTX_TASK_ID=example_kubernetes_job_operator
AIRFLOW_CTX_EXECUTION_DATE=2022-12-15T20:39:54.994293+00:00
AIRFLOW_CTX_TRY_NUMBER=1
AIRFLOW_CTX_DAG_RUN_ID=manual__2022-12-15T20:39:54.994293+00:00
[2022-12-15, 20:39:56 UTC] {taskinstance.py:1772} ERROR - Task failed with exception
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.10/site-packages/airflow_kubernetes_job_operator/kubernetes_job_operator.py", line 373, in execute
rslt = self.job_runner.execute_job(
File "/home/airflow/.local/lib/python3.10/site-packages/airflow_kubernetes_job_operator/job_runner.py", line 302, in execute_job
watchable_kinds = GetAPIVersions.get_existing_api_kinds(self.client, all_kinds)
File "/home/airflow/.local/lib/python3.10/site-packages/airflow_kubernetes_job_operator/kube_api/queries.py", line 410, in get_existing_api_kinds
return [k for k in all_kinds if k.api_version == "v1" or k.api_version in apis]
File "/home/airflow/.local/lib/python3.10/site-packages/airflow_kubernetes_job_operator/kube_api/queries.py", line 410, in <listcomp>
return [k for k in all_kinds if k.api_version == "v1" or k.api_version in apis]
TypeError: argument of type 'NoneType' is not iterable
the yaml file looks like this
apiVersion: v1
kind: Pod
metadata:
name: transferpod
spec:
template:
metadata:
labels:
app: transferpod
nodeSelector:
iam.gke.io/gke-metadata-server-enabled: "true"
containers:
- name: transferpod
image: "{{ job.IMAGE }}"
resources: {}
and the dag like this
t1 = KubernetesJobOperator(
task_id="example_kubernetes_job_operator",
body_filepath=f"{dir_path}/manifests/transferjob.yaml",
in_cluster=True,
jinja_job_args={
"IMAGE": f"eu.gcr.io/{GOOGLE_CLOUD_PROJECT}/transferpod",
},
)
let me know if you need further info
from kubernetesjoboperator.
Hi, I think this is still an issue. The legacy support was extended to before airflow 2.0.0, but this legacy period should now be over (Otherwise I'll be chasing this all my life). I therefore removed support for legacy before 2.0.0.
Fix in this PR. May take an hour or two #75
(The PR contains a linting fix as well... so may be a little cluttered)
from kubernetesjoboperator.
Fixed in https://github.com/LamaAni/KubernetesJobOperator/releases/tag/2.0.9
Please notify when you have tested and close this Issue?
from kubernetesjoboperator.
I was able to execute with the following dockerfile:
FROM apache/airflow
RUN pip3 install --user airflow-kubernetes-job-operator
from kubernetesjoboperator.
Hello! I can confirm that the legacy issue has stopped happening across different setups with the latest version. thanks a lot!
However I am still facing the issue with the all_kinds being none.
I have tried the following to troubleshoot this:
- using body option with a yaml string
- I have used the example in the repo
# First resource: this resource will be tracked by the operator. Other resources will not be tracked.
apiVersion: batch/v1
kind: Job
metadata:
name: test-job # not required. Will be a prefix to task name
finalizers:
- foregroundDeletion
spec:
template:
metadata:
labels:
app: test-task-pod
spec:
restartPolicy: Never
containers:
- name: job-executor
image: ubuntu
command:
- bash
- -c
- |
#/usr/bin/env bash
echo "OK"
backoffLimit: 0
---
apiVersion: v1
kind: Service
metadata:
name: test-service # not required, will be a prefex to task name.
spec:
selector:
app: test-task-pod
ports:
- port: 8080
targetPort: 8080
- I have foregone the yamls all together and tried out this
job_task = KubernetesJobOperator(
task_id="from-image",
dag=dag,
image="ubuntu",
command=["bash", "-c", 'echo "all ok"'],
)
It still fails instantly with the same message.
4. I have even disabled gitsync and baked dags into the image, but nada.
all while using the following of course
FROM apache/airflow
RUN pip3 install --user airflow-kubernetes-job-operator
After a lot of squinting, I have nothing to show for it, I cannot think of why 2 exact images would behave differently, I keep thinking it is my code, but then again, I have tried dummy code, dummy yamls, and always the same issue.
Let me know if you need anymore input, and thanks again for your prompt resolution of the first issue!
Extra background info:
- running on google GKE
- official apache airflow helm chart with some modifications
- regular kubernetespodoperator works
from kubernetesjoboperator.
Hum... the all_kinds is a collection that is initialized when the operator is loaded, it defines what kinds are to be expected when deploying to the cluster. I'm gonna add a full example with a docker image to the repo. Can you try and run that?
from kubernetesjoboperator.
Please see this example under examples/docker for an example docker deployment. Can you run the script run_in_docker
?
The scripts also requires you to have a valid kubernetes config under "$HOME/.kube/config"
from kubernetesjoboperator.
will give it a try asap and report back, sorry for delay, ooo rn
from kubernetesjoboperator.
Hi,
I've the same issue TypeError: argument of type 'NoneType' is not iterable
with
apache-airflow==2.4.3
airflow-kubernetes-job-operator==2.0.9
kubernetes==23.6.0
No problem before airflow upgrade with
apache-airflow==2.2.5
airflow-kubernetes-job-operator==2.0.8
kubernetes==11.0.0
from kubernetesjoboperator.
@arnor2000 Can you also please check and run the example and report back? I'll try your example airflow.
from kubernetesjoboperator.
Wow, that was a weird one.
You both probably have an error when connecting to the kube service (please see RBAC section in the help). This error was not reported properly and returned without being raised. This would mean that error you both experienced is actually an api connection error. e.g. the the connection to the api could not be established and returned a None value instead of throwing an error.
Resolved in this pr. #81
Please validate and let me know.
from kubernetesjoboperator.
Any update on this?
from kubernetesjoboperator.
Apologizing once more for opening issue and leaving the appearance of deserting it. I will be in office tomorrow morning and will give it a try and let you know. Sorry for delay.
from kubernetesjoboperator.
None of the previous issues occur anymore, and the error message has indeed been updated.
Now I get
airflow_kubernetes_job_operator.kube_api.exceptions.KubeApiClientException: airflow_kubernetes_job_operator.kube_api.queries.GetAPIVersions, Unauthorized: Unauthorized
I have created the exact RBAC role as in the main page, and bound it to the default service account which was already admin of all, the error above sadly persists. thing is, running a KubernetesPodOperator on the same instance works fine, meaning it is able to connect to k8 apis in general.
from kubernetesjoboperator.
None of the previous issues occur anymore, and the error message has indeed been updated. Now I get
airflow_kubernetes_job_operator.kube_api.exceptions.KubeApiClientException: airflow_kubernetes_job_operator.kube_api.queries.GetAPIVersions, Unauthorized: Unauthorized
I have created the exact RBAC role as in the main page, and bound it to the default service account which was already admin of all, the error above sadly persists. thing is, running a KubernetesPodOperator on the same instance works fine, meaning it is able to connect to k8 apis in general.
This means that the client dose not have access, or it cannot find the right config. You need to define the right RBAC configuration for the using account running in the pod. You can find the RBAC config in the help here: https://github.com/LamaAni/KubernetesJobOperator#kubernetes-rbac-rules
You can also find how to configure a kubernetes service account here: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/
from kubernetesjoboperator.
As mentioned, I have already done so, but I will play around a bit and see, will update with findings.
from kubernetesjoboperator.
Oh, sorry. Missed that. Then its the auto detection of the credentials. I will try and test that locally and report back.
from kubernetesjoboperator.
Found the issue. The automatic bearer token string was invalid, the 'bearer' was already in there. Maybe since there was a code change in airflow 2.5.0?
Apologies, I have the kube config deployed with my configuration, and therefore my tests did not catch this. The bearer token is now correctly identified and added.
I hope this is the last issue with this.
PR is here: https://github.com/LamaAni/KubernetesJobOperator/pull/83/files
Release is here: https://github.com/LamaAni/KubernetesJobOperator/releases/tag/2.0.12
from kubernetesjoboperator.
Good day! this was indeed the issue, upon upgrading to 2.0.12 everything worked flawlessly. I will mark this as solved, thanks a lot for your hard work.
from kubernetesjoboperator.
Hi, I've the same issue
TypeError: argument of type 'NoneType' is not iterable
withapache-airflow==2.4.3 airflow-kubernetes-job-operator==2.0.9 kubernetes==23.6.0
No problem before airflow upgrade with
apache-airflow==2.2.5 airflow-kubernetes-job-operator==2.0.8 kubernetes==11.0.0
@arnor2000 Dose this work for you as well?
from kubernetesjoboperator.
Sorry, I didn't have an accessible environment to test easily.
Yes, it's fixed, thank you!
from kubernetesjoboperator.
Related Issues (20)
- Duplicate pod logs HOT 4
- BUG: Delete pods after job completion HOT 6
- Connection reset by peer on long job run HOT 14
- Get pod logs when multiple containers are used HOT 7
- Uses cases and testimonials HOT 3
- Executing each task without restarting pods for every task HOT 7
- Is KubernetesJobOperator compatible with Apache Airflow >= v2.0? HOT 10
- FEATURE: Delete all completed tasks, regardless the final result of the task (successful or failed). HOT 24
- Investigate warnings HOT 3
- Secrets not found HOT 1
- FEATURE: Refresh aws token HOT 13
- Error after upgrading to airflow 2.3.3 HOT 2
- Cannot create resource "jobs" HOT 1
- No logs from pod HOT 3
- ConnectionResetError: [Errno 104] Connection reset by peer HOT 5
- Running task logs is incomplete HOT 12
- FEATURE: need 'templates_dict' to ensure proper rendering of time macros. HOT 2
- FEATURE: TaskFlow / Decorated Task HOT 1
- Delete the job due to an error triggered by chunk.decode("utf8"). HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kubernetesjoboperator.