Comments (10)
@zawadimario this is not a fix but disabling security.
from kubeflow.
I would like to take on this ticket.
from kubeflow.
I followed your steps to reproduce the issue but I did not find any error logs on my end.
On k8s version 1.25 (on GKE) I installed PVC Viewer Controller using the following:
kustomize build apps/pvcviewer-controller/upstream/default | kubectl apply -f -
Here are my logs for PVC Viewer Controller.
tariqhasan@Tariqs-MacBook-Air manifests % kubectl get pods -n kubeflow
NAME READY STATUS RESTARTS AGE
pvcviewer-controller-manager-85bc664bd6-c5hdz 3/3 Running 1 (68s ago) 80s
tariqhasan@Tariqs-MacBook-Air manifests % kubectl logs -f pvcviewer-controller-manager-85bc664bd6-c5hdz -n kubeflow
I0224 23:39:46.526471 1 request.go:601] Waited for 1.033863986s due to client-side throttling, not priority and fairness, request: GET:https://10.104.0.1:443/apis/storage.k8s.io/v1beta1?timeout=32s
2024-02-24T23:39:47Z INFO controller-runtime.metrics Metrics server is starting to listen {"addr": "127.0.0.1:8080"}
2024-02-24T23:39:47Z INFO controller-runtime.builder Registering a mutating webhook {"GVK": "kubeflow.org/v1alpha1, Kind=PVCViewer", "path": "/mutate-kubeflow-org-v1alpha1-pvcviewer"}
2024-02-24T23:39:47Z INFO controller-runtime.webhook Registering webhook {"path": "/mutate-kubeflow-org-v1alpha1-pvcviewer"}
2024-02-24T23:39:47Z INFO controller-runtime.builder Registering a validating webhook {"GVK": "kubeflow.org/v1alpha1, Kind=PVCViewer", "path": "/validate-kubeflow-org-v1alpha1-pvcviewer"}
2024-02-24T23:39:47Z INFO controller-runtime.webhook Registering webhook {"path": "/validate-kubeflow-org-v1alpha1-pvcviewer"}
2024-02-24T23:39:47Z INFO setup starting manager
2024-02-24T23:39:47Z INFO controller-runtime.webhook.webhooks Starting webhook server
2024-02-24T23:39:47Z INFO Starting server {"path": "/metrics", "kind": "metrics", "addr": "127.0.0.1:8080"}
2024-02-24T23:39:47Z INFO Starting server {"kind": "health probe", "addr": "[::]:8081"}
2024-02-24T23:39:47Z INFO controller-runtime.certwatcher Updated current TLS certificate
I0224 23:39:47.586415 1 leaderelection.go:248] attempting to acquire leader lease kubeflow/57a72bdf.kubeflow.org...
2024-02-24T23:39:47Z INFO controller-runtime.webhook Serving webhook server {"host": "", "port": 9443}
2024-02-24T23:39:47Z INFO controller-runtime.certwatcher Starting certificate watcher
I0224 23:39:47.596987 1 leaderelection.go:258] successfully acquired lease kubeflow/57a72bdf.kubeflow.org
2024-02-24T23:39:47Z DEBUG events pvcviewer-controller-manager-85bc664bd6-c5hdz_2b1df526-7d8b-49d7-996f-820f580f6d99 became leader {"type": "Normal", "object": {"kind":"Lease","namespace":"kubeflow","name":"57a72bdf.kubeflow.org","uid":"64838ab3-2979-4c4f-a264-17a44e8159d3","apiVersion":"coordination.k8s.io/v1","resourceVersion":"55914"}, "reason": "LeaderElection"}
2024-02-24T23:39:47Z INFO Starting EventSource {"controller": "pvcviewer", "controllerGroup": "kubeflow.org", "controllerKind": "PVCViewer", "source": "kind source: *v1alpha1.PVCViewer"}
2024-02-24T23:39:47Z INFO Starting EventSource {"controller": "pvcviewer", "controllerGroup": "kubeflow.org", "controllerKind": "PVCViewer", "source": "kind source: *v1.Deployment"}
2024-02-24T23:39:47Z INFO Starting EventSource {"controller": "pvcviewer", "controllerGroup": "kubeflow.org", "controllerKind": "PVCViewer", "source": "kind source: *v1.Service"}
2024-02-24T23:39:47Z INFO Starting EventSource {"controller": "pvcviewer", "controllerGroup": "kubeflow.org", "controllerKind": "PVCViewer", "source": "kind source: *unstructured.Unstructured"}
2024-02-24T23:39:47Z INFO Starting Controller {"controller": "pvcviewer", "controllerGroup": "kubeflow.org", "controllerKind": "PVCViewer"}
2024-02-24T23:39:47Z INFO Starting workers {"controller": "pvcviewer", "controllerGroup": "kubeflow.org", "controllerKind": "PVCViewer", "worker count": 1}
Can you please provide more context around your error?
cc: @juliusvonkohout @kimwnasptd
from kubeflow.
Cc @TobiasGoerke since he is the maintainer of that component.
from kubeflow.
Also cannot reproduce the issue. Who's IP is 10.111.0.1 in your cluster? Have you deployed any netpols / authpolicies that are blocking network access?
from kubeflow.
I have had a similar experience when deploying it in an IPv6 environment. Any fix or workaround @TobiasGoerke?
from kubeflow.
I have had a similar experience when deploying it in an IPv6 environment. Any fix or workaround @TobiasGoerke?
The controller is very similar to other Kubeflow controllers from a technical point of view, so I wouldn't know why it would crash while the others don't.
Could you please share some logs and info about how you deploy Kubeflow?
from kubeflow.
Well @TobiasGoerke when I deployed it in Minikube, it runs well but not when I deploy on an IPv6 environment(VM based on Rocky Linux 9). How I deploy it? Just following the Kubeflow manifests GitHub instructions for kustomize. Below are my logs. I have omited IP addresses
2024-03-15T07:23:27Z ERROR Failed to get API Group-Resources {"error": "Get "https://[:8001]:443/api?timeout=32s": dial tcp [:8001]:443: connect: connection refused"}
sigs.k8s.io/controller-runtime/pkg/cluster.New
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/cluster/cluster.go:160
sigs.k8s.io/controller-runtime/pkg/manager.New
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/manager.go:344
main.main
/workspace/main.go:70
runtime.main
/usr/local/go/src/runtime/proc.go:250
2024-03-15T07:23:27Z ERROR setup unable to start manager {"error": "Get "https://[:8001]:443/api?timeout=32s": dial tcp [:8001]:443: connect: connection refused"}
main.main
/workspace/main.go:90
runtime.main
/usr/local/go/src/runtime/proc.go:250
from kubeflow.
@TobiasGoerke and @tariq-hasan I have found a fix for my problem.
I just commented out these two lines and it worked.
# seccompProfile:
# type: RuntimeDefault
So your manifest found at manifests/apps/pvcviewer-controller/upstream/manager/manager.yaml should look like this
apiVersion: apps/v1
kind: Deployment
metadata:
name: controller-manager
namespace: system
labels:
control-plane: controller-manager
app.kubernetes.io/name: deployment
app.kubernetes.io/instance: controller-manager
app.kubernetes.io/component: manager
app.kubernetes.io/created-by: pvc-viewer
app.kubernetes.io/part-of: pvc-viewer
app.kubernetes.io/managed-by: kustomize
spec:
selector:
matchLabels:
control-plane: controller-manager
replicas: 1
template:
metadata:
annotations:
kubectl.kubernetes.io/default-container: manager
labels:
control-plane: controller-manager
spec:
securityContext:
runAsNonRoot: true
containers:
- command:
- /manager
args:
- --leader-elect
image: docker.io/kubeflownotebookswg/pvcviewer-controller
imagePullPolicy: IfNotPresent
name: manager
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- "ALL"
livenessProbe:
httpGet:
path: /healthz
port: 8081
initialDelaySeconds: 15
periodSeconds: 20
readinessProbe:
httpGet:
path: /readyz
port: 8081
initialDelaySeconds: 5
periodSeconds: 10
# More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
resources:
limits:
cpu: 500m
memory: 128Mi
requests:
cpu: 10m
memory: 64Mi
serviceAccountName: controller-manager
terminationGracePeriodSeconds: 10
from kubeflow.
I understand. I only wish that this can find a permanent fix. I do not understand if this happens when deploying on certain linux distributions or it's generally a bug. Looking forward to a permanent solution.
from kubeflow.
Related Issues (20)
- How to hide Artifact component from pipeline graph?
- ssl: none from centraldashboard to profiles which cause rbac access denied HOT 2
- [jupyter-web-app/backend] Error 500 when listing PodeDefaults using matchExpressions instead of matchLabels
- Maximum number of artifacts exceeded. How to aggregate artifacts from >100 ParallelFor iterations HOT 2
- Kubeflow Access Management API - is incomplete
- CSRF check failed. This may happen if you opened the login form in more than 1 tabs. Please try to login again.
- How to do container mount like '-v /path/a:/path:b' in kubeflow yaml component? HOT 1
- The pipeline running status is inconsistent with AWS Glue HOT 2
- Unable to configure a specific hostname for notebook-controller generated VirtualServices HOT 1
- profile-controller: adding contributors field
- container_component parameter check issue
- User sees shared pipelines in Private section in Central Dashboard while not being a contributor in any namespace
- Problem with google.cloud.logging and set_*_limit
- Intel GPU not in default GPU vendor list in Jupyter Notebook server HOT 1
- RStudio image ignores pod environment variables
- [TRACKING] discussion & planning for future of `kubeflow/kubeflow` repo HOT 36
- when create profile,it need to pull image from internet,i need to switch it private registry HOT 2
- jupyter-web-app's `PodDefault` `configurations` are keyed by their label selector's key, not their name HOT 5
- Support non-Istio deployment, using Cilium support as a use case
- inferenceService can pull image directly HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kubeflow.