openshift / cluster-version-operator Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
To get a list of current overrides, run:
$ oc get -o json clusterversion version | jq .spec.overrides
[
{
"kind": "APIService",
"name": "v1alpha1.packages.apps.redhat.com",
"unmanaged": true
}
]
However, the spec dont contain overrides section
spec:
channel: fast
clusterID: 37be53b4-bdbc-4b65-b76e-ddf9c2b671c6
upstream: http://localhost:8080/graph
So all the following steps wont work.
Similarly, below command needs to be updated to:
oc get clusterversion -o jsonpath='{.status.current.payload}{"\n"}' version
To
oc get clusterversion -o jsonpath='{.status.desired.payload}{"\n"}' version
While trying to run the CVO locally as per the doc/dev, the section specifies the following command:
./_output/linux/amd64/cluster-version-operator -v5 start --release-image 4.4.0-rc.4
But, as the --listen
is set to "0.0.0.0:9099" by default, the command fails with the following error unless we append --listen=""
to the command:
F0926 00:28:13.708624 62174 start.go:24] error: --listen was not set empty, so --serving-cert-file must be set
The documentation should be updated to either specify the --serving-cert-file
or at least to unset the listen option --listen=""
.
The sync loop being inline with the status update loop makes upgrades hard to debug (we have to wait for the whole loop to complete to update status). Also, if a sync is running and you change the desired state, we need to wait for the sync to complete. There's some other errors that then show up to users that are non obvious.
Instead, we should decouple the CV status from the sync loop, and make the sync loop cancellable.
To do that, I think we should:
End outcomes:
It would be nice to have a proper description of what 'cluster-version-operator' is, features, current status, etc. in the repo 'front page' instead build instructions.
CVO cannot deploy the image registry due to a cache sync error.
CVO payload: registry.svc.ci.openshift.org/openshift/origin-release:v4.0
CVO version: 4.0.0-0.alpha-2018-11-30-060640
Error in logs:
E1130 14:43:49.424010 1 sync.go:133] error running apply for serviceaccount "openshift-image-registry/cluster-image-registry-operator" (v1, 125 of 183): serviceaccounts "cluster-image-registry-operator" is forbidden: caches not synchronized
We would like to be able to restrict by IP access to the downloads route in the openshift-console project using the ip whitelist annotation.
This currently does not appear possible as any update to add the required annotation is reverted by the cluster-version-operator
Not sure what I've got here but this is a CI failure bringing up a cluster, from the CVO pod logs:
E0213 17:40:08.957050 1 task.go:57] error running apply for service "openshift-cloud-credential-operator/controller-manager-service" (84 of 273): services "controller-manager-service" is forbidden: caches not synchronized
E0213 17:40:28.980158 1 task.go:57] error running apply for service "openshift-cloud-credential-operator/controller-manager-service" (84 of 273): services "controller-manager-service" is forbidden: caches not synchronized
I0213 17:40:38.501814 1 leaderelection.go:209] successfully renewed lease openshift-cluster-version/version
I0213 17:40:40.483182 1 reflector.go:286] github.com/openshift/cluster-version-operator/vendor/github.com/openshift/client-go/config/informers/externalversions/factory.go:101: forcing resync
E0213 17:40:51.996136 1 task.go:57] error running apply for service "openshift-cloud-credential-operator/controller-manager-service" (84 of 273): services "controller-manager-service" is forbidden: caches not synchronized
I0213 17:40:51.996197 1 task_graph.go:438] No more reachable nodes in graph, continue
I0213 17:40:51.996203 1 task_graph.go:474] No more work
I0213 17:40:51.996221 1 task_graph.go:494] No more work for 3
I0213 17:40:51.996227 1 task_graph.go:494] No more work for 6
I0213 17:40:51.996234 1 task_graph.go:494] No more work for 7
I0213 17:40:51.996240 1 task_graph.go:494] No more work for 1
I0213 17:40:51.996246 1 task_graph.go:494] No more work for 4
I0213 17:40:51.996252 1 task_graph.go:494] No more work for 2
I0213 17:40:51.996252 1 task_graph.go:494] No more work for 0
I0213 17:40:51.996257 1 task_graph.go:494] No more work for 5
I0213 17:40:51.996277 1 task_graph.go:510] Workers finished
I0213 17:40:51.996290 1 task_graph.go:518] Result of work: [Could not update service "openshift-cloud-credential-operator/controller-manager-service" (84 of 273): the server has forbidden updates to this resource]
E0213 17:40:51.996341 1 sync_worker.go:263] unable to synchronize image (waiting 3m19.747206386s): Could not update service "openshift-cloud-credential-operator/controller-manager-service" (84 of 273): the server has forbidden updates to this resource
I0213 17:40:51.996400 1 cvo.go:298] Started syncing cluster version "openshift-cluster-version/version" (2019-02-13 17:40:51.996393402 +0000 UTC m=+2487.354191867)
I0213 17:40:51.996446 1 cvo.go:326] Desired version from operator is v1.Update{Version:"0.0.1-2019-02-13-164905", Image:"registry.svc.ci.openshift.org/ci-op-girsxxlp/release@sha256:ded54f5fb7dfe10f53176ac710f6309b05828dc0aa276b448ce5aefc8e5eae78"}
I0213 17:40:51.996541 1 cvo.go:300] Finished syncing cluster version "openshift-cluster-version/version" (144.1µs)
More logs available here: https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_cloud-credential-operator/31/pull-ci-openshift-cloud-credential-operator-master-e2e-aws/158
The cluster-network file is 0000_70_cluster-network-operator_03_daemonset.yaml
The release-image/0000_07_cluster-network-operator_03_daemonset.yaml
file even if it says daemonset
it is actually a deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: network-operator
namespace: openshift-network-operator
- op: add
path: /spec/overrides
value:
- kind: Deployment
group: apps <--------- This
name: cluster-network-operator
namespace: openshift-network-operator
unmanaged: true
Hi,
We're trying to setup an openshift 4.4.3 cluster for a customer and they need all of the worker nodes to be dedicated only to their business application pods (apart from daemonsets ofcourse). But the default installation comes with a bunch of cluster operators bundled in the release image and the manifests cannot be changed for them. This leads to some pods of cluster operators like marketplace operator scheduled on worker nodes which don't have a nodeSelector and other operators have a generic linux nodeSelector on them. Ofcourse we can create infra nodes and manually change nodeSelector to move stuff there. But the manifests will get reset with upgrades and we will have to manually patch them again. Is there are way to properly define nodeSelectors for Level 1 operators like cluster operators (Assuming CVO is Level 0 operator)? It would be great if we get even a small lead on that and we can contribute if the feature is not there. Thanks
Test PR: #210
Error: unknown command "openshift-kube-apiserver" for "hypershift"
Run 'hypershift --help' for usage.
unknown command "openshift-kube-apiserver" for "hypershift"
/cc @abhinavdahiya @wking
It still refers to status.version
in the singular. There maybe other things wrong too?
Access to the Cluster Version k8s object requires cluster role access, which makes it difficult to obtain the cluster ID.
Operator-metering is a useful tool for building reports from Prometheus data. For upcoming flows and customer interactions (support, billing, etc) it would be beneficial for the reports to contain the cluster ID.
If the cluster ID was available as label or its own metric in Prometheus that would help to simplify report origination.
TEST_INTEGRATION=1 go test ./pkg/start/ -tags integration -count=4
E0325 22:45:48.713617 10713 sync_worker.go:276] unable to synchronize image (waiting 625ms): Could not update configmap "e2e-cvo-ff4l/config2" (2 of 2): the object is invalid, possibly due to local cluster configuration
E0325 22:45:48.922205 10713 leaderelection.go:256] error initially creating leader election record: namespaces "e2e-cvo-mlm6zv" not found
E0325 22:45:54.301108 10713 event.go:259] Could not construct reference to: '&v1.ConfigMap{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Data:map[string]string(nil), BinaryData:map[string][]uint8(nil)}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Normal' 'LeaderElection' 'claytonc-mbp.local_4bb2f5bb-5f1d-4543-90dd-75de07986a26 stopped leading'
panic: close of closed channel
goroutine 63 [running]:
github.com/openshift/cluster-version-operator/pkg/start.(*Options).run.func3()
/Users/clayton/projects/origin/src/github.com/openshift/cluster-version-operator/pkg/start/start.go:190 +0x74
github.com/openshift/cluster-version-operator/vendor/k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run.func1(0xc0006e4240)
/Users/clayton/projects/origin/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:148 +0x40
github.com/openshift/cluster-version-operator/vendor/k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run(0xc0006e4240, 0x216d1e0, 0xc000184600)
/Users/clayton/projects/origin/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:157 +0x112
github.com/openshift/cluster-version-operator/vendor/k8s.io/client-go/tools/leaderelection.RunOrDie(0x216d220, 0xc0000da010, 0x2174320, 0xc0001e5d40, 0x14f46b0400, 0xa7a358200, 0x6fc23ac00, 0xc0003b03f0, 0xc0000d13b0, 0x0)
/Users/clayton/projects/origin/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:166 +0x87
created by github.com/openshift/cluster-version-operator/pkg/start.(*Options).run
/Users/clayton/projects/origin/src/github.com/openshift/cluster-version-operator/pkg/start/start.go:157 +0x1ef
FAIL github.com/openshift/cluster-version-operator/pkg/start 75.329s
The following branches are being fast-forwarded from the current development branch (master) as placeholders for future releases. No merging is allowed into these release branches until they are unfrozen for production release.
release-4.18
release-4.19
For more information, see the branching documentation.
Hi,
I'm using the Cluster Version Operator on openshift 4.12 and OVN-Kubernetes CNI plugin.
Now I want to test my cluster with Calico, but the port 9099 is used by CVO and doesn't allow the Calico routing module (Felix) to start.
I would like to know if stop CVO can cause any issue in my cluster ?
How can I change the listening port of CVO ?
Best Regards,
We have the following override in our ClusterVersion
:
- group: imageregistry.operator.openshift.io
kind: Config
name: cluster
namespace: ""
unmanaged: true
This is causing cluster provisioning to fail, because when the operator encounters this manifest...
0000_30_config-operator_01_operator.cr.yaml
apiVersion: operator.openshift.io/v1
kind: Config
metadata:
name: cluster
annotations:
include.release.openshift.io/ibm-cloud-managed: "true"
include.release.openshift.io/self-managed-high-availability: "true"
include.release.openshift.io/single-node-developer: "true"
release.openshift.io/create-only: "true"
spec:
managementState: Managed
... the getOverrideForManifest function is improperly matching it to the above imageregistry.operator.openshift.io
override because it disregards the Group in its comparison (imageregistry.operator.openshift.io
!= operator.openshift.io
).
As a result, the cluster-config-operator has no custom resource to act on and it blocks the cluster-version-operator from ever completing:
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version False True 3h18m Working towards 4.9.7: 725 of 735 done (98% complete), waiting on config-operator
The CVO needs to be able to install operators that take care of creating a viable service network. To do this, it could run with hostnetworking=true and cleverly setting KUBERNETES_SERVICE_PORT
and KUBERNETES_SERVICE_HOST
to point to the local kube-apiserver.
If we did this, I think it could come up after the apiserver, controller, scheduler and before anything else. I bumped into this while trying to get running from a kube control plane.
@mfojtik @knobunc @ironcladlou @abhinavdahiya @smarterclayton @derekwaynecarr
Hey folks,
As apart of the contributing documentation there's no mention of requiring a login to build the container image. In this case it's the registry.ci.openshift.org/ocp/ubi
container image that appears to require authentication.
Can you please clarify if there is a authentication method for public contributors or if there should be a different URL there?
What happened
When CVO reports failing cluster v4.4 it returns Failing condition e.g.:
conditions:
- lastTransitionTime: "2020-08-12T07:08:36Z"
message: Done applying 4.4.10
status: "True"
type: Available
- lastTransitionTime: "2020-08-12T06:53:47Z"
status: "False"
type: Failing
- lastTransitionTime: "2020-08-12T07:08:36Z"
message: Cluster version is 4.4.10
status: "False"
type: Progressing
- lastTransitionTime: "2020-08-12T07:08:57Z"
message: The update channel has not been configured.
reason: NoChannel
status: "False"
type: RetrievedUpdates
This is an expected state related to the code: https://github.com/openshift/cluster-version-operator/blob/release-4.4/pkg/cvo/status.go#L240
However, the openshift/api expects one of the states https://github.com/openshift/api/blob/release-4.4/config/v1/types_cluster_operator.go#L141:
OperatorAvailable ClusterStatusConditionType = "Available"
OperatorProgressing ClusterStatusConditionType = "Progressing"
OperatorDegraded ClusterStatusConditionType = "Degraded"
OperatorUpgradeable ClusterStatusConditionType = "Upgradeable"
What you expected to happen
openshift/cvo and openshift/api to have mathich conditions defined.
Hi
Is there any way to get the communication to api.openshift.com running through a HTTP Proxy on Version 4.1.7?
We can't update the cluster over the UI because it can't reach api.openshift.com as the HTTP Proxy seems not to be configured in the operator.
Everything else on the cluster is configured to run through Proxy.
Regards
Basically all our upgrade jobs are dead. The CVO should be able to say what isn't yet completed, but it just says "%x complete". We should try to make a reasonable message to aid debugging. The source error here:
all update jobs are basically dead
{
"apiVersion": "config.openshift.io/v1",
"kind": "ClusterOperator",
"metadata": {
"creationTimestamp": "2019-06-29T12:41:31Z",
"generation": 1,
"name": "machine-config",
"resourceVersion": "52233",
"selfLink": "/apis/config.openshift.io/v1/clusteroperators/machine-config",
"uid": "38f06bd4-9a6b-11e9-b262-12ce5335583c"
},
"spec": {},
"status": {
"conditions": [
{
"lastTransitionTime": "2019-06-29T13:48:35Z",
"message": "Cluster not available for 0.0.1-2019-06-29-122423",
"status": "False",
"type": "Available"
},
{
"lastTransitionTime": "2019-06-29T13:35:16Z",
"message": "Working towards 0.0.1-2019-06-29-122423",
"status": "True",
"type": "Progressing"
},
{
"lastTransitionTime": "2019-06-29T13:48:35Z",
"message": "Unable to apply 0.0.1-2019-06-29-122423: timed out waiting for the condition during syncRequiredMachineConfigPools: error pool master is not ready, retrying. Status: (pool degraded: true total: 3, ready 0, updated: 0, unavailable: 0)",
"reason": "FailedToSync",
"status": "True",
"type": "Degraded"
}
],
"extension": {
"lastSyncError": "error pool master is not ready, retrying. Status: (pool degraded: true total: 3, ready 0, updated: 0, unavailable: 0)",
"master": "pool is degraded because nodes fail with \"1 nodes are reporting degraded status on sync\": \"Node ip-10-0-132-28.ec2.internal is reporting: \\\"failed to run pivot: failed to start machine-config-daemon-host.service: exit status 1\\\"\"",
"worker": "pool is degraded because nodes fail with \"1 nodes are reporting degraded status on sync\": \"Node ip-10-0-139-237.ec2.internal is reporting: \\\"failed to run pivot: failed to start machine-config-daemon-host.service: exit status 1\\\"\""
},
first two I looked at were the machine config operator
In a libvirt cluster I just launched using:
$ openshift-install version
openshift-install v0.1.0-52-gedc4d97104f7fefbe6ce778d18aaf53299f8af59
Terraform v0.11.8
I'm seeing:
[core@wking-bootstrap ~]$ kubectl logs -n openshift-cluster-version bootstrap-cluster-version-operator-wking-bootstrap
I1005 21:39:48.036769 1 start.go:67] ClusterVersionOperator v0.0.0-97-ga5a76d51-dirty
I1005 21:39:48.037010 1 start.go:180] Loading kube client config from path "/etc/kubernetes/kubeconfig"
...
E1005 21:42:10.909970 1 event.go:259] Could not construct reference to: '&v1.ConfigMap{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"cluster-version-operator", GenerateName:"", Namespace:"openshift-cluster-version", SelfLink:"/api/v1/namespaces/openshift-cluster-version/configmaps/cluster-version-operator", UID:"d45bde3c-c8e4-11e8-8408-0214269547a8", ResourceVersion:"9259", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63674371377, loc:(*time.Location)(0x1bf25a0)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string{"control-plane.alpha.kubernetes.io/leader":"{\"holderIdentity\":\"wking-bootstrap_efc11fa1-5144-45db-a2cb-568952d64f05\",\"leaseDurationSeconds\":90,\"acquireTime\":\"2018-10-05T21:42:10Z\",\"renewTime\":\"2018-10-05T21:42:10Z\",\"leaderTransitions\":8}"}, OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Data:map[string]string(nil), BinaryData:map[string][]uint8(nil)}' due to: 'no kind is registered for the type v1.ConfigMap'. Will not report event: 'Normal' 'LeaderElection' 'wking-bootstrap_efc11fa1-5144-45db-a2cb-568952d64f05 became leader'
...
I1005 21:42:12.267560 1 sync.go:24] Running sync for (servicecertsigner.config.openshift.io/v1alpha1, Kind=ServiceCertSignerOperatorConfig) /instance
E1005 21:42:12.534057 1 memcache.go:147] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
I1005 21:42:12.556363 1 sync.go:60] Done syncing for (servicecertsigner.config.openshift.io/v1alpha1, Kind=ServiceCertSignerOperatorConfig) /instance
...
I1005 21:42:14.179926 1 sync.go:60] Done syncing for (/v1, Kind=Service) openshift-operator-lifecycle-manager/package-server
I1005 21:42:14.180040 1 sync.go:24] Running sync for (image.openshift.io/v1, Kind=ImageStream) /
I1005 21:42:14.336201 1 request.go:485] Throttling request took 155.956498ms, request: GET:https://wking-api.installer.testing:6443/apis/image.openshift.io/v1/imagestreams
I1005 21:42:14.349166 1 cvo.go:201] Finished syncing operator "openshift-cluster-version/cluster-version-operator" (3.337348292s)
E1005 21:42:14.349428 1 runtime.go:66] Observed a panic: &runtime.TypeAssertionError{interfaceString:"runtime.Object", concreteString:"*unstructured.UnstructuredList", assertedString:"*unstructured.Unstructured", missingMethod:""} (interface conversion: runtime.Object is *unstructured.UnstructuredList, not *unstructured.Unstructured)
/go/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:72
/go/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
/go/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
/usr/local/go/src/runtime/asm_amd64.s:573
/usr/local/go/src/runtime/panic.go:502
/usr/local/go/src/runtime/iface.go:252
/usr/local/go/src/runtime/iface.go:262
/go/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/client-go/dynamic/simple.go:197
/go/src/github.com/openshift/cluster-version-operator/pkg/cvo/internal/generic.go:31
/go/src/github.com/openshift/cluster-version-operator/pkg/cvo/internal/generic.go:88
/go/src/github.com/openshift/cluster-version-operator/pkg/cvo/sync.go:51
/go/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:203
/go/src/github.com/openshift/cluster-version-operator/pkg/cvo/sync.go:33
/go/src/github.com/openshift/cluster-version-operator/pkg/cvo/cvo.go:243
/go/src/github.com/openshift/cluster-version-operator/pkg/cvo/cvo.go:115
/go/src/github.com/openshift/cluster-version-operator/pkg/cvo/cvo.go:173
/go/src/github.com/openshift/cluster-version-operator/pkg/cvo/cvo.go:162
/go/src/github.com/openshift/cluster-version-operator/pkg/cvo/cvo.go:146
/go/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133
/go/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134
/go/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
/usr/local/go/src/runtime/asm_amd64.s:2361
panic: interface conversion: runtime.Object is *unstructured.UnstructuredList, not *unstructured.Unstructured [recovered]
panic: interface conversion: runtime.Object is *unstructured.UnstructuredList, not *unstructured.Unstructured
goroutine 76 [running]:
github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
/go/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:58 +0x107
panic(0x11562c0, 0xc4202d3e00)
/usr/local/go/src/runtime/panic.go:502 +0x229
github.com/openshift/cluster-version-operator/vendor/k8s.io/client-go/dynamic.(*dynamicResourceClient).Get(0xc420e173b0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
/go/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/client-go/dynamic/simple.go:197 +0x90f
github.com/openshift/cluster-version-operator/pkg/cvo/internal.applyUnstructured(0x13e3760, 0xc420e173b0, 0xc4200a6ca0, 0xc4200a6ca0, 0x0, 0x0, 0x13e3760)
/go/src/github.com/openshift/cluster-version-operator/pkg/cvo/internal/generic.go:31 +0x99
github.com/openshift/cluster-version-operator/pkg/cvo/internal.(*genericBuilder).Do(0xc420576180, 0xc420405b80, 0x29f)
/go/src/github.com/openshift/cluster-version-operator/pkg/cvo/internal/generic.go:88 +0x72
github.com/openshift/cluster-version-operator/pkg/cvo.(*Operator).syncUpdatePayload.func1(0xa, 0x0, 0x0)
/go/src/github.com/openshift/cluster-version-operator/pkg/cvo/sync.go:51 +0x241
github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff(0x2540be400, 0x3ff4cccccccccccd, 0x0, 0x3, 0xc4207e5b20, 0x2c0, 0xc42041ef00)
/go/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:203 +0x9c
github.com/openshift/cluster-version-operator/pkg/cvo.(*Operator).syncUpdatePayload(0xc42043cb00, 0xc420481380, 0xc4205511a0, 0x3b, 0xc4205511a0)
/go/src/github.com/openshift/cluster-version-operator/pkg/cvo/sync.go:33 +0x749
github.com/openshift/cluster-version-operator/pkg/cvo.(*Operator).sync(0xc42043cb00, 0xc4203d4440, 0x32, 0x0, 0x0)
/go/src/github.com/openshift/cluster-version-operator/pkg/cvo/cvo.go:243 +0x49a
github.com/openshift/cluster-version-operator/pkg/cvo.(*Operator).(github.com/openshift/cluster-version-operator/pkg/cvo.sync)-fm(0xc4203d4440, 0x32, 0xc4203e3b00, 0x10e38c0)
/go/src/github.com/openshift/cluster-version-operator/pkg/cvo/cvo.go:115 +0x3e
github.com/openshift/cluster-version-operator/pkg/cvo.(*Operator).processNextWorkItem(0xc42043cb00, 0xc4203d2800)
/go/src/github.com/openshift/cluster-version-operator/pkg/cvo/cvo.go:173 +0xe0
github.com/openshift/cluster-version-operator/pkg/cvo.(*Operator).worker(0xc42043cb00)
/go/src/github.com/openshift/cluster-version-operator/pkg/cvo/cvo.go:162 +0x2b
github.com/openshift/cluster-version-operator/pkg/cvo.(*Operator).(github.com/openshift/cluster-version-operator/pkg/cvo.worker)-fm()
/go/src/github.com/openshift/cluster-version-operator/pkg/cvo/cvo.go:146 +0x2a
github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc42026fae0)
/go/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x54
github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc42026fae0, 0x3b9aca00, 0x0, 0x1, 0xc42008c900)
/go/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134 +0xbd
github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc42026fae0, 0x3b9aca00, 0xc42008c900)
/go/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
created by github.com/openshift/cluster-version-operator/pkg/cvo.(*Operator).Run
/go/src/github.com/openshift/cluster-version-operator/pkg/cvo/cvo.go:146 +0x1d0
Just run the command at https://github.com/openshift/cluster-version-operator/blob/master/docs/dev/clusterversion.md#finding-your-current-update-image , and the output is empty for my cluster as follows:
[root@ocp42-inf ~]# oc get clusterversion -o jsonpath='{.status.current.image}{"\n"}' version
Then check the API of ClusterVersionStatus
at https://github.com/openshift/api/blob/master/config/v1/types_cluster_version.go#L81 and found that current
is not a field any more, am I missing anything? Shall I get this from .status.history[0].image
?
[root@ocp42-inf ~]# oc get clusterversion -o jsonpath='{.status.history[0].image}{"\n"}' version
quay.io/openshift-release-dev/ocp-release@sha256:c5337afd85b94c93ec513f21c8545e3f9e36a227f55d41bc1dfb8fcc3f2be129
Question: How does the CVO monitors for new image when auto-update is enabled?
How does it looks for new ocp-release image when pushed to the container catalog(which would be used in future for ocp 4.0 if I am not wrong)?
Let me know your thoughts on it.
Hi, team! I'm a newbie to openshift. As I read the source code about CVO, I found only check coInformer's and cvInformer's cache HasSynced()
, but did not check others. Is that a special design?
cluster-version-operator/pkg/cvo/cvo.go
Lines 215 to 223 in dfe5ef5
The error message displayed imagestreams.image.openshift.io "origin-v4.0" not found
when I execute the command as follows:
oc adm release new -n openshift --server https://api.ci.openshift.org \
--from-image-stream=origin-v4.0 \
--to-image-base=docker.io/abhinavdahiya/origin-cluster-version-operator:latest \
--to-image docker.io/abhinavdahiya/origin-release:latest
The step follow the doc at here.
How can I execute this command correctly?
The CVO is creating an SCC for itself. This fails when installing on a kube-cluster and since the openshift-apiserver is created via an operator installed by the CVO, this creates a cycle.
Instead, create a clusterrole and clusterrolebinding for the SCC that will eventually exist.
The following branches are being fast-forwarded from the current development branch (master) as placeholders for future releases. No merging is allowed into these release branches until they are unfrozen for production release.
release-4.16
release-4.17
For more information, see the branching documentation.
While starting openshift cluster we saw that we don't have running console operator.
Operator logs are here:
https://pastebin.pl/view/9a2143c2
After restarting operator's pod everything worked fine
Referring to this code:
cluster-version-operator/pkg/cvo/sync_worker.go
Lines 545 to 546 in 70c0232
Assuming the sync went correctly this could look like:
metricPayload.WithLabelValues(r.version, "pending").Set(float64(r.total-r.done))
metricPayload.WithLabelValues(r.version, "applied").Set(float64(r.done))
which should be the same as:
metricPayload.WithLabelValues(r.version, "pending").Set(float64(0))
metricPayload.WithLabelValues(r.version, "applied").Set(float64(r.total))
I suggest to put it to the test and use the former.
Upgrade from 4.7.0 to 4.7.9, the x509 error occurred in 2 places, one is calling api-int.xxxx
and prometheus-operator.openshift-monitoring.svc:8080
, was able to fix the first one by manually update-ca-trust the cert from the API server. However not sure how to do with 2nd one since it's a cluster internal URI.
cluster-version-operator log shown below:
I0523 01:03:07.033591 1 cvo.go:481] Started syncing cluster version "openshift-cluster-version/version" (2021-05-23 01:03:07.033585342 +0000 UTC m=+82331.507064819)
I0523 01:03:07.041158 1 cvo.go:510] Desired version from spec is v1.Update{Version:"4.7.9", Image:"quay.io/openshift-release-dev/ocp-release@sha256:5a5433a5f82a10c78783d7aed7d556d26602295ee8e9dcfaba97ebc1ab0bc2ac", Force:false}
I0523 01:03:07.041264 1 sync_worker.go:227] Update work is equal to current target; no change required
I0523 01:03:07.041289 1 status.go:161] Synchronizing errs=field.ErrorList{} status=&cvo.SyncWorkerStatus{Generation:2, Step:"ApplyResources", Failure:error(nil), Done:8, Total:668, Completed:0, Reconciling:false, Initial:false, VersionHash:"qi_N6BhDM3k=", LastProgress:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}, Actual:v1.Release{Version:"4.7.9", Image:"quay.io/openshift-release-dev/ocp-release@sha256:5a5433a5f82a10c78783d7aed7d556d26602295ee8e9dcfaba97ebc1ab0bc2ac", URL:"https://access.redhat.com/errata/RHBA-2021:1365", Channels:[]string(nil)}, Verified:false}
I0523 01:03:07.041331 1 status.go:81] merge into existing history completed=false desired=v1.Release{Version:"4.7.9", Image:"quay.io/openshift-release-dev/ocp-release@sha256:5a5433a5f82a10c78783d7aed7d556d26602295ee8e9dcfaba97ebc1ab0bc2ac", URL:"https://access.redhat.com/errata/RHBA-2021:1365", Channels:[]string{"candidate-4.7", "candidate-4.8", "fast-4.7", "stable-4.7"}} last=&v1.UpdateHistory{State:"Partial", StartedTime:v1.Time{Time:time.Time{wall:0x0, ext:63757219493, loc:(*time.Location)(0x223c360)}}, CompletionTime:(*v1.Time)(nil), Version:"4.7.9", Image:"quay.io/openshift-release-dev/ocp-release@sha256:5a5433a5f82a10c78783d7aed7d556d26602295ee8e9dcfaba97ebc1ab0bc2ac", Verified:true}
I0523 01:03:07.041474 1 cvo.go:483] Finished syncing cluster version "openshift-cluster-version/version" (7.885146ms)
E0523 01:03:07.136035 1 task.go:112] error running apply for prometheusrule "openshift-cluster-version/cluster-version-operator" (9 of 668): Internal error occurred: failed calling webhook "prometheusrules.openshift.io": Post "https://prometheus-operator.openshift-monitoring.svc:8080/admission-prometheusrules/validate?timeout=5s": x509: certificate signed by unknown authority
I0523 01:03:07.193126 1 cvo.go:554] Finished syncing available updates "openshift-cluster-version/version" (162.643079ms
Priority classes docs:
https://docs.openshift.com/container-platform/3.11/admin_guide/scheduling/priority_preemption.html#admin-guide-priority-preemption-priority-class
Example: https://github.com/openshift/cluster-monitoring-operator/search?q=priority&unscoped_q=priority
Notes: The pre-configured system priority classes (system-node-critical
and system-cluster-critical
) can only be assigned to pods in kube-system
or openshift-*
namespaces. Most likely, core operators and their pods should be assigned system-cluster-critical
. Please do not assign system-node-critical
(the highest priority) unless you are really sure about it.
Hi, All
I built a cluster by using the openshif-installer v0.6.0
today. But, I find the manifest of the OLM is old, so how to use the latest version of the OLM? Details as below:
[jzhang@dhcp-140-18 installer]$ oc get clusterversion version -o yaml
...
current:
payload: quay.io/openshift-release-dev/ocp-release@sha256:4f02d5c7183360a519a7c7dbe601f58123c9867cd5721ae503072ae62920575b
version: 0.0.1-2018-12-08-172651
...
[jzhang@dhcp-140-18 installer]$ oc adm release extract --from=quay.io/openshift-release-dev/ocp-release@sha256:4f02d5c7183360a519a7c7dbe601f58123c9867cd5721ae503072ae62920575b --to=release-payload
[jzhang@dhcp-140-18 installer]$ ls release-payload/ | grep 30
0000_30_00-namespace.yaml
0000_30_01-olm-operator.serviceaccount.yaml
0000_30_02-clusterserviceversion.crd.yaml
0000_30_03-installplan.crd.yaml
0000_30_04-subscription.crd.yaml
0000_30_05-catalogsource.crd.yaml
0000_30_06-rh-operators.configmap.yaml
0000_30_07-certified-operators.configmap.yaml
0000_30_08-certified-operators.catalogsource.yaml
0000_30_09-rh-operators.catalogsource.yaml
0000_30_10-olm-operator.deployment.yaml
0000_30_11-catalog-operator.deployment.yaml
0000_30_12-aggregated.clusterrole.yaml
0000_30_13-packageserver.csv.yaml
0000_30_14-operatorgroup.crd.yaml
One more question, if I want to update this CVO to the latest version, how to do that? The current version is 0.0.1-2018-12-08-172651
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.