Git Product home page Git Product logo

ocs-operator's Introduction

Table of Contents

OpenShift Container Storage Operator

This is the primary operator for Red Hat OpenShift Container Storage (OCS). It is a "meta" operator, meaning it serves to facilitate the other operators in OCS by performing administrative tasks outside their scope as well as watching and configuring their CustomResources (CRs).

Deploying pre-built images

Prerequisites

OCS Operator will install its components only on nodes labelled for OCS with cluster.ocs.openshift.io/openshift-storage=''.

To label the nodes from CLI,

$ oc label nodes <NodeName> cluster.ocs.openshift.io/openshift-storage=''

OCS requires at least 3 nodes labelled this way.

Note: When deploying via Console, the creation wizard takes care of labelling the selected nodes.

Dedicated nodes

In case dedicated storage nodes are available, these can also be tainted to allow only OCS components to be scheduled on them. Nodes need to be tainted with node.ocs.openshift.io/storage=true:NoSchedule which can be done from the CLI as follows,

$ oc adm taint nodes <NodeNames> node.ocs.openshift.io/storage=true:NoSchedule

Note: The dedicated/tainted nodes will only run OCS components. The nodes will not run any apps. Therefore, if you taint, you need to have additional worker nodes that are untainted. If you don't, you will be unable to run any other apps in you OpenShift cluster.

Installation

The OCS operator can be installed into an OpenShift cluster using Operator Lifecycle Manager (OLM).

For quick install using pre-built container images, deploy the deploy-olm.yaml manifest.

$ oc create -f ./deploy/deploy-with-olm.yaml

This creates:

  • a custom CatalogSource
  • a new openshift-storage Namespace
  • an OperatorGroup
  • a Subscription for OCS & a Subscription for NOOBAA, to the OCS catalog in the openshift-storage namespace

You can check the status of the CSVs using the following command:

$ oc get csv -n openshift-storage
NAME                      DISPLAY                       VERSION   REPLACES   PHASE
noobaa-operator.v5.14.0   NooBaa Operator               5.14.0               Succeeded
ocs-operator.v4.16.0      OpenShift Container Storage   4.16.0               Succeeded

This can take a few minutes. Once PHASE says Succeeded you can create a StorageCluster.

StorageCluster can be created from the console, using the StorageCluster creation wizard. From the CLI, a StorageCluster resource can be created using the example CR as follows,

$ oc create -f ./config/samples/ocs_v1_storagecluster.yaml

Development

Tools

Build

OCS Operator

The operator image can be built via:

$ make ocs-operator

OCS Metric Exporter

The metric exporter image can be built via:

$ make ocs-metrics-exporter

OCS Operator Bundle

To create an operator bundle image, run

$ make operator-bundle

Note: Push the OCS Bundle image to image registry before moving to next step.

OCS Operator Catalog

An operator catalog image can then be built using,

$ make operator-catalog

Deploying development builds

To install own development builds of OCS, first set your own REGISTRY_NAMESPACE and IMAGE_TAG.

$ export REGISTRY_NAMESPACE=<quay-username>
$ export IMAGE_TAG=<some-tag>

Then build and push the ocs-operator & ocs-metrics-exporter image to your own image repository.

$ make ocs-operator
$ podman push quay.io/$REGISTRY_NAMESPACE/ocs-operator:$IMAGE_TAG

$ make ocs-metrics-exporter
$ podman push quay.io/$REGISTRY_NAMESPACE/ocs-metrics-exporter:$IMAGE_TAG

Then build and push the operator bundle image.

$ make operator-bundle
$ podman push quay.io/$REGISTRY_NAMESPACE/ocs-operator-bundle:$IMAGE_TAG

Next build and push the operator catalog image.

$ make operator-catalog
$ podman push quay.io/$REGISTRY_NAMESPACE/ocs-operator-catalog:$IMAGE_TAG

Now create a namespace and an OperatorGroup for OCS

$ oc create ns openshift-storage

$ cat <<EOF | oc create -f -
apiVersion: operators.coreos.com/v1alpha2
kind: OperatorGroup
metadata:
  name: openshift-storage-operatorgroup
  namespace: openshift-storage
spec:
  targetNamespaces:
    - openshift-storage
EOF

Then add a new CatalogSource using the newly built and pushed catalog image.

$ cat <<EOF | oc create -f -
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: ocs-catalogsource
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/$REGISTRY_NAMESPACE/ocs-operator-catalog:$IMAGE_TAG
  displayName: OpenShift Container Storage
  publisher: Red Hat
EOF

Finally create ocs & noobaa subscription.

$ cat <<EOF | oc create -f -
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: ocs-subscription
  namespace: openshift-storage
spec:
  channel: alpha
  name: ocs-operator
  source: ocs-catalogsource
  sourceNamespace: openshift-marketplace
EOF

$ cat <<EOF | oc create -f -
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: noobaa-subscription
  namespace: openshift-storage
spec:
  channel: alpha
  name: noobaa-operator
  source: ocs-catalogsource
  sourceNamespace: openshift-marketplace
EOF

Initial Configuration

When the operator starts, it will create a single OCSInitialization resource. That will cause various initial configuration to be created, including default StorageClasses.

The OCSInitialization resource is a singleton. If the operator sees one that it did not create, it will write an error message to its status explaining that it is being ignored.

Modifying Initial Configuration

You may modify or delete any of the operator's initial data. To reset and restore that data to its initial state, delete the OCSInitialization resource. It will be recreated, and all associated resources will be either recreated or restored to their original state.

Functional Tests

Our functional test suite uses the ginkgo testing framework. The ginkgo functests test suite in this repo is for developers. As new functionality is introduced into the ocs-operator, this repo allows developers to prove their functionality works by including tests within their PR. This is the test suite where we exercise ocs-operator deployment/update/uninstall as well as some basic workload functionality like creating PVCs.

Prerequisites for running Functional Tests

  • OCS must already be installed
  • KUBECONFIG env var must be set

Running functional test

make functest

Below is some sample output of what to expect.

Building functional tests
hack/build-functest.sh
GINKO binary found at /home/dvossel/go/bin/ginkgo
Compiling functests...
    compiled functests.test
Running functional test suite
hack/functest.sh
Running Functional Test Suite
Running Suite: Tests Suite
==========================
Random Seed: 1568299067
Will run 1 of 1 specs

•
Ran 1 of 1 Specs in 7.961 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 0 Skipped
PASS

Functional test phases

There are 3 phases to the functional tests to be aware of.

  1. BeforeSuite: At this step, the StorageCluster object is created and the test blocks waiting for the StorageCluster to come online.

  2. Test Execution: Every written test can assume at this point that a StorageCluster is online and PVC actions should succeed.

  3. AfterSuite: This is where test artifact cleanup occurs. Right now all tests should execute in the ocs-test namespace in order for artifacts to be cleaned up properly.

NOTE: The StorageCluster created in the BeforeSuite phase is not cleaned up. If you run the functional testsuite multiple times, BeforeSuite will simply fast succeed by detecting the StorageCluster already exists.

Developing Functional Tests

All the functional test code lives in the functests/ directory. For an example of how a functional test is structured, look at the functests/pvc_creation_test.go file.

The tests themselves should invoke simple to understand steps. Put any complex logic into separate helper files in the functests/ directory so test flows are easy to follow.

Running a single test

When developing a test, it's common to just want to run a single functional test rather than the whole suite. This can be done using ginkgo's "focus" feature.

All you have to do is put a F in front of the tests declaration to force only that test to run. So, if you have an iteration like It("some test") defined, you just need to set that to FIt("some test") to force the test suite to only execute that single test.

Make sure to remove the focus from your test before creating the pull request. Otherwise the test suite will fail in CI.

Debugging Functional Test Failures

If an e2e test fails, you have access to two sets of data to help debug why the error occurred.

Functional test stdout log

This will tell you what test failed and it also outputs some debug information pertaining to the test cluster's state after the test suite exits. In prow you can find this log by clicking on the details link to the right of the ci/prow/ocs-operator-bundle-e2e-aws test entry on your PR. From there you can click the Raw build-log.txt link to view the full log.

PROW artifacts

In addition to the raw test stdout, each e2e test result has a set of artifacts associated with it that you can view using prow. These artifacts let you retroactively view information about the test cluster even after the e2e job has completed.

To browse through the e2e test cluster artifacts, click on the details link to the right of the ci/prow/ocs-operator-bundle-e2e-aws test entry on your PR. From there look at the top right hand corner for the artifacts link. That will bring you to a directory tree. Follow the artifacts/ directory to the ocs-operator-bundle-e2e-aws/ directory. There you can find logs and information pertaining to objects in the cluster.

ocs-operator's People

Contributors

agarwal-mudit avatar aruniiird avatar bipuladh avatar crombus avatar dannyzaken avatar davidvossel avatar gowthamshanmugam avatar iamniting avatar jarrpa avatar kshlm avatar leelavg avatar madhu-1 avatar malayparida2000 avatar nb-ohad avatar nbalacha avatar nikhil-ladha avatar obnoxxx avatar openshift-ci[bot] avatar openshift-merge-bot[bot] avatar openshift-merge-robot avatar priyanka19-98 avatar raghavendra-talur avatar sp98 avatar subhamkrai avatar synarete avatar thotz avatar umangachapagain avatar vbnrh avatar weirdwiz avatar yati1998 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ocs-operator's Issues

Something off - think we need to add OperatorSource yaml and make sure naming conventions are followed

Missing OperatorSource, not sure of the correct yaml to use, but should probably be something like this. It needs to point to the quay.io appregistry repo that is being used.

apiVersion: operators.coreos.com/v1
kind: OperatorSource
metadata:
  name: ocs-catalogsource
  namespace: openshift-marketplace
spec:
  type: appregistry
  endpoint: https://quay.io
  registryNamespace: ocs-dev
  displayName: "OCS Catalog Source"
  publisher: "Red Hat"

currently, what we are seeing is we have a CatalogSource, OperatorGroup and Subscription Created, but what I think we are missing is the OperatorSource which ties the CatalogSourceConfig to the CatalogSource. I successfully went through this process to add the AWS S3 Operator into OCP and the process I followed and how to test this in OCP is on this readme. https://github.com/yard-turkey/awss3operator

Current Issue:
We are not getting the ocs-operator to show up in the Hub - and the CSV is never created - it just sits there and hangs.

# oc logs catalog-operator-7f6cd5c4cb-hqckr -n openshift-operator-lifecycle-manager
time="2019-08-22T17:07:20Z" level=info msg="retrying openshift-storage"
E0822 17:07:20.766270       1 queueinformer_operator.go:186] Sync "openshift-storage" failed: {ocs-operator alpha  {ocs-catalogsource openshift-marketplace}} not found: CatalogSource {ocs-catalogsource openshift-marketplace} not found
time="2019-08-22T17:07:20Z" level=info msg="retrying openshift-storage"
[root@ip-10-0-30-4 ~]# oc get csv -n openshift-storage
No resources found.

[root@ip-10-0-30-4 ~]# oc get catalogsources --all-namespaces
NAMESPACE                              NAME                      NAME                          TYPE       PUBLISHER   AGE
openshift-marketplace                  certified-operators       Certified Operators           grpc       Red Hat     2d1h
openshift-marketplace                  community-operators       Community Operators           grpc       Red Hat     2d1h
openshift-marketplace                  local-storage-manifests   Local Storage Operator        grpc       Red Hat     139m
openshift-marketplace                  ocs-catalogsource         Openshift Container Storage   grpc       Red Hat     139m
openshift-marketplace                  redhat-operators          Red Hat Operators             grpc       Red Hat     2d1h
openshift-operator-lifecycle-manager   olm-operators             OLM Operators                 internal   Red Hat     2d1h

[root@ip-10-0-30-4 openshift]# oc logs ocs-catalogsource-xjswz -n openshift-marketplace
time="2019-08-22T16:57:37Z" level=info msg="serving registry" database=bundles.db port=50051

[root@ip-10-0-30-4 openshift]# oc get operatorgroups --all-namespaces
NAMESPACE                              NAME                              AGE
local-storage                          local-operator-group              165m
openshift-monitoring                   openshift-cluster-monitoring      2d1h
openshift-operator-lifecycle-manager   olm-operators                     2d1h
openshift-operators                    global-operators                  2d1h
openshift-storage                      openshift-storage-operatorgroup   165m

[root@ip-10-0-30-4 openshift]# oc logs ocs-catalogsource-xjswz -n openshift-marketplace
time="2019-08-22T16:57:37Z" level=info msg="serving registry" database=bundles.db port=50051

[root@ip-10-0-30-4 openshift]# oc get operatorgroups --all-namespaces
NAMESPACE                              NAME                              AGE
local-storage                          local-operator-group              165m
openshift-monitoring                   openshift-cluster-monitoring      2d1h
openshift-operator-lifecycle-manager   olm-operators                     2d1h
openshift-operators                    global-operators                  2d1h
openshift-storage                      openshift-storage-operatorgroup   165m

Additionally, no other events are generated, just nothing happens, and in the console when I look at the subscription, there is no CSV - it's not finding it or not pulling it down correctly, I suspect it's the OperatorSource that is causing this.

@jarrpa @kshlm

rfe: support for disconnected installations

We are running a disconnected intallation and would like to use the ocs operator.

I tried changing the image versions in the OLM config but it still keeps referring to images hosted elsewhere and the images used in this operator are alse tag based instead of manifest based so using the new cluster imagecontent resource is not possible at this time.

must-gather: Investigate better Pod filtering for collecting logs

As of #324, the must-gather scripts use an inverted grep statement to filter out which pods should have logs gathered. There's concern that this may have weird corner cases we're not seeing, or at the very least is not the most technically correct way to do it. We should look into a more exact alternative, possibly using -o go-template.

StorageCluster Graceful Deletion

When a StorageCluster is marked for deletion, we need to gracefully tear down the StorageCluster owned objects before the StorageCluster is completely removed from the cluster.

This can be achieved by using a finalizer on the StorageCluster object. A finalizer allows us to delay the deletion of the StorageCluster object until we've completely cleaned up all the cluster objects related to the Storage Cluster. We'd essentially be codifying some of the steps around removing the CephCluster, Noobaa core deployments, and StorageClassInitialization that the hack/cluster-cleanup.sh script performs today.

registry bundle container build fails

The bundle docker image building fails for me giving an error about invalid YAML to JSON conversion. msg="could not decode contents of file /registry/bundles.db into package: error converting YAML to JSON: yaml: control characters are not allowed"

We need a CI job to verify building both the operator and the bundle container images passes.

Below is the full output of the hack/build-registry-bundle.sh command.

~/go/src/github.com/openshift/ocs-operator/deploy ~/go/src/github.com/openshift/ocs-operator
Sending build context to Docker daemon  68.61kB
Step 1/5 : FROM quay.io/openshift/origin-operator-registry
 ---> 28477b68e90e
Step 2/5 : COPY olm-catalog /registry
 ---> 758ddc2eca3e
Step 3/5 : RUN initializer --manifests /registry --output bundles.db
 ---> Running in bebd0054e223
time="2019-08-09T17:21:08Z" level=info msg="loading Bundles" dir=/registry
time="2019-08-09T17:21:08Z" level=info msg=directory dir=/registry file=registry load=bundles
time="2019-08-09T17:21:08Z" level=info msg=directory dir=/registry file=ocs-operator load=bundles
time="2019-08-09T17:21:08Z" level=info msg=directory dir=/registry file=0.0.1 load=bundles
time="2019-08-09T17:21:09Z" level=info msg="found csv, loading bundle" dir=/registry file=ocs-operator.v0.0.1.clusterserviceversion.yaml load=bundles
time="2019-08-09T17:21:09Z" level=info msg="loading bundle file" dir=/registry file=noobaabackingstore.crd.yaml load=bundle
time="2019-08-09T17:21:09Z" level=info msg="loading bundle file" dir=/registry file=noobaabucketclass.crd.yaml load=bundle
time="2019-08-09T17:21:09Z" level=info msg="loading bundle file" dir=/registry file=noobaanoobaa.crd.yaml load=bundle
time="2019-08-09T17:21:09Z" level=info msg="loading bundle file" dir=/registry file=ocs-operator.v0.0.1.clusterserviceversion.yaml load=bundle
time="2019-08-09T17:21:09Z" level=info msg="loading bundle file" dir=/registry file=ocsinitialization.crd.yaml load=bundle
time="2019-08-09T17:21:09Z" level=info msg="loading bundle file" dir=/registry file=rookcephblockpools.crd.yaml load=bundle
time="2019-08-09T17:21:09Z" level=info msg="loading bundle file" dir=/registry file=rookcephclusters.crd.yaml load=bundle
time="2019-08-09T17:21:09Z" level=info msg="loading bundle file" dir=/registry file=rookcephobjectstores.crd.yaml load=bundle
time="2019-08-09T17:21:09Z" level=info msg="loading bundle file" dir=/registry file=rookcephobjectstoreusers.crd.yaml load=bundle
time="2019-08-09T17:21:09Z" level=info msg="loading bundle file" dir=/registry file=storagecluster.crd.yaml load=bundle
time="2019-08-09T17:21:09Z" level=info msg="loading Packages and Entries" dir=/registry
time="2019-08-09T17:21:09Z" level=info msg=directory dir=/registry file=registry load=package
time="2019-08-09T17:21:09Z" level=fatal msg="could not decode contents of file /registry/bundles.db into package: error converting YAML to JSON: yaml: control characters are not allowed"
The command '/bin/sh -c initializer --manifests /registry --output bundles.db' returned a non-zero code: 1

Add a periodic job to run the full ocs-ci test-suite

Currently we run a subset of the tests in the ocs-ci testsuite are being run as pre-submit jobs for each PR.

We need to setup a periodic job (a few times a week or nightly), to run the full testsuite.

In addition to setting up the periodic job, we also need to setup notifications for failed runs.

Resolve problems reported by shellcheck for gather_ceph_resources script

When I run shellcheck on must-gather/collection-scripts/gather_ceph_resources script, I see the following problems:


In must-gather/collection-scripts/gather_ceph_resources line 26:
    secretName=`oc get secrets -n $namespace $CEPH_MON_SECRET_NAME -o jsonpath="{.metadata.name}"`
               ^-- SC2006: Use $(...) notation instead of legacy backticked `...`.
                                  ^--------^ SC2086: Double quote to prevent globbing and word splitting.


In must-gather/collection-scripts/gather_ceph_resources line 27:
    if [ -z ${secretName} ]; then 
            ^-----------^ SC2086: Double quote to prevent globbing and word splitting.


In must-gather/collection-scripts/gather_ceph_resources line 30:
    adminKey=`oc get secrets -n $namespace $CEPH_MON_SECRET_NAME -o jsonpath="{.data.admin-secret}" | base64 --decode`
             ^-- SC2006: Use $(...) notation instead of legacy backticked `...`.
                                ^--------^ SC2086: Double quote to prevent globbing and word splitting.


In must-gather/collection-scripts/gather_ceph_resources line 31:
    configMapName=`oc get configmap -n $namespace $CEPH_MON_CONFIGMAP_NAME -o jsonpath="{.metadata.name}"`
                  ^-- SC2006: Use $(...) notation instead of legacy backticked `...`.
                                       ^--------^ SC2086: Double quote to prevent globbing and word splitting.


In must-gather/collection-scripts/gather_ceph_resources line 32:
    if [ -z ${configMapName} ]; then
            ^--------------^ SC2086: Double quote to prevent globbing and word splitting.


In must-gather/collection-scripts/gather_ceph_resources line 35:
    monEndPoints=`oc get configmap -n $namespace $CEPH_MON_CONFIGMAP_NAME -o jsonpath="{.data.data}"`
                 ^-- SC2006: Use $(...) notation instead of legacy backticked `...`.
                                      ^--------^ SC2086: Double quote to prevent globbing and word splitting.


In must-gather/collection-scripts/gather_ceph_resources line 36:
    monEndPoints=`echo ${monEndPoints} | sed "s/[a-z]\+=//g" | sed "s/rook-ceph-mon[0-9]\+=//g"`
                 ^-- SC2006: Use $(...) notation instead of legacy backticked `...`.
                       ^-------------^ SC2086: Double quote to prevent globbing and word splitting.


In must-gather/collection-scripts/gather_ceph_resources line 39:
    cat ${CEPH_KEYRING_TEMPLATE} | safe_replace "REPLACE_WITH_KEYRING" ${adminKey} > ${KEYRING_FILE}
        ^----------------------^ SC2002: Useless cat. Consider 'cmd < file | ..' or 'cmd file | ..' instead.
                                                                       ^---------^ SC2086: Double quote to prevent globbing and word splitting.


In must-gather/collection-scripts/gather_ceph_resources line 40:
    cat ${CEPH_CONFIG_TEMPLATE} | safe_replace "REPLACE_WITH_MON_ENDPOINTS" ${monEndPoints} | safe_replace "REPLACE_WITH_KEYRING_PATH" ${KEYRING_FILE} > ${CEPH_CONFIG_FILE}
        ^---------------------^ SC2002: Useless cat. Consider 'cmd < file | ..' or 'cmd file | ..' instead.
                                                                            ^-------------^ SC2086: Double quote to prevent globbing and word splitting.


In must-gather/collection-scripts/gather_ceph_resources line 79:
for resource in ${ceph_resources[@]}; do
                ^------------------^ SC2068: Double quote array expansions to avoid re-splitting elements.


In must-gather/collection-scripts/gather_ceph_resources line 80:
    openshift-must-gather --base-dir=${CEPH_COLLLECTION_PATH} inspect ${resource} --all-namespaces
                                     ^----------------------^ SC2086: Double quote to prevent globbing and word splitting.
                                                                      ^---------^ SC2086: Double quote to prevent globbing and word splitting.


In must-gather/collection-scripts/gather_ceph_resources line 85:
    openshift-must-gather --base-dir=${CEPH_COLLLECTION_PATH} inspect ns/${ns}
                                     ^----------------------^ SC2086: Double quote to prevent globbing and word splitting.
                                                                         ^---^ SC2086: Double quote to prevent globbing and word splitting.


In must-gather/collection-scripts/gather_ceph_resources line 86:
    if [ $(generate_config ${ns}) -eq 1 ]; then
         ^----------------------^ SC2046: Quote this to prevent word splitting.
                           ^---^ SC2086: Double quote to prevent globbing and word splitting.


In must-gather/collection-scripts/gather_ceph_resources line 91:
    mkdir -p ${COMMAND_OUTPUT_DIR}
             ^-------------------^ SC2086: Double quote to prevent globbing and word splitting.


In must-gather/collection-scripts/gather_ceph_resources line 92:
    mkdir -p ${COMMAND_JSON_OUTPUT_DIR}
             ^------------------------^ SC2086: Double quote to prevent globbing and word splitting.


In must-gather/collection-scripts/gather_ceph_resources line 97:
        ${ceph_commands[$i]} --connect-timeout=15 >> ${COMMAND_OUTPUT_FILE}
                                                     ^--------------------^ SC2086: Double quote to prevent globbing and word splitting.


In must-gather/collection-scripts/gather_ceph_resources line 99:
        ${ceph_commands[$i]} --connect-timeout=15 --format json-pretty >> ${JSON_COMMAND_OUTPUT_FILE}
                                                                          ^-------------------------^ SC2086: Double quote to prevent globbing and word splitting.


In must-gather/collection-scripts/gather_ceph_resources line 105:
        for osdPod in $(oc get pods -n ${ns} -l app=rook-ceph-osd --no-headers | awk '{print $1}'); do
                                       ^---^ SC2086: Double quote to prevent globbing and word splitting.


In must-gather/collection-scripts/gather_ceph_resources line 106:
            pod_status=`oc get po ${osdPod} -n ${ns} -o jsonpath='{.status.phase}'`
                       ^-- SC2006: Use $(...) notation instead of legacy backticked `...`.
                                  ^-------^ SC2086: Double quote to prevent globbing and word splitting.
                                               ^---^ SC2086: Double quote to prevent globbing and word splitting.


In must-gather/collection-scripts/gather_ceph_resources line 111:
            oc -n ${ns} exec ${osdPod} -- ${ceph_volume_commands[$i]} >> ${COMMAND_OUTPUT_FILE}
                  ^---^ SC2086: Double quote to prevent globbing and word splitting.
                             ^-------^ SC2086: Double quote to prevent globbing and word splitting.
                                          ^-------------------------^ SC2086: Double quote to prevent globbing and word splitting.
                                                                         ^--------------------^ SC2086: Double quote to prevent globbing and word splitting.


In must-gather/collection-scripts/gather_ceph_resources line 119:
        mkdir -p ${NODE_OUTPUT_DIR}
                 ^----------------^ SC2086: Double quote to prevent globbing and word splitting.


In must-gather/collection-scripts/gather_ceph_resources line 120:
        oc debug nodes/${node} -- bash -c "test -f /host/var/lib/rook/log/${ns}/ceph-volume.log && cat /host/var/lib/rook/log/${ns}/ceph-volume.log" > ${NODE_OUTPUT_DIR}/ceph-volume.log
                       ^-----^ SC2086: Double quote to prevent globbing and word splitting.
                                                                                                                                                       ^----------------^ SC2086: Double quote to prevent globbing and word splitting.

For more information:
  https://www.shellcheck.net/wiki/SC2068 -- Double quote array expansions to ...
  https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt...
  https://www.shellcheck.net/wiki/SC2086 -- Double quote to prevent globbing ...

Nodes tainted with 'node.ocs.openshift.io/storage=true:NoSchedule' by default

I'm running UAT for OCS-Operator 4.2 (master) on a OCP 4.2.0-0.ci-2019-09-25-072821, deploying through OLM (oc create -f ocs-operator/deploy/deploy-with-olm.yaml) and creating the Storage Cluster using UI but multiple pods are not being scheduled.

Checking the logs, it points to taint applied to the worker nodes (the ones I choose using UI) of node.ocs.openshift.io/storage=true:NoSchedule leaving the pods with errors such as 0/6 nodes are available: 6 node(s) had taints that the pod didn't tolerate..

The problem seems to be fixable removing the taint from the nodes with oc adm taint node <workers> node.ocs.openshift.io/storage:NoSchedule-

OCS CatalogSource Pod uses invalid image name

--- Bug description ---
Sometimes while deploying, the CatalogSource pod responsible for serving the OCS Operator to the cluster, ends up with an improper image name and refuses to start because of that.

The incorrect image name is Image: quay.io//ocs-registry:latest. The behaviour happens because the ocs-dev part in Image: quay.io/ocs-dev/ocs-registry:latest gets cut, somehow. When ocs-dev is put into place, the installation process continues and both Local Storage Operator and OCS Operator get installed.

--- How to reproduce ---
I didn't manage to get a reproducible way to trigger the bug, however, some versions of deployment image do not cause that to happen - that part seemed to work fine with v0.0.1-alpha.1-15-g45140f3 and v0.0.1-alpha.1-21-gf63c4b3 images.

--- Expected behaviour ---
The image of pod will be set to Image: quay.io/ocs-dev/ocs-registry:latest and the operator will successfully install itself.

OSDs fail to come online with multiple worker nodes in the same AZ.

On AWS, some osds fail to come online when multiple worker nodes exist in the same availability zone (AZ).

The only consistent factor I've been able to identify is that the osds fail when two worker nodes exist in the same AZ. I don't understand the root cause yet. Below are the data points I have

Environments that work

Success: 3/3 osds come online

  • Replica 3
  • 3 Worker nodes
  • even distribution across 3 AZs

Success: 2/2 osds come online

  • Replica 2
  • 2 Worker nodes
  • even distribution across 2 AZs

Environments that fail

Failure: 2/3 or 1/3 osds come online

  • Replica 3
  • 3 Worker nodes
  • even distribution across 2 AZs

Failure: 1/2 osds come online

  • Replica 2
  • 3 Worker nodes
  • even distribution across 2 AZs

Failure debug

Events:
  Type     Reason              Age                 From                                   Message
  ----     ------              ----                ----                                   -------
  Normal   Scheduled           46m                 default-scheduler                      Successfully assigned openshift-storage/rook-ceph-osd-0-574bb9858b-kfltj to ip-10-0-129-254.ec2.internal
  Warning  FailedAttachVolume  46m                 attachdetach-controller                Multi-Attach error for volume &quot;pvc-230b857f-d8af-11e9-b766-0e15af88a5cc&quot; Volume is already used by pod(s) rook-ceph-osd-prepare-example-deviceset-2-khhxc-22t6n
  Warning  FailedMount         88s (x20 over 44m)  kubelet, ip-10-0-129-254.ec2.internal  Unable to mount volumes for pod &quot;rook-ceph-osd-0-574bb9858b-kfltj_openshift-storage(43ee9141-d8af-11e9-bed1-0a89cbfd94e0)&quot;: timeout expired waiting for volumes to attach or mount for pod &quot;openshift-storage&quot;/&quot;rook-ceph-osd-0-574bb9858b-kfltj&quot;. list of unmounted volumes=[example-deviceset-2-khhxc]. list of unattached volumes=[rook-data rook-config-override rook-ceph-log devices example-deviceset-2-khhxc example-deviceset-2-khhxc-bridge rook-binaries run-udev rook-ceph-osd-token-fcbk9]

Investigate a debugging strategy for must-gather

PR #327 redirects the output oc invokes to /dev/NULL which hides some important information for debugging must-gather. Redirecting it to a debugging file will help in capturing that info for debugging a failed must-gather.

Finalizer to ensure that Rook and Noobaa are removed before OCS-Operator

When OCS-Operator is removed, uninstallation should complete with no orphaned resources that might block reinstallation. As such, OCS-Operator should ensure that its immediate dependent operators (Rook and Noobaa) are fully removed before allowing itself to be removed. In order to ensure this, we should add a Finalizer to the operator to validate that these deletions are complete before allowing its own removal. This Finalizer should not delete any dependent resources of Rook or Noobaa: they should remain responsible for all resources that they created through the uninstallation process.

Cluster deletion fails

Deleting a fully deployed StorageCluster fails. Deleting a StorageCluster marks the owned CephCluster for deletion. But the CephCluster is never deleted, and the Rook ceph pods and resources continue to run.

The rook-ceph operator logs the following error,

2019-09-03 09:52:25.883690 I | op-cluster: cluster openshift-storage has a deletion timestamp
2019-09-03 09:52:25.884772 E | op-cluster: failed finalizer for cluster. failed to get volume attachments for operator namespace openshift-storage: the server could not find the requested resource (get volumes.rook.io)

This is making it hard during development to reuse an OCP cluster.

OB/OBC CRDs are not owned by OCS downstream (although it bundles them)

CC @nimrod-becker @jarrpa @davidvossel

Here is where it uses required: for these CRDs in the OCS CSV instead of using owned:
https://github.com/openshift/ocs-operator/blob/34106927774f51584ca6fa3b6ef4e9dc559db39e/deploy/olm-catalog/ocs-operator/0.0.2/ocs-operator.v0.0.2.clusterserviceversion.yaml#L263-L302

This is because OCS CSV is copying the same required that NooBaa CSV is using upstream.

However for downstream these CRDs should be owned by OCS operator.

The reason it works is that the operator bundle includes those CRDs, which seems like a fishy situation, as OLM is not aware that OCS operator is the one owning those CRDs.

make deps-update fails on clean master

$ go version
go version go1.13.4 darwin/amd64

$ git log --oneline | head -3
4a7b111 Merge pull request #342 from ashishranjan738/mustlogs
b5a9011 enhance(must-gather): updates scripts to collect logs in a debug.log
cddd7e2 Merge pull request #340 from ashishranjan738/osdcollectfix

$ make deps-update
go mod tidy && go mod vendor
go: downloading sigs.k8s.io/controller-runtime v0.2.0
go: downloading k8s.io/api v0.0.0-20190409021203-6e4e0e4f393b
go: downloading k8s.io/client-go v0.0.0-20190409021438-1a26190bd76a
go: downloading k8s.io/apimachinery v0.0.0-20190404173353-6a84e37a896d
go: downloading github.com/operator-framework/operator-sdk v0.10.0
go: downloading github.com/rook/rook v1.1.3
go: extracting k8s.io/apimachinery v0.0.0-20190404173353-6a84e37a896d
go: extracting github.com/operator-framework/operator-sdk v0.10.0
go: downloading github.com/openshift/custom-resource-status v0.0.0-20190812200727-7961da9a2eb7
go: extracting k8s.io/client-go v0.0.0-20190409021438-1a26190bd76a
go: extracting k8s.io/api v0.0.0-20190409021203-6e4e0e4f393b
go: extracting sigs.k8s.io/controller-runtime v0.2.0
go: extracting github.com/openshift/custom-resource-status v0.0.0-20190812200727-7961da9a2eb7
go: downloading github.com/go-logr/logr v0.1.0
go: downloading github.com/noobaa/noobaa-operator/v2 v2.0.8
go: downloading github.com/openshift/api v3.9.1-0.20190904155310-a25bb2adc83e+incompatible
go: downloading github.com/operator-framework/operator-lifecycle-manager v0.0.0-20190605231540-b8a4faf68e36
go: downloading github.com/stretchr/testify v1.3.0
go: downloading github.com/ghodss/yaml v1.0.0
go: extracting github.com/rook/rook v1.1.3
go: downloading k8s.io/kube-openapi v0.0.0-20190816220812-743ec37842bf
go: extracting github.com/go-logr/logr v0.1.0
go: extracting github.com/stretchr/testify v1.3.0
go: extracting github.com/openshift/api v3.9.1-0.20190904155310-a25bb2adc83e+incompatible
go: extracting github.com/ghodss/yaml v1.0.0
go: downloading github.com/onsi/gomega v1.7.0
go: extracting github.com/noobaa/noobaa-operator/v2 v2.0.8
go: extracting k8s.io/kube-openapi v0.0.0-20190816220812-743ec37842bf
go: extracting github.com/operator-framework/operator-lifecycle-manager v0.0.0-20190605231540-b8a4faf68e36
go: downloading k8s.io/apiextensions-apiserver v0.0.0-20190409022649-727a075fdec8
go: downloading github.com/google/gofuzz v1.0.0
go: downloading github.com/openshift/client-go v0.0.0-20190813201236-5a5508328169
go: downloading golang.org/x/net v0.0.0-20190827160401-ba9fcec4b297
go: extracting github.com/onsi/gomega v1.7.0
go: extracting github.com/google/gofuzz v1.0.0
go: extracting k8s.io/apiextensions-apiserver v0.0.0-20190409022649-727a075fdec8
go: downloading github.com/googleapis/gnostic v0.2.0
go: downloading github.com/evanphx/json-patch v4.5.0+incompatible
go: downloading gopkg.in/yaml.v2 v2.2.4
go: downloading k8s.io/klog v1.0.0
go: downloading github.com/prometheus/client_golang v0.9.4
go: downloading k8s.io/apiserver v0.0.0-20190531031430-24fd0f18bc21
go: downloading github.com/golang/protobuf v1.3.1
go: extracting github.com/openshift/client-go v0.0.0-20190813201236-5a5508328169
go: extracting golang.org/x/net v0.0.0-20190827160401-ba9fcec4b297
go: extracting github.com/evanphx/json-patch v4.5.0+incompatible
go: extracting gopkg.in/yaml.v2 v2.2.4
go: downloading github.com/RHsyseng/operator-utils v0.0.0-20190807020041-5344a0f594b8
go: extracting k8s.io/klog v1.0.0
go: extracting github.com/googleapis/gnostic v0.2.0
go: extracting github.com/prometheus/client_golang v0.9.4
go: downloading github.com/blang/semver v3.5.1+incompatible
go: extracting github.com/golang/protobuf v1.3.1
go: downloading k8s.io/kubernetes v1.14.2
go: downloading sigs.k8s.io/yaml v1.1.0
go: downloading github.com/gogo/protobuf v1.2.2-0.20190723190241-65acae22fc9d
go: extracting k8s.io/apiserver v0.0.0-20190531031430-24fd0f18bc21
go: extracting github.com/blang/semver v3.5.1+incompatible
go: downloading k8s.io/kube-aggregator v0.0.0-20190404125450-f5e124c822d6
go: extracting github.com/RHsyseng/operator-utils v0.0.0-20190807020041-5344a0f594b8
go: downloading gopkg.in/fsnotify.v1 v1.4.7
go: downloading github.com/pmezard/go-difflib v1.0.0
go: extracting sigs.k8s.io/yaml v1.1.0
go: downloading github.com/spf13/pflag v1.0.3
go: extracting k8s.io/kube-aggregator v0.0.0-20190404125450-f5e124c822d6
go: extracting github.com/pmezard/go-difflib v1.0.0
go: downloading github.com/kube-object-storage/lib-bucket-provisioner v0.0.0-20190924175516-f3ba69cc601e
go: extracting gopkg.in/fsnotify.v1 v1.4.7
go: extracting github.com/spf13/pflag v1.0.3
go: downloading github.com/golang/glog v0.0.0-20160126235308-23def4e6c14b
go: downloading github.com/prometheus/common v0.4.1
go: extracting github.com/gogo/protobuf v1.2.2-0.20190723190241-65acae22fc9d
go: downloading github.com/onsi/ginkgo v1.10.1
go: downloading github.com/davecgh/go-spew v1.1.1
go: downloading golang.org/x/sys v0.0.0-20190904005037-43c01164e931
go: downloading github.com/beorn7/perks v1.0.0
go: extracting github.com/kube-object-storage/lib-bucket-provisioner v0.0.0-20190924175516-f3ba69cc601e
go: extracting github.com/golang/glog v0.0.0-20160126235308-23def4e6c14b
go: extracting github.com/prometheus/common v0.4.1
go: extracting github.com/beorn7/perks v1.0.0
go: extracting github.com/davecgh/go-spew v1.1.1
go: downloading github.com/prometheus/client_model v0.0.0-20190129233127-fd36f4220a90
go: downloading go.uber.org/zap v1.10.0
go: downloading github.com/sirupsen/logrus v1.4.2
go: downloading github.com/json-iterator/go v1.1.7
go: downloading github.com/go-logr/zapr v0.1.1
go: extracting github.com/onsi/ginkgo v1.10.1
go: extracting github.com/sirupsen/logrus v1.4.2
go: extracting go.uber.org/zap v1.10.0
go: extracting github.com/prometheus/client_model v0.0.0-20190129233127-fd36f4220a90
go: extracting github.com/go-logr/zapr v0.1.1
go: extracting github.com/json-iterator/go v1.1.7
go: downloading github.com/emicklei/go-restful v2.9.5+incompatible
go: downloading github.com/imdario/mergo v0.3.7
go: extracting golang.org/x/sys v0.0.0-20190904005037-43c01164e931
go: downloading golang.org/x/text v0.3.2
go: downloading golang.org/x/time v0.0.0-20190308202827-9d24e82272b4
go: downloading go.uber.org/multierr v1.2.0
go: extracting github.com/emicklei/go-restful v2.9.5+incompatible
go: extracting github.com/imdario/mergo v0.3.7
go: extracting go.uber.org/multierr v1.2.0
go: downloading github.com/pkg/errors v0.8.1
go: downloading golang.org/x/crypto v0.0.0-20190829043050-9756ffdc2472
go: extracting golang.org/x/time v0.0.0-20190308202827-9d24e82272b4
go: downloading github.com/fsnotify/fsnotify v1.4.7
go: downloading github.com/prometheus/procfs v0.0.2
go: downloading k8s.io/utils v0.0.0-20190920012459-5008bf6f8cd6
go: downloading github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd
go: downloading github.com/coreos/prometheus-operator v0.29.0
go: extracting github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd
go: extracting k8s.io/utils v0.0.0-20190920012459-5008bf6f8cd6
go: extracting github.com/pkg/errors v0.8.1
go: extracting github.com/fsnotify/fsnotify v1.4.7
go: downloading github.com/go-openapi/validate v0.18.0
go: downloading github.com/golang/mock v1.2.1-0.20190329180013-73dc87cad333
go: downloading go.uber.org/atomic v1.4.0
go: extracting github.com/prometheus/procfs v0.0.2
go: downloading github.com/golang/groupcache v0.0.0-20190129154638-5b532d6fd5ef
go: downloading github.com/modern-go/reflect2 v1.0.1
go: extracting github.com/modern-go/reflect2 v1.0.1
go: extracting go.uber.org/atomic v1.4.0
go: extracting github.com/golang/groupcache v0.0.0-20190129154638-5b532d6fd5ef
go: downloading gomodules.xyz/jsonpatch/v2 v2.0.1
go: downloading golang.org/x/oauth2 v0.0.0-20190604053449-0f29369cfe45
go: extracting github.com/golang/mock v1.2.1-0.20190329180013-73dc87cad333
go: downloading github.com/hpcloud/tail v1.0.0
go: downloading sigs.k8s.io/testing_frameworks v0.1.2
go: extracting github.com/go-openapi/validate v0.18.0
go: extracting github.com/coreos/prometheus-operator v0.29.0
go: downloading github.com/go-openapi/jsonpointer v0.19.2
go: extracting golang.org/x/crypto v0.0.0-20190829043050-9756ffdc2472
go: extracting github.com/hpcloud/tail v1.0.0
go: extracting gomodules.xyz/jsonpatch/v2 v2.0.1
go: extracting golang.org/x/oauth2 v0.0.0-20190604053449-0f29369cfe45
go: extracting sigs.k8s.io/testing_frameworks v0.1.2
go: extracting github.com/go-openapi/jsonpointer v0.19.2
go: downloading gopkg.in/inf.v0 v0.9.1
go: downloading gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15
go: downloading github.com/konsorten/go-windows-terminal-sequences v1.0.2
go: downloading github.com/go-openapi/strfmt v0.18.0
go: downloading github.com/go-openapi/spec v0.19.2
go: extracting gopkg.in/inf.v0 v0.9.1
go: extracting github.com/konsorten/go-windows-terminal-sequences v1.0.2
go: extracting github.com/go-openapi/strfmt v0.18.0
go: extracting gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15
go: downloading github.com/hashicorp/golang-lru v0.5.1
go: downloading github.com/mitchellh/mapstructure v1.1.2
go: extracting github.com/go-openapi/spec v0.19.2
go: downloading github.com/go-openapi/analysis v0.17.2
go: extracting golang.org/x/text v0.3.2
go: extracting github.com/hashicorp/golang-lru v0.5.1
go: downloading github.com/go-openapi/errors v0.17.2
go: extracting github.com/mitchellh/mapstructure v1.1.2
go: downloading github.com/mailru/easyjson v0.0.0-20190614124828-94de47d64c63
go: downloading github.com/go-openapi/runtime v0.17.2
go: extracting github.com/go-openapi/analysis v0.17.2
go: downloading github.com/globalsign/mgo v0.0.0-20181015135952-eeefdecb41b8
go: extracting github.com/go-openapi/errors v0.17.2
go: downloading gopkg.in/tomb.v1 v1.0.0-20141024135613-dd632973f1e7
go: extracting github.com/mailru/easyjson v0.0.0-20190614124828-94de47d64c63
go: downloading github.com/asaskevich/govalidator v0.0.0-20180720115003-f9ffefc3facf
go: extracting github.com/go-openapi/runtime v0.17.2
go: extracting gopkg.in/tomb.v1 v1.0.0-20141024135613-dd632973f1e7
go: downloading github.com/go-openapi/jsonreference v0.19.2
go: downloading github.com/pborman/uuid v1.2.0
go: extracting github.com/globalsign/mgo v0.0.0-20181015135952-eeefdecb41b8
go: downloading github.com/go-openapi/swag v0.19.2
go: extracting github.com/asaskevich/govalidator v0.0.0-20180720115003-f9ffefc3facf
go: extracting github.com/go-openapi/jsonreference v0.19.2
go: downloading google.golang.org/appengine v1.6.1
go: extracting github.com/pborman/uuid v1.2.0
go: downloading github.com/PuerkitoBio/purell v1.1.1
go: downloading github.com/go-openapi/loads v0.17.2
go: extracting github.com/go-openapi/swag v0.19.2
go: downloading github.com/google/uuid v1.1.1
go: downloading github.com/matttproud/golang_protobuf_extensions v1.0.1
go: downloading cloud.google.com/go v0.40.0
go: downloading github.com/kr/pretty v0.1.0
go: extracting github.com/PuerkitoBio/purell v1.1.1
go: extracting google.golang.org/appengine v1.6.1
go: extracting github.com/go-openapi/loads v0.17.2
go: downloading github.com/PuerkitoBio/urlesc v0.0.0-20170810143723-de5bf2ad4578
go: extracting github.com/kr/pretty v0.1.0
go: downloading github.com/kr/text v0.1.0
go: extracting github.com/google/uuid v1.1.1
go: extracting github.com/matttproud/golang_protobuf_extensions v1.0.1
go: extracting github.com/PuerkitoBio/urlesc v0.0.0-20170810143723-de5bf2ad4578
go: extracting k8s.io/kubernetes v1.14.2
go: extracting github.com/kr/text v0.1.0
go: extracting cloud.google.com/go v0.40.0
go: downloading github.com/docker/distribution v2.7.1+incompatible
go: finding k8s.io/cloud-provider latest
go: downloading k8s.io/cloud-provider v0.0.0-20191121022508-6371aabbd7a7
go: extracting github.com/docker/distribution v2.7.1+incompatible
go: downloading github.com/opencontainers/go-digest v1.0.0-rc1
go: extracting github.com/opencontainers/go-digest v1.0.0-rc1
go: extracting k8s.io/cloud-provider v0.0.0-20191121022508-6371aabbd7a7
github.com/openshift/ocs-operator/pkg/deploy-manager imports
	github.com/operator-framework/operator-lifecycle-manager/pkg/controller/install imports
	k8s.io/kubernetes/plugin/pkg/auth/authorizer/rbac tested by
	k8s.io/kubernetes/plugin/pkg/auth/authorizer/rbac.test imports
	k8s.io/kubernetes/plugin/pkg/auth/authorizer/rbac/bootstrappolicy imports
	k8s.io/kubernetes/pkg/features imports
	k8s.io/cloud-provider/features: module k8s.io/cloud-provider@latest found (v0.0.0-20191121022508-6371aabbd7a7), but does not contain package k8s.io/cloud-provider/features
make: *** [deps-update] Error 1

$ git diff
diff --git a/go.sum b/go.sum
index 1bb3989..3a0e879 100644
--- a/go.sum
+++ b/go.sum
@@ -746,6 +746,7 @@ k8s.io/cli-runtime v0.0.0-20191005121332-4d28aef60981 h1:/o1WCRCwddWOUckroUAu1wd
 k8s.io/cli-runtime v0.0.0-20191005121332-4d28aef60981/go.mod h1:CICkH37E5f4cPVqJ/ZcIDn/kdvjJHq8sacFCdCyKHtU=
 k8s.io/client-go v0.0.0-20190409021438-1a26190bd76a h1:FH59qGH0V92P8HYzYVoCTnYTbIasdCuHjcD4fQzm6C0=
 k8s.io/client-go v0.0.0-20190409021438-1a26190bd76a/go.mod h1:7vJpHMYJwNQCWgzmNV+VYUl1zCObLyodBc8nIyt8L5s=
+k8s.io/cloud-provider v0.0.0-20191121022508-6371aabbd7a7 h1:y6z2bZTlzF1HF7tIiTZjf3ehGx5Eglipd93MeG6q3hQ=
 k8s.io/code-generator v0.0.0-20181203235156-f8cba74510f3/go.mod h1:MYiN+ZJZ9HkETbgVZdWw2AsuAi9PZ4V80cwfuf2axe8=
 k8s.io/code-generator v0.0.0-20190311093542-50b561225d70/go.mod h1:MYiN+ZJZ9HkETbgVZdWw2AsuAi9PZ4V80cwfuf2axe8=
 k8s.io/code-generator v0.0.0-20190717022600-77f3a1fe56bb/go.mod h1:cDx5jQmWH25Ff74daM7NVYty9JWw9dvIS9zT9eIubCY=

Validation warnings for ocs registry

We have an automation downstream for registry containers that validates the registry. It is currently showing a number of warnings:

http://external-ci-coldstorage.datahub.redhat.com/cvp/cvp-redhat-operator-metadata-validation-test/ocs-registry-container-4.2-103.142e3ba.master/b97282c0-7ce5-402f-ace1-9712657f2b6f/operator-metadata-linting-output.txt

This is using operator-courier [1] to verify the registry. These warnings might not point to an actual problem with the registry but it would be nice if you could review them and suggest an action, i.e. fix/ignore/...

[1] https://github.com/operator-framework/operator-courier

CSV ocs-operator.v0.0.1 should be named just ocs-operator

I think that the operator name shouldn't contain the version inside its name if the version can be taken from version attribute here:
https://github.com/openshift/ocs-operator/blob/525734a3146d55d6125490d9e5acf03bed080f83/deploy/olm-catalog/ocs-operator/0.0.1/ocs-operator.v0.0.1.clusterserviceversion.yaml#L917

I also see that the openshift OLM does not have version in the name:

$ oc get csv --all-namespaces
NAMESPACE                              NAME            DISPLAY          VERSION   REPLACES   PHASE
openshift-operator-lifecycle-manager   packageserver   Package Server   0.11.0               Succeeded

It means that we can in our automation rely only on one name ocs-operator and not ocs-operator.v0.0.1 which after upgrade shouldn't change the name IMO.

ocs-operator updates its status incorrectly

The ocs-operator updates its status to not ready when a StorageCluster object is created.

$ oc describe pods ocs-operator-56cffb8fc4-mlhkt                                                                                                                                                                    ✘ 130 
Name:               ocs-operator-56cffb8fc4-mlhkt
Namespace:          openshift-storage
Priority:           0
PriorityClassName:  <none>
Node:               ip-10-0-179-129.us-west-2.compute.internal/10.0.179.129
Start Time:         Wed, 04 Sep 2019 12:25:54 +0530
Labels:             name=ocs-operator
                    pod-template-hash=56cffb8fc4
Annotations:        alm-examples:
                      
                      [
                          {
                              "apiVersion": "ocs.openshift.io/v1alpha1",
                              "kind": "StorageCluster",
                              "metadata": {
                                  "name": "example-storagecluster",
                                  "namespace": "openshift-storage"
                              },
                              "spec": {
                                  "manageNodes": false,
                                  "monPVCTemplate": {
                                      "spec": {
                                          "accessModes": [
                                              "ReadWriteOnce"
                                          ],
                                          "resources": {
                                              "requests": {
                                                  "storage": "10Gi"
                                              }
                                          },
                                          "storageClassName": "gp2"
                                      }
                                  },
                                  "storageDeviceSets": [
                                      {
                                          "count": 3,
                                          "dataPVCTemplate": {
                                              "spec": {
                                                  "accessModes": [
                                                      "ReadWriteOnce"
                                                  ],
                                                  "resources": {
                                                      "requests": {
                                                          "storage": "1Ti"
                                                      }
                                                  },
                                                  "storageClassName": "gp2",
                                                  "volumeMode": "Block"
                                              }
                                          },
                                          "name": "example-deviceset",
                                          "placement": {},
                                          "portable": true,
                                          "resources": {}
                                      }
                                  ]
                              }
                          },
                          {
                              "apiVersion": "ocs.openshift.io/v1alpha1",
                              "kind": "OCSInitialization",
                              "metadata": {
                                  "name": "example-ocsinitialization"
                              },
                              "spec": {}
                          },
                          {
                              "apiVersion": "ocs.openshift.io/v1alpha1",
                              "kind": "StorageClusterInitialization",
                              "metadata": {
                                  "name": "example-storageclusterinitialization"
                              },
                              "spec": {}
                          }
                      ]
                    capabilities: Full Lifecycle
                    categories: Storage
                    olm.operatorGroup: openshift-storage-operatorgroup
                    olm.operatorNamespace: openshift-storage
                    olm.targetNamespaces: openshift-storage
                    openshift.io/scc: restricted
Status:             Running
IP:                 10.130.2.9
Controlled By:      ReplicaSet/ocs-operator-56cffb8fc4
Containers:
  ocs-operator:
    Container ID:  cri-o://d6e7d968eb613f1965bd69de864df10321cc09d9d60aa0a8a128ae8c53fa17eb
    Image:         quay.io/kshlm/ocs-operator:201909041217
    Image ID:      quay.io/kshlm/ocs-operator@sha256:2efcf6d5a77640c65e5dd097b2d1515b22e49b838850d6a3dbc120a4c584552d
    Port:          60000/TCP
    Host Port:     0/TCP
    Command:
      ocs-operator
    State:          Running
      Started:      Wed, 04 Sep 2019 12:26:38 +0530
    Ready:          False
    Restart Count:  0
    Readiness:      exec [stat /tmp/operator-sdk-ready] delay=4s timeout=1s period=10s #success=1 #failure=1
    Environment:
      WATCH_NAMESPACE:   (v1:metadata.annotations['olm.targetNamespaces'])
      POD_NAME:         ocs-operator-56cffb8fc4-mlhkt (v1:metadata.name)
      OPERATOR_NAME:    ocs-operator
      CEPH_IMAGE:       ceph/ceph:v14.2.2-20190828
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from ocs-operator-token-bvb6g (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  ocs-operator-token-bvb6g:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ocs-operator-token-bvb6g
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                    From                                                 Message
  ----     ------     ----                   ----                                                 -------
  Normal   Scheduled  63m                    default-scheduler                                    Successfully assigned openshift-storage/ocs-operator-56cffb8fc4-mlhkt to ip-10-0-179-129.us-west-2.compute.internal
  Normal   Pulling    63m                    kubelet, ip-10-0-179-129.us-west-2.compute.internal  Pulling image "quay.io/kshlm/ocs-operator:201909041217"
  Normal   Pulled     63m                    kubelet, ip-10-0-179-129.us-west-2.compute.internal  Successfully pulled image "quay.io/kshlm/ocs-operator:201909041217"
  Normal   Created    63m                    kubelet, ip-10-0-179-129.us-west-2.compute.internal  Created container ocs-operator
  Normal   Started    63m                    kubelet, ip-10-0-179-129.us-west-2.compute.internal  Started container ocs-operator
  Warning  Unhealthy  3m42s (x351 over 62m)  kubelet, ip-10-0-179-129.us-west-2.compute.internal  Readiness probe failed: stat: cannot stat '/tmp/operator-sdk-ready': No such file or directory

The operator is still fully functional and can respond to updates to the CRDs it watches.
Only the status of the StorageCluster should be updated, which should depend on the status of the rook resources it creates.

Remove promotion of the source image

The source image is currently being promoted to the global registry, (for example) after a PR is merged on ocs-operator. The source image is not useful outside of the PRs test namespace, and should not be promoted.

The promotion configuration should be removed from all config files.

csi-rbdplugin ds binds to 8080/tcp and collides with other pods

For KNI environments, the 8080/tcp port is being used by coredns hence the csi-rbdplugin ds cannot start.

$ for port in 8080 8081; do for node in $(oc get nodes -o jsonpath="{.item
s[*].metadata.name}"); do echo ${node}; ssh core@${node} "sudo ss -ntlp | grep ${port}"; done; done
kni1-master-0.example.com
LISTEN   0         128                       *:8080                   *:*        users:(("coredns",pid=3478,fd=3))                                                                                                                           
kni1-master-1.example.com
LISTEN   0         128                       *:8080                   *:*        users:(("coredns",pid=3508,fd=3))                                                                                                                           
kni1-master-2.example.com
LISTEN   0         128                       *:8080                   *:*        users:(("coredns",pid=3494,fd=3))                                                                                                                           
kni1-master-0.example.com
LISTEN   0         128            10.19.138.11:8081             0.0.0.0:*        users:(("cephcsi",pid=88352,fd=3))                                                                                                                          
kni1-master-1.example.com
LISTEN   0         128            10.19.138.12:8081             0.0.0.0:*        users:(("cephcsi",pid=91411,fd=3))                                             
kni1-master-2.example.com
LISTEN   0         128            10.19.138.13:8081             0.0.0.0:*        users:(("cephcsi",pid=92409,fd=3))
$ oc get pods -n openshift-storage -l app=csi-rbdplugin
NAME                  READY   STATUS             RESTARTS   AGE
csi-rbdplugin-cf85v   2/3     CrashLoopBackOff   6          8m41s
csi-rbdplugin-ctnmq   2/3     CrashLoopBackOff   6          8m41s
csi-rbdplugin-q4ljs   2/3     CrashLoopBackOff   6          8m41s
$ oc logs csi-rbdplugin-cf85v -c liveness-prometheus
I0823 14:01:01.172467  115756 cephcsi.go:99] Driver version: canary and Git version: 81c28d6cb0a6d64c4c84c90724275192eb64028f
I0823 14:01:01.172558  115756 cephcsi.go:140] Starting driver type: liveness with name: liveness.csi.ceph.com
I0823 14:01:01.172572  115756 liveness.go:89] Liveness Running
F0823 14:01:01.173757  115756 liveness.go:106] listen tcp 10.19.138.13:8080: bind: address already in use

I believe it would be nice to bind in some other $random port instead a typical one like 8080

Need end-to-end functional/integration test (happy path) for OCS

This repository needs most urgently an end-to-end test that tests the whole OCS system with (openshift,) ocs-operator, rook-ceph, noobaa, etc that should test the basic happy path at least:

  • set it all up
  • make sure basic functionality like provisioning some apps with PVCs from the generated storage classes, and using them,
  • deleting the PVCs
  • destroying the setup again

This is highest priority so that we can catch regressions early.

make source-manifests is failing with 'make: *** [Makefile:79: source-manifests] Error 2'

This has been failing for a couple days now in our downstream CI.

Relevant log files, using commit 2c0a70a:

+ export ROOK_IMAGE=quay.io/rhceph-dev/rook:4.2-153.37a9e6b0.master
+ ROOK_IMAGE=quay.io/rhceph-dev/rook:4.2-153.37a9e6b0.master
++ ./get-latest.py rhceph-dev/mcg-operator
+ export NOOBAA_IMAGE=quay.io/rhceph-dev/mcg-operator:4.2-87
+ NOOBAA_IMAGE=quay.io/rhceph-dev/mcg-operator:4.2-87
+ make source-manifests
Sourcing CSV and CRD manifests from component-level operators
hack/source-manifests.sh
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100   627    0   627    0     0   3215      0 --:--:-- --:--:-- --:--:--  3198

 29 68.5M   29 20.3M    0     0  17.7M      0  0:00:03  0:00:01  0:00:02 17.7M
 42 68.5M   42 29.2M    0     0  13.6M      0  0:00:05  0:00:02  0:00:03 9048k
 48 68.5M   48 33.1M    0     0  10.5M      0  0:00:06  0:00:03  0:00:03 6551k
 53 68.5M   53 36.7M    0     0  9074k      0  0:00:07  0:00:04  0:00:03 5594k
 59 68.5M   59 41.0M    0     0  8156k      0  0:00:08  0:00:05  0:00:03 5284k
 68 68.5M   68 47.2M    0     0  7861k      0  0:00:08  0:00:06  0:00:02 5495k
 74 68.5M   74 51.3M    0     0  7356k      0  0:00:09  0:00:07  0:00:02 4534k
 81 68.5M   81 56.1M    0     0  7048k      0  0:00:09  0:00:08  0:00:01 4695k
 87 68.5M   87 59.6M    0     0  6675k      0  0:00:10  0:00:09  0:00:01 4683k
 90 68.5M   90 61.8M    0     0  6238k      0  0:00:11  0:00:10  0:00:01 4263k
 96 68.5M   96 66.0M    0     0  6067k      0  0:00:11  0:00:11 --:--:-- 3858k
100 68.5M  100 68.5M    0     0  6068k      0  0:00:11  0:00:11 --:--:-- 3983k
make: *** [Makefile:79: source-manifests] Error 2

Request for non AWS-centric YAML examples

Later edit: Misunderstood the installation process, please disregard the bug below. The OCS requires filesystem-type PVs on each node for the mons and block-type PVs for the datasets. After using the Local Storage Operator to provide them AAAANNND removing the storageClass from each one manually ( #310 ), they were picked up by the ocs-operator. The next place I got stuck at is "0/6 nodes are available: 1 node(s) had volume node affinity conflict, 5 node(s) didn't match node selector." from the pod named rook-ceph-osd-prepare-ocs-deviceset-2-0-XXXXXX-XXXX.

I will open a new issue regarding this.

--------- Original bug ---------
Hi guys,

We are attempting to evaluate OCS for a potential 2020/Q3 OpenShift deployment. Do you have any example YAML storageClusters for BareMetal deployments? We've created KVM VMs and BareMetal machines with available 1TiB LUNs. The defaults are:

spec:
  managedNodes: false
  storageDeviceSets:
    - count: 3
      dataPVCTemplate:
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 1Ti
          storageClassName: ''
          volumeMode: Block
      name: ocs-deviceset
      placement: {}
      portable: true
      resources: {}

The event stream is found below. The operator installation (ocs-init) works flawlessly, and in the recent builds the local-storage-operator is not required anymore (or is it?!?). But the storageCluster installation seems to fail in multiple spots:

  • /tmp/operator-sdk-ready error for ocs-operator pod ( #86 )
  • "no persistent volumes available for this claim and no storage class is set" for rook-ceph-mon-a persistent volume claim ( #310 )
  • "pod has unbound immediate PersistentVolumeClaims" for rook-ceph-mon-a pod ( #310 )
Razvan:~ razvan$ oc get events -n openshift-storage -o custom-columns=FirstOccurence:.firstTimestamp,COUNT:.count,LastOccurence:.lastTimestamp,Kind:.involvedObject.kind,Reason:.reason,ObjectName:.involvedObject.name,Message:.message --sort-by='.firstTimestamp' | grep -v ^$ | awk '{printf "| " $1 " | "$2 " | " $3 " | " $4 " | " $5 " | " $6 " | "; for(i=7;i<=NF;i++){printf "%s ", $i}; printf "|\n"}'
FirstOccurence Count LastOccurence Kind Reason ObjectName Message
2019-11-27T12:18:20Z 2 2019-11-27T12:18:21Z ClusterServiceVersion RequirementsUnknown ocs-operator.v0.0.1 requirements not yet checked
2019-11-27T12:18:28Z 1 2019-11-27T12:18:28Z ClusterServiceVersion RequirementsNotMet ocs-operator.v0.0.1 one or more requirements couldn't be found
2019-11-27T12:18:37Z 25 2019-11-27T13:33:25Z ClusterServiceVersion AllRequirementsMet ocs-operator.v0.0.1 all requirements found, attempting install
2019-11-27T12:18:57Z 1 2019-11-27T12:18:57Z Pod Scheduled ocs-operator-5f5459c784-7h6lb Successfully assigned openshift-storage/ocs-operator-5f5459c784-7h6lb to oshift-worker3-all-prod
2019-11-27T12:18:57Z 1 2019-11-27T12:18:57Z Pod Scheduled rook-ceph-operator-6b89bb7d9b-vsvms Successfully assigned openshift-storage/rook-ceph-operator-6b89bb7d9b-vsvms to oshift-worker3-all-prod
2019-11-27T12:18:57Z 4 2019-11-27T12:21:35Z ClusterServiceVersion InstallSucceeded ocs-operator.v0.0.1 waiting for install components to report healthy
2019-11-27T12:18:57Z 1 2019-11-27T12:18:57Z ReplicaSet SuccessfulCreate noobaa-operator-77b5db4545 Created pod: noobaa-operator-77b5db4545-fpdhs
2019-11-27T12:18:57Z 1 2019-11-27T12:18:57Z Deployment ScalingReplicaSet rook-ceph-operator Scaled up replica set rook-ceph-operator-6b89bb7d9b to 1
2019-11-27T12:18:57Z 1 2019-11-27T12:18:57Z ReplicaSet SuccessfulCreate rook-ceph-operator-6b89bb7d9b Created pod: rook-ceph-operator-6b89bb7d9b-vsvms
2019-11-27T12:18:57Z 1 2019-11-27T12:18:57Z Deployment ScalingReplicaSet noobaa-operator Scaled up replica set noobaa-operator-77b5db4545 to 1
2019-11-27T12:18:57Z 1 2019-11-27T12:18:57Z ReplicaSet SuccessfulCreate ocs-operator-5f5459c784 Created pod: ocs-operator-5f5459c784-7h6lb
2019-11-27T12:18:57Z 1 2019-11-27T12:18:57Z Deployment ScalingReplicaSet ocs-operator Scaled up replica set ocs-operator-5f5459c784 to 1
2019-11-27T12:18:57Z 1 2019-11-27T12:18:57Z Pod Scheduled noobaa-operator-77b5db4545-fpdhs Successfully assigned openshift-storage/noobaa-operator-77b5db4545-fpdhs to oshift-worker3-all-prod
2019-11-27T12:18:59Z 1 2019-11-27T12:18:59Z ClusterServiceVersion InstallWaiting ocs-operator.v0.0.1 installing: Waiting: waiting for deployment rook-ceph-operator to become ready: Waiting for rollout to finish: 0 of 1 updated replicas are available...
2019-11-27T12:19:04Z 1 2019-11-27T12:19:04Z Pod Pulling ocs-operator-5f5459c784-7h6lb Pulling image "quay.io/ocs-dev/ocs-operator:4.2.0"
2019-11-27T12:19:04Z 1 2019-11-27T12:19:04Z Pod Pulling noobaa-operator-77b5db4545-fpdhs Pulling image "noobaa/noobaa-operator:2.0.8"
2019-11-27T12:19:04Z 1 2019-11-27T12:19:04Z Pod Pulling rook-ceph-operator-6b89bb7d9b-vsvms Pulling image "rook/ceph:v1.1.4-27.gf20c056"
2019-11-27T12:19:18Z 1 2019-11-27T12:19:18Z Pod Started ocs-operator-5f5459c784-7h6lb Started container ocs-operator
2019-11-27T12:19:18Z 1 2019-11-27T12:19:18Z Pod Pulled ocs-operator-5f5459c784-7h6lb Successfully pulled image "quay.io/ocs-dev/ocs-operator:4.2.0"
2019-11-27T12:19:18Z 1 2019-11-27T12:19:18Z Pod Pulled noobaa-operator-77b5db4545-fpdhs Successfully pulled image "noobaa/noobaa-operator:2.0.8"
2019-11-27T12:19:18Z 1 2019-11-27T12:19:18Z Pod Created noobaa-operator-77b5db4545-fpdhs Created container noobaa-operator
2019-11-27T12:19:18Z 1 2019-11-27T12:19:18Z Pod Created ocs-operator-5f5459c784-7h6lb Created container ocs-operator
2019-11-27T12:19:18Z 1 2019-11-27T12:19:18Z Pod Started noobaa-operator-77b5db4545-fpdhs Started container noobaa-operator
2019-11-27T12:19:31Z 1 2019-11-27T12:19:31Z Endpoints LeaderElection ceph.rook.io-block rook-ceph-operator-6b89bb7d9b-vsvms_2a1fff48-1110-11ea-ab0c-0a580af6058b became leader
2019-11-27T12:19:31Z 1 2019-11-27T12:19:31Z Pod Started rook-ceph-operator-6b89bb7d9b-vsvms Started container rook-ceph-operator
2019-11-27T12:19:31Z 1 2019-11-27T12:19:31Z Pod Created rook-ceph-operator-6b89bb7d9b-vsvms Created container rook-ceph-operator
2019-11-27T12:19:31Z 1 2019-11-27T12:19:31Z Pod Pulled rook-ceph-operator-6b89bb7d9b-vsvms Successfully pulled image "rook/ceph:v1.1.4-27.gf20c056"
2019-11-27T12:19:31Z 1 2019-11-27T12:19:31Z Endpoints LeaderElection rook.io-block rook-ceph-operator-6b89bb7d9b-vsvms_2a2011f8-1110-11ea-ab0c-0a580af6058b became leader
2019-11-27T12:19:32Z 2 2019-11-27T12:19:33Z ClusterServiceVersion InstallSucceeded ocs-operator.v0.0.1 install strategy completed with no errors
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z ReplicaSet SuccessfulCreate csi-cephfsplugin-provisioner-669767cc87 Created pod: csi-cephfsplugin-provisioner-669767cc87-jmt74
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z DaemonSet SuccessfulCreate csi-cephfsplugin Created pod: csi-cephfsplugin-p56mn
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z Pod Scheduled csi-cephfsplugin-p56mn Successfully assigned openshift-storage/csi-cephfsplugin-p56mn to oshift-worker1-all-prod
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z DaemonSet SuccessfulCreate csi-cephfsplugin Created pod: csi-cephfsplugin-47rhx
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z DaemonSet SuccessfulCreate csi-cephfsplugin Created pod: csi-cephfsplugin-5hd4h
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z Pod Scheduled csi-rbdplugin-gpmg7 Successfully assigned openshift-storage/csi-rbdplugin-gpmg7 to oshift-worker1-all-prod
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z ReplicaSet SuccessfulCreate csi-cephfsplugin-provisioner-669767cc87 Created pod: csi-cephfsplugin-provisioner-669767cc87-pm4qm
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z DaemonSet SuccessfulCreate csi-rbdplugin Created pod: csi-rbdplugin-x88r5
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z Pod Scheduled csi-cephfsplugin-provisioner-669767cc87-jmt74 Successfully assigned openshift-storage/csi-cephfsplugin-provisioner-669767cc87-jmt74 to oshift-worker3-all-prod
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z Pod Scheduled csi-cephfsplugin-5hd4h Successfully assigned openshift-storage/csi-cephfsplugin-5hd4h to oshift-worker2-all-prod
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z Pod Scheduled csi-cephfsplugin-provisioner-669767cc87-pm4qm Successfully assigned openshift-storage/csi-cephfsplugin-provisioner-669767cc87-pm4qm to oshift-worker1-all-prod
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z Pod Scheduled csi-rbdplugin-qjfn2 Successfully assigned openshift-storage/csi-rbdplugin-qjfn2 to oshift-worker3-all-prod
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z DaemonSet SuccessfulCreate csi-rbdplugin Created pod: csi-rbdplugin-qjfn2
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z Pod Scheduled csi-rbdplugin-x88r5 Successfully assigned openshift-storage/csi-rbdplugin-x88r5 to oshift-worker2-all-prod
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z DaemonSet SuccessfulCreate csi-rbdplugin Created pod: csi-rbdplugin-gpmg7
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z Deployment ScalingReplicaSet csi-rbdplugin-provisioner Scaled up replica set csi-rbdplugin-provisioner-56c7c77dd7 to 2
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z ReplicaSet SuccessfulCreate csi-rbdplugin-provisioner-56c7c77dd7 Created pod: csi-rbdplugin-provisioner-56c7c77dd7-896tw
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z Pod Scheduled csi-rbdplugin-provisioner-56c7c77dd7-7k9jz Successfully assigned openshift-storage/csi-rbdplugin-provisioner-56c7c77dd7-7k9jz to oshift-worker3-all-prod
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z ReplicaSet SuccessfulCreate csi-rbdplugin-provisioner-56c7c77dd7 Created pod: csi-rbdplugin-provisioner-56c7c77dd7-7k9jz
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z Pod Scheduled csi-cephfsplugin-47rhx Successfully assigned openshift-storage/csi-cephfsplugin-47rhx to oshift-worker3-all-prod
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z Deployment ScalingReplicaSet csi-cephfsplugin-provisioner Scaled up replica set csi-cephfsplugin-provisioner-669767cc87 to 2
2019-11-27T12:20:37Z 1 2019-11-27T12:20:37Z Pod Scheduled csi-rbdplugin-provisioner-56c7c77dd7-896tw Successfully assigned openshift-storage/csi-rbdplugin-provisioner-56c7c77dd7-896tw to oshift-worker2-all-prod
2019-11-27T12:20:38Z 1 2019-11-27T12:20:38Z Pod Pulling csi-cephfsplugin-p56mn Pulling image "quay.io/k8scsi/csi-node-driver-registrar:v1.1.0"
2019-11-27T12:20:38Z 1 2019-11-27T12:20:38Z Pod Pulling csi-rbdplugin-qjfn2 Pulling image "quay.io/k8scsi/csi-node-driver-registrar:v1.1.0"
2019-11-27T12:20:38Z 1 2019-11-27T12:20:38Z Pod Pulling csi-cephfsplugin-47rhx Pulling image "quay.io/k8scsi/csi-node-driver-registrar:v1.1.0"
2019-11-27T12:20:38Z 1 2019-11-27T12:20:38Z Pod FailedMount csi-cephfsplugin-provisioner-669767cc87-jmt74 MountVolume.SetUp failed for volume "rook-csi-cephfs-provisioner-sa-token-9plmg" : couldn't propagate object cache: timed out waiting for the condition
2019-11-27T12:20:38Z 19 2019-11-27T13:52:41Z PodDisruptionBudget NoPods rook-ceph-mon-pdb No matching pods found
2019-11-27T12:20:38Z 1 2019-11-27T12:20:38Z Pod Pulling csi-cephfsplugin-5hd4h Pulling image "quay.io/k8scsi/csi-node-driver-registrar:v1.1.0"
2019-11-27T12:20:38Z 1 2019-11-27T12:20:38Z Pod Pulling csi-rbdplugin-gpmg7 Pulling image "quay.io/k8scsi/csi-node-driver-registrar:v1.1.0"
2019-11-27T12:20:38Z 1 2019-11-27T12:20:38Z Pod Pulling csi-rbdplugin-x88r5 Pulling image "quay.io/k8scsi/csi-node-driver-registrar:v1.1.0"
2019-11-27T12:20:38Z 183 2019-11-27T13:50:41Z PodDisruptionBudget NoPods rook-ceph-mds-ocs-storagecluster-cephfilesystem No matching pods found
2019-11-27T12:20:39Z 562 2019-11-27T13:54:09Z Pod Unhealthy ocs-operator-5f5459c784-7h6lb Readiness probe failed: stat: cannot stat '/tmp/operator-sdk-ready': No such file or directory
2019-11-27T12:20:42Z 1 2019-11-27T12:20:42Z Pod Pulled csi-rbdplugin-gpmg7 Successfully pulled image "quay.io/k8scsi/csi-node-driver-registrar:v1.1.0"
2019-11-27T12:20:42Z 1 2019-11-27T12:20:42Z Pod Pulled csi-rbdplugin-qjfn2 Successfully pulled image "quay.io/k8scsi/csi-node-driver-registrar:v1.1.0"
2019-11-27T12:20:42Z 1 2019-11-27T12:20:42Z Pod Pulled csi-cephfsplugin-47rhx Successfully pulled image "quay.io/k8scsi/csi-node-driver-registrar:v1.1.0"
2019-11-27T12:20:42Z 1 2019-11-27T12:20:42Z Pod Pulled csi-cephfsplugin-p56mn Successfully pulled image "quay.io/k8scsi/csi-node-driver-registrar:v1.1.0"
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Pulling csi-cephfsplugin-5hd4h Pulling image "quay.io/cephcsi/cephcsi:v1.2.1"
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Pulled csi-rbdplugin-x88r5 Successfully pulled image "quay.io/k8scsi/csi-node-driver-registrar:v1.1.0"
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Pulling csi-rbdplugin-qjfn2 Pulling image "quay.io/cephcsi/cephcsi:v1.2.1"
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Created csi-rbdplugin-x88r5 Created container driver-registrar
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Job SuccessfulCreate rook-ceph-detect-version Created pod: rook-ceph-detect-version-wrhxs
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Started csi-rbdplugin-qjfn2 Started container driver-registrar
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Pulling csi-rbdplugin-x88r5 Pulling image "quay.io/cephcsi/cephcsi:v1.2.1"
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Started csi-cephfsplugin-5hd4h Started container driver-registrar
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Created csi-cephfsplugin-5hd4h Created container driver-registrar
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Pulled csi-cephfsplugin-5hd4h Successfully pulled image "quay.io/k8scsi/csi-node-driver-registrar:v1.1.0"
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Scheduled rook-ceph-detect-version-wrhxs Successfully assigned openshift-storage/rook-ceph-detect-version-wrhxs to oshift-worker3-all-prod
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Created csi-cephfsplugin-p56mn Created container driver-registrar
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Created csi-rbdplugin-gpmg7 Created container driver-registrar
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Started csi-rbdplugin-gpmg7 Started container driver-registrar
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Started csi-cephfsplugin-p56mn Started container driver-registrar
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Created csi-rbdplugin-qjfn2 Created container driver-registrar
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Pulling csi-cephfsplugin-p56mn Pulling image "quay.io/cephcsi/cephcsi:v1.2.1"
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Started csi-rbdplugin-x88r5 Started container driver-registrar
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Pulling csi-cephfsplugin-47rhx Pulling image "quay.io/cephcsi/cephcsi:v1.2.1"
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Started csi-cephfsplugin-47rhx Started container driver-registrar
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Created csi-cephfsplugin-47rhx Created container driver-registrar
2019-11-27T12:20:43Z 1 2019-11-27T12:20:43Z Pod Pulling csi-rbdplugin-gpmg7 Pulling image "quay.io/cephcsi/cephcsi:v1.2.1"
2019-11-27T12:20:44Z 1 2019-11-27T12:20:44Z Pod Pulling csi-rbdplugin-provisioner-56c7c77dd7-7k9jz Pulling image "quay.io/k8scsi/csi-provisioner:v1.3.0"
2019-11-27T12:20:45Z 1 2019-11-27T12:20:45Z Pod Pulling csi-cephfsplugin-provisioner-669767cc87-pm4qm Pulling image "quay.io/k8scsi/csi-attacher:v1.2.0"
2019-11-27T12:20:45Z 1 2019-11-27T12:20:45Z Pod Pulling csi-rbdplugin-provisioner-56c7c77dd7-896tw Pulling image "quay.io/k8scsi/csi-provisioner:v1.3.0"
2019-11-27T12:20:47Z 1 2019-11-27T12:20:47Z Pod Pulling csi-cephfsplugin-provisioner-669767cc87-jmt74 Pulling image "quay.io/k8scsi/csi-attacher:v1.2.0"
2019-11-27T12:20:47Z 2 2019-11-27T12:20:56Z ClusterServiceVersion ComponentUnhealthy ocs-operator.v0.0.1 installing: Waiting: waiting for deployment ocs-operator to become ready: Waiting for rollout to finish: 0 of 1 updated replicas are available...
2019-11-27T12:20:51Z 1 2019-11-27T12:20:51Z Pod Started rook-ceph-detect-version-wrhxs Started container init-copy-binaries
2019-11-27T12:20:51Z 1 2019-11-27T12:20:51Z Pod Created csi-cephfsplugin-47rhx Created container liveness-prometheus
2019-11-27T12:20:51Z 1 2019-11-27T12:20:51Z Pod Started csi-cephfsplugin-47rhx Started container liveness-prometheus
2019-11-27T12:20:51Z 1 2019-11-27T12:20:51Z Pod Started csi-cephfsplugin-47rhx Started container csi-cephfsplugin
2019-11-27T12:20:51Z 1 2019-11-27T12:20:51Z Pod Created rook-ceph-detect-version-wrhxs Created container init-copy-binaries
2019-11-27T12:20:51Z 1 2019-11-27T12:20:51Z Pod Started csi-rbdplugin-qjfn2 Started container liveness-prometheus
2019-11-27T12:20:51Z 1 2019-11-27T12:20:51Z Pod Created csi-cephfsplugin-47rhx Created container csi-cephfsplugin
2019-11-27T12:20:51Z 1 2019-11-27T12:20:51Z Pod Created csi-rbdplugin-qjfn2 Created container liveness-prometheus
2019-11-27T12:20:51Z 1 2019-11-27T12:20:51Z Pod Pulled csi-rbdplugin-qjfn2 Container image "quay.io/cephcsi/cephcsi:v1.2.1" already present on machine
2019-11-27T12:20:51Z 1 2019-11-27T12:20:51Z Pod Pulled rook-ceph-detect-version-wrhxs Container image "rook/ceph:v1.1.4-27.gf20c056" already present on machine
2019-11-27T12:20:51Z 1 2019-11-27T12:20:51Z Pod Started csi-rbdplugin-qjfn2 Started container csi-rbdplugin
2019-11-27T12:20:51Z 1 2019-11-27T12:20:51Z Pod Created csi-rbdplugin-qjfn2 Created container csi-rbdplugin
2019-11-27T12:20:51Z 1 2019-11-27T12:20:51Z Pod Pulled csi-rbdplugin-qjfn2 Successfully pulled image "quay.io/cephcsi/cephcsi:v1.2.1"
2019-11-27T12:20:51Z 1 2019-11-27T12:20:51Z Pod Pulled csi-cephfsplugin-47rhx Successfully pulled image "quay.io/cephcsi/cephcsi:v1.2.1"
2019-11-27T12:20:51Z 1 2019-11-27T12:20:51Z Pod Pulled csi-rbdplugin-provisioner-56c7c77dd7-896tw Successfully pulled image "quay.io/k8scsi/csi-provisioner:v1.3.0"
2019-11-27T12:20:51Z 1 2019-11-27T12:20:51Z Pod Pulled csi-cephfsplugin-47rhx Container image "quay.io/cephcsi/cephcsi:v1.2.1" already present on machine
2019-11-27T12:20:52Z 1 2019-11-27T12:20:52Z Pod Created csi-rbdplugin-provisioner-56c7c77dd7-7k9jz Created container csi-provisioner
2019-11-27T12:20:52Z 1 2019-11-27T12:20:52Z Pod Pulling csi-rbdplugin-provisioner-56c7c77dd7-896tw Pulling image "quay.io/k8scsi/csi-attacher:v1.2.0"
2019-11-27T12:20:52Z 1 2019-11-27T12:20:52Z Pod Created csi-cephfsplugin-provisioner-669767cc87-jmt74 Created container csi-attacher
2019-11-27T12:20:52Z 1 2019-11-27T12:20:52Z Pod Started csi-rbdplugin-provisioner-56c7c77dd7-896tw Started container csi-provisioner
2019-11-27T12:20:52Z 1 2019-11-27T12:20:52Z Pod Started csi-rbdplugin-provisioner-56c7c77dd7-7k9jz Started container csi-provisioner
2019-11-27T12:20:52Z 1 2019-11-27T12:20:52Z Pod Created csi-rbdplugin-provisioner-56c7c77dd7-896tw Created container csi-provisioner
2019-11-27T12:20:52Z 1 2019-11-27T12:20:52Z Pod Pulled csi-cephfsplugin-provisioner-669767cc87-jmt74 Successfully pulled image "quay.io/k8scsi/csi-attacher:v1.2.0"
2019-11-27T12:20:52Z 1 2019-11-27T12:20:52Z Pod Pulled csi-cephfsplugin-provisioner-669767cc87-jmt74 Container image "quay.io/k8scsi/csi-provisioner:v1.3.0" already present on machine
2019-11-27T12:20:52Z 1 2019-11-27T12:20:52Z Pod Pulling csi-cephfsplugin-provisioner-669767cc87-pm4qm Pulling image "quay.io/k8scsi/csi-provisioner:v1.3.0"
2019-11-27T12:20:52Z 1 2019-11-27T12:20:52Z Pod Started csi-cephfsplugin-provisioner-669767cc87-jmt74 Started container csi-attacher
2019-11-27T12:20:52Z 1 2019-11-27T12:20:52Z Pod Pulling rook-ceph-detect-version-wrhxs Pulling image "ceph/ceph:v14.2.4-20190917"
2019-11-27T12:20:52Z 1 2019-11-27T12:20:52Z Pod Started csi-cephfsplugin-provisioner-669767cc87-pm4qm Started container csi-attacher
2019-11-27T12:20:52Z 1 2019-11-27T12:20:52Z Pod Pulled csi-cephfsplugin-provisioner-669767cc87-pm4qm Successfully pulled image "quay.io/k8scsi/csi-attacher:v1.2.0"
2019-11-27T12:20:52Z 1 2019-11-27T12:20:52Z Pod Created csi-cephfsplugin-provisioner-669767cc87-pm4qm Created container csi-attacher
2019-11-27T12:20:52Z 1 2019-11-27T12:20:52Z Pod Pulled csi-rbdplugin-provisioner-56c7c77dd7-7k9jz Successfully pulled image "quay.io/k8scsi/csi-provisioner:v1.3.0"
2019-11-27T12:20:52Z 1 2019-11-27T12:20:52Z Pod Pulled csi-rbdplugin-provisioner-56c7c77dd7-7k9jz Container image "quay.io/k8scsi/csi-attacher:v1.2.0" already present on machine
2019-11-27T12:20:53Z 1 2019-11-27T12:20:53Z Pod Started csi-cephfsplugin-provisioner-669767cc87-jmt74 Started container csi-cephfsplugin
2019-11-27T12:20:53Z 1 2019-11-27T12:20:53Z Pod Started csi-cephfsplugin-provisioner-669767cc87-jmt74 Started container liveness-prometheus
2019-11-27T12:20:53Z 1 2019-11-27T12:20:53Z Pod Created csi-rbdplugin-provisioner-56c7c77dd7-7k9jz Created container csi-rbdplugin-attacher
2019-11-27T12:20:53Z 1 2019-11-27T12:20:53Z Pod Pulling csi-rbdplugin-provisioner-56c7c77dd7-7k9jz Pulling image "quay.io/k8scsi/csi-snapshotter:v1.2.0"
2019-11-27T12:20:53Z 1 2019-11-27T12:20:53Z Pod Created csi-cephfsplugin-provisioner-669767cc87-jmt74 Created container csi-provisioner
2019-11-27T12:20:53Z 1 2019-11-27T12:20:53Z Pod Pulled csi-cephfsplugin-provisioner-669767cc87-jmt74 Container image "quay.io/cephcsi/cephcsi:v1.2.1" already present on machine
2019-11-27T12:20:53Z 1 2019-11-27T12:20:53Z Pod Created csi-cephfsplugin-provisioner-669767cc87-jmt74 Created container csi-cephfsplugin
2019-11-27T12:20:53Z 1 2019-11-27T12:20:53Z Pod Created csi-cephfsplugin-provisioner-669767cc87-jmt74 Created container liveness-prometheus
2019-11-27T12:20:53Z 1 2019-11-27T12:20:53Z Pod Started csi-rbdplugin-provisioner-56c7c77dd7-7k9jz Started container csi-rbdplugin-attacher
2019-11-27T12:20:53Z 1 2019-11-27T12:20:53Z Pod Pulled csi-cephfsplugin-provisioner-669767cc87-jmt74 Container image "quay.io/cephcsi/cephcsi:v1.2.1" already present on machine
2019-11-27T12:20:53Z 1 2019-11-27T12:20:53Z Pod Started csi-cephfsplugin-provisioner-669767cc87-jmt74 Started container csi-provisioner
2019-11-27T12:20:57Z 1 2019-11-27T12:20:57Z Pod Started csi-cephfsplugin-provisioner-669767cc87-pm4qm Started container csi-provisioner
2019-11-27T12:20:57Z 1 2019-11-27T12:20:57Z Pod Created csi-cephfsplugin-provisioner-669767cc87-pm4qm Created container csi-provisioner
2019-11-27T12:20:57Z 1 2019-11-27T12:20:57Z Pod Pulled csi-cephfsplugin-provisioner-669767cc87-pm4qm Successfully pulled image "quay.io/k8scsi/csi-provisioner:v1.3.0"
2019-11-27T12:20:57Z 1 2019-11-27T12:20:57Z Pod Started csi-rbdplugin-provisioner-56c7c77dd7-896tw Started container csi-rbdplugin-attacher
2019-11-27T12:20:57Z 1 2019-11-27T12:20:57Z Pod Pulling csi-rbdplugin-provisioner-56c7c77dd7-896tw Pulling image "quay.io/k8scsi/csi-snapshotter:v1.2.0"
2019-11-27T12:20:57Z 1 2019-11-27T12:20:57Z Pod Pulling csi-cephfsplugin-provisioner-669767cc87-pm4qm Pulling image "quay.io/cephcsi/cephcsi:v1.2.1"
2019-11-27T12:20:57Z 1 2019-11-27T12:20:57Z Pod Created csi-rbdplugin-provisioner-56c7c77dd7-896tw Created container csi-rbdplugin-attacher
2019-11-27T12:20:57Z 1 2019-11-27T12:20:57Z Pod Pulled csi-rbdplugin-provisioner-56c7c77dd7-896tw Successfully pulled image "quay.io/k8scsi/csi-attacher:v1.2.0"
2019-11-27T12:20:58Z 1 2019-11-27T12:20:58Z Pod Pulled rook-ceph-detect-version-wrhxs Successfully pulled image "ceph/ceph:v14.2.4-20190917"
2019-11-27T12:20:58Z 1 2019-11-27T12:20:58Z Pod Started rook-ceph-detect-version-wrhxs Started container cmd-reporter
2019-11-27T12:20:58Z 1 2019-11-27T12:20:58Z Pod Created rook-ceph-detect-version-wrhxs Created container cmd-reporter
2019-11-27T12:21:00Z 1 2019-11-27T12:21:00Z Pod Started csi-rbdplugin-provisioner-56c7c77dd7-7k9jz Started container liveness-prometheus
2019-11-27T12:21:00Z 1 2019-11-27T12:21:00Z Pod Pulled csi-rbdplugin-provisioner-56c7c77dd7-7k9jz Container image "quay.io/cephcsi/cephcsi:v1.2.1" already present on machine
2019-11-27T12:21:00Z 1 2019-11-27T12:21:00Z Pod Created csi-rbdplugin-provisioner-56c7c77dd7-7k9jz Created container csi-rbdplugin
2019-11-27T12:21:00Z 1 2019-11-27T12:21:00Z Pod Started csi-rbdplugin-provisioner-56c7c77dd7-7k9jz Started container csi-rbdplugin
2019-11-27T12:21:00Z 1 2019-11-27T12:21:00Z Pod Created csi-rbdplugin-provisioner-56c7c77dd7-7k9jz Created container liveness-prometheus
2019-11-27T12:21:00Z 1 2019-11-27T12:21:00Z Pod Created csi-rbdplugin-provisioner-56c7c77dd7-7k9jz Created container csi-snapshotter
2019-11-27T12:21:00Z 1 2019-11-27T12:21:00Z Pod Pulled csi-rbdplugin-provisioner-56c7c77dd7-7k9jz Container image "quay.io/cephcsi/cephcsi:v1.2.1" already present on machine
2019-11-27T12:21:00Z 1 2019-11-27T12:21:00Z Pod Started csi-rbdplugin-provisioner-56c7c77dd7-7k9jz Started container csi-snapshotter
2019-11-27T12:21:00Z 1 2019-11-27T12:21:00Z Pod Pulled csi-rbdplugin-provisioner-56c7c77dd7-7k9jz Successfully pulled image "quay.io/k8scsi/csi-snapshotter:v1.2.0"
2019-11-27T12:21:01Z 11 2019-11-27T12:35:33Z Pod FailedScheduling rook-ceph-mon-a-canary-64f4448445-kwgz8 pod has unbound immediate PersistentVolumeClaims (repeated 3 times)
2019-11-27T12:21:01Z 1 2019-11-27T12:21:01Z ReplicaSet SuccessfulCreate rook-ceph-mon-a-canary-64f4448445 Created pod: rook-ceph-mon-a-canary-64f4448445-kwgz8
2019-11-27T12:21:01Z 1 2019-11-27T12:21:01Z Deployment ScalingReplicaSet rook-ceph-mon-a-canary Scaled up replica set rook-ceph-mon-a-canary-64f4448445 to 1
2019-11-27T12:21:01Z 64 2019-11-27T12:36:05Z PersistentVolumeClaim FailedBinding rook-ceph-mon-a no persistent volumes available for this claim and no storage class is set
2019-11-27T12:21:05Z 4 2019-11-27T12:26:51Z ClusterServiceVersion NeedsReinstall ocs-operator.v0.0.1 installing: Waiting: waiting for deployment ocs-operator to become ready: Waiting for rollout to finish: 0 of 1 updated replicas are available...
2019-11-27T12:21:10Z 1 2019-11-27T12:21:10Z Pod Pulled csi-cephfsplugin-5hd4h Successfully pulled image "quay.io/cephcsi/cephcsi:v1.2.1"
2019-11-27T12:21:10Z 1 2019-11-27T12:21:10Z Pod Pulled csi-cephfsplugin-5hd4h Container image "quay.io/cephcsi/cephcsi:v1.2.1" already present on machine
2019-11-27T12:21:10Z 1 2019-11-27T12:21:10Z Pod Started csi-cephfsplugin-5hd4h Started container csi-cephfsplugin
2019-11-27T12:21:10Z 1 2019-11-27T12:21:10Z Pod Created csi-cephfsplugin-5hd4h Created container csi-cephfsplugin
2019-11-27T12:21:10Z 1 2019-11-27T12:21:10Z Pod Started csi-rbdplugin-x88r5 Started container liveness-prometheus
2019-11-27T12:21:10Z 1 2019-11-27T12:21:10Z Pod Created csi-rbdplugin-x88r5 Created container liveness-prometheus
2019-11-27T12:21:10Z 1 2019-11-27T12:21:10Z Pod Pulled csi-rbdplugin-x88r5 Container image "quay.io/cephcsi/cephcsi:v1.2.1" already present on machine
2019-11-27T12:21:10Z 1 2019-11-27T12:21:10Z Pod Started csi-rbdplugin-x88r5 Started container csi-rbdplugin
2019-11-27T12:21:10Z 1 2019-11-27T12:21:10Z Pod Created csi-rbdplugin-x88r5 Created container csi-rbdplugin
2019-11-27T12:21:10Z 1 2019-11-27T12:21:10Z Pod Pulled csi-rbdplugin-x88r5 Successfully pulled image "quay.io/cephcsi/cephcsi:v1.2.1"
2019-11-27T12:21:10Z 1 2019-11-27T12:21:10Z Pod Created csi-cephfsplugin-5hd4h Created container liveness-prometheus
2019-11-27T12:21:10Z 1 2019-11-27T12:21:10Z Pod Started csi-cephfsplugin-5hd4h Started container liveness-prometheus
2019-11-27T12:21:11Z 1 2019-11-27T12:21:11Z Pod Created csi-rbdplugin-gpmg7 Created container liveness-prometheus
2019-11-27T12:21:11Z 1 2019-11-27T12:21:11Z Pod Pulled csi-cephfsplugin-p56mn Container image "quay.io/cephcsi/cephcsi:v1.2.1" already present on machine
2019-11-27T12:21:11Z 1 2019-11-27T12:21:11Z Pod Started csi-cephfsplugin-p56mn Started container csi-cephfsplugin
2019-11-27T12:21:11Z 1 2019-11-27T12:21:11Z Pod Created csi-cephfsplugin-p56mn Created container csi-cephfsplugin
2019-11-27T12:21:11Z 1 2019-11-27T12:21:11Z Pod Pulled csi-cephfsplugin-p56mn Successfully pulled image "quay.io/cephcsi/cephcsi:v1.2.1"
2019-11-27T12:21:11Z 1 2019-11-27T12:21:11Z Pod Pulled csi-rbdplugin-gpmg7 Container image "quay.io/cephcsi/cephcsi:v1.2.1" already present on machine
2019-11-27T12:21:11Z 1 2019-11-27T12:21:11Z Pod Started csi-rbdplugin-gpmg7 Started container csi-rbdplugin
2019-11-27T12:21:11Z 1 2019-11-27T12:21:11Z Pod Created csi-rbdplugin-gpmg7 Created container csi-rbdplugin
2019-11-27T12:21:11Z 1 2019-11-27T12:21:11Z Pod Pulled csi-rbdplugin-gpmg7 Successfully pulled image "quay.io/cephcsi/cephcsi:v1.2.1"
2019-11-27T12:21:11Z 1 2019-11-27T12:21:11Z Pod Created csi-cephfsplugin-p56mn Created container liveness-prometheus
2019-11-27T12:21:11Z 1 2019-11-27T12:21:11Z Pod Started csi-cephfsplugin-p56mn Started container liveness-prometheus
2019-11-27T12:21:11Z 1 2019-11-27T12:21:11Z Pod Started csi-rbdplugin-gpmg7 Started container liveness-prometheus
2019-11-27T12:21:12Z 1 2019-11-27T12:21:12Z Pod Started csi-rbdplugin-provisioner-56c7c77dd7-896tw Started container liveness-prometheus
2019-11-27T12:21:12Z 1 2019-11-27T12:21:12Z Pod Created csi-cephfsplugin-provisioner-669767cc87-pm4qm Created container csi-cephfsplugin
2019-11-27T12:21:12Z 1 2019-11-27T12:21:12Z Pod Started csi-cephfsplugin-provisioner-669767cc87-pm4qm Started container csi-cephfsplugin
2019-11-27T12:21:12Z 1 2019-11-27T12:21:12Z Pod Pulled csi-cephfsplugin-provisioner-669767cc87-pm4qm Container image "quay.io/cephcsi/cephcsi:v1.2.1" already present on machine
2019-11-27T12:21:12Z 1 2019-11-27T12:21:12Z Pod Started csi-rbdplugin-provisioner-56c7c77dd7-896tw Started container csi-snapshotter
2019-11-27T12:21:12Z 1 2019-11-27T12:21:12Z Pod Created csi-cephfsplugin-provisioner-669767cc87-pm4qm Created container liveness-prometheus
2019-11-27T12:21:12Z 1 2019-11-27T12:21:12Z Pod Created csi-rbdplugin-provisioner-56c7c77dd7-896tw Created container liveness-prometheus
2019-11-27T12:21:12Z 1 2019-11-27T12:21:12Z Pod Started csi-rbdplugin-provisioner-56c7c77dd7-896tw Started container csi-rbdplugin
2019-11-27T12:21:12Z 1 2019-11-27T12:21:12Z Pod Created csi-rbdplugin-provisioner-56c7c77dd7-896tw Created container csi-rbdplugin
2019-11-27T12:21:12Z 1 2019-11-27T12:21:12Z Pod Pulled csi-rbdplugin-provisioner-56c7c77dd7-896tw Container image "quay.io/cephcsi/cephcsi:v1.2.1" already present on machine
2019-11-27T12:21:12Z 1 2019-11-27T12:21:12Z Pod Created csi-rbdplugin-provisioner-56c7c77dd7-896tw Created container csi-snapshotter
2019-11-27T12:21:12Z 1 2019-11-27T12:21:12Z Pod Pulled csi-rbdplugin-provisioner-56c7c77dd7-896tw Successfully pulled image "quay.io/k8scsi/csi-snapshotter:v1.2.0"
2019-11-27T12:21:12Z 1 2019-11-27T12:21:12Z Pod Pulled csi-cephfsplugin-provisioner-669767cc87-pm4qm Successfully pulled image "quay.io/cephcsi/cephcsi:v1.2.1"
2019-11-27T12:21:12Z 1 2019-11-27T12:21:12Z Pod Started csi-cephfsplugin-provisioner-669767cc87-pm4qm Started container liveness-prometheus
2019-11-27T12:21:12Z 1 2019-11-27T12:21:12Z Pod Pulled csi-rbdplugin-provisioner-56c7c77dd7-896tw Container image "quay.io/cephcsi/cephcsi:v1.2.1" already present on machine
2019-11-27T12:21:13Z 1 2019-11-27T12:21:13Z Lease LeaderElection external-attacher-leader-openshift-storage-cephfs-csi-ceph-com csi-cephfsplugin-provisioner-669767cc87-jmt74 became leader
2019-11-27T12:21:13Z 1 2019-11-27T12:21:13Z Lease LeaderElection openshift-storage-cephfs-csi-ceph-com csi-cephfsplugin-provisioner-669767cc87-jmt74 became leader
2019-11-27T12:21:20Z 1 2019-11-27T12:21:20Z Lease LeaderElection openshift-storage-rbd-csi-ceph-com csi-rbdplugin-provisioner-56c7c77dd7-7k9jz became leader
2019-11-27T12:21:20Z 1 2019-11-27T12:21:20Z Lease LeaderElection external-attacher-leader-openshift-storage-rbd-csi-ceph-com csi-rbdplugin-provisioner-56c7c77dd7-7k9jz became leader
2019-11-27T12:21:20Z 1 2019-11-27T12:21:20Z Lease LeaderElection external-snapshotter-leader-election csi-rbdplugin-provisioner-56c7c77dd7-7k9jz became leader
2019-11-27T12:21:36Z 2 2019-11-27T12:21:37Z ClusterServiceVersion InstallWaiting ocs-operator.v0.0.1 installing: Waiting: waiting for deployment ocs-operator to become ready: Waiting for rollout to finish: 0 of 1 updated replicas are available...
2019-11-27T12:26:33Z 26 2019-11-27T13:49:36Z ClusterServiceVersion InstallCheckFailed ocs-operator.v0.0.1 install timeout
2019-11-27T12:36:03Z 1 2019-11-27T12:36:03Z Pod FailedScheduling rook-ceph-mon-a-canary-64f4448445-kwgz8 skip schedule deleting pod: openshift-storage/rook-ceph-mon-a-canary-64f4448445-kwgz8
2019-11-27T12:36:07Z 1 2019-11-27T12:36:07Z Job SuccessfulCreate rook-ceph-detect-version Created pod: rook-ceph-detect-version-r5nb2
2019-11-27T12:36:07Z 1 2019-11-27T12:36:07Z Pod Scheduled rook-ceph-detect-version-r5nb2 Successfully assigned openshift-storage/rook-ceph-detect-version-r5nb2 to oshift-worker3-all-prod
2019-11-27T12:36:14Z 1 2019-11-27T12:36:14Z Pod Pulled rook-ceph-detect-version-r5nb2 Container image "rook/ceph:v1.1.4-27.gf20c056" already present on machine
2019-11-27T12:36:15Z 1 2019-11-27T12:36:15Z Pod Created rook-ceph-detect-version-r5nb2 Created container init-copy-binaries
2019-11-27T12:36:15Z 1 2019-11-27T12:36:15Z Pod Started rook-ceph-detect-version-r5nb2 Started container init-copy-binaries
2019-11-27T12:36:15Z 1 2019-11-27T12:36:15Z Pod Pulled rook-ceph-detect-version-r5nb2 Container image "ceph/ceph:v14.2.4-20190917" already present on machine
2019-11-27T12:36:16Z 1 2019-11-27T12:36:16Z Pod Created rook-ceph-detect-version-r5nb2 Created container cmd-reporter
2019-11-27T12:36:16Z 1 2019-11-27T12:36:16Z Pod Started rook-ceph-detect-version-r5nb2 Started container cmd-reporter
2019-11-27T12:36:18Z 63 2019-11-27T12:51:22Z PersistentVolumeClaim FailedBinding rook-ceph-mon-a no persistent volumes available for this claim and no storage class is set
2019-11-27T12:36:18Z 11 2019-11-27T12:51:03Z Pod FailedScheduling rook-ceph-mon-a-canary-64f4448445-bt2gb pod has unbound immediate PersistentVolumeClaims (repeated 3 times)
2019-11-27T12:36:18Z 1 2019-11-27T12:36:18Z ReplicaSet SuccessfulCreate rook-ceph-mon-a-canary-64f4448445 Created pod: rook-ceph-mon-a-canary-64f4448445-bt2gb
2019-11-27T12:36:18Z 1 2019-11-27T12:36:18Z Deployment ScalingReplicaSet rook-ceph-mon-a-canary Scaled up replica set rook-ceph-mon-a-canary-64f4448445 to 1
2019-11-27T12:51:20Z 1 2019-11-27T12:51:20Z Pod FailedScheduling rook-ceph-mon-a-canary-64f4448445-bt2gb skip schedule deleting pod: openshift-storage/rook-ceph-mon-a-canary-64f4448445-bt2gb
2019-11-27T12:51:25Z 1 2019-11-27T12:51:25Z Pod Scheduled rook-ceph-detect-version-zz7fx Successfully assigned openshift-storage/rook-ceph-detect-version-zz7fx to oshift-worker3-all-prod
2019-11-27T12:51:25Z 1 2019-11-27T12:51:25Z Job SuccessfulCreate rook-ceph-detect-version Created pod: rook-ceph-detect-version-zz7fx
2019-11-27T12:51:33Z 1 2019-11-27T12:51:33Z Pod Created rook-ceph-detect-version-zz7fx Created container init-copy-binaries
2019-11-27T12:51:33Z 1 2019-11-27T12:51:33Z Pod Started rook-ceph-detect-version-zz7fx Started container init-copy-binaries
2019-11-27T12:51:33Z 1 2019-11-27T12:51:33Z Pod Pulled rook-ceph-detect-version-zz7fx Container image "rook/ceph:v1.1.4-27.gf20c056" already present on machine
2019-11-27T12:51:34Z 1 2019-11-27T12:51:34Z Pod Created rook-ceph-detect-version-zz7fx Created container cmd-reporter
2019-11-27T12:51:34Z 1 2019-11-27T12:51:34Z Pod Started rook-ceph-detect-version-zz7fx Started container cmd-reporter
2019-11-27T12:51:34Z 1 2019-11-27T12:51:34Z Pod Pulled rook-ceph-detect-version-zz7fx Container image "ceph/ceph:v14.2.4-20190917" already present on machine
2019-11-27T12:51:37Z 1 2019-11-27T12:51:37Z ReplicaSet SuccessfulCreate rook-ceph-mon-a-canary-64f4448445 Created pod: rook-ceph-mon-a-canary-64f4448445-r664n
2019-11-27T12:51:37Z 13 2019-11-27T13:05:33Z Pod FailedScheduling rook-ceph-mon-a-canary-64f4448445-r664n pod has unbound immediate PersistentVolumeClaims (repeated 3 times)
2019-11-27T12:51:37Z 1 2019-11-27T12:51:37Z Deployment ScalingReplicaSet rook-ceph-mon-a-canary Scaled up replica set rook-ceph-mon-a-canary-64f4448445 to 1
2019-11-27T12:51:37Z 64 2019-11-27T13:06:40Z PersistentVolumeClaim FailedBinding rook-ceph-mon-a no persistent volumes available for this claim and no storage class is set
2019-11-27T13:06:38Z 1 2019-11-27T13:06:38Z Pod FailedScheduling rook-ceph-mon-a-canary-64f4448445-r664n skip schedule deleting pod: openshift-storage/rook-ceph-mon-a-canary-64f4448445-r664n
2019-11-27T13:06:43Z 1 2019-11-27T13:06:43Z Job SuccessfulCreate rook-ceph-detect-version Created pod: rook-ceph-detect-version-ff27n
2019-11-27T13:06:43Z 1 2019-11-27T13:06:43Z Pod Scheduled rook-ceph-detect-version-ff27n Successfully assigned openshift-storage/rook-ceph-detect-version-ff27n to oshift-worker3-all-prod
2019-11-27T13:06:51Z 1 2019-11-27T13:06:51Z Pod Pulled rook-ceph-detect-version-ff27n Container image "rook/ceph:v1.1.4-27.gf20c056" already present on machine
2019-11-27T13:06:51Z 1 2019-11-27T13:06:51Z Pod Started rook-ceph-detect-version-ff27n Started container init-copy-binaries
2019-11-27T13:06:51Z 1 2019-11-27T13:06:51Z Pod Created rook-ceph-detect-version-ff27n Created container init-copy-binaries
2019-11-27T13:06:52Z 1 2019-11-27T13:06:52Z Pod Started rook-ceph-detect-version-ff27n Started container cmd-reporter
2019-11-27T13:06:52Z 1 2019-11-27T13:06:52Z Pod Pulled rook-ceph-detect-version-ff27n Container image "ceph/ceph:v14.2.4-20190917" already present on machine
2019-11-27T13:06:52Z 1 2019-11-27T13:06:52Z Pod Created rook-ceph-detect-version-ff27n Created container cmd-reporter
2019-11-27T13:06:55Z 1 2019-11-27T13:06:55Z ReplicaSet SuccessfulCreate rook-ceph-mon-a-canary-64f4448445 Created pod: rook-ceph-mon-a-canary-64f4448445-56td6
2019-11-27T13:06:55Z 12 2019-11-27T13:21:03Z Pod FailedScheduling rook-ceph-mon-a-canary-64f4448445-56td6 pod has unbound immediate PersistentVolumeClaims (repeated 3 times)
2019-11-27T13:06:55Z 1 2019-11-27T13:06:55Z Deployment ScalingReplicaSet rook-ceph-mon-a-canary Scaled up replica set rook-ceph-mon-a-canary-64f4448445 to 1
2019-11-27T13:06:55Z 63 2019-11-27T13:21:57Z PersistentVolumeClaim FailedBinding rook-ceph-mon-a no persistent volumes available for this claim and no storage class is set
2019-11-27T13:21:56Z 1 2019-11-27T13:21:56Z Pod FailedScheduling rook-ceph-mon-a-canary-64f4448445-56td6 skip schedule deleting pod: openshift-storage/rook-ceph-mon-a-canary-64f4448445-56td6
2019-11-27T13:21:58Z 1 2019-11-27T13:21:58Z Job SuccessfulCreate rook-ceph-detect-version Created pod: rook-ceph-detect-version-kvmpf
2019-11-27T13:21:58Z 1 2019-11-27T13:21:58Z Pod Scheduled rook-ceph-detect-version-kvmpf Successfully assigned openshift-storage/rook-ceph-detect-version-kvmpf to oshift-worker3-all-prod
2019-11-27T13:22:06Z 1 2019-11-27T13:22:06Z Pod Created rook-ceph-detect-version-kvmpf Created container init-copy-binaries
2019-11-27T13:22:06Z 1 2019-11-27T13:22:06Z Pod Pulled rook-ceph-detect-version-kvmpf Container image "ceph/ceph:v14.2.4-20190917" already present on machine
2019-11-27T13:22:06Z 1 2019-11-27T13:22:06Z Pod Pulled rook-ceph-detect-version-kvmpf Container image "rook/ceph:v1.1.4-27.gf20c056" already present on machine
2019-11-27T13:22:06Z 1 2019-11-27T13:22:06Z Pod Started rook-ceph-detect-version-kvmpf Started container init-copy-binaries
2019-11-27T13:22:07Z 1 2019-11-27T13:22:07Z Pod Created rook-ceph-detect-version-kvmpf Created container cmd-reporter
2019-11-27T13:22:07Z 1 2019-11-27T13:22:07Z Pod Started rook-ceph-detect-version-kvmpf Started container cmd-reporter
2019-11-27T13:22:09Z 1 2019-11-27T13:22:09Z ReplicaSet SuccessfulCreate rook-ceph-mon-a-canary-64f4448445 Created pod: rook-ceph-mon-a-canary-64f4448445-l8gsz
2019-11-27T13:22:09Z 11 2019-11-27T13:36:33Z Pod FailedScheduling rook-ceph-mon-a-canary-64f4448445-l8gsz pod has unbound immediate PersistentVolumeClaims (repeated 3 times)
2019-11-27T13:22:09Z 1 2019-11-27T13:22:09Z Deployment ScalingReplicaSet rook-ceph-mon-a-canary Scaled up replica set rook-ceph-mon-a-canary-64f4448445 to 1
2019-11-27T13:22:09Z 64 2019-11-27T13:37:12Z PersistentVolumeClaim FailedBinding rook-ceph-mon-a no persistent volumes available for this claim and no storage class is set
2019-11-27T13:37:10Z 1 2019-11-27T13:37:10Z Pod FailedScheduling rook-ceph-mon-a-canary-64f4448445-l8gsz skip schedule deleting pod: openshift-storage/rook-ceph-mon-a-canary-64f4448445-l8gsz
2019-11-27T13:37:18Z 1 2019-11-27T13:37:18Z Pod Scheduled rook-ceph-detect-version-qx4r9 Successfully assigned openshift-storage/rook-ceph-detect-version-qx4r9 to oshift-worker3-all-prod
2019-11-27T13:37:18Z 1 2019-11-27T13:37:18Z Job SuccessfulCreate rook-ceph-detect-version Created pod: rook-ceph-detect-version-qx4r9
2019-11-27T13:37:26Z 1 2019-11-27T13:37:26Z Pod Created rook-ceph-detect-version-qx4r9 Created container init-copy-binaries
2019-11-27T13:37:26Z 1 2019-11-27T13:37:26Z Pod Pulled rook-ceph-detect-version-qx4r9 Container image "rook/ceph:v1.1.4-27.gf20c056" already present on machine
2019-11-27T13:37:26Z 1 2019-11-27T13:37:26Z Pod Started rook-ceph-detect-version-qx4r9 Started container init-copy-binaries
2019-11-27T13:37:27Z 1 2019-11-27T13:37:27Z Pod Created rook-ceph-detect-version-qx4r9 Created container cmd-reporter
2019-11-27T13:37:27Z 1 2019-11-27T13:37:27Z Pod Pulled rook-ceph-detect-version-qx4r9 Container image "ceph/ceph:v14.2.4-20190917" already present on machine
2019-11-27T13:37:27Z 1 2019-11-27T13:37:27Z Pod Started rook-ceph-detect-version-qx4r9 Started container cmd-reporter
2019-11-27T13:37:30Z 1 2019-11-27T13:37:30Z Deployment ScalingReplicaSet rook-ceph-mon-a-canary Scaled up replica set rook-ceph-mon-a-canary-64f4448445 to 1
2019-11-27T13:37:30Z 11 2019-11-27T13:52:03Z Pod FailedScheduling rook-ceph-mon-a-canary-64f4448445-hrhcr pod has unbound immediate PersistentVolumeClaims (repeated 3 times)
2019-11-27T13:37:30Z 63 2019-11-27T13:52:33Z PersistentVolumeClaim FailedBinding rook-ceph-mon-a no persistent volumes available for this claim and no storage class is set
2019-11-27T13:37:30Z 1 2019-11-27T13:37:30Z ReplicaSet SuccessfulCreate rook-ceph-mon-a-canary-64f4448445 Created pod: rook-ceph-mon-a-canary-64f4448445-hrhcr
2019-11-27T13:52:31Z 1 2019-11-27T13:52:31Z Pod FailedScheduling rook-ceph-mon-a-canary-64f4448445-hrhcr skip schedule deleting pod: openshift-storage/rook-ceph-mon-a-canary-64f4448445-hrhcr
2019-11-27T13:52:36Z 1 2019-11-27T13:52:36Z Job SuccessfulCreate rook-ceph-detect-version Created pod: rook-ceph-detect-version-gfpw5
2019-11-27T13:52:36Z 1 2019-11-27T13:52:36Z Pod Scheduled rook-ceph-detect-version-gfpw5 Successfully assigned openshift-storage/rook-ceph-detect-version-gfpw5 to oshift-worker3-all-prod
2019-11-27T13:52:44Z 1 2019-11-27T13:52:44Z Pod Pulled rook-ceph-detect-version-gfpw5 Container image "rook/ceph:v1.1.4-27.gf20c056" already present on machine
2019-11-27T13:52:45Z 1 2019-11-27T13:52:45Z Pod Created rook-ceph-detect-version-gfpw5 Created container init-copy-binaries
2019-11-27T13:52:45Z 1 2019-11-27T13:52:45Z Pod Pulled rook-ceph-detect-version-gfpw5 Container image "ceph/ceph:v14.2.4-20190917" already present on machine
2019-11-27T13:52:45Z 1 2019-11-27T13:52:45Z Pod Started rook-ceph-detect-version-gfpw5 Started container init-copy-binaries
2019-11-27T13:52:46Z 1 2019-11-27T13:52:46Z Pod Started rook-ceph-detect-version-gfpw5 Started container cmd-reporter
2019-11-27T13:52:46Z 1 2019-11-27T13:52:46Z Pod Created rook-ceph-detect-version-gfpw5 Created container cmd-reporter
2019-11-27T13:52:48Z 1 2019-11-27T13:52:48Z Deployment ScalingReplicaSet rook-ceph-mon-a-canary Scaled up replica set rook-ceph-mon-a-canary-64f4448445 to 1
2019-11-27T13:52:48Z 7 2019-11-27T13:54:12Z PersistentVolumeClaim FailedBinding rook-ceph-mon-a no persistent volumes available for this claim and no storage class is set
2019-11-27T13:52:48Z 1 2019-11-27T13:52:48Z ReplicaSet SuccessfulCreate rook-ceph-mon-a-canary-64f4448445 Created pod: rook-ceph-mon-a-canary-64f4448445-g8k7b
2019-11-27T13:52:48Z 4 2019-11-27T13:53:48Z Pod FailedScheduling rook-ceph-mon-a-canary-64f4448445-g8k7b pod has unbound immediate PersistentVolumeClaims (repeated 3 times)

Thanks,
Răzvan

Come up with a better OWNERS file

We had a merging impasse, where we couldn't advance due to the absence of maintainers. So as an interim solution, #15 was created, adding me as a (temporary?) fall-back. This issue is a reminder to discuss and merge a better/more complete OWNERS file.

PersistentVolumeClaim's created with empty storageClassName

With OpenShift 4.2 on OpenStack, PersistentVolumeClaim's are created by the ocs-operator with an empty storageClassName. This results in a PersistentVolumeClaim with the status Pending, it won't create a PersistentVolume.

$ oc describe pvc rook-ceph-mon-a
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  creationTimestamp: null
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
    app: rook-ceph-mon
    ceph-version: 14.2.4
    ceph_daemon_id: a
    mon: a
    mon_cluster: openshift-storage
    rook-version: v1.1.4-27.gf20c056
    rook_cluster: openshift-storage
  name: rook-ceph-mon-a
  ownerReferences:
  - apiVersion: ceph.rook.io/v1
    blockOwnerDeletion: true
    kind: CephCluster
    name: ocs-storagecluster-cephcluster
    uid: 9bfccbc9-0b77-11ea-b163-fa163eaf4b87
  selfLink: /api/v1/namespaces/openshift-storage/persistentvolumeclaims/rook-ceph-mon-a
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: ""
  volumeMode: Filesystem
status:
  phase: Pending

Using storagecluster along with local storage

Currently, a single storage class is specified in a storagecluster definition and used for both mon and osd pvcs.
Mon pvcs have a filesystem volume mode, while osds ones have volume mode set to block.

That doesn't fit the use of local volumes since in this case, the corresponding storage classes can only provision pvcs from a single volume mode.

The storage cluster should have an extra field to point to a different storage class for the mon pvcs or have a way for the deployment to be done without those mon pvcs.

rook-ceph-tools with status 'CreateContainerError'

After deploying OCS from master branch, the rook-ceph-tools pod is reporting CreateContainerError status. Checking the event log for the pod, looks like there is a problem in the image or deployment definition:

Error: container create failed: container_linux.go:345: starting container process caused "exec: \"/tini\": stat /tini: no such file or directory"

The pod spec is:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
  creationTimestamp: "2019-09-25T17:49:34Z"
  generation: 1
  name: rook-ceph-tools
  namespace: openshift-storage
  ownerReferences:
  - apiVersion: ocs.openshift.io/v1
    blockOwnerDeletion: true
    controller: true
    kind: StorageClusterInitialization
    name: ocs-storagecluster
    uid: d597ab49-dfbc-11e9-b23a-12e0f21dfc3a
  resourceVersion: "46479"
  selfLink: /apis/extensions/v1beta1/namespaces/openshift-storage/deployments/rook-ceph-tools
  uid: d60bc36e-dfbc-11e9-b23a-12e0f21dfc3a
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: rook-ceph-tools
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: rook-ceph-tools
    spec:
      containers:
      - args:
        - -g
        - --
        - /usr/local/bin/toolbox.sh
        command:
        - /tini
        env:
        - name: ROOK_ADMIN_SECRET
          valueFrom:
            secretKeyRef:
              key: admin-secret
              name: rook-ceph-mon
        image: ceph/ceph:v14.2.4-20190917
        imagePullPolicy: IfNotPresent
        name: rook-ceph-tools
        resources: {}
        securityContext:
          privileged: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /dev
          name: dev
        - mountPath: /sys/bus
          name: sysbus
        - mountPath: /lib/modules
          name: libmodules
        - mountPath: /etc/rook
          name: mon-endpoint-volume
      dnsPolicy: ClusterFirstWithHostNet
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
      - hostPath:
          path: /dev
          type: ""
        name: dev
      - hostPath:
          path: /sys/bus
          type: ""
        name: sysbus
      - hostPath:
          path: /lib/modules
          type: ""
        name: libmodules
      - configMap:
          defaultMode: 420
          items:
          - key: data
            path: mon-endpoints
          name: rook-ceph-mon-endpoints
        name: mon-endpoint-volume
status:
  conditions:
  - lastTransitionTime: "2019-09-25T17:49:34Z"
    lastUpdateTime: "2019-09-25T17:49:34Z"
    message: Deployment does not have minimum availability.
    reason: MinimumReplicasUnavailable
    status: "False"
    type: Available
  - lastTransitionTime: "2019-09-25T18:44:03Z"
    lastUpdateTime: "2019-09-25T18:44:03Z"
    message: ReplicaSet "rook-ceph-tools-747987675d" is progressing.
    reason: ReplicaSetUpdated
    status: "True"
    type: Progressing
  observedGeneration: 1
  replicas: 1
  unavailableReplicas: 1
  updatedReplicas: 1

StorageClusterInitialization needs more rbac rules

StorageClusterInitialzation controller keeps logging

E0904 07:48:52.663664       1 reflector.go:251] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:196: Failed to watch *v1.StorageClass: unknown (get storageclasses.storage.k8s.io)
E0904 07:48:52.962478       1 reflector.go:134] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:196: Failed to list *v1.CephObjectStore: cephobjectstores.ceph.rook.io is forbidden: User "system:serviceaccount:openshift-storage:ocs-operator" cannot list resource "cephobjectstores" in API group "ceph.rook.io" in the namespace "openshift-storage"

We should also check if we need more RBAC for NooBaa resources.

RGW is not started after a StorageCluster is created

problem description

  1. I created a StorageCluster using the example cr in ocs-operator repo - https://github.com/openshift/ocs-operator/blob/master/deploy/crds/ocs_v1_storagecluster_cr.yaml

  2. the operator created a CephObjectStore resource, which looks like it's invalid (all values are 0)

resource yaml:

apiVersion: ceph.rook.io/v1
kind: CephObjectStore
metadata:
  creationTimestamp: 2019-10-07T07:58:28Z
  generation: 1
  name: example-storagecluster-cephobjectstore
  namespace: openshift-storage
  ownerReferences:
  - apiVersion: ocs.openshift.io/v1
    blockOwnerDeletion: true
    controller: true
    kind: StorageClusterInitialization
    name: example-storagecluster
    uid: 3eeac13e-e8d8-11e9-b1de-005056beac7f
  resourceVersion: "5527795"
  selfLink: /apis/ceph.rook.io/v1/namespaces/openshift-storage/cephobjectstores/example-storagecluster-cephobjectstore
  uid: 3f27dd1c-e8d8-11e9-b1de-005056beac7f
spec:
  dataPool:
    crushRoot: ""
    deviceClass: ""
    erasureCoded:
      algorithm: ""
      codingChunks: 0
      dataChunks: 0
    failureDomain: ""
    replicated:
      size: 0
  gateway:
    allNodes: false
    instances: 1
    placement:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: cluster.ocs.openshift.io/openshift-storage
              operator: Exists
      podAntiAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
        - podAffinityTerm:
            labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - rook-ceph-rgw
            topologyKey: kubernetes.io/hostname
          weight: 100
      tolerations:
      - effect: NoSchedule
        key: node.ocs.openshift.io/storage
        operator: Equal
        value: "true"
    port: 80
    resources:
      limits:
        cpu: "1"
        memory: 2Gi
      requests:
        cpu: "1"
        memory: 2Gi
    securePort: 0
    sslCertificateRef: ""
  metadataPool:
    crushRoot: ""
    deviceClass: ""
    erasureCoded:
      algorithm: ""
      codingChunks: 0
      dataChunks: 0
    failureDomain: host
    replicated:
      size: 3
  1. RGW is never created

Steps to Reproduce:

  1. create a storage cluster
  2. oc get cephobjectstore -o yaml
  3. see the created resource

Actual results

created CephObjectStore looks invalid and RGW is not started

Expected results:

RGW should be started (at least for vmware deployents)

Incorrect pool name for cephfs SC in storageclusterinitialization_controller.go (L#364)

File name: https://github.com/openshift/ocs-operator/blob/master/pkg/controller/storageclusterinitialization/storageclusterinitialization_controller.go

description:

In L#364 of above go code, we are incorrectly using the CephBlockPool name for cephfilesystem pool name. We need to change it to use CephFileystem "data0" pool name.

For both rbd(L#379) and cephfs(L#364) default storageclass creations, we are using the same cephblockpool name.

L#364 	**"pool":      generateNameForCephBlockPool(initData),**
	ret := []storagev1.StorageClass{
		storagev1.StorageClass{
			ObjectMeta: metav1.ObjectMeta{
				Name: generateNameForCephFilesystemSC(initData),
			},
			Provisioner:   fmt.Sprintf("%s.cephfs.csi.ceph.com", initData.Namespace),
			ReclaimPolicy: &persistentVolumeReclaimDelete,
			Parameters: map[string]string{
				"clusterID": initData.Namespace,
				"fsName":    fmt.Sprintf("%s-cephfilesystem", initData.Name),
				"pool":      generateNameForCephBlockPool(initData),
				"csi.storage.k8s.io/provisioner-secret-name":      "rook-ceph-csi",
				"csi.storage.k8s.io/provisioner-secret-namespace": initData.Namespace,
				"csi.storage.k8s.io/node-stage-secret-name":       "rook-ceph-csi",
				"csi.storage.k8s.io/node-stage-secret-namespace":  initData.Namespace,
			},
		},

Versions

OCS operator version

$ oc describe pod ocs-operator-689df6c45d-9fbj6|grep Image
Image: quay.io/ocs-dev/ocs-operator:latest
Image ID: quay.io/ocs-dev/ocs-operator@sha256:77cd819d70203d146a54c3c6648437d47845919dfc16938a58cfba08663a8a14

$ oc get csv
NAME DISPLAY VERSION REPLACES PHASE
local-storage-operator.v4.2.0 Local Storage 4.2.0 Succeeded
ocs-operator.v0.0.1 Openshift Container Storage Operator 0.0.1 Succeeded

$ oc version

Client Version: version.Info{Major:"", Minor:"", GitVersion:"v0.0.0-alpha.0-34-g0d02ccfc", GitCommit:"0d02ccfccbfbabe8be161db4dbb1f42bf00cf7c2", GitTreeState:"clean", BuildDate:"2019-07-28T20:08:21Z", GoVersion:"go1.12.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.6+1e333e7", GitCommit:"1e333e7", GitTreeState:"clean", BuildDate:"2019-09-15T23:10:54Z", GoVersion:"go1.12.8", Compiler:"gc", Platform:"linux/amd64"}
OpenShift Version: 4.2.0-0.nightly-2019-09-16-074831

Ceph Version

$ oc rsh rook-ceph-tools-754fbcbf58-2bq2r ceph -v
ceph version 14.2.3 (0f776cf838a1ae3130b2b73dc26be9c95c6ccc39) nautilus (stable)

$ oc describe pod rook-ceph-mon-a-55dbcd5b85-jmmmt|grep Image
Image: ceph/ceph:v14.2.3-20190904
Image ID: docker.io/ceph/ceph@sha256:ac55e55a3d50dce5bd80679e8ac859fd0e3b571ef2af22cd80a36af3b1370589

Rook Version

$ oc rsh rook-ceph-tools-754fbcbf58-2bq2r rook version
rook: v1.1.0-beta.0.96.gceec3ac

$ oc describe pod rook-ceph-operator-56b4f66675-h6svw|grep Image
Image: rook/ceph:v1.1.0
Image ID: docker.io/rook/ceph@sha256:41cc3d790058a405c6626e21217cdd4dd16219adbe2685008ee828fbcb2da367

CSI version

$ oc describe pod csi-rbdplugin-provisioner-69568f45cc-5phgw|grep Image
Image: quay.io/k8scsi/csi-provisioner:v1.3.0
Image ID: quay.io/k8scsi/csi-provisioner@sha256:e615e92233248e72f046dd4f5fac40e75dd49f78805801953a7dfccf4eb09148
Image: quay.io/k8scsi/csi-attacher:v1.2.0
Image ID: quay.io/k8scsi/csi-attacher@sha256:26fccd7a99d973845df1193b46ebdcc6ab8dc5f6e6be319750c471fce1742d13
Image: quay.io/k8scsi/csi-snapshotter:v1.2.0
Image ID: quay.io/k8scsi/csi-snapshotter@sha256:6f12a57ef46c340c475489cac8d63c2431033961deaf40414208edebee50b640
Image: quay.io/cephcsi/cephcsi:v1.2.0
Image ID: quay.io/cephcsi/cephcsi@sha256:a9437db817bb8f1063e92eeb93f9c3471c928d16f4affe35d56a14372fe66f84
Image: quay.io/cephcsi/cephcsi:v1.2.0
Image ID: quay.io/cephcsi/cephcsi@sha256:a9437db817bb8f1063e92eeb93f9c3471c928d16f4affe35d56a14372fe66f84

Installing OCS on Openshift 4.2 pre release cluster

How do you create a storagecluster for bare metal openshift cluster using ocs operator ? I have installed the operators using:

oc create -f ./deploy/deploy-with-olm.yaml

but the sample yaml to create storage cluster in the next step has a type : gp2 which is for AWS I guess, which I tried and it failed..I need to do it for a bare metal openshift 4.2 cluster..any thoughts ?

I have a 3 master and 3 node configuration and I have labelled the worker nodes as storage nodes. I am expecting a CNS type of installation (I have installed GlusterFS CNS on Openshift 3.11 before) using local storage from worker nodes. Please let me know how to accomplish this using the ocs operator

panic: error converting YAML to JSON: yaml: mapping values are not allowed in this context

We are hitting a panic when generating the unified CSV with the latest noobaa images (v2.0.1) and the latest master:

+ make gen-release-csv
Generating unified CSV from sourced component-level operators
hack/generate-unified-csv.sh
reading in csv at build/_output/csv-templates/ocs-operator.csv.yaml.in
reading in csv at build/_output/csv-templates/rook-csv.yaml.in
reading in csv at build/_output/csv-templates/noobaa-csv.yaml
panic: error converting YAML to JSON: yaml: mapping values are not allowed in this context

goroutine 1 [running]:
main.unmarshalCSV(0x7ffec3218d08, 0x2b, 0x0)
	/home/jenkins-build/build/workspace/ocs-operator-containers/github/tools/csv-merger/csv-merger.go:128 +0x38b
main.generateUnifiedCSV()
	/home/jenkins-build/build/workspace/ocs-operator-containers/github/tools/csv-merger/csv-merger.go:371 +0x14c0
main.main()
	/home/jenkins-build/build/workspace/ocs-operator-containers/github/tools/csv-merger/csv-merger.go:683 +0x1d3
make: *** [Makefile:97: gen-release-csv] Error 2

It looks like this is related to noobaa since it fails at noobaa-csv.yaml, @guymguym can you also take a look at this?

local-storage operator not deployed

When deploying ocs operator, local storage operator doesnt get properly deployed
the deployment for the local storage operator contains several mentions of image-registry.openshift-image-registry.svc:5000/ which prevent the deployment to succeed
editing the deployment to remove those elements makes the deployment succeed

Name:               local-storage-operator-759c56f6bf-jx8cs
Namespace:          openshift-storage
Priority:           0
PriorityClassName:  <none>
Node:               openshift-master-0.qe3.kni.lab.eng.bos.redhat.com/10.19.134.154
Start Time:         Sun, 08 Sep 2019 01:30:55 -0400
Labels:             name=local-storage-operator
                    pod-template-hash=759c56f6bf
Annotations:        alm-examples:
                      [
                        {
                          "apiVersion": "local.storage.openshift.io/v1",
                          "kind": "LocalVolume",
                          "metadata": {
                            "name": "example"
                          },
                          "spec": {
                            "storageClassDevices": [
                              {
                                "devicePaths": [
                                    "/dev/vde",
                                    "/dev/vdf"
                                ],
                                "fsType": "ext4",
                                "storageClassName": "foobar",
                                "volumeMode": "Filesystem"
                              }
                            ]
                          }
                        }
                      ]
                    capabilities: Full Lifecycle
                    categories: Storage
                    containerImage: quay.io/openshift/origin-local-storage-operator:4.2.0
                    createdAt: 2019-08-14T00:00:00Z
                    description: Configure and use local storage volumes in kubernetes and Openshift
                    olm.operatorGroup: openshift-storage-operatorgroup
                    olm.operatorNamespace: openshift-storage
                    olm.targetNamespaces: openshift-storage
                    openshift.io/scc: restricted
                    repository: https://github.com/openshift/local-storage-operator
                    support: Red Hat
Status:             Pending
IP:                 10.130.0.149
Controlled By:      ReplicaSet/local-storage-operator-759c56f6bf
Containers:
  local-storage-operator:
    Container ID:  
    Image:         image-registry.openshift-image-registry.svc:5000/openshift/ose-local-storage-operator:v4.2.0-201909061135
    Image ID:      
    Port:          60000/TCP
    Host Port:     0/TCP
    Command:
      local-storage-operator
    State:          Waiting
      Reason:       ImagePullBackOff
    Ready:          False
    Restart Count:  0
    Environment:
      WATCH_NAMESPACE:    openshift-storage (v1:metadata.namespace)
      OPERATOR_NAME:      local-storage-operator
      PROVISIONER_IMAGE:  image-registry.openshift-image-registry.svc:5000/openshift/ose-local-storage-static-provisioner:v4.2.0-201909020729
      DISKMAKER_IMAGE:    image-registry.openshift-image-registry.svc:5000/openshift/ose-local-storage-diskmaker:v4.2.0-201909040619
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from local-storage-operator-token-r4vq2 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  local-storage-operator-token-r4vq2:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  local-storage-operator-token-r4vq2
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason   Age                     From                                                        Message
  ----     ------   ----                    ----                                                        -------
  Warning  Failed   24m (x8287 over 31h)    kubelet, openshift-master-0.qe3.kni.lab.eng.bos.redhat.com  Error: ImagePullBackOff
  Normal   BackOff  4m40s (x8376 over 31h)  kubelet, openshift-master-0.qe3.kni.lab.eng.bos.redhat.com  Back-off pulling image "image-registry.openshift-image-registry.svc:5000/openshift/ose-local-storage-operator:v4.2.0-201909061135"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.