Git Product home page Git Product logo

operator-controller's Introduction

operator-controller

The operator-controller is the central component of Operator Lifecycle Manager (OLM) v1. It extends Kubernetes with an API through which users can install extensions.

Mission

OLM’s purpose is to provide APIs, controllers, and tooling that support the packaging, distribution, and lifecycling of Kubernetes extensions. It aims to:

  • align with Kubernetes designs and user assumptions
  • provide secure, high-quality, and predictable user experiences centered around declarative GitOps concepts
  • give cluster admins the minimal necessary controls to build their desired cluster architectures and to have ultimate control

Overview

OLM v1 is the follow-up to OLM v0, located here.

OLM v1 consists of two different components:

  • operator-controller (this repository)
  • catalogd

For a more complete overview of OLM v1 and how it differs from OLM v0, see our overview.

Installation

The following script will install OLMv1 on a Kubernetes cluster. If you don't have one, you can deploy a Kubernetes cluster with KIND.

Caution

Operator-Controller depends on cert-manager. Running the following command may affect an existing installation of cert-manager and cause cluster instability.

The latest version of Operator Controller can be installed with the following command:

curl -L -s https://github.com/operator-framework/operator-controller/releases/latest/download/install.sh | bash -s

Getting Started with OLM v1

This quickstart procedure will guide you through the following processes:

  • Deploying a catalog
  • Installing, upgrading, or downgrading an extension
  • Deleting catalogs and extensions

Create a Catalog

OLM v1 is designed to source content from an on-cluster catalog in the file-based catalog (FBC) format. These catalogs are deployed and configured through the ClusterCatalog resource. More information on adding catalogs can be found here.

The following example uses the official OperatorHub catalog that contains many different extensions to choose from. Note that this catalog contains packages designed to work with OLM v0, and that not all packages will work with OLM v1. More information on catalog exploration and content compatibility can be found here.

To create the catalog, run the following command:

# Create ClusterCatalog
kubectl apply -f - <<EOF
apiVersion: catalogd.operatorframework.io/v1alpha1
kind: ClusterCatalog
metadata:
  name: operatorhubio
spec:
  source:
    type: image
    image:
      ref: quay.io/operatorhubio/catalog:latest
      pollInterval: 10m
EOF

Once the catalog is unpacked successfully, its content will be available for installation.

# Wait for the ClusterCatalog to be unpacked
kubectl wait --for=condition=Unpacked=True clustercatalog/operatorhubio --timeout=60s

Install a Cluster Extension

For simplicity, the following example manifest includes all necessary resources to install the ArgoCD operator. The manifest includes installation namespace, installer service account and associated minimal set of RBAC permissions needed for installation, and the ClusterExtension resource, which specifies the name and version of the extension to install. More information on installing extensions can be found here.

# Apply the sample ClusterExtension. Manifest already includes
# namespace and adequately privileged service account
kubectl apply -f https://raw.githubusercontent.com/operator-framework/operator-controller/main/config/samples/olm_v1alpha1_clusterextension.yaml

Upgrade the Cluster Extension

To upgrade the installed extension, update the version field in the ClusterExtension resource. Note that there must be CRD compatibility between the versions being upgraded, and the target version must be compatible with OLM v1. More information on CRD upgrade safety can be found here, compatible with OLM v1. More information on CRD upgrade safety can be found here, and on the extension upgrade process here.

# Update to v0.11.0
kubectl patch clusterextension argocd --type='merge' -p '{"spec": {"source": {"catalog": {"version": "0.11.0"}}}}'

For information on the downgrade process, see here.

Uninstall the Cluster Extension

To uninstall an extension, delete the ClusterExtension resource. This will trigger the uninstallation process, which will remove all resources created by the extension. More information on uninstalling extensions can be found here.

# Delete cluster extension and residing namespace
kubectl delete clusterextension/argocd

Cleanup

Extension installation requires the creation of a namespace, an installer service account, and its RBAC. Once the extension is uninstalled, these resources can be cleaned up.

# Delete namespace, and by extension, the installer service account, Role, and RoleBinding
kubectl delete namespace argocd
# Delete installer service account cluster roles
kubectl delete clusterrole argocd-installer-clusterrole && kubectl delete clusterrole argocd-rbac-clusterrole
# Delete installer service account cluster role bindings
kuebctl delete clusterrolebinding argocd-installer-binding && kubectl delete clusterrolebinding argocd-rbac-binding

License

Copyright 2022-2024.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

operator-controller's People

Contributors

acmenezes avatar acornett21 avatar anik120 avatar ankitathomas avatar awgreene avatar bentito avatar dependabot[bot] avatar dtfranz avatar everettraven avatar gallettilance avatar grokspawn avatar itroyano avatar jmprusi avatar joelanford avatar jsm84 avatar jubittajohn avatar kevinrizza avatar lalatendumohanty avatar m1kola avatar makon57 avatar michaelryanpeter avatar ncdc avatar perdasilva avatar skattoju avatar theishshah avatar tlwu2013 avatar tmshort avatar trgeiger avatar varshaprasad96 avatar yashoza19 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

operator-controller's Issues

Automate OLM V1 release

To help facilitate a simple and repeatable release process, tools should be created to automate it as much as possible. This automation should exist in whichever repository is chosen or created from #111.

When invoked it should create a new manifest containing the chosen versions of each sub-component, then make that manifest available for use.

Acceptance criteria:

  • Automation (i.e. github action) is created for OLM V1 releases.
  • The automation populates a manifest with the chosen versions of each OLM V1 sub-component.
  • The chosen automation finally publishes the release.

Create install command to install operator-controller and all dependencies

In order to facilitate an easy release process, we should develop a simple method of installation for the operator-controller as well as its dependencies. So far we have proposed doing this through a shell script to allow this with a single command, similar to the golangcli-lint install command here:

curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin v1.50.1

One alternative idea was to package the yaml files into a single OCI image then install through that.

Acceptance criteria:

  • operator-controller and all dependencies may be installed with a single command
  • Should exist in the repo created in #111 Obsolete: The operator-controller repo will be the source-of-truth for OLM v1 releases, install steps, etc.
  • Must install Cert-Manager, RukPak, and the operator-controller. Each component should be installed successfully and reach a running state before moving to the next component.
  • The script should default to installing the latest released version of each component, but allow users to specify released versions for each component. Obsolete: A release of operator-controller should have the versions of each component enforced in the go.mod file, Makefile, etc.

Explore singleton PO API concept

The current PO API implementation is based on discrete units of input: one package is specified per PO object. This is very similar to the subscription mechanism in OLM (albeit cluster-scoped), where inputs to resolution are treated separately. Historically this behavior has caused a lot of problems and can introduce inconsistencies around resolution. It may make more sense, given what we know about the PO requirements currently, to change the PO API to where one cluster singleton object would represent all the POs that the user intends to install, instead of many separate ones.

This PO object would represent multiple packages, and write back to its status the result of resolution for all the referenced packages. Treating all inputs as a set may make resolution more straightforward and avoid some of the legacy pitfalls in OLM.

A/C:

  • Write a design document outlining the proposed change in the PO API and the pros/cons with the existing model

Change makefile to install and verify tools versions in a less fragile way

See Joe's comment here for explanation and potential ways we can do this: #81 (comment)

Rukpak uses a separate tools module in the repo to pull down the go tools binaries for use with the makefile targets. We could do something similar since it's a better implementation than what comes from the generated project here.

Example Makefile: https://github.com/operator-framework/rukpak/blob/main/Makefile
Example tools folder: https://github.com/operator-framework/rukpak/tree/main/hack/tools

Convert Operator CR into Deppy Constraints

User Story:

As a user, I would like the packageName I specify in the Operator Controller to be converted into a Deppy Constraint.

Background:

  • The goal of this story is to convert an Operator CR into Deppy Constraints.
  • Today, we only plan to introduce the packageName and GVK constraints, but will expand in the future.
  • Within the context of Deppy, Variables are logical statements that can be true or false. Constraints are variables that must be true in a solution set. So for this story:
    -- The user will specify a package that they want to install within an Operator CR's spec.PackageName field
    -- The operator controller will create a "Deppy Variable" that captures this packageName constraint.
    -- The Deppy Library will introduce the GVK Uniqueness requirement.

Acceptance Criteria:

  • An Operator CR can be converted into a set of Deppy Constraints.

Operator Controller can take a deppy solution and lookup each entity ID

User Story: As the operator controller, I can take the set of entity references provided by Deppy as a solution and resolve each reference to a particular bundle image.

Acceptance Criteria:

  • Surface errors when the operator controller fails to resolve a reference
  • The operator controller creates an array of bundle image it needs to process.

Design and document OLM V1 release process

We need to design and document the release process that will be followed for OLM V1. The process should be as simple and repeatable as possible; easily approachable by OLM V1 developers of all experience levels. The process should be fully documented and co-located with our release branch or repository as set in #111.

Acceptance criteria:

  • OLM V1 release process is defined.
  • Release process documentation is located alongside our release manifests from #111.
  • Should utilize the script created in #110

Operator Controller Reconciles Using All Operator CRs

When a reconcile is triggered, the Operator Controller must be aware of the entire state of the cluster in order to make fully-informed decisions. To begin with, this means gathering all Operator CRs present on the cluster.

We should eventually be gathering (from completion of future issues):

  • All Operator CRs
  • All Installed Operators (not necessarily all covered by above)
  • Cluster Version
  • etc.

Acceptance Criteria:

  • Operator Controller Reconciler is updated to reconcile using all Operator resources on the cluster
  • Test are added to cover this functionality

Implement support for respecting upgrade graph semantics

Goal: Ensure the defined upgrade graph semantics are respected during upgrades. The current implementation always chooses the highest semver bundle in the desired package name without inspecting the upgrade graph semantics.

Needs Clarification on Scope:

  • Ensuring cross-catalog upgrades are handled properly
  • Ensuring cross-channel upgrades are handled properly
  • Accommodating catalogs that define multiple bundle formats (as this doesn't exist today)

Open Questions:

  • Do we need to leverage the OperatorConditions API to determine whether an Operator cannot be upgraded?
  • How do we avoid blasting through the upgrade graph if there's no support at the rukpak layer for inspecting the runtime state of managed workloads?

Add support for generating BundleInstance resources

Goal: update the current implementation which only generates a Bundle resource to also generate a BundleInstance that matches that corresponding Bundle's metadata.Name.

When a new platform-operator Bundle version is available (e.g. it has the highest semver), the controller is responsible for generating a new Bundle resource, and updating the existing BundleInstance resource to point to that newly generated Bundle's metadata.Name.

Operator Controller can apply and manage RukPak bundleDeployments based on the entity id

User Story: Given a solution set, the operator controller should ensure that the correct bundleDeployments exist.

Acceptance Criteria:

  • Only one bundleDeployment is ever present for a particular package.
  • If the solution set contains an entity that is not related to an existing bundleDeployment, create the bundleDeployment
  • If the solution set contains an entity that is an upgrade for an existing bundleDeployment, update the bundleDeployment.
  • If a bundleDeployment exists for an entity that is no longer in the solution set, delete the bundleDeployment.

Out of scope:

  • Dependencies
  • Some outside constraint invalidates an existing bundleDeployment. Examples:
    -- Operator Foo only supports clusters at version 1.24, the cluster has been upgraded to version 1.25. All resolutions are effectively broken because the existing install and subsequent upgrades are invalid.
    -- Supporting resolutions if an installed entity is removed from the catalogSource.

The resolver should treat certain properties/constraints as optional

See https://kubernetes.slack.com/archives/C0181L6JYQ2/p1675349508746979

TL;DR: We should pass a boolean value to loadFromEntity to indicate whether the property is required (true) or optional (false). Currently the function is implemented as if everything it is asked to load is required.

If this new parameter is optional/false and the property name is not in the map, we should return the default value of the type and a nil error.

Lastly, instead of using true/false at call sites, @perdasilva suggested we define constants (perhaps required, optional = true, false), and then use the constant at call sites to make it more obvious what the semantics of the boolean value are.

Create installation yaml for operator-controller

Each component of OLM v1 should be installed through a consistent format. RukPak, for instance, is currently installed using a single downloadable yaml file (here). We should create a similar if not identical installation method for the operator-controller. If we choose to go another route, an additional issue must be created to do the same for RukPak (and probably cert-manager too).

Acceptance criteria:

  • operator-controller can be installed from a single downloadable yaml file.

Note that the output of this issue should not install dependencies such as RukPak itself, just as RukPak does not install its dependencies through the same yaml. This should be taken care of in the next issue: #110 Also note that installing the operator-controller on its own through this yaml without installing the dependencies will not result in a successful installation. This information should be easily available so that users don't install this yaml only hoping for a successful OLM V1 installation. We should also have links to documentation for #110 available.

Possible idea: Install operator-controller via BundleDeployment

Create an interface to Deppy that encapsulates the necessary solver methods

User Story:

  • As a developer on the Operator Controller project, I would like to place any calls to Deppy solver behind a reasonable interface so I can easily test or swap out the Deppy implementation.

Background:

  • The Deppy project is being developed as a library that third party projects may utilize.
  • There is a possibility that the Operator Controller will need to introduce an Kubernetes Native API (a CRD) that will allow users to configure constraints sent to the resolver.
  • To support the possible change, calls to the Deppy Solver should be be done through an interface.
  • Placing the Solver behind an interface will simplify unit testing as well.
  • The existing solver interface can be found here.

Acceptance Criteria:

  • The Deppy Solver is implemented using an interface

Update the Operator CRD Spec to allow me to specify an operator with a particular packageName

User Story:
As a cluster admin, I would like an operator CRD to be introduced that allows me to define the operator I would like to install.
Why is this important?

  • The Operator API will act as a top-level resource for users installing operators. This viewpoint will significantly improve the users ability to understand the state of a given operator.
  • The proposed API introduced in this epic consists of a single field within the spec. This is the minimum API service that enables developers to introduce a workflow that will install an operator from a catalogSource on cluster. This work will eventually be expanded upon so we may realize the goals of OLM v1.

Acceptance Criteria:

  • A new project is created with Kubebuilder.
  • A new operator CRD is created.
  • The operator API should allow users to specify the packageName.
  • A controller exists that reconciles the operator CR.

Out of Scope:

  • Results do not need to be deterministic. For example, the package may be fulfilled by an entity within any catalogSource.

Add `Progressing` condition type?

As of #217, we have removed the Ready condition, and we added the Installed condition in #218 and Resolved condition in #213.

We may want to consider another condition type called Progressing that tells a user that the controller is working toward (or not) achieving the desired state specified in the operator spec.

This would be useful to understand context when status.installedBundleSource and status.resolvedBundleSource differ.

Operator controller uses deppy to find solutions for existing operator CRs

User Story: As a user, I would like the operator controller to be able to identify if it can install a set of bundles that satisfies the existing operator CRs.

Why is this important?

  • Deppy is being developed as a framework, which will allow other projects to import Deppy as a library that can be used to identify if a set of entities can satisfy a set of constraints. The Operator Controller will then rely on the Deppy library to decide which operator must be installed on cluster based on:
    -- user constraints
    -- cluster constraints
    -- dependency constraints
    -- constraints introduced by operators currently installed
  • The initial delivery tracked in this epic will only focus on supporting the following constraints:
    -- PackageName Constraints: Allowing users to specify that an entity must be from a particular package
    -- GVK Uniqueness Constraints: Allowing users to specify that GVKs introduced to the cluster must be provided by a single entity.
  • Multiple operator CRs can introduce multiple PackageName constraints.

Acceptance Criteria:

  • Deppy should be embedded in the Operator Controller, possibly behind an interface so it can easily be moved to its own standalone service.
  • The operator controller can:
    -- identify when an operator CR cannot be satisfied.
    -- identify solutions when they exist.
  • The output from Deppy will provide a list of references to bundles that were selected. These references will then be used by the operator controller to create RukPak bundleDeployments. The format of the references needs to be determined as a part of this ticket.

Introduce the spec.Package field to the Operator API

  • Introduce the spec.Package field.
  • Create a GA Milestone issue that captures that the spec.Package field will likely need to be changed in order to support other sourcing techniques. For example, installing a bundle image directly
  • Update #62 to include a link to the GA Milestone Issue

The generated Inputs leak over time from the catalog source adapter

The recent set of changes have introduced a new component that's solely responsible for aggregating catalog source catalog content, where each olm.bundle is modeled as a deppy Input resource. For the most part, this seems like it's a step in the right direction, but the current implementation leads to some false positives during e2e testing. When the original source for the generated Input is deleted (e.g. a specific catalog source resource), the Input is stranded, which leads to test pollution over time.

Unfortunately, the CatalogSource API is namespace-scoped, and therefore we can't place an owner reference on the underlying Input resource that was generated as the Input API is cluster-scoped, and garbage collection cannot occur in that edge case.

Resurface errors in BundleDeployment to Operator status

The operator controller uses rukpak to install operator bundles on cluster. Any error occurred during installation of bundle is reflected on BundleDeployment object's status. This should also be reflected on the operator status too, such that an Operator API user can know the status of bundle installation.

The Operator Controller can provide Deppy with a set of constraints and return a solution set consisting of entity IDs.

User Story:

  • As a Deppy user, I would like to provide Deppy a list of constraints and Entity Sources and be returned a solution set consisting of entity IDs.

Acceptance Criteria:

  • If a solution set is found, the list of entity IDs are returned.
  • If no solution set is found, an error explaining why the constraints could not be met is returned.
  • The operator CR's status is updated to reflect that an entity satisfying it's constraints have been found.

Provide new methods or fields to allow for other bundle sources to be used

The operator controller code as it is now assumes several things about the bundlePath being used, including the source type (image) and the provisioner class used (plain). We need to provide some method by which we can easily figure out the type of bundle source being installed. This could be done by:

  • adding additional property(s) to help us determine the type, e.g. olm.bundle.type
  • changing the olm.bundle.path property to instead be a set of different properties for each specific type, e.g. olm.bundle.imageReference.
  • etc.

Optimize Controller by de-duping Watch Events

Each reconcile event in the Operator Controller will result in multiple Operator CRs being processed. If many events were to occur for many Operator CRs in a short time span (e.g. many created at once) then we should take steps to avoid redundant Reconciles.

A couple ideas that were floated:

  • Changing the default controller behavior to map each event with the same key, resulting in just one total Reconcile event
  • Manually removing other entries from the event queue

Optimizing like this could create an issue with traceability, since it may be difficult to audit any changes made to the actual Operator CR that triggered an event.

Operator CRs are Processed in Parallel

Each reconcile event will result in the entire set of Operator CRs being processed, and in order to do that in a timely manner they should all be processed in parallel as opposed to serially.

This issue should be done after #89, as each Operator process needs to know the entire cluster state in order for each CR be processed properly.

Acceptance Criteria:

  • Reconciler is updated to process listed Operator CRs in parallel.
  • Test are written to cover this behavior.

Depends on #89

Start using envtest for testing reconciler

With the introduction of a separate suite to test the working of reconciler in here, it would be useful to spin up an isolated test server with envtest instead of directly taking in the config of the kind cluster and using locally available binaries. This will help us in avoiding any version incompatibility issues that may occur.

Collect content for each entity ID provided in the solution set

User Story:

  • For each entity in a solution set, the operator controller is able to retrieve the associated packageName and bundle URL.

Background:

  • Once Deppy has identified which operators need to be on cluster, the Operator Controller must be able to retrieve the content that needs to be installed onto the cluster.

Acceptance Criteria:

  • For each entity in a solution set, a packageName and bundle URL are retrieved.
  • Failure to retrieve the packageName or bundleContent surfaces as an error in the Operator Cr.

Create OLM V1 release branch or repository

We need a single source of truth from which to cut each released version of OLM V1. This project could exist in a branch of the operator-lifecycle-manager repository, the operator-controller repository, or in a separate repository all on its own.

Regardless of where it exists, it should contain, for each release version of OLM V1, a manifest of versions for each sub-component. This can then be referenced when an install script is run to install each sub-component at the version specified for that particular OLM V1 release.

Acceptance criteria:

  • A single source of truth for OLM V1 release versions is created.
  • The branch or repository that is created contains a manifest for each release version, each pointing to the appropriate versions of all sub-components to be installed.
  • A README.md is creating explaining the purpose of the repository.

Based on discussion: We're leaning towards creating a new repository for this. Call it github.com/operator-framework/olmv1-installer.

Add support for sourcing content from non-catalog sources

Placeholder: Investigate extending the spec.catalog configuration, and adding a spec.source configuration that can accommodate sourcing from an OCI artifact, OLM catalog source, etc. This field can mirror the union type pattern seen throughout Kubernetes (and rukpak's API set).

Add support for interacting with file-based-catalogs

Goal: Add support for interacting with dedicated file-based-catalogs for an individual platform operator.

The implementation can either serve that file-based-catalog over a grpc connection, or use OLM's existing CatalogSource implementation, and communicate with a dedicated CatalogSource resource using the registry ListBundles API to get a list of olm.bundles that are present in that CatalogSource resource.

The operator controller ensures that the correct bundleDeployments exist for each entity in the solution set

User Story

  • As a user, I would like the Operator Controller to create a bundleDeployment for each entity in the solution set returned by Deppy.

Background:

  • The Operator Controller will has a set of objects that contain both a packageName and bundle URL. The Operator Controller must then ensure that the correct bundleDeployments exist on cluster.
  • The packageName and bundle URL object will be available once OLM-2836 is delivered.

Acceptance Criteria:

  • Only one bundleDeployment is ever present for a particular package.
  • If the solution set contains an entity that is not related to an existing bundleDeployment, create the bundleDeployment
  • If the solution set contains an entity that is an upgrade for an existing bundleDeployment, update the bundleDeployment (Non-Blocking, nice to have).
  • If the operator-controller previously created a bundleDeployment for a entity in the solution set that is no longer present, delete the bundleDeployment.
  • Unit tests are created that ensure:
    -- a valid bundleDeployment is created for each Operator CR.
    -- existing bundleDeployments are updated if a new version of the operator is available in the deppy source (Non-Blocking, nice to have).
    -- existing bundleDeployments are removed if their associated operator CR is deleted.

Out of Scope

  • Dependencies
  • Outside constraints that invalidate an existing bundle deployment, such as supported cluster versions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.