Git Product home page Git Product logo

lifecycle-toolkit's Introduction

Keptn

build Codecov goversion version GitHub Discussions Artifacthub Badge OpenSSF Best Practices OpenSSF Scorecard FOSSA Status CLOMonitor

This is the primary repository for the Keptn software and documentation. Keptn provides a “cloud-native” approach for managing the application release lifecycle metrics, observability, health checks, with pre- and post-deployment evaluations and tasks. It is an incubating project, under the umbrella of the Keptn Application Lifecycle working group.

Note Keptn was developed under the code name of "Keptn Lifecycle Toolkit" or "KLT" for short. The source code contains many vestiges of these names.

Goals

Keptn provides Cloud Native teams with the following capabilities:

  • Pre-requisite evaluation before deploying workloads and applications
  • Checking the Application Health in a declarative (cloud-native) way
  • Standardized way to run pre- and post-deployment tasks
  • Provide out-of-the-box Observability
  • Deployment lifecycle management

Operator Maturity Model with third level circled in

Keptn can be seen as a general purpose and declarative Level 3 operator for your Application. For this reason, Keptn is agnostic to deployment tools that are used and works with any GitOps solution.

For more information about the core concepts of Keptn, see our core concepts documentation section.

Status

Status of the different features:

The status follows the Kubernetes API versioning schema.

Community

Find details on regular hosted community events in the keptn/community repo and our Slack channel(s) in the CNCF Slack workspace.

Roadmap

You can find our roadmap here.

Governance

  • Community Membership: Guidelines for community engagement, contribution expectations, and the process for becoming a community member at different levels.

  • Members and Charter: Describes the formation and responsibilities of the Keptn Governance Committee, including its scope, members, and core responsibilities.

Installation

Keptn can be installed on any Kubernetes cluster running Kubernetes >=1.24.

For users running vCluster, please note that you may need to modify your configuration before installing Keptn; see Running Keptn with vCluster for more information.

Use the following command sequence to install the latest release of Keptn:

helm repo add keptn https://charts.lifecycle.keptn.sh
helm repo update
helm upgrade --install keptn keptn/keptn -n keptn-system --create-namespace --wait

Monitored namespaces

Keptn must be installed in its own namespace that does not run other major components or deployments.

By default, the Keptn lifecycle orchestration monitors all namespaces in the cluster except for a few namespaces that are reserved for specific Kubernetes and other components. You can modify the Helm chart to specify the namespaces where the Keptn lifecycle orchestration is allowed. For more information, see the "Namespaces and Keptn" page in the Configuration section of the documentation.

More information

For more info about Keptn, please see our documentation.

You can also find a number of video presentations and demos about Keptn on the YouTube Keptn channel. Videos that refer to the "Keptn Lifecycle Controller" are relevant for the Keptn project.

Contributing

For more information about contributing to Keptn, please refer to the Contribution guide section of the documentation.

To set up your local Keptn development environment, please follow these steps for new contributors.

License

Please find more information in the LICENSE file.

Thanks to all the people who have contributed 💜

Made with contrib.rocks.

lifecycle-toolkit's People

Contributors

aepfli avatar agardnerit avatar amishakumari544 avatar bacherfl avatar eddycharly avatar geoffrey1330 avatar github-actions[bot] avatar hirentimbadiya avatar keptn-bot avatar mowies avatar nitishupkr avatar nlamirault avatar odubajdt avatar oleg-nenashev avatar philipp-hinteregger avatar prakrit55 avatar rakshitgondwal avatar realanna avatar renovate[bot] avatar shivangshandilya avatar staceypotter avatar stackscribe avatar sudiptob2 avatar thisthat avatar thschue avatar utkarshumre avatar vamshireddy02 avatar vickysomtee avatar vishalvivekm avatar yashpimple avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lifecycle-toolkit's Issues

Webhook injects the Keptn Scheduler

Goal

The Mutating Webhook should inject the Keptn Scheduler to workloads that have the required Keptn Labels.

Acceptance Criteria

  • When the Webhook is invoked, it always allows the request
  • When the workload contains the Keptn Annotations, then the Webhook mutates the resource injecting the Keptn Scheduler
  • When the resource is well-formed, then the resulting mutation should also be well-formed

DoD

  • The mutation does not fail the application of the resource
  • If Keptn Annotations are available, the Keptn Scheduler is used

Dependencies

Keptn Scheduler schedules pods after pre-deployment checks

Goal

The Keptn Scheduler waits for Pre-Checks to finish before the Pod is set for deployment.

Technical Details

Service and Application Controllers #9 #10 need to watch for the Event CRDs and update the Service and Application CRDs when the pre-deployment checks are completed. The Keptn Scheduler can then watch Service CRDs to determine when the Pod can be set for deployment.

Acceptance Criteria

  • Service Controller updates the status of Service CRDs when Pre-checks events are completed
  • Application Controller updates the status of Service CRDs when Pre-checks events are completed

DoD

  • Pods are set for deployment after pre-deployment checks are completed

Dependencies

Keptn Lifecycle Controller

Goal

The goal of this sandbox project is to introduce a more “cloud-native” approach for pre- and post-deployment, as well as the concept of application health checks.

Reference

Propagate Deployment/StatefulSet/Daemonset Annotations to the Pods

At the moment, Annotations for using the Lifecycle Controller have to be set on a Pod Level. As this could lead to problems, it should also be possible to set them on a higher level, e.g. on a Deployment or Stateful Set, and they should be used in the lifecycle controller.

Annotations on the Pod Level should "win" if both are set.

Bootstrap Service Controller

Goal

Extend the Keptn Operator #2 with a controller for the Deployments annotated with the Keptn annotations #4.

Technical Details

Deployments can be annotated with the special annotation stemming from #4.
The controller should monitor their status.

Example

apiVersion: lifecycle.keptn.sh/v1alpha1
kind: Service
metadata:
  name: service-test
spec:
  application: Application12
  preDeploymentChecks:
    service: service-test
    application: Application12
    job: 
      backoffLimit: 5
      activeDeadlineSeconds: 100
      template:
        spec:
          containers:
          - name: hello-world
            image: ubuntu:latest
            command: ["echo",  "Hello from the KLO and Flo"]
          restartPolicy: Never

Acceptance Criteria

  • The operator monitors Deployments that have Keptn annotations
  • The operator tracks information about the pre-deployment status at the service level: pending, running, success/failure
  • The operator emits K8s events when the pre-deployment of the deployment starts and finishes.

DoD

  • The Operator monitors Deployments annotated w/ Keptn annotations
  • Events are visible in the K8s event stream

Dependencies

Fetch metrics from Prometheus

Goal

The KeptnEvaluation controller fetches metrics from Prometheus.

Technical Details

The controller shall fetch data from Prometheus using the information from KeptnEvaluationDefinition and KeptnEvaluationProvider.
For each spec.objectives of the KeptnEvaluationDefinition, the controller shall resolve the query against prometheus and evaluate the result with the target.
Afterwards, it should update the status.evaluationStatus of each objective with the value retrieved from Prometheus and the status of the objective: failed or passed.

Finally, when all objectives have been resolved, the controller can change the status.overallStatus field.
The value of this field is the logical and of all objectives' status, i.e., if at least one is failed the overallStauts is failed.

Acceptance Criteria

  • The controller resolves against Prometheus each objective
  • The controller adds the results from Prometheus into the status.evaluationStatus.value field of the respective query
  • The controller sets the overallStats field when all objectives are evaluated

DoD

  • Evaluations are completed fetching data from prometheus

Dependencies

Bootstrap Pipeline

Goal

Configure a Github Workflow that builds and tests Go code.

Acceptance Criteria

  • The workflow should run at each commit
  • The workflow should block a PR merge if it fails
  • When the Go code cannot be built, then the workflow fails
  • When the Go tests fail, the workflow fails
  • The artifacts are published to the GitHub packager

DoD

  • An automated pipeline blocks PRs from being merged if the code cannot compile or it does not pass the tests
  • Artifacts are built into container that are published to GitHub packager

Dependencies

Bootstrap Keptn Annotations

Goal

Define the set of annotations on which the Webhook Admission Controller reacts, their format, their default value, and their name.

Technical Details

The Mutating Webhook should only change manifests that are annotated with Keptn special annotations.
Keptn should track the following properties:

  • instance
  • application
  • (micro)service / component
  • version

Draft proposal of annotations with example values:

keptn.sh/instance: eu-central
keptn.sh/application: my-super-cool-application
keptn.sh/component: microservice-1
keptn.sh/version: "1.0"

Not all of these annotations are mandatory for Keptn to track the lifecycle. Hence, the goal of this research is to define the mandatory annotations and the default values for the missing ones.

Acceptance Criteria

  • Decide the list of mandatory annotations
  • Decide default values for missing annotations
  • Define the format and names

DoD

  • Document the results in the README file

Add tracing capability for Scheduler

Goal

Instrument the Scheduler with the OTel SDK to export traces.

Technical Details

Use OTel Go to manually instrument the code to export the following spans as a single trace:

Image

The Trace Context should be propagated into the CRDs via a common metadata field TraceContext. The context will be propagated using this immutable map.

The scheduler shall use keptn/scheduler as a component name.
The scheduler shall extract the parent Span from TraceContext
The scheduler shall create a Consumer Span and attach an event at each reconciliation.
The span should be created with the following OTel Attribute:

  • KeptnApp as keptn.deployment.app_name
  • KeptnWorkload as keptn.deployment.workload
  • KeptnVersion as keptn.deployment.version
  • WorkloadStatus as keptn.deployment.workload.status

Acceptance Criteria

  • TraceContext contains W3C Trace Headers
  • Scheduler creates a Consumer Span at the beginning of the reconciliation loop
  • Attributes with Keptn information should be added (e.g. AppName, status)
  • Events are added at each reconciliation with the status

DoD

  • The Scheduler generates Spans

Resources

Make timeouts and retries configurable for function executions

As Keptn Tasks should fail after some time or after some executions, this behavior should be configurable.

Therefore, we'd need some additional fields in the FunctionSpec (/operator/api/v1alpha1/keptntaskdefinition_types):

  • retries
  • timeout

These thresholds should also get passed over to the created Task resource instance, and the task should be terminated (and so the corresponding job) if one of these states has been reached.

Checks: Reconcile loop modifies resource without storing the changes in a single reconcile loop

During a reconcile loop which handles the pre- or post-deployment checks, there is a situation, where the logic modifies the workloadinstance object but does not store the changes via K8s API. In this case, the modified object stays in the memory until another reconcile loop handles it.

Steps to reproduce:

  • create a deployment without post-deployment checks
  • after the binding of the pods is finished, you will receive an error in the logs

Operation cannot be fulfilled on keptnworkloadinstances.lifecycle.keptn.sh \"podtato-head-podtato-head-left-arm-0.1.0\": the object has been modified; please apply your changes to the latest version and try again

The operator should not receive such an error at all, although the implementation works good

Add tracing capability for App

Goal

Instrument the App controller with the OTel SDK to export traces.

Technical Details

Use OTel Go to manually instrument the code to export the following spans as a single trace:

Image

The Trace Context should be propagated into the CRDs via a common metadata field TraceContext. The context will be propagated using this immutable map.

The App controller shall use keptn/operator/app as the component name.
The App controller shall extract the parent Span from TraceContext.
Each Span should be created with the following OTel Attribute:

  • AppName as keptn.app.name
  • AppVersion as keptn.app.version
  • AppNamespace as keptn.app.namespace

Acceptance Criteria

  • App creates a Consumer Span when it starts to handle a App CRD
  • App creates a Producer Span when it creates a AppVersion
  • Attributes with Keptn information should always be added whenever possible (e.g. AppName, ...)

DoD

  • The App Controller generates Spans

Resources

Bootstrap KeptnEvaluation Controller

Goal

Extend the Keptn Operator with a controller for KeptnEvaluation CRDs.

Technical Details

The Controller should monitor KeptnEvaluation CRDs:

apiVersion: lifecycle.keptn.sh/v1alpha1
kind: KeptnEvaluation
metadata:
  name: my-prometheus-evaluation-asdolkfj
spec:
  evaluationDefinition: my-prometheus-evaluation #name of the KeptnEvaluationDefinition to use
status:
  evaluationStatus: #array
    - name: query-1 #string
      value: 10 #string with the result from query-1
      status: passed #enum: passed/failed
   - name: query-2
      value: 50
      status: failed
 overallStatus: failed

This controller should monitor KeptnEvaluation CRDs and fetch KeptnEvaluationDefinitions and KeptnEvaluationProvider CRDs.
KeptnEvaluationDefinition is fetched based on the name of spec.evaluationDefinition.
KeptnEvaluationProvider is fetched based on value of spec.source from the KeptnEvaluationDefinition.

Acceptance Criteria

  • KeptnEvaluation CRD is defined
  • The controller monitor the KeptnEvaluation CRD
  • The controller fetched KeptnEvaluationDefinitions and KeptnEvaluationProviders

DoD

  • The operator monitors KeptnEvaluation CRDs

Bootstrap Event Controller

Goal

Create the skeleton for the Keptn Event Controller inside the Keptn Operator #2 and define the Keptn Event CRD.

Technical Details

The operator implemented in #2 shall be extended with support to control Keptn Event CRDs.
Draft of the CDR to monitor:

apiVersion: lifecycle.keptn.sh/v1alpha1
kind: Event
metadata:
  name: event-sample
spec:
  service: Service1
  application: Application1
  job: 
    backoffLimit: 5
    activeDeadlineSeconds: 100
    template:
      spec:
        containers:
        - name: hello-world
          image: ubuntu:latest
          command: ["echo",  "Hello from the KLO"]
        restartPolicy: Never

Acceptance Criteria

  • The operator monitor Keptn Event CRDs
  • The CRD definition is documented in the README

DoD

  • The Event CRD format is documented in the README
  • The Operator controls Event CRDs
  • Adapt Application and Service controllers to emit Keptn Events together with K8s events

Dependencies

Research: Testing

Goal

Research how we can test the controller and which type of tests we should use.

Acceptance Criteria

  • Define how we can test CRD in isolation: maybe using K3s/K3d/Kind
  • Define what type of tests we should execute:
    • Unit tests
    • Contract tests
    • E2E & run them on multiple K8s versions
    • Load tests
  • Define a testing strategy and how to automate it

DoD

  • A PoC with tests is created
  • Best practices on how to structure the code are documented in a Developer.md

References

Bootstrap the Keptn Lifecycle Operator

Goal

Create the skeleton for the Keptn Lifecycle Operator.

Technical Details

Use Kubebuilder to generate the skeleton of an operator.

Acceptance Criteria

  • The operator offers health and readiness probes
  • The operator logs that it was able to start

DoD

  • The operator can be run in a K8s cluster

Bootstrap KeptnEvaluationDefinition

Goal

Create the definition for the Keptn Analysis CRDs.

Technical Details

A new set of CRDs should be created for supporting SLI/SLO validation.

KeptnEvaluationDefinition CRD should look like the following

apiVersion: keptn.sh/v1
kind: KeptnEvaluationDefinition
metadata:
  name: my-prometheus-evaluation
spec:
  source: prometheus #string
  objectives: #array
    - name: query-1 #string
       query: "xxxx" #string:  promQL query
       evaluationTarget: <20 #string: can only be starting with < or >
    - name: query-2
       query: "yyyy"
       evaluationTarget: >4

KeptnEvaluationProvider CRD should look like the following

apiVersion: keptn.sh/v1
kind: KeptnEvaluationProvider
metadata:
  name: prometheus
spec:
  targetServer: "http://prometheus-k8s.monitoring.svc.cluster.local:9090" #string
  secretName: prometheusLoginCredentials #secret name, optional

Acceptance Criteria

  • The KeptnEvaluationDefinition is defined
  • The KeptnEvaluationProvider is defined

DoD

  • The CRDs format is documented in the README

Dependencies

Bootstrap SLO controller

Goal

Extend the Keptn Operator with a controller for SLO CRDs.

Technical Details

The Controller should monitor SLO in a format compatible with the current Keptn SLO format.
The SLO format should be part of the spec field.

Example

apiVersion: lifecycle.keptn.sh/v1alpha1
kind: SLO
metadata:
  name: service-test
spec:
  comparison:
    aggregate_function: "avg"
    compare_with: "single_result"
    include_result_with_score: "pass"
    number_of_comparison_results: 1
  filter:
  objectives:
    - sli: "response_time_p95"
      displayName: "Response time P95"
      key_sli: false
      pass:             # pass if (relative change <= 10% AND absolute value is < 600ms)
        - criteria:
            - "<=+10%"  # relative values require a prefixed sign (plus or minus)
            - "<600"    # absolute values only require a logical operator
      warning:          # if the response time is below 800ms, the result should be a warning
        - criteria:
            - "<=800"
      weight: 1
  total_score:
    pass: "90%"
    warning: "75%"

Acceptance Criteria

  • The operator monitors Keptn SLO

rename lfc-scheduler to scheduler

To keep things consistent throughout the lifecycle-controller project, the lfc-scheduler should be renamed to scheduler.

Acceptance Criteria

  • lfc-scheduler folder is renamed
  • all occurrences in the documentation are changed

[Tracking] Lifecycle Controller can be injected in manifests

Goal

When a Kubernetes Manifest is applied, the Keptn Scheduler is injected for pods that are correctly annotated and Kubernetes Events for Pre-Deployment are available in the K8s Event stream.

Technical Details

A Kubernetes Manifest, which is annotated with Service and Application Name, gets applied to the Kubernetes Cluster.
Afterward, the Keptn Scheduler gets injected (via Mutating Webhook), and Kubernetes Events for Pre-Deployment are sent to the event stream.
In this step, the Scheduler only creates events, and no further actions are taken.

User flow

  1. Apply Deployment manifest annotated with keptn-service and version
  2. Watch the event stream of the created service resource via kubectl describe keptnservice ...

DoD

The start and end of pre-deployment are shown in the K8s event stream.

List

Add tracing capability for AppVersion

Goal

Instrument the AppVersion controller with the OTel SDK to export traces.

Technical Details

Use OTel Go to manually instrument the code to export the following spans as a single trace:

Image

The Trace Context should be propagated into the CRDs via a common metadata field TraceContext. The context will be propagated using this immutable map.

The AppVersion controller shall use keptn/operator/appversion as the component name.

Each Span should be created with the following OTel Attribute:

  • AppName as keptn.app.name
  • AppVersion as keptn.app.version
  • AppNamespace as keptn.app.namespace

Acceptance Criteria

  • TraceContext contains W3C Trace Headers
  • AppVersion creates a Consumer Span when it handles the AppVersion CRD
  • AppVersion creates a Client Span when it creates a Task
  • Attributes with Keptn information should always be added whenever possible (e.g. AppName, status of the task)

DoD

  • The AppVersion controller generates Spans

Resources

Add tracing capability for Task

Goal

Instrument the Task Controller with the OTel SDK to export traces.

Technical Details

Use OTel Go to manually instrument the code to export the following spans as a single trace:

Image

The Trace Context should be propagated into the CRDs via a common metadata field TraceContext. The context will be propagated using this immutable map.

The Task controller shall use keptn/operator/task as the component name.

Each Span should be created with the following OTel Attribute:

  • KeptnApp as keptn.deployment.app_name
  • KeptnWorkload as keptn.deployment.workload
  • KeptnVersion as keptn.deployment.version

Acceptance Criteria

  • TraceContext contains W3C Trace Headers
  • Task creates a Server Span that ends when the task is completed
  • Attributes with Keptn information should always be added whenever possible (e.g. AppName, status of the task)
  • Events should be added when something relevant happened (e.g, The Job failed and it gets triggered again)

DoD

  • The Task Controller generates Spans.

Resources

Pass over "context information" to Tasks

To be able to add some context-related information (e.g. Workload Name, Version, App Name, ...) in the Functions executed in Keptn, they should get passed over when a function is executed.

As a proposal, this could be done as a JSON object (following the cloud event specification) in an environment variable.

Example Configuration for enabling OpenTelemetry Collector for Traces and Metrics

Goal

Extend the examples folder with configuration for OTel Collector.

Technical Details

The examples folder should contain an example configuration for enabling the operator to send OpenTelemetry data (traces and metrics) to the OpenTelemetry collector. https://github.com/open-telemetry/opentelemetry-go/tree/main/example/otel-collector can be used as a reference.
Ultimately, the Traces generated by the Operator and the Scheduler should be sent to the OTel collector and be viewable in Jaeger (setting up Jaeger should be out of scope of this issue), and Metrics can be viewed in Prometheus (InstallingPrometheus should be out of scope of this issue)

Acceptance Criteria

  • Traces and Metrics export is demo-able

Pass over results from Tasks to the WorkloadInstance

To get more information for Post-Deployment Analysis in the future, it should be able to hand over results to the Keptn Objects.

This should happen in a way that the Jobs might produce some results, which are stored in the Task Status and finally in the Status of the Workload Instance Object (e.g. PreDeploymentTaskStatus[1].TaskResult)

[Tracking] Add tracing capabilities

Goal

Instrument the Keptn Lifecycle Operator with the OTel SDK to export traces.

Technical Details

Use OTel Go to manually instrument the code to export the following spans as a single trace:

Image

The Trace Context should be propagated into the CRDs via a common metadata field TraceContext. The context will be propagated using this immutable map.

Each component should have its own tracer:

  • Scheduler: keptn/scheduler
  • Webhook: keptn/webhook
  • Workload: keptn/operator/workload
  • WorkloadInstance: keptn/operator/workloadinstance
  • Task: keptn/operator/task
  • TaskInstance: keptn/operator/taskinstance

Each Span should be created with the following OTel Attribute:

  • KeptnApp as keptn.deployment.app_name
  • KeptnWorkload as keptn.deployment.workload
  • KeptnVersion as keptn.deployment.version

The webhook is the one that always starts a trace.
The scheduler should create a Client Span for each reconciliation.

DoD

  • The Lifecycle Controller generates the aforementioned spans as part of a single trace

Resources

List

[Tracking] Pre-Deployment Checks

Goal

When a Kubernetes Manifest is applied, the Keptn Lifecycle Controller allows Pods to be scheduled after pre-deployment checks.

Technical Details

A Kubernetes Manifest, which is annotated with Service and Application Name, gets applied to the Kubernetes Cluster.
In the manifest, there is also a Keptn Integration CRD that is set to react on pre-deployment checks. For now, the integration should simply look for a well-defined ConfigMap to exist.
The Keptn scheduler takes care of setting the K8s manifest to be deployed and triggers the integration for the pre-deployment check. When the pre-deployment checks are finished with a successful state, the pods are scheduled to be deployed.

User flow

  1. Apply Deployment manifest annotated with keptn-service and version, a Keptn Integration for pre-deployment checks
  2. Watch event stream of the created service resource via kubectl describe keptnservice ...
  3. The job is started when the deployment reaches the pre-deployment checks.

DoD

  1. The start and end of pre-deployment are shown in the K8s event stream.
  2. The defined job is exec at the start of the pre-deployment checks and finishes. Its start and stop events are shown in the K8s event stream.
  3. The pre-deployment information (pending, running, success/failure) is shown in the status section of the keptn-service resource

List

Bootstrap Release pipeline

Goal

Create a GitHub Action that performs a release for the Keptn Lifecycle Controller.

Technical Details

The Action will be triggered manually or when a tag in the format v[0-9]+\.[0-9]+\.[0-9]+ is found.
The release should push all the artifacts to GH packages and attach the installing manifest in the GH release.

Acceptance Criteria

  • A single installation manifest is created and attached to the release
  • Images are built and published in the GH registry

DoD

  • We can manually release the Lifecycle Controller

Event Controller triggers K8s Jobs

Goal

When the Event Controller receives a Keptn Event CRD, it triggers a K8s Job. When the K8s Job is completed, it updates the status of the Event CRD.

Technical Details

The Event Controller #11 should be extended to also trigger a well-defined K8s job when the pre-deployment check event is received. The K8s job, for the moment, is hardcoded and the job will exit when it finds that the ConfigMap keptn-pre-deployment-checks exists in the current namespace.
An example on how triggering K8s Job can be found on the Keptn Integration here: https://github.com/keptn-contrib/job-executor-service

Acceptance Criteria

  • An hardcoded job terminates when it finds the CM keptn-pre-deployment-checks in the current namespace
  • The Event Controller creates such a job when there is a new Keptn Event for pre-deployment checks
  • The Event Controller monitors the K8s Job and updates the status of the event as Done when the job finishes

DoD

  • A K8s Job is triggered for pre-deployment checks

Dependencies

Resources

Add tracing capability for Workload

Goal

Instrument the WorkLoad controller with the OTel SDK to export traces.

Technical Details

Use OTel Go to manually instrument the code to export the following spans as a single trace:

Image

The Trace Context should be propagated into the CRDs via a common metadata field TraceContext. The context will be propagated using this immutable map.

The Workload controller shall use keptn/operator/workload as the component name.
The Workload controller shall extract the parent Span from TraceContext.
Each Span should be created with the following OTel Attribute:

  • KeptnApp as keptn.deployment.app_name
  • KeptnWorkload as keptn.deployment.workload
  • KeptnVersion as keptn.deployment.version

Acceptance Criteria

  • Workload creates a Consumer Span when it starts to handle a Workload CRD
  • Workload creates a Producer Span when it creates a WorkloadInstance
  • Attributes with Keptn information should always be added whenever possible (e.g. AppName, status of the task)

DoD

  • The Workload Controller generates Spans

Resources

Report traces to Jaeger

Goal

The traces reported by Keptn Lifecycle Controller should be visible in Jaeger.

Technical Details

The Keptn Lifecycle Controller should report traces and metrics to the OTel Collector.
The collector should be configured to redirect the traces to Jaeger.

Acceptance Criteria

  • There is a manifest to install Jaeger
  • There is a manifest to install OTel Collector
  • The code is adapted to send traces to the OTel Collector

DoD

  • We see traces in Jaeger

Resources

Add tracing capability for WorkloadInstance

Goal

Instrument the WorkloadInstance controller with the OTel SDK to export traces.

Technical Details

Use OTel Go to manually instrument the code to export the following spans as a single trace:

Image

The Trace Context should be propagated into the CRDs via a common metadata field TraceContext. The context will be propagated using this immutable map.

The WorkloadInstance controller shall use keptn/operator/workloadinstance as the component name.

Each Span should be created with the following OTel Attribute:

  • KeptnApp as keptn.deployment.app_name
  • KeptnWorkload as keptn.deployment.workload
  • KeptnVersion as keptn.deployment.version

Acceptance Criteria

  • TraceContext contains W3C Trace Headers
  • WorkloadInstace creates a Consumer Span when it handles the WorkloadInstance CRD
  • WorkloadInstace creates a Client Span when it creates a Task
  • Attributes with Keptn information should always be added whenever possible (e.g. AppName, status of the task)

DoD

  • The WorkloadInstance controller generates Spans

Resources

Research: Integrate Lifecycle Controller with ArgoCD Resource Hooks

Goal

Research how we can integrate the LifeCycle Controller with ArgoCD and document the changes we would need.

Technical Details

ArgoCD offers hooks that are triggered before and after a deployment.

Acceptance Criteria

  • Document which components of the LifeCycle controller are still required (e.g., Event Controller) and which are not needed anymore (e.g., Keptn Scheduler)
  • Document how the pre- and post-Sync hooks should look like
  • Document any found limitation

DoD

  • Document how much of the LifeCycle Controller can be built on top of ArgoCD as part of this ticket

Dependencies

Resources

Adapt KeptnWorkloadInstance Controller to create KeptnEvaluation CRDs

Goal

The KeptnWorkloadInstance controller creates KeptnEvaluation CRDs.

Technical Details

The controller created in #10 should be adapted to support also the run of SLI/SLO validations.
For this, the status field should be enhanced to have PreDeploymentEvaluation and PostDeploymentEvaluation.

The controller workflow should trigger first the pre-deployment tasks, afterwards the pre-deployment evaluations, and only after it, its pre-deployment status is completed and the scheduler can bound the pod to a node.
Similarly, after the post-deployment tasks the post-deployment evaluations should start.
For this, the KeptnWorkloadInstance controller creates the KeptnEvaluation CRD and watches the overallStatus field and consider the evaluation completed when it is set to either passed or failed.

Acceptance Criteria

  • The KeptnWorkloadInstance controller creates KeptnEvaluation CRDs
  • Evaluations are part of the workflow

DoD

  • Evaluations can take place as part of the pre/post deployment steps

Dependencies

Rename WorkloadInstance to WorkloadVersion

As the current naming could lead to confusion, the WorkloadInstance type should be renamed to WorkloadVersion.

Acceptance Criteria

  • Type is renamed and changed in all occurrences
  • Documentation and samples are changed
  • @thisthat change the SVG of the architecture diagram (and post it to the repo)

Dependency

#2214

Add tracing capability for Webhook

Goal

Instrument the Webhook with the OTel SDK to export traces.

Technical Details

Use OTel Go to manually instrument the code to export the following spans as a single trace:

Image

The Trace Context should be propagated into the CRDs via a common metadata field TraceContext. The context will be propagated using these immutable fields. This struct should look like the following:

type KeptnContext struct {
	TraceID    string `json:"traceparent",omitempty`
	TraceState string `json:"tracestate",omitempty`
}

The webhook shall use keptn/webhook as the component name.
Spans shall be created with the following OTel Attribute:

  • KeptnApp
  • KeptnWorkload
  • KeptnInstance
  • KeptnVersion

The webhook is the one that always starts a trace.

Acceptance Criteria

  • TraceContext contains W3C Trace Headers
  • Webhook creates a Server Span when it starts to handle a request
  • Webhook creates a Producer Span when it creates the Workload CRD
  • Webhook injects the W3C Context into the CRDs
  • Attributes with Keptn information are added to the Span

DoD

  • The Webhook exports traces

Resources

Add alternative keptn annotations

In the future, the annotations used for the lifecycle controller should be configurable. In a first step, the lifecycle controller should be enhanced to utilize kubernetes recommended labels:

app.kubernetes.io/part-of​ -> keptn.sh/application
app.kubernetes.io/name    -> keptn.sh/service
app.kubernetes.io/version -> keptn.sh/version

Acceptance Criteria

  • Kubernetes recommended labels could be used to annotate workloads
  • The behavior is documented

re-check for otel-collector when connection is disabled

At the moment, it seems like the connection to the otel-collector gets disabled if it can not be reached at the start.

In this case, we should check periodically (60-120 secs) if the collector gets available and enable the connection afterward. The current workaround is restarting the pods (as stated in the examples).

Bootstrap Application Controller

Goal

Create the skeleton for the Keptn Application Controller inside the Keptn Operator and define the Keptn Application CRD.

Technical Details

The operator implemented in #2 shall be extended with support to control Keptn Application CRDs.

Draft example of the CRD to monitor:

apiVersion: keptn.sh/v1
kind: KeptnApplication
metadata:
  name: my-super-cool-application
spec:
  services:
    - name: infrastructure
      version: 4.0
    - name: service-1
      version: 1.0
    - name: service-2
      version: 2.0
      depends:
        - type: service
          name: service-1
          version: 1.0
        - type: service
          name: infrastructure
          version: 4.0
  pre-deployment:
    - sh.keptn.event.pre-deployment-test.triggered
  post-deployment:
    - sh.keptn.event.post-deployment-test.triggered

Acceptance Criteria

  • The operator monitor Application CRDs
  • The operator tracks information about the pre-deployment status: pending, running, success/failure
  • The CRD definition is documented in the README
  • Emit K8s events when the pre-deployment of the Application starts and finishes.

DoD

  • The Application CRD format is documented in the README
  • The Operator controls Application CRDs
  • Events are visible in the K8s event stream

Dependencies

Bootstrap SLI controller

Goal

Extend the Keptn Operator with a controller for SLI CRDs.

Technical Details

The Controller should monitor SLI in a format compatible with the current Keptn SLI format.
The SLI format should be part of the spec field.

Example

apiVersion: lifecycle.keptn.sh/v1alpha1
kind: SLI
metadata:
  name: service-test
spec:
  indicators:
    throughput: "builtin:service.requestCount.total:merge(0):count?scope=tag(keptn_project:$PROJECT),tag(keptn_stage:$STAGE),tag(keptn_service:$SERVICE),tag(keptn_deployment:$DEPLOYMENT)"
    error_rate: "builtin:service.errors.total.count:merge(0):avg?scope=tag(keptn_project:$PROJECT),tag(keptn_stage:$STAGE),tag(keptn_service:$SERVICE),tag(keptn_deployment:$DEPLOYMENT)"
    response_time_p50: "builtin:service.response.time:merge(0):percentile(50)?scope=tag(keptn_project:$PROJECT),tag(keptn_stage:$STAGE),tag(keptn_service:$SERVICE),tag(keptn_deployment:$DEPLOYMENT)"
    response_time_p90: "builtin:service.response.time:merge(0):percentile(90)?scope=tag(keptn_project:$PROJECT),tag(keptn_stage:$STAGE),tag(keptn_service:$SERVICE),tag(keptn_deployment:$DEPLOYMENT)"
    response_time_p95: "builtin:service.response.time:merge(0):percentile(95)?scope=tag(keptn_project:$PROJECT),tag(keptn_stage:$STAGE),tag(keptn_service:$SERVICE),tag(keptn_deployment:$DEPLOYMENT)"

Acceptance Criteria

  • The operator monitors Keptn SLI

Bootstrap the Mutating Webhook

Technical Details

The operator of #2 is extended with the registration of a Mutating Webhook in the Admission Controller.
A reference implementation can be found here:
https://github.com/open-feature/open-feature-operator/blob/7d099c7b72f9a7430581696218458eaee31fb0df/main.go#L90-L98

Acceptance Criteria

  • A Mutating Webhook is registered.
  • Annotations of the resource are logged at each request.
  • All requests are always allowed.

DoD

  • A Mutating Webhook is registered in the Admission Controller.
  • The operator is extended with the Webhook implementation

Dependencies

Resources

Adapt KeptnAppVersion Controller to create KeptnEvaluation CRDs

Goal

The KeptnAppVersion controller creates KeptnEvaluation CRDs.

Technical Details

The controller created in #9 should be adapted to support also the run of SLI/SLO validations.
For this, the field KeptnAppVersionStatus should be enhanced to have PreDeploymentEvaluation and PostDeploymentEvaluation.

The controller workflow should trigger first the pre-deployment tasks, afterwards the pre-deployment evaluations, and only after it, its pre-deployment status is completed and the workload controller can proceed.
Similarly, after the post-deployment tasks the post-deployment evaluations should start.
For this, the KeptnAppVersion controller creates the KeptnEvaluation CRD and watches the overallStatus field and consider the evaluation completed when it is set to either passed or failed.

Acceptance Criteria

  • The KeptnAppVersion controller creates KeptnEvaluation CRDs
  • Evaluations are part of the workflow

DoD

  • Evaluations can take place as part of the pre/post deployment steps

Dependencies

Add metric capabilities

Goal

Instrument the Keptn Lifecycle Operator with the OTel SDK to export some metrics.

Technical Details

Use OTel Go to manually instrument the code to export the following metrics:

  • Total number of deployments keptn.deployment.count as Counter
  • Total number of tasks keptn.task.count as Counter
  • Total number of apps keptn.app.count as Counter
  • Deployments duration keptn.deployments.duration as Histogram
  • Task duration keptn.task.duration as Histogram
  • App duration keptn.app.duration as Histogram
  • Number of deployment in execution keptn.deployments.active as UpDownCounter
  • Number of tasks in execution keptn.task.active as UpDownCounter
  • Number of apps in execution keptn.app.active as UpDownCounter

Each measurement should have the following dimensions reported as OTel Attribute:

  • KeptnApp
  • KeptnWorkload
  • KeptnInstance
  • KeptnVersion

For deployment metrics, also the following shall be set as dimensions:

  • namespace
  • status: failed | succeed

If a deployment fails, it would be interesting also to add the attribute phase to know when it failed: pre-checks, deployment, ...

For task metrics, also the following shall be set as dimensions:

  • status: failed | succeed
  • phase: pre | post

Acceptance Criteria / DoD

  • The Lifecycle Controller generates the aforementioned metrics

Resources

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.