Git Product home page Git Product logo

scaffolding's Introduction

Scaffolding

scaffolding's aim is to provide a framework for writing simple scripts to execute performance benchmarks, with a focus on keeping the process quick, flexible and simple.

The project is organized as follows:

  • ./toolkit: go package which automates simple tasks that would be too tedious or repetitive to implement scripting with other CLI tools.
  • ./scale-tests: collection of scripts for visualizing scale and performance data.
  • ./scripts: collection of bash scripts which implement commonly used/required functionality.
  • ./kustomize: collection of kustomize templates for applying commonly used manifests.
  • ./scenarios: implementation scripts for running benchmarks within different scenarios for some purpose.
  • ./cmapisrv-mock: a component which mocks the behavior of the Cilium Cluster Mesh API Server for scalability testing purposes.
  • ./egw-scale-utils: components for scale testing Cilium's Egress Gateway feature.

toolkit

collection of tools to assist in running performance benchmarks

Usage:
  toolkit [command]

Available Commands:
  completion  Generate the autocompletion script for the specified shell
  help        Help about any command
  lazyget     get a thing so you don't have to
  ron         Run On Node
  verify      verify the state of things

Flags:
  -h, --help                help for toolkit
  -k, --kubeconfig string   path to kubeconfig for k8s-related commands
                            if not given will try the following (in order):
                            KUBECONFIG, ./kubeconfig, ~/.kube/config
  -v, --verbose             show debug logs

Use "toolkit [command] --help" for more information about a command.

Currently have the following subcommands:

  • lazyget, used for:
    • creating kind configurations (kind-config)
    • getting kind images based on kubernetes version (kind-image)
  • ron used for:
    • running commands on nodes in a kubernetes cluster through the use of pods, with support for: mounting local files, creating PVC for storing artifacts, auto-copying data out of PVC, prefixing commands with nsenter, and automatic cleanup.
  • verify, used for:
    • verifying all pods and nodes have no failing conditions (k8s-ready)

For adding new subcommands, be sure to check out util.go, which has some commonly used utility functions ready to go.

scripts

Most, if not all, of these scripts support passing -d as the first parameter, which asks the script to run a set -x for verbose output:

if [ "${1}" == '-d' ]
then
    set -x
    shift 1
fi

As a convention, the filenames of these scripts should be in snake case.

  • add_grafana_dashboard.sh: Download a Grafana dashboard from grafana.com based on an ID, and create a ConfigMap with its contents. This ConfigMap is then used as a patch to live-update the dashboards ConfigMap used by kustomize/grafana/dashboards.yaml, in order to add a dashboard into a live grafana instance. If -p is passed to the script, then then live-updating behavior is suppressed, allowing for just a ConfigMap containing the dashboard to be created. This is suitable for adding dashboards into kustomize/grafana/dashboards.
  • exec_with_registry.sh: Find a service with the labels app.kubernetes.io/part-of=scaffolding and app.kubernetes.io/name=registry, port-forward it to localhost on port 5000, execute a given command, then kill the port-forward. Useful for a (crane|docker|podman) push.
  • get_apiserver_url.sh: Look for a pod with a prefix of kube-apiserver in its name and return it's IP and port in the format of ip:port. Not very v6 friendly.
  • get_ciliumcli.sh: Download cilium-cli to current directory using instructions from the documentation.
  • get_cluster_cidr.sh: Find the cluster cidr as passed to kubelets through the --cluster-cidr arg.
  • get_crane.sh: Download crane to the current directory using instructions from their documentation.
  • get_node_internal_ip.sh: Return the address for a node with the type InternalIP.
  • k8s_api_readyz.sh: Grab the current context's API server IP and CA data and make a curl request to /readyz?verbose=true to check if the API server is up. If the CA data cannot be determined, then use --insecure with curl to still allow for a request to go out.
  • retry.sh: Retry a given command, using a given delay in-between attempts. For example, retry.sh 5 echo hi will attempt to run echo hi every 5 seconds until success.
  • profile_node.sh: Profile a k8s node's userspace and kernelspace stacks, saving the output and generated FlameGraph as an artifact. Requires the k8s node has perf installed and the executing node has perl installed.
  • netperf.sh: Helper script for kicking off netperf tests, including support for parallel netperf instances. Results are outputted in CSV format.

kustomize

This collection of kustomize templates is meant to be easy to reference in a kustomization.yaml for your needs. As an example, within a scenario's directory add:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../kustomize/prometheus
- ./../kustomize/grafana

into a kustomization.yaml and execute kustomize build . | kubectl apply -f and boom, you have prometheus and grafana. If you want to modify the deployment, just add patches. For instance, to upload node_cpu_seconds_total metrics to grafana cloud:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../kustomize/prometheus
- ./../kustomize/grafana
patchesStrategicMerge:
- -|
    apiVersion: monitoring.coreos.com/v1
    kind: Prometheus
    metadata:
      name: prometheus
      labels:
        app: prometheus
    spec:
      remoteWrite:
      - url: <MY_PROM_PUSH_URL/>
        basicAuth:
          username:
            name: <MY_PROM_SECRET/>
            key: username
          password:
            name: <MY_PROM_SECRET/>
            key: password
        writeRelabelConfigs:
          - source_labels: 
              - "__name__"
            regex: "node_cpu_seconds_total"
            action: "keep"

Or to add a dashboard stored in ./my-cool-dashboard.json to grafana:

# my-cool-dashboard-cm.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-dashboards
data:
  my-cool-dashboard.json: |-
    <paste dashboard contents here/>
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../kustomize/prometheus
- ./../kustomize/grafana
patches:
- ./my-cool-dashboard-cm.yaml

It's convention that each resource can be pinned to a node using NodeSelectors and role.scaffolding/<role/>=true labels, which is useful when we want to dedicate a node for a certain resource, such as netperf server. See below for specifics.

prometheus

Deploys prometheus into the monitoring namespace onto any node labeled role.scaffolding/monitoring=true, accessible using the service named prometheus. This is done through the use of prometheus-operator.

Has a 'select-all' configured for ServiceMonitors.

grafana

Deploys grafana onto a node with the role.scaffolding/monitoring=true label into the monitoring namespace, accessible using the service named grafana.

By default, will use the prometheus deployment above as a datasource.

A ConfigMap named grafana-dashboards will have its keys mounted as files inside the grafana container at /var/lib/grafana/dashboards, therefore to add a new dashboard, add its json to said ConfigMap with its filename as the key. The script at scripts/add_grafana_dashboard.sh can be used to help facilitate this process.

Three dashboards are provided out-of-the-box in kustomize/grafana/dashboards, applied as patches in kustomize/grafana/kustomization.yaml:

A dashboard provider is used to accomplish this. See the grafana docs for more information.

node-exporter

Deploys node-exporter in the monitoring namespace onto any node labeled with role.scaffolding/monitored=true. A ServiceMonitor is created to allow for collection from Prometheus. A dashboard for the exported metrics is included within the grafana kustomization (see above).

HostNetwork is enabled to allow for network metrics on the node to be collected.

monitoring-ns

Creates a namespace named 'monitoring'. Kustomize doesn't auto-create namespaces, so something needs to create the 'monitoring' ns for resources like prometheus and grafana (see above). By including this manifest, you won't have to manually create the namespace before a kustomize build | kubectl apply -f -.

registry

Deploys an in-cluster registry in the namespace registry, available through the service named registry. This means the DNS name registry.registry.svc can be used as the URL for pushed images.

Crane is a great way to interact with this registry and can be downloaded using scripts/get_crane.sh. If you need to build a custom image and don't want to mess with pushing and downloading from a remote registry just to get it into your cluster, then this is the manifest for you!

topologies

Sets up pod topologies for performance testing. Right now we just have the one, pod2pod, and the intention here is to overwrite details of the deployment as needed within a kustomization.yaml. This is definitely subject to change, as there is probably a better way to do this which doesn't involve a lot of boilerplate.

topologies/pod2pod/base

Creates two pods for network performance testing by using a Deployment with one replica and a NodeSelector:

  • pod2pod-client: Selects nodes with the label role.scaffolding/pod2pod-client=true.
  • pod2pod-server: Selects nodes with the label role.scaffolding/pod2pod-server=true.

Each of these deployments has a pod with a single container named main, using k8s.gcr.io/pause:3.1 as its image. To override the image for both deployments, you can use kustomize's images transformer:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../kustomize/topologies/pod2pod/base
images:
- name: k8s.gcr.io/pause:3.1
  newName: <mycoolimage/>

If you just want the server or the client, you can use the patchesStrategicMerge transformer as follows:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../kustomize/topologies/pod2pod/base
patchesStrategicMerge:
- |-
  $patch: delete
  apiVersion: apps/v1
  kind: Deployment
  metadata:
    name: pod2pod-client

topologies/pod2pod/overlays/pod

Uses pod2pod/base, but has a patch to ensure that each of the Deployments has one replica. Basically just an alias at this point.

topologies/pod2pod/overlays/service

Creates an incomplete Service that selects the pod2pod-server. You still need to fill in the service's spec with details about how you want it to function. For instance, if I want to:

  • Have pod2pod-server run httpd on port 80,
  • Expose it as a LoadBalancer service on port 80
  • Have pod2pod-client run an alpine container forever, for kubectl exec

I would write the following:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../kustomize/topologies/pod2pod/overlays/service
patchesStrategicMerge:
- |-
  apiVersion: apps/v1
  kind: Deployment
  metadata:
    name: pod2pod-client
  spec:
    template:
      spec:
        containers:
          - name: main
            image: alpine
            command: ["sleep", "infinity"]
- |-
  apiVersion: apps/v1
  kind: Deployment
  metadata:
    name: pod2pod-server
  spec:
    template:
      spec:
        containers:
          - name: main
            image: httpd
            ports:
            - containerPort: 80
              name: http
              protocol: TCP
- |-
  apiVersion: v1
  kind: Service
  metadata:
    name: pod2pod-server
  spec:
    type: LoadBalancer
    ports:
      - protocol: TCP
        port: 80
        targetPort: 80
        name: http

scenarios

Each sub-directory within the scenarios directory is meant to house resources for running any kind of performance test, using the resources within scaffolding. The idea here is that each directory has:

  • A main script for running the test(s),
  • A kustomization.yaml file for deploying intra needed for the test,
  • An artifacts directory where items produced from the test are kept,
  • A README.md describing what is going on,
  • And any other resources required.

scenarios/common.sh can be sourced within as a helper, containing common environment variables and functions:

Environment variables:

  • SCENARIO_DIR: Absolute path to the directory of the current scenario (ie cwd when common.sh is sourced)
  • ARTIFACTS: Absolute path to the scenario's artifacts directory
  • ROOT_DIR: Absolute path to the root of scaffolding
  • TOOLKIT: ... toolkit sub-directory ...
  • SCRIPT: ... script sub-directory ...
  • KUSTOMIZE: ... kustomize sub-directory ...

These environment variables can be overridden if needed by setting them prior to calling the init function below.

Functions:

  • init(): Set the above environment variables, create ARTIFACTS directory, build toolkit if ARTIFACTS/toolkit does not exist.
  • init_print(): Print the above imported environment variables and functions.
  • reset_vars(): Reset the above imported environment variables.
  • build_toolkit(): Build a binary for toolkit and save it into the artifacts directory.
  • wait_ready(): Use scripts/retry.sh along with scripts/k8s_api_readyz.sh and the toolkit's verify k8s-ready command to wait until the k8s cluster is ready to go before proceeding. This is great to use after applying a built kustomize file or after provisioning a cluster.
  • wait_cilium_ready(): Call wait_ready, wait one minute for Cilium to show ready through cilium status, and then run a connectivity test. The connectivity test can be skipped by setting SKIP_CT to skip-ct.
  • breakpoint(): Wait to continue until some data comes in from STDIN (ie from a user).
  • env_var_or_die(): Check if the given variable is set, and if it isn't, exit with rc 1.

xdp

Demonstrate the positive CPU impact of XDP native acceleration and DSR on a load-balancer. Requires three nodes, one for a load balancer, one for a netperf server, one for grafana and prometheus.

Implemented within minikube for local development, but can easily be modified for other environments as needed.

Run kubectl port-forward -n monitoring svc/grafana 3000:3000 to view the node-exporter dashboard, which can be used to monitor the CPU usage of the load balancer node.

netperf regression testing

Perform latency and throughput regression testing between multiple versions of Cilium. Netperf latency and throughput tests are executed on a specific version of Cilium. An upgrade will then be performed to a new version of Cilium, and tests will be repeated.

Profiles will be taken on nodes. Tests are run pod-to-pod.

IPSec testing

Performs the same tests as above, with options for enabling encryption in Cilium. One can specify installing Cilium with IPSec enabled, wireguard enabled, or no encryption enabled.

EGW masquerade delay

Executes a small-scale scalability test in a kind cluster to determine the amount of time it takes for traffic egressing from a workload pod to be masqueraded through an EGW node. Used for testing the components within the egw-scale-utils directory.

scaffolding's People

Contributors

giorio94 avatar jtaleric avatar learnitall avatar marseel avatar renovate[bot] avatar thorn3r avatar xmulligan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

scaffolding's Issues

Running on Kind

Experiencing various issues when attempting to run on kind for local testing:

Tools node label

We don't have to worry about this in GKE, but the tools playbook may apply the scaffolding/app=tools label onto a control-plane node, which will prevent tools from being scheduled since they don't have the node-role.kubernetes.io/master:NoSchedule toleration. I experienced this when working with a kind cluster today, having to manually create the label and set apply_tool_label to false.

Worker node label

If no nodes in the cluster have the node-role.kubernetes.io/worker label, then the same issue may occur where a control-plane node gets accidentally labeled as a worker node.

Metadata collection in benchmark role

This relates to the test_platform variable, set in the following lines in roles/benchmark/tasks/metadata.yml:

- name: Attempt to determine platform - GKE
  shell: |
    export KUBECONFIG={{kubeconfig}}
    kubectl get nodes $(kubectl get nodes -o custom-columns=:.metadata.name --no-headers) -o json | grep "gke.gcr.io"
  register: gke_check
  when: test_platform is undefined

- set_fact:
    test_platform: "gke"
  when: gke_check.rc == 0

If test_platform is undefined and the platform is not GKE, Attempt to determine platform - GKE will fail causing the playbook to fail as well, since we do not have an ignore included. If test_platform is defined and the platform is not GKE, then ansible will complain about gke_check not being set in the call to set_fact. Changing the when to test_platform is undefined and gke_check.rc == 0, got me though, but adding in an ignore statement might be useful as well.

Reduce Size of Dockerfile Image

Have had multiple attempts at this, however haven't had too much consistent success. From previous attempts, here is what we've learned:

  • google-auth Python library can be used to get a GCP token for a SA
  • Creating K8s SA is great way to have non-expiring token
  • gcloud sdk is 1.0GB+
  • Cilium CLI dependent on usage of gcloud (see 1d04b7f)

Version and feature matrix for testing

We should define our testing matrix -

GKE

Cilium Version Kernel Cilium Features
v1.12-rc 5.4
v1.12-rc 5.16
v1.12-rc 5.18
v1.11-latest 5.4
v1.11-latest 5.16
v1.10-latest 5.4
v1.10-latest 5.16

Similar to above... I would like to get an idea on what Cilium features performance should iterate on initially.

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

This repository currently has no open or pending branches.

Detected dependencies

dockerfile
cmapisrv-mock/Dockerfile
  • docker.io/library/golang 1.23.0@sha256:613a108a4a4b1dfb6923305db791a19d088f77632317cfc3446825c54fb862cd
  • gcr.io/etcd-development/etcd v3.5.15@sha256:9a01b7da0a3cde485c03fcf58fef9b2a09c81b4926b2b7d7ae6d1e9b20a2a192
  • docker.io/library/alpine 3.20.2@sha256:0a4eaa0eecf5f8c050e5bba433f58c052be7587ee8af3e8b3910ef9ab5fbe9f5
github-actions
.github/workflows/build.yaml
  • actions/checkout v4.1.7@692973e3d937129bcbf40652eb9f2f61becf3332
  • dorny/paths-filter v3.0.2@de90cc6fb38fc0963ad72b210f1f284cd68cea36
  • docker/setup-buildx-action v3.6.1@988b5a0280414f521da01fcc63a27aeeb4b104db
  • docker/login-action v3.3.0@9780b0c442fbb1117ed29e0efdff1e18412f7567
  • actions/checkout v4.1.7@692973e3d937129bcbf40652eb9f2f61becf3332
  • docker/build-push-action v6.7.0@5cd11c3a4ced054e52742c5fd54dca954e0edd85
.github/workflows/lint.yaml
  • actions/setup-go v5.0.2@0a12ed9d6a96ab950c8f026ed9f722fe0da7ef32
  • actions/checkout v4.1.7@692973e3d937129bcbf40652eb9f2f61becf3332
  • golangci/golangci-lint-action v6.1.0@aaa42aa0628b4ae2578232a66b541047968fac86
gomod
cmapisrv-mock/go.mod
  • go 1.22.0
  • github.com/cilium/cilium v1.16.1
  • github.com/cilium/hive v0.0.0-20240816121742-535330fad6ce@535330fad6ce
  • github.com/dustinkirkland/golang-petname v0.0.0-20240428194347-eebcea082ee0@eebcea082ee0
  • github.com/sirupsen/logrus v1.9.3
  • github.com/spf13/cobra v1.8.1
  • github.com/spf13/pflag v1.0.6-0.20210604193023-d5e0c0615ace@d5e0c0615ace
  • golang.org/x/exp v0.0.0-20240808152545-0cdaa3abc0fa@0cdaa3abc0fa
  • golang.org/x/time v0.6.0
regex
.github/workflows/lint.yaml
  • go 1.23.0
  • golangci/golangci-lint v1.60.1

  • Check this box to trigger a request for Renovate to run again on this repository

Add License Header to Source File

For applying Apache 2.0, need to add a source header as described in the Appendix of the license. Adding an authors file would simplify things a bit, and follow the pattern other Isovalent projects use. Next step from #51

Continuous profiling of CNI

We would like to profile each CNI. ParcaDev does this.

Cilium strips symbols, so we need to build or have the Cilium image build system create the default Cilium images plus images with the symbols intact.

Add Scope to Metadata

Add new value into metadata saved in archive directory and exported to ES called 'scope', which is a label for the result. Goal is to have it help us ease sorting through result data.

Race condition in DeleteResourceAndWaitGone

Currently DeleteResourceAndWaitGone:

  • Opens watch to wait for deleted resource (in separate goroutine)
  • Deletes resource

This can cause a race condition where the resource is deleted and never observed on watch.
We should wait for cache sync before deleting the resource.
We can achieve it by adding precondition that will signal sync happened: precondition code

Comparing data-path results

Problem

We would like to compare different CNI code versions. However comparing two separate test executions, in the same region, with the same instance sizes occasionally produces variable results.

To combat we can :

  • establish the hostNetwork performance, and compare the pod network performance to this. When comparing versions, we look at the %diff from hostNetwork to pod network.
  • use a single cluster with version x.y capture the results, uninstall all things, upgrade to version x.z rerun datapath tests.

Neither is perfect, and we want the tests to be reliable, so we need to test both out.

Metadata Collection Overlap in Benchmark Role

We have the metadata.yaml task file, which collects metadata prior to running benchmarks. Some of the tasks within it are re-run within the datapath.yml or the agent.yml playbooks. I think we should remove this overlap, so we don't have two competing methods of metadata collection.

Ex:

https://github.com/cilium/scaffolding/blob/main/roles/benchmarks/tasks/metadata.yml#L40=
https://github.com/cilium/scaffolding/blob/main/roles/benchmarks/tasks/datapath.yml#L23=

Both get the kernel

Kernel Playbook Idempotency

The kernel playbook has a hard time reaching a success state when running against a cluster that has already been used to perform a benchmark run. Sometimes a node's kubelet won't reach a success state in less than 10 minutes, causing GKE to auto-repair the node. Looks like we ran into https://github.com/jtaleric/tinker/tree/main/clouds/gke/kernel-swap#caveat.

Need to add some functionality into the kernel playbook to clean up pods/resources in the cluster before performing the kernel swap and reboot, allowing for the kubelet to come up in time. This would allow us to keep auto-repair enabled.

Three avenues:

  • Have the kernel playbook not run if number of pods is over a certain amount, or if certain tool-associated namespaces are present.
  • Have the kernel playbook delete tool-associated namespaces, or all namespaces besides the default, before doing the swap. This assumes the tool playbook will run again and that we don't need any state within the tools before they are deleted.
  • Some combination of the above.

store status in elasticsearch

We should store the status of each run in ES. Today, results and metadata are stored there however different tools have different index structures.

One example is snafu (benchmark-wrapper) has a concept of a run_id which allows the user to add some descriptive string to each run, where kube-burner has no concept of a run_id. I am sure there will be other things we integrate that suffer the same issue. We could add run_id to kube-burner, but I feel it would be better for us to encode run specific details in our own index.

{
 "timestamp" : "date",
 "platform" : "GKE,OpenShift,EKS,AKS",
 "k8version" : "version",
 "kernel" : "[ list of kernels? or just a single kernel assumption is that all nodes are the same kerenl ]",
 "benchmark" : "what ran",
 "uuid" : "id from kube-burner or benchmark-wrapper",
 "num_nodes" : "number of nodes"
 "region" : "what region or location the cluster was deployed"
 "state" : "pass/failed"
}

Simplify Condition Checking in toolkit

Background

In toolkit/k8s/verify.go, there's a function named CheckUnstructuredForReadyState, which takes in an unstructured kubernetes resources and checks if it is in a good state or not. It does this by checking if the resource's phase and conditions, (and container statuses if applicable), are all in a 'good' state. This gets a bit tricky, as each resource can have a different set of applicable phases and conditions. Additionally for conditions, it's not as simple as checking if they are all 'true' as some conditions are only considered 'good' if they are in a false state. For instance, Ready needs to be true, but DiskPressure needs to be false.

Problem

There are two main helper functions invoked by CheckUnstructuredForReadyState to check phases and conditions:

They both use a switch statement to check for known-goods and known-bads, then return an error if something unknown is given. This means that if we want to cover every resource available to kubernetes, we have to 'encode' all the relevant phase and condition info into these functions. This becomes an explosion problem over time as more environments and resources are to be supported. Recently, support for GKE-specific NodeConditions was brought up: #60

How can these functions be simplified to avoid having to update them over time?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.