Git Product home page Git Product logo

kubernetes-conjur-deploy's Introduction

kubernetes-conjur-deploy

This repository contains scripts for automating the deployment of Conjur followers to a Kubernetes or OpenShift environment. These scripts can also be used to deploy a full cluster with Master and Standbys for testing and demo purposes but this is not recommended for a production deployment of Conjur.

This repo supports CyberArk Conjur Enterprise (formerly DAP) v10+. To deploy Conjur Open Source, please use the Conjur Open Source helm chart.


Setup

The Conjur deployment scripts pick up configuration details from local environment variables. The setup instructions below walk you through the necessary steps for configuring your environment and show you which variables need to be set before deploying.

All environment variables can be set/defined with:

  • bootstrap.env file if deploying the follower to Kubernetes or OpenShift
  • dev-bootstrap.env for all other configurations.

Edit the values per instructions below, source the appropriate file and run 0_check_dependencies.sh to verify.

The Conjur appliance image can be loaded with _load_conjur_tarfile.sh. The script uses environment variables to locate the tarfile image and the value to use as a tag once it's loaded.

Conjur Configuration

Appliance Image

You need to obtain a Docker image of the Conjur appliance and push it to an accessible Docker registry. Provide the image and tag like so:

export CONJUR_APPLIANCE_IMAGE=<tagged-docker-appliance-image>

You will also need to provide an ID for the Conjur authenticator that will later be used in Conjur policy to provide your apps with access to secrets through Conjur:

export AUTHENTICATOR_ID=<authenticator-id>

This ID should describe the cluster in which Conjur resides. For example, if you're hosting your dev environment on GKE you might use gke/dev.

Follower Seed

You will need to provide a follower seed file generated from your Conjur Master. The seed can be generated by SSH-ing into your Master and running:

NOTE: If you are running this code to deploy a follower that will run in a separate cluster from the master, you must force-generate the follower certificate manually before creating the seed to prevent the resulting certificate from omitting the future in-cluster subject altname included.

To generate a follower seed with the appropriate in-cluster subject altname for followers that are not in the same cluster as master, we first will need to issue a certificate on master. Skip this step if master is collocated with the follower.

$ evoke ca issue --force <follower_external_fqdn> conjur-follower.<conjur_namespace_name>.svc.cluster.local

We now need to create the seed archive with the proper information:

$ evoke seed follower <follower_external_fqdn> > /tmp/follower-seed.tar

If you are on the same node as the master container, you can also export the seed with:

$ sudo docker exec <container_id> evoke seed follower <follower_external_fqdn> > /tmp/follower-seed.tar

Note: the exported seed file will not be copied to host properly if you use -t flag with the docker exec command.

Copy the resulting seed file from the Conjur master to your local filesystem and set the following environment variable to point to it:

export FOLLOWER_SEED=path/to/follower/seed/file

The deploy scripts will copy the seed to your follower pods and use it to configure them as Conjur followers.

Important note: Make sure to delete any copies of the seed after use. It contains sensitive information and can always be regenerated on the Master.

Platform Configuration

If you are working with OpenShift, you will need to set:

export PLATFORM=openshift
export OSHIFT_CLUSTER_ADMIN_USERNAME=<name-of-cluster-admin> # system:admin in minishift
export OSHIFT_CONJUR_ADMIN_USERNAME=<name-of-conjur-namespace-admin> # developer in minishift

Otherwise, $PLATFORM variable will default to kubernetes.

Before deploying Conjur, you must first make sure that you are connected to your chosen platform with a user that has the cluster-admin role. The user must be able to create namespaces and cluster roles.

Conjur Namespace

Provide the name of a namespace in which to deploy Conjur:

export CONJUR_NAMESPACE_NAME=<my-namespace>

The conjur-authenticator Cluster Role

Conjur's Kubernetes authenticator requires the following privileges:

  • ["get", "list"] on "pods" for confirming a pod's namespace membership
  • ["create", "get"] on "pods/exec" for injecting a certificate into a pod

The deploy scripts include a manifest that defines the conjur-authenticator cluster role, which grants these privileges. Create the role now (note that your user will need to have the cluster-admin role to do so):

# Kubernetes
kubectl apply -f ./kubernetes/conjur-authenticator-role.yaml

# OpenShift
oc apply -f ./openshift/conjur-authenticator-role.yaml

Docker Configuration

Install Docker from version 17.05 or higher on your local machine if you do not already have it.

Kubernetes

You will need to provide the domain and any additional pathing for the Docker registry from which your Kubernetes cluster pulls images:

export DOCKER_REGISTRY_URL=<registry-domain>
export DOCKER_REGISTRY_PATH=<registry-domain>/<additional-pathing>

Note that the deploy scripts will be pushing images to this registry so you will need to have push access.

If you are using a private registry, you will also need to provide login credentials that are used by the deployment scripts to create a secret for pulling images:

export DOCKER_USERNAME=<your-username>
export DOCKER_PASSWORD=<your-password>
export DOCKER_EMAIL=<your-email>

Please make sure that you are logged in to the registry before deploying.

Running Kubernetes locally

You can now deploy a local development environment for Kubernetes using Docker Desktop. See our [contributing guide][CONTRIBUTING.md] to learn how!

OpenShift

OpenShift users should make sure the integrated Docker registry in your OpenShift environment is available and that you've added it as an insecure registry in your local Docker engine. You must then specify the path to the OpenShift registry like so:

export DOCKER_REGISTRY_PATH=docker-registry-<registry-namespace>.<routing-domain>

Please make sure that you are logged in to the registry before deploying.

Running OpenShift in Minishift

You can use Minishift to run OpenShift locally in a single-node cluster. Minishift provides a convenient way to test out Conjur deployments on a laptop or local machine and also provides an integrated Docker daemon from which to stage and push images into the OpenShift registry. The ./openshift subdirectory contains two files:

  • minishift.env that defines environment variables to configure Minishift, and
  • minishift-start.sh to startup Minishift. The script assumes VirtualBox as the hypervisor but others are supported. See https://github.com/minishift/minishift for more information.

Steps to startup Minishift:

  1. Ensure VirtualBox is installed
  2. cd openshift
  3. Run ./minishift-start.sh
  4. source minishift.env to gain use of the internal Docker daemon
  5. cd ..
  6. Use dev-bootstrap.env for your variable configuration
  7. Run ./start

Usage

Deploying Conjur Follower

Ensure that bootstrap.env has the FOLLOWER_SEED variable set to the seed file created here or a URL to the seed service.

If master key encryption is used in the cluster, CONJUR_DATA_KEY must be set to the path to a file that contains the encryption key to use when configuring the follower.

By default, the follower will store all data within the container. If FOLLOWER_USE_VOLUMES is set to true, the follower will use host volumes (not persistent volumes) for /var/log/conjur, /var/log/nginx and /var/lib/postgresql/10.

After verifying this setting, source ./bootstrap.env and then run ./start to execute the scripts necessary to have the follower deployed in your environment.


Test App Demo

The kubernetes-conjur-demo repo deploys test applications that retrieve secrets from Conjur and serves as a useful reference when setting up your own applications to integrate with Conjur.

Conjur cluster load balancer

If a Conjur cluster is deployed within the kubernetes/openshift cluster, by default an external facing load balancer will not be deployed with it. For convenience, (kubernetes|openshift)/conjur-cluster-ext-service.yaml are provided for creating and mapping the load balancer. It can be applied with:

kubectl create -f "./$PLATFORM/conjur-cluster-ext-service.yaml"

Contributing

We welcome contributions of all kinds to this repository. For instructions on how to get started and descriptions of our development workflows, please see our contributing guide.

License

This repository is licensed under Apache License 2.0 - see LICENSE for more details.

kubernetes-conjur-deploy's People

Contributors

andytinkham avatar boazmichaely avatar diverdane avatar doodlesbykumbi avatar dustinmm80 avatar eladkug avatar garymoon avatar gl-johnson avatar h-artzi avatar hdabrowski avatar hughsaunders avatar imheresamir avatar inbalzilberman avatar ismarc avatar izgeri avatar jakequilty avatar jodyhuntatx avatar john-odonnell avatar jtuttle avatar liavyona avatar micahlee avatar neil-k-zero avatar orenbm avatar rpothier avatar sgnn7 avatar sigalsax avatar szh avatar tarnowsc avatar ucatu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kubernetes-conjur-deploy's Issues

developer user needs access to internal registry

To push images to the Openshift repo, the system:admin needs to grant access to the developer user. That's not in the deployment scripts. I wrote this, should probably go in the 1_create_conjur_namespace.sh script.
This may grant excessive privileges so some scrutiny/testing is warranted.

$ cat grant-user-access.sh
#!/bin/bash
if [[ $# -ne 1 ]]; then
echo "Provide name of developer user..."
exit -1
fi
USER=$1
oc adm policy add-role-to-user system:registry $USER
oc adm policy add-role-to-user system:image-builder $USER

oc adm policy add-role-to-user admin developer -n default
oc adm policy add-role-to-user admin developer -n $CONJUR_NAMESPACE_NAME

Deploy can be performed w/ non-cluster admin user

It would be great if we could rework our deploy scripts to work with a non-cluster admin user since that may not be a very realistic use case. We will need to figure out which steps require cluster admin privileges and document them as pre-requisites. At the very least this will likely include:

  1. loading RBAC YAML for the cluster role / role-binding
  2. enabling docker images to run as root in OpenShift (see here)

There may be more, will need a spike on this.

Deploy scripts leverage seed service workflow

As an OpenShift Operator, I want to deploy followers without maintaining a seed file, so that I can easily deploy and scale followers.

GIVEN Conjur configured to authenticate followers
WHEN I run the script
THEN followers are deployed to OpenShift

Blocked by conjurinc/appliance#746

Remove 3.3 OpenShift testing

We have decided it is no longer worth testing on the antiquated OpenShift 3.3. The Jenkinsfile configuration for 3.3 is commented out but we might as well remove it.

Image pull backoff in OC4.5 because image url is external when using start.sh

Summary

When deploying a follower to openshift 4.5 via start.sh, the follower deployment is created referring to the seedfetcher image by it's external url. This causes the follower pod to fail to start as it cannot pull the image for it's first init container.

Detail

The following logic determines if an internal or external url is generated:

    if ! [ -z ${TEST_PLATFORM+x} ] && [[ $TEST_PLATFORM =~ ^openshift4 ]] && [[ "$internal" == "true" ]]; then
      echo "image-registry.openshift-image-registry.svc:5000/$CONJUR_NAMESPACE_NAME/$1:$CONJUR_NAMESPACE_NAME"
    else
      echo "$DOCKER_REGISTRY_PATH/$CONJUR_NAMESPACE_NAME/$1:$CONJUR_NAMESPACE_NAME"
    fi

From: https://github.com/cyberark/kubernetes-conjur-deploy/blob/master/utils.sh#L48

TEST_PLAFORM is only set in test.sh and test.sh is only called from the Jenkinsfile. So when a user's entrypoint is start.sh, an internal url will not be generated.

Workaround

Add the following to bootstrap.env when using openshift 4x

TEST_PLATFORM=openshift4.5

add script to delete deployments

When bringing up a new cluster, one often needs to delete the one that isn't working. I hacked this from 3_deploy_conjur_cluster.sh and 4_create_load_balancer.sh:

$ cat delete-deployments.sh
#!/bin/bash
#set -eo pipefail

. utils.sh

announce "Deleting Conjur cluster."

set_namespace $CONJUR_NAMESPACE_NAME

conjur_appliance_image=$(platform_image "conjur-appliance")

echo "deleting main cluster"
if $cli get statefulset &>/dev/null; then # this returns non-0 if platform doesn't support statefulset
conjur_cluster_template="./$PLATFORM/conjur-cluster-stateful.yaml"
else
conjur_cluster_template="./$PLATFORM/conjur-cluster.yaml"
fi
sed -e "s#{{ CONJUR_APPLIANCE_IMAGE }}#$conjur_appliance_image#g" $conjur_cluster_template |
$cli delete -f -

echo "deleting load balancer"
docker_image=$(platform_image haproxy)

sed -e "s#{{ DOCKER_IMAGE }}#$docker_image#g" "./$PLATFORM/haproxy-conjur-master.yaml" |
$cli delete -f -

echo "deleting followers"
sed -e "s#{{ CONJUR_APPLIANCE_IMAGE }}#$conjur_appliance_image#g" "./$PLATFORM/conjur-follower.yaml" |
sed -e "s#{{ AUTHENTICATOR_ID }}#$AUTHENTICATOR_ID#g" |
$cli delete -f -

sleep 10

echo "Waiting for Conjur pods to terminate..."
conjur_pod_count=0
wait_for_it 300 "$cli describe po conjur-cluster | grep Status: | grep -c Running | grep -q $conjur_pod_count"

echo "Cluster deleted."

Update Postgres Version to `10.14`

The example Conjur OSS deployment uses an EOL version of Postgres 9.4. We want to update this to match the expected OSS version (10.14).

Align deploy-oss branch with master

cyberark--kubernetes-conjur-deploy job is running as part of cyberark--secrets-provider-for-k8s, but today we are working with different branch(deploy-oss) for it and not with master.
Need to understand if we want to run it for oss as well (today ran for DAP only)
Implement helm chart will take a while so for now we want to align deploy repo for master since we saw it can give us a great value.
We will take helm chart as a different task, already raised a question on slack for this.
DOD:

  • Working with conjur-deploy master working as today + running on oss environment as well
  • Verify secrert-provider-for-k8s working
  • Verify no change in documentation
  • The default behavior is running DAP tests

Tests run also against Conjur oss

At this point we run the tests only with DAP. we should add a user input for oss and run the tests with it so we don't catch errors only in the consuming projects

Merge deploy-oss branch with master

Today we are working with specific branch because of oss implementation. We want to merge it with master.
**important **
The default need to be DAP. Need to make sure the main flow wasnt change. please see the discussion here

DOD:

  • k8s-conjur-deploy pass
  • secrets provider is updated and working with k8s-conjur-deploy master without tags.

Repo needs a CHANGELOG

Since most other CYBR repos have CHANGELOGs, this one could use it too.

AC:

  • CHANGELOG is present in the repo
  • CHANGELOG follows keepachangelog format
  • CHANGELOG is automatically verified by build steps

Increasing waiting for Conjur timeout pods to launch

Since we are working on different tag (deploy-oss-tag) for secret-provider-for-k8s, we need to change the timeout for conjur pods to launch. This fix already in master and until we will be aligned to master we need to add it under this tag as well.

configure nodePort for haproxy to provide external access to Conjur Master service

hostname:port syntax is the expected format for the Conjur Master service. Adding a nodePort to the haproxy provides external access via a fixed port number. I like 30443 since it's easy to remember:

$ cat haproxy-conjur-master.yaml p

apiVersion: v1
kind: Service
metadata:
name: conjur-master
labels:
app: haproxy-conjur-master
spec:
ports:

  • port: 443
    name: https
    nodePort: 30443

Restore role follower label to eliminate failures in kubernetes-conjur-demo

The conjurdemos/kubernetes-conjur-demo repo CI is currently failing on that repo's master. Here's a recent failure:
https://jenkins.conjur.net/job/conjurdemos--kubernetes-conjur-demo/job/master/373/
The console output from one test case shows this error:

09:54:10  + ./4_store_conjur_cert.sh
09:54:10  ++++++++++++++++++++++++++++++++++++++
09:54:10  
09:54:10  Storing Conjur cert for test app configuration.
09:54:10  
09:54:10  ++++++++++++++++++++++++++++++++++++++
09:54:10  Retrieving Conjur certificate.
09:54:10  No resources found.
09:54:11  Value added
09:54:11  Value added
09:54:11  No resources found.
09:54:11  error: expected 'exec POD_NAME COMMAND [ARG1] [ARG2] ... [ARGN]'.
09:54:11  POD_NAME and COMMAND are required arguments for the exec command
09:54:11  See 'oc exec -h' for help and examples.

The problem stems from this section of the deploy script 4_store_conjur_cert.sh:

if $cli get pods --selector role=follower --no-headers; then
  follower_pod_name=$($cli get pods --selector role=follower --no-headers | awk '{ print $1 }' | head -1)
  ssl_cert=$($cli exec $follower_pod_name -- cat /opt/conjur/etc/ssl/conjur.pem)
else
  echo "Regular follower not found. Trying to assume a decomposed follower..."
  follower_pod_name=$($cli get pods --selector role=decomposed-follower --no-headers | awk '{ print $1 }' | head -1)
  ssl_cert=$($cli exec -c "nginx" $follower_pod_name -- cat /opt/conjur/etc/ssl/cert/tls.crt)
fi

Apparently, we need the role: follower (and possibly the name: conjur-follower) pod template labels added back in for the demo scripts to be able to find the follower based on the role: follower label. It should look like this:

 template:
    metadata:
      labels:
        app: conjur-follower
        name: conjur-follower
        role: follower

Deploy DAP by default for backwards compatibility

We changed the default behaviour that this repo deploys OSS by default. That change was reverted and we need to do it again. We have the deploy-oss restored and we can continue the work from there.

We need to:

  • Deploy DAP by default and OSS with the --oss flag
  • Decide what the default value will be in kubernetes-conjur-demo & secrets-provider-for-k8s
  • Challenge the way we deploy the OSS cluster. It may be different than the way described in the docs
  • Challenge the way we handle different deployments
    • Currently both deployments are handled in the same scripts where the differences are handled by if [[CONJUR_DEPLOYMENT == "oss" ]] clauses. It may be more readable to have 2 sets of scripts with duplications.
  • Uniform the way we update files with text. We have some sed and some sh.yml replacements.
  • Wait for Conjur server to be ready when configuring Conjur CLI. At this point we just sleep for 45 seconds. We should wait for it properly.
  • Add tests for OSS - at this point we have tests in this repo for DAP. We should have the same for OSS so we don't catch errors only in the consuming projects
  • Update kubernetes-conjur-demo & secrets-provider-for-k8s to consume kubernetes-conjur-deploy properly and run both OSS & DAP tests.

Deploy Scripts don't work on current Minishift Version

The deploy scripts fail in 2_prepare_docker_images.sh:
https://github.com/cyberark/kubernetes-conjur-deploy/blob/master/2_prepare_docker_images.sh#L66-L79

This is because the Seed Fetcher Dockerfiles uses a multi-stage build, which is not supported in the Docker version included with Minishift:

FROM cyberark/conjur-kubernetes-authenticator:latest as authenticator

See also: minishift/minishift#3278

DAP 6.2.1 cluster does not start successfully [Potential issue]

We've gotten a number of internal reports that this deploy works fine with the DAP 6.2.0 version but fails with the 6.2.1 version (the latest release). Examples include:

I'd like us to investigate to better understand the issue. Please:

  1. Try to replicate the problem (deploy a 6.2.1 cluster to our OCP environment, and validate it's functioning as intended)
  2. Assuming we can replicate the problem, try to resolve the issue.

Automation failed to finish successfully on oc311

Since 26.8 oc311 failed during the stop command after delete the namespace.

example
`[2020-08-26T14:40:47.840Z] Using project "conjur-deploy-test-eed4a7bc-b".

[2020-08-26T14:40:48.410Z] namespace "conjur-deploy-test-eed4a7bc-b" deleted

[2020-08-26T20:19:26.210Z] Sending interrupt signal to process

[2020-08-26T20:19:27.485Z] Sending interrupt signal to process

[2020-08-26T20:19:38.242Z] script returned exit code 143`

DOD:

  • Verify this repo run successfully

  • Verify secrests-provider repo run successfully

Unbound variable error on stop

When running integration tests in the secrets-provider-for-k8s, a Conjur environment is deployed in K8s. When an error occurs or the tests have run to completion, the environment is stopped which removes the conjur environment and any created namespaces and roles.

Clean up cannot take place to completion due to the following error:
../kubernetes-conjur-deploy-94a25c3c-0/stop: line 8: cli: unbound variable

The error points to here:
$cli delete namespace $CONJUR_NAMESPACE_NAME

Therefore a simple, export cli= in the code below should do the trick

if [ $PLATFORM = 'kubernetes' ]; then
    cli=kubectl
elif [ $PLATFORM = 'openshift' ]; then
    cli=oc

If we do not make this addition the OC cluster will be overloaded

CLI container doesn't work and returns 502 bad gateway

If you try to use the CLI container that is deployed with these scripts, you get an error like

<html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>nginx</center>
</body>
</html>
error: 502 Bad Gateway

I have seen this with DAP clusters deployed both in and outside openshift 3.9.

Add CONJUR_VERSION to test app authenticator vars

We added the CONJUR_VERSION variable to the conjur-kubernetes-authenticator and it default to '5'. To prevent the v4 scripts from failing, we will need to set it to '4' in the test app manifests.

Numbered scripts are easier to follow

We should clean up the numbered scripts (at the very least) so that they're easier to read and modify. Break them up into functions and do a pass on naming to make things more clear.

Auto Enrollment Secrets Add should be part of the script

All the part here of adding secrets from the k8s/openshift to DAP master
https://docs.cyberark.com/Product-Doc/OnlineHelp/AAM-DAP/Latest/en/Content/Integrations/ConjurDeployFollowers.htm?tocpath=Setup%7CConfigure%20DAP%20Followers%7C_____2#ConfigureDAPforautoenrollmentofFollowers

Should be done as part of the script and not manually by the user

Motivation :
Currently the user who runs the KCD in order to deploy just a follower to an existing DAP master should store K8S/OCP tokens in the master node in order the master and follower can authenticate each other and start replicating. Even it is the the commands the user need to run. I think adding them to the script could be a good idea so a user will just clone the repo configure in bootstrap his env variables and run it.

In addition if STOP_RUNNING_ENV =TRUE and the user runs the script for a second time on or cluster with k8s namespace same as the user gave it will override the secrets in k8s/OCP and the secret in the DAP master will be diffrent from the followers k8s secret so the communication will fail. So if after the script will create new secret it will store it in the DAP master the script users won't need to face this problem .

Kubernetes Conjur Deploy project uses published seed fetcher image

As an OpenShift operator who needs to configure a DAP follower, I want a simple example of how to use the seed fetcher to bootstrap a follower, so that I can do the same in my environment.

GIVEN I have the cyberark/kubernetes-conjur-deploy repository cloned on my local machine
WHEN follow the instructions in the project Readme
THEN the script uses the cyberark/dap-seed-fetcher image rather than building it locally

Migrate secrets provider repo to use the helm chart for deploying conjur in automation

Need to add support for OSS with helm chart.
@izgeri have an e2e test using the helm chart and secretless here (in a github action): https://github.com/cyberark/conjur-oss-suite-release/blob/master/.github/workflows/e2e-tests.yml .

For now our secrets provider repo will continue to work off of a branch of k8s-deploy.

DOD:

  • Working with conjur-deploy master for both DAP/OSS
  • Verify secrert-provider-for-k8s working
  • Update documentation

Jenkins tests for OC 4.3 sometimes hangs for namespace deletion

Summary

The Jenkins pipeline test for OpenShift 4.3 sometimes hangs when the scripts try to clean up
the Conjur namespace. The console logs for this failure look like this:

[2021-01-25T14:39:25.466Z] Logging in as cluster admin...
[2021-01-25T14:39:25.730Z] Login successful.
[2021-01-25T14:39:25.730Z] 
[2021-01-25T14:39:25.730Z] You have access to 63 projects, the list has been suppressed. You can list all projects with 'oc projects'
[2021-01-25T14:39:25.730Z] 
[2021-01-25T14:39:25.730Z] Using project "conjur-deploy-test-03eb13bd-9".
[2021-01-25T14:39:25.730Z] + set_namespace default
[2021-01-25T14:39:25.730Z] + [[ 1 != 1 ]]
[2021-01-25T14:39:25.730Z] ++ oc config current-context
[2021-01-25T14:39:25.730Z] + oc config set-context conjur-deploy-test-03eb13bd-9/api-openshift-43-itci-conjur-net:6443/kube:admin --namespace=default
[2021-01-25T14:39:25.992Z] + has_namespace conjur-deploy-test-03eb13bd-9
[2021-01-25T14:39:25.993Z] + oc get namespace conjur-deploy-test-03eb13bd-9
[2021-01-25T14:39:25.993Z] + true
[2021-01-25T14:39:25.993Z] + oc delete namespace conjur-deploy-test-03eb13bd-9
[2021-01-25T14:39:26.253Z] namespace "conjur-deploy-test-03eb13bd-9" deleted

           < = = = HANGING HERE = = = >

This does not happen for other versions of OpenShift.

This appears to be related to this OC 4.3 bug: https://bugzilla.redhat.com/show_bug.cgi?id=1798282
and can likely be fixed by deleting finalizers on Kubernetes services before deleting the corresponding namespace.

Steps to Reproduce

Look at an example failure log here:
https://jenkins.conjur.net/blue/organizations/jenkins/cyberark--kubernetes-conjur-deploy/detail/master/658/pipeline/22

Expected Results

Tests should not hang for namespace deletion.

Actual Results (including error logs, if applicable)

See description above.

Reproducible

  • Always
  • Sometimes
  • Non-Reproducible

Version/Tag number

Latest master.

Environment setup

OpenShift 4.3.

Additional Information

This appears to be related to this OC 4.3 bug: https://bugzilla.redhat.com/show_bug.cgi?id=1798282
and can likely be fixed by deleting finalizers on Kubernetes services before deleting the corresponding namespace.

Invalid conjur version number

We should validate Conjur version is 4 or 5 rather than looking at the arbitrary image name to validate version. Currently this section is failing because our image name is cyberark/dap:11.2.1.

# check if CONJUR_VERSION is consistent with CONJUR_APPLIANCE_IMAGE
appliance_tag=${CONJUR_APPLIANCE_IMAGE//[A-Za-z.]*:/}
appliance_version=${appliance_tag//[.-][0-9A-Za-z.-]*/}
if [ "${appliance_version}" != "$CONJUR_VERSION" ]; then
echo "ERROR! Your appliance does not match the specified Conjur version."
exit 1
fi

RBAC Role for OpenShift should allow get/list for DeploymentConfigs

The current permissions that are included in openshift/conjur-authenticator-role.yaml do not include
get and list permissions for DeploymentConfig resources in OpenShift. These permissions are
required in order to allow users to use authn-k8s authentication that is based on DeploymentConfig
application identity (e.g. Conjur policies that include DeploymentConfig identity as an annotation).

Followers are configured to be distributed across nodes

As an OpenShift operator, I want follower nodes distributed across nodes, so that I don't loose all followers if there is a hardware failure.

GIVEN an OpenShift cluster
WHEN I deploy followers into the cluster
THEN followers are spread across available nodes

Standby and follower seeding encrypts key files

When seeding standbys and followers in production it's best to encrypt the key files on the master before generating seeds so that the keys from the seed are encrypted when copied to the standby / follower. The encrypted keys can be added to the keyring before configuration by adding evoke keys exec to the configuration command, i.e.:

evoke keys exec -m /conjur_files/master-key -- evoke configure standby

We should update the demo to follow this best practice.

Sample manifests include readiness probes

Our application manifests for Kubernetes and OpenShift do not include Probes to determine if the follower is healthy, and thus available to receive requests. For an operator, it's critical that followers are healthy to avoid traffic being routed to followers which cannot service requests.

GIVEN a Kubernetes or OpenShift cluster
WHEN I configure a follower using the script
AND I scale followers up by one (bringing up an un-configured follower)
THEN request traffic is not routed to the unhealthy follower

Dev Notes
Health checks in K8s/OC can be configured using Readiness Probes. Probes are part of the Pod spec. Adding a probe to a deployment should look something like:

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: conjur-follower
spec:
  replicas: {{ CONJUR_FOLLOWER_COUNT }}
  template:
    metadata:
      labels:
        app: conjur-follower
        name: conjur-follower
        role: follower
    spec:
      serviceAccountName: conjur-cluster
      containers:
      - name: conjur-appliance
        image: {{ CONJUR_APPLIANCE_IMAGE }}
        imagePullPolicy: {{ IMAGE_PULL_POLICY }}
        env:
          - name: CONJUR_AUTHENTICATORS
            value: authn-k8s/{{ AUTHENTICATOR_ID }}
        ports:
        - containerPort: 443
          name: https
        - containerPort: 636
          name: ldaps
        - containerPort: 5432
          name: pg-main
        - containerPort: 5433
          name: pg-audit
        readinessProbe:
          httpGet:
            path: /health
            port: 443
          initialDelaySeconds: 15
          timeoutSeconds: 1
      imagePullSecrets:
        - name: dockerpullsecret

Note the readinessProbe pointing to the health endpoint.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.