DOKS Solution Blueprints
digitalocean / container-blueprints Goto Github PK
View Code? Open in Web Editor NEWDigitalOcean Kubernetes(DOKS) Solution Blueprints
DigitalOcean Kubernetes(DOKS) Solution Blueprints
The README specifies:
You need to make sure to NOT ADD static routes containing CIDRs which overlap with DigitalOcean REST API endpoints (including DOKS) ! Doing so, will affect DOKS cluster functionality (Kubelets), and/or other internal services (e.g. Crossplane).
What is the CIDR for DigitalOcean REST API? The internal service one I would imagine being my VPC IP range that I need to block.
Overview
Because Kubernetes is such popular nowadays, security plays a vital role. The DOKS Supply Chain Security blueprint main idea is to provide a starting point for developers to set up a CI/CD pipeline with integrated vulnerability scanning support. The main topic and ideas discussed is around supply chain security in the Kubernetes ecosystem.
In terms of tooling, we focus around Kubescape and Snyk. Then, we use two separate guides describing the two. The accompanying examples show the user how to create a standard CI/CD workflow using GitHub actions.
Main topics:
Additional topics to cover:
Other enhancements and nice to haves:
Current version for the Flux CD
provider from the main TF module is a little bit too old (0.2.x). Because of this, some of the Flux CD
components deployed by the module are not functioning properly or not detected at all - e.g. kustomization/flux-system
.
Another thing to take into consideration is that the CLI counterpart gets updated really frequent as well. So, the TF provider and the flux CLI counterpart need to be not too distant, when talking about the version.
flux = {
source = "fluxcd/flux"
version = "~> 0.8.1"
}
~>
which is a little bit too restrictive, and go with minimum allowed versions instead, via >=
for all providers.A NAT Gateway sits between your DOKS cluster and public Internet. Main role is to control egress traffic, that is traffic going outside your DOKS cluster. Because of the context, we can call it an Egress Gateway as well.
Another benefit of using an egress gateway is that all traffic exiting your cluster is routed through a single public IP, thus making ACLs management easier for external services that require it.
Main idea is to have a dedicated droplet acting as an egress gateway for our DOKS cluster. Then, static routes are added to the DOKS worker nodes which tell Linux to route or not traffic through the egress gateway based on destination address. On the other hand, we can have a list of public CIDRs, and route all outbound traffic via the egress gateway (some limitations apply, though).
There are two ways to approach this from a Kubernetes point of view:
The latter approach is chosen because it offers more flexibility and control over the process. We implemented a custom static routes controller to achieve this goal, called k8s-staticroute-operator.
GitHub changed the SSH public key recently. The Terraform module from this repository which provisions Flux CD relies on it.
More info can be found at fluxcd/source-controller#490.
Create source cluster 1 with Postgres DB size 500GB data.
Use Velero for backup.
Create destination cluster 2. Import the data from postgres DB backup of cluster 1.
Hey All!
I am trying to work through the DOKS-CI-CD tutorial, and have hit a few hard blockers.
Any recommendations on getting back on track? Should I just install the tekton and knative portions from scratch?
Create a blueprint to showcase a basic DOKS CI/CD setup using Tekton
and Argo CD
. Also, add Knative
to the mix for running/exposing user applications painlessly.
The CI/CD pipeline should be triggered automatically via webhooks
whenever code changes are pushed to the user applications GitHub repository.
Hint:
Maybe we can leverage Knative Eventing
to connect GitHub events
with Tekton EventListeners
and Triggers
.
The default value for the github_ssh_pub_key
variable used in the Terraform module to populate the known hosts file for Flux CD contains an invalid value. Currently, the variables.tf
file sets the default value like this:
variable "github_ssh_pub_key" {
description = "GitHub SSH public key"
type = string
default = "github.com ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBEmKSENjQEezOmxkZMy7opKgwFB9nkt5YRrYMjNuG5N87uRgg6CLrbo5wAdT/y6v0mKV0U2w0WZ2YB/++Tpockg="
}
Remove the redundant "github.com" part form the string. It's already appended by the known_hosts
parameter from the data
resource inside main.tf
file:
data = {
identity = tls_private_key.main.private_key_pem
"identity.pub" = tls_private_key.main.public_key_pem
known_hosts = "github.com ${var.github_ssh_pub_key}"
}
Seems that this combination is behaving like a poison pill:
data "digitalocean_kubernetes_cluster" "primary" {
name = var.doks_cluster_name
depends_on = [
digitalocean_kubernetes_cluster.primary
]
}
When used with the following provider:
provider "kubernetes" {
host = data.digitalocean_kubernetes_cluster.primary.endpoint
token = data.digitalocean_kubernetes_cluster.primary.kube_config[0].token
cluster_ca_certificate = base64decode(
data.digitalocean_kubernetes_cluster.primary.kube_config[0].cluster_ca_certificate
)
}
When you spin up a cluster for the first time, the above combination will work. But, subsequent runs of terraform plan
fail with:
Error: Get "http://localhost/api/v1/namespaces/flux-system": dial tcp [::1]:80: connect: connection refused
│
│ with module.doks_flux_cd.kubernetes_namespace.flux_system,
│ on .terraform/modules/doks_flux_cd/create-doks-with-terraform-flux/main.tf line 52, in resource "kubernetes_namespace" "flux_system":
│ 52: resource "kubernetes_namespace" "flux_system" {
My assumption is that it has to do on how Terraform evaluates resources, providers, data sources, etc. Seems that on subsequent runs, after the DOKS cluster is created, the depends_on
condition is causing the digitalocean_kubernetes_cluster
data source to not re-evaluate, or to not return valid data. The kubernetes
provider will default to localhost
instead, if not receiving a valid Kubernetes cluster configuration from the remote.
On the other hand, we don't need to lookup data using the digitalocean_kubernetes_cluster
data source. The digitalocean_kubernetes_cluster
resource, is already exposing everything we need after successful creation.
Avoid lookup using the digitalocean_kubernetes_cluster
data source, and rely on the digitalocean_kubernetes_cluster
resource instead.
Hi
Thank yo sou much for this article, it is really what I need. but I have some issue with it.
I applied the all steps in the article you shared. I defined the route tables to have all CloudFlare IP ranges and point the NAT gateway if the packet matches with them in order to forward the traffic through the NAT gateway when the services are connecting to the another cluster API, which is using CloudFlare. We expected the route table to only affect the outbound packets, but it started to affect the inbound traffic too.
As a result, it was giving a timeout from Cloudflare-enabled services API URLs. Instead of processing the packets on the server level directly, the route table forwarded all packets regardless of the inbound and the outbound to the NAT gateway side.
do you have any idea about the solving the issue?
Regards
Here is the my public-egress-example.yaml file:
apiVersion: networking.digitalocean.com/v1
kind: StaticRoute
metadata:
name: public-egress
spec:
destinations:
"0.0.0.0/5"
"8.0.0.0/7"
"11.0.0.0/8"
"12.0.0.0/6"
"16.0.0.0/4"
"32.0.0.0/3"
Additional topics to complement the main DOKS-CI-CD blueprint:
Split the main TF module from the DOKS FluxCD repository into submodules for easy maintenance:
< root_module >
|
|---- < doks >
|
|---- < fluxcd >
< doks >
submodule deals with DOKS
cluster initialization stuff.
< fluxcd >
submodule deals with provisioning the Flux CD
manifests in the GitHub
repo, and associated Kubernetes
resources.
--
forProvider:
region: nyc3
size: s-1vcpu-1gb
image: ubuntu-20-04-x64
sshKeys:
- bb:bb:bb:bb:bb:bb:bb:bb
verify “ssh’ing” works
I did this by "ssh root@public_ip_of_NAT_GW -i .ssh/<private_ssh_key>
Verify ip_forwarding is enabled
"sysctl net.ipv4.ip_forward"
Expected output
net.ipv4.ip_forward = 1
Expected output:
root@nat-gw-nyc3-new5:~# iptables -L -t nat
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
MASQUERADE all -- 10.108.0.0/20 anywhere. ---> Should see the the VPC network info where you have both NAT GW and the K8s cluster
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.