openshift / baremetal-runtimecfg Goto Github PK

Runtime configuration renderer for OpenShift in platforms without Cloud DNSaaS

License: Apache License 2.0

Dockerfile 0.47% Go 98.83% Makefile 0.61% Shell 0.09%

baremetal-runtimecfg's Introduction

runtimecfg

runtimecfg is small utitily that reads KubeConfig and checks the current system for rendering OpenShift baremetal networking configuration.

Usage

runtimecfg [command]

The available commands are:

display: Displays the struct that contains the information for rendering
help: Help about any command
render: Renders go templates with the runtime configuration. Takes a -o/--out-dir parameter to specify where to write the rendered files.

The available flags are:

--api-vip: Virtual IP Address to reach the OpenShift API
--dns-vip: Virtual IP Address to reach an OpenShift node resolving DNS server
--ingress-vip Virtual IP Address to reach the OpenShift Ingress Routers

Note that you must pass at least one VIP for the VRRP interface to be found.

Test

Run on docker (recommended)

Running tests inside docker are consistet between machines and it keeps the host environment clean.

In order to run the tests you should have all these prerequisites:

make
docker
docker-compose

make docker_test

Run locally on host

There are some tests that require user capabilities (cap_net_admin, cap_net_raw). In case the user doesn't have these capabilities - The tests would be skipped.

Pay attention: This tests might change the machine networking.

make test

baremetal-runtimecfg's People

Contributors

Stargazers

Watchers

baremetal-runtimecfg's Issues

Keepalived monitor is unable to automatically change VRRP interface of keepalived.conf in unicast mode when one more ovs-bridge interface added to node vm during bootstrap

In OCP 4.11, the keepalived is configured to use unicast mode only:

In our VSphere installation process, we are using a NCP CNI that will add one more ovs-bridge interface:
br-int in the node vm besides ens192 interface. During bootstrap stage, when master node vms created,
NCP CNI will create br-int interface and move node vm from ens192 to br-int. However,
keepalived monitor is not able to re-render /etc/keepalived/keepalived.conf interface from ens192 to br-int
when node vm interface is changed from ens192 to br-int. Keepalived was continuing to check VRRP request
from ens192 interface, which will cause keepalived timeout and set API VIP on master nodes.
Let's look at one node vm's keepalived log:

/etc/keepalived/keepalived.conf info:
vrrp_instance piyushv6_API
state BACKUP
interface ens192
virtual_router_id 214
priority 40
advert_int 1
unicast_src_ip 192.168.61.105
unicast_peer {
192.168.61.102
}

2022-11-25T03:56:55.306531381+00:00 stderr F Fri Nov 25 03:56:55 2022: (piyushv6_API) removing VIPs.
2022-11-25T03:56:55.306582413+00:00 stderr F Fri Nov 25 03:56:55 2022: (piyushv6_INGRESS) removing VIPs.
2022-11-25T03:56:55.307107929+00:00 stderr F Fri Nov 25 03:56:55 2022: (piyushv6_API) Entering BACKUP STATE (init)

2022-11-25T03:56:55.305392412+00:00 stderr F Fri Nov 25 03:56:55 2022: (/etc/keepalived/keepalived.conf: Line 94) Truncating auth_pass to 8 >characters
2022-11-25T03:56:55.306137745+00:00 stderr F Fri Nov 25 03:56:55 2022: Assigned address 192.168.61.105 for interface ens192
2022-11-25T03:56:55.306158766+00:00 stderr F Fri Nov 25 03:56:55 2022: Assigned address fe80::996f:7c4:9af4:ce2e for interface ens192

2022-11-25T03:57:27.296451138+00:00 stderr F Fri Nov 25 03:57:27 2022: Deassigned address fe80::996f:7c4:9af4:ce2e from interface ens192
2022-11-25T03:57:27.307766132+00:00 stderr F Fri Nov 25 03:57:27 2022: Netlink reflector reports IP 192.168.61.12 removed from ens192

2022-11-25T03:57:27.310881239+00:00 stderr F Fri Nov 25 03:57:27 2022: Deassigned address 192.168.61.105 from interface ens192
2022-11-25T03:57:27.329559161+00:00 stderr F Fri Nov 25 03:57:27 2022: Interface ovs-system added

2022-11-25T03:57:30.669037098+00:00 stderr F Fri Nov 25 03:57:30 2022: (piyushv6_API) Receive advertisement timeout
2022-11-25T03:57:30.669083185+00:00 stderr F Fri Nov 25 03:57:30 2022: (piyushv6_API) Entering MASTER STATE
2022-11-25T03:57:30.669083185+00:00 stderr F Fri Nov 25 03:57:30 2022: (piyushv6_API) setting VIPs. -----> Here, 192.192.168.61.11 API VIP is set on node master 0 itself due to VRRP timeout

2022-11-25T03:57:47.504979420+00:00 stderr F Fri Nov 25 03:57:47 2022: Interface br-int added

We can see keepalived set API VIP on node itself, however, by this time, API server is still in bootstrap node not in any of master nodes. So, the node is not able to talk with API server by VIP any more. Also, the API server is not deployed on master nodes yet. Ultimately, it will lead to keepalived monitor failed to update /etc/keepalived/keepalived.conf when br-int interface added in unitcast mode because keepalived monitor needs to talk with API server to fetch master node peers info in order to update keepalived.conf interface.

We see output from -> ip addr show:
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP group default qlen 1000
link/ether 00:50:56:ad:b1:af brd ff:ff:ff:ff:ff:ff
inet 192.168.61.12/32 scope global ens192
valid_lft forever preferred_lft forever
inet 192.168.61.11/32 scope global ens192 ---- We see, API VIP 192.168.61.11 is set on node ens192 interface by keepalived after timeout
valid_lft forever preferred_lft forever
3: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether b6:cc:66:7c:c2:ff brd ff:ff:ff:ff:ff:ff
8: br-int: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 00:50:56:ad:b1:af brd ff:ff:ff:ff:ff:ff
inet 192.168.61.105/24 brd 192.168.61.255 scope global dynamic noprefixroute br-int node ip 92.168.61.105 is moved from ens192 to br-int
valid_lft 70427sec preferred_lft 70427sec

Let's look at keepalived monitor log:
2022-11-25T03:57:50.560139752+00:00 stderr F time="2022-11-25T03:57:50Z" level=info msg="Monitor conf file doesn't exist" >file=/etc/keepalived/unsupported-monitor.conf

2022-11-25T03:57:50.568845276+00:00 stderr F time="2022-11-25T03:57:50Z" level=error msg="Failed to retrieve API members information" >kubeconfigPath=/var/lib/kubelet/kubeconfig
2022-11-25T03:57:50.568865258+00:00 stderr F time="2022-11-25T03:57:50Z" level=warning msg="Could not retrieve LB config: Get >"https://localhost:6443/api/v1/nodes?labelSelector=node-role.kubernetes.io%2Fmaster%3D\": dial tcp [::1]:6443: connect: connection refused"

We can see Keepalived monitor was started on 2022-11-25T03:57:50, by that time, keepalived had entered MASTER STATE due to timeout and set VIPs on node itself. However, the k8s API Server was not yet deployed on master node0 yet.

Why keepalived receive advertisement timeout after br-int activated. it's because keeplivd can not receive VRRP either from unicast peers after or bootstrap node on ens192. So, it should change interface from ens192 to br-int.

After reading source code of keepalived monitor and doing some test, we think the issue is related to additional API Server backend computation in unicast mode:
https://github.com/openshift/baremetal-runtimecfg/blob/master/pkg/monitor/dynkeepalived.go#L72
When computing API Server backend, it needs to connect to API server, either in bootstrap node or local API Server.
If getting API Server backend failed, the new config is not a valid Config, so, this is why keepalived monitor is not able to update keepalived config.
https://github.com/openshift/baremetal-runtimecfg/blob/master/pkg/monitor/dynkeepalived.go#L97

Therefore, we draw a conclusion:
Keepalived monitor is unable to automatically change VRRP interface of keepalived.conf in unicast mode when one more ovs-bridge interface added to node vm during bootstrap stage. we might need to add bootstrap node as unitcast peer in keepalivd config file during bootstrap stage.

Steps to reproduce the issue:

Using OCP 4.11 to start install cluster
During installation stage, maser node vms add one ovs-bridge interface on master node vms and move node ip from ens192 to br-int.
The keepalived will set API VIP on node ens192 interface after VRRP timeout and make keepalived monitor is not able to update keepalived confg interface from ens192 to br-int.

Is there some way to fix this issue to make Keepalived monitor is able to automatically change VRRP interface when ovs-bridge interface is up?

we find a workaround solution:
Manually change ens192 interface to br-int interface when ovs-bridge is up and then restart keepalived static pod.

We did this workaround on another master node1:
Manual changing interface to br-int in keepalived.conf:
vrrp_instance piyushv6_API
state BACKUP
interface br-int -- manually change interface from ens192 to br-int
virtual_router_id 214
priority 40
advert_int 1
unicast_src_ip 192.168.61.102
unicast_peer {
192.168.61.105
}

And then restart keepavlid pod：

keepalived monitor log:
2022-11-25T03:56:55.611958471+00:00 stderr F time="2022-11-25T03:56:55Z" level=info msg="Config change detected" configChangeCtr=3 current >config="{{piyushv6 openshift.test 192.168.61.11 214 A AAAA 192.168.61.12 21 A AAAA 32 0 []} {123 123 123 [{piyushv6-hv79x-master-1 >192.168.61.102 123} {piyushv6-hv79x-master-0 192.168.61.105 123}] } 192.168.61.102 piyushv6-hv79x-master-1 br-int [10.40.125.142] >{[192.168.61.105 192.168.61.102]} true}"
..
2022-11-25T03:56:55.613872623+00:00 stderr F time="2022-11-25T03:56:55Z" level=info msg="vrrp_instance piyushv6_API {"
2022-11-25T03:56:55.613872623+00:00 stderr F time="2022-11-25T03:56:55Z" level=info msg=" state BACKUP"
2022-11-25T03:56:55.613872623+00:00 stderr F time="2022-11-25T03:56:55Z" level=info msg=" interface br-int" ---interface is changed to br-int

2022-11-25T03:56:55.614085763+00:00 stderr F time="2022-11-25T03:56:55Z" level=info msg="Runtimecfg rendering template" >path=/etc/keepalived/keepalived.conf. -- by this time, keepalived.conf was update, interface was changed to br-int
...
2022-11-25T04:11:33.997518005+00:00 stderr F time="2022-11-25T04:11:33Z" level=info msg="Failed to get master Nodes list" err="Get >"https://api-int.piyushv6.openshift.test:6443/api/v1/nodes?labelSelector=node-role.kubernetes.io%2Fmaster%3D\": dial tcp 192.168.61.11:6443: >connect: connection refused"
2022-11-25T04:11:33.997518005+00:00 stderr F time="2022-11-25T04:11:33Z" level=info msg="An error occurred while trying to read master nodes >details from api-vip:kube-apiserver: Get "https://api-int.piyushv6.openshift.test:6443/api/v1/nodes?labelSelector=node->role.kubernetes.io%2Fmaster%3D\": dial tcp 192.168.61.11:6443: connect: connection refused"
2022-11-25T04:11:33.997565831+00:00 stderr F time="2022-11-25T04:11:33Z" level=info msg="Trying to read master nodes details from >localhost:kube-apiserver"

keepalived log:
2022-11-25T03:51:30.796192643+00:00 stderr F + socat UNIX-LISTEN:/var/run/keepalived/keepalived.sock,fork 'system:bash -c msg_handler'
2022-11-25T03:56:55.614480850+00:00 stderr F The client sent: reload.

2022-11-25T03:56:55.795060493+00:00 stderr F Fri Nov 25 03:56:55 2022: Assigned address 192.168.61.102 for interface br-int

2022-11-25T04:09:42.750280967+00:00 stderr F Fri Nov 25 04:09:42 2022: (piyushv6_API) Master received advert from 192.168.61.123 with higher >priority 70, ours 40 --> master node1 received VRRP from bootstrap node and set itself into BACKUP state.
2022-11-25T04:09:42.750334698+00:00 stderr F Fri Nov 25 04:09:42 2022: (piyushv6_API) Entering BACKUP STATE
2022-11-25T04:09:42.750362086+00:00 stderr F Fri Nov 25 04:09:42 2022: (piyushv6_API) removing VIPs.

We can see, for master node1, keeplived monitor changed VRRP interface to br-int after restarting.
Also, keepalivd was able to receive VRRP from bootstrap node 192.168.61.123 after setting interface to br-int, so it was able to remove VIPs from the node.

wrong node ip address on 4.6

I'm running openshift 4.6 on baremetal (3 control/3worker nodes) with one NIC holding a public IP address and a VLAN with a private IP address attached to it. All internal traffic and kubelet/crio runs on the internal VLAN (at least up until 4.6). With the latest update to 4.6 the nodeip-configuration.service only finds the public IP address (because the public IP has the default route set).
DNS is setup and all hostnames resolve to the internal IP Address.

My current solution is to start my own service writing files in the same locations https://github.com/openshift/baremetal-runtimecfg/blob/master/cmd/runtimecfg/node-ip.go does but using the a DNS lookup on the hostname to find the correct IP address.

Is there another/supportet way of setting the correct IP address, so kubelet and crio use the intended address?

FletcherChecksum8() may generate invalid VRID

Sometimes, keepalived pod may end up in a crash loop with the following logs.

Starting Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2
Opening file '/etc/keepalived/keepalived.conf'.
Starting VRRP child process, pid=7
Registering Kernel netlink reflector
Registering Kernel netlink command channel
Registering gratuitous ARP shared channel
Opening file '/etc/keepalived/keepalived.conf'.
Truncating auth_pass to 8 characters
VRRP Error : VRID not valid - must be between 1 & 255. reconfigure !
Truncating auth_pass to 8 characters
VRRP_Instance(l3vvqqni-f21ca_DNS) the virtual id must be set!
Stopped
Keepalived_vrrp exited with permanent error CONFIG. Terminating
Stopping
Stopped Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2

It turns out, the FletcherChecksum8() function used to compute a unique VRID may return 0 with certain input such as "l3vvqqni-f21ca-dns.
0 is not a valid VRID.

https://play.golang.org/p/V0i11g1XvsY

Future Release Branches Frozen For Merging | branch:release-4.16 branch:release-4.17

The following branches are being fast-forwarded from the current development branch (master) as placeholders for future releases. No merging is allowed into these release branches until they are unfrozen for production release.

release-4.16
release-4.17

For more information, see the branching documentation.

Future Release Branches Frozen For Merging | branch:release-4.5 branch:release-4.4

release-4.5
release-4.4

Contact the Test Platform or Automated Release teams for more information.

Unreachable loadbalanced API when taking down VIP holding master

Related to bz#1744560. We end up rendering the wrong content in haproxy conf seemingly matching the master that went down.

Future Release Branches Frozen For Merging | branch:release-4.10 branch:release-4.9

release-4.10
release-4.9

Contact the Test Platform or Automated Release teams for more information.

Runtimecfg broken for ovirt

Hello,
Yesterday ovirt ocp ci runs started falling in the night due to the change #38.
When debugging the issue we noticed that the masters that started couldn't publish their service because of:
2020-01-16T09:52:44.729133066+00:00 stderr F time="2020-01-16T09:52:44Z" level=fatal msg="Failed to find interface for specified ip" ip=192.168.201.0
we looked at the config file which is created by runtimecfg /etc/mdns/config.hcl and saw that the bind_address was set to "192.168.201.0" which should be "192.168.201.31".
We built the baremetal-runtime pod without the change and everything went smooth.

On our ENV we have one interface on each master which has IPv6 and IPv4 address.
We initially thought that it was the root casue so we tried removing the IPv6 address (since we are not using it) delete the conf, and restart the pod but the problem remained.

If you need env to try and debug this we recommend triggering an ovirt job.

During FCOS Boot: `Job nodeip-configuration.service/start running` appears to hang indefinitely

I also opened an issue in OKD: okd-project/okd#1477

Issue summary:

I'm seeing a weird behavior with 4.12.0-0.okd-2023-01-21-055900

During cluster install on a KVM host, a control-plane node that is booting after release-image-pivot will sometimes hang at:

Job nodeip-configuration.service/start running

If I kill the VM and restart it, then it will often get past this and continue.

This issue seems to be fairly repeatable on a KVM host. In this particular case, I am installing a 3 node cluster on a single physical host with 12 vCPU and 64GB of RAM.

These OKD clusters are built with fixed IP addresses which are configured via ignition. I use butane to create the node specific ignition files, and iPXE to serve them at boot time.

After the bootstrap node is running, one or more of the control-plane nodes will hang at:

Job nodeip-configuration.service/start running

The node does not respond to ping at this point.

It will not recover unless I virsh destroy <hung-node> && virsh start <hung-node>

It is interesting to note that on a successful node, the nodeip-configuration.service appears to run twice.

journalctl output from a successful node is attached:

journalctl.out.txt

I can repeat this fairly regularly, but not with any discernible pattern. It sometimes happens with just one node starting at a time. It sometimes happens to only one of three nodes.

It does appear to happen more frequently on a constrained network.

Version:

CentOS Stream - KVM host 5.14.0-205.el9.x86_64
OKD: 4.12.0-0.okd-2023-01-21-055900
OKD-SCOS: 4.12.0-0.okd-scos-2022-12-02-083740

Possible VRID collision?

Not sure this is really the case (haven't observed it), the use of a checksum algo to compute the VRID could lead to collision -- especially on a small pool of 255 possible values.

Would it be possible to add a check in the code to ensure the VRIDs are different?

Bootstrap pods pointing to an incorrect kubeconfig

A recent change to the installer (openshift/installer@82d81d9f) copies kubeconfig-loopback instead of kubeconfig to /etc/kubernetes/kubeconfig.

This causes a problem for baremetal bootstrap. It points to localhost.

Proposed fix is to read from cluster-config.yaml first, and fall back to kubeconfig if that's not available.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.