submariner-io / cloud-prepare Goto Github PK
View Code? Open in Web Editor NEWAPIs and code to prepare various cloud infrastructures for Submariner.
License: Apache License 2.0
APIs and code to prepare various cloud infrastructures for Submariner.
License: Apache License 2.0
hi, team, have any plans to upgrade k8s package like client-go to above v0.19?
The submariner-addon is using client-go v0.20.5, and there are some incompatible issues when I import cloud-prepare .
# github.com/submariner-io/admiral/pkg/resource
vendor/github.com/submariner-io/admiral/pkg/resource/dynamic.go:29:21: not enough arguments in call to d.client.Get
have (string, "k8s.io/apimachinery/pkg/apis/meta/v1".GetOptions)
want (context.Context, string, "k8s.io/apimachinery/pkg/apis/meta/v1".GetOptions, ...string)
vendor/github.com/submariner-io/admiral/pkg/resource/dynamic.go:38:24: not enough arguments in call to d.client.Create
have (*unstructured.Unstructured, "k8s.io/apimachinery/pkg/apis/meta/v1".CreateOptions)
the dynamic client methods are changed from v0.19.
What would you like to be removed:
Let's deprecate this mode for 0.15 and remove it on 0.16
Why is this needed:
We're not actually testing or using non-dedicated gateway mode from cloud prepare.
Furthermore, a more K8s native approach is to deploy with LB mode which actually handles the cloud related operations properly.
From a technical debt perspective, the code is just sitting there and making cloud-prepare and subctl harder to maintain, with it removed we'll have an easier maintenance burden and could even further simplify the cloud prepare code.
What would you like to be added:
Let's deprecate this mode for 0.15 and remove it on 0.16
Why is this needed:
We're not actually testing or using the "generic cloud prepare" mode.
Furthermore, we already have code for making sure gateways have labels in the join
command, which makes much more sense. Having a duplicate code path just increases the maintenance burden without adding any value.
Additionally, this mode on cloud prepare adds no special additional capabilities beyond what subctl join
already does or can do.
From a technical debt perspective, the code is just sitting there and making cloud-prepare & subctl harder to maintain, with it removed we'll have an easier maintenance burden and could even further simplify cloud prepare code.
What would you like to be added:
use machineSetClient.List() instead of hard coding in the ocpGatewayDeployer for GCP and RHOS
Why is this needed:
Remove hard coding
What would you like to be added:
We need support for configuring one (or more) of worker nodes in the Cluster as Submariner Gateway nodes.
The following configuration has to be done for the Gateway nodes.
Note: The following issue handles the dedicated Gateway node use-case #91
To be aligned with other subctl
commands, allow specifying these flags in the command line
We now have easy insight into unit test coverage via SonarCloud. The current UT coverage for the submariner-io/cloud-prepare repo is 51.8%. We should strive to improve UT coverage, ideally initially up to at least 70%.
https://sonarcloud.io/summary/overall?id=submariner-io_cloud-prepare
https://sonarcloud.io/component_measures?id=submariner-io_cloud-prepare&metric=coverage&view=list
What would you like to be added:
Support for automated infra configuration of OpenStack-based clouds.
What would you like to be added:
Azure cloud prepare needs to have the capability to tag an existing node as a gateway node.
Why is this needed:
This helps not having a dedicated node as g/w node but reuse an existing one.
What would you like to be added:
While installing dedicated Gateway nodes in some clouds, the region may not support the requested instance type.
For example, AWS us-west-1 does not support m5n.large instance type. It would be useful if cloud-prepare library can support a mode where it can choose the most appropriate instance available in the region.
Currently, the signature for the API is
NewOcpGatewayDeployer(cloud api.Cloud, msDeployer ocp.MachineSetDeployer, instanceType string)
We can modify the code to allow an empty instanceType and treat this as an option for cloud-prepare to choose the appropriate instance available in the region.
What would you like to be added:
The cloud prepare implementation for GCP and Azure should allow scaling beyond the number of Zones.
Why is this needed:
Limiting the maximum number of gateway nodes to number of Zones mays restrict the number of g/w nodes that can be spawned if region has too few Zones.
What would you like to be added:
Support for automated infra configuration of VMware vSphere-based environments.
What would you like to be added:
We need support for deploying new Dedicated Gateway nodes on the GCP Cluster instead of choosing an existing one from the Worker node for certain scenarios.
After running the following commands on OCP 4.11 cluster on AWS:
subctl version : subctl version: v0.14.0-rc4
I can still see the submariner GW security group on AWS UI.
[1]
$ subctl cloud prepare aws --ocp-metadata aws-cluster-a/metadata.json
✓ Preparing AWS cloud for Submariner deployment
✓ Obtained infra ID "yb-awsa-9b2kp" and region "us-east-2" from OCP metadata file "aws-cluster-a/metadata.json"
✓ Initializing AWS connectivity
✓ Retrieving VPC ID
✓ Retrieved VPC ID vpc-0f50548218add38be
✓ Validating pre-requisites
✓ Validated pre-requisites
✓ Creating Submariner gateway security group
✓ Created Submariner gateway security group yb-awsa-9b2kp-submariner-gw-sg
✓ Adjusting public subnet yb-awsa-9b2kp-public-us-east-2a to support Submariner
✓ Adjusted public subnet yb-awsa-9b2kp-public-us-east-2a to support Submariner
✓ Deploying gateway node for public subnet yb-awsa-9b2kp-public-us-east-2a
✓ Deployed gateway node for public subnet yb-awsa-9b2kp-public-us-east-2a
✓ Retrieving VPC ID
✓ Retrieved VPC ID vpc-0f50548218add38be
✓ Validating pre-requisites
✓ Validated pre-requisites
✓ Opening port 4800 protocol udp for intra-cluster communications
✓ Opened port 4800 protocol udp for intra-cluster communications
$
$ subctl cloud cleanup aws --ocp-metadata aws-cluster-a/metadata.json
✓ Obtained infra ID "yb-awsa-9b2kp" and region "us-east-2" from OCP metadata file "aws-cluster-a/metadata.json"
✓ Initializing AWS connectivity
✓ Retrieving VPC ID
✓ Retrieved VPC ID vpc-0f50548218add38be
✓ Validating pre-requisites
✓ Validated pre-requisites
✓ Removing gateway node for public subnet yb-awsa-9b2kp-public-us-east-2a
✓ Removed gateway node for public subnet yb-awsa-9b2kp-public-us-east-2a
✓ Untagging public subnet yb-awsa-9b2kp-public-us-east-2a from supporting Submariner
✓ Untagged public subnet yb-awsa-9b2kp-public-us-east-2a from supporting Submariner
✓ Deleting Submariner gateway security group
✓ Deleted Submariner gateway security group
✓ Retrieving VPC ID
✓ Retrieved VPC ID vpc-0f50548218add38be
✓ Validating pre-requisites
✓ Validated pre-requisites
✓ Revoking intra-cluster communication permissions
✓ Revoked intra-cluster communication permissions
Following discussion with OCM consumers it seems that it would be better to extract gateway deployment to a different API.
The proposed API is:
type GatewayDeployInput struct {
// List of ports to open externally so that Submariner can reach and be reached by other Submariners
PublicPorts []PortSpec
// Amount of gateways that are being deployed
// 0 (AutoGateways) = Deploy gateways per the default deployer policy (Default if not specified)
//
// 1-* = Deploy the amount of gateways requested (May fail if there aren't enough public subnets)
Gateways int
}
type GatewayDeployer interface {
Deploy(input GatewayDeployInput, reporter Reporter) error
Cleanup(reporter Reporter) error
}
What would you like to be added:
Why is this needed:
The cloud-prepare project doesn’t support “Depends on” etc. It would be nice to have this, as in other Submariner projects.
What would you like to be added:
Support for automated infra configuration of Microsoft Azure-based clouds.
What would you like to be added:
Let's deprecate this mode for 0.15 and remove it on 0.16
Why is this needed:
We're not actually testing or using the "generic cloud prepare" mode.
Furthermore, we already have code for making sure gateways have labels in the join
command, which makes much more sense. Having a duplicate code path just increases the maintenance burden without adding any value.
Additionally, this mode on cloud prepare adds no special additional capabilities beyond what subctl join
already does or can do.
From a technical debt perspective, the code is just sitting there and making cloud-prepare & subctl harder to maintain, with it removed we'll have an easier maintenance burden and could even further simplify cloud prepare code.
What happened:
Running cloud prepare for AWS via subctl
, when no ~/.aws/credentials
file is found, fails with:
✓ Preparing AWS cloud for Submariner deployment
✓ Obtained infra ID "mkolesni-subm-deb2-42pgb" and region "us-east-1" from OCP metadata file "mkolesni-subm-deb2/metadata.json"
✓ Initializing AWS connectivity
✗ Retrieving VPC ID
✗ Unable to retrieve the VPC ID: error describing AWS VPCs: operation error EC2: DescribeVpcs, failed to sign request: failed to retrieve credentials: failed to refresh cached credentials, no EC2 IMDS role found, operation error ec2imds: GetMetadata, request canceled, context deadline exceeded
✗ Failed to prepare AWS cloud: unable to retrieve the VPC ID: error describing AWS VPCs: operation error EC2: DescribeVpcs, failed to sign request: failed to retrieve credentials: failed to refresh cached credentials, no EC2 IMDS role found, operation error ec2imds: GetMetadata, request canceled, context deadline exceeded
subctl version: devel
What you expected to happen:
It should present a clear error message
On 0.11.2 it used to present this message:
✗ Retrieving AWS credentials from your AWS configuration
✗ failed to read AWS credentials from /root/.aws/credentials: open /root/.aws/credentials: no such file or directory
How to reproduce it (as minimally and precisely as possible):
Install openshift on AWS using openshift-installer: ./openshift-install create cluster
Run cloud prepare: subctl cloud prepare aws
Anything else we need to know?:
Environment:
subctl diagnose all
):subctl gather
):devel
and on 0.12.1
Migrate this code to use EndpointResolverV2
. A basic migration strategy is outlined here. It seems we might be able to set the BaseEndpoint
field but we also set the PartitionID
field on the aws.Endpoint
and it's unclear how that maps to the new API. It's also unclear how to distinguish service names as the service
name param is not passed to the EndpointResolverV2
interface.
Running cloud prepare (OCP 4.12 install, if it matters) and the perparation fails due to not being able to find interfaces:
[root@36efcb63aedb shipyard]# subctl cloud prepare azure --ocp-metadata output/ocp-cluster2/ --auth-file ~/.azure/osServicePrincipal.json
✓ Preparing Azure cloud for Submariner deployment
✓ Obtained infra ID "mkolesni-testday-clus-6qgc4" and region "eastus" from OCP metadata file "output/ocp-cluster2/"
✓ Retrieving Azure credentials from your Azure authorization file "/root/.azure/osServicePrincipal.json"
✓ Initializing Azure connectivity
✗ Deploying gateway node
✗ Failed to open the Submariner gateway port for already existing nodes: error getting the interfaces "mkolesni-testday-clus-fgz8k-worker-eastus2-h2r5f-nic" from resource group "mkolesni-testday-clus-6qgc4-rg": GET https://management.azure.com/subscriptions/03e5f0ef-0741-442a-bc1b-ba34ceb3f63f/resourceGroups/mkolesni-testday-clus-6qgc4-rg/providers/Microsoft.Network/networkInterfaces/mkolesni-testday-clus-fgz8k-worker-eastus2-h2r5f-nic
--------------------------------------------------------------------------------
RESPONSE 404: 404 Not Found
ERROR CODE: ResourceNotFound
--------------------------------------------------------------------------------
{
"error": {
"code": "ResourceNotFound",
"message": "The Resource 'Microsoft.Network/networkInterfaces/mkolesni-testday-clus-fgz8k-worker-eastus2-h2r5f-nic' under resource group 'mkolesni-testday-clus-6qgc4-rg' was not found. For more details please go to https://aka.ms/ARMResourceNotFoundFix"
}
}
--------------------------------------------------------------------------------
✗ Failed to prepare Azure cloud: Deployment failed : failed to open the Submariner gateway port for already existing nodes: error getting the interfaces "mkolesni-testday-clus-fgz8k-worker-eastus2-h2r5f-nic" from resource group "mkolesni-testday-clus-6qgc4-rg": GET https://management.azure.com/subscriptions/03e5f0ef-0741-442a-bc1b-ba34ceb3f63f/resourceGroups/mkolesni-testday-clus-6qgc4-rg/providers/Microsoft.Network/networkInterfaces/mkolesni-testday-clus-fgz8k-worker-eastus2-h2r5f-nic
--------------------------------------------------------------------------------
RESPONSE 404: 404 Not Found
ERROR CODE: ResourceNotFound
--------------------------------------------------------------------------------
{
"error": {
"code": "ResourceNotFound",
"message": "The Resource 'Microsoft.Network/networkInterfaces/mkolesni-testday-clus-fgz8k-worker-eastus2-h2r5f-nic' under resource group 'mkolesni-testday-clus-6qgc4-rg' was not found. For more details please go to https://aka.ms/ARMResourceNotFoundFix"
}
}
--------------------------------------------------------------------------------
subctl version: v0.15.0-m2
Now that 0.16.0-rc0 has been released, the release-0.16 branch should be configured in dependabot.
What happened:
I deployed Submariner on a cluster, then removed it, and tried to prepare the cluster again for Submariner (the gateway node had been removed).
cloud-prepare
failed with
✗ InvalidGroup.Duplicate: The security group 'skitt-1-c6nr4-submariner-gw-sg' already exists for VPC 'vpc-00b8d5512b0e9a056'
status code: 400, request id: 770c894e-4bea-4ae1-abc6-c2fed115bb7e
Failed to prepare AWS cloud: InvalidGroup.Duplicate: The security group 'skitt-1-c6nr4-submariner-gw-sg' already exists for VPC 'vpc-00b8d5512b0e9a056'
status code: 400, request id: 770c894e-4bea-4ae1-abc6-c2fed115bb7e
What you expected to happen:
The cluster to be set up for Submariner.
How to reproduce it (as minimally and precisely as possible):
Run cloud-prepare
, delete the gateway node, run cloud-prepare
again.
Anything else we need to know?:
Environment:
subctl diagnose all
):subctl gather
):subctl
(devel
), openshift-installer
4.6.6What would you like to be added:
Cloud prepare support for RHOS to tag an existing node as g/w node
Why is this needed:
This is needed for deploying Submariner in RHOS cloud.
What happened:
Security group is associated with incorrect VM in cloud-prepare for Openstack while using dedicated g/w node. Instead of the VM created for dedicated g/w node. It is getting associated with a different node.
What you expected to happen:
Security groups should be associated with new node created which has the g/w node tag.
How to reproduce it (as minimally and precisely as possible):
run cloud prepare rhos with dedicated g/w node set to true.
Anything else we need to know?:
Environment:
subctl diagnose all
):subctl gather
):dependabot now supports grouping dependencies so that they are upgraded together in a single PR, see https://github.blog/changelog/2023-06-30-grouped-version-updates-for-dependabot-public-beta/ for details. This would be useful for a number of dependencies, in particular AWS dependencies.
Currently we're using the openshift-machine-api
user in AWS to run subctl cloud prepare/cleanup
.
This user is created by the OpenShift installer and has some of the permissions we need, but is also lacking the following:
We have several alternatives to solve this:
What would you like to be added:
It would be more user friendly if we can let the user specify his metadata.json
file and extract the infra ID and region from there.
e.g.:
subctl cloud prepare aws --ocp-metadata /path/to/metadata.json
instead of:
subctl cloud prepare aws --infra-id <infraid> --region <region>
Alternatively we could perhaps extract it from the OpenShift installation (only if it's stored there).
Why is this needed:
Support disabling dedicated gateways when that's desired.
Also, we can support dedicated gateways behind a load balancer if we want, but those then need to run on the private normal network for other nodes, otherwise it doesn't work.
Following submariner-io/shipyard#573, we would like to get rid of project-specific Dockerfile.dapper
files.
Depends on submariner-io/shipyard#575
When we deploy gateway nodes, we should make sure they're labeled as infrastructure nodes, so they don't count against customers' subscriptions.
See https://docs.openshift.com/container-platform/4.12/nodes/nodes/nodes-nodes-creating-infrastructure-nodes.html#creating-an-infra-node_creating-infrastructure-nodes; the relevant label is node-role.kubernetes.io/infra="".
When running "subctl cloud prepare osp ..." on a fresh cluster, it properly configures the gateway node, creates the necessary Security Group rules and associates the SG to the nodes. However, if one of the existing nodes is already labelled as a gateway node and we run "subctl cloud prepare osp ..." it assumes that necessary steps are already done and is not associating the required Security Group rules on the nodes.
Workaround:
subctl
driven deployment: delete the submariner.io/gateway=true
label and re-run the subctl cloud-prepare
command.submariner-addon
from the ManagedCluster and re-install it.This week’s bump to the latest Azure SDK failed:
pkg/azure/azure.go:25:2: SA1019: package github.com/Azure/azure-sdk-for-go/services/network/mgmt/2021-03-01/network is deprecated: Please note, this package has been deprecated. A replacement package is available [github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/network/armnetwork](https://pkg.go.dev/github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/network/armnetwork). We strongly encourage you to upgrade to continue receiving updates. See [Migration Guide](https://aka.ms/azsdk/golang/t2/migration) for guidance on upgrading. Refer to our [deprecation policy](https://azure.github.io/azure-sdk/policies_support.html) for more details. (staticcheck)
"github.com/Azure/azure-sdk-for-go/services/network/mgmt/2021-03-01/network"
^
See https://github.com/Azure/azure-sdk-for-go/blob/main/documentation/MIGRATION_GUIDE.md for details.
What would you like to be added:
Enhance cloud prepare to support dedicated g/w node in RHOS
Why is this needed:
This needed to support Submariner in RHOS
What would you like to be removed:
Let's deprecate this mode for 0.15 and remove it on 0.16
Why is this needed:
We're not actually testing or using non-dedicated gateway mode from cloud prepare.
Furthermore, a more K8s native approach is to deploy with LB mode which actually handles the cloud related operations properly.
From a technical debt perspective, the code is just sitting there and making cloud-prepare harder to maintain, with it removed we'll have an easier maintenance burned and could even further simplify cloud prepare code.
What happened:
In Openstack, cloud prepare does not create gateway node with OVN Kubernetes CNI
What you expected to happen:
How to reproduce it (as minimally and precisely as possible):
Run cloud prepare in a Openshift cluster with OVNKubernetes CNI
Anything else we need to know?:
The below error is shown is machineset controller
E0406 15:03:56.893497 1 controller.go:326] "msg"="Reconciler error" "error"="error creating Openstack instance: error getting security groups: security group asuryanarhos-r7jdj-submariner-internal-sg not found" "controller"="machine-controller" "name"="asuryanarhos-r7jdj-submariner-gw-0-6vdcb" "namespace"="openshift-machine-api" "object"={"name":"asuryanarhos-r7jdj-submariner-gw-0-6vdcb","namespace":"openshift-machine-api"} "reconcileID"="26d333f5-695e-4213-807e-a24caf5a0b2e"
W0406 15:20:57.362743 1 controller.go:382] asuryanarhos-r7jdj-submariner-gw-0-6vdcb: failed to create machine: error creating Openstack instance: error getting security groups: security group asuryanarhos-r7jdj-submariner-internal-sg not found
Environment:
subctl diagnose all
):subctl gather
):What would you like to be added:
Support for automated infra configuration of GCP-based environments.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.