Git Product home page Git Product logo

shipyard's People

Contributors

astoycos avatar aswinsuryan avatar billy99 avatar davidohana avatar dependabot-preview[bot] avatar dependabot[bot] avatar dfarrell07 avatar jaanki avatar maayanf24 avatar mangelajo avatar maryamtahhan avatar maxbab avatar mkimuram avatar mkolesnik avatar pinikomarov avatar roytman avatar skitt avatar sridhargaddam avatar tpantelis avatar vthapar avatar yboaron avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

shipyard's Issues

Connector Pod in E2e tests is not waiting before retrying during some failures

Currently, as part of e2e tests, as soon as Connector pod is
scheduled, it tries to connect to the listener pod with a wait
interval configured in Config.ConnectionTimeout (defaults to 18 secs).

However, in some scenarios, if there is any error while accessing
the remote server, the current logic does not wait for 18 secs
before retrying for the next time. So, all the CONN_TRIES seem
to happen one after the other and the test-case is marked as failed.
This is seen with Globalnet jobs where it takes time for the
ingress/egress rules to programmed on the Gateway nodes.

Example:
Waiting for the connector pod "tcp-check-pod5rfhb" to exit, returning what connector sent
INFO: Pod "tcp-check-pod5rfhb" output:
nc: 169.254.2.13 (169.254.2.13:1234): No route to host
nc: 169.254.2.13 (169.254.2.13:1234): No route to host
nc: 169.254.2.13 (169.254.2.13:1234): No route to host
nc: 169.254.2.13 (169.254.2.13:1234): No route to host
nc: 169.254.2.13 (169.254.2.13:1234): No route to host
nc: 169.254.2.13 (169.254.2.13:1234): No route to host
nc: 169.254.2.13 (169.254.2.13:1234): No route to host

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

go: gopkg.in/[email protected]: unrecognized import path "gopkg.in/inf.v0" (parse https://gopkg.in/inf.v0?go-get=1: no go-import meta tags ())

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Lighthouse images are not pulled locally with kind when deploying from scratch

What happened:
When making a deployment (make deploy from https://github.com/submariner-io/submariner) from scratch and submariner related images are not in local "docker images", the lighthouse images (lighthouse-agent and lighthouse-coredns) are not pulled.

The lighthouse deployment pods have the "ImagePullBackOff" status.

$ kubectl --kubeconfig output/kubeconfigs/kind-config-cluster2 -n submariner-operator get pods
NAME                                            READY   STATUS             RESTARTS   AGE
submariner-gateway-zcrw9                        1/1     Running            0          109s
submariner-lighthouse-agent-ccdbc9659-t8bv5     0/1     ImagePullBackOff   0          108s
submariner-lighthouse-coredns-557485fbc-pn45l   0/1     ImagePullBackOff       0          107s
submariner-lighthouse-coredns-557485fbc-wwm89   0/1     ImagePullBackOff   0          107s
submariner-operator-6675977db7-l5nl7            1/1     Running            0          2m2s
submariner-routeagent-276tw                     1/1     Running            0          109s
submariner-routeagent-z89p5                     1/1     Running            0          108s
submariner-routeagent-zth62                     1/1     Running            0          108s

Deployments request the following images:

$ kubectl --kubeconfig output/kubeconfigs/kind-config-cluster2 -n submariner-operator describe deployment submariner-lighthouse-agent | grep -i image
    Image:      localhost:5000/lighthouse-agent:local

$ kubectl --kubeconfig output/kubeconfigs/kind-config-cluster2 -n submariner-operator describe deployment submariner-lighthouse-coredns | grep -i image
    Image:      localhost:5000/lighthouse-coredns:local

The "docker images" output (lighthouse image are missing):

$ docker images
REPOSITORY                                           TAG                 IMAGE ID            CREATED             SIZE
quay.io/submariner/submariner-networkplugin-syncer   dev                 3b82d8483bdc        4 minutes ago       110MB
quay.io/submariner/submariner-networkplugin-syncer   devel               3b82d8483bdc        4 minutes ago       110MB
submariner                                           master              f2814c74b8f7        11 minutes ago      800MB
quay.io/submariner/submariner-globalnet              <none>              460ad9783141        50 minutes ago      124MB
quay.io/submariner/submariner-globalnet              dev                 6b44c121f332        50 minutes ago      124MB
quay.io/submariner/submariner-globalnet              devel               6b44c121f332        50 minutes ago      124MB
localhost:5000/submariner-route-agent                local               ad6c3273b6d4        50 minutes ago      124MB
quay.io/submariner/submariner-route-agent            dev                 ad6c3273b6d4        50 minutes ago      124MB
quay.io/submariner/submariner-route-agent            devel               ad6c3273b6d4        50 minutes ago      124MB
quay.io/submariner/submariner-route-agent            <none>              1a3ee61d89bc        50 minutes ago      124MB
quay.io/submariner/submariner                        <none>              32c193e4b57b        50 minutes ago      245MB
localhost:5000/submariner                            local               c17d290c5a16        50 minutes ago      245MB
quay.io/submariner/submariner                        dev                 c17d290c5a16        50 minutes ago      245MB
quay.io/submariner/submariner                        devel               c17d290c5a16        50 minutes ago      245MB
localhost:5000/submariner-operator                   local               5dab07c69884        11 hours ago        10MB
quay.io/submariner/submariner-operator               dev                 5dab07c69884        11 hours ago        10MB
quay.io/submariner/submariner-operator               devel               5dab07c69884        11 hours ago        10MB
quay.io/submariner/shipyard-dapper-base              devel               45c574b97e56        17 hours ago        800MB
fedora                                               33                  b3048463dcef        5 days ago          175MB
registry.access.redhat.com/ubi8/ubi-minimal          latest              c103a05423dd        2 weeks ago         103MB
localhost:5000/nettest                               local               ed8d90d0ba28        3 weeks ago         24.3MB
quay.io/submariner/nettest                           dev                 ed8d90d0ba28        3 weeks ago         24.3MB
quay.io/submariner/nettest                           devel               ed8d90d0ba28        3 weeks ago         24.3MB
quay.io/submariner/submariner-networkplugin-syncer   <none>              1554c6f5835a        3 weeks ago         165MB
registry                                             2                   2d4f4b5309b1        5 months ago        26.2MB
kindest/node                                         v1.17.0             ec6ab22d89ef        10 months ago       1.23GB

If I pull the images manually, everything is working.

What you expected to happen:
The "make deploy" command should pull all requested images and prepare the environment.

How to reproduce it (as minimally and precisely as possible):
Delete all submariner related images from docker images.

Environment:

  • Submariner version: v0.7.0
  • Kubectl version: Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.4", GitCommit:"d360454c9bcd1634cf4cc52d1867af5491dc9c5f", GitTreeState:"clean", BuildDate:"2020-11-11T13:17:17Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
  • Kind version: 0.9.0
  • OS: Fedora 33
  • Kernel: Linux max 5.9.8-200.fc33.x86_64 #1 SMP Tue Nov 10 21:58:19 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

The docs troubleshooting guide is mentioning the lighthouse image issue, but I think that it should be happen automatically.

Support FIPS mode on Red Hat-supported platforms

At some point we should support FIPS mode, which requires building Go binaries in such a way that they can use the system’s OpenSSL libraries rather than Go’s default BoringSSL. This is possible using go-toolset, which is available on RHEL, and can be enabled on UBI8 on hosts with an appropriate license:

 docker run -it --rm registry.access.redhat.com/ubi8/ubi:latest
 dnf install go-toolset

This currently provides Go 1.13 which is fine for our purposes.

Provide Shipyard shared infra in all submariner-io repos

Currently, some repositories don't provide the shared Shipyard infrastructure:

It would be useful to be able to run local linting in those repos, via the shared Shipyard tooling, instead of relying on GHAs in CI.

While adding Shipyard infra to these projects, the update/refactor the (now-removed) docs about the process of adding Shipyard to repos.

Implement basic GH action

A basic action is needed for any KIND deployment to work, since the default GH actions VMs don't have enough free space.

Add capability to run consuming projects e2e

We need a way to run CI jobs that test that we don't break consuming projects.

One idea is to trigger such jobs in GHA by comments, eg "/testprojects" or something like that

Move codegen logic to Shipyard

The codegen target is used in Submariner and also was copied into Lighthouse.

It would be better to place it in Shipyard

Add globalnet flag to E2E Framework

The code to detect whether globalnet is enabled resides in the submariner project. Instead of moving it to shipyard, pass it as a flag on the go test command line as the front-end script already knows if globalnet is enabled.

RunNoConnectivityTest has very low timeout -> false positives

test/e2e/tcp/connectivity.go

func RunNoConnectivityTest(p ConnectivityTestParams) (*framework.NetworkPod, *framework.NetworkPod) {
	if p.ConnectionTimeout == 0 {
		p.ConnectionTimeout = 5

mangelajo 4 hours ago Member
I guess this is the code we had, but while we are at refactors I don't want us to forget about it :)

Such timeout is too low:

we create listener with 5 sec timeout,
we go to 2nd cluster, and create connector, with the same timeout
but chances are that the creation time will be much higher than the connection attempts.

This leads to false positives, and we should probably bump this to at least (arbitrary) 30 seconds. I know it will make this test slower, but it will have less unlike false possitives.

We could refine our testing images in the future for better handling of timeouts, etc.

globalnet: E2E not working

https://travis-ci.com/github/submariner-io/submariner-operator/jobs/313547481

Shipyard expects clusters X to be allocated CIDR 169.254.X.0/24. But this is incorrect as Cluster1, which works as broker shouldn't be allocated any CIDR Current allocations are

  •  global_CIDRs['cluster2']='169.254.0.0/19'
    
  •  global_CIDRs['cluster3']='169.254.32.0/19'
    

These are being changed in submariner-io/submariner-operator#288 to

  •  global_CIDRs['cluster2']='169.254.0.0/24'
    
  •  global_CIDRs['cluster3']='169.254.1.0/24'
    

This should be modified to align with new values. The actual CIDR depends on globalnet-cluster-size passed at time of deploy-broker or join. So, either it should be made dynamic, or use the newer values that assume a cluster size of 255

Fix polarion not reporting of Ginkgo tests junit output

We want to import the Junit.xml results that is being generated from E2E tests, into Polarion.
There's a bug in JUMP polarion script, that fails to parse test messages () into Polarion.

The required solution is in 2 phases:

Phase 1: Create a temporary workaround to modify the XML file, to be able to be read by JUMP tool - This can be done in Python or in Bash.

Phase 2: A complete resolution in the Ginkgo Test framework (GO) - to create the "" under "" section as PR (pull request).
This requires experience in the Ginkgo framework, to fix + add unit-tests (not related to Submariner), to verify the PR , and approval by the community.

Please go over the ticket, and try to reproduce the issue. Once reproduced, you can start fixing it.

-dp-context of E2E has different orders across projects

Shipyard adds -dp-context cluster1 -dp-context cluster2 -dp-context cluster3, such order is used by Lighthouse.

Submariner uses -dp-context cluster2 -dp-context cluster3 -dp-context cluster1

Subctl consumes e2e providing only two dp contexts.

We need to make them uniform so :
* The shipyard e2e script will work for all
* Subctl can consume all uniformly.

I created this, but probably the best option is to make them all uniform the way shipyard includes them today, which makes more sense.

This needs to be thought.

submariner-io/lighthouse#151

Add shared build target for images

Right now images are rebuilt each time, but we can only rebuild them if we need to.

It would be best to put this logic in Shipyard and use everywhere else.

E2E on submariner-operator fails (netshoot fimeout setting)

submariner-io/submariner-operator#376

https://github.com/submariner-io/submariner-operator/pull/376/checks?check_run_id=671110418#step:5:8351

deployment.apps/netshoot created
Waiting for netshoot pods to be ready.
[submariner-operator]$ [cluster2] kubectl rollout status deploy/netshoot --timeout=
[submariner-operator]$ [cluster2] kubectl rollout status deploy/netshoot --timeout=
[submariner-operator]$ [cluster2] command kubectl --context=cluster2 rollout status deploy/netshoot --timeout=
[submariner-operator]$ [cluster2] kubectl --context=cluster2 rollout status deploy/netshoot --timeout=
Error: invalid argument "" for "--timeout" flag: time: invalid duration 

reload-images restart=routeagent does not work consistently.

In Submariner repo, we have a make target "make reload-images restart=routeagent" which allows us to update the route-agent Pods in the KIND Clusters and restart them at the same time. However, it was seen that when we execute that command, not all route-agent pods are getting restarted.
This is because the submariner-operator overrides the changes done as part of restart command shown below.

kubectl patch -n submariner-operator daemonset submariner-routeagent --type=json -p=[{"op": "replace", "path": "/spec/template/spec/containers/0/imagePullPolicy", "value": "Always" },{"op": "replace", "path": "/spec/template/metadata/labels/modified", "value": "1595246714"}]

Need a more reliable way to support this target.

Old E2E namespaces should be removed or overridden

When running E2E tests (e.g. with subctl) multiple times, it may occur that old E2E namespaces, which includes Nginx, are left for days without being deleted.
This will eventually cause a low disk space and memory on the cluster nodes, and then pods will start to crash/evicted:

$ kubectl get pods -n submariner-operator -o wide
NAME                                           READY   STATUS    RESTARTS   AGE     IP             NODE                             NOMINATED NODE   READINESS GATES
submariner-gateway-px2b9                       0/1     Evicted   0          4m      <none>         default-cl1-k7hcq-worker-g2cnf   <none>           <none>
submariner-globalnet-2b74f                     0/1     Evicted   0          3m59s   <none>         default-cl1-k7hcq-worker-g2cnf   <none>           <none>
submariner-lighthouse-agent-6bc4766f97-lbxc2   1/1     Running   0          3h36m   10.255.0.213   default-cl1-k7hcq-worker-hctn6   <none>           <none>
submariner-lighthouse-coredns-c88f64f5-q4hfp   1/1     Running   0          3h36m   10.255.0.215   default-cl1-k7hcq-worker-hctn6   <none>           <none>
submariner-lighthouse-coredns-c88f64f5-qhhrf   1/1     Running   0          3h36m   10.255.0.214   default-cl1-k7hcq-worker-hctn6   <none>           <none>
submariner-operator-dcbdf5669-vngwt            1/1     Running   0          3h36m   10.255.0.212   default-cl1-k7hcq-worker-hctn6   <none>           <none>
submariner-routeagent-2j8bx                    1/1     Running   0          3h36m   10.166.2.69    default-cl1-k7hcq-worker-hctn6   <none>           <none>
submariner-routeagent-9w9q4                    1/1     Running   0          3h36m   10.166.2.159   default-cl1-k7hcq-master-1       <none>           <none>
submariner-routeagent-cdjhs                    1/1     Running   0          3h36m   10.166.2.92    default-cl1-k7hcq-master-2       <none>           <none>
submariner-routeagent-htcbm                    0/1     Evicted   0          3m59s   <none>         default-cl1-k7hcq-worker-g2cnf   <none>           <none>
submariner-routeagent-s6xpd                    1/1     Running   0          3h36m   10.166.2.82    default-cl1-k7hcq-master-0       <none>           <none>

image

Originally posted by @sridhargaddam in submariner-io/submariner#913 (comment)

E2e Connector pod should wait until its ready before running any tests

Currently, in the e2e dataplane tests, we create connector and listener pods which internally use 'nc' utility for validating tcp connectivity. For listener pod, we wait until its ready, but the connector pod runs the connectivity test as soon as its deployed. While this is okay for vanilla submariner use-cases, its prone to failures with globalnet jobs.

Basically when using globalnet, one has to wait until the globalIp is annotated on the Pod/Service before cross-cluster connectivity can be validated.

Support 2 cluster deployments

Currently if deploying 2 clusters then connectivity isn't checked. Support deploying just 2 clusters and have connectivity check if they have submariner.
This is related to #136 since if we can deploy broker where submariner is deployed, we can save a cluster.

Shipyard & operator (and perhaps lighthouse) stand to benefit from this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.