Git Product home page Git Product logo

cloud-image-builder's Introduction

Cloud Image Builder

This repository will contain the Image generation for AWS, GCP and also for Minikube.

Resources

Workflow

The workflow starts on a Jenkins pipeline which fill some env vars and then executes the jobs:

  • shell/build.sh

This script will execute 3 ansible playbooks:

  • ${PROVIDER}-provision.yml: Creates the instance on ${PROVIDER} and waits for SSH access (CentOS7)
  • ${PROVIDER}-setup.yml: Download/Copy the needed resources, installs a script called first-boot.sh which will install K8s and then Kubevirt on it when the user create an instance of that image (not on the image generation).
  • ${PROVIDER}-mkimage.yml: Stops the instance, Create a ${PROVIDER} Image from the stopped instance, Deletes the source instance and then creates a new instance of googlecloud-sdk prepared to upload the image to a GCS bucket (but not yet).

Then the next step comes in:

  • shell/publish.sh

This process will execute 1 another ansible playbook:

  • ${PROVIDER}-publish.yml: On the previous instace created using a googlecloud-sdk base image:
    • Copy the credentials to remote instance.
    • Exports the image generated to a GCS bucket on a tar.gz file
    • Stops and delete the instance.
    • The exported Targz file contains the raw disk of the GCE instance.

On AWS case, this last step changes. The YAML file will propagate the AMI image among some regions (listed bellow), then a JSON file is generated with the AMI ID's info.

AWS Regions to propagate to

  • us-west-1
  • us-east-1
  • disabled us-east-2

  • disabled us-west-2

  • disabled ca-central-1

  • eu-west-1
  • disabled eu-west-2

  • disabled eu-west-3

  • disabled eu-central-1

  • disabled ap-northeast-1

  • ap-southeast-1
  • disabled ap-southeast-2

  • disabled ap-south-1

  • disabled sa-east-1

Enhancements

This repo let the user to access to an instance on GCE or AWS which contains all the necessary to run through the labs, but does not install anything until the user spin up the instance. This is important to know because we only gain some speed downloading some resources and we need to maintain many things and dedicate some resources to do this.

TO-DO

  • Inline inventories.
  • ENV Vars for Ansible config.
  • Make ${PROVIDER}-setup.yml generic for all providers using block.
  • Maybe we could valorate to create generic YAML files instead of have one per provider.
  • Use Krew to install virtctl as a plugin
  • Reduce Ansible copy tasks using loops.
  • Delete bin folder in order to download from internet.
  • Deploy Kubevirt and CDI using operator instead of kubevirt-ansible.

cloud-image-builder's People

Contributors

alosadagrande avatar codificat avatar iranzo avatar joejstuart avatar joeldavis84 avatar jparrill avatar karmab avatar markllama avatar rwsu avatar tripledes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cloud-image-builder's Issues

first-boot.sh run fails on startup

On instance start, both on on AWS and GCP, the first-boot.sh script fails to run.

[centos@ip-192-168-2-190 ~]$ journalctl | grep first-boot > ~/first-boot.log
[centos@ip-192-168-2-190 ~]$ more ~/first-boot.log 
Mar 26 14:52:48 ip-192-168-2-190.ec2.internal first-boot.sh[4290]: sed: can't read playbooks/roles/kubernetes
-master/vars/main.yml: No such file or directory
Mar 26 14:52:55 ip-192-168-2-190.ec2.internal first-boot.sh[4290]: PLAY [install kubernetes] ****************
**************************************
...
Mar 26 15:00:03 ip-192-168-2-190.ec2.internal first-boot.sh[4290]: TASK [find CentOS version] ***************
**************************************
Mar 26 15:00:04 ip-192-168-2-190.ec2.internal first-boot.sh[4290]: changed: [localhost] => {"changed": true, 
"cmd": "cat /etc/redhat-release | awk '{ print $4; }'", "delta": "0:00:00.004647", "end": "2019-03-26 15:00:0
4.254490", "rc": 0, "start": "2019-03-26 15:00:04.249843", "stderr": "", "stderr_lines": [], "stdout": "7.6.1
810", "stdout_lines": ["7.6.1810"]}
Mar 26 15:00:04 ip-192-168-2-190.ec2.internal first-boot.sh[4290]: TASK [find Kubernetes version] ***********
**************************************
Mar 26 15:00:04 ip-192-168-2-190.ec2.internal first-boot.sh[4290]: changed: [localhost] => {"changed": true, 
"cmd": "kubectl get nodes | grep master | awk '{ print $5; }'", "delta": "0:00:00.056270", "end": "2019-03-26
 15:00:04.463120", "rc": 0, "start": "2019-03-26 15:00:04.406850", "stderr": "The connection to the server lo
calhost:8080 was refused - did you specify the right host or port?", "stderr_lines": ["The connection to the 
server localhost:8080 was refused - did you specify the right host or port?"], "stdout": "", "stdout_lines": 
[]}
Mar 26 15:00:04 ip-192-168-2-190.ec2.internal first-boot.sh[4290]: TASK [find KubeVirt version] *************
**************************************
Mar 26 15:00:04 ip-192-168-2-190.ec2.internal first-boot.sh[4290]: changed: [localhost] => {"changed": true, 
"cmd": "cat /home/centos/kubevirt-version", "delta": "0:00:00.003771", "end": "2019-03-26 15:00:04.618157", "
rc": 0, "start": "2019-03-26 15:00:04.614386", "stderr": "", "stderr_lines": [], "stdout": "0.14.0", "stdout_
lines": ["0.14.0"]}
Mar 26 15:00:04 ip-192-168-2-190.ec2.internal first-boot.sh[4290]: TASK [template motd file] ****************
**************************************
Mar 26 15:00:05 ip-192-168-2-190.ec2.internal first-boot.sh[4290]: changed: [localhost] => {"changed": true, 
"checksum": "739633db3c912086a73519ac2a47f6ef3f90bbf8", "dest": "/etc/motd", "gid": 0, "group": "root", "md5s
um": "e35943e142bcbf4496a234278d5149a5", "mode": "0644", "owner": "root", "secontext": "system_u:object_r:etc
_t:s0", "size": 357, "src": "/root/.ansible/tmp/ansible-tmp-1553612404.67-57418666654755/source", "state": "f
ile", "uid": 0}
Mar 26 15:00:05 ip-192-168-2-190.ec2.internal first-boot.sh[4290]: PLAY RECAP *******************************
**************************************
Mar 26 15:00:05 ip-192-168-2-190.ec2.internal first-boot.sh[4290]: localhost                  : ok=5    chang
ed=4    unreachable=0    failed=0
Mar 26 15:00:05 ip-192-168-2-190.ec2.internal first-boot.sh[4290]: Note: Forwarding request to 'systemctl dis
able kubevirt-installer.service'.
Mar 26 15:00:05 ip-192-168-2-190.ec2.internal first-boot.sh[4290]: Removed symlink /etc/systemd/system/multi-
user.target.wants/kubevirt-installer.service.

Instance cleanup is timing out on GCP

From CI log https://jenkins-kubevirt.apps.ci.centos.org/blue/rest/organizations/jenkins/pipelines/cloud-image-builder/branches/PR-71/runs/2/log/?start=0,

[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] [cloud-image-builder_PR-71-JQTBPZAX4DCIWKVPBIRIXXGVDKM26HY3KRKIPXGHE6LEQA2W3BDQ] Running shell script
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] + ansible-playbook -vvv --private-key **** gcp-test-centos-cleanup.yml
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] ansible-playbook 2.6.1
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] config file = /workDir/workspace/cloud-image-builder_PR-71-JQTBPZAX4DCIWKVPBIRIXXGVDKM26HY3KRKIPXGHE6LEQA2W3BDQ/ansible.cfg
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] configured module search path = [u'/workDir/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] ansible python module location = /usr/lib/python2.7/site-packages/ansible
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] executable location = /usr/bin/ansible-playbook
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] python version = 2.7.15 (default, May 16 2018, 17:50:09) [GCC 8.1.1 20180502 (Red Hat 8.1.1-1)]
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] Using /workDir/workspace/cloud-image-builder_PR-71-JQTBPZAX4DCIWKVPBIRIXXGVDKM26HY3KRKIPXGHE6LEQA2W3BDQ/ansible.cfg as config file
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] Parsed /etc/ansible/hosts inventory source with ini plugin
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] [WARNING]: provided hosts list is empty, only localhost is available. Note
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] that the implicit localhost does not match 'all'
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df]
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] PLAYBOOK: gcp-test-centos-cleanup.yml ******************************************
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] 1 plays in gcp-test-centos-cleanup.yml
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df]
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] PLAY [localhost] ***************************************************************
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] META: ran handlers
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df]
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] TASK [Delete the test instance] ************************************************
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] task path: /workDir/workspace/cloud-image-builder_PR-71-JQTBPZAX4DCIWKVPBIRIXXGVDKM26HY3KRKIPXGHE6LEQA2W3BDQ/gcp-test-centos-cleanup.yml:11
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] <127.0.0.1> ESTABLISH LOCAL CONNECTION FOR USER: default
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] <127.0.0.1> EXEC /bin/sh -c 'echo ~default && sleep 0'
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] <127.0.0.1> EXEC /bin/sh -c '( umask 77 && mkdir -p "echo /workDir/.ansible/tmp/ansible-tmp-1540343031.65-22375549268710" && echo ansible-tmp-1540343031.65-22375549268710="echo /workDir/.ansible/tmp/ansible-tmp-1540343031.65-22375549268710" ) && sleep 0'
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] Using module file /usr/lib/python2.7/site-packages/ansible/modules/cloud/google/gce.py
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] <127.0.0.1> PUT /workDir/.ansible/tmp/ansible-local-1622PyY9bx/tmpJu_W2E TO /workDir/.ansible/tmp/ansible-tmp-1540343031.65-22375549268710/gce.py
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] <127.0.0.1> EXEC /bin/sh -c 'chmod u+x /workDir/.ansible/tmp/ansible-tmp-1540343031.65-22375549268710/ /workDir/.ansible/tmp/ansible-tmp-1540343031.65-22375549268710/gce.py && sleep 0'
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] <127.0.0.1> EXEC /bin/sh -c '/usr/bin/python2 /workDir/.ansible/tmp/ansible-tmp-1540343031.65-22375549268710/gce.py && sleep 0'
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] <127.0.0.1> EXEC /bin/sh -c 'rm -f -r /workDir/.ansible/tmp/ansible-tmp-1540343031.65-22375549268710/ > /dev/null 2>&1 && sleep 0'
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] The full traceback is:
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] Traceback (most recent call last):
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] File "/tmp/ansible_6Xe84b/ansible_module_gce.py", line 748, in
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] main()
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] File "/tmp/ansible_6Xe84b/ansible_module_gce.py", line 695, in main
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] module, gce, inames, number, lc_zone, state)
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] File "/tmp/ansible_6Xe84b/ansible_module_gce.py", line 604, in change_instance_state
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] gce.destroy_node(node)
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] File "/usr/lib/python2.7/site-packages/libcloud/compute/drivers/gce.py", line 6428, in destroy_node
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] self.connection.async_request(request, method='DELETE')
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] File "/usr/lib/python2.7/site-packages/libcloud/common/base.py", line 787, in async_request
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] (self.timeout))
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] libcloud.common.types.LibcloudError: <LibcloudError in None 'Job did not complete in 180 seconds'>
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] fatal: [localhost]: FAILED! => {
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] "changed": false,
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] "module_stderr": "Traceback (most recent call last):\n File "/tmp/ansible_6Xe84b/ansible_module_gce.py", line 748, in \n main()\n File "/tmp/ansible_6Xe84b/ansible_module_gce.py", line 695, in main\n module, gce, inames, number, lc_zone, state)\n File "/tmp/ansible_6Xe84b/ansible_module_gce.py", line 604, in change_instance_state\n gce.destroy_node(node)\n File "/usr/lib/python2.7/site-packages/libcloud/compute/drivers/gce.py", line 6428, in destroy_node\n self.connection.async_request(request, method='DELETE')\n File "/usr/lib/python2.7/site-packages/libcloud/common/base.py", line 787, in async_request\n (self.timeout))\nlibcloud.common.types.LibcloudError: <LibcloudError in None 'Job did not complete in 180 seconds'>\n",
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] "module_stdout": "",
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] "msg": "MODULE FAILURE",
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] "rc": 1
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] }
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] to retry, use: --limit @/workDir/workspace/cloud-image-builder_PR-71-JQTBPZAX4DCIWKVPBIRIXXGVDKM26HY3KRKIPXGHE6LEQA2W3BDQ/gcp-test-centos-cleanup.retry
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df]
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] PLAY RECAP *********************************************************************
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df] localhost : ok=0 changed=0 unreachable=0 failed=1
[gcp-centos-a5e7fc18-d90e-4a3f-9362-479bbfeeb5df]

kubevirt install failing on multus provisioning on master builds

Here's the error I'm seeing on an EC2 node:

[centos@ip-10-0-4-184 kubevirt-ansible]$ sudo ansible-playbook playbooks/kubevirt.yml -e@vars/all.yml -e cluster=kubernetes --connection=local -i inventory-aws

PLAY [masters[0]] **********************************************************************************************************************

TASK [network-multus : include_tasks] **************************************************************************************************
Friday 05 October 2018 00:38:58 +0000 (0:00:00.071) 0:00:00.071 ********
fatal: [ip-10-0-4-184.ec2.internal]: FAILED! => {
"reason": "The field 'loop' is supposed to be a string type, however the incoming data structure is a <class 'ansible.parsing.yaml.objects.AnsibleSequence'>\n\nThe error appears to have been in '/home/centos/kubevirt-ansible/playbooks/roles/network-multus/tasks/provision.yml': line 58, column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n - name: Wait until multus is running\n ^ here\n"

I believe the issue is that the image is using ansible version 2.4 and we need to upgrade to a more recent version. kubevirt-ansible has moved to 2.6.3: kubevirt/kubevirt-ansible@3f842dd

[labs] latest gcp image does not have virtctl binary

Transferred from: kubevirt/kubevirt.github.io#374

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:

Following lab1 on a GCP instance created from the kubevirt-0.20.7-4 image, the step about starting the first VM fails because there's no virtctl binary:

$ pwd
/home/centos
$ ls -l virtctl
ls: cannot access virtctl: No such file or directory

What you expected to happen:

Lab should work as described

Anything else we need to know?:

Problem deleting hostpath PVC created using CDI. Hangs.

After completing this lab http://kubevirt.io/labs/kubernetes/lab7, I preceded to delete the fedora PVC.

[centos@ip-172-30-0-188 ~]$ kubectl delete -f pvc_fedora.yml 
persistentvolumeclaim "fedora" deleted

It reports as deleted but the command hangs, and the prompt does not return, even after an hour has passed.

I then opened a second prompt to view the PVC status and it reports:

[centos@ip-172-30-0-188 ~]$ kubectl get pvc
NAME     STATUS        VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
fedora   Terminating   pvc-8dbf9c5f-d161-11e8-a817-16237247c730   10Gi       RWO            hostpath       66m

I would expect the delete to complete quickly and not result in a hang.

Versions
OS: CentOS 7.5.1804 Core
Kubernetes: v1.12.1
KubeVirt: v0.8.0
CDI: v1.2.0

feedback on generated vm

  • misses virtctl in path
  • misses cdi
  • misses kubevirt ui
  • for multus, might be helpful to create some network attachment definitions

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.