nuagenetworks / nuage-metroae Goto Github PK

View Code? Open in Web Editor NEW

44.0 44.0 17.0 78.43 MB

Nuage Networks Metro Automation Engine

Home Page: http://devops.nuagenetworks.net

License: Apache License 2.0

Python 24.73% Shell 5.67% HTML 48.90% Jinja 20.68% Dockerfile 0.02%

nuage-metroae's People

Contributors

Stargazers

Watchers

Forkers

jbemmel sjabasti jeanchristophelevee ani-sinha seksitya shirishbasantrai baijuw tadasburtilius bemird gikonyodils vpatel4n spamulap12 freddebacker borisux tomzhangtelus papamusk63 jopaik

nuage-metroae's Issues

Playbook to unpack Nuage binaries

Feature idea is to have an playbook that unpacks the official Nuage binaries so they can be referenced and used by the deploy playbooks.

It will take as input a directory or s3 link, and unpack the files to a particular destination directory.

It could be transformed to a role if desired, so it can be deployed against a jumphost and have src/dest variables.

Support for Elastic search upgrade and rollback

Stats are only enabled on a single VSD even if clustered VSD

As seen in code:

VMs should have autostart enabled

Possible enhancement: Deployed VSD/VSC VMs should have autostart enabled so they start automatically after reboot of the host or a power outage.

Template nsgv.xml.j2 does not include serial parameters.

Issue:
When deploying a new NSGV, we see that we can not connect via virsh console.

Reason:
Missing information in the nsgv.xml.j2 regarding serial connectivity.

Proposal Changes:
Add the following part in the nsgv.xml.j2 template.

<console type='pty' tty='/dev/pts/16'>
  <source path='/dev/pts/16'/>
  <target type='serial' port='0'/>
  <alias name='serial0'/>
</console>

VSD HA mode causes proxy and statistics problem

This error only happens in VSD HA mode, not in standalone.

configuration in build_vars.yaml (see atachment).
build_vars.yaml.txt

Install playbook looks like this:

3 vsds
2 vscs
1 vstat
1 vnsutils

With the HA option you have following proxy issues on the VMs:

Proxy Issue

On VSD:
/opt/ejabberd/bin/ejabberdctl connected_users
Proxy is not connected

On Proxy:
/var/log/vns/na.log :

[02-May-2017 23:22:13.812] [LOG] Client reconnects
[02-May-2017 23:22:13.815] [LOG] Client is connected
[02-May-2017 23:22:13.816] [LOG] Client is disconnected true { Error
at Connection.onStanza (/opt/notification/node_modules/node-xmpp-core/lib/connection.js:316:21)
at StreamParser. (/opt/notification/node_modules/node-xmpp-core/lib/connection.js:213:14)
at emitOne (events.js:96:13)
at StreamParser.emit (events.js:188:7)
at SaxLtx. (/opt/notification/node_modules/node-xmpp-core/lib/stream_parser.js:56:22)
at emitOne (events.js:96:13)
at SaxLtx.emit (events.js:188:7)
at SaxLtx._handleTagOpening (/opt/notification/node_modules/ltx/lib/sax/sax_ltx.js:30:18)
at SaxLtx.write (/opt/notification/node_modules/ltx/lib/sax/sax_ltx.js:92:26)
at StreamParser.write (/opt/notification/node_modules/node-xmpp-core/lib/stream_parser.js:125:21)
stanza:
Stanza {
name: 'stream:error',
parent: null,
attrs: { 'xmlns:stream': 'http://etherx.jabber.org/streams' },
children: [ [Object] ],
nodeType: 1,
nodeName: 'error' } }

Restart services will no solve the issue

Statistics issue:

On the Elastic VM:
no Firewall rules are added

On VSDs:
Statistic collections is not activated on any vsd

Recovery from VSD failure during upgrade

During the upgrade of a VSD cluster, one of the VSDs wouldn’t come up into operational state. monit giving authentication error. Metro failed. Maybe Metro could have had some kind of error handling. After reboot, the root user could start monit, but no services would start. Nothing in install log. Had to start, then shut down services, then reboot, then it worked. Maybe find flags in the install log. Metro would have complained. If one of the VSDs didn’t come up, had to edit the ansible by hand to make it work.

Packages installed during upgrade must be removed

Customers are often sensitive to the applications and packages that get installed on their lab systems. Any packages that get installed during an upgrade must be removed when the upgrade is complete. Care must be taken not to uninstall a package that was already present before the upgrade.

VSC config.cfg syntax errors are not caught

The vsc-deploy playbook provisions the configuration on the VSC's, if there are problems with the cf1:\config.cfg (and possibly bof.cfg) file the configuration on the VSC cannot be loaded and the vsc-postdeploy will fail.

Could we add a check to make sure the configuration was loaded properly?

This is an snippet from a failed boot on my VSC:

Initializing VMM
Virtual address sharing is disabled
Time from clock is TUE FEB 21 19:24:23 2017 UTC
Initial DNS resolving preference is ipv4-only

Attempting to exec primary configuration file:
'cf1:\config.cfg' ...
System Configuration

MAJOR: CLI #1009 An error occurred while processing a CLI command -
File cf1:\config.cfg, Line 11: Command "server 0.centos.pool.ntp.org" failed.

CRITICAL: CLI #1002 The system configuration is missing or incomplete because an error occurred while processing the configuration file.

MAC address (nsgv_mac) is a mandatory parameter in the build_vars.yml file

Issue:
We have seen that MAC address (nsgv_mac) is a mandatory parameter in the build_vars.yml file.
Proposal:
We consider this parameter should be an optional parameter in stead of mandatory.

Example of desired change:
{% if item.nsgv_mac is defined %}
nsgv_mac: '{{ item.nsgv_mac }}'
{% endif %}

Kind Regards
Guillermo

Support multiple build files

As of this writing, we support one and only one build.yml file. If a customer wants to use the same ansible host to deploy multiple places, they need to do something like have multiple clones, one per deployment target, or multiple build.yml files that they copy over build.yml as needed. This is unwieldy.

Note that it could be that they Metro GUI will solve this for us, hiding the build.yml manipulation undere the covers.

Need to separate VM name from hostname

Metro, as of v2.1.1, uses the configured hostname as the VM name on KVM. (Is there an equivalent on VMware????) Some installations may want them to be different or we could be using Metro to upgrade an installation that was done manually.

The VM name must be optional. If it is not specified, default to the hostname.

Please implement this one asset at a time, e.g. submit a PR for VSD, then a PR for VSC, etc.

Please investigate whether this is an issue on VMware.

It is not possible to build variables for mynsgvs only without myvnsutils parameters.

Issue:
We have seen that is not possible to generate variables when you have only "mynsgvs" parameters.

Reason: You have used the default AND condition ansible that checks for myvnsutils AND mynsgvs parameters.

Prroposal: We suggest to change the logic for the conditional statement when filling variables for NSG and Util

Changes:
We have put an “OR” condition in stead of an “AND” in the role “build”, task “get_paths”, “VNS utility /NSGV”

Example:

name: Register VNS variables with proper path and image locations for use in other playbooks
set_fact:
"{{ item.1 }}_path": "{{ rc_vns_files.results[item.0].files[0].path | dirname }}"
"{{ item.1 }}_file_name": "{{ rc_vns_files.results[item.0].files[0].path | basename }}"
with_indexed_items:
- vnsutil_qcow2
- nsgv_qcow2
  when: ( myvnsutils is defined or mynsgvs is defined ) and "'install' in vns_operations_list|default(['None'])"

Kind Regards
Guillermo

Restructure of playbooks

In current version, the main folder contains a very big set of playbooks which creates a certain level of complexity and confusion.
A suggestions is to re-structure this in a way that

the main folder only contains build.yml, install_xxx.yml
a separate playbooks folder is created with an internal structure covering all the high-level playbooks that refer to the roles of this repo
- playbooks/nuage - contains pure Nuage components delivered as part of software distribution (ie VSD, VSC, VRS, VCIN, VSTAT, NSGV, VNS-Util, etc.)
- playbooks/ci - contains playbooks for the continuous integration labs.
- playbooks/openstack - contains playbooks to deploy osc and compute nodes
- playbooks/mesos - contains playbooks to deploy mesos and associated docker hosts

Feedback welcome

Need to update the test and documentation for Ansible version.

We now require Ansible 2.2 for full support. Update the version check appropriately.

Unpack should only look for files that have been configured

If build.yml doesn't include a section for nsg-v, for example, nuage-unpack role still expects to find nsg-v files and errors if they are not present. We shouldn't require files that aren't used.

VSD 4.0R6 deploy is failing

I think there has been a change in the workflow for deploying VSD for 4.0R6. We need to conditionally execute based on VSD versions.

Add build tests for target_server_type == vcenter

In our current automated test environment, we run several tests of the build role for several kvm-based scenarios. The purpose of these tests is to make sure the variables are being processed correctly. We do not have tests of vcenter-based scenarios. Please implement tests for vcenter builds for 4.0.R8 and 5.0.1.

Introduce per-port bridge settings for NSGVs

In current version build_vars.yml can accomodate the same WAN/LAN bridges for every NSG defined under mynsgvs section of the file.
It is common to have different WAN/LANs bridges on different NSGs deployed in a single sweep. Thus it is necessary to have port <-> bridge setting in each NSGV instance.

A quick fix from @GuillermoMM was to provide the following config:

  - hostname: NSGV_BRANCH_2
    target_server_type: "kvm"
    target_server: 10.167.62.5
    bootstrap_method: zfb_external
    iso_path: '/tmp/'
    port1_bridge: br12
    port2_bridge: br10
    port3_bridge: br1nsg11

Naming was not an issue back then, maybe it will be good to express it like this (for a 6port):

port_bridges:
  - br12
  - br_dummy
  - br1nsg11
  - br_dummy
  - br_dummy
  - br_dummy

Upgrade: Support multiple interfaces on VSD

Metro, in v2.1.1, assumes that the VSD has one and only one network interface. When doing an upgrade from a configuration in which the VSD has added network interfaces, the additional interfaces will not exist when the upgrade completes.

We must add support in VSD upgrade that will ensure that the post-upgrade network configuration on VSD matches the pre-upgrade network configuration. If it has 2 NICs before the upgrade, it must have 2 after the upgrade.

Add pre-deploy support for VMware

Jenkins jobs expect to find specific versions on disk on the master node

We should change the code that supports the Jenkins jobs (in ./test/) such that we pull binary files from another location such as stratos.

Maintain consistency of backup folders names when stored on ansible deployment host

In v2.1.1, when storing backup folders on ansible deployment host,the folders are stored in different /tmp/ paths for different nuage components. Having consistency with the paths/folder names would help.

vsd-deploy: when server is vcenter, the deploy tasks run twice

Issue severity
Critical

Type
Bug

Description
When using vcenter as the server for a VSD, both the non_heat.yml and the vcenter.yml are being executed, causing the install and all tasks being executed twice.

Code reference
https://github.com/nuagenetworks/nuage-metro/blob/6390b882d013c35ae607a5e6646c992e1656d5dd/roles/vsd-deploy/tasks/main.yml#L2-L12

Role restructuring to support target types

From @jonasvermeulen:
---------- Forwarded message ----------
From: Jonas Vermeulen [email protected]
Date: Mon, Oct 10, 2016 at 5:28 AM
Subject: Supporting different target types
To: Brian Castelli [email protected], Philippe DELLAERT [email protected]

Hi Brian,

I'm evaluating the use of metro for deploying VSD/VSC/Proxy/NSGV on top of

OpenStack
VMware
AWS
HyperV

Unfortunately I noticed the "xxxx-deploy" playbooks and associated roles have a mixture of image manipulation tasks, image deployment tasks and inside-OS installation/configuration tasks.
As such, these roles cannot be reused when the target-type will be another hypervisor/cloud type.

My suggestion would be to use use pre-tasks with conditional includes to prepare the image and deployment, before calling the role.
Example is at
https://github.com/openstack/openstack-ansible/blob/master/playbooks/os-horizon-install.yml

Another suggestion is to

include pre-deploy step as a pre-task
define the actual deploy steps as a role
include post-deploy steps as post-tasks

Philippe might have some more suggestions.
At the moment it is more a structural change, not really changing any of the tasks, but it would affect the way how all files are laid out of course. So just looking to get your view, and see what you think.
If you like, we can also discuss over the github Issue board so it becomes open to everyone's view.

Fedora - Current (25 as of today) as Ansible target

Would be great to have fedora be a supported deployment host. Is anyone successfully using it ?

NSG-Templates

With the use of Metro in VNS deployments, it is apparent there is a need for abstracting the way how NSG(V)s are modeled and deployed.

Basically Nuage VSD uses the concept of NSG Templates
(nsgatewaytemplates) to model a group of NSGs. This includes the ports it has, the VSCs it talks to, the underlays it is connected to etc. It would also be the perfect place to define what linux bridges a virtual NSG should connect to.

As such, I propose to split up NSGV roles into

mynsgtemplates - comprises all information for each group of NSG. As such you can define templates for NSG-BR, NSG-UBR, NSG with one/dual uplink etc etc.
mynsgvs - which can then be a list of NSGVs referring to the Metro-name or the UUID of a pre-existing template. You could also define here then what bootstrap method to use.

The NSGV for AWS is already written with the concept of using an external pre-provisioned UUID, but I think the same should apply for the nsgvs.

Flag orphan vports during pre-upgrade health

During the upgrade in the lab, we found that we had vports configured for VMs that didn't exist. Before the upgrade, the health check reports 200 vports. After the upgrade it reported 150 vports. But it wasn't a mistake. 150 was the proper number. We need to enhance the vport health checks to complain when it detects vports that will not be present after the upgrade.

Add libnetworking support for VRS deploy

From 4.0R6 up both dockermon and lib networking plugins are supported. We need to add lib networking support for image-unpack, build and vrs-deploy roles.

Support haproxy configuration for VSD cluster deployments

Once the VSD cluster setup is alive, haproxy can load balance VSD api requests. Currently metro does not support any haproxy configuration.

NSGVs with 6 ports.

Description:
We see that for customer testing, they would like to deploy NSGV with 6 ports as well.

Proposal Change:
It is possible to adapt the roles to include 6 ports?

Thanks a lot in advance,
Guillermo

Unzip - Elastic search backup package should be only be checked when doing upgrade

Issue severity
Minor

Type
Optimisation

Description
The Elastic search package is not unpacked during nuage-unzip because it checks whether both the image package and the backup package are present.

The backup package should only be required if an upgrade is requested. During a fresh install, this package is not required.

Code reference
https://github.com/nuagenetworks/nuage-metro/blob/6390b882d013c35ae607a5e6646c992e1656d5dd/roles/nuage-unzip/tasks/main.yml#L118

Execute yaml syntax check on build_var.yml

Syntax errors in the build_vars.yml file can be tricky to find and fix. This issue has the following tasks:

Investigate methods for doing this check. I would like to kick off the check from within the build.yml playbook, but I'm concerned that a syntax error will prevent that from happening when the vars file is loaded.
Present options with a recommended choice
Implement and test the feature

Enable upgrade-only build step

Currently the build playbook executes the nuage-unpack role. This is unnecessary for an upgrade. The problem is that the build role depends on variables set by the nuage-unpack role. Think about combining the two roles into one and making it such that we can run build without any binary files for an upgrade operation.

vcin-destroy: vcenter tasks fails because of the check on facts gathering

Issue severity
Major

Type
Bug

Description
vcin-destroy fails in a vCenter environment because a bad check (non existing variable)

Error

fatal: [vcin01.phd.eu.nuagedemo.net]: FAILED! => {"failed": true, "msg": "The conditional check 'not vcin_vm_facts.failed' failed. The error was: error while evaluating conditional (not vcin_vm_facts.failed): 'dict object' has no attribute 'failed'\n\nThe error appears to have been in '/home/pdellaer/GitHub/nuage-metro/roles/vcin-destroy/tasks/vcenter.yml': line 17, column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n- block:\n  - name: Power off the VCIN VM\n    ^ here\n"}

Code reference
https://github.com/nuagenetworks/nuage-metro/blob/6390b882d013c35ae607a5e6646c992e1656d5dd/roles/vcin-destroy/tasks/vcenter.yml#L37

Nuage 5.0.1 | ./metro-ansible nuage_unzip.yml fails because zipped file structure changed in 5.0.1

Hi,
The ./metro-ansible nuage_unzip.yml is failing due to a file structure change in 5.0.1.
Below the temp fix i used with the master branch:

1--> fix the unzip for 5.0.R1:

tar xzvf Nuage-VNS-Utils-5.0.1_4.tar.gz

md5sum -c vns-util-5.0.1_4.qcow2.md5

mv * /images/5.0.R1/unziped/vns/utils/

this fixed the unzip path for the nsg file : ncpe_centos7.qcow2

2--> VNS Utility/ NSGV path file change:
edit the file:
vi /home/nuage/metro/nuage-metro/roles/build/tasks/get_paths.yml

change the following two rows:
- { subdir: "vns/utils/", pattern: "vns-util-*.qcow2" } --> util qcoq2 is not there, unzip failure !

  - { subdir: "vns/", pattern: "ncpe_centos7.qcow2" }  --> path was vns/nsg/

Thanx.
Niek van der Ven

Duplicated `destroy images dir` play in nsgv_destroy role

nsgv-destroy role has the same play Destroy the images directory executed twice

first time in the included nsgv_destroy_helper.yml and then in the kvm.yml

vsd-deploy: 5.0.1 HA deploy fails for node 2 and 3

Issue severity
Major

Type
Bug/Enhancement

Description
When installing 5.0.1, the HA deployment fails because the requirement on the pass-phrase-less SSH has changed users (no longer root, but vsd user requires pass-phrase-less SSH)

Error

[root@vsd02 ~]# cat /opt/vsd/logs/install.log
Info: no migration files found
/opt/vsd/vsd-deploy.sh -1 vsd01.phd.eu.nuagedemo.net -t 2 -x xmpp.phd.eu.nuagedemo.net -y
Note: Forwarding request to 'systemctl is-enabled ntpd.service'.
enabled
synchronised to NTP server (10.189.1.254) at stratum 7
   time correct to within 8130 ms
   polling server every 64 s
25-05-17 12:26:15 ERROR: fail pass-phrase-less ssh as vsd to [email protected]
Error: fail /opt/vsd/vsd-deploy.sh -1 vsd01.phd.eu.nuagedemo.net -t 2 -x xmpp.phd.eu.nuagedemo.net -y

vstat-destroy: vcenter tasks fails because of the check on facts gathering

Issue severity
Major

Type
Bug

Description
vstat-destroy fails in a vCenter environment because a bad check (non existing variable)

Error

fatal: [ela01.phd.eu.nuagedemo.net]: FAILED! => {"failed": true, "msg": "The conditional check 'not vstat_vm_facts.failed' failed. The error was: error while evaluating conditional (not vstat_vm_facts.failed): 'dict object' has no attribu
te 'failed'\n\nThe error appears to have been in '/home/pdellaer/GitHub/nuage-metro/roles/vstat-destroy/tasks/vcenter.yml': line 17, column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n- block:\n  - name: Power off the Stats VM\n    ^ here\n"}

Code reference
https://github.com/nuagenetworks/nuage-metro/blob/6390b882d013c35ae607a5e6646c992e1656d5dd/roles/vstat-destroy/tasks/vcenter.yml#L37

Add pre-deploy support for HyperV

How does nsgv-predeploy handle existing NSGVs?

Hi team,
in the nsgv-predeploy role (in its kvm part) we do not track if an NSGV already defined with the same hostname before running the plays.

This leads to the following situation:
Suppose I provision in VSD 2 new NSGVs which happen to have the same hostnames, as the ones already defined on the hypervisor.
Currently playbook will go through each step (except for defining new VM, where we do when: inventory_hostname not in virt_vms.list_vms) resulting in the 0 return code.
So an end user wont see the real reason behind his NSGs stay in non-bootstrap state.

I would suggest to stop the playbook immediately if one tries to define NSGVs with the hostnames which already defined.

Variables re-use in build.yml

Re-usage of variables in build.yml could eliminate fat-finger related errors.
One example of this could be made on dns_domain variable which is defined in the end of build.yml.

If dns_doman is defined explicitely, we can re-use it in hostnames variables of different components.
For example, consider the myvsds:hostname definition:

# current version
 myvsds:
      - { hostname: vsd1.example.com,
#    <cropped>


# with dns_domain re-use
 myvsds:
      - { hostname: "vsd1.{{ dns_domain }}",
#    <cropped>

# dns_domain defined explicitely
dns_domain: example.com

Same steps could apply for different variables used in build.yml like vsd_fqdn used in myvscs definition, etc.

Fix UPGRADE.md

Current UPGRADE.md file is missing required folders/paths to be present when running build_upgrade.yml.
The following paths are needed while performing vsd,vsc,vstat upgrade/rollbacks

When performing vsd upgrade, user is required to create folder "migration" in the vsd path and place migration scripts inside the "migration" folder (refer BUILD.md on creating vsd path)
When performing vsc upgrade, .tim file is required in the vsc image path
When performing vstat upgrade, user is required to create folder "backup" in the vstat path and place backup scripts inside the "backup" folder.

Remove dockermon support

Currently, there is only one global var to install dockermon or not. This change would make it more flexible to choose which VRS node to install the dockermon.

vns-deploy failure checking xmpp detail

From #162

We only see one thing when running the playbooks:

TASK [vns-deploy : Get output of 'show vswitch-controller xmpp-server detail'] *****************************************************
 [WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: {{ groups['vnsutils'] is
defined and groups['vnsutils'] }}

fatal: [metro-vsc1.nuage.stgt]: FAILED! => {"failed": true, "msg": "The conditional check 'xmpp_detail.stdout[0].find('Functional') != -1' failed. The error was: error while evaluating conditional (xmpp_detail.stdout[0].find('Functional') != -1): 'dict object' has no attribute 'stdout'"}

The playbook fails at this point, but a manual check reveals that the XMPP session is up, so we continued the playbooks after vns-deploy, i.e. with vns-postdeploy and from there everything works like a charm.

Proposal Changes:
1.- Options in build_vars for ZFB = true/false
2.- In case ZFB=true, we will need a flag that indicates if needs to be done by METRO =True/False or third party.

Thanks a lot in advance.
Guillermo

Check for num of vsds configured in build - vsd_cluster

We don't support clustered stats VM (ElasticSearch)

We spin up stand-alone stats VMs only. Need to support cluster.

nuagenetworks / nuage-metroae Goto Github PK

nuage-metroae's People

Contributors

Stargazers

Watchers

Forkers

nuage-metroae's Issues

Proxy Issue

Statistics issue:

Recommend Projects

Recommend Topics

Recommend Org