Git Product home page Git Product logo

cephci's People

Contributors

amarnatreddy avatar ameenasuhani avatar anrao19 avatar ckulal avatar clacroix12 avatar harshkumarrh avatar haruchebrolu avatar hkadam134 avatar julpark-rh avatar ktdreyer avatar manasagowri avatar mergify[bot] avatar mkasturi18 avatar mohitbis avatar openshift-merge-bot[bot] avatar pdhiran avatar pranavprakash20 avatar psathyan avatar rahullepakshi avatar rakeshgm avatar shreekarss avatar srinivasabharath avatar subhashp7i avatar sunilkumarn417 avatar tintumathew10 avatar udaysk23 avatar vamahaja avatar vasukulkarni avatar viduship avatar yogesh-mane avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cephci's Issues

mixed lvm configs osd create failure

ceph-5vp-1586235666231-node5-osd monitor_interface=eth0 lvm_volumes="[{'data':'/dev/vdc','db':'db-lv1','db_vg':'vg1','wal':'wal-lv1','wal_vg':'vg1'},{'data':'data-lv1','data_vg':'vg1','db':'/dev/vde2','wal':'/dev/vde3'},{'data':'/dev/vde1'}]"
ceph-5vp-1586235666231-node4-osd monitor_interface=eth0 lvm_volumes="[{'data':'/dev/vdc','db':'db-lv1','db_vg':'vg1','wal':'wal-lv1','wal_vg':'vg1'},{'data':'data-lv1','data_vg':'vg1','db':'/dev/vde2','wal':'/dev/vde3'},{'data':'/dev/vde1'}]" dmcrypt='True'

Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 71d25569-5e62-4678-8083-1e24368f673c
Running command: /sbin/lvcreate --yes -l 100%FREE -n osd-block-71d25569-5e62-4678-8083-1e24368f673c
stderr: Volume group name has invalid characters
Run `lvcreate --help' for more
2020-04-06 14:53:20,226 - ceph.ceph - INFO - information.
--> Was unable to complete a new OSD, will rollback changes

http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-1586235666231/containerized_ceph_ansible_0.log

config_roll_over - add_osd /add-mon - --limit needs to be replaced with add-osd / add-mon

add-osd in luminous is failing saying -
"msg": "The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'docker_exec_cmd'\n\nThe error appears to have been in '/usr/share/ceph-ansible/roles/ceph-mgr/tasks/common.yml': line 30, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n block:\n - name: create ceph mgr keyring(s) on a mon node\n ^ here\n"
}

As per the doc, we need to use add-osd.yml
Ref - cephci-run-1595989707197

flake8 new version checks F522

======================== 8 passed, 7 warnings in 0.52s =========================
py37 run-test: commands[1] | flake8
./tests/radosbench.py:54:0: F522 '...'.format(...) has unused named argument(s): name
./tests/cephfs/cephfs_utils.py:1067:0: F522 '...'.format(...) has unused named argument(s): mon_ip3
ERROR: InvocationError for command /home/travis/build/red-hat-storage/cephci/.tox/py37/bin/flake8 (exited with code 1)
___________________________________ summary ____________________________________
ERROR: py37: commands failed
The command "tox" exited with 1.

changed from travis CI build 161-162

  • ignoring the check for now , need to fix variables in above two scripts and renable F522

Exception in parallel execution

2020-04-22 09:46:45,669 - mita.openstack - INFO -  ... available
2020-04-22 09:46:45,670 - mita.openstack - INFO - Attaching volume ceph-rhceph-jslave1-1587562779221-node12-osd2...
2020-04-22 09:46:47,023 - mita.openstack - INFO - Successfully attached volume ceph-rhceph-jslave1-1587562779221-node12-osd2
2020-04-22 09:46:47,023 - mita.openstack - INFO - Creating 15gb of storage for: ceph-rhceph-jslave1-1587562779221-node12-osd3
2020-04-22 09:46:47,628 - mita.openstack - INFO - Waiting for volume ceph-rhceph-jslave1-1587562779221-node12-osd3 to become available
2020-04-22 09:46:47,628 - mita.openstack - INFO - Volume: ceph-rhceph-jslave1-1587562779221-node12-osd3 is in state: creating
2020-04-22 09:47:02,314 - mita.openstack - INFO - Waiting for node ceph-rhceph-jslave1-1587562779221-node10-pool to become available
2020-04-22 09:47:05,614 - mita.openstack - INFO - Failed to bring the node in running state in 0:04:00s
2020-04-22 09:47:06,571 - mita.openstack - INFO -  ... available
2020-04-22 09:47:06,571 - mita.openstack - INFO - Attaching volume ceph-rhceph-jslave1-1587562779221-node12-osd3...
2020-04-22 09:47:07,348 - mita.openstack - INFO - Successfully attached volume ceph-rhceph-jslave1-1587562779221-node12-osd3
2020-04-22 09:47:17,354 - ceph.parallel - ERROR - Exception in parallel execution
Traceback (most recent call last):
  File "/home/rhceph-jslave1/workspace/rhceph-ansible-sanity-4.1-rhel7.8/ceph/parallel.py", line 87, in __exit__
    for result in self:
  File "/home/rhceph-jslave1/workspace/rhceph-ansible-sanity-4.1-rhel7.8/ceph/parallel.py", line 105, in __next__
    resurrect_traceback(result)
  File "/home/rhceph-jslave1/workspace/rhceph-ansible-sanity-4.1-rhel7.8/ceph/parallel.py", line 34, in resurrect_traceback
    raise exc_info[0](exc_info[1]).with_traceback(exc_info[2])
  File "/home/rhceph-jslave1/workspace/rhceph-ansible-sanity-4.1-rhel7.8/ceph/parallel.py", line 21, in capture_traceback
    return func(*args, **kwargs)
  File "/home/rhceph-jslave1/workspace/rhceph-ansible-sanity-4.1-rhel7.8/ceph/utils.py", line 81, in setup_vm_node
    ceph_nodes[node] = CephVMNode(**params)
  File "/home/rhceph-jslave1/workspace/rhceph-ansible-sanity-4.1-rhel7.8/mita/openstack.py", line 70, in __init__
    self.create_node()
  File "/home/rhceph-jslave1/workspace/rhceph-ansible-sanity-4.1-rhel7.8/mita/openstack.py", line 170, in create_node
    raise NodeErrorState("Failed to bring up the node in Running state " + self.name)
AttributeError: 'CephVMNode' object has no attribute 'name'
/home/rhceph-jslave1/workspace/rhceph-ansible-sanity-4.1-rhel7.8/utility/utils.py:612: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  cfg = yaml.load(yml)
Traceback (most recent call last):
  File "run.py", line 611, in <module>
    rc = run(args)
  File "run.py", line 413, in run
    ceph_cluster_dict, clients = create_nodes(conf, inventory, osp_cred, run_id, service, instances_name)
  File "/home/rhceph-jslave1/workspace/rhceph-ansible-sanity-4.1-rhel7.8/utility/retry.py", line 25, in f_retry
    return f(*args, **kwargs)
  File "run.py", line 137, in create_nodes
    ceph_vmnodes = create_ceph_nodes(cluster, inventory, osp_cred, run_id, instances_name)
  File "/home/rhceph-jslave1/workspace/rhceph-ansible-sanity-4.1-rhel7.8/ceph/utils.py", line 75, in create_ceph_nodes
    p.spawn(setup_vm_node, node, ceph_nodes, **node_params)
  File "/home/rhceph-jslave1/workspace/rhceph-ansible-sanity-4.1-rhel7.8/ceph/parallel.py", line 87, in __exit__
    for result in self:
  File "/home/rhceph-jslave1/workspace/rhceph-ansible-sanity-4.1-rhel7.8/ceph/parallel.py", line 105, in __next__
    resurrect_traceback(result)
  File "/home/rhceph-jslave1/workspace/rhceph-ansible-sanity-4.1-rhel7.8/ceph/parallel.py", line 34, in resurrect_traceback
    raise exc_info[0](exc_info[1]).with_traceback(exc_info[2])
  File "/home/rhceph-jslave1/workspace/rhceph-ansible-sanity-4.1-rhel7.8/ceph/parallel.py", line 21, in capture_traceback
    return func(*args, **kwargs)
  File "/home/rhceph-jslave1/workspace/rhceph-ansible-sanity-4.1-rhel7.8/ceph/utils.py", line 81, in setup_vm_node
    ceph_nodes[node] = CephVMNode(**params)
  File "/home/rhceph-jslave1/workspace/rhceph-ansible-sanity-4.1-rhel7.8/mita/openstack.py", line 70, in __init__
    self.create_node()
  File "/home/rhceph-jslave1/workspace/rhceph-ansible-sanity-4.1-rhel7.8/mita/openstack.py", line 170, in create_node
    raise NodeErrorState("Failed to bring up the node in Running state " + self.name)
AttributeError: 'CephVMNode' object has no attribute 'name'

https://ceph-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhceph-ansible-sanity-4.1-rhel7.8/36/console

rename docker cli args

  1. --docker to --container, try maximum to remove references to docker to avoid confusion
  2. later update jenkins configs

selinux avc denial check

  • avc denial check to run at the begining and end of TC
  • parse audit log for denials and throw the results at end of test suite (check reboot TC's) have a check at the end saying true or false ..point to logs on all nodes ,to troubleshoot

add containerized mon ,playbook failed 4.1-rhel-7

https://ceph-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhceph-containerized-ceph-ansible-sanity-4.1-rhel7.8/15/console http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-1587577640783/?C=M;O=A

TASK [ceph-container-common : set_fact image_repodigest_after_pulling] *********
task path: /usr/share/ceph-ansible/roles/ceph-container-common/tasks/fetch_image.yml:194
Wednesday 22 April 2020  15:00:02 -0400 (0:00:00.462)       0:02:33.770 ******* 

2020-04-22 14:57:47,973 - ceph.ceph - INFO - ok: [ceph-rhceph-jslave2-1587577640783-node2-monmds] => changed=false 
  ansible_facts:
    image_repodigest_after_pulling: sha256:24bf8efe8643927beb737c0a60790f41ea06d625c3c6dfd56974f7cc6b41bf65

2020-04-22 14:57:48,042 - ceph.ceph - INFO - ok: [ceph-rhceph-jslave2-1587577640783-node6-monrgw] => changed=false 
  ansible_facts:
    image_repodigest_after_pulling: sha256:24bf8efe8643927beb737c0a60790f41ea06d625c3c6dfd56974f7cc6b41bf65
fatal: [ceph-rhceph-jslave2-1587577640783-node1-monmgrinstaller]: FAILED! => 
  msg: |-
    The task includes an option with an undefined variable. The error was: list object has no element 0
  
    The error appears to be in '/usr/share/ceph-ansible/roles/ceph-container-common/tasks/fetch_image.yml': line 194, column 3, but may
    be elsewhere in the file depending on the exact syntax problem.
  
    The offending line appears to be:
  
  
    - name: set_fact image_repodigest_after_pulling
      ^ here

long running commands stuck in fs suite

https://ceph-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/blue/organizations/jenkins/CEPH4.1-RHEL8-Features/detail/CEPH4.1-RHEL8-Features/1/pipeline/5/

  • some test cases get stuck forever in fs suite at cmds like "rm -rf " in clean steps.

debug: Initial pointers check if cluster is getting filled becasue of more IOs/ increase the the vm flavor for more ram and try.

  • Blocking the whole pipeline p1 jobs triggered weekly ..For now set a timer in script to avoid blocking remaining suite and pipeline indefinitely.

interop 3.3 rbd failure

3 RBD failures: in rhcs3.3 (7.9rc image)

basic rbd tests

  • test:
    name: rbd cli image
    module: rbd_system.py
    config:
    test_name: cli/rbd_cli_image.py
    branch: master
    polarion-id: CEPH-83572722
    desc: CLI validation for image related commands

  • test:
    name: rbd cli snap_clone
    module: rbd_system.py
    config:
    test_name: cli/rbd_cli_snap_clone.py
    branch: master
    polarion-id: CEPH-83572725
    desc: CLI validation for snap and clone related commands

  • test:
    name: rbd cli misc
    module: rbd_system.py
    config:
    test_name: cli/rbd_cli_misc.py
    branch: master
    polarion-id: CEPH-83572724
    desc: CLI validation for miscellaneous rbd commands

podman support to rgw suite and s3 upstream tests

  • rgw suite might needs couple fixes in ceph-qe-scripts to pass for container deployments(both podman and docker)
    tests/sanity_rgw.py

  • tests/test_s3.py --> fails always for baremetal too

  • will fail jenkins p0 and cvp container jobs if any test need to be added
    #89

containerized ceph ansible task fails with error while enabling firewalld for RHEL-7.9-20200825.n.0-Server-x86_64 and RHEL-7.9-Server-x86_64-nightly-latest on 4.1 build

The cli used is
python run.py --rhbuild 4.1-rhel-7 --global-conf conf/nautilus/ansible/sanity-ceph-ansible.yaml --osp-cred osp/osp-cred-ci-2.yaml --inventory conf/inventory/rhel-7.9-server-x86_64.yaml --suite suites/nautilus/ansible/sanity_containerized_ceph_ansible.yaml --log-level info --store --report-portal --add-repo http://download.eng.bos.redhat.com/rhel-7/composes/auto/ceph-4.1-rhel-7/RHCEPH-4.1-RHEL-7-20200821.ci.1 --instances-name mgowri3

The error logs for the execution is present in the below path.
RHEL-7.9-20200825.n.0-Server-x86_64: http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-1598351360420/
RHEL-7.9-Server-x86_64-nightly-latest: http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-1598353231300/

tried executing the systemctl enable firewalld command manually on the VM and got the same error.

AttributeError: 'CephVMNode' object has no attribute 'name'

File "/home/ymane/cephci/ceph/utils.py", line 79, in setup_vm_node
ceph_nodes[node] = CephVMNode(**params)
File "/home/ymane/cephci/mita/openstack.py", line 68, in init
self.create_node()
File "/home/ymane/cephci/mita/openstack.py", line 166, in create_node
raise NodeErrorState("Failed to bring up the node in Running state " + self.name)

if client is missing in conf, --rerun fails

A simple 3 node config with no client role and config is stored using --store, and later when rerunning via --reuse

Running test: install ceph pre-requisites
Test logfile location: http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-1598562234549/install_ceph_pre-requisites_0.log
2020-08-27 21:03:54,727 - __main__ - INFO - Running test install_prere```q.py
2020-08-27 21:03:54,727 - __main__ - ERROR - Traceback (most recent call last):
  File "run.py", line 551, in run
    ceph_cluster_dict=ceph_cluster_dict, clients=clients)
UnboundLocalError: local variable 'clients' referenced before assignment

2020-08-27 21:03:54,728 - __main__ - INFO - Test <module 'install_prereq' from '/home/vakulkar/cephci/tests/misc_env/install_prereq.py'> failed
Test <module 'install_prereq' from '/home/vakulkar/cephci/tests/misc_env/install_prereq.py'> failed
2020-08-27 21:03:54,728 - __main__ - INFO - Aborting on test failure
2020-08-27 21:03:54,728 - __main__ - INFO - 
All test logs located here: http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-1598562234549

All test logs located here: http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-1598562234549

TEST NAME                        TEST DESCRIPTION                                               DURATION                                  STATUS
install ceph pre-requisites      None                                                           0:00:00.000553                            Failed

improve and fix reuse cli arg

  • there is some workaround needed for --reuse cli arg like setting clients=None ( do rca on why its needed and can it effect other scripts?)
  • document in README on how to use REUSEwith snapshots
  • improve the function to auto pick the latest snapshot avaiable in rerun dir (without need to always specifying manually) and probably its better to have/retain the manual snapshot select option

Update ansible configs in all suite yamls

  • lot of redudant , obsolete deprecated configs present in suite yamls of luminous/nautilus. Clean them up based on latest supported configs.
    eg: setting osd scenario collocated or having s3 keys which are not used in many suites
  • test whole suite with updated configs before merging

Improve cephci email template

`http://post-office.corp.redhat.com/archives/cephci/2020-April/msg00196.html

@veera-raghava-reddy

  1. Subject - Can we include compose and suite name
    CEPHCI >> RHCEPH-4.1-RHEL-8-20200421.ci.0 >> sanity_containerized_ceph_ansible
  2. Duration - Format in HH:MM:SS and total time at the end.
  3. Test Name - Is it suite name, for few tests same name "Versioning Tests"
  • If Test name is linked to Polarion test can we add a link to test
    #95
  1. In subject can we append the result - If [100% of tests Pass - Pass else - Fail, so it will help to prioritize on which test run mail to first look into.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.