Git Product home page Git Product logo

cbt's Introduction

CBT - The Ceph Benchmarking Tool

INTRODUCTION

CBT is a testing harness written in python that can automate a variety of tasks related to testing the performance of Ceph clusters. CBT does not install Ceph packages, it is expected that this will be done prior to utilizing CBT. CBT can create OSDs at the beginning of a test run, optionally recreate OSDs between test runs, or simply run against an existing cluster. CBT records system metrics with collectl, it can optionally collect more information using a number of tools including perf, blktrace, and valgrind. In addition to basic benchmarks, CBT can also do advanced testing that includes automated OSD outages, erasure coded pools, and cache tier configurations. The main benchmark modules are explained below.

radosbench

RADOS bench testing uses the rados binary that comes with the ceph-common package. It contains a benchmarking facility that exercises the cluster by way of librados, the low level native object storage API provided by Ceph. Currently, the RADOS bench module creates a pool for each client.

librbdfio

The librbdfio benchmark module is the simplest way of testing block storage performance of a Ceph cluster. Recent releases of the flexible IO tester (fio) provide a RBD ioengine. This allows fio to test block storage performance of RBD volumes without KVM/QEMU configuration, through the userland librbd libraries. These libraries are the same ones used by the, QEMU backend, so it allows a approximation to KVM/QEMU performance.

kvmrbdfio

The kvmrbdfio benchmark uses the flexible IO tester (fio) to exercise a RBD volume that has been attached to a KVM instance. It requires that the instances be created and have RBD volumes attached prior to using CBT. This module is commonly used to benchmark RBD backed Cinder volumes that have been attached to instances created with OpenStack. Alternatively the instances could be provisioned using something along the lines of Vagrant or Virtual Machine Manager.

rbdfio

The rbdfio benchmark uses the flexible IO tester (fio) to excercise a RBD volume that has been mapped to a block device using the KRBD kernel driver. This module is most relevant for simulating the data path for applications that need a block device, but wont for whatever reason be ran inside a virtual machine.

PREREQUISITES

CBT uses several libraries and tools to run:

  1. python3-yaml - A YAML library for python used for reading configuration files.
  2. python3-lxml - Powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API
  3. ssh (and scp) - remote secure command executation and data transfer
  4. pdsh (and pdcp) - a parallel ssh and scp implementation
  5. ceph - A scalable distributed storage system

Note that pdsh is not packaged for RHEL7 and CentOS 7 based distributations at this time, though the rawhide pdsh packages install and are usable. The RPMs for these packages are available here:

  • ftp://rpmfind.net/linux/fedora/linux/releases/23/Everything/x86_64/os/Packages/p/pdsh-2.31-4.fc23.x86_64.rpm
  • ftp://rpmfind.net/linux/fedora/linux/releases/23/Everything/x86_64/os/Packages/p/pdsh-rcmd-rsh-2.31-4.fc23.x86_64.rpm
  • ftp://rpmfind.net/linux/fedora/linux/releases/23/Everything/x86_64/os/Packages/p/pdsh-rcmd-ssh-2.31-4.fc23.x86_64.rpm

Optional tools and benchmarks can be used if desired:

  1. collectl - system data collection
  2. blktrace - block device io tracing
  3. seekwatcher - create graphs and movies from blktrace data
  4. perf - system and process profiling
  5. valgrind - runtime memory and cpu profiling of specific processes
  6. fio - benchmark suite with integrated posix, libaio, and librbd support
  7. cosbench - object storage benchmark from Intel

USER AND NODE SETUP

In addition to the above software, a number of nodes must be available to run tests. These are divided into several categories. Multiple categories can contain the same host if it is assuming multiple roles (running OSDs and a mon for instance).

  1. head - node where general ceph commands are run
  2. clients - nodes that will run benchmarks or other client tools
  3. osds - nodes where OSDs will live
  4. rgws - nodes where rgw servers will live
  5. mons - nodes where mons will live

A user may also be specified to run all remote commands. The host that is used to run cbt must be able to issue passwordless ssh commands as the specified user. This can be accomplished by creating a passwordless ssh key:

ssh-keygen -t dsa

and copying the resulting public key in the ~/.ssh to the ~/.ssh/authorized_key file on all remote hosts.

This user must also be able to run certain commands with sudo. The easiest method to enable this is to simply enable blanket passwordless sudo access for this user, though this is only appropriate in laboratory environments. This may be acommplished by running visudo and adding something like:

# passwordless sudo for cbt
<user>    ALL=(ALL)       NOPASSWD:ALL

Where <user> is the user that will have password sudo access.
Please see your OS documentation for specific details.

In addition to the above, it will be required to add all osds and mons into the list of known hosts for ssh in order to perform properly. Otherwise, the benchmarking tests will not be able to run.

Note that the pdsh command could have difficulties if the sudoers file requires tty. If this is the case, commend out the Defaults requiretty line in visudo.

DISK PARTITIONING

Currently CBT looks for specific partition labels in /dev/disk/by-partlabel for the Ceph OSD data and journal partitions.
At some point in the future this will be made more flexible, for now this is the expected behavior. Specifically on each OSD host partitions should be specified with the following gpt labels:

osd-device-<num>-data
osd-device-<num>-journal

where <num> is a device ordered starting at 0 and ending with the last device on the system. Currently cbt assumes that all nodes in the system have the same number of devices. A script is available that shows an example of how we create partition labels in our test lab here:

https://github.com/ceph/cbt/blob/master/tools/mkpartmagna.sh

CREATING A YAML FILE

CBT yaml files have a basic structure where you define a cluster and a set of benchmarks to run against it. For example, the following yaml file creates a single node cluster on a node with hostname "burnupiX". A pool profile is defined for a 1x replication pool using 256 PGs, and that pool is used to run RBD performance tests using fio with the librbd engine.

cluster:
  user: 'nhm'
  head: "burnupiX"
  clients: ["burnupiX"]
  osds: ["burnupiX"]
  mons:
    burnupiX:
      a: "127.0.0.1:6789"
  osds_per_node: 1
  fs: 'xfs'
  mkfs_opts: '-f -i size=2048'
  mount_opts: '-o inode64,noatime,logbsize=256k'
  conf_file: '/home/nhm/src/ceph-tools/cbt/newstore/ceph.conf.1osd'
  iterations: 1
  use_existing: False
  clusterid: "ceph"
  tmp_dir: "/tmp/cbt"
  pool_profiles:
    rbd:
      pg_size: 256
      pgp_size: 256
      replication: 1
benchmarks:
  librbdfio:
    time: 300
    vol_size: 16384
    mode: [read, write, randread, randwrite]
    op_size: [4194304, 2097152, 1048576]
    concurrent_procs: [1]
    iodepth: [64]
    osd_ra: [4096]
    cmd_path: '/home/nhm/src/fio/fio'
    pool_profile: 'rbd'

An associated ceph.conf.1osd file is also defined with various settings that are to be used in this test:

[global]
        osd pool default size = 1
        auth cluster required = none
        auth service required = none
        auth client required = none
        keyring = /tmp/cbt/ceph/keyring
        osd pg bits = 8  
        osd pgp bits = 8
        log to syslog = false
        log file = /tmp/cbt/ceph/log/$name.log
        public network = 192.168.10.0/24
        cluster network = 192.168.10.0/24
        rbd cache = true
        osd scrub load threshold = 0.01
        osd scrub min interval = 137438953472
        osd scrub max interval = 137438953472
        osd deep scrub interval = 137438953472
        osd max scrubs = 16
        filestore merge threshold = 40
        filestore split multiple = 8
        osd op threads = 8
        mon pg warn max object skew = 100000
        mon pg warn min per osd = 0
        mon pg warn max per osd = 32768

[mon]
        mon data = /tmp/cbt/ceph/mon.$id
        
[mon.a]
        host = burnupiX 
        mon addr = 127.0.0.1:6789

[osd.0]
        host = burnupiX
        osd data = /tmp/cbt/mnt/osd-device-0-data
        osd journal = /dev/disk/by-partlabel/osd-device-0-journal

To run this benchmark suite, cbt is launched with an output archive directory to store the results and the yaml configuration file to use:

cbt.py --archive=<archive dir> ./mytests.yaml

You can also specify the ceph.conf file to use by specifying it on the commandline:

cbt.py --archive=<archive dir> --conf=./ceph.conf.1osd ./mytests.yaml

In this way you can mix and match ceph.conf files and yaml test configuration files to create parametric sweeps of tests. A script in the tools directory called mkcephconf.py lets you automatically generate hundreds or thousands of ceph.conf files from defined ranges of different options that can then be used with cbt in this way.

CONCLUSION

There are many additional and powerful ways you can use cbt that are not yet covered in this document. As time goes on we will try to provide better examples and documentation for these features. For now, it's best to look at the examples, look at the code, and ask questions!

cbt's People

Contributors

aisakaki avatar amnonhanuhov avatar athanatos avatar bengland2 avatar criptik avatar cyx1231st avatar ivotron avatar jdurgin avatar koder-ua avatar ksingh7 avatar liu-chunmei avatar ljx023 avatar marcosmamorim avatar markhpc avatar matan-b avatar mmgaggle avatar mogeb avatar neha-ojha avatar nitzanmordhai avatar nobuto-m avatar perezjosibm avatar rzarzynski avatar sidhant-agrawal avatar sseshasa avatar svelar avatar tchaikov avatar vasukulkarni avatar wadeholler avatar wwdillingham avatar xuechendi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cbt's Issues

monitoring.py - collectl issue

wrong in script:
rawdskfilt = 'cciss/c\d+d\d+ |hd[ab] | sd[a-z]+ |dm-\d+ |xvd[a-z] |fio[a-z]+ | vd[a-z]+ |emcpower[a-z]+ |psv\d+ |nvme[0-9]n[0-9]+p[0-9]+ '

should be ('+' on the beginning) :
rawdskfilt = '+cciss/c\d+d\d+ |hd[ab] | sd[a-z]+ |dm-\d+ |xvd[a-z] |fio[a-z]+ | vd[a-z]+ |emcpower[a-z]+ |psv\d+ |nvme[0-9]n[0-9]+p[0-9]+ '

also it is better to run collectl with sudo and with rotation of logfiles (after midnight for example), so:
common.pdsh(nodes, 'sudo collectl -s+mYZ -i 1:10 -r00:00,7 --rawdskfilt "%s" -F0 -f %s' % (rawdskfilt, collectl_dir))

Resources required to delete a pool affect subsequent test run

Many benchmarks delete and recreate the test pool between runs. However, the Ceph command to delete a pool returns immediately, and the work of deleting objects in the pool takes place in the background. Unfortunately, experience has shown that disk and CPU resources used while deleting the objects is great enough that it influences the test results for the subsequent run.

One way to avoid the problem is to have the cluster.rmpool() function wait until the disk and CPU utilization on the OSD nodes drops to a reasonable level before returning to the caller. I will be issuing a pull request with this change.

wrong collectl syntax for rawdskfilt in monitoring.py

wrong syntax in monitoring.py causing collectl to not run:

collectl -s+mYZ -i 1:10 --rawdskfilt "+cciss/c\d+d\d+ |hd[ab] | sd[a-z]+ |dm-\d+ |xvd[a-z] |fio[a-z]+ | vd[a-z]+ |emcpower[a-z]+ |psv\d+ |nvme[0-9]n[0-9]+p[0-9]+ " -F0 -f /tmp/cbt/00000000/LibrbdFio/osd_ra-00001024/op_size-00004094/concurrent_procs-004/iodepth-064/read/pool_monitoring/collectl
Quantifier follows nothing in regex; marked by <-- HERE in m/+ <-- HERE cciss/c\d+d\d+ |hd[ab] | sd[a-z]+ |dm-\d+ |xvd[a-z] |fio[a-z]+ | vd[a-z]+ |emcpower[a-z]+ |psv\d+ |nvme[0-9]n[0-9]+p[0-9]+ / at /usr/share/collectl/formatit.ph line 235.

so instead
rawdskfilt = '+cciss/c\d+d\d+ |hd[ab] | sd[a-z]+ |dm-\d+ |xvd[a-z] |fio[a-z]+ | vd[a-z]+ |emcpower[a-z]+ |psv\d+ |nvme[0-9]n[0-9]+p[0-9]+ '

it should be
rawdskfilt = 'cciss/c\d+d\d+ |hd[ab] | sd[a-z]+ |dm-\d+ |xvd[a-z] |fio[a-z]+ | vd[a-z]+ |emcpower[a-z]+ |psv\d+ |nvme[0-9]n[0-9]+p[0-9]+ '

concurrent_procs - unable to change the vaule

Hi
changing the concurrent_procs option vaule in yaml file doesn't have any affect to the cbt. It always working with value '3'. I found this while using benchmark librbdfio.

THX

cbt getting idle while executing rpdcp command

cbt is getting idle while trying to copy collectl data by executing the rpdcp command.
Because it getting the nodes by
nodes = settings.getnodes('clients', 'osds', 'mons', 'rgws')
When same hosts are defined under clients, osds, mons, rgws it executes collectl command on each of them 4 times. Rpdcp then cannot finish copy of the files as they are still opened.

radosbench fails if client cannot read admin keyring

The radosbench test uses the 'rados' tool, which requires the client admin key to access the cluster. The rados command will fail unless the user specified in the config file has read acess to the client admin keyring.

A workaround exists, and that is to set "cmd_path: 'sudo /usr/bin/rados'" in the test configuration. But this is cumbersome and breaks the ability to run the command under valgrind.

OSError: [Errno 2] No such file or directory 09:45:52 - ERROR - cbt - During tests

I have installed ceph in my head and client node, and ceph cluster is up running. I have no idea which file or directory does it need!

09:41:23 - ERROR - cbt - During tests
Traceback (most recent call last):
File "./cbt.py", line 64, in main
b.initialize()
File "/root/cbt-master/benchmark/librbdfio.py", line 68, in initialize
super(LibrbdFio, self).initialize()
File "/root/cbt-master/benchmark/benchmark.py", line 44, in initialize
self.cluster.cleanup()
File "/root/cbt-master/cluster/ceph.py", line 210, in cleanup
common.pdsh(nodes, 'sudo rm -rf %s' % self.tmp_dir).communicate()
File "/root/cbt-master/common.py", line 70, in pdsh
return CheckedPopen(args,continue_if_error=continue_if_error)
File "/root/cbt-master/common.py", line 20, in init
self.popen_obj = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE, close_fds=True)
File "/usr/lib64/python2.7/subprocess.py", line 711, in init
errread, errwrite)
File "/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory

Create Module for All-to-All Network Tests

Use iperf for all-to-all network tests. Base code on somthing vaguely like:

Client to OSD tests:

!/bin/bash

for i in 8 9 10 11 12 13
do
val=$((62+$i))
pdsh -R ssh -w osd[$i] iperf -s -B 172.27.50.$val &
done

!/bin/bash

for i in 0 1 2 3 4 5 6 7
do
for val in 70 71 72 73 74 75
do
pdsh -R ssh -w client[$i] iperf -c 172.27.50.$val -f m -t 60 -P 1 > /tmp/iperf_client${i}to${val}.out &
done
done

!/bin/bash

for i in 8 9 10 11 12 13
do
val=$((62+$i))
pdsh -R ssh -w osd[$i] iperf -s -B 172.27.49.$val &
done

!/bin/bash

for i in 8 9 10 11 12 13
do
for val in 70 71 72 73 74 75
do
pdsh -R ssh -w osd[$i] iperf -c 172.27.49.$val -P 1 -f m -t 60 -P 1 > /tmp/iperf_${i}to${val}.out &
done
done

Update CBT to support RBD Erasure Coding testing

CBT current does not support testing RBD Erasure coding (ceph 12.2.2). ceph.py needs to be updated such that when librbdfio.py calls mkpool with a data pool specified ceph.py uses the data_pool_profile instead of the pool_profile.

Since librbdfio will append "-data" to the pool name if a user had specified a data_pool the below could be a possible solution.

cbt/cluster/ceph.py:
def mkpool(self, name, profile_name, application, base_name=None):
if 'data' in name:
pool_profiles = self.config.get('data_pool_profiles', {'default': {}})
else:
pool_profiles = self.config.get('pool_profiles', {'default': {}})
. . .

Concurrent take root testing

A cluster composed of nodes with a mix of SSD and HDD, where the SSDs are being used as OSDs, and not for a cache tier. Currently, we can test either the flash take root with CBT, or the HDD take root, but not both concurrently. Add ability to have clients test pools with different take roots, concurrently.

More explicit failure indication in cbt run.

When executing the cbt.py test suite, it is very hard to figure out which steps failed/passed.
My experience with this tool is very limited as I just started using it, but I see that the pdsh commands fail without any error, so it is hard to decipher why.

Also, the use_existing flag in cluster: configuration in the yaml file should be highlighted when using against an existing cluster. Once I go through a successful execution I will create a pull request for any doc changes if makes sense and other issues if I see.

Another issue I see is username and groupname are taken as the same which is not the case. Might be useful to add a groups filed as well.

Lastly -->
Now I think I have gotten past some of my inital hurdles and am able to execute an fio benchmark, but I am not sure what is next.

The last step I see is:

21:30:37 - DEBUG - cbt - pdsh -R ssh -w [email protected],[email protected],[email protected] sudo chown -R behzad_dastur.behzad_dastur /tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-01048576/concurrent_procs-001/iodepth-064/randwrite/* 21:30:37 - DEBUG - cbt - rpdcp -f 1 -R ssh -w [email protected],[email protected],[email protected] -r /tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-01048576/concurrent_procs-001/iodepth-064/randwrite/* /tmp/00000000/LibrbdFio/osd_ra-00004096/op_size-01048576/concurrent_procs-001/iodepth-064/randwrite

I can see logs created at:

[root@cbtvm001-d658 cbt]# ls /tmp/00000000/LibrbdFio/osd_ra-00004096/op_size-01048576/concurrent_procs-001/iodepth-064/read/ collectl.b-stageosd001-r19f29-prod.acme.symcpe.net collectl.v-stagemon-002-prod.abc.acme.net output.0.v-stagemon-001-prod.abc.acme.net collectl.v-stagemon-001-prod.abc.acme.net historic_ops.out.b-stageosd001-r19f29-prod.abc.acme.net
Are there ways to now visualize this data.

OSError: [Errno 2] No such file or directory while running radosbench test

17:56:52 - DEBUG - cbt - pdsh -R ssh -w root@ceph echo 3 | sudo tee /proc/sys/vm/drop_caches
^@17:56:53 - ERROR - cbt - During tests
Traceback (most recent call last):
File "./cbt.py", line 71, in main
b.run()
File "/root/cbt/benchmark/radosbench.py", line 68, in run
self._run('write', '%s/write' % self.run_dir, '%s/write' % self.out_dir)
File "/root/cbt/benchmark/radosbench.py", line 81, in _run
rados_version_str = subprocess.check_output(["rados", "-v"])
File "/usr/lib64/python2.7/subprocess.py", line 568, in check_output
process = Popen(stdout=PIPE, _popenargs, *_kwargs)
File "/usr/lib64/python2.7/subprocess.py", line 711, in init
errread, errwrite)
File "/usr/lib64/python2.7/subprocess.py", line 1308, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory

clocksource option for fio

Adding a clocksource option for fio benchmarks would be useful so that gettimeofday can be selected when there are multiple fio processes running on multiple hosts. Assuming ntp is properly configured, this would make aggregating data much easier than it would be otherwise with independent, relative timestamps.

librbdfio fails when client's hostname differs from identifier in CBT job file

The librbdfio benchmark creates RBDs using node names specified in the job file. However, the clients try to access the RBDs using their own "hostname -s" name. If the names are not the same then tests will fail, with output like this in the output files:

Starting 1 process
rbd engine: RBD version: 0.1.9
rbd_open failed.
fio_rbd_connect failed.

I will be submitting a fix.

OSD expansion testing

The current recovery machinery for CBT marks an OSD or group of OSDs out, then later marks it back in. It would be great is there was a way to mark an OSD or group of OSDs out, destroy them, then add them back in to the cluster.

Option to restart Ceph services

When changing Ceph parameters, some of them require Ceph services, like OSD, to be restarted. CBT can only recreate a brand new cluster or use existing cluster. It would be nice to have something like use_existing, but it would restart all Ceph services and check health before starting the test(s).

population of fio files should use fio_cmd provided in yaml

In using kvmrbdfio.py to run fio on my openstack guests, the population of the fio files does not use the path to the fio executable (fio_cmd) provided in the yaml file. I define the fio command path ...

----------- yaml excerpt -----------
benchmarks:
kvmrbdfio:
fio_cmd: "/usr/local/bin/fio"
----------- end excerpt ------------

... however the populate call instead uses no path and relies on the fio executable being in the $PATH defined by sudo users, which excludes /usr/local/bin where my fio resides, resulting in the population failing on the target systems.

Option to save results archive in S3

It would be great to have the ability to save the results archive to a S3 bucket instead of it only being on the local filesystem of the head node.

add simple cephfs support (cephfsfio)

update to include simple cephFS support.

Reuse rbdfio benchmark
update with new cephfsfio.py benchmark and examples
update ceph.py to create MDSs

Client concurrency scaling

It would be great to have the ability to specify the clients, eg:

clients: [client1, client2, client3, client4 client5, client6]

Then also specify an array with the number of clients to use for each iteration, eg:

num_clients: [1, 2, 4, 6]

This would be useful for finding the client concurrency contention threshold for a cluster.

script hangs while performing rpdcp

Hi
while performing rdcp the script just hangs. However all files and folders are copied to the target ARCH directory, but script doesn't continue on other steps.

Any idea?


6:42:18 - DEBUG - cbt - Nodes : ceph@cbt-node1,cbt-node2,cbt-node3
16:42:18 - DEBUG - cbt - CheckedPopen continue_if_error=True args=pdsh -f 3 -R ssh -w ceph@cbt-node1,cbt-node2,cbt-node3 sudo pkill -SIGINT -f blktrace
16:42:19 - DEBUG - cbt - Nodes : ceph@cbt-node1,ceph@cbt-node2,ceph@cbt-node3,ceph@cbt-node1,cbt-node2,cbt-node3
16:42:19 - DEBUG - cbt - CheckedPopen continue_if_error=False args=pdsh -S -f 6 -R ssh -w ceph@cbt-node1,ceph@cbt-node2,ceph@cbt-node3,ceph@cbt-node1,cbt-node2,cbt-node3 sudo chown -R ceph.ceph /tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-04194304/concurrent_procs-003/iodepth-064/read/*

16:42:19 - DEBUG - cbt - CheckedPopen continue_if_error=False args=rpdcp -f 10 -R ssh -w ceph@cbt-node1,ceph@cbt-node2,ceph@cbt-node3,ceph@cbt-node1,cbt-node2,cbt-node3 -r /tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-04194304/concurrent_procs-003/iodepth-064/read/* /tmp/ARCH1/00000000/LibrbdFio/osd_ra-00004096/op_size-04194304/concurrent_procs-003/iodepth-064/read

Thanks

how to parse cbt results?

Hi,
just finished benchmark of our Ceph cluster with CBT tool and now I'm confused how to deal with output data. I got a large directory structure with a lot of "id" directories like:

results/00000000/id3309284191688069537/

where are fio log files and some files in json format.

My point is if there is any tool or simple way how to parse these output data to human readable format, that I would be also able to compare two or more CBT benchmark tests.

Thank you

radosbench should not depend on ceph-common to be installed on head-node

cbt checks the rados version to determine the syntax of the object size for rados bench. It does so locally on the head node (by running rados -v) which fails if the rados binary (from ceph-common on Red Hat systems) is not installed.
Instead cbt should ask the first client node for the version, if it assumes all clients to have the same rados version installed.

Readahead for use_existing

When using use_existing, readahead is not set on the OSDs because they do not have the labels CBT expects. We should probably determine which devices are mounted in /var/lib/osd/* and then set their readahead values.

ImportError: No module named lxml.etree

Hi
I was trying to setup CBT on a fresh system and encountered the following problem

[root@ceph-node1 cbt]# python cbt.py
Traceback (most recent call last):
  File "cbt.py", line 9, in <module>
    import benchmarkfactory
  File "/root/cbt/benchmarkfactory.py", line 10, in <module>
    from benchmark.cosbench import Cosbench
  File "/root/cbt/benchmark/cosbench.py", line 8, in <module>
    import lxml.etree as ET
ImportError: No module named lxml.etree
[root@ceph-node1 cbt]#

The fix was simple yum -y install python-lxml, just for documentation purpose i am creating this issue. I will submit a PR for updating documentation.

Trigger deep scrubs

Ability to trigger deep scrubs to test the impact of deep scrubbing. In hammer there is supposed to be a ceph.conf tunable that allows specifying a time. If that value can be injected, then we can probably do something along the lines of:

current_time = time
schedule_time = current_time + ramp_time

ceph tell * injectargs --some_scrubbing_schedule_key $schedule_time

Can't find ceph config dump in output directories

In librbdfio.py, before every test, dump_config is called to save the ceph config into a file called ceph_settings.out. I don't think this ever completes successfully because that file is never created.

create example files

make example test run configs for most common workloads that can get people started.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.