ceph / cbt Goto Github PK

The Ceph Benchmarking Tool

License: Other

Python 53.62% CSS 1.19% HTML 0.01% JavaScript 41.78% Shell 0.97% Roff 2.43%

cbt's Introduction

CBT - The Ceph Benchmarking Tool

INTRODUCTION

CBT is a testing harness written in python that can automate a variety of tasks related to testing the performance of Ceph clusters. CBT does not install Ceph packages, it is expected that this will be done prior to utilizing CBT. CBT can create OSDs at the beginning of a test run, optionally recreate OSDs between test runs, or simply run against an existing cluster. CBT records system metrics with collectl, it can optionally collect more information using a number of tools including perf, blktrace, and valgrind. In addition to basic benchmarks, CBT can also do advanced testing that includes automated OSD outages, erasure coded pools, and cache tier configurations. The main benchmark modules are explained below.

radosbench

RADOS bench testing uses the rados binary that comes with the ceph-common package. It contains a benchmarking facility that exercises the cluster by way of librados, the low level native object storage API provided by Ceph. Currently, the RADOS bench module creates a pool for each client.

librbdfio

The librbdfio benchmark module is the simplest way of testing block storage performance of a Ceph cluster. Recent releases of the flexible IO tester (fio) provide a RBD ioengine. This allows fio to test block storage performance of RBD volumes without KVM/QEMU configuration, through the userland librbd libraries. These libraries are the same ones used by the, QEMU backend, so it allows a approximation to KVM/QEMU performance.

kvmrbdfio

The kvmrbdfio benchmark uses the flexible IO tester (fio) to exercise a RBD volume that has been attached to a KVM instance. It requires that the instances be created and have RBD volumes attached prior to using CBT. This module is commonly used to benchmark RBD backed Cinder volumes that have been attached to instances created with OpenStack. Alternatively the instances could be provisioned using something along the lines of Vagrant or Virtual Machine Manager.

rbdfio

The rbdfio benchmark uses the flexible IO tester (fio) to excercise a RBD volume that has been mapped to a block device using the KRBD kernel driver. This module is most relevant for simulating the data path for applications that need a block device, but wont for whatever reason be ran inside a virtual machine.

PREREQUISITES

CBT uses several libraries and tools to run:

python3-yaml - A YAML library for python used for reading configuration files.
python3-lxml - Powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API
ssh (and scp) - remote secure command executation and data transfer
pdsh (and pdcp) - a parallel ssh and scp implementation
ceph - A scalable distributed storage system

Note that pdsh is not packaged for RHEL7 and CentOS 7 based distributations at this time, though the rawhide pdsh packages install and are usable. The RPMs for these packages are available here:

ftp://rpmfind.net/linux/fedora/linux/releases/23/Everything/x86_64/os/Packages/p/pdsh-2.31-4.fc23.x86_64.rpm
ftp://rpmfind.net/linux/fedora/linux/releases/23/Everything/x86_64/os/Packages/p/pdsh-rcmd-rsh-2.31-4.fc23.x86_64.rpm
ftp://rpmfind.net/linux/fedora/linux/releases/23/Everything/x86_64/os/Packages/p/pdsh-rcmd-ssh-2.31-4.fc23.x86_64.rpm

Optional tools and benchmarks can be used if desired:

collectl - system data collection
blktrace - block device io tracing
seekwatcher - create graphs and movies from blktrace data
perf - system and process profiling
valgrind - runtime memory and cpu profiling of specific processes
fio - benchmark suite with integrated posix, libaio, and librbd support
cosbench - object storage benchmark from Intel

USER AND NODE SETUP

In addition to the above software, a number of nodes must be available to run tests. These are divided into several categories. Multiple categories can contain the same host if it is assuming multiple roles (running OSDs and a mon for instance).

head - node where general ceph commands are run
clients - nodes that will run benchmarks or other client tools
osds - nodes where OSDs will live
rgws - nodes where rgw servers will live
mons - nodes where mons will live

A user may also be specified to run all remote commands. The host that is used to run cbt must be able to issue passwordless ssh commands as the specified user. This can be accomplished by creating a passwordless ssh key:

ssh-keygen -t dsa

and copying the resulting public key in the ~/.ssh to the ~/.ssh/authorized_key file on all remote hosts.

This user must also be able to run certain commands with sudo. The easiest method to enable this is to simply enable blanket passwordless sudo access for this user, though this is only appropriate in laboratory environments. This may be acommplished by running visudo and adding something like:

# passwordless sudo for cbt
<user>    ALL=(ALL)       NOPASSWD:ALL

Where <user> is the user that will have password sudo access.
Please see your OS documentation for specific details.

In addition to the above, it will be required to add all osds and mons into the list of known hosts for ssh in order to perform properly. Otherwise, the benchmarking tests will not be able to run.

Note that the pdsh command could have difficulties if the sudoers file requires tty. If this is the case, commend out the Defaults requiretty line in visudo.

DISK PARTITIONING

Currently CBT looks for specific partition labels in /dev/disk/by-partlabel for the Ceph OSD data and journal partitions.
At some point in the future this will be made more flexible, for now this is the expected behavior. Specifically on each OSD host partitions should be specified with the following gpt labels:

osd-device-<num>-data
osd-device-<num>-journal

where <num> is a device ordered starting at 0 and ending with the last device on the system. Currently cbt assumes that all nodes in the system have the same number of devices. A script is available that shows an example of how we create partition labels in our test lab here:

https://github.com/ceph/cbt/blob/master/tools/mkpartmagna.sh

CREATING A YAML FILE

CBT yaml files have a basic structure where you define a cluster and a set of benchmarks to run against it. For example, the following yaml file creates a single node cluster on a node with hostname "burnupiX". A pool profile is defined for a 1x replication pool using 256 PGs, and that pool is used to run RBD performance tests using fio with the librbd engine.

cluster:
  user: 'nhm'
  head: "burnupiX"
  clients: ["burnupiX"]
  osds: ["burnupiX"]
  mons:
    burnupiX:
      a: "127.0.0.1:6789"
  osds_per_node: 1
  fs: 'xfs'
  mkfs_opts: '-f -i size=2048'
  mount_opts: '-o inode64,noatime,logbsize=256k'
  conf_file: '/home/nhm/src/ceph-tools/cbt/newstore/ceph.conf.1osd'
  iterations: 1
  use_existing: False
  clusterid: "ceph"
  tmp_dir: "/tmp/cbt"
  pool_profiles:
    rbd:
      pg_size: 256
      pgp_size: 256
      replication: 1
benchmarks:
  librbdfio:
    time: 300
    vol_size: 16384
    mode: [read, write, randread, randwrite]
    op_size: [4194304, 2097152, 1048576]
    concurrent_procs: [1]
    iodepth: [64]
    osd_ra: [4096]
    cmd_path: '/home/nhm/src/fio/fio'
    pool_profile: 'rbd'

An associated ceph.conf.1osd file is also defined with various settings that are to be used in this test:

[global]
        osd pool default size = 1
        auth cluster required = none
        auth service required = none
        auth client required = none
        keyring = /tmp/cbt/ceph/keyring
        osd pg bits = 8  
        osd pgp bits = 8
        log to syslog = false
        log file = /tmp/cbt/ceph/log/$name.log
        public network = 192.168.10.0/24
        cluster network = 192.168.10.0/24
        rbd cache = true
        osd scrub load threshold = 0.01
        osd scrub min interval = 137438953472
        osd scrub max interval = 137438953472
        osd deep scrub interval = 137438953472
        osd max scrubs = 16
        filestore merge threshold = 40
        filestore split multiple = 8
        osd op threads = 8
        mon pg warn max object skew = 100000
        mon pg warn min per osd = 0
        mon pg warn max per osd = 32768

[mon]
        mon data = /tmp/cbt/ceph/mon.$id
        
[mon.a]
        host = burnupiX 
        mon addr = 127.0.0.1:6789

[osd.0]
        host = burnupiX
        osd data = /tmp/cbt/mnt/osd-device-0-data
        osd journal = /dev/disk/by-partlabel/osd-device-0-journal

To run this benchmark suite, cbt is launched with an output archive directory to store the results and the yaml configuration file to use:

cbt.py --archive=<archive dir> ./mytests.yaml

You can also specify the ceph.conf file to use by specifying it on the commandline:

cbt.py --archive=<archive dir> --conf=./ceph.conf.1osd ./mytests.yaml

In this way you can mix and match ceph.conf files and yaml test configuration files to create parametric sweeps of tests. A script in the tools directory called mkcephconf.py lets you automatically generate hundreds or thousands of ceph.conf files from defined ranges of different options that can then be used with cbt in this way.

CONCLUSION

There are many additional and powerful ways you can use cbt that are not yet covered in this document. As time goes on we will try to provide better examples and documentation for these features. For now, it's best to look at the examples, look at the code, and ask questions!

cbt's People

Contributors

Stargazers

Watchers

Forkers

xuechendi hanscj1 ezhangle tanay16 hjwsm1989 y-trudeau ivotron athenahealth roclark koder-ua jeanchlopez mmgaggle blinick wjin weiqian wwdillingham wadeholler xinzechi knightku esnyder mosoriob rldleblanc tdeneau ekuric vishwanathmaram x-ion-de ksingh7 bdastur swanandm marcosmamorim jevonq asbishop cronburg zhubx007 yonghengdexin735 johnhaan benlynch-zz jan--f xuxiaopang ommoreno bigluster dmesser peerless-jing elvis2workspace amsaha foodotbar rongzhus swinds24 ariesdevil neha-ojha phonglh79 jkhushbu mogeb pzghost hxt-lingmo nancyjiang5 mterzo jeis2497052 jeegn-chen charmingcow longduncan 831jsh acalhounrh onceupon n3o4po11o vasukulkarni nobuto-m aclamk ushukkla sidhant-agrawal ivaylo911 zeeshan-07 grantmackey tchaikov rzarzynski athanatos amnonhanuhov murolo zhangrb sseshasa rakeshgm varshar16 yanghonggang bkukanov dthpulse changfuwu kevinwu888 nasirkamal clwluvw tuan-hoang1 isabella232 sunnyku hopkings2008 yunfeiguan david-caro amathuria chriskt611 zhang-sen cyx1231st martinbrook

cbt's Issues

monitoring.py - collectl issue

also it is better to run collectl with sudo and with rotation of logfiles (after midnight for example), so:
common.pdsh(nodes, 'sudo collectl -s+mYZ -i 1:10 -r00:00,7 --rawdskfilt "%s" -F0 -f %s' % (rawdskfilt, collectl_dir))

Resources required to delete a pool affect subsequent test run

Many benchmarks delete and recreate the test pool between runs. However, the Ceph command to delete a pool returns immediately, and the work of deleting objects in the pool takes place in the background. Unfortunately, experience has shown that disk and CPU resources used while deleting the objects is great enough that it influences the test results for the subsequent run.

One way to avoid the problem is to have the cluster.rmpool() function wait until the disk and CPU utilization on the OSD nodes drops to a reasonable level before returning to the caller. I will be issuing a pull request with this change.

create test run summary results

have an aggregate summary of test results.

evaluate LTTNg events list to see what is relevant

try to trim down the events to those needed by ceph to save on capture size.

test ability to pdsh before beginning tests

check the exit code of

pdsh -w @hosts true

before actually starting the tests

wrong collectl syntax for rawdskfilt in monitoring.py

wrong syntax in monitoring.py causing collectl to not run:

collectl -s+mYZ -i 1:10 --rawdskfilt "+cciss/c\d+d\d+ |hd[ab] | sd[a-z]+ |dm-\d+ |xvd[a-z] |fio[a-z]+ | vd[a-z]+ |emcpower[a-z]+ |psv\d+ |nvme[0-9]n[0-9]+p[0-9]+ " -F0 -f /tmp/cbt/00000000/LibrbdFio/osd_ra-00001024/op_size-00004094/concurrent_procs-004/iodepth-064/read/pool_monitoring/collectl
Quantifier follows nothing in regex; marked by <-- HERE in m/+ <-- HERE cciss/c\d+d\d+ |hd[ab] | sd[a-z]+ |dm-\d+ |xvd[a-z] |fio[a-z]+ | vd[a-z]+ |emcpower[a-z]+ |psv\d+ |nvme[0-9]n[0-9]+p[0-9]+ / at /usr/share/collectl/formatit.ph line 235.

Update cosbench.py to create / update the swift cosbench:operator user

if deploying (use_existing is false) -
Update cosbench.py to create / update the swift cosbench:operator user

--relies on rgws[] and first rgw having radosgw-admin

concurrent_procs - unable to change the vaule

Hi
changing the concurrent_procs option vaule in yaml file doesn't have any affect to the cbt. It always working with value '3'. I found this while using benchmark librbdfio.

THX

cbt getting idle while executing rpdcp command

cbt is getting idle while trying to copy collectl data by executing the rpdcp command.
Because it getting the nodes by
nodes = settings.getnodes('clients', 'osds', 'mons', 'rgws')
When same hosts are defined under clients, osds, mons, rgws it executes collectl command on each of them 4 times. Rpdcp then cannot finish copy of the files as they are still opened.

send graphite markers upon starting test

send a metric with a value of 1 to a specified graphite host with a key that matches the name of the test being run (matching the directory output structure)

0000000_radosbench_etc

Probably pretty easy with something like this:

https://github.com/daniellawrence/graphitesend

radosbench fails if client cannot read admin keyring

The radosbench test uses the 'rados' tool, which requires the client admin key to access the cluster. The rados command will fail unless the user specified in the config file has read acess to the client admin keyring.

A workaround exists, and that is to set "cmd_path: 'sudo /usr/bin/rados'" in the test configuration. But this is cumbersome and breaks the ability to run the command under valgrind.

OSError: [Errno 2] No such file or directory 09:45:52 - ERROR - cbt - During tests

I have installed ceph in my head and client node, and ceph cluster is up running. I have no idea which file or directory does it need!

09:41:23 - ERROR - cbt - During tests
Traceback (most recent call last):
File "./cbt.py", line 64, in main
b.initialize()
File "/root/cbt-master/benchmark/librbdfio.py", line 68, in initialize
super(LibrbdFio, self).initialize()
File "/root/cbt-master/benchmark/benchmark.py", line 44, in initialize
self.cluster.cleanup()
File "/root/cbt-master/cluster/ceph.py", line 210, in cleanup
common.pdsh(nodes, 'sudo rm -rf %s' % self.tmp_dir).communicate()
File "/root/cbt-master/common.py", line 70, in pdsh
return CheckedPopen(args,continue_if_error=continue_if_error)
File "/root/cbt-master/common.py", line 20, in init
self.popen_obj = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE, close_fds=True)
File "/usr/lib64/python2.7/subprocess.py", line 711, in init
errread, errwrite)
File "/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory

Create Module for All-to-All Network Tests

Use iperf for all-to-all network tests. Base code on somthing vaguely like:

Client to OSD tests:

!/bin/bash

for i in 8 9 10 11 12 13
do
val=$((62+$i))
pdsh -R ssh -w osd[$i] iperf -s -B 172.27.50.$val &
done

!/bin/bash

for i in 0 1 2 3 4 5 6 7
do
for val in 70 71 72 73 74 75
do
pdsh -R ssh -w client[$i] iperf -c 172.27.50.$val -f m -t 60 -P 1 > /tmp/iperf_client${i}to${val}.out &
done
done

!/bin/bash

for i in 8 9 10 11 12 13
do
val=$((62+$i))
pdsh -R ssh -w osd[$i] iperf -s -B 172.27.49.$val &
done

!/bin/bash

for i in 8 9 10 11 12 13
do
for val in 70 71 72 73 74 75
do
pdsh -R ssh -w osd[$i] iperf -c 172.27.49.$val -P 1 -f m -t 60 -P 1 > /tmp/iperf_${i}to${val}.out &
done
done

Update CBT to support RBD Erasure Coding testing

CBT current does not support testing RBD Erasure coding (ceph 12.2.2). ceph.py needs to be updated such that when librbdfio.py calls mkpool with a data pool specified ceph.py uses the data_pool_profile instead of the pool_profile.

Since librbdfio will append "-data" to the pool name if a user had specified a data_pool the below could be a possible solution.

cbt/cluster/ceph.py:
def mkpool(self, name, profile_name, application, base_name=None):
if 'data' in name:
pool_profiles = self.config.get('data_pool_profiles', {'default': {}})
else:
pool_profiles = self.config.get('pool_profiles', {'default': {}})
. . .

Concurrent take root testing

A cluster composed of nodes with a mix of SSD and HDD, where the SSDs are being used as OSDs, and not for a cache tier. Currently, we can test either the flash take root with CBT, or the HDD take root, but not both concurrently. Add ability to have clients test pools with different take roots, concurrently.

test cosbench integration

More explicit failure indication in cbt run.

When executing the cbt.py test suite, it is very hard to figure out which steps failed/passed.
My experience with this tool is very limited as I just started using it, but I see that the pdsh commands fail without any error, so it is hard to decipher why.

Also, the use_existing flag in cluster: configuration in the yaml file should be highlighted when using against an existing cluster. Once I go through a successful execution I will create a pull request for any doc changes if makes sense and other issues if I see.

Another issue I see is username and groupname are taken as the same which is not the case. Might be useful to add a groups filed as well.

Lastly -->
Now I think I have gotten past some of my inital hurdles and am able to execute an fio benchmark, but I am not sure what is next.

The last step I see is:

21:30:37 - DEBUG - cbt - pdsh -R ssh -w [email protected],[email protected],[email protected] sudo chown -R behzad_dastur.behzad_dastur /tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-01048576/concurrent_procs-001/iodepth-064/randwrite/* 21:30:37 - DEBUG - cbt - rpdcp -f 1 -R ssh -w [email protected],[email protected],[email protected] -r /tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-01048576/concurrent_procs-001/iodepth-064/randwrite/* /tmp/00000000/LibrbdFio/osd_ra-00004096/op_size-01048576/concurrent_procs-001/iodepth-064/randwrite

I can see logs created at:

[root@cbtvm001-d658 cbt]# ls /tmp/00000000/LibrbdFio/osd_ra-00004096/op_size-01048576/concurrent_procs-001/iodepth-064/read/ collectl.b-stageosd001-r19f29-prod.acme.symcpe.net collectl.v-stagemon-002-prod.abc.acme.net output.0.v-stagemon-001-prod.abc.acme.net collectl.v-stagemon-001-prod.abc.acme.net historic_ops.out.b-stageosd001-r19f29-prod.abc.acme.net
Are there ways to now visualize this data.

OSError: [Errno 2] No such file or directory while running radosbench test

17:56:52 - DEBUG - cbt - pdsh -R ssh -w root@ceph echo 3 | sudo tee /proc/sys/vm/drop_caches
^@17:56:53 - ERROR - cbt - During tests
Traceback (most recent call last):
File "./cbt.py", line 71, in main
b.run()
File "/root/cbt/benchmark/radosbench.py", line 68, in run
self._run('write', '%s/write' % self.run_dir, '%s/write' % self.out_dir)
File "/root/cbt/benchmark/radosbench.py", line 81, in _run
rados_version_str = subprocess.check_output(["rados", "-v"])
File "/usr/lib64/python2.7/subprocess.py", line 568, in check_output
process = Popen(stdout=PIPE, _popenargs, *_kwargs)
File "/usr/lib64/python2.7/subprocess.py", line 711, in init
errread, errwrite)
File "/usr/lib64/python2.7/subprocess.py", line 1308, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory

radosbench should allow other readmodes besides seq

pre-test sanity checking

do a best effort check to make sure CBT wont crash halfway through a test run.

add pipes and filters arch in CBT

in order to improve extendability (especially setup phase) -- Radoslaw, Mirantis

clocksource option for fio

Adding a clocksource option for fio benchmarks would be useful so that gettimeofday can be selected when there are multiple fio processes running on multiple hosts. Assuming ntp is properly configured, this would make aggregating data much easier than it would be otherwise with independent, relative timestamps.

librbdfio fails when client's hostname differs from identifier in CBT job file

The librbdfio benchmark creates RBDs using node names specified in the job file. However, the clients try to access the RBDs using their own "hostname -s" name. If the names are not the same then tests will fail, with output like this in the output files:

Starting 1 process
rbd engine: RBD version: 0.1.9
rbd_open failed.
fio_rbd_connect failed.

I will be submitting a fix.

OSD expansion testing

The current recovery machinery for CBT marks an OSD or group of OSDs out, then later marks it back in. It would be great is there was a way to mark an OSD or group of OSDs out, destroy them, then add them back in to the cluster.

option to reuse existing rbd volumes in subsequent cbt run

when using use_existing, being able to reuse already filled rbd volumes instead of having to write a substantial amount of data again would be a huge time saver.

Option to restart Ceph services

When changing Ceph parameters, some of them require Ceph services, like OSD, to be restarted. CBT can only recreate a brand new cluster or use existing cluster. It would be nice to have something like use_existing, but it would restart all Ceph services and check health before starting the test(s).

lack of osd_per_node causes crash, even with use_existing set to true

If osd_per_node is unset, then cbt will crash mid way through the run, even with use_existing set to true.

better error handling

population of fio files should use fio_cmd provided in yaml

In using kvmrbdfio.py to run fio on my openstack guests, the population of the fio files does not use the path to the fio executable (fio_cmd) provided in the yaml file. I define the fio command path ...

----------- yaml excerpt -----------
benchmarks:
kvmrbdfio:
fio_cmd: "/usr/local/bin/fio"
----------- end excerpt ------------

... however the populate call instead uses no path and relies on the fio executable being in the $PATH defined by sudo users, which excludes /usr/local/bin where my fio resides, resulting in the population failing on the target systems.

cosbench should support 100% read mode and mix mode with adjustable read and write ratios

Currently only mode supported is write
"read" mode will require a prepare stage to create objects

Also should support mix mode and allow user to specify read and write ratios

extend/adapt CBT to export results to metrics.ceph.com

Work with Bitergia. Perhaps utilize work already in CETune (Intel - Jian)?

Option to save results archive in S3

It would be great to have the ability to save the results archive to a S3 bucket instead of it only being on the local filesystem of the head node.

add simple cephfs support (cephfsfio)

update to include simple cephFS support.

Reuse rbdfio benchmark
update with new cephfsfio.py benchmark and examples
update ceph.py to create MDSs

To create a cosbench yaml example

Can be on the lines of from a cosbench profile:

Auth type
Config details
Workstage
Work type
Workstage
Work type

remove need for partition labeling

Client concurrency scaling

It would be great to have the ability to specify the clients, eg:

clients: [client1, client2, client3, client4 client5, client6]

Then also specify an array with the number of clients to use for each iteration, eg:

num_clients: [1, 2, 4, 6]

This would be useful for finding the client concurrency contention threshold for a cluster.

script hangs while performing rpdcp

Hi
while performing rdcp the script just hangs. However all files and folders are copied to the target ARCH directory, but script doesn't continue on other steps.

Any idea?

6:42:18 - DEBUG - cbt - Nodes : ceph@cbt-node1,cbt-node2,cbt-node3
16:42:18 - DEBUG - cbt - CheckedPopen continue_if_error=True args=pdsh -f 3 -R ssh -w ceph@cbt-node1,cbt-node2,cbt-node3 sudo pkill -SIGINT -f blktrace
16:42:19 - DEBUG - cbt - Nodes : ceph@cbt-node1,ceph@cbt-node2,ceph@cbt-node3,ceph@cbt-node1,cbt-node2,cbt-node3
16:42:19 - DEBUG - cbt - CheckedPopen continue_if_error=False args=pdsh -S -f 6 -R ssh -w ceph@cbt-node1,ceph@cbt-node2,ceph@cbt-node3,ceph@cbt-node1,cbt-node2,cbt-node3 sudo chown -R ceph.ceph /tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-04194304/concurrent_procs-003/iodepth-064/read/*

16:42:19 - DEBUG - cbt - CheckedPopen continue_if_error=False args=rpdcp -f 10 -R ssh -w ceph@cbt-node1,ceph@cbt-node2,ceph@cbt-node3,ceph@cbt-node1,cbt-node2,cbt-node3 -r /tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-04194304/concurrent_procs-003/iodepth-064/read/* /tmp/ARCH1/00000000/LibrbdFio/osd_ra-00004096/op_size-04194304/concurrent_procs-003/iodepth-064/read

Thanks

rbdfio engine should unmap

rbdfio.py should unmap all rbd devices before beginning a run

Where is the Log after ceph running

Where is the Log after ceph running ?I can't find it

how to parse cbt results?

Hi,
just finished benchmark of our Ceph cluster with CBT tool and now I'm confused how to deal with output data. I got a large directory structure with a lot of "id" directories like:

results/00000000/id3309284191688069537/

where are fio log files and some files in json format.

My point is if there is any tool or simple way how to parse these output data to human readable format, that I would be also able to compare two or more CBT benchmark tests.

Thank you

radosbench should not depend on ceph-common to be installed on head-node

cbt checks the rados version to determine the syntax of the object size for rados bench. It does so locally on the head node (by running rados -v) which fails if the rados binary (from ceph-common on Red Hat systems) is not installed.
Instead cbt should ask the first client node for the version, if it assumes all clients to have the same rados version installed.

Readahead for use_existing

When using use_existing, readahead is not set on the OSDs because they do not have the labels CBT expects. We should probably determine which devices are mounted in /var/lib/osd/* and then set their readahead values.

ImportError: No module named lxml.etree

Hi
I was trying to setup CBT on a fresh system and encountered the following problem

[root@ceph-node1 cbt]# python cbt.py
Traceback (most recent call last):
  File "cbt.py", line 9, in <module>
    import benchmarkfactory
  File "/root/cbt/benchmarkfactory.py", line 10, in <module>
    from benchmark.cosbench import Cosbench
  File "/root/cbt/benchmark/cosbench.py", line 8, in <module>
    import lxml.etree as ET
ImportError: No module named lxml.etree
[root@ceph-node1 cbt]#

The fix was simple yum -y install python-lxml, just for documentation purpose i am creating this issue. I will submit a PR for updating documentation.

Copy ceph.conf and cbt yaml configuration into archive directory

Copy ceph.conf and cbt yaml configuration into archive directory, so we have the exact test configuration.

Trigger deep scrubs

Ability to trigger deep scrubs to test the impact of deep scrubbing. In hammer there is supposed to be a ceph.conf tunable that allows specifying a time. If that value can be injected, then we can probably do something along the lines of:

current_time = time
schedule_time = current_time + ramp_time

ceph tell * injectargs --some_scrubbing_schedule_key $schedule_time

ceph / cbt Goto Github PK

cbt's Introduction

CBT - The Ceph Benchmarking Tool

INTRODUCTION

radosbench

librbdfio

kvmrbdfio

rbdfio

PREREQUISITES

USER AND NODE SETUP

DISK PARTITIONING

CREATING A YAML FILE

CONCLUSION

cbt's People

Contributors

Stargazers

Watchers

Forkers

cbt's Issues

!/bin/bash

!/bin/bash

!/bin/bash

!/bin/bash

Recommend Projects

Recommend Topics

Recommend Org