utsaslab / crashmonkey Goto Github PK

View Code? Open in Web Editor NEW

186.0 15.0 30.0 2.26 MB

CrashMonkey: tools for testing file-system reliability (OSDI 18)

License: Apache License 2.0

Makefile 0.67% C 2.53% C++ 77.68% Shell 1.40% Python 17.72%

test-harness crash-consistency file-systems

crashmonkey's People

Contributors

Stargazers

Watchers

crashmonkey's Issues

Create API and stub to communicate checkpoint requests to CrashMonkey

Part of the revised version of #12.

Checkpoints require support across many parts of CrashMonkey. This part is meant to provide user workloads the ability to tell the CrashMonkey test harness that they want to create a checkpoint. CrashMonkey should provide both a stub binary to accomplish this task (similar to the current stubs in the user_tools directory) as well as a small API for tests subclassed from BastTestCase.h. This utility can make use of the sockets class available in the utils/communication directory.

For checkpoints, we can assume 2 things:

the user has just performed a sync/fsync request of some form
this call will block until all parts of the checkpoint are completed

The stub program or API for this part should do 2 things:

send a message via socket to the CrashMonkey test harness requesting a checkpoint be made
wait for the CrashMonkey test harness to respond that the checkpoint has been completed

After the stub has received confirmation the checkpoint operation completed, it should exit with no error (for the binary) or return to the caller.

Investigate index out of bounds exception when no bios are logged

The test harness portion of CrashMonkey crashes with an index out of bounds exception when no bios are logged/transferred to user space.

Running sudo ./c_harness -f /dev/vda -d /dev/cow_ram0 -t btrfs -e 102400 tests/rename_root_to_sub.so on a build from master should trigger this bug (potentially substituting /dev/vda with an existing disk in the VM).

insmod ERROR: “disk_wrapper.ko: Cannot allocate memory”

I'm trying to test Crashmonkey, but I get an error when trying to register a kernel module. I also change the kernel version on some other PCs, but it shows same results.
Please check this problem.

Crashmonkey (commit eeca8ff)
Kernel 4.15.0-041500-generic, Glibc 2.23
Hardware : Intel 4 core, 8GB RAM, 120GB SSD
Ubuntu 16.04.2 LTS

root@junghan-nuc:~/crashmonkey/build/c_harness -f /dev/vda1 -d /dev/cow_ram0 -t ext2 tests/rename_root_to_sub.so -v

running 0x7ffc10bb66f8
========== PHASE 0: Setting up CrashMonkey basics ==========
Inserting RAM disk module
Loading test case
Loading permuter
Updating dirty_expire_time_centisecs to 3000

========== PHASE 1: Creating base disk image ==========
Formatting test drive
mke2fs 1.42.13 (17-May-2015)
Discarding device blocks: done
Creating filesystem with 10240 1k blocks and 2560 inodes
Filesystem UUID: 0637732c-1f79-4ed0-84ca-125bca2fb70a
Superblock backups stored on blocks:
8193

Allocating group tables: done
Writing inode tables: done
Writing superblocks and filesystem accounting information: done

Mounting test file system for pre-test setup
Running pre-test setup
Unmounting test file system after pre-test setup
Making new snapshot
cloning device /dev/cow_ram0

========== PHASE 2: Recording user workload ==========
Clearing caches
Inserting wrapper module into kernel
insmod: ERROR: could not insert module ../build/disk_wrapper.ko: Cannot allocate memory
Error inserting kernel wrapper module
rmmod: ERROR: Module cow_brd is in use
Unable to remove cow_brd device
root@junghan-nuc:~/crashmonkey/build#

CrashMonkey test faililng due to assertion error in RandomPermuter

Many times, the crash monkey test fails with an assertion error in random permuter due to which it is unable to decide the result of the test. The error looks like -

c_harness: permuter/RandomPermuter.cpp:307: void fs_testing::permuter::RandomPermuter::AddEpochs(const iterator&, const iterator&, const iterator&, const iterator&): Assertion 'current_res != res_end' failed.

We noticed this happens more frequently while testing for ext4 than the other file systems.

Reproduce file-system bugs found on mailing lists

Recently, a couple of gentlemen have been using dm-log-writes and xfstests (available here). We should reproduce these bugs in CrashMonkey as well.

Make multiple instances of CrashMonkey run in a single machine

Currently, for each VM, only one Crashmonkey instance is running. This wastes a lot of computational power. It would much more efficient to run X instances of Crashmonkey if there are X cores on the machine.

This would require running X wrapper devices per virtual machine. Not sure what kernel problems we will run into when doing this.

Create glossary of terms wiki page

Since CrashMonkey has terminology that is either new, or using it in a new context, we should create a wiki page that defines common terms in CrashMonkeys that users may not be familiar with. This will help people discuss concepts of CrashMonkey in a coherent manner.

Use dm snapshot device instead of our custom ram block device

Something appears to be off with the little ram block device (cow_brd) that I created earlier in the project. The device mapper target has a two part system that is already in place through the snapshot-origin and snapshot targets. There is also a library for device mapper that can be used to programatically control device mapper targets.

The dm target has the advantage of upstream support as well as C library support in the for of libdevmapper.

It also has a bash interface, and a small script like the following can be used to rig up a simple snapshot device:

#! /bin/bash

SNAP_BASE=/dev/ram0
SNAP_BASE_NAME=snap_base

SNAP_DEV=/dev/ram1
SNAP_NAME=snap_snap

set -x
DEV_SIZE=$(blockdev --getsz "$SNAP_BASE")
echo "0 $DEV_SIZE snapshot-origin $SNAP_BASE" | dmsetup create $SNAP_BASE_NAME

ORIG_BASE=/dev/mapper/$SNAP_BASE_NAME
ORIG_SIZE=$(blockdev --getsz "$ORIG_BASE")
dmsetup create $SNAP_NAME --notable
echo "0 $DEV_SIZE snapshot $ORIG_BASE $SNAP_DEV n 8" \
    | dmsetup load $SNAP_NAME
dmsetup mknodes

Despite the fact that a bash library exists to communicate with device mapper targets, I feel an implementation using the libdevmapper library would be preferable.

Do not expose kernel flags to user space

I made the mistake of exposing kernel bio flags to user space in earlier versions of CrashMonkey. This is not a portable design choice and needs to be fixed. Instead of directly using kernel bio flags in user space, the kernel code in CrashMonkey should translate from kernel flags to CrashMonkey specific flags. This enables portability of user space code in CrashMonkey. Another project similar to CrashMonkey already does this and has been added to the kernel. The code that accomplishes this can be found here.

At least the following flags/concepts should have their own defines in CrashMonkey:

sync
FUA
flush
discard
secure erase
write zeros
meta

implement checkpoint logic in disk_wrapper

Part of the revised version of #12.

Checkpoints require support across many parts of CrashMonkey. This part is meant to provide the ability for the disk_wrapper to actually create a checkpoint. This should be implemented as part of the ioctl created in #40.

For checkpoints, we can assume 2 things:

the user has just performed a sync/fsync request of some form
this call will block until all parts of the checkpoint are completed

When a checkpoint request is received an an ioctl, the disk_wrapper should insert a new disk_write_op into the sequence to signify that a checkpoint was made. This operation should have no data, but should have flags to denote that it is a checkpoint. New flags will need to be created to signify checkpoint operations as the current flags don't reflect that.

Insertion into the list of disk operations should be done in a thread-safe manner. It could be the case that another process is attempting to insert a write into the list, so be sure to use proper locking to ensure nothing is lost.

The checkpoint operation should appear like all the other operations in the list so that it can be transferred to user space like all the others.

Kernel modules and C++

So it seems there is no support to C++, I was trying to port the code to Kernel 3.16 and an included header in the compilation of RandomPermuter.so (linux/stddef.h) try to define true and false, that are keywords in C++.
A workaround is remove the link to the kernel headers in the compilation of RandomPermuter.so, and create a local file with a copy of the needed values from linux/blk_types.h. But I don't like it.

Modify xfstests' fsx program for use in CrashMonkey

xfstests in kdave's repo has a source file for a program called fsx that is meant to perform random write/truncate/allocate operations on a single file in the file system. This would be a useful tool for the CrashMonkey team because it would allow us to quickly bootstrap random tests.

The fsx program has a few extra things the CrashMonkey team may not need.

it creates 2 extra files in addition to the file used for testing. The purpose of these file should be investigated and they should not be created by fsx if they are not needed or should be created somewhere other than on the file system under test if they are useful.
fsx has an algorithm to generate data, but I do not know if it generates data the CrashMonkey team can easily check for if given an offset in a file. Part of CrashMonkey's tests include checks for proper data in files, so we would like to make sure that we know what data is being written to a file where. If needed, the fsx algorithm to create data to write to a file should be modified so that it writes data CrashMonkey can easily verify.

Have a set of tests run against each file system

Redo Makefile

Now that the project is getting bigger, the Makefile should be modified so that builds are cleaner.

Included code should be compiled into libraries where possible and linked where needed instead of provided directly as it is now.

Compiled code should also be placed in its own build directory instead of alongside the source files which generated it. This will make make clean much easier to define as well as making it easier to avoid checking in binaries in git.

Implement stub program and API for users to create file watches

File watches are a feature that would make checking for data consistency a lot easier on users. Therefore, CrashMonkey should support a system where a user can tell CrashMonkey what files should no longer change. These watches may be tied to certain checkpoints in the workload, or they may be something that holds through the entire workload.

For watches, we can assume a few things:

the call will block until the watch has been setup
the watch will either reference a checkpoint, or be before any file system operations have completed
if the watch references a checkpoint, the watch function is called directly after a call to checkpoint

This part of the watch infrastructure gives user workloads the ability to tell CrashMonkey to watch a file. Since workloads can be run either by CrashMonkey (by subclassing BaseTest.h) or with CrashMonkey in the background, we need to provide both a stub binary and a simple API to setup watches. Watch setup should use sockets to communicate with the CrashMonkey test harness (see utils/communication/).

When the user requests a watch on a file, the stub should do the following:

send the file path and checkpoint number (or some "null" checkpoint number if the workload is just starting) to the CrashMonkey test harness via socket
wait for the CrashMonkey test harness that either the watch has been successfully setup or there was a problem with the watch.

Modularize/Cleanup Block Device Management Code

The code to manage block devices is currently a part of the main harness code. Since the functionality of this code is very narrowly scoped and it is not directly related to how the test harness should be run, it should likely be moved into a utility class or module.

update permutation algorithms to return the most recent checkpoint when generating crash states

Part of the revised version of #12.

Checkpoints require support across many parts of CrashMonkey. This part slightly modifies how crash states are generated so that we can give user consistency tests more information about the crash state they are working with.

For checkpoints, we can assume 2 things:

the user has just performed a sync/fsync request of some form
this call will block until all parts of the checkpoint are completed

When a new crash state is generated, the Permuter (or subclass) that generated the crash state should inform the CrashMonkey test harness of the most recent checkpoint passed in the bio sequence. An example of a workload, generated crash state, and checkpoint number are shown below.

workload:
+-------------+-------------+-------------+-------------+-------------+
|   epoch 1   |   epoch 2   | checkpoint  |   epoch 3   |   epoch 4   |
+-------------+-------------+-------------+-------------+-------------+

generated crash state:
+-------------+-------------+-------------+-----------------+
|   epoch 1   |   epoch 2   | checkpoint  | partial epoch 3 |
+-------------+-------------+-------------+-----------------+

returned checkpoint value: 1

Another example could be:

workload:
+-------------+-------------+-------------+-------------+-------------+
|   epoch 1   |   epoch 2   | checkpoint  |   epoch 3   |   epoch 4   |
+-------------+-------------+-------------+-------------+-------------+

generated crash state:
+-------------+-----------------+
|   epoch 1   | partial epoch 2 |
+-------------+-----------------+

returned checkpoint value: 0

ACE fails on fsync

Hi,

If I include fsync in the OperationSet list (ace/ace.py line 59), and run (python ace.py -l 1 -n False -d False), ACE fails with the following error message:

Traceback (most recent call last):
  File "ace.py", line 1463, in <module>
    main()
  File "ace.py", line 1422, in main
    doPermutation(i)
  File "ace.py", line 1242, in doPermutation
    cur_line = buildJlang(modified_sequence[insert], length_map)
  File "ace.py", line 1017, in buildJlang
    ret = flat_list[2]
IndexError: list index out of range

Thanks.

Verify 'dm-log-writes' logs a sector relative to the start of the parition

Comments in the kernel indicate that the bi_sector that our current logging may not be relative to the partition we're monitoring, but relative to the entire block device (offending comment). We need to determine if the sector being logged is the sector relative to the partition of the device we are monitoring or the disk itself (ex. relative to /dev/sda1 or /dev/sda).

Another system, called log-writes , performs logging similar to CrashMonkey, but uses device mapper targets instead. Eventually, we would like to move CrashMonkey over to a more standard device mapper target. However, before we do that, we would like to know what the pain points will be. We should use the dm-log-writes target to determine if the sectors logged are relative to the start of the partition being monitored or relative to the start of the block device.

An easy way to check this is to log some operations with the log-writes system and then try to replay those operations onto a device with a different number of partitions and/or with partitions of different sizes than the device logging was originally done on

create API to communicate checkpoint requests from CrashMonkey test harness to disk_wrapper module

Part of the revised version of #12.

Checkpoints require support across many parts of CrashMonkey. This part is meant to provide the ability for the CrashMonkey test harness to tell the disk_wrapper to create a checkpoint. This can be implemented as an ioctl call to the disk_wrapper.

For checkpoints, we can assume 2 things:

the user has just performed a sync/fsync request of some form
this call will block until all parts of the checkpoint are completed

The ioctl should be a synchronous call. The work done in the ioctl is in #39.

Monitor the workload process with ptrace

Monitoring the program running the workload with ptrace inside CrashMonkey would make things a little easier for users in a few ways, including:

automatically creating checkpoints at sync/fsync operations
outputting the correlation between logged bios and system calls made in the workload

Based on the above list, the main goals of adding ptrace to CrashMonkey should be:

monitor for sync/fsync and create CrashMonkey checkpoints when they occur
log data sent to system calls like write so that it can be correlated with logged bios

Use xfstests to create a test suite

https://github.com/kdave/xfstests

Infrastructure to do the following:

Pick an xfstest
Run in bkgd, profile with crashmonkey
check N crash states

Decouple workload generation from C++ code

We should be able to support a workflow like this:

start tracing
user runs arbitrary workload
stop tracing

The user's workload shouldn't have to be written in C++ inside CrashMonkey

Implement Checkpoints

I have a long email thread in my inbox with @ashmrtn about this. Will add the summary from that thread here later.

In short, we want to have some mechanism to know what data/metadata to expect in each crash state. The idea is to allow users to call Checkpoint, which captures the user-visible state (directory tree + data) of the file system somewhere. On a crash, we go back to the latest Checkpoint and see if we have all the data in there.

Implement logic for file watches in CrashMonkey

For watches, we can assume a few things:

the call will block until the watch has been setup
the watch will either reference a checkpoint, or be before any file system operations have completed
if the watch references a checkpoint, the watch function is called directly after a call to checkpoint

This part of the watch infrastructure implements the logic for file watches. For each file watch made, the CrashMonkey test harness should checksum the data and selective metadata for the specified file. Metadata that does not change on every access (ex. file size, file type, and permissions but not things like accessed time) should be included in the checksum. Checksums should be stored in a hashmap (or hashmap like structure) that maps the file name to the checksum.

There is no limit on the number of times a file can be added to watches. Therefore, each filepath<->checksum hashmap should be stored according to the checkpoint that it corresponds to (ex. if a file is watched referencing checkpoint 1 and checkpoint 2 -- as two separate calls to watch -- then there should be a hashmap corresponding to checkpoint 1 watches and a hashmap corresponding to checkpoint 2 watches).

Modularize/Cleanup Kernel Module Insertion and Removal Code

The code to insert and remove kernel modules is currently a part of the main harness code. Since the functionality of this code is very narrowly scoped and it is not directly related to how the test harness should be run, it should likely be moved into a utility class or module.

Verify the proper sector is recorded by 'disk_wrapper'

An easy way to check this is to log some operations with CrashMonkey and then try to replay those operations onto a device with a different number of partitions and/or with partitions of different sizes than the device logging was originally done on.

Clean up logs

We currently have the information from a run of CrashMonkey spread in too many files and logs, which makes interpreting tests hard. Lets consolidate this into one file.

Stress Test the File System

A number of bugs only occur the file system is close to full (storage space almost utilized). Add this as part of the testing.

Some bugs only appear when the kernel is low on memory. Need to figure out how to add those.

modify CrashMonkey and BaseTestCase to make use of checkpoints

Part of the revised version of #12.

Checkpoints require support across many parts of CrashMonkey. This part slightly modifies the way user tests are called so that checkpoint information is passed to user tests

For checkpoints, we can assume 2 things:

the user has just performed a sync/fsync request of some form
this call will block until all parts of the checkpoint are completed

The part of CrashMonkey that calls user tests should be modified to pass along the checkpoint number (generated by #41). BaseTestCase.h should be modified such that this is allowed.

Test crashes during xfstests

Set it up so that we crash during xfstests and check if the file system is consistent.
This "macro-test" will yield a lot of interesting crash states.

Update CrashMonkey to also work with 4.x version kernels.

CrashMonkey currently only works with 3.x versions of the Linux kernel. It should be updated to work with 4.x versions of the kernel as well.

Create a set of utilities that user defined tests can use to gain information during workload setup/execution

User provided tests should have a set of utilities/methods that they can call into which provide them with things like the directory the test file system is mounted at and the file system size.

Implement error reporting for watches

For watches, we can assume a few things:

the call will block until the watch has been setup
the watch will either reference a checkpoint, or be before any file system operations have completed
if the watch references a checkpoint, the watch function is called directly after a call to checkpoint

This part of the watch infrastructure allows CrashMonkey to report errors when the files being watched change in generated crash states.

Each time a crash state is generated, CrashMonkey should examine the checkpoint for the crash state (see #41/#42) and then check all file watches referencing that checkpoint and earlier.

When "checking" a watched file, CrashMonkey should checksum the file data and selected metadata (see #44) at that path present in the generated crash state. If the crash state's checksum does not match the checksum calculated when the file watch was setup, then CrashMonkey should note the error in detail (ex. "checksum for file is incorrect" -- see DataTestResult.h for an example of error strings) in a results/xResult struct (you will likely have to modify or make a new struct for this). Recording specific errors in a xResult struct will allow these errors to be printed to the log later in test harness execution.

Write an adaptor for Crashmonkey for dm-flakey

Currently, the user-space component of CrashMonkey generates tests that run on CrashMonkey's custom kernel module. We would like to generate tests that use dm-flakey (https://www.kernel.org/doc/Documentation/device-mapper/dm-flakey.txt) since dm-flakey is already in the Linux kernel. The advantage of doing so is that the tests CrashMonkey produces can directly be added to xfstests and run by Linux kernel developers.

For example, Jayashree is now porting CrashMonkey tests to dm-flakey tests manually and adding them to xfstests: https://www.spinics.net/lists/fstests/msg10767.html. An adaptor for dm-flakey would make this automatic.

Put whether a bio is data or metadata in output log

#17 and #48 brought in a log file that the failing tests are printed to. This log output contains the indices of the bios in the crash state, and the order they were written out to disk to to form the crash state. This list of indices should be augmented to show whether the bio was a metadata bio or a data bio.

Metadata bios will be denoted by the META flag in the bio itself. Data bios will not have that flag.

Since the amount of data stored in the test result object is minimal at this time, we will likely have to expand the member variable that contains the crash state index information. Expanding that to have <index, data> pairs should suffice.

Sample output for this change could look like/similar to the following:

Test #26 FAILED: file missing: test file has completely disappeared
   last checkpoint: 2
   crash state: 0 (M), 1 (D), 2 (D), 4 (D), 3 (D), 5 (M)

Have CrashMonkey behave more like a fuzzer

Have CrashMonkey run in a "fuzzer" mode where:

It takes as input running time T. It should terminate after this specified time.
It randomly picks a target workload in the configured state space (right now it explores everything)
It stores a persistent state with the workloads it has already explored so that it doesn't explore them in the next invocation (a bloom filter with test IDs would probably work)

Introduce multi-threading

Test N crash states at once, where N is the number of cores on the test machine.

Create unit tests for the permuters

The permuters should have unit tests associated with them to make sure they are working properly. These test should include things like checking the permuter works properly when no or 1 bio is logged etc.

Tests should be placed in the test directory in the repo.

Collect crash-consistency bugs to reproduce

Look at Linux kernel mailing lists and file-system specific lists such as linux-ext4 and linux-btrfs to collect bugs we could attempt to reproduce.

Port CrashMonkey to 4.12

Exit gracefully if ctrl-c is received

Users may want to exit CrashMonkey before the test harness has finished a complete run. They should be able to hit ctrl-c on the shell to kill it and expect CrashMonkey to clean up resources properly.

Most of this should just be catching the proper signal and then calling cleanup_harness() in the instance of the Tester object that harness/c_harness.cpp has.

Right now, I know that background communication sockets aren't cleaned up, kernel module(s) aren't removed, and file systems aren't unmounted (depending on when ctrl-c is hit).

Make all logs that CrashMonkey generates with the -l flag hex

To aid correlation between writes to disk and recorded bios, all logs generated by the -l (that's lower case 'L', not upper case 'i') should be in hex.

The goal of this is allow users to run CrashMonkey with strace -x logging the write system calls. Then, users can directly correlate the hex strings passed to system calls that strace logs to the data in bios recorded by CrashMonkey

Make CrashMonkey errors easier to understand

Based on experiences from some of the newer people on the CrashMonkey team, it seems that it is hard to determine why CrashMonkey failed to run properly if an error occurs. Therefore, the error messages in CrashMonkey should be updated to make it easier to understand what went wrong.

Split barrier operations that contain data

Flush operations are defined oddly. They make sure that the data in the device cache is persisted, but will not make sure the data in the request itself is persisted (link). In light of this, CrashMonkey should split flush operations that have data. The flush operation itself should end the current epoch in CrashMonkey, but the data should be placed in the next epoch as it is not guaranteed to be persisted in the epoch the flush just ended. This should probably be done in the Permuter class when it initializes internal data structures so that it this behavior is transparent to user implemented permuters.

stop disk_wrapper from hanging if flag device doesn't exist

There's a bug in the initialization code of disk_wrapper that causes it to be unremmovable (and thus forces a system restart) if the device it's supposed to pull IO scheduler flags from does not exist.

The device is stable otherwise, but this should be fixed to make the system more resilient.

future bugs to investigate when CrashMonkey is more complete

This is a list of file system bugs that the current implementation of CrashMonkey cannot reproduce because we don't have the infrastructure for it. If we add the infrastructure in the future, these might be interesting to try to recreate.

Allow User Data Consistency Test to Output Meaningful Errors

The current system is a simple pass/fail return from user data consistency tests. We need to update this to allow user data consistency tests to output meaningful errors that help with finding file system bugs. Without this, it is not easy to determine why a test failed since you can only see summary information about the tests.

For small tests, stop when the total possible number of crash states is hit

Currently, the user specifies how many crash states to test and keeps running tests until it reaches that number. For large tests (>= ~10 bios in a single epoch) this is fine. For smaller tests however, unless the user manually counts how many possible crash states there are and sets the options accordingly, it will cause CrashMonkey to loop infinitely trying to find enough unique crash states to satisfy the command line argument.

CrashMonkey should be updated to reduce the number of tests it will run on small workloads so that it does not spin forever trying to generate unique crash states. It should print a message when it does this so that users are aware of this behavior.

Make flags sane

As development goes along, I find that more and more of the flags are letters that don't really relate to what the flag actually does. It would be nice to clean these up and make them sane values that relate to what the flag actually does.

utsaslab / crashmonkey Goto Github PK

crashmonkey's People

Contributors

Stargazers

Watchers

Forkers

crashmonkey's Issues

Recommend Projects

Recommend Topics

Recommend Org