bitcoin-core / qa-assets Goto Github PK

Bitcoin Core related blobs used for quality assurance

License: MIT License

qa-assets's Introduction

qa-assets

Bitcoin Core related blobs used for quality assurance.

Fuzz inputs

qa-assets/fuzz_seed_corpus contains one input corpus per fuzz target (one folder per target named the same as each target).

Contributing inputs

For documentation on how to fuzz Bitcoin Core please see fuzzing.md.

If you want to contribute fuzz inputs, please "merge" the inputs before submitting a pull request. You can use the libFuzzer option -set_cover_merge=1 (recommended with -use_value_profile=0) or the --m_dir option of the fuzz runner: test_runner.py.

Pruning inputs

Over time fuzz engines reduce inputs (produce a smaller input that yields the same coverage statistics), which causes our corpora to accumulate larger non-reduced inputs.
Code changes can lead to inputs losing their coverage.

To avoid corpora bloat, stale inputs and potential CI timeouts, we usually prune/minimize our corpora around the branch-off point using the delete_nonreduced_fuzz_inputs.sh script (Recommended to run in a fresh VM, see documentation in the script). The script is usually run twice to ensure that the results are "somewhat" reproducible (e.g. #119 (comment)).

After pruning the corpora, the coverage should not have dropped at all.

Pulling inputs from oss-fuzz

Use download_oss_fuzz_inputs.py to pull fuzz inputs from oss-fuzz.

qa-assets's People

Contributors

Stargazers

Watchers

qa-assets's Issues

Consider removing unnecessarily large inputs which are causing excessive corpus processing runtime

The following corpus directories contain some very large input files which unnecessarily cause the fuzzing runtime to exceed what feels reasonable.

"Unnecessarily large" in this context means that the presence of these very large input files do not add any coverage beyond what is already achieved by processing only significantly smaller input files already in the corpus.

Corpus directory	Largest coverage increasing file in directory	Largest input file in directory
`addrman`	143 118 bytes	1 048 576 bytes
`banman`	49 814 bytes	50 125 bytes
`block`	1 000 431 bytes	1 048 576 bytes
`prevector`	709 301 bytes	709 301 bytes
`process_messages`	984 807 bytes	3 984 182 bytes
`script_flags`	961 741 bytes	1 855 780 bytes
`transaction`	111 109 bytes	111 872 bytes

Perhaps we should consider removing these excessively large inputs that do not add any coverage at the moment and are unlikely to do so in the future?

Increase timeout or remove valgrind CI job?

I think there are only ~5 occurrences of the valgrind job passing out of all the merged/closed PRs. They always (afaict) fail due to timing out.

So maybe we should:

Increase the timeout? (unless the job would take way to long anyway)
Remove the job?
Only run on corpora that were changed in a PR?

CI job for verifying coverage increase

We could consider adding a CI job that checks that the coverage for newly added inputs actually goes up (this should be possible with afl-showmap). This would help with review and avoiding bloat in the corpora.

(Fuzz harnesses with low stability could be annoying here)

Consider removing unnecessarily large inputs which are causing excessive corpus processing runtime

The following corpus directories contain some very large input files which unnecessarily cause the fuzzing runtime to exceed what feels reasonable.

Corpus directory	Largest coverage increasing file in directory	Largest input file in directory
`addrman`	143 118 bytes	1 048 576 bytes
`banman`	49 814 bytes	50 125 bytes
`block`	1 000 431 bytes	1 048 576 bytes
`prevector`	709 301 bytes	709 301 bytes
`process_messages`	984 807 bytes	3 984 182 bytes
`script_flags`	961 741 bytes	1 855 780 bytes
`transaction`	111 109 bytes	111 872 bytes

Perhaps we should consider removing these excessively large inputs that do not add any coverage at the moment and are unlikely to do so in the future?

Consider pruning corpus to speed up the Bitcoin Core CI fuzzing job?

Should we prune the corpus to get rid of large slow inputs that reach code that can be reached using smaller (and thus quicker-to-process) inputs?

The basic idea is to reduce the total size of the qa-assets inputs while keeping the same achieved coverage.

I think that might speed up the Bitcoin Core CI fuzzing job significantly.

Sounds good? Any opposition? :)

If we agree that it is a good idea I'm willing to do the job :)

Automatic check on PR coverage?

Would it be possible to have an automatic evaluation on PRs that outputs how the coverage changes per a PR and how many files per target are being added? It could possibly even warn if the coverage for a target is reduced by a PR.

Sharing File...

Crypto

fuzz_seed_corpus: sub_net_deserialize and address_deserialize don't have any fuzz tests

While rebasing and testing the PR #21496, I found out that the seed directories sub_net_deserialize and address_deserialize in fuzz_seed_corpus don't have any fuzz tests in src/test/fuzz and therefore, an assertion failed in fuzz.cpp.

Error message:

sub_net_deserialize
fuzz: test/fuzz/fuzz.cpp:70: auto initialize()::(anonymous class)::operator()() const: Assertion "it != FuzzTargets().end()" && check' failed.
Aborted (core dumped)
address_deserialize
fuzz: test/fuzz/fuzz.cpp:70: auto initialize()::(anonymous class)::operator()() const: Assertion "it != FuzzTargets().end()" && check' failed.
Aborted (core dumped)

As a solution to this problem, I was thinking to remove these directories from fuzz_seed_corpus.
@MarcoFalke, is it possible?

Merge OSS-Fuzz inputs

This requires an auth token for gsutil, it seems. See https://google.github.io/oss-fuzz/advanced-topics/corpora/#downloading-the-corpus

unsymbolized MSAN stack traces

See https://github.com/bitcoin-core/qa-assets/runs/20812716141:

+ '[' true = true ']'
+ LD_LIBRARY_PATH=/ci_container_base/depends/x86_64-pc-linux-gnu/lib
+ test/fuzz/test_runner.py -j6 -l DEBUG /ci_container_base/ci/scratch/qa-assets/fuzz_seed_corpus/ --empty_min_time=60
==29805==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x55ef079a6060  (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1093060)
    #1 0x55ef0718b823  (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x878823)
    #2 0x55ef071b9b12  (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x8a6b12)
    #3 0x7fd0ab3e9d8f  (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) (BuildId: c289da5071a3399de893d2af81d6a30c62646e1e)
    #4 0x7fd0ab3e9e3f  (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f) (BuildId: c289da5071a3399de893d2af81d6a30c62646e1e)
    #5 0x55ef0717e834  (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x86b834)

  Member fields were destroyed
    #0 0x55ef0724925d  (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x93625d)
    #1 0x55ef09320bb9  (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x2a0dbb9)
    #2 0x55ef0717e664  (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x86b664)
    #3 0x7fd0ab3e9eba  (/lib/x86_64-linux-gnu/libc.so.6+0x29eba) (BuildId: c289da5071a3399de893d2af81d6a30c62646e1e)

SUMMARY: MemorySanitizer: use-of-uninitialized-value (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1093060) 
Exiting
Traceback (most recent call last):
  File "/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/fuzz/test_runner.py", line 382, in <module>
    main()
  File "/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/fuzz/test_runner.py", line 106, in main
    test_list_all = parse_test_list(fuzz_bin=os.path.join(config["environment"]["BUILDDIR"], 'src', 'test', 'fuzz', 'fuzz'))
  File "/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/fuzz/test_runner.py", line 369, in parse_test_list
    test_list_all = subprocess.run(
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz' returned non-zero exit status 1.

llvm-symbolizer is built and installed into /usr/bin https://cirrus-ci.com/task/5482198250291200?logs=ci#L6085:

 + update-alternatives --install /usr/bin/llvm-symbolizer llvm-symbolizer /msan/clang_build/bin/llvm-symbolizer 100
update-alternatives: using /msan/clang_build/bin/llvm-symbolizer to provide /usr/bin/llvm-symbolizer (llvm-symbolizer) in auto mode

I guess we also need to set MSAN_SYMBOLIZER_PATH.

.

https://github.com/bitcoin/bitcoin/blob/master/SECURITY.md

`utxo_total_supply` extremely slow

While I had noticed utxo_total_supply being extremely slow on a merge before, my nightly fuzzer has been on the utxo_total_supply target for 6h51m at this point, even though the limit on the fuzzer is 3600 seconds. It seems to me that either it is stuck in an endless loop, or just extremely slow.

My […]/qa-assets-active-fuzzing/fuzz_seed_corpus/utxo_total_supply directory has 3250 seeds.

It looks to me like it hasn’t even loaded my corpus yet:

Here are two of the worst offenders:

slow-unit-ec3af16aeb3b967f9b63edc7266a71f52825c616

echo "egV+iYmJCy56C0HRX/7+cP7+/v////////7+/v7+/v7+/v7+/v78/v7+/v7+/v7+/r7+/v5cXFxcXFxcXFwpKSkpKSkpKf7+/v7+/v7+/v6+/v4pKSkpKSkpKSkpXClcXP7+JiAOWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZsDQ/J55ZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWbA1PyeewVkoWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVkODg4OWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWQ4ODg5ZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZsDQ/J55ZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWbA1PyeewVkoWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVkODg4OWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlQWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWQ4ODg5ZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVmwND8nnllZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZsDU/J57BWShZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWQ4ODg5ZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZDg4ODllZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVmwND8nnllZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZsDU/J57BWShZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWQ4ODg5ZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVBZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWf7+/v7+/v7+C0HRX/7+/v7+/v5ZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWQ4ODg5ZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWbA0PyeeWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZsDQ/J55ZWVlZWVn+/v7+/v6+WVlZWVlZWVlZWVlZWVlZWVlZWVlZ/v7+/v741y4LQVlZWVlZWVlZWVlZsDQ/J57BWShZWVkODg7RDg==" | base64 -d > slow-unit-ec3af16aeb3b967f9b63edc7266a71f52825c616

/slow-unit-9c08c09c1d630fade4f699fdd1eeaa53325dcce3

echo "JiAOWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZDg4ODllZWVlZWVlZWVlZWVlZWVlZWVlZWVkZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWTJZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVkODg4OWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlTWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWQ4ODllZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZDg4ODllZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVkgDllZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWQ4ODg5ZWVlZWVlZWVlZWVlZWVlZWVlZWVlZGVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVkyWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZDg4ODllZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVkODg5ZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWQ4ODg5ZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVxcXFxcXFxcXFxcXFxcXFxcXFwIvLy8vLy8vLy8vLy8vLy8vLy8vLy8vLy8vLy8vLy8vLy8vLy8vLy8vLy8XFxcXFxcXFxcXFxcXFxcXFxcXFxcXFxcXFxcXFxcXFxcXFxcXFxcXFxcXFxcXFxcXFxcvLy8t7y8vLy8XAj19Ts7Ox0mJiYmOzs7JiYmJiYmJiYmJiYmJiYmJiYmJiYmXFxcXFxcXFxcXFxcXFxcXFxcXFxcXFxcXAi8vLy8vLy8vLy8vLy8vLy8vLy8vLxZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWQ4ODllZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVmwOD8nnllZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZDg4OWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWbA4PyeeWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVmwNT8nnsFZKFlZWQ4ODg4=" | base64 -d > slow-repro-9c08c09c

Pruning large/slow inputs?

Some targets currently run for > half an hour, i.e banman or addrman_deserialize, with their slowest inputs running for a minute or more. Is it worth trying to prune these slow inputs, assuming no, or a negligable decrease in coverage, if runtime can be improved by a decent percentage?

https://cirrus-ci.com/task/6079733960540160?logs=ci#L1068:

Run addrman_deserialize with args ['valgrind', '--quiet', '--error-exitcode=1', '/tmp/bitcoin-core/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz', '-runs=1', '/tmp/bitcoin-core/ci/scratch/qa-assets/fuzz_seed_corpus/addrman_deserialize']INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 3456924522
INFO: Loaded 1 modules   (243081 inline 8-bit counters): 243081 [0x2736df8, 0x2772381), 
INFO: Loaded 1 PC tables (243081 PCs): 243081 [0x2772388,0x2b27c18), 
INFO:     3096 files found in /tmp/bitcoin-core/ci/scratch/qa-assets/fuzz_seed_corpus/addrman_deserialize
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 1048576 bytes
INFO: seed corpus: files: 3096 min: 1b max: 1048576b total: 128207363b rss: 251Mb
#64	pulse  cov: 1321 ft: 1729 corp: 31/980b exec/s: 21 rss: 272Mb
#128	pulse  cov: 1323 ft: 1754 corp: 33/1080b exec/s: 18 rss: 273Mb
#256	pulse  cov: 1324 ft: 1755 corp: 34/1133b exec/s: 15 rss: 273Mb
#512	pulse  cov: 1493 ft: 2023 corp: 68/3165b exec/s: 21 rss: 274Mb
#1024	pulse  cov: 2713 ft: 4888 corp: 135/9095b exec/s: 35 rss: 282Mb
#2048	pulse  cov: 2941 ft: 12633 corp: 547/156Kb exec/s: 23 rss: 284Mb
Slowest unit: 12 s:
artifact_prefix='./'; Test unit written to ./slow-unit-fa63ff08d2d73478c8e4dfb1f6d10c314c51bda2
Slowest unit: 20 s:
artifact_prefix='./'; Test unit written to ./slow-unit-2cf6dcdc90d27082d6f06c1bbcf2c9fb617c4665
Slowest unit: 29 s:
artifact_prefix='./'; Test unit written to ./slow-unit-c7043fea9ac0a00b8421f0c66b91141a47e2ce4a
Slowest unit: 77 s:
artifact_prefix='./'; Test unit written to ./slow-unit-c2c4bf148a02aebee1179dbb17b0f1c4ce69d949
#3097	INITED cov: 2960 ft: 15207 corp: 945/20Mb exec/s: 1 rss: 416Mb
#3097	DONE   cov: 2960 ft: 15207 corp: 945/20Mb lim: 760806 exec/s: 1 rss: 416Mb
Done 3097 runs in 2437 second(s)

https://cirrus-ci.com/task/6079733960540160?logs=ci#L943:

Run banman with args ['valgrind', '--quiet', '--error-exitcode=1', '/tmp/bitcoin-core/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz', '-runs=1', '/tmp/bitcoin-core/ci/scratch/qa-assets/fuzz_seed_corpus/banman']INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 3396028099
INFO: Loaded 1 modules   (243081 inline 8-bit counters): 243081 [0x2736df8, 0x2772381), 
INFO: Loaded 1 PC tables (243081 PCs): 243081 [0x2772388,0x2b27c18), 
INFO:     4783 files found in /tmp/bitcoin-core/ci/scratch/qa-assets/fuzz_seed_corpus/banman
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 1048576 bytes
INFO: seed corpus: files: 4783 min: 1b max: 4194310b total: 221535216b rss: 254Mb
#128	pulse  cov: 1334 ft: 2275 corp: 46/332b exec/s: 64 rss: 276Mb
#256	pulse  cov: 1409 ft: 2700 corp: 84/735b exec/s: 64 rss: 276Mb
#512	pulse  cov: 2378 ft: 4873 corp: 157/1826b exec/s: 56 rss: 279Mb
#1024	pulse  cov: 2631 ft: 7553 corp: 306/5510b exec/s: 56 rss: 280Mb
#2048	pulse  cov: 2844 ft: 10686 corp: 585/21Kb exec/s: 51 rss: 280Mb
Slowest unit: 10 s:
artifact_prefix='./'; Test unit written to ./slow-unit-f80fda481ebe048a46dc7d56cac745553ac86462
Slowest unit: 12 s:
artifact_prefix='./'; Test unit written to ./slow-unit-3866f35faef1aa8c88ce6a52ba599c96b9773452
Slowest unit: 15 s:
artifact_prefix='./'; Test unit written to ./slow-unit-b4b646db3c8b4309d520a929dd9969cd3d266026
Slowest unit: 17 s:
artifact_prefix='./'; Test unit written to ./slow-unit-92ff80cf45bd551bcace289e5fa514f411b614cf
Slowest unit: 20 s:
artifact_prefix='./'; Test unit written to ./slow-unit-efd82da2184653f7522296ad76c71eb62d226205
Slowest unit: 26 s:
artifact_prefix='./'; Test unit written to ./slow-unit-94cebd976b987de8e532cc562064831761aa3d8d
#4096	pulse  cov: 2896 ft: 17823 corp: 1307/792Kb exec/s: 4 rss: 309Mb
Slowest unit: 32 s:
artifact_prefix='./'; Test unit written to ./slow-unit-e0627c9a132a8619dd91150e67aa5142ef7f1db0
Slowest unit: 43 s:
artifact_prefix='./'; Test unit written to ./slow-unit-79d9d700db64f0f307875e04b72d9c71e3fce554
Slowest unit: 59 s:
artifact_prefix='./'; Test unit written to ./slow-unit-329c339c4dbe81a59cd0167e0ed96c579a71144d
#4784	INITED cov: 2896 ft: 17955 corp: 1346/2709Kb exec/s: 2 rss: 392Mb
#4784	DONE   cov: 2896 ft: 17955 corp: 1346/2709Kb lim: 621890 exec/s: 2 rss: 392Mb
Done 4784 runs in 2100 second(s)

Aren’t we missing out on a lot of reductions?

Hey,

I noticed that I get a bunch of new seeds when I fuzz, but by merging my results into the existing corpus, it only takes the ones that add new features and coverage. All the seeds that reduce the size but achieve the same coverage are not adopted into the corpus. Obviously, if I merged both the existing and my new seeds into an empty third directory, they’d get traversed by increasing size and I would get keep the shortest seeds that achieve the same coverage.

Should I:

merge my new seeds into the existing corpus and only upstream new features/coverage?
merge everything into an empty directory, add everything that ends up in that direct to the corpus and upstream it?

Currently, I’m following approach 1. which keeps the corpus smaller, but misses out on the reductions. Following 2. would grow the corpus more quickly, but then when we squash it at the branch off point we’d keep the best reduced seeds.

brainstorm: Reducing the size of this repo

This repository is very large (~16GB atm) and I think there are a bunch of things we could do to improve that.

Prune the git history, .git is currently at 4GB. (we don't really need the history/we could archive to the history to a separate repo)
Compress corpora (~6GB gzip)
Avoid large inputs / have separate repo for those

The biggest downside to the size currently is that we pull this repo in our CI jobs (oss-fuzz as well) which is a big overhead.

Maybe we setup an automated mirror repo that has the compressed corpora and no git history?