Git Product home page Git Product logo

Comments (12)

aakarshg avatar aakarshg commented on August 19, 2024

@dry923 might be worth the time to investigate the issue when you get back and have some cycles.

from benchmark-wrapper.

aakarshg avatar aakarshg commented on August 19, 2024

@bengland2 has pointed out that the CI hasn't been rebuilding the workload images ( snafu_ci tag ) in over a month, which is evident through logs so this would mean that the CI has been using old images for workloads and thus the tests could have been passing ?

from benchmark-wrapper.

aakarshg avatar aakarshg commented on August 19, 2024

The fix for this would be to

clear out the prior incarnation of the image before pushing the new one? That way, either the test fails or the image used is actually reflecting the source code that we want to test.

as pointed out by @bengland2

from benchmark-wrapper.

bengland2 avatar bengland2 commented on August 19, 2024

Here is a set of changes that I believe fixes most of the problem. It isn't fully tested yet because I have to finish other PRs but wanted to make you aware of what I had so far:

https://github.com/bengland2/snafu/tree/ensure-benchmark-updated

I'll submit as a PR if there is interest, but was waiting to get Russ input, etc.
The basic ideas are: 1) centralize as much logic as possible in ci/common.sh, and 2) never assume that a bash command succeeds. While bash "set -e" is a brute force way to accomplish this, it also prevents you from doing things like retrying podman push, or using an error status as input to an if conditional.

from benchmark-wrapper.

rsevilla87 avatar rsevilla87 commented on August 19, 2024

@bengland2 just merged this PR some hours ago. #113

from benchmark-wrapper.

bengland2 avatar bengland2 commented on August 19, 2024

I didn't know if anyone was working on it, that's why I didn't submit as a PR. As long as the problem gets solved...

from benchmark-wrapper.

aakarshg avatar aakarshg commented on August 19, 2024

There's still quite a few issues with CI, if you look at http://perf-sm5039-4-8.perf.lab.eng.rdu2.redhat.com:8080/job/snafu_jjb/87/consoleFull for fs-drift you can see that the uuid cannot even be parsed ( the reason is because we delete all before we can look at the uuid ), but the CI still replied with a passed on fs-drift !

from benchmark-wrapper.

dry923 avatar dry923 commented on August 19, 2024

There's still quite a few issues with CI, if you look at http://perf-sm5039-4-8.perf.lab.eng.rdu2.redhat.com:8080/job/snafu_jjb/87/consoleFull for fs-drift you can see that the uuid cannot even be parsed ( the reason is because we delete all before we can look at the uuid ), but the CI still replied with a passed on fs-drift !

@aakarshg so I found an issue when running the kubectl delete namespace in wait clean fails because there is no namespace it falls into the error function. So it would still show that the test was successful since this would likely be outside of that loop (there's a wait_clean at the end of run_test.sh which I think this is happening on) and the test would have really succeeded. I'm testing out a simple fix now.

from benchmark-wrapper.

bengland2 avatar bengland2 commented on August 19, 2024

did this have anything to do with fs-drift at all? I couldn't see it.

from benchmark-wrapper.

aakarshg avatar aakarshg commented on August 19, 2024

Nothing to do with fs-drift @bengland2 , the benchmark actually works fine... its the ci script in ripsaw that's also affecting CI runs in snafu.

from benchmark-wrapper.

rsevilla87 avatar rsevilla87 commented on August 19, 2024

The problem is that fs-drift, fio and smallfile ripsaw tests deploy two different CRs. One for the standard IO and another one for hostPath IO. The tests on ripsaw work well, but Snafu uses these test with a different wrapper.

from benchmark-wrapper.

rsevilla87 avatar rsevilla87 commented on August 19, 2024

I figured out that wait_clean is being called twice in IO tests, one at the end of each test in snafu https://github.com/cloud-bulldozer/snafu/blob/master/ci/run_ci.sh#L61 and another one at the end of IO tests.
https://github.com/cloud-bulldozer/ripsaw/blob/master/tests/test_fiod.sh#L25

Both wait_clean functions are in different source files, but either of them delete the namespace, so the second fails when called.
I also have detected that the function check_es does not check if all arguments have been passed to it. So when for example, the second argument is missing (indices to check), it does not check any index and returns 0.

from benchmark-wrapper.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.