Git Product home page Git Product logo

Comments (2)

jtaleric avatar jtaleric commented on August 19, 2024

Hi! I'm using this space to write about my current progress, updates and plans. The biggest piece right now is the addition of "collectors to the benchmark wrapper":

COLLECTORS

  • Introduces the ability to run data collection tools alongside benchmark runs

SNAFU/Benchmark-wrapper is to run benchmarks, not orchestrate outside tools.

  • For example, running pbench data collection alongside a fio or uperf benchmark

I think if we want pbench collection in a K8s deployment, we orchestrate the pbench agents via Rispaw/Benchmark-Operator.

  • Structure:

    • Added --collector and --collector-config options to run_snafu (as well as --upload)

    • Abstract collector class specifies need for config file, as well as:

      • set_config_vars(config): Parses through the config file and sets any specified vars / defaults, makes sure everything that needs to be in the file is.
      • startup(): Performs env setup and initialization of necessary processes, as well as begins any persistent data collection processes (for continuous data collection over the course of the benchmark).
      • start_sample(): Starts a data collection sample (for sample-based data collection tools, synced with benchmark samples). Return the sample archive dir.
      • stop_sample(): Stops a data collection sample.
      • shutdown(): Stops persistent data collcetion and other processes, performs cleanup and any desired post-processing.
      • upload(): Uploads a collector's archives using a specified procedure to a desired location.
    • The necessary collector_factory, _load_collectors, __init__ files, and changes to registry have also been made

    • In run_snafu, if a collector has been specified with a config, it will create/initialize the desired collector class and run that startup tasks. Then, before each benchmark sample, it will start a data collection sample which will be stopped once the benchmark sample ends. After the benchmark is complete, the collector shutdown/post-process tasks are run.

Most of this functionality could be implemented in the Operator.

  • The upload option:

    • A user can also run snafu with the --upload option while still including collector + config
    • This will then upload the created archives for a specific collector to a location specified/authorized through a combination of the upload() method and the config file.
    • For example, this option is used for moving pbench archives to the pbench server, and auth info/locations can be specified through the config file

Should the PBench agents simply upload the data to the pbench-server, or pbench-controller after the collection is completed?

  • Adding a Collector:

    • A really simple process
    • Just create a collector_name dir under the main collector dir
    • Add an init file, a sample config, and the collector_name.py file that creates a Collector_name class based on the abstract collector.
    • Robert Krawitz is also currently interested in implementing his prom_extract tool as another collector

I discussed this with Robert and the CNV team. Ripsaw/Benchmark-Operator already has a task to collect system metrics and send to ES. https://github.com/cloud-bulldozer/benchmark-operator/blob/master/roles/system-metrics/tasks/main.yml

The role here could be used for the PBench collection work.

PBENCH

  • Note that all of this structure already exists and has been written, and pbench has been fully implemented using it (upload option still needs a bit of work).
  • To see all of this work, go to: my pbench integration PR
  • This has all been built off of upstream branch: learnitall/feature-extend-oo-structure
  • I also have a small demo in the latest tools meeting recording, and can do another whenever desired

ISSUES/TO-DO

  • Add support to specify/run multiple collectors for a given benchmark run
  • SAMPLE ISSUES (described in more detail below)
  • How to handle dockerfiles to have environments with both necessary collector bits and benchmark bits

SAMPLE ISSUE

  • Currently no universal notion of a sample, samples are defined within the benchmark, with each benchmark having its own for-loop iterating through the number of samples and yielding results.

  • No way to sync collector samples with benchmark samples (or do any other work revolving around benchmark samples) without hard-coding hacks into each individual benchmark.

  • Instead, would like to request that the new benchmark rewrite fix this problem by abstracting the main sample loop and introducing a universal definition of sample:

    • Basically, instead of having a collect() method in the benchmarks where a for-loop is run over the number of samples with each iteration yielding one sample result, simply have a collect_sample() method that collects one sample of the benchmark (essentially the same code minus the for-loop), then have that called in the new sample loop in run_snafu at each iteration instead.

    • That way, in the main sample loop, I could start a collector sample, run a benchmark sample, then stop the collector sample, and then the rest of the benchmark sample processing/indexing would stay the same for each iteration.

    • PSEUDO-CODE:

      • OLD:
        for data in benchmark.run():
          (current indexing stuff)
          ...and then inside the .run() there would be a for loop for sample in range(samples)....(code to collect sample)
      • NEW:
        for sample in range(samples):
          collector.start_sample()
          data = benchmark.collect_sample() ...now just has (code to collect sample), no for sample in range(samples)
          collector.stop_sample()
          (current indexing stuff)
    • Would help a lot with current/future work, and also would likely help with unit testing and whatnot.

    • This is more of an example of a solution rather than the "one perfect solution", so I'm sure people may think of better ones. Rather, I'm just looking to add this level of abstraction to allow for proper integration of stuff like collectors, as well as new features in the future.

    • Currently, what collectors are ACTUALLY doing are just running one collection sample over the course of the whole benchmark run process, because that's all that is currently possible.

@learnitall
@dry923
@rsevilla87
@mohit-sheth
@whitleykeith ;)

Any additional thoughts?

from benchmark-wrapper.

whitleykeith avatar whitleykeith commented on August 19, 2024

Hi! I'm using this space to write about my current progress, updates and plans. The biggest piece right now is the addition of "collectors to the benchmark wrapper":

COLLECTORS

Ultimately I'm in agreement with @jtaleric on this. The benchmark-wrapper shouldn't be responsible for integrating with other in-process tools. I'm not against maybe supporting a different output format/target but the data collection shouldn't be tied at the process level.

I think it makes sense that any "collector" would be done in a sidecar similar to how Jaeger works https://github.com/jaegertracing/jaeger-kubernetes

I'm not even sure I want prom_extract to be a feature of SNAFU, and it definitely shouldn't be a collector. We have kube-burner for that.

SAMPLE ISSUE

  • Currently no universal notion of a sample, samples are defined within the benchmark, with each benchmark having its own for-loop iterating through the number of samples and yielding results.

  • No way to sync collector samples with benchmark samples (or do any other work revolving around benchmark samples) without hard-coding hacks into each individual benchmark.

  • Instead, would like to request that the new benchmark rewrite fix this problem by abstracting the main sample loop and introducing a universal definition of sample:

    • Basically, instead of having a collect() method in the benchmarks where a for-loop is run over the number of samples with each iteration yielding one sample result, simply have a collect_sample() method that collects one sample of the benchmark (essentially the same code minus the for-loop), then have that called in the new sample loop in run_snafu at each iteration instead.

    • That way, in the main sample loop, I could start a collector sample, run a benchmark sample, then stop the collector sample, and then the rest of the benchmark sample processing/indexing would stay the same for each iteration.

    • PSEUDO-CODE:

      • OLD:
        for data in benchmark.run():
          (current indexing stuff)
          ...and then inside the .run() there would be a for loop for sample in range(samples)....(code to collect sample)
      • NEW:
        for sample in range(samples):
          collector.start_sample()
          data = benchmark.collect_sample() ...now just has (code to collect sample), no for sample in range(samples)
          collector.stop_sample()
          (current indexing stuff)
    • Would help a lot with current/future work, and also would likely help with unit testing and whatnot.

    • This is more of an example of a solution rather than the "one perfect solution", so I'm sure people may think of better ones. Rather, I'm just looking to add this level of abstraction to allow for proper integration of stuff like collectors, as well as new features in the future.

    • Currently, what collectors are ACTUALLY doing are just running one collection sample over the course of the whole benchmark run process, because that's all that is currently possible.

I completely agree that there should be one universal implementation for a "sample" in SNAFU, and I think your example makes perfect sense (minus wrapping it with a collector 😄 )

from benchmark-wrapper.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.