As first discussed in <a class="issue-link js-issue-link" data-error-text="Failed to l

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Process trace events into intermediate storage format about ros2_tracing HOT 4 OPEN

ros2 commented on June 24, 2024

Process trace events into intermediate storage format

from ros2_tracing.

Comments (4)

mjcarroll commented on June 24, 2024 1

So I'm guessing you're talking about an alternative to step 2.ii, and not talking about storing the events themselves?

Yes, mostly talking about an alternative to 2.ii in this outline.

Depending on the output of #22, there may be a potential of collapsing 2.i and 2.ii into a single step. For example, if ctf -> <intermediate format> is wildly more efficient than ctf-> pickled dict -> intermediate format while retaining all the same information. I don't see this as a high priority, though.

from ros2_tracing.

christophebedard commented on June 24, 2024

@mjcarroll I would just like to clarify a few things:

tracetools_read (in this repository)
1. Currently uses the babeltrace Python bindings to read a CTF trace from disk and return a list of events as Python dictionaries; it doesn't do anything else.
tracetools_analysis (in ros-tracing/tracetools_analysis)
1. Reads events from a CTF trace using tracetools_read and writes the dictionaries to a file (pickle). This is because it is quicker to read from a pickle file than to read the CTF trace using babeltrace, and this allows us to only read the actual CTF trace once and then just read the pickle file. See tracetools_analysis/process.py's process() function or the load_file() function, which is usually what's used in Jupyter notebooks, as you probably know.
2. Processes events one by one and writes some data to pandas DataFrames. See tracetools_analysis/processor/ros2.py and tracetools_analysis/data_model/ros2.py, respectively. A single row in a DataFame roughly corresponds to a single trace event, but at this point the trace events are abstracted away.
  1. To improve performance, it actually first writes data to normal Python lists, and then converts these lists to DataFrames once all trace events have been processed. Appending to a Python list is much faster than appending to a DataFrame.
3. Then some functions are written to compare/merge/etc. DataFrames to extract high-level information. See files under tracetools_analysis/utils/.

The idea would be to read raw CTF traces into some intermediate time-series data that is well suited for analysis tasks. Further high-level APIs could be built to ingest the intermediate data.

So I'm guessing you're talking about an alternative to step 2.ii, and not talking about storing the events themselves?

Then in parallel we can change steps 1.i/2.i, which is kind of more related to #22.

from ros2_tracing.

iluetkeb commented on June 24, 2024

Coming at this from the usage end, we have two different kinds of information in the CTF

meta-data, essentially mapping names of functions and endpoints to thread-ids+memory-addresses
activity data, that is, callbacks being called, messages being sent/received, etc.

Meta-data is emitted first, but due to things like the life-cycle, system modes and more complex launch scenarios, the entire tracefile has to be scanned to be sure to get everything. We usually need all meta-data for later association. For reasons of efficiency and storage size, I am assuming that we want to store meta-data separately also during later stages, but note that we never measured the advantage of this, and due to things like category tables etc., merged storage might actually be comparable.

In contrast, for activity data, it is often sufficient and quite often very useful to process just parts of it, usually temporal chunks For example, for analysis of performance, we usually need to differentiate at least where the system is starting up, idle, active, or shutting down. Many systems also frequently switch between active and idle.

Last, but not least, memory-wise it can be necessary to load data partially.

I think it doesn't matter very much in practice whether we store data after it has been converted into a pandas dataframe or before, assuming that we're using one of several data storage formats which can be easily written from and loaded into pandas dataframes (like those from Apache Arrow).

from ros2_tracing.

mjcarroll commented on June 24, 2024

Meta-data is emitted first, but due to things like the life-cycle, system modes and more complex launch scenarios, the entire tracefile has to be scanned to be sure to get everything.

I was hoping, but could not find evidence, that babeltrace2 would let us filter on event type/name, such that you could iterate for all metadata before doing filtered views of the longer running event data.

from ros2_tracing.

Process trace events into intermediate storage format about ros2_tracing HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent