Git Product home page Git Product logo

Comments (10)

bhufmann avatar bhufmann commented on June 27, 2024

@tahini thanks for opening this tracker. It's an important topic for the TSP specification.

The advantage of sending formatted strings is that the domain specific logic and responsibility is in the trace server, and the client can then display the data as. The definition of virtual table and the tree model used for xy, time-graphs and data-trees currently asks for formatted strings. The disadvantage is, that for doing client side actions, that require raw values, for example, sorting, timestamp synchronization and others, the raw data needs to be retrieved somehow. New APIs are needed that will allow the client to retrieve the raw data. This could be done by having a query parameter in the APIs to request the data as raw values (e.g. in fetchTree()), or by requesting tooltips with detailed information in additional APIs. For the events table, there is a proposal (see PR #36 ) to provide tooltips information (properties) including the formatted timestamp as well as the raw timestamp value.

As an alternative, there is your suggestion to provide a data type hint plus a unit to the relevant data. With this the server can send raw values along with a data type and unit and the client then can apply a formatting. The advantage is that the client can change formatting, the client can apply sorting, filtering on the client side without doing extra querying the server (to a certain extend where client side actions make sense). The disadvantage is that it's not trivial to define a "good" set of data types that can serve most use cases which have "good" default formatters in the client. Also, each client has to implement it's own formatting. Maybe, the server could provide the format string and the client would just apply it. But looking at e.g. the DecimalFormat or DateFormat it seems quite challenging to handle in the client.

It tried to implement data type support for the trees with columns (e.g. data tree, time graph tree, xy tree) and had some challenges when implementing formatters that would fit various use cases. For example in a table with multiple columns with decimal numbers, where each column has a different value range, what should be the default decimal format in the client? Also, I had a case where a data type was needed per table cell and not just per column.

I think it's important that we have this discussion, and decide which route to go. The first alternative, seems initially a better solution to get started and get something quickly implemented, but I wonder if in the long run alternative 2 is better.

Any thoughts?

from trace-server-protocol.

MatthewKhouzam avatar MatthewKhouzam commented on June 27, 2024

Hi, as the instigator of PR36, I would like to give some thoughts...

Let's look at a timestamp example as this is seems to cover all the cases

examples:

  • Timestamp 150000000000 (raw)
  • Timestamp 2018/05/02 14:23:33.0002231234 (formatted)
  • Timestamp <a href="timestamp://150000000000">2018/05/02 14:23:33.0002231234</a> (both)

What do you think of option 3?

from trace-server-protocol.

MatthewKhouzam avatar MatthewKhouzam commented on June 27, 2024

Some clarifications: for the hyperlink. We can look into HATEOAS.

Also, for the link, let's see possibilities:

  • <a href="timestamp://150000000000">2018/05/02 14:23:33.0002231234</a>
  • <a href="tsp://?timestamp=150000000000">2018/05/02 14:23:33.0002231234</a>
  • <a href="tsp://127.0.0.1:8080:/?timestamp=150000000000">2018/05/02 14:23:33.0002231234</a>
  • <a href="tsp://127.0.0.1:8080:/path-to-endpoint/?timestamp=150000000000">2018/05/02 14:23:33.0002231234</a>

which level of detail is interesting, also, how can we make it clear that the data is for other views to update, like an outputelement in eclipse's UI?

from trace-server-protocol.

tahini avatar tahini commented on June 27, 2024

@MatthewKhouzam for the hyperlinks, those would be generated by the server, right?

You both bring up very valid points about the data, which brings even more questions. Like who is expected to do the sorting. Lea did sorting of columns in the filter-table-tree (left part of XY charts and timegraphs). And the sort was done client-side, with whatever value was received. Which is where this issue originated. But if we say all sorting is meant to be done server-side, then we can very well send only formatted strings.

But does it make sense to always require going server side (with all the overhead it implies), just to sort a couple hundreds of entries typically?

But we sometimes have more than a couple hundreds entries with virtual tables, in which case it would make sense to sort server side.

I think the idea of having the possibility, via query parameter, to request either raw or formatted data is interesting, as it would make good formatted default for the dumbest consuming UI, while still give room for creative ways to consume the data by the client.

So we could update the protocol so headers are not simple strings, but objects with a name, tooltip, dataType and unit (same as the Axis descriptor), say that the data comes by default formatted as per its data type.

Then as a second step, we could decide on how best to give the possibility to retrieve raw data, either as additional endpoints as Matthew suggests or as additional parameter in the query.

from trace-server-protocol.

bhufmann avatar bhufmann commented on June 27, 2024

@MatthewKhouzam for the hyperlinks, those would be generated by the server, right?

You both bring up very valid points about the data, which brings even more questions. Like who is expected to do the sorting. Lea did sorting of columns in the filter-table-tree (left part of XY charts and timegraphs). And the sort was done client-side, with whatever value was received. Which is where this issue originated. But if we say all sorting is meant to be done server-side, then we can very well send only formatted strings.

But does it make sense to always require going server side (with all the overhead it implies), just to sort a couple hundreds of entries typically?

But we sometimes have more than a couple hundreds entries with virtual tables, in which case it would make sense to sort server side.

For sorting client side the client has to have sufficient information to do it without querying the server. First of all the client would have all the entries to sort. Secondly, the client knows how to sort. What I mean sorting string values vs numbers is different. Sometimes the value to sort is a string but needs to be sorted differently. Consider sorting a list with "CPU0, CPU1 ... CPU16" it would be expected to sort on the CPU number not on the string. The next question is, how does the client know that it can sort client side and when it has to query the server? Would be part of the some kind of descriptor (e.g. column descriptor).
With Filtering we have similar challenges.

I think the idea of having the possibility, via query parameter, to request either raw or formatted data is interesting, as it would make good formatted default for the dumbest consuming UI, while still give room for creative ways to consume the data by the client.

I agree, I think just working with formatted strings or raw data is not sufficient. We need probably both. For timestamps, we for sure need both.

So we could update the protocol so headers are not simple strings, but objects with a name, tooltip, dataType and unit (same as the Axis descriptor), say that the data comes by default formatted as per its data type.

So, when sending formatted data, I guess the dataType would be string. For raw data, then there are different types depending on the data.

Then as a second step, we could decide on how best to give the possibility to retrieve raw data, either as additional endpoints as Matthew suggests or as additional parameter in the query.
Using the same endpoint, would have the advantage that views can display the raw data instead or the whole data structure could processed further in the client (e.g. Python client). Additional endpoints provide a more directed request i.e. give the raw values of this entry.

from trace-server-protocol.

bhufmann avatar bhufmann commented on June 27, 2024

Some clarifications: for the hyperlink. We can look into HATEOAS.

Also, for the link, let's see possibilities:

* `<a href="timestamp://150000000000">2018/05/02 14:23:33.0002231234</a>`

* `<a href="tsp://?timestamp=150000000000">2018/05/02 14:23:33.0002231234</a>`

* `<a href="tsp://127.0.0.1:8080:/?timestamp=150000000000">2018/05/02 14:23:33.0002231234</a>`

* `<a href="tsp://127.0.0.1:8080:/path-to-endpoint/?timestamp=150000000000">2018/05/02 14:23:33.0002231234</a>`

which level of detail is interesting, also, how can we make it clear that the data is for other views to update, like an outputelement in eclipse's UI?

HATEOAS is intended for the server advertising other endpoints or actions to the client that the client can use for further features. This might be interesting for some other features in our application.

Using the a hyperlink to transport timestamp information (raw and formatted value) is meant from server to the client and within the client and not to provide an end-point that the client will use to query the server for more information. If we would go that way, this would have to be specified in the TSP so that clients and servers know about it implement it accordingly.

from trace-server-protocol.

MatthewKhouzam avatar MatthewKhouzam commented on June 27, 2024

Follow up: the core issue:

Passing data from the trace server to the trace client

Problem:
Views cannot synchronize with each other or trace server
They need a unique key to do so
How do we convey data from the server to the client. Do we provide it in “raw” or “formatted”?
What implications do we have? Do we pass formatters? Do we get raw values? See GH issue 35 for TSP
How to enable Critical path (Good use case to test on)
Any view can react on any key of any view.
How can we get the “keys”? Defining “aspects” of a trace that are unique.

Ways forward:

  1. Raw data sent as well as formatted
  • Heavy on networks
  • Heavy on client-side memory
  • Trivial in client compute
  1. Client formats data
  • Efficient in terms of network,
  • Need metadata for inter-view interactions
  • Potentially inconsistent across implementations
  • May fragment client ecosystem
  • In memory sort ()
  1. Server formats data
  • Virtual sort ()
    a. Client has raw, queries for formatted <- not inconsistent with #2
    b. Client has formatted queries for raw
  • Less client-side compute
  • Less separation of concerns
  • One more network hop to get data

Internal nanosecond storage https://www.npmjs.com/package/timestamp-nano

Work for #2

  • Send metadata
  • Have client format
  • Update events table
  • Update Statistics
  • TSP: Send nanoseconds

Work for #3

  • Support lookup endpoints
  • Lookup from there

If we send only “raw” to the client:

Client-side formatting: a warning.
JS != Python != java

So, #2 makes more sense architecturally but is more work today, less tomorrow, let's do #2!

from trace-server-protocol.

tahini avatar tahini commented on June 27, 2024

I'll start working on this next week

from trace-server-protocol.

bhufmann avatar bhufmann commented on June 27, 2024

I'll start working on this next week

@tahini, thanks for letting us know. If you'd like to have some discussions please let me know. I'm looked at it before and recently I was scratching my head to see how we can do similar things in the events table. In the end we'd like to have way to transport data between server and client, and then be able to correlate the data in the client between different views (data provider).

from trace-server-protocol.

tahini avatar tahini commented on June 27, 2024

@bhufmann I just did a PR for column data types (including the event table). I could try to make a quick prototype of this new API to see if it solves whatever you were scratching your head about. What was the exact issue? Is it about the timestamp? Or something else?

from trace-server-protocol.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.