Git Product home page Git Product logo

Comments (9)

hendrikmuhs avatar hendrikmuhs commented on July 21, 2024 1

We have discussed the underlying issue in the backend team:

The transform API parses/stores/serializes the configuration in question in a way that preserves order, this works if the REST API is used directly. This is out of the specification and inconsistent to other parts of the stack. The reason for this feature is performance. Starting with 7.15 transform has some builtin optimizations for improving performance by optimizing the group by query. That's one of the reasons why we've decided to keep the API the way it is and not change/enhance the syntax (make it order-aware or add sort criteria). We will rather spend more time on improving the optimizations. As a result we will adjust the documentation accordingly. If you are using REST it will still preserve order, but that's considered an implementation detail.

That's for the technical aspect of it. From a usability perspective I still think that consistent order helps UX. But that's not my decision.

Thanks for your time, I will close the corresponding HLRC issue.

from elasticsearch-java.

swallez avatar swallez commented on July 21, 2024

Java indeed has LinkedHashMap and its overhead is pretty minimal. However, having insertion order preservation as an implicit contract is IMHO a problem:

  • the JSON spec states that "An object is an unordered collection of zero or more name/value pairs". So this is in fact a bug in that API definition: it should have used an array if order matters (as is done for sort criteria).
  • we have a number of client libraries where the language does not have an insertion order preserving map in its standard library: this is the case of Go, C# and Rust.

We can introduce a workaround in the Java client for this specific API (by adding an annotation on the API spec that will trigger the workaround in the code generator), but we should not assume that all client libraries will be able to implement it.

from elasticsearch-java.

swallez avatar swallez commented on July 21, 2024

@hendrikmuhs after discussing this with the larger Clients Team and careful evaluation, we've decided to not implement the change you suggest.

The problem, as I mentioned previously, is that some languages have no builtin support for order-preserving maps, while others have (even as a default, like Python or Ruby). Yet all languages are compliant with the JSON spec, which is the semantic model on which the Elasticsearch API is built.

So implementing this in the Java client, even if simple and innocuous from a technical perspective, will not solve the issue for the entire range or client libraries, maintained either by Elastic or by 3rd parties.

If this lack of ordering is a blocking issue, then this API should be updated to a format that can capture order. The example of sort criteria comes to mind, which are defined as an array of objects containing a single key.

Feel free to reach out to the Clients Team if you want to discuss this further.

from elasticsearch-java.

hendrikmuhs avatar hendrikmuhs commented on July 21, 2024
  • the JSON spec states that "An object is an unordered collection of zero or more name/value pairs". So this is in fact a bug in that API definition: it should have used an array if order matters (as is done for sort criteria).

That's correct w.r.t. the JSON spec, I already mentioned that in the beginning. This is fine for clients that take JSON or a dictionary, but not for the java client:

A java client user uses a builder, he does not know and does not care whether JSON is used behind the scenes. This is an implementation detail.

I agree that other client languages require their own assessment, I can not speak for all, but e.g. the python client has a totally different interface. There I could use an ordered dict if I want to preserve order.

So implementing this in the Java client, even if simple and innocuous from a technical perspective, will not solve the issue for the entire range or client libraries, maintained either by Elastic or by 3rd parties.

Yes, I agree. I opened this issue for the java client, but not for other languages.

If this lack of ordering is a blocking issue, then this API should be updated to a format that can capture order.

I think that's the long term way to solve this, the transform API should be changed to optionally take an array of objects:

[{"key1": {...} }, {"key2": {...} }, {"key3": {...} }]

instead of:

"key1": {...}, "key2": {...}, "key3": {...}

I will discuss this with the backend team.

Note that this does not change the way a user interacts with the java client. He would still call the builder, the only difference is the way the client calls the backend afterwards.

Feel free to reach out to the Clients Team if you want to discuss this further.

I will, after some consultation with the backend team.

from elasticsearch-java.

technige avatar technige commented on July 21, 2024

Hi @hendrikmuhs

Respectfully, I beg to differ on JSON being an implementation detail. From the sole point of view of the Java client user, this may be true, but the full public product surface offered by Elastic includes not only the API offered by the clients, but also the HTTP/JSON interface itself, which can be used directly.

It is everyone's general interest to ensure that the user experience is consistent across all parts of the product surface and divergent semantics (X can be considered ordered if used via method 1, but not via method 2) increases both maintenance costs and usage complexity.

So ultimately, we cannot look at at the Java client in isolation; it has to be designed and built in line with all the other components within our product suite.

In terms of your specific case, my recommendation would be to discuss how to persist the ordering information across all layers of the system for the API you have in mind. That way, we may be able to enhance the semantics there in a way that works for all parts of the product surface equally.

from elasticsearch-java.

hendrikmuhs avatar hendrikmuhs commented on July 21, 2024

@technige

The problem appears only if the java client is used (maybe for other clients, too, that has to be investigated), the API works fine using the HTTP API. It's the client that breaks the contract as it shuffles what the user puts in.

from elasticsearch-java.

technige avatar technige commented on July 21, 2024

What contract is broken? As already mentioned above, JSON objects are defined as unordered, so any JSON parser is free to yield the values it reads in whatever order is convenient for the implementation. The only way to avoid that and retain the order in which the server happened to write them would be to write your own custom JSON parser which guarantees to retain that information.

And similarly, while it may normally do so in practice, the server itself does not guarantee that it will output the information in the payload in the same order as the user provided it. Again, this is a result of JSON having been chosen as the output format, rather than a vendor-specific variant of JSON (e.g. application/vnd.elastic.json-with-ordered-maps) .

But no contracts that exist today have been broken. It's simply that those contracts don't go far enough to cover your requirement. And so I return to my previous point: that we should instead discuss how to expose the additional semantics for the specific API.

from elasticsearch-java.

swallez avatar swallez commented on July 21, 2024

@hendrikmuhs the exact same issue will happen with our Go, C# and Rust clients, and any custom code people will write in these languages.

The Python dictionary, Ruby hash and PHP associative array happen to keep insertion order, but this is just a happy coincidence.

from elasticsearch-java.

hendrikmuhs avatar hendrikmuhs commented on July 21, 2024

Let's take this offline, it looks like gh is not the right place to discuss this, we seem to discuss different issues. I am very aware that JSON dictionaries do not define an order, I've never claimed so.

We will discuss in backend and get in touch afterwards.

from elasticsearch-java.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.