Git Product home page Git Product logo

Comments (5)

yl-lisen avatar yl-lisen commented on June 2, 2024

In my mind, version_kv <left> join version_kv currently has two scenarios:
one is with aggregation, we need to output changelog results
the other is without aggregation,
i. If the target table is version_kv, we only need to join to output the intermediate results (without changelog)
ii. If the target table is changelog_kv, we still output changelog results

from proton.

yl-lisen avatar yl-lisen commented on June 2, 2024

For syntax:

--- case 1, with aggregation , the join result is changelog, and the aggr result is append-only
select id, max(v) from kv1 join kv2 using(id);

--- case 2, with aggregation, the join result is changelog, and the aggr result is changelog
select id, max(v), _tp_delta from kv1 join kv2 using(id) emit changelog;

--- case 3, without aggregation, the join result is append-only
select id, kv1.v, kv2.v from kv1 [left] join kv2 using(id);

--- case 4, without aggregation, the join result is changelog
select id, kv1.v, kv2.v, _tp_delta from kv1 [left] join kv2 using(id) emit changelog

from proton.

chenziliang avatar chenziliang commented on June 2, 2024

In my mind, version_kv <left> join version_kv currently has two scenarios: one is with aggregation, we need to output changelog results the other is without aggregation, i. If the target table is version_kv, we only need to join to output the intermediate results (without changelog) ii. If the target table is changelog_kv, we still output changelog results

Right, we just need support another emit strategy like EMIT UPSERT, it's users responsibility to pick the right emit strategy. For aggregation, we pick changelog for them, for plain join, we pick changelog by default as well but user can override it with EMIT UPSERT ? We don't need consider target stream since that will be very complex and some times it is hard like an external target table (we don't know what it is). This seems good enough ?

from proton.

yl-lisen avatar yl-lisen commented on June 2, 2024

I don't like to support another emit strategy EMIT UPSERT, which is not a common strategy and is just used in versioned_kv join versioned_kv.

Introducing a special emit strategy will make it more complex and difficult for users to get started
I prefer that the default is append-only unless emit changelog is specified.

@chenziliang

from proton.

chenziliang avatar chenziliang commented on June 2, 2024

Make sense to me

from proton.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.