Git Product home page Git Product logo

Comments (7)

amaliujia avatar amaliujia commented on June 8, 2024

@mbasmanova

I am a person who is interested in contributing to open source. Looks like this issue is a good one for a starter. May I work on this issue?

from velox.

mbasmanova avatar mbasmanova commented on June 8, 2024

Welcome, @amaliujia . Sure, feel free to pick this up. CC: @atanu1991 who is working on a related issue #153

from velox.

amaliujia avatar amaliujia commented on June 8, 2024

@mbasmanova

I want to discuss a bit on high level before start to implement.

MultiRange is a class that contains a list of Filter, with nan_allowed and null_allowed. So my current idea of the implementation is:

  1. Two MultiRange can merge only if they have the same nan_allowed and null_allowed.
  2. Two MultiRange can merge only if they have the same data type on all the Filters.
  3. If 1 and 2 are true, then simply have a nested loop to do
 for f1 in MultiRangeA.getFilters():
      for f2 in MultiRangeB.getFilters():
           f1.mergeWith(f2);

Can you give me some suggestions?

from velox.

mbasmanova avatar mbasmanova commented on June 8, 2024

Rui,

MultiRange represents and OR of multiple filters. NaN values pass only if MultiRange::nanAllowed is true. Null values pass only if nullAllowed is true.

Merging two filters is combining the filters with an AND. A value passes the new filter only if it passes both the original filters.

Here is how I would represent a merge of two MultiRange filters (with just 2 filters in each for simplicity):

(a OR b) AND (c OR d) = (a AND c) OR (a AND d) OR (b AND c) OR (b AND d)

Therefore, what you have proposed is very close.

  1. Calculate nullAllowed for the combined filter: bool bothNullAllowed = nullAllowed_ && other->testNull();
  2. Calculate nanAllowed for the combined filter: {same as above}
  3. Loop over all combinations of filters from left and right side and merge these.
  4. Construct a new filter using flags and list of filters from previous steps.

In step (3), make sure to check if a merge of two filter is kAlwaysFalse, kIsNull, kIsNotNull. In this case, the filter can be dropped from the result (null and NaN flags in the nested filters do not count for MultiRange filter; only top-level flags take effect).

Step (3) may result in an empty list of filters or a list of one. If the list is empty, return kAlwaysFalse, kIsNull, or kIsNotNull as appropriate (see nullOrFalse(bothNullAllowed) in Filter.cpp). If the list contains just one filter, return that filter after incorporating the null flag, e.g. if bothNullAllowed && !filter->testNull() return filter.mergeWith(IsNull) else return filter.

That should be it.

(Check out #233 that implements merge logic for BytesValues and BytesRanges.)

from velox.

amaliujia avatar amaliujia commented on June 8, 2024

Thanks @mbasmanova!

I prepared a WIP impl based on your suggestions: #299.

Can you comment on the PR to see if I am on the right track? I haven't added test cases and I can add those later.

Sorry to occupy too much of your time. I am a beginner so probably need more guidance at this moment (for example on how to deal with nullness in Volex)? Maybe later I could help to contribute a doc about nullness, nan, always true, always false, is null, is not null with Filter.

from velox.

stale avatar stale commented on June 8, 2024

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

from velox.

kgpai avatar kgpai commented on June 8, 2024

Looks like this issue was resolved with #299 . Closing.

from velox.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.