Git Product home page Git Product logo

Comments (3)

Akron avatar Akron commented on June 20, 2024

Would estimating timeouts be really helpful rather than lower bounds? I made a proposal for a change in the response from Krill to make the lower bound more explicit, setting the results to -1 and adding a different result key, like total_till_time_exceeded or similar. Would that help?
Estimation could be added, but wouldn't be very sophisticated for VC. It would estimate only on the whole corpus.

from rkorapclient.

kupietz avatar kupietz commented on June 20, 2024

Estimating frequencies or some other workaround is required if the frequency query is somewhere deeply hidden like in all collocation analysis functions, but also in simple frequency queries over vectors of queries and vcs.

Just lower bounds would render the whole API client idea useless - maybe unless this happens rarely or can be resolved by a retry or something.

from rkorapclient.

Akron avatar Akron commented on June 20, 2024

I label this as an enhancement, as estimation would be a completely different feature and I guess would need to be implemented on Krill's side. At least to return the necessary numbers.

As I said: It may not work well with the current numbers we get. I am not an expert in this field of statistics, but I would assume to get a reasonable estimation, we would need a rough percentage of how much of the data in question we already have searched until the timeout - and how much is left. We can give this information for the whole index (i.e. how many documents have we passed in relation to the whole corpus), but as far as I can see, we can't give this information for a VC for now, because a VC is not balanced over the whole corpus/index.

To be able to do that, we would have to calculate the number of documents in the VC and the number of documents in the VC we already have passed (at least roughly). We could do that in a single run.

I see three options for this:

  • Doing it in the first run everytime. This would slow down all searches.
  • Doing it only if an "estimation" flag is set.
  • Doing it after a timeout. This, however, would render the purpose of the timeout meaningless, as the calculation in a redundant run could be quite costly.

However - this would be a Krill enhancement.

from rkorapclient.

Related Issues (16)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.