Comments (3)
Would estimating timeouts be really helpful rather than lower bounds? I made a proposal for a change in the response from Krill to make the lower bound more explicit, setting the results to -1 and adding a different result key, like total_till_time_exceeded
or similar. Would that help?
Estimation could be added, but wouldn't be very sophisticated for VC. It would estimate only on the whole corpus.
from rkorapclient.
Estimating frequencies or some other workaround is required if the frequency query is somewhere deeply hidden like in all collocation analysis functions, but also in simple frequency queries over vectors of queries and vcs.
Just lower bounds would render the whole API client idea useless - maybe unless this happens rarely or can be resolved by a retry or something.
from rkorapclient.
I label this as an enhancement, as estimation would be a completely different feature and I guess would need to be implemented on Krill's side. At least to return the necessary numbers.
As I said: It may not work well with the current numbers we get. I am not an expert in this field of statistics, but I would assume to get a reasonable estimation, we would need a rough percentage of how much of the data in question we already have searched until the timeout - and how much is left. We can give this information for the whole index (i.e. how many documents have we passed in relation to the whole corpus), but as far as I can see, we can't give this information for a VC for now, because a VC is not balanced over the whole corpus/index.
To be able to do that, we would have to calculate the number of documents in the VC and the number of documents in the VC we already have passed (at least roughly). We could do that in a single run.
I see three options for this:
- Doing it in the first run everytime. This would slow down all searches.
- Doing it only if an "estimation" flag is set.
- Doing it after a timeout. This, however, would render the purpose of the timeout meaningless, as the calculation in a redundant run could be quite costly.
However - this would be a Krill enhancement.
from rkorapclient.
Related Issues (16)
- Memoize totalResults instead of rerequesting
- recursiveCA demo: Error in `slice_head()`
- corpusQuery should have a context size parameter to controll snippet lengths HOT 1
- Collocation analysis is incompatible with R 4.3.0
- RKorAPClient on shinyapps.io, problems with OAuth2 HOT 4
- Bug in collocationAnalysis.R
- Add more heuristics to example search in collocationAnalysis
- Analyze syntagmatic patterns of CA results like in the CCDB
- LICENSE and LICENSE.md conflict
- Add possibility to fetch search results pages in random order
- Umlauts cannot be queried on Windows / if system locale is not UTF-8 HOT 1
- Allow users to (O-)authorize client applications on the fly via a browser window HOT 2
- Focus queries currently do not yield all results HOT 1
- Timed out query results are cached
- Collocation scores are incorrect if lemmatizeNodeQuery or lemmatizeCollocateQuery is TRUE
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rkorapclient.