Git Product home page Git Product logo

Comments (10)

kimchy avatar kimchy commented on May 18, 2024

Terms API: Allow to get terms for one or more field. Closed by 5d78196.

from elasticsearch.

clintongormley avatar clintongormley commented on May 18, 2024

Please could you provide the docs for the usage of terms, so that I can add it to ElasticSearch.pm

thanks

clint

from elasticsearch.

kimchy avatar kimchy commented on May 18, 2024

The terms API accepts the following uris:

  • GET /_terms
  • GET /{index}/_terms (where {index} can be one or more indices, with _all support)

The http parameters are (fields or field must be set):

  • fields: The fields to search on, comma separated.
  • field: The field to search on, can be multiple HTTP field parameters.
  • from: The lower bound (lex) term from which the iteration will start. Defaults to start from the first.
  • to: The upper bound (lex) term to which the iteration will end. Defaults to unbound (null).
  • fromInclusive: Should the first from (if set) be inclusive or not. Defaults to false.
  • toInclusive: Should the last to (if set) be inclusive or not. Defaults to true.
  • prefix: An optional prefix from which the terms iteration will start (in lex order).
  • regexp: An optional regular expression to filter out terms (only the ones that match the regexp will return).
  • minFreq: An optional minimum document frequency to filter out terms.
  • maxFreq: An optional maximum document frequency to filter out terms.
  • size: The number of term / doc freq pairs to return per field. Defaults to 10.
  • sort: The type of sorting for term / doc freq. Can either be "term" or "freq". Defaults to term.

The field names support for indexName based lookup, and full path lookup (can have a type prefix or not).

The results basically include a docs header, and then a object named based on the field name, and the term and document frequency for each.

The only thing that I am not sure about is that currently, the term value is the JSON object name, and I wonder if it make sense to create generic JSON object, with a term field inside with its value, what do you think?

from elasticsearch.

kimchy avatar kimchy commented on May 18, 2024

Regarding my previous question, I simply added another http boolean parameter called termsAsArray. It defaults to true, which means you will get an array of JSON objects, with term and docFreq as fields. This will also maintain the order for parsers that are not order aware (since you can sort). If set to false, it will return JSON object names with the term itself.

from elasticsearch.

clintongormley avatar clintongormley commented on May 18, 2024

fromInclusive: Should the first from (if set) be inclusive or not. Defaults to false.
toInclusive: Should the last to (if set) be inclusive or not. Defaults to true

You mean fromInclusive defaults to TRUE. I've renamed these exclude_from and exclude_to so that the default (unspecified) is false.

from elasticsearch.

clintongormley avatar clintongormley commented on May 18, 2024

What do you mean by this:

The field names support for indexName based lookup, and full path lookup (can have
a type prefix or not).

Can you give me an example of the format?

from elasticsearch.

clintongormley avatar clintongormley commented on May 18, 2024

fromInclusive: Should the first from (if set) be inclusive or not. Defaults to false.
toInclusive: Should the last to (if set) be inclusive or not. Defaults to true

Actually, these are both incorrect. Currently fromInclusive is true and toInclusive is false.

Why do you have these as different values? From the naming of from and to, I'd expect them to be inclusive, and only to exclude them if specified.

from elasticsearch.

kimchy avatar kimchy commented on May 18, 2024

The idea of fromInclusive and toInclusive is to follow the usually convention of writing a for loop, something like for (i=0;i<10;i++), in this case, the from (0) is inclusive, and to to is not. In any case, I suggest that you follow the same wording and parameters elasticsearch uses, so you won't confuse users. We can talk about if it make sense to change this, but while I suggest keeping it the same.

Regarding the field name, it is exaplined a bit here (http://www.elasticsearch.com/docs/elasticsearch/mapping/object_type/#pathType), though I should add a page that explains it explicitly. For example, if you have (person is the type of the mapping):

{ person : { name : { firstName : "...", lastName : "..." } } }

then the field name (that will match) will be either person.name.firstName, or name.firstName. If you add explicit mapping for the name object (or person), you can control the pathType.

from elasticsearch.

clintongormley avatar clintongormley commented on May 18, 2024

The idea of fromInclusive and toInclusive is to follow the usually convention of writing
a for loop, something like for (i=0;i<10;i++),

OK - I didn't get that. I would say then they should be called from and until, rather than to.

In Perl (and some other dynamic languages), loops can be written more succinctly, like:

for (1..5) {  }
foreach my $name (@names) 

... both of which are inclusive. To my mind, basing the default values of fromInclusive and toInclusive on a for loop exposes implementation, rather than representing how a user might think in natural language.

Regarding the field name....

OK, I have two mappings: type_1 and type_2. Both have a field 'text', but i ask for terms on field 'text' or 'type_1.text', I get the same results, which doesn't seem to be what I'm asking.

Is this what it is supposed to do?

from elasticsearch.

kimchy avatar kimchy commented on May 18, 2024

No problem, make sense, I will change the toInclusive to true.

Regarding the field name, yea, its not filtered by type if you prefix it by type (which is different than if you use the typed field in search queries for example). It can be implemented, but its more difficult and will be much more expensive to perform, so for now, I did not implement it.

from elasticsearch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.