Git Product home page Git Product logo

fuel-search's People

Contributors

jaaprood avatar ronaldmansveld avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

fuel-search's Issues

Returning relevance instead of score

When searching we expect a relevance-percentage as input, but till now we are returning a score (or actually: the number of transformations).

One would expect the output to be accompanied by relevance again.

Make search work properly with multiple words

At the moment the scores of words are calculated per occuring word. Search terms consisting out of multiple words, however, should also be matched to combinations of words out of the fields. The problem is that we can't simply go every single possibility, as the number of iterations could go up dramatically and make searching through data very slow.

My first thought is to only get scores for combinations of words if the search term contains spaces itself. The search term "love pie" should, next to single words also get scores for the combination of adjacent words. We should never skip the normal words, as the space could be just another spelling mistake. With this strategy, we do dismiss the probability of people forgetting to use a space: searching for "lovepie" won't result in there being a score calculated for "love pie" but only the words individually.

Add method to search object to retrieve statistics after execution of search

As discussed at issue #2, it would be pretty good, especially for development purposes, to be able to get information about the search result set. I'm proposing the following syntax:

$products = Model_Product::find('all');

$products_search = Search::find('shampoo')
    ->in($products)
    ->by('name');

$found_products = $products_search->execute();
$found_products_statistics = $products_search->statistics();

My guess is that this method should return an array, with the "entry_key" as key, and information as value.

offset and limit is sub-optimal

Even though the foreach loop on entry_scores will break after the limit of results has been reached, all entries before $offset are still being iterated.

By first selecting the slice of $entry_scores that will be used (which can be done by a simple array_slice after the sorting) less iterations will be needed to construct the final returned array.

Add "loose terms searching" to support multiple words searches

As discussed in issue #1, right now the search results for multiple words are very inaccurate. As a base for multiple words searching, we're starting of with loose terms searching.

This means that when people search for more than 1 word, the search phrase will be split on spaces. A similarity score will be calculated for each single word separately after which the scores will be added up to result in one overall score.

This approach should make searching for multiple words a lot more accurate than it is now. However, with this approach, the context of how the words will appear will not matter. When searching for "we love pie" an entry containing "we really love apple pie" will score just as well as "love is awesome. we rule. let's eat some pie". Support for this will be added later on.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.