Git Product home page Git Product logo

fuzzyfinder's Introduction

fuzzyfinder

image

image

Fuzzy Finder implemented in Python. Matches partial string entries from a list of strings. Works similar to fuzzy finder in SublimeText and Vim's Ctrl-P plugin.

image

Quick Start

$ pip install fuzzyfinder

or

$ easy_install fuzzyfinder

Usage

>>> from fuzzyfinder import fuzzyfinder

>>> suggestions = fuzzyfinder('abc', ['defabca', 'abcd', 'aagbec', 'xyz', 'qux'])
>>> list(suggestions)
['abcd', 'defabca', 'aagbec']

>>> # Use a user-defined function to obtain the string against which fuzzy matching is done
>>> collection = ['aa bbb', 'aca xyz', 'qx ala', 'xza az', 'bc aa', 'xy abca']
>>> suggestions = fuzzyfinder('aa', collection, accessor=lambda x: x.split()[1])
>>> list(suggestions)
['bc aa', 'qx ala', 'xy abca']

>>> suggestions = fuzzyfinder('aa', ['aac', 'aaa', 'aab', 'xyz', 'ada'])
>>> list(suggestions)
['aaa', 'aab', 'aac', 'ada']

>>> # Preserve original order of elements if matches have same rank
>>> suggestions = fuzzyfinder('aa', ['aac', 'aaa', 'aab', 'xyz', 'ada'], sort_results=False)
>>> list(suggestions)
['aac', 'aaa', 'aab', 'ada']

Features

  • Simple, easy to understand code.
  • No external dependencies, just the python std lib.

How does it work

Blog post describing the algorithm: http://blog.amjith.com/fuzzyfinder-in-10-lines-of-python

Similar Projects

fuzzyfinder's People

Contributors

amjith avatar lwm avatar sudormrfbin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fuzzyfinder's Issues

case_insensitive as optional argument?

I have some old code that reads:

from fuzzyfinder import fuzzyfinder
matches = fuzzyfinder(word_before_cursor, fuzzy_words, case_sensitive=True)

But I can't see in this project when case_sensitive was ever an option? Am I going crazy? Was case_sensitive the default and the extra parameter I was passing just doing nothing?

No regex version

Just an FYI: it's slightly faster, but at the cost of readability 😉:

def make_searcher(query):
    if len(query) == 0:
        return lambda x: (True, 0, 0)

    head = query[0]
    tail = query[1:]

    def matcher(datum):
        running = 0
        first = datum.find(head)
        if first == -1:
            return (False, -1, -1)
        datum = datum[first+1:]

        for char in tail:
            index = datum.find(char)
            if index == -1:
                return (False, -1, -1)
            running += index
            datum = datum[index+1:]

        return (True, running, first)

    return matcher


def fuzzyfinder(query, data):
    suggestions = []
    searcher = make_searcher(query)

    for item in data:
        matched, distance, first = searcher(item)
        if not matched:
            continue
        suggestions.append((distance, first, item))
    return [x for _, _, x in sorted(suggestions)]

Test code:

collection = [
    'django_migrations.py',
    'django_admin_log.py',
    'main_generator.py',
    'migrations.py',
    'api_user.doc',
    'user_group.doc',
    'accounts.txt',
]
from timeit import timeit
timeit('fuzzyfinder("mig", collection)', setup='from __main__ import collection; from fuzzyfinder import fuzzyfinder')
new old speedup
CPython 11.90004337999926 14.990981923001527 20.61%
PyPy 2.1479620933532715 3.060289144515991 29.64%

Highlighting feature

Hi, first of all, let me tell you the code of your library is really cool because of the simplicity!

Now, I was wondering if you knew how to modify it slightly so the results would be highlighted like SublimeText, TextMate or similars...

To see what I mean take a look to fuzzysort:

showcase

which has an interface such as:

/*
WHAT: SublimeText-like Fuzzy Search
USAGE:
  fuzzysort.single('fs', 'Fuzzy Search') // {score: -16}
  fuzzysort.single('test', 'test') // {score: 0}
  fuzzysort.single('doesnt exist', 'target') // null
  fuzzysort.go('mr', ['Monitor.cpp', 'MeshRenderer.cpp'])
  // [{score: -18, target: "MeshRenderer.cpp"}, {score: -6009, target: "Monitor.cpp"}]
  fuzzysort.highlight(fuzzysort.single('fs', 'Fuzzy Search'), '<b>', '</b>')
  // <b>F</b>uzzy <b>S</b>earch
*/

So, any ideas how to add a "highlight" feature?

not good for Chinese?

suggestions = fuzzyfinder('可大讯飞', ['仅 就 第三季度 而言,虽然 科大讯飞 管理 费用 与 研发 费用 都 在 大幅 提升,但 两项 之 和 与 营收 的 比例 为 24% ,去年 同期 的 25% 还 要 低 一个 百分点 。 因此,将>三季度 扣非净利润 降低,归咎于 管理 费用 与 研发 费用 的 提升,显然 不太 恰当。'])
print(list(suggestions))
[] ---expected [科大讯飞], only one Chinese character is different

Disable alpha numeric sorting on suggestions

First off, this is an amazing library! Thanks so much for this. One thing I am finding is duplicating my code a bit though is if I run the following code

suggestions = fuzzyfinder('baa', ['zbaa', 'dbaa', 'abaa'])

The list(suggestions) code comes back as ['abaa', 'dbaa', 'zbaa']. When I'm passing my original list it's already sorted so I end up having to resort it again after I'm done. Is there anyway to disable this so that it gives the order that the list was presented in?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.