brianreavis / sifter.js Goto Github PK
View Code? Open in Web Editor NEWA library for textually searching arrays and hashes of objects by property (or multiple properties). Designed specifically for autocomplete.
A library for textually searching arrays and hashes of objects by property (or multiple properties). Designed specifically for autocomplete.
My input is sorted by date. Is it possible to set a flag to disable sort and leave the output sorted by date? Can't find it in the docs. Perhaps don't set sortField
and set score
to no
?
Hi !!!
Thanks for the release, but the version 0.5.0 doesn't seam to be published on npm:
In the body of the Sifter.prototype.search
function are the following lines:
// generate result scoring function
fn_score = options.score || self.getScoreFunction(search);
I take this to mean we can supply a custom score function, which is pretty neat, but undocumented on the project GitHub page. It would be better to have it documented.
Would be great to replace the native sort function with a stable algorithm.
Currently if options.sort
is false, then
search.items.sort(function(a, b) {
return b.score - a.score;
});
Messes up the original order of the array in certain browsers due to non stable native sort.
┌───────────────┬──────────────────────────────────────────────────────────────┐
│ Low │ Prototype Pollution │
├───────────────┼──────────────────────────────────────────────────────────────┤
│ Package │ minimist │
├───────────────┼──────────────────────────────────────────────────────────────┤
│ Patched in │ >=0.2.1 <1.0.0 || >=1.2.3 │
├───────────────┼──────────────────────────────────────────────────────────────┤
│ Dependency of │ selectize │
├───────────────┼──────────────────────────────────────────────────────────────┤
│ Path │ selectize > sifter > optimist > minimist │
├───────────────┼──────────────────────────────────────────────────────────────┤
│ More info │ https://npmjs.com/advisories/1179 │
└───────────────┴──────────────────────────────────────────────────────────────┘
https://github.com/substack/node-optimist is deprecated. The author seems to have no intention of maintaining the package.
I think optimist
should be replaced with yargs
which has all the same functionality - https://github.com/yargs/yargs/blob/master/docs/examples.md#even-more-shiver-me-timbers
Alternatively, optimist
should be forked and minimist
version bumped to 0.2.1
.
version on master builds, but version on npm does not
I'm getting errors when installing with Node 0.11.x
.
npm ERR! Darwin 13.4.0
npm ERR! argv "/usr/local/bin/node" "/usr/local/bin/npm" "install" "sifter" "--save"
npm ERR! node v0.11.14
npm ERR! npm v2.0.0
npm ERR! code ELIFECYCLE
npm ERR! [email protected] install:node-gyp rebuild
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the [email protected] install script.
npm ERR! This is most likely a problem with the microtime package,
npm ERR! not with npm itself.
npm ERR! Tell the author that this fails on your system:
npm ERR! node-gyp rebuild
npm ERR! You can get their info via:
npm ERR! npm owner ls microtime
npm ERR! There is likely additional logging output above.
Could you add support for a later version of microtime
? The author seems to have added support for 0.11.x
:
wadey/node-microtime#19
The sort in sifter respects diacritics, but it doesn't support all locales. In my case, I'm specifically running into trouble sorting Chinese in a selectize element.
Seems like the way to make this work would be to switch to using localeCompare
for the sorting, but then there would also need to be a way to supply a locale to sifter.
The last modification is not published on NPM.
this function is especially important if searching only by a single column.
creating own score function is not a good solution, thus it has to handle separately diacritics.
need this function to have inside selectize.js
Hi - first off I'd like to thank you for this library.
I noticed (just today) in checking out the latest patch version of selectize 0.12.2 that this project's node-csv dependency access protocol changed from https to git a while ago.
Is there a specific reason for the change to git: instead of the authenticated https protocol?
I've read both #18 and #19 and I still don't understand why the protocol was changed.
Can it be changed back to https?
Hi. I'm using selectize with sifter support.
Diacritics matching can be also used when you have two languages (two keyboard layouts) to switch between and you're typing on wrong one.
In my case, english letters "qwertyuiop" are "йцукенгшщз" in russian keyboard layout. If I add these pairs of letters to DIACRITICS
it works just perfect and shows expected search results even if I'm on wrong keyboard layout! Very handy.
Something like this:
var DIACRITICS = {
'a': '[фaḀḁĂăÂâǍǎȺⱥȦȧẠạÄäÀàÁáĀāÃãÅåąĄÃąĄ]',
'b': '[иb␢βΒB฿𐌁ᛒ]',
'c': '[сcĆćĈĉČčĊċC̄c̄ÇçḈḉȻȼƇƈɕᴄCc]',
'd': '[вdĎďḊḋḐḑḌḍḒḓḎḏĐđD̦d̦ƉɖƊɗƋƌᵭᶁᶑȡᴅDdð]',
'e': '[уeÉéÈèÊêḘḙĚěĔĕẼẽḚḛẺẻĖėËëĒēȨȩĘęᶒɆɇȄȅẾếỀềỄễỂểḜḝḖḗḔḕȆȇẸẹỆệⱸᴇEeɘǝƏƐε]',
'f': '[аfƑƒḞḟ]',
'g': '[пgɢ₲ǤǥĜĝĞğĢģƓɠĠġ]',
'h': '[рhĤĥĦħḨḩẖẖḤḥḢḣɦʰǶƕ]',
'i': '[шiÍíÌìĬĭÎîǏǐÏïḮḯĨĩĮįĪīỈỉȈȉȊȋỊịḬḭƗɨɨ̆ᵻᶖİiIıɪIi]',
'j': '[оjȷĴĵɈɉʝɟʲ]',
'k': '[лkƘƙꝀꝁḰḱǨǩḲḳḴḵκϰ₭]',
'l': '[дlŁłĽľĻļĹĺḶḷḸḹḼḽḺḻĿŀȽƚⱠⱡⱢɫɬᶅɭȴʟLl]',
'n': '[тnŃńǸǹŇňÑñṄṅŅņṆṇṊṋṈṉN̈n̈ƝɲȠƞᵰᶇɳȵɴNnŊŋ]',
'o': '[щoØøÖöÓóÒòÔôǑǒŐőŎŏȮȯỌọƟɵƠơỎỏŌōÕõǪǫȌȍՕօ]',
'p': '[зpṔṕṖṗⱣᵽƤƥᵱ]',
'q': '[йqꝖꝗʠɊɋꝘꝙq̃]',
'r': '[кrŔŕɌɍŘřŖŗṘṙȐȑȒȓṚṛⱤɽ]',
's': '[ыsŚśṠṡṢṣꞨꞩŜŝŠšŞşȘșS̈s̈]',
't': '[еtŤťṪṫŢţṬṭƮʈȚțṰṱṮṯƬƭ]',
'u': '[гuŬŭɄʉỤụÜüÚúÙùÛûǓǔŰűŬŭƯưỦủŪūŨũŲųȔȕ∪]',
'v': '[мvṼṽṾṿƲʋꝞꝟⱱʋ]',
'w': '[цwẂẃẀẁŴŵẄẅẆẇẈẉ]',
'x': '[чxẌẍẊẋχ]',
'y': '[нyÝýỲỳŶŷŸÿỸỹẎẏỴỵɎɏƳƴ]',
'z': '[яzŹźẐẑŽžŻżẒẓẔẕƵƶ]'
};
Unfortunately there's no way to redefine DIACRITICS
variable, it's hardcoded. Is it possible to make it configurable?
P.S. I don't think russian-english or other language pairs should be implemented in source along another diacritics, end users should configure it themselves. But it may be a new feature, in that case of course it will be another variable which can be used in asciifold()
.
I can prepare and send you pull request for it.
Hey there,
looking through your code I noticed the limited "Unicode to ASCII reduction". While this probably works for most people, there are more symbols than that. I had a similar problem with an URL slug creater function a while back, maybe this helps you to extend your diacritics support: urlify.js
Cheers!
I would try to restart the development in my spare time.
But expect no wonders I have not much time.
Hi,
I have a problem loading Siffer.js with jspm because jspm does not support loading dependencies from a tarball/archive (jspm/jspm-cli#424). Although it's the issue of jspm, I wonder why the dependency on node-csv is specified as "https://github.com/voodootikigod/node-csv/tarball/master". Is it possible to change this to "git://github.com/voodootikigod/node-csv". Please see the pull request (#19).
Thanks in advance!
Best regards,
Alex
It would be great if "grade 2" matched "grade 2 smith room 311" higher than "grade 4 roberts room 214". Same for "room 2" should show the second result scored higher than the first.
Any thoughts on how best to approach this? If you have any ideas let me know, I'd be happy to submit the enhancements back in as a PR.
fields: ["title", "description"],
weights: {
"title": 2,
"description": 1
}
Only an idea at this point.
Hi!
I'm lacking sorting on nesting property (and potentialy search on).
Imagine the following objects:
let data = [
{name: 'John Doe', metrics: {a: 15, b:17}},
{name: 'Jane Doe', metrics: {a: 18, b:2}}
];
I want to be able to write:
let s = new Sifter(data).search('', {
fields: [],
sort: [{field: 'metrics.a', direction: 'asc'}]
});
Is there a way to do this today ?
@brianreavis Would you be open to adding more maintainers to this repo?
For example, you could ask someone among the current forks, or someone who has submitted PRs (pending or merged ones).
This is an awesome library but there are some minor things that could be polished, e.g. the security warning from minimist and possibly deprecating (or extracting) the CLI stuff.
Hey there!
I'd like to report a security issue but cannot find contact instructions on your repository.
If not a hassle, might you kindly add a SECURITY.md
file with an email, or another contact method? GitHub recommends this best practice to ensure security issues are responsibly disclosed, and it would serve as a simple instruction for security researchers in the future.
Thank you for your consideration, and I look forward to hearing from you!
(cc @huntr-helper)
Add the ability to sort by multiple fields. The sort precedence will be based on the order of entries in the "sort" setting. Basic idea (the interface might vary):
sort: [
{field: "$score", direction: "desc"},
{field: "first_name", direction: "asc"},
{field: "last_name", direction: "asc"},
]
score: 2, first_name: "shane", last_name: "mcconkey"
score: 2, first_name: "shane", last_name: "zebra"
score: 1, first_name: "aziz", last_name: "ansari"
score: 1, first_name: "nick", last_name: "offerman"
I try to search on a simple multidimensional object. But nothing return result.
var sifter = new Sifter([
{
lastName: "John",
firstName: "Doe",
email: "[email protected]",
society: {
id: 1,
name: "ONU"
}
}
]);
I have try:
var result = sifter.search('onu', {
fields: ['lastName', 'firstName', 'society.name'],
});
var result = sifter.search('onu', {
fields: ['lastName', 'firstName', 'society[name]'],
});
Thank you for your reply.
So "red car" would only match explicitly "red car", not red, car, reddit, carnage, etc.
This would make it consistent with the undocumented options.score
, and would provide much simpler workarounds for some of the other issues which want to disable sorting. In that case you could just do something like options.sort = function (results) { return results; }
.
Hi,
as long as https://github.com/brianreavis/selectize.js uses sifter.js for stripping diacritics, I would like to ask you to update the mapping table on "ľ" and "ĺ" characters.
We really miss it in Slovak language eg. for regions such as Veľký Krtíš or occupation "učiteľ".
var DIACRITICS = {
'a': '[aÀÁÂÃÄÅàáâãäåĀāąĄ]',
'c': '[cÇçćĆčČ]',
'd': '[dđĐďĎð]',
'e': '[eÈÉÊËèéêëěĚĒēęĘ]',
'i': '[iÌÍÎÏìíîïĪī]',
'l': '[lłŁ]',
'n': '[nÑñňŇńŃ]',
'o': '[oÒÓÔÕÕÖØòóôõöøŌō]',
'r': '[rřŘ]',
's': '[sŠšśŚ]',
't': '[tťŤ]',
'u': '[uÙÚÛÜùúûüůŮŪū]',
'y': '[yŸÿýÝ]',
'z': '[zŽžżŻźŹ]'
};
Please let me know if I shoud copy this issue also to Selectize.
Thanks in advance.
These diacritics are missing that are in Latvian alphabet:
g -> ģĢ
k -> ķĶ
l -> ļĻ
n -> ņŅ
Can the microtime dependancy be moved into devDependencies
since it's not required in the main library and it adds unnecessary node recompiles to our builds?
If you have an item with the value of 1 in the list to search, it finds a match.
But if you have a value of 0 it is not matched on.
I found this when using selectize, which uses this library to perform the sorting. I had an id that was 0 and I could not get it to match on it. (Unfortunate because 0 is an important item in my list.)
async v2.6.3 uses an outdated version of lodash which has a low severity security issue (CVE-2020-8203). Async v3 does not depend on lodash at all.
Right now, the default behavior is to adjoin terms with OR. Allowing AND for better narrowing would be nice in some cases.
"conjunction": "and"
Related:
selectize/selectize.js#119
I think this line is wrong.
Line 392 in 97270b4
It should be something like
var result_hash = sifter_hash.search('switzerland europe', options);
But it looks like that will also break the test, because the id will be different a
instead of 0
.
The sifter.js uses "csv-parse": "^2.0.0" as dependencies.
The csv-parse.js was affected with Regular expression Denial of Service - ReDoS up to v.4.4.6
https://www.npmjs.com/advisories/1171
The actual version csv-parse is v.4.4.7
The suggestion is update csv-parse in sifter.js due found vulnerability.
Versions of csv-parse prior to 4.4.6 are vulnerable to Regular Expression Denial of Service. The __isInt() function contains a malformed regular expression that processes large specially-crafted input very slowly, leading to a Denial of Service. This is triggered when using the cast option.
Remediation
Upgrade to version 4.4.6 or later.
I use TypeScript pretty much everywhere these days and I often make my own definitions for libraries. I've made on for sifter and I'm wondering whether there's any interest in including it in the library? It would mean something else to update, but the API seems fairly small and stable.
I could probably manage to rustle up a PR if there's interest.
Because of the csv-parse CVE (see also #55) I looked at this library, and noticed that the library is completely self-contained, and all dependencies are only required by the sifter-binary. If the binary would be a self-contained package, the sifter library would have no dependencies at all, and wouldn't be affected by upstream security issues.
My guess is that a sizable amount of sifter-users are using it indirectly through selectize.js, which does also only uses the library parts of this package.
Hi!
I am unable to install at all, because of the hard coded dependency of node-csv ("node-csv": "https://github.com/voodootikigod/node-csv/tarball/master").
Edit: I can't configure a proxy, I use a custom registry which can't handle hard coded urls
Best regards,
murm
Hi, i am trying to create a small autocomplete without jquery that can handle large amount of options, 500k+, i have a problem where about 200k+ are starting with the same word, if i start typing that word it blocks the ui for a long time, so i was wandering if there is a way to stop the search if it passes over the limit number of results with score over 1.0 i get score 1.1 for all.
Or to be able to stop the search manually, or after a fixed amount of time and return only the results that it got until stop was triggered.
Thanks, and great work with this library.
var data = [
{
"title": "Foo",
"tags": ["foo", "bar", "baz"]
},
{
"title": "Bar",
"tags": ["bar", "qux"]
}
];
It would be great to have the ability to search the data based on the tags
property. It's possible to hack this up in user-code by concatenating the tags
property into a string, with a unique imploding character, e.g.
var data = [
{
"title": "Foo",
"tags": ["foo", "bar", "baz"],
"_tags": "foo_bar_baz"
},
{
"title": "Bar",
"tags": ["bar", "qux"],
"_tags": "bar_qux"
}
];
...and search through that transformed data instead, but this is neither a clean nor a stable solution.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.