Comments (21)
Implemented!
FT.SUGDEL key str
Enjoy!
from redisearch.
Wow, that was unexpectedly fast! Thanks!
from redisearch.
Regarding your second comment: I would like that, yes, as I'm developing a search functionality for a rather large scale application and I'd like to keep everything in redis. Of course I don't want to pressure you to do anything as this is an open-source project, though your effort is greatly appreciated by me and I believe by many others that will come to use RediSearch. Thank you for your time. :)
P. S. How long would it take you to implement that?
from redisearch.
It should take a couple of hours, it's just a matter of finding the time. I'll try to do it this week.
It will make the deletions slower, but should keep the searches just as fast.
from redisearch.
It works! Thanks man, you just made my week!
from redisearch.
Hey. Adding something simple such as marking an entry as deleted until the tree is rebuilt is trivial, I can do it soon. As long as you're not deleting too much it should work fine.
A complete deletion with rebalancing the tree might be a bit trickier. I'll see what I can do.
from redisearch.
Nice m8! Thanks.
from redisearch.
Just tested this and it creates a huge performance decrease once keys are removed.
from redisearch.
What did you test? How many keys and how many did you delete?
It shouldn't slow things down at all, but it won't make things faster either.
from redisearch.
Oh, should've said that sooner...
Anyways:
I've testedit by first adding a million of random entries then, adding a million of similar entries like "asdfg(+ an integer from 1-1000000)" then deleting those entries right after adding them all. After that searching for "asdfg" takes forever.
from redisearch.
Is it any different from not deleting anything? It shouldn't be. Can you test that?
If you save the database and reload, you'll get only the un-deleted tree.
BTW the test case is not realistic - you're doing a prefix search that traverses the entire data set with no shortcuts, it's no different than doing KEYS *
in redis.
from redisearch.
I'll share some code I based this benchmark on in a minute.
from redisearch.
I can offer a data set of the entire English wikipedia entries and popularity scores, I used that for developing the module and benchmarking.
Regarding the deletion - I can implement "real" deletion and make it way faster, but I can't whip it out in an hour like this shortcut :)
from redisearch.
Ok, so I based my benchmark on the following:
void fill_first()
{
redisContext* c = redisConnect("127.0.0.1", 34567);
for (int i = 0; i < 1000000; i++)
{
int n = (core::random::uint8_get() % 24) + 1; // some random size
char tag[n], name[n];
const char alphanum[] = { "abcdefghijklmnopqrstuvwxyz" };
uint8_t rand_idx[n * 2];
core::random::get(rand_idx, n * 2); // Get random bytes
for (int i = 0; i < n; i ++)
tag[i] = alphanum[rand_idx[i] % sizeof(alphanum)]; // fill the array with alphanum chars
for (int i = n; i < n * 2; i ++)
name[i - n] = alphanum[rand_idx[i] % sizeof(alphanum)]; // fill the array with alphanum chars
redisCommand(c, "FT.SUGADD userslex %b:%b %d", tag, n, name, n, i);
}
redisFree(c);
}
void add_delete(const char* variant)
{
redisContext* c = redisConnect("127.0.0.1", 34567);
for (int i = 0; i < 1000000; i++)
redisCommand(c, "FT.SUGADD userslex %s%d %d", variant, i, i);
for (int i = 0; i < 1000000; i++)
redisCommand(c, "FT.SUGDEL userslex %s%d", variant, i);
redisFree(c);
}
void search(const char* str)
{
redisContext* c = redisConnect("127.0.0.1", 34567);
redisCommand(c, "FT.SUGGET userslex %b MAX 10 FUZZY", str, strlen(str));
redisFree(c);
}
int main (int argc, char** argv)
{
fill_first(); // fill the completer with 1000000 random alphanumeric entries
search("asdfg"); // ended in 2 ms
add_delete("asdfg"); // now add 1000000 entries of a variant string and remove those entries
search("asdfg"); // ended in 66 ms, i.e. 33 times slower on the same key-set
search("unrel"); // ended in 0 ms, i.e. the search on unrelated entries (the entries that do not contain the characters used in a variant above) is the same
}
from redisearch.
Ok, there might be a simpler solution than to implement full deletion.
from redisearch.
Thanks man, I appreciate it!
from redisearch.
awesome!
from redisearch.
@mannol ok, looks like I fixed it, even though I doubt it was a real problem (I can go into greater detail on why this is very specific to the kind of data you generated).
After the fix, memory is freed, and after the first search iteration, searches are just as fast. This is what I'm getting from the benchmark, running the search 10 times before the add/delete and 10 times after:
Before:
done search in 1.225009ms!
done search in 0.723388ms!
done search in 0.798917ms!
done search in 0.774701ms!
done search in 0.768723ms!
done search in 0.827574ms!
done search in 0.806563ms!
done search in 0.664389ms!
done search in 0.649305ms!
done search in 0.597494ms!
--- Adding/Deleting!---
done POST DEL search in 11.494974ms!
done POST DEL search in 0.889031ms!
done POST DEL search in 0.825044ms!
done POST DEL search in 0.808792ms!
done POST DEL search in 0.789969ms!
done POST DEL search in 0.786663ms!
done POST DEL search in 0.791179ms!
done POST DEL search in 0.772230ms!
done POST DEL search in 0.737571ms!
done POST DEL search in 1.238047ms!
(these results are from 0.5M records, but it doesn't matter)
from redisearch.
BTW thanks for providing the benchmark, it saved me tons of work figuring this shit out.
from redisearch.
BTW notice that if you're doing FUZZY prefix searches, you should limit the prefix to 3 characters minimum. IIRC the module doesn't do that on its own.
from redisearch.
Yeah, I've noticed that during benchmarking. FUZZY searching with less than 3 characters is useless for our service anyway.
from redisearch.
Related Issues (20)
- [BUG] Is There a offset limit when using `limit offset num` HOT 3
- [BUG] Redis Search silently fails to Sort when the index schema is too large HOT 6
- [BUG] ft.aggregate slowdown with high frequency updates HOT 3
- [BUG] Wildcard redisearch on TEXT field does not return result HOT 2
- [BUG] Unable to do full-text exact search with a colon in the text HOT 2
- [BUG] simple ft.create/ft.search with <100 bytes of data is leaking 1300 bytes of memory. HOT 3
- [BUG] FT.AGGREGATE performance problem HOT 6
- [BUG] Order of precedence not honored in APPLY functions with exponents HOT 1
- [Feature Request] Add FT.ALIASGET command
- L2 distance computation misunderstanding in documentation HOT 4
- [BUG] I can't run "make build" command successfully HOT 3
- Document Distributed Search (RSCoordinator) build/installation HOT 7
- Boost File Error when building 2.8.13 with Bullseye HOT 1
- [BUG] RediSearch HNSW indexing deadlock? HOT 2
- Facing build issue on PPC64LE architecture. HOT 3
- [BUG] APPLY substr function not using -1 count as documented - [MOD-6959]
- the results obtained after indexing are incomplete HOT 29
- Please Help Fix RSCoordinator So that Redis Search (RediSearch) Can Be Used Across Redis Cluster - module-oss.so initialization failed HOT 17
- Configuration of Custom Tokenizer HOT 1
- [BUG] Redis freezes and stops responding with 100% CPU Utilization while using redissearch with HNSW vector indexes HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from redisearch.