turian / flann-1.2 Goto Github PK
View Code? Open in Web Editor NEWFork of FLANN 1.2, Fast Library for Approximate Nearest Neighbors
Home Page: http://www.cs.ubc.ca/~mariusm/index.php/FLANN/FLANN
Fork of FLANN 1.2, Fast Library for Approximate Nearest Neighbors
Home Page: http://www.cs.ubc.ca/~mariusm/index.php/FLANN/FLANN
FLANN - Fast Library for Approximate Nearest neighbors ====================================================== This is a library for fast approximate nearest neighbor matching. See the doc/manual.pdf file for information on how to use the library. ================ READ THIS ENTIRE SECTION BEFORE USING MY CODE Forked by Joseph Turian: http://github.com/turian/flann-1.2/tree/master I have forked flann-1.2 from: http://www.cs.ubc.ca/~mariusm/index.php/FLANN/FLANN in order to make the following changes: * Added a DOT distance measure. This is 1-dot(x,y). This only works with KMeans! KDtrees don't work. WARNING: I don't know if this will break FLANN if your dot-product is outside the range of [0,1], but mine never are because I pre-normalize. You could alternately implement the normalization as the cosine similarity (http://en.wikipedia.org/wiki/Cosine_similarity), but I don't do that since it would make the inner loop less efficient. * KDtrees assumes that distance is non-decreasing as you accumulate over dimensions. However, dot distance is non-*INCREASING*. So I disabled branch pruning in KDtree.h (nonetheless KDtrees don't work). * KDtrees blow up with dot distance, so I disabled KDtrees in autotuning. I also keep track of a list of FLANN issues: http://github.com/turian/flann-1.2/issues ================ See Marius Muja.html for more information about what is going on. ================ > Marius: > A small detail you have to be careful about is how the kd-tree uses this > distance function. Because the kd-trees are construced one dimension at a > time they keep track of a partial distance to which they accumulate > distances across each dimension. So the following two pieces of code must > compute the same thing: > > double d = dist(a,a+10,b); > ------------ > double tmp = dist(a,a+1,b); > double d = dist(a+1,a+10,b+1,tmp); Joseph: Okay, I need your help with something simple: I want to convert the dot similarity into a distance. 1) If I negate the similarity, is this acceptable? Or will negative distance screw up FLANN? 2_ Because I have normalized my inputs, the dot product will always be in the range [0, 1]. So I could implement the dot *distance* as: 1 - dot(x, y). If I do this, though, I am not sure what I should do about the term 1. Should I add it every time, or only when acc is nonzero, or what? I include my source code: double dot_dist(Iterator1 first1, Iterator1 last1, Iterator2 first2, double acc = 0) { double dist = acc; double dot0, dot1, dot2, dot3; Iterator1 lastgroup = last1 - 3; /* Add one to the distance because we substract the dot product. */ dist += 1; /* Process 4 items with each loop for efficiency. */ while (first1 < lastgroup) { dot0 = first1[0] * first2[0]; dot1 = first1[1] * first2[1]; dot2 = first1[2] * first2[2]; dot3 = first1[3] * first2[3]; dist -= (dot0 + dot1 + dot2 + dot3); first1 += 4; first2 += 4; } /* Process last 0-3 pixels. Not needed for standard vector lengths. */ while (first1 < last1) { dot0 = *first1++ * *first2++; dist -= dot0; } return dist; }
Dot distance code currently doesn't work, I still need to fix it.
When building the index, it make be that the underlying items to be indexed are not necessarily indicative of the query distribution.
But we may actually have samples of the sort of queries that will be used. So it would be useful to use these queries to build the index, but not include in them in the index (since we never want to retrieve them).
After you build the index, you cannot pickle it for later.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.