Git Product home page Git Product logo

kd-tree-javascript's Introduction

k-d Tree JavaScript Library

A basic but super fast JavaScript implementation of the k-dimensional tree data structure.

As of version 1.01, the library is defined as an UMD module (based on https://github.com/umdjs/umd/blob/master/commonjsStrict.js).

In computer science, a k-d tree (short for k-dimensional tree) is a space-partitioning data structure for organizing points in a k-dimensional space. k-d trees are a useful data structure for several applications, such as searches involving a multidimensional search key (e.g. range searches and nearest neighbor searches). k-d trees are a special case of binary space partitioning trees.

Demos

  • Spiders - animated multiple nearest neighbour search
  • Google Map - show nearest 20 out of 3000 markers on mouse move
  • Colors - search color names based on color space distance
  • Mutable - dynamically add and remove nodes

Usage

Using global exports

When you include the kd-tree script via HTML, the global variables kdTree and BinaryHeap will be exported.

// Create a new tree from a list of points, a distance function, and a
// list of dimensions.
var tree = new kdTree(points, distance, dimensions);

// Query the nearest *count* neighbours to a point, with an optional
// maximal search distance.
// Result is an array with *count* elements.
// Each element is an array with two components: the searched point and
// the distance to it.
tree.nearest(point, count, [maxDistance]);

// Insert a new point into the tree. Must be consistent with previous
// contents.
tree.insert(point);

// Remove a point from the tree by reference.
tree.remove(point);

// Get an approximation of how unbalanced the tree is.
// The higher this number, the worse query performance will be.
// It indicates how many times worse it is than the optimal tree.
// Minimum is 1. Unreliable for small trees.
tree.balanceFactor();

Using RequireJS

requirejs(['path/to/kdTree.js'], function (ubilabs) {
	// Create a new tree from a list of points, a distance function, and a
	// list of dimensions.
	var tree = new ubilabs.kdTree(points, distance, dimensions);

	// Query the nearest *count* neighbours to a point, with an optional
	// maximal search distance.
	// Result is an array with *count* elements.
	// Each element is an array with two components: the searched point and
	// the distance to it.
	tree.nearest(point, count, [maxDistance]);

	// Insert a new point into the tree. Must be consistent with previous
	// contents.
	tree.insert(point);

	// Remove a point from the tree by reference.
	tree.remove(point);

	// Get an approximation of how unbalanced the tree is.
	// The higher this number, the worse query performance will be.
	// It indicates how many times worse it is than the optimal tree.
	// Minimum is 1. Unreliable for small trees.
	tree.balanceFactor();
});

Example

var points = [
  {x: 1, y: 2},
  {x: 3, y: 4},
  {x: 5, y: 6},
  {x: 7, y: 8}
];

var distance = function(a, b){
  return Math.pow(a.x - b.x, 2) +  Math.pow(a.y - b.y, 2);
}

var tree = new kdTree(points, distance, ["x", "y"]);

var nearest = tree.nearest({ x: 5, y: 5 }, 2);

console.log(nearest);

About

Developed at Ubilabs. Released under the MIT Licence.

kd-tree-javascript's People

Contributors

aemkei avatar curzonj avatar fungiboletus avatar linusvanelswijk avatar mirceapricop avatar patricknelson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kd-tree-javascript's Issues

Improper node placement

Using a large dataset ( > 500 ) and removing nodes at random will cause the KDTree to not be segmented correctly. This will cause nodes to not be found correctly.

  1. Create a set of random 2d points
  2. Find the single nearest point within a random 2d point.
  3. Remove the single nearest point.
  4. Repeat from step 2 until you have no more points.

Expected outcome. We are able to remove all records from the tree.

Actual: There are records left, as the tree has made some nodes "unreachable" by not placing them correctly due to the remove function.

Let me know if you want a proof of bug.

Distance metrics such as cosine similarity don't work

When using a metric such as Cosine similarity, the returned vector is almost never the one with the smallest distance value.
Example:

Dataset
var points = [
{ 'x': 61, 'y': 54, 'id': 'A' },
{ 'x': 7, 'y': 54, 'id': 'B' },
{ 'x': 62, 'y': 58, 'id': 'C' },
{ 'x': 75, 'y': 48, 'id': 'D' },
{ 'x': 76, 'y': 52, 'id': 'E' },
{ 'x': 41, 'y': 16, 'id': 'F' },
{ 'x': 35, 'y': 35, 'id': 'G' },
{ 'x': 34, 'y': 63, 'id': 'H' },
{ 'x': 22, 'y': 32, 'id': 'I' },
{ 'x': 0, 'y': 53, 'id': 'J' },
{ 'x': 13, 'y': 91, 'id': 'K' },
{ 'x': 87, 'y': 24, 'id': 'L' },
{ 'x': 63, 'y': 2, 'id': 'M' },
{ 'x': 10, 'y': 76, 'id': 'N' },
{ 'x': 85, 'y': 98, 'id': 'O' },
{ 'x': 94, 'y': 90, 'id': 'P' },
{ 'x': 52, 'y': 47, 'id': 'Q' },
{ 'x': 100, 'y': 38, 'id': 'R' },
{ 'x': 95, 'y': 47, 'id': 'S' },
{ 'x': 62, 'y': 15, 'id': 'T' },
{ 'x': 93, 'y': 66, 'id': 'U' },
{ 'x': 46, 'y': 54, 'id': 'V' },
{ 'x': 85, 'y': 99, 'id': 'W' },
{ 'x': 32, 'y': 53, 'id': 'X' },
{ 'x': 11, 'y': 37, 'id': 'Y' },
{ 'x': 0, 'y': 54, 'id': 'Z' },
{ 'x': 90, 'y': 100, 'id': 's' },
{ 'x': 84, 'y': 58, 'id': 't' },
{ 'x': 97, 'y': 35, 'id': 'u' },
{ 'x': 24, 'y': 34, 'id': 'v' },
{ 'x': 67, 'y': 70, 'id': 'w' },
{ 'x': 16, 'y': 7, 'id': 'x' },
{ 'x': 27, 'y': 73, 'id': 'a' },
{ 'x': 0, 'y': 35, 'id': 'b' },
{ 'x': 97, 'y': 47, 'id': 'c' },
{ 'x': 44, 'y': 56, 'id': 'd' },
{ 'x': 23, 'y': 11, 'id': 'e' },
{ 'x': 3, 'y': 80, 'id': 'f' },
{ 'x': 87, 'y': 27, 'id': 'g' },
{ 'x': 42, 'y': 29, 'id': 'h' },
{ 'x': 77, 'y': 72, 'id': 'i' },
{ 'x': 40, 'y': 10, 'id': 'j' },
{ 'x': 86, 'y': 91, 'id': 'k' },
{ 'x': 43, 'y': 23, 'id': 'l' },
{ 'x': 55, 'y': 13, 'id': 'm' },
{ 'x': 88, 'y': 14, 'id': 'n' },
{ 'x': 67, 'y': 22, 'id': 'o' },
{ 'x': 88, 'y': 91, 'id': 'p' },
{ 'x': 82, 'y': 33, 'id': 'q' },
{ 'x': 97, 'y': 29, 'id': 'r' }
]
Cosine similarity (note the negative sign at the end of the function, as it would normally increase with similar vectors):
    const cosineSimilarity = (vec1, vec2) => {
        let dotProduct = 0;
        let magnitude1 = 0;
        let magnitude2 = 0;

        for (const dim of Object.keys(vec1)) {
            if (typeof (vec1[dim]) != 'number') continue;

            dotProduct += vec1[dim] * vec2[dim];
            magnitude1 += Math.pow(vec1[dim], 2);
            magnitude2 += Math.pow(vec2[dim], 2);
        }

        return -dotProduct / (Math.sqrt(magnitude1) * Math.sqrt(magnitude2));
    };

Inefficient function the get the closest vectors from an array.

    const closestVectors = (vec, vectors) => {
        const distances = vectors.map(v => cosineSimilarity(vec, v));
        const min = Math.min(...distances);
        // Check if multiple with same distance
        const minIndices = distances.reduce((a, e, i) => {
            if (e === min) a.push(i);
            return a;
        }, []);
        // Return list of closest vectors
        return { vectors: minIndices.map(i => vectors[i]), distances: minIndices.map(i => distances[i]) };
    };

Testing

    const test = { x: 200, y: 123, label: "TEST" }
    var tree = new kdTree(points, cosineSimilarity, ['x', 'y']);

    const nearest = tree.nearest(test, 1);
    const nearest2 = closestVectors(test, points);

    console.log("K-D Tree")
    console.log(nearest)
    console.log("Custom")
    console.log(nearest2)

  // > K-D Tree
  // > [ [ { x: 76, y: 52, id: 'E' }, -0.998815642613221 ] ]
  // > Custom
  // > {
  //     vectors: [ { x: 75, y: 48, id: 'D' } ],
  //     distances: [ -0.999839132267399 ]
  //   }

Clearly the tree returns a vector with a slightly larger distance (0.998 vs 0.999).

A question about Google Map demo

When I download the code about "Google Map" demo and run, there is question that

Refused to execute script from 'http://localhost:63342/kd-tree-javascript-master/examples/map/data.json' because its MIME type ('application/json') is not executable, and strict MIME type checking is enabled.

I guess that the file "data.json" is not strictly a json file. So I recommand we rename the file with 'data.js', and this can solve the problem.

bug!bug!bug!

const points = [{ 'y': 2, 'x': 110 }, { 'y': 2, 'x': 222 }, { 'y': 2, 'x': 333 }, { 'y': 2, 'x': 444 }, { 'y': 2, 'x': 555 }, { 'y': 2, 'x': 666 }, { 'y': 2, 'x': 777 }, { 'y': 2, 'x': 888 }, { 'y': 2, 'x': 999 }, { 'y': 2, 'x': 1110 }]

R = 1000

            const tree = new kdTree([...points], distance, ["x", "z"]);

            const obj = {};
            
                const nearest = tree.nearest(points[0], 10, R);

// const nearest = tree.nearest(point[points.length-1],10,R)

In a line points , the first point and the last point , get the error data

Not choosing random node if multiple nodes have same distance

I originally posted this issue to knn, but I realised that the issue was due to a limitation here (#19).

If there are 2 nearest nodes with the exact same distance to the test node, the returned node will always be the last one.
Example:

var points = [
    { x: 1, y: 2, label: 'A' },
    { x: 3, y: 4, label: 'B' },
    { x: 5, y: 6, label: 'C' },
    { x: 5, y: 6, label: 'E' },
    { x: 7, y: 8, label: 'D' },
];

var distance = function (a, b) {
    return Math.pow(a.x - b.x, 2) + Math.pow(a.y - b.y, 2);
}

var tree = new kdTree(points, distance, ["x", "y"]);

var nearest = tree.nearest({ x: 5, y: 5 }, 1);

console.log(nearest);
// > [ [ { x: 5, y: 6, label: 'E' }, 1 ] ]

Both node E and node C have the same distance from the test node (distance=1), yet node E is always returned no matter what.

Expected behaviour:
If multiple nodes could be chosen as the nearest node and maxNodes is set to 1, the returned node should be chosen at random.

Load from existing tree

Functionality to recreate a KDTree index from raw Array Buffer data
Like this: KDTree.from(data)

A question about dimension parameter

I have items like that:

[{
   someField1:"someValue",
   someField2:"someValue",
   coordinate: [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, ...]  //more than 100 dimensions
}, 
....
]

What I have to set in dimension parameter for correct work?

Bugs with quickly removing and inserting elements.

Hello and thank you for creating this library. I am creating a game that uses this library to store all enemy locations for fast nearest neighbor search. Each enemies location changes about once a second and so I remove them and insert them back into the graph to update (this may not be the best way to do it). The following coffeescript demonstrates some errors that are occurring.

enemyCount: () ->
  countR = (n) ->
    return 0 if n is null
    return 1 + countR(n.left) + countR(n.right)
  countR(@enemyKdTree.root)

#using axial coord hex grid distance function
distanceFun = (a,b) -> Hex::distanceTo.call a, b
@enemyKdTree = new kdTree [], distanceFun, ['q', 'r']

#create 
foo = []
for i in [0..200] by 1
  foo[i] = { q: Math.randInt(0,30), r: Math.randInt(0,30) }
  @enemyKdTree.insert foo[i]

for i in [0..200] by 1
  @enemyKdTree.remove foo[i]
  #update fields leaving obj ref the same
  foo[i].q = Math.randInt(0,30)
  foo[i].r = Math.randInt(0,30)
  @enemyKdTree.insert foo[i]

console.log enemyCount()
#This should remain 200 butgets number that are super off and different each time!
# First few runs = [285, 281, 290, 235]

The error rate being proportional to scale and being non deterministic makes it feel like a concurrency issue. Any thoughts? I'll try going through the code also.

Thanks

Add TS Support

Adding TS Typing adds many benefits such as improved IDE intellisense and general TS functionality

Proposed approach

  • Migrate the current project to use a basic implementation of TypeScript
  • Remove UMD, add UMD as a TS build target.
  • Use TS/ES6 class for the kdTree class
  • Use a class-level generic type to specify the return type for method calls. - this should not affect JS execution, but will provide a useful failure to typescript users.

I will PR for this, but I may publish an NPM package if I don't hear back within a month.

The remove method has errors (not really removing some nodes)

After having removed some nodes, they can still be found in the kdTree in some situations.

Here is an example of the remove method errors:

https://gist.github.com/santoxi/ba8d283f9a21523fa45d23d8e9c3046e

as a temporary work-around, I added a fourth argument to "nearest" to pass a predicate to filter the matched nodes, which might be also useful to add:

    this.nearest = function (point, maxNodes, maxDistance, predicate) {
			predicate = predicate || (x => true);
....
        linearDistance = metric(linearPoint, node.obj);

        if (node.right === null && node.left === null) {
          if (predicate(node.obj) && (bestNodes.size() < maxNodes || ownDistance < bestNodes.peek()[1])) {
            saveNode(node, ownDistance);
          }
          return;
        }
....
        nearestSearch(bestChild);

        if (predicate(node.obj) && (bestNodes.size() < maxNodes || ownDistance < bestNodes.peek()[1])) {
          saveNode(node, ownDistance);
        }

bug!bug!bug!

const points = [{ 'y': 2, 'x': 110 }, { 'y': 2, 'x': 222 }, { 'y': 2, 'x': 333 }, { 'y': 2, 'x': 444 }, { 'y': 2, 'x': 555 }, { 'y': 2, 'x': 666 }, { 'y': 2, 'x': 777 }, { 'y': 2, 'x': 888 }, { 'y': 2, 'x': 999 }, { 'y': 2, 'x': 1110 }]

R = 1000

            const tree = new kdTree([...points], distance, ["x", "z"]);

            const obj = {};
            
                const nearest = tree.nearest(points[0], 10, R);

// const nearest = tree.nearest(point[points.length-1],10,R)

In a line points , the first point and the last point , get the error data

I think I'm removing nodes too quickly

I've run into a problem where I do something like:

for node in nodesNotInTree {
   var closestNode = tree.nearest(node, 1)[0][0];
   tree.remove(closestNode);
}

This is an attempt to match every node in nodesNotInTree to a unique node in tree. However, for some reason, this method of removal isn't working properly. Instead, closestNode is the closest node from the original tree, instead what it should be: of the closest node in the tree that hasn't yet been accessed.

Hope that explained my problem clearly. Am I doing things in the wrong order?

Performance drop since 0.12

We've found a performance drop in our production code when migrating to node 0.12.
Here is a benchmark that reproduces it (based on the readme sample):

var kdtree = require('./kdtree'); // an npm module would be awesome ;-)
var Benchmark = require('benchmark');

var points = [
  {x: 1, y: 2},
  {x: 3, y: 4},
  {x: 5, y: 6},
  {x: 7, y: 8}
];

var distance = function(a, b){
  return Math.pow(a.x - b.x, 2) +  Math.pow(a.y - b.y, 2);
}
var tree = new kdTree(points, distance, ["x", "y"]);

var suite = new Benchmark.Suite();
suite.add('Tree#nearest', function() {
    tree.nearest({ x: 5, y: 5 }, 2);
})
.on('cycle', function(event) {
    console.log(String(event.target));
})
.run({ 'async': true });

On node 0.10.24 (tested on 0.10.39 with similar results)

Tree#nearest x 321,548 ops/sec ±1.68% (92 runs sampled)

On node 0.12.6 (tested on 0.12.0 with similar results)

Tree#nearest x 82,008 ops/sec ±3.13% (75 runs sampled)

Any idea on the matter?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.