Git Product home page Git Product logo

ddrt's Introduction

CircleCI LICENSE VERSION

:ddrt (README)

A Dynamic, Distributed R-Tree (DDRT) library written in Elixir. The 'dynamic' part of the title refers to the fact that this implementation is optimized for a high volume of update operations. Put another way, this is an R-tree best suited for use with spatial data in constant movement. The 'distributed' part refers to the fact that this library is designed to maintain a spatial index (rtree) across a cluster of distributed elixir nodes.

The library uses @derekkraan's MerkleMap and CRDT implementations to ensure reliable, "eventually consistent" distributed behavior.

The complete documentation is available on hexdocs. You can find the hex package here.

Getting Started

Installation

The package can be installed by adding ddrt to your list of dependencies in mix.exs:

def deps do
  [
    {:ddrt, "~> 0.2.1"}
  ]
end

Starting DDRT Processes

Start up a DDRT process with default values

DDRT.start_link([
  name: DDRT
  width: 6,
  verbose: false,
  seed: 0
])

Or add it to your supervision tree:

Supervisor.start_link([
  {DDRT, [
  	name: DDRT,
  	width: 6,
  	verbose: false,
  	seed: 0
  ]}
], [name: MySupervisor])

Otherwise if you're just looking to use the standalone R-tree functionality on a single machine (not a cluster of machines), you would instead use the DDRT.DynamicRtree module:

DDRT.DynamicRtree.start_link([name: DynamicRtree])

Note: all configuration parameters and public API methods are exactly the same between the DDRT and DDRT.DynamicRtree modules.

Configuration

Available configuration parameters are:

  • name: The name of the DDRT process. Defaults to DDRT
  • width: The max number of children a node may have. Defaults to 6
  • verbose: allows Logger to report console logs. (Also decreases performance). Defaults to false.
  • seed: Sets the seed value for the pseudo-random number generator which generates the unique IDs for each node in the tree. This is a deterministic process; so the same seed value will guarantee the same pseudo-random unique IDs being generated for your tree in the same order each time. Defaults to 0

Replicating your R-Tree in a cluster

First it's important to understand that distributed networking capabilities come built-in with Erlang. To get Elixir processes communicating amongst themselves over a network in general, we first have to use that fundamental Erlang networking magic to make all of the running Erlang Virtual Machines "aware" of eachother's existence on the network. In Elixir, these concepts are expressed in the Node module. One can use Node.connect/2 to make two Erlang VM nodes aware of eachother, and then Elixir processes are able to send messages to eachother on those nodes.

Connecting up the Erlang VMs in your cluster is outside of the scope of this package. There are already other libraries in Elixir designed to do exactly this. Possibly the best example is bitwalker/libcluster.

A very simple libcluster configuration for quick and easy development might look like:

## config.exs ##

use Mix.Config
config :libcluster,
topologies: [
 example: [
   strategy: Cluster.Strategy.Epmd,
   config: [hosts: [:"a@localhost", :"b@localhost"]],
 ]
]

Then you would have to pass in those same node names to iex when you start your application, like:

eduardo@ddrt $ iex --name a@localhost -S mix
iex(a@localhost)1>

eduardo@ddrt $ iex --name b@localhost -S mix
iex(b@localhost)1>

Finally, after starting DDRT on each node you would use DDRT.set_members/2 to begin communication between DDRT processes like this:

# on node A:
{:ok, _pid} = DDRT.start_link([name: DDRT])
DDRT.set_members(DDRT, [{DDRT, :b@localhost}])

# on node B:
{:ok, _pid} = DDRT.start_link([name: DDRT])
DDRT.set_members(DDRT, [{DDRT, :a@localhost}])

From here, everything done on either tree will be reflected in the other tree.

Note: it's important that you have the same configuration parameters for each DDRT process running on each connected node in your cluster.

Usage

Starts a local DDRT named :peter

iex> DDRT.start_link([name: :peter])
{:ok, #PID<0.214.0>}

Insert "Griffin" into the :peter DDRT

iex> DDRT.insert({"Griffin", [{4,5}, {6,7}]}, :peter)
{:ok,
  %{
    43143342109176739 => {["Griffin"], nil, [{4, 5}, {6, 7}]},
    :root => 43143342109176739,
    :ticket => [19125803434255161 | 82545666616502197],
    "Griffin" => {:leaf, 43143342109176739, [{4, 5}, {6, 7}]}
}}

Insert "Parker" on into the :peter DDRT

iex> DDRT.insert({"Parker",[{10,11},{16,17}]},:peter)
{:ok,
  %{
    43143342109176739 => {["Parker", "Griffin"], nil, [{4, 11}, {6, 17}]},
    :root => 43143342109176739,
    :ticket => [19125803434255161 | 82545666616502197],
    "Griffin" => {:leaf, 43143342109176739, [{4, 5}, {6, 7}]},
    "Parker" => {:leaf, 43143342109176739, [{10, 11}, {16, 17}]}
}}

Query which leaves in the :peter R-tree overlap with box [{0,7},{4,8}]

iex> DDRT.query([{0,7},{4,8}],:peter)
{:ok, ["Griffin"]}

Updates "Griffin" bounding box in the :peter R-tree

iex> DDRT.update("Griffin", [{-6,-5},{11,12}], :peter)
{:ok,
  %{
    43143342109176739 => {["Parker", "Griffin"], nil, [{-6, 11}, {6, 17}]},
    :root => 43143342109176739,
    :ticket => [19125803434255161 | 82545666616502197],
    "Griffin" => {:leaf, 43143342109176739, [{-6, -5}, {11, 12}]},
    "Parker" => {:leaf, 43143342109176739, [{10, 11}, {16, 17}]}
}}

Repeat the last query again. (This time "Griffin" is no longer within the query bounding box.)

 iex> DDRT.query([{0,7},{4,8}], :peter)
 {:ok, []}

Now lets delete both "Griffin" and "Parker" keys from the tree.

iex> DDRT.delete(["Griffin","Parker"], :peter)
{:ok,
  %{
    43143342109176739 => {[], nil, [{0, 0}, {0, 0}]},
    :root => 43143342109176739,
    :ticket => [19125803434255161 | 82545666616502197]
}}

Bounding-Box Format

[{x_min,x_max}, {y_min,y_max}]

Example:                               & & & & & y_max & & & & &
  A unit at pos x: 10, y: -12 ,        &                       &
  with x_size: 1 and y_size: 2         &                       &
  would be represented with            &          pos          &
  the following bounding box         x_min       (x,y)       x_max
  [{9.5,10.5},{-13,-11}]               &                       &
                                       &                       &
                                       &                       &
                                       & & & & & y_min & & & & &

Standalone (non-distributed) R-tree mode

If you're only interested in using an R-tree on a single machine in Elixir, you should be using the DDRT.DynamicRtree module. This module is optimized to run on a single machine, and as such the r-tree is significantly faster without the distribution overhead.

The DDRT.DynamicRtree module shares the same API and initialization options as the main DDRT module.

Benchmarks

Operating System: macOS
CPU Information: Intel(R) Core(TM) i5-5257U CPU @ 2.70GHz
Number of Available Cores: 4
Available memory: 8 GB
Elixir 1.9.0
Erlang 22.0.7

Delete

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
parallel: 1
inputs: delete all leaves of tree [1000]
Estimated total run time: 28 s

##### With input delete all leaves of tree [1000] #####
Name                       ips        average  deviation         median         99th %
map bulk                175.20        5.71 ms     ±9.18%        5.60 ms        9.47 ms
merklemap bulk           80.27       12.46 ms    ±21.27%       11.74 ms       25.37 ms
map 1 by 1                4.68      213.68 ms     ±3.12%      213.24 ms      227.16 ms
merklemap 1 by 1          1.55      643.75 ms    ±14.80%      616.84 ms      878.20 ms

Comparison: 
map bulk                175.20
merklemap bulk           80.27 - 2.18x slower +6.75 ms
map 1 by 1                4.68 - 37.44x slower +207.97 ms
merklemap 1 by 1          1.55 - 112.79x slower +638.04 ms

Update

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 10 s
memory time: 0 ns
parallel: 1
inputs: all leaves of tree [1000], all leaves of tree [100000]
Estimated total run time: 48 s

##### With input all leaves of tree [1000] #####
Name                ips        average  deviation         median         99th %
map              133.88        7.47 ms    ±22.82%        6.92 ms       14.83 ms
merklemap         65.74       15.21 ms    ±21.93%       14.18 ms       26.42 ms

Comparison: 
map              133.88
merklemap         65.74 - 2.04x slower +7.74 ms

##### With input all leaves of tree [100000] #####
Name                ips        average  deviation         median         99th %
map                0.68         1.46 s    ±15.84%         1.47 s         1.82 s
merklemap          0.33         3.01 s     ±8.23%         3.09 s         3.21 s

Comparison: 
map                0.68
merklemap          0.33 - 2.06x slower +1.55 s

Query

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
parallel: 1
inputs: 100x100 box query, 10x10 box query, 1x1 box query, world box query
Estimated total run time: 56 s

##### With input 100x100 box query #####
Name                ips        average  deviation         median         99th %
merklemap        299.97        3.33 ms    ±28.87%        3.03 ms        6.92 ms
map              268.51        3.72 ms    ±36.46%        3.35 ms        8.57 ms

Comparison: 
merklemap        299.97
map              268.51 - 1.12x slower +0.39 ms

##### With input 10x10 box query #####
Name                ips        average  deviation         median         99th %
map              1.50 K      667.16 μs    ±37.04%         594 μs     1557.56 μs
merklemap        1.01 K      992.92 μs    ±48.86%         883 μs     2418.52 μs

Comparison: 
map              1.50 K
merklemap        1.01 K - 1.49x slower +325.76 μs

##### With input 1x1 box query #####
Name                ips        average  deviation         median         99th %
map              2.01 K      498.54 μs    ±39.28%         430 μs        1257 μs
merklemap        1.51 K      660.89 μs    ±45.08%         603 μs     1551.25 μs

Comparison: 
map              2.01 K
merklemap        1.51 K - 1.33x slower +162.34 μs

##### With input world box query #####
Name                ips        average  deviation         median         99th %
map              156.18        6.40 ms    ±18.51%        5.99 ms       10.70 ms
merklemap        152.11        6.57 ms    ±26.12%        5.92 ms       13.93 ms

Comparison: 
map              156.18
merklemap        152.11 - 1.03x slower +0.171 ms

Insert

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
parallel: 1
inputs: 1000 leaves
Estimated total run time: 28 s

##### With input 1000 leaves #####
Name                       ips        average  deviation         median         99th %
map bulk                305.53        3.27 ms    ±39.39%        2.79 ms        7.96 ms
merklemap bulk          190.61        5.25 ms    ±62.06%        4.37 ms       17.65 ms
map 1 by 1               66.73       14.99 ms     ±4.63%       14.78 ms       19.11 ms
merklemap 1 by 1         23.00       43.48 ms    ±23.79%       39.24 ms       81.16 ms

Comparison: 
map bulk                305.53
merklemap bulk          190.61 - 1.60x slower +1.97 ms
map 1 by 1               66.73 - 4.58x slower +11.71 ms
merklemap 1 by 1         23.00 - 13.28x slower +40.21 ms

ddrt's People

Contributors

lubiepiwo avatar sikanrong avatar vhf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ddrt's Issues

Option to choose between Rtree or QuadTree as underlying data structure

We would really like to extend the library in such a way that the end-user can choose between an RTree or a QuadTree for the underlying data structure, while maintaining the distributed nature of the library.

It's not entirely clear when we'll have the time to make this change, but we would appreciate anybody willing to work with us to integrate this functionality into our library via a pull-request.

Crash when inserting bounding box with triangular or zero area

I've been using this lib with great success but recently I've been hitting a bug with small bounding boxes crashing DDRT.

Isolating the issue from the stacktrace and getting to a minimal repro was pretty hard but after some debugging time I got there:

This is fine:

DynamicRtree.new(type: MerkleMap)
{:ok, t} = DynamicRtree.insert({0, [{9, 10}, {9, 10}]})
{:ok, t} = DynamicRtree.insert({1, [{9, 9}, {9, 9.1}]})

fine as well:

DynamicRtree.new(type: MerkleMap)
{:ok, t} = DynamicRtree.insert({0, [{9, 9.1}, {9, 9.1}]})

but this crashes (notice the bounding box is a really small triangle):

DynamicRtree.new(type: MerkleMap)
{:ok, t} = DynamicRtree.insert({1, [{9, 9}, {9, 9.1}]})

and this crashes as well (the bounding box has an area of 0):

DynamicRtree.new(type: MerkleMap)
{:ok, t} = DynamicRtree.insert({1, [{9, 9}, {9, 9}]})

The crash looks like this:

17:46:17.597 [error] GenServer DDRT terminating
** (ArgumentError) argument error
    :erlang.tl([])
    (ddrt 0.2.1) lib/ddrt/dynamic_rtree_impl/utils.ex:44: DDRT.DynamicRtreeImpl.Utils.combine_multiple/1
    (ddrt 0.2.1) lib/ddrt/dynamic_rtree_impl.ex:263: anonymous fn/3 in DDRT.DynamicRtreeImpl.add_entry/3
    (elixir 1.11.2) lib/map.ex:818: Map.update!/3
    (merkle_map 0.2.0) lib/merkle_map.ex:221: MerkleMap.update!/3
    (ddrt 0.2.1) lib/ddrt/dynamic_rtree_impl.ex:262: DDRT.DynamicRtreeImpl.add_entry/3
    (ddrt 0.2.1) lib/ddrt/dynamic_rtree_impl.ex:246: DDRT.DynamicRtreeImpl.insertion/3
    (ddrt 0.2.1) lib/ddrt/dynamic_rtree_impl.ex:79: DDRT.DynamicRtreeImpl.tree_insert/2
    (ddrt 0.2.1) lib/ddrt/dynamic_rtree.ex:506: DDRT.DynamicRtree.handle_call/3
    (stdlib 3.13.2) gen_server.erl:706: :gen_server.try_handle_call/4
    (stdlib 3.13.2) gen_server.erl:735: :gen_server.handle_msg/6
    (stdlib 3.13.2) proc_lib.erl:226: :proc_lib.init_p_do_apply/3

It only seems to appear when (1) the tree is empty, and (2) the bounding box is either a triangle or has a 0 area.

I gave solving this a few attempts but without success.

Here are two failing tests reproducing the issue:

    test "MerkleMap inserts a triangular bounding box without crash" do
      DynamicRtree.new(type: MerkleMap)

      {:ok, t} = DynamicRtree.insert({1, [{9, 9}, {9, 9.1}]})

      assert t == DynamicRtree.tree()

      root = t |> MerkleMap.get(:root)
      {ch, _root_ptr, root_box} = t |> MerkleMap.get(root)
      assert t |> Enum.to_list() |> length == t |> Enum.uniq() |> length
      assert length(ch) == 1
      assert root_box == [{9, 9}, {9, 9.1}]
    end

    test "MerkleMap inserts a zero area bounding box without crash" do
      DynamicRtree.new(type: MerkleMap)

      {:ok, t} = DynamicRtree.insert({1, [{9, 9}, {9, 9}]})

      assert t == DynamicRtree.tree()

      root = t |> MerkleMap.get(:root)
      {ch, _root_ptr, root_box} = t |> MerkleMap.get(root)
      assert t |> Enum.to_list() |> length == t |> Enum.uniq() |> length
      assert length(ch) == 1
      assert root_box == [{9, 9}, {9, 9}]
    end

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.