Git Product home page Git Product logo

indextree's Introduction

indextree

GitHub Actions Coverage Dependency Status Doc indextree License MIT Crates.io doc.rs

Arena based tree structure with multithreading support

This arena tree structure is using just a single Vec and numerical identifiers (indices in the vector) instead of reference counted pointers. This means there is no RefCell and mutability is handled in a way much more idiomatic to Rust through unique (&mut) access to the arena. The tree can be sent or shared across threads like a Vec. This enables general multiprocessing support like parallel tree traversals.

Example usage

use indextree::Arena;

// Create a new arena
let arena = &mut Arena::new();

// Add some new nodes to the arena
let a = arena.new_node(1);
let b = arena.new_node(2);

// Append a to b
assert!(a.append(b, arena).is_ok());
assert_eq!(b.ancestors(arena).into_iter().count(), 2);

Benchmarks

https://github.com/mooman219/generational_arena_bench

indextree's People

Contributors

clbarnes avatar coderedart avatar datatriny avatar dependabot-preview[bot] avatar jistr avatar lisoph avatar lo48576 avatar m-adoo avatar phrohdoh avatar ramn avatar raymontag avatar saschagrunert avatar seifane avatar slivering avatar teymour-aldridge avatar voidxnull avatar waywardmonkeys avatar wlinna avatar yonifeng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

indextree's Issues

Add Arena::with_capacity

When creating large trees, inserting many nodes causes the underlying Vec to reallocate frequently.
Arena::with_capacity would simply create allocate the Vec beforehand.

fn with_capacity(capacity: usize) -> Self {
    Self {
        nodes: Vec::with_capacity(capacity),
        first_free_slot: None,
        last_free_slot: None,
    }
}

I would also suggest adding the Arena::clear function which simply clears the Vec, causing all the created NodeIds to be invalid.
This is equivalent to removing all the Nodes while retaining the Vec capacity.
Unless I forgot something, I'm rather confident that such an implementation would not cause soundness issues, e.g. with the iterators.

Iterate over Arena in Depth-first order

Hi,

Is there anyway to iterate over all the nodes in an Arena in a depth-first order (iter seems to be in storage order). At the moment, I'm tracking all the root nodeid's (during insertion) in a separate vec, and, then iterating over each of their descendants. I was wondering, if there is a better way to do this? Thanks

Prepend to the node with no children

indextree/tests/lib.rs

Lines 39 to 45 in 0e51295

#[test]
fn failure_prepend() {
let arena = &mut Arena::new();
let a = arena.new_node(1);
let b = arena.new_node(2);
assert!(!a.prepend(b, arena).is_ok());
}

I think prepend() to the leaf node should success as append() does, but currently it is treated as invalid.
Is there any reason to forbid adding the only child by prepend()?

Unsafe in get_pair_mut

I noticed there is a usage of unsafe in get_pair_mut. Do you think this could be avoided by using &[].split_at_mut()? Something along the lines of

let (xs, ys) = self.split_at_mut(max(a, b));
(&mut xs[min(a, b)], &mut ys.first)

pretty print indextree

The documentation has many instances of nicely-formatted ASCII representations of tree state.
Is there a code sample of rendering to this format? Or better yet, a debug API to do so, possibly by fmt::Display or similar?

Iterating over all nodes while modifying them

Hello,

I was wondering if there was a way to iterate over all nodes of an Arena and modify them.
My program requires that I go over all nodes, which contain a struct, and modify a value inside that struct. Order is not important I just need to be able do it on all nodes.
However there doesn't seem to be a way to get a mutable iterator even with rayon. Is this a design quirk ? Am I missing something here ?

Thanks !

Clarify order of Descendants, Traverse, and ReverseTraverse

I've had a look around and it doesn't seem like "tree order" has any meaning when it comes to graph traversal - rather, it relates to the number of children each node can have. The language in common use for tree traversal seems to be "depth first" and "breadth first", in "pre order", "post order", or "level order".

support no_std

It will be great if you can add no_std support for indextree. It won't be hard because indextree doesn't depend on any specific library right now.

feature request: `remove_children` and `detach_children`

When dealing with the tree structure, sometimes we want to remove the all children nodes of a specific node but keeping the node itself. For example, removing anything between tags <p> and </p>.

Here is a snippet demonstrating how to implement detach_children with indextree:

fn detach_children<T>(self, arena: &mut Arena<T>) {
    for child in self.children(arena) {
        let node = arena[child];
        node.parent = None;
        node.next_sibling = None;
        node.previous_sibling = None;
    }
    
    let node = arena[self];
    node.first_child = None;
    node.last_child = None;
}

I haven't tried it yet, so it may not be compiled.

NodeId::remove_subtree does not remove the whole subtree sometimes

The preorder traversal in NodeId::remove_subtree is incorrect. Upon return upwards in the tree, if a cursor's parent doesn't have a sibling, the algorithm terminates without looking at more distant ancestors which still might have some siblings that need removing.

In the following example arena, when removing node 1_2, node 1_2_2 should be marked as removed but it isn't:

arena                                                                                                                                                                                                                                                                                 
`-- 1                                                                                                                                                                                                                                                                                 
    |-- 1_1                                                                                                                                                                                                                                                                           
    |-- 1_2                                                                                                                                                                                                                                                                           
    |   |-- 1_2_1                                                                                                                                                                                                                                                                     
    |   |   `-- 1_2_1_1                                                                                                                                                                                                                                                               
    |   |       `-- 1_2_1_1_1                                                                                                                                                                                                                                                         
    |   `-- 1_2_2                                                                                                                                                                                                                                                                     
    `-- 1_3                                                                                                                                                                                                                                                                           

Pull request incoming πŸ™‚.

Accessing nodes on a given Arena

How do you get access to an iterator on an Arena? For example, if a function returns a created Arena with appended Nodes, how do I start traversing or indexing? According to the docs, Arena itself doesn't seem to have any methods that let you do this without an appended NodeId included separately.

Problems around CI stuff

Files introduced by #45.

About .gitignore:

If you’re building a non-end product, such as a rust library that other rust packages will depend on, put Cargo.lock in your .gitignore.

--- Cargo.toml vs Cargo.lock - The Cargo Book

If CI really needs the lockfile, it should be prepared by the CI script, instead of including Cargo.lock to the project.
(For example, include Cargo.lock.ci to the project and rename it to Cargo.toml before the build phase runs.)

About .rustfmt.toml:

.rustfmt.toml should have only non-default entries, or empty.
If it has unstable entries (such as indent_style), they prevents stable rustfmt from using the config, and required_verison = "1.0.3" prevents other nightly rustfmt from formatting the code.

I think using default config is the best, and then .rustfmt can be removed.

Bad NodeId.index?

You are using Λ‹self.nodes.len()` to get the next index, but if you add 2 nodes, then remove the first one, then add a node, both nodes will have the same index right ?

Iteratation through children first

Hi,
I'd like to get an iteration through children first. Here are examples:

use indextree::Arena;

pub fn test_arena() {
  let mut arena = Arena::new();
  let n1 = arena.new_node(("1", 0));
  let n1_1 = arena.new_node(("1_1", 1));
  n1.append(n1_1, &mut arena);
  let n1_2 = arena.new_node(("1_2", 1));
  n1.append(n1_2, &mut arena);
  {
    let n1_2_1 = arena.new_node(("1_2_1", 2));
    n1_2.append(n1_2_1, &mut arena);
  }
  let n1_3 = arena.new_node(("1_3", 1));
  n1.append(n1_3, &mut arena);
  {
    let n1_3_1 = arena.new_node(("1_3_1", 2));
    n1_3.append(n1_3_1, &mut arena);
    let n1_3_2 = arena.new_node(("1_3_2", 2));
    n1_3.append(n1_3_2, &mut arena);
    let n1_3_3 = arena.new_node(("1_3_3", 2));
    n1_3.append(n1_3_3, &mut arena);
  }

  let it = arena.iter();
  for a in it {
    let (v, level) = a.get();
    println!("{}", format!("{:>width$}", v, width = 4*level));
  }
}

pub fn test_arena2() {
  let mut arena = Arena::new();
  let n1 = arena.new_node(("1", 0));
  let n1_1 = arena.new_node(("1_1", 1));
  n1.append(n1_1, &mut arena);
  let n1_2 = arena.new_node(("1_2", 1));
  n1.append(n1_2, &mut arena);
  let n1_3 = arena.new_node(("1_3", 1));
  n1.append(n1_3, &mut arena);

  {
    let n1_3_1 = arena.new_node(("1_3_1", 2));
    n1_3.append(n1_3_1, &mut arena);
    let n1_3_2 = arena.new_node(("1_3_2", 2));
    n1_3.append(n1_3_2, &mut arena);
    let n1_3_3 = arena.new_node(("1_3_3", 2));
    n1_3.append(n1_3_3, &mut arena);
  }
  {
    let n1_2_1 = arena.new_node(("1_2_1", 2));
    n1_2.append(n1_2_1, &mut arena);
  }

  let it = arena.iter();
  for a in it {
    let (v, level) = a.get();
    println!("{}", format!("{:>width$}", v, width = 4*level));
  }
}

Actually I got 2 things differents:

For the first test:
1_1
1_2
1_2_1
1_3
1_3_1
1_3_2
1_3_3

For the second test:
1
1_1
1_2
1_3
1_3_1
1_3_2
1_3_3

What I want is the first result, how can I do please?

Regards

Suggestion: Panic when appending to node that has `self` node as child.

Hi,
thanks for writing this nice library.
For some reason I thought I could use indextree as a graph structure. Later I realized that this library is just for tree structures. And I got such strange behavior:

        let arena = &mut Arena::new();

        let start = arena.new_node("start");
        let a = arena.new_node("A");
        let b = arena.new_node("b");
        let c = arena.new_node("c");

        start.append(a, arena);
        a.append(start, arena);
        start.append(b, arena);
        b.append(start, arena);
        assert_eq!(start.children(arena).into_iter().count(), 2);

        a.append(c, arena);
        c.append(a, arena);

        assert_eq!(start.children(arena).into_iter().count(), 2); //FAILED: count is 1

I guess doing "proper" validation to check if I'm building graph is difficult. For example this gives infinite look:

        let arena = &mut Arena::new();

        let a = arena.new_node("a");
        let b = arena.new_node("b");
        let c = arena.new_node("c");

        a.append(b, arena);
        b.append(c, arena);
        c.append(a, arena);

        assert_eq!(a.children(arena).into_iter().count(), 1);
        assert_eq!(a.descendants(arena).into_iter().count(), 3); // infinite loop

But in my case just panic when trying to "connect" nodes to each other would give me earlier feedback that I'm using this library in a wrong way. I propose that for second append the code below should panic (I hope it's easy to check it, I didn't look at the source code)

        let arena = &mut Arena::new();

        let a = arena.new_node("a");
        let b = arena.new_node("b");

        a.append(b, arena);
        b.append(a, arena); //IMO this should panic

        assert_eq!(a.children(arena).into_iter().count(), 1);
        assert_eq!(a.descendants(arena).into_iter().count(), 2); // infinite loop

BTW, the example code on https://crates.io/crates/indextree is not compiling (assert!(a.append(b, arena).is_ok()); line)

Use `NonZeroUsize` for internal representation of `NodeId`

If NodeId uses NonZeroUsize, Option<NodeId> can be same size as NodeId.
(On the other hand, with current implementation by usize, Option<NodeId> would be twice larger than NodeId.)
Using NonZeroUsize reduces size of many types (even if user-defined) which have Option<NodeId>s.

There is a downside: the internal numbers should be decremented before they are used as vector index.

As far as I know, string-interner crate does this: use NonZeroU32 and -1 when they are used as index.

Should insertions return `Result` type?

Insertions can fail if the preconditions are not met, but they are very very simple.
Preconditions:

  • NodeId::append(), NodeId::prepend():
    • self != new_child.
  • NodeId::insert_before(), NodeId::insert_after():
    • self != new_sibling.

They are sometimes ensured by the context, and then users are sure insertions do not fail.
In such situation, it seems redundant to unwrap many results.
(See https://github.com/saschagrunert/indextree/pull/39/files#diff-e7bc1f36b79d2e2d0030318c5fe20910 for example.)

I'm not sure which is better...
Both interface (panicking or not panicking when preconditions are not met) seems reasonable.

Panic on semantic error, rather than returning error?

It seems indextree avoids panicking, even if there are semantic error (caused by broken inconsistent data).

For example:

indextree/src/lib.rs

Lines 460 to 471 in 0e51295

if let Some((self_borrow, new_child_borrow)) =
arena.nodes.get_tuple_mut(self.index0(), new_child.index0())
{
new_child_borrow.parent = Some(self);
last_child_opt =
mem::replace(&mut self_borrow.last_child, Some(new_child));
if let Some(last_child) = last_child_opt {
new_child_borrow.previous_sibling = Some(last_child);
} else {
if self_borrow.first_child.is_some() {
bail!(NodeError::FirstChildAlreadySet);
}

In this code, if self_borrow.last_child was None, self_borrow.first_child should be also None, if indextree is working correctly.
If it is not, then the data can (and may) have been broken in unexpected way, and it seems meaningless to do more operation on that broken data.
Additionally, results are returned as Fallible<_> and it does type erasure, so users might easily miss such semantic bug.

I think semantic errors should be caught by panicking (typically .expect() on Result and Option) and assert macros.

Appending to tree in parallel

Hello,

I'd like to append to my indextree arena in parallel through rayon, however I'm running into issues with the borrow checker. What's the recommended way to circumvent this?

allow getting node using usize as an index.

I need to pass around the NodeID via ffi functions. And because its an opaque rust struct, I decided to simply get the usize and use that as the node_id for the sake of simplicity.

But, NodeID also has a version, making it impossible to get the original NodeID from just the usize.

It would be nice if the arena had a function to get the current NodeID at the index usize (or None if it doesn't exist).

Parallel tree traversal

Hi there!

The README mentions parallel tree traversals over an Arena but I couldn't find any related method on the docs.

How could one do so?

Also, are there any plans to support rayon's parallel iterators ?

Great lib, btw πŸ‘

Easier access to underlying index

I need to access the underlying index of a NodeId.

Background:
I'm using indextree to implement an octree. Each node should also know which nodes are not visible from that node. My NodeData looks like this:

pub struct NodeData {
   ... 
    pub occludees: Vec<NodeId>
}

Now I need to serialize my tree in JSON format, but the problem is that I cannot access NodeId's underlying index. I also tried to use Serialize attribute (after activating deser feature), which gets me somewhere, but the index is serialized as following

{
     "index1": 51
}

Would it be possible to add an easy way to access the underlying index?
Thanks!

bug: `NodeStamp` does not work as expected when reusing nodes

First of all, I'd like to express my appreciation for your excellent work!

However, I've found a bug that NodeStamp does not work as expected when reusing nodes.

Problems

indextree/src/id.rs

Lines 40 to 64 in 3eee154

impl NodeStamp {
pub fn is_removed(self) -> bool {
self.0.is_negative()
}
pub fn as_removed(&mut self) {
debug_assert!(!self.is_removed());
self.0 = if self.0 < i16::MAX {
-self.0 - 1
} else {
-self.0
};
}
pub fn reuseable(self) -> bool {
debug_assert!(self.is_removed());
self.0 > i16::MIN
}
pub fn reuse(&mut self) -> Self {
debug_assert!(self.reuseable());
self.0 = -self.0;
*self
}
}

New nodes are allocated with NodeStamp(0). When we decide to remove it, as_removed will be called so the stamp of the removed node will be turned to -1. Then if it's reused, the stamp will become 1. If we repeat this, the values of this stamp form a sequence of...

0, -1, 1, -2, 2, ..., 32766, -32767

Here comes the problem: -32767 is still considered reusable since it's greater than i16::MIN = -32768. So if we reuse it, then the stamp will become 32767. However, if we remove it again, we'll get a stamp of -32767 since self.0 < i16::MAX is now false, so we're actually reusing a stamp!

32766, -32767, 32767, -32767, 32767, -32767...

This breaks the assumption that we should use different stamps for different nodes, and the interfaces relying on this may behave incorrectly. For example...

use indextree::Arena;

fn main() {
    let mut tree: Arena<i32> = Arena::new();

    for i in 0usize.. {
        let id = tree.new_node(42);
        assert!(!id.is_removed(&tree));

        id.remove(&mut tree);
        assert!(id.is_removed(&tree));

        let new_id = tree.new_node(42);
        assert!(!new_id.is_removed(&tree));a
        assert!(id.is_removed(&tree), "i: {i}, id: {id:?}, new_id: {new_id:?}"); // <--

        new_id.remove(&mut tree);
    }
}

This code above panics during the 16385th loop.

thread 'main' panicked at 'i: 16384, id: NodeId { index1: 1, stamp: NodeStamp(32767) }, new_id: NodeId { index1: 1, stamp: NodeStamp(32767) }', src/bin/index_tree.rs:18:9

Possible Solutions

The root cause of this problem is that reusable always return true, which is not expected. So an intuitive solution could be correctly implementing this function.

  pub fn reuseable(self) -> bool {
      debug_assert!(self.is_removed());
-     self.0 > i16::MIN
+     self.0 > i16::MIN + 1
  }

However, I suppose it's worthwhile revisiting the meaning of stamp's existence. The ultimate purpose of reusing a node is to save memory, so using i16 seems not that convincing here. A node being unresuable means that we have to allocate a new node. For example, if we simply turn the stamp into an i64, we'll have much more chances to reuse a node. The cost of 6 extra bytes between i16 and i64 is much less than allocating a new node, not to mention the memory alignment that may erase the difference.

So I'm wondering whether it's acceptable to claim that indexing with or calling methods on a removed NodeId is always considered to be an undefined behavior, so that we can remove the stamp at all? This could make node reusing always available.

How to get a node based on NodeId's private index1: NonZeroUsize?

I need an unique identifier associated with the nodes and it has to be an integer of any kind <= i/u64.

So I thought I take out the index would be a good idea:
let node_id = arena.get_node_id(node).unwrap(); let uid: usize = node_id.into();

How to put it back in again? How construct a NodeId so that I can get my Node out of the Arena again?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.