saschagrunert / indextree Goto Github PK

View Code? Open in Web Editor NEW

593.0 12.0 52.0 4.63 MB

Arena based tree 🌲 structure by using indices instead of reference counted pointers

License: MIT License

Rust 99.70% Makefile 0.30%

tree-structure arena data-structrues rust

indextree's Issues

Why doesn't arena have a way to iterate over nodeids?

How am i supposed to filter some ids from arena, for further manipulations? get_node_id searches the entire arena.

Also, how are you supposed to do memoized recursion

	fn pos(&self, date: &Date, node: NodeId, mut arena: TPtr<Arena<Body>>) -> Vec3 {
		let Self { orbit, pos, .. } = self;
		if let Some(p) = pos {
			return *p;
		}

		let a = arena.get_mut();
		let mut n = node;
		let parents = iter::from_fn(|| {
			let p = a[n].parent()?;
			n = p;
			Some(p)
		})
		.collect_vec();

		let arena = arena.clone();
		let parents = parents.into_iter().rev().fold(Vec3((0, 0, 0)), move |acc, n| a[n].get().pos(date, n, arena).sum(acc));

		orbit.pos_at_date(*date).sum(parents)
	}

can't pass arena to to a method, if you're invoking it on something inside arena.

Problems around CI stuff

Files introduced by #45.

About `.gitignore`:

If you’re building a non-end product, such as a rust library that other rust packages will depend on, put Cargo.lock in your .gitignore.

--- Cargo.toml vs Cargo.lock - The Cargo Book

If CI really needs the lockfile, it should be prepared by the CI script, instead of including Cargo.lock to the project.
(For example, include Cargo.lock.ci to the project and rename it to Cargo.toml before the build phase runs.)

About `.rustfmt.toml`:

.rustfmt.toml should have only non-default entries, or empty.
If it has unstable entries (such as indent_style), they prevents stable rustfmt from using the config, and required_verison = "1.0.3" prevents other nightly rustfmt from formatting the code.

I think using default config is the best, and then .rustfmt can be removed.

Accessing nodes on a given Arena

How do you get access to an iterator on an Arena? For example, if a function returns a created Arena with appended Nodes, how do I start traversing or indexing? According to the docs, Arena itself doesn't seem to have any methods that let you do this without an appended NodeId included separately.

Iterate over Arena in Depth-first order

Hi,

Is there anyway to iterate over all the nodes in an Arena in a depth-first order (iter seems to be in storage order). At the moment, I'm tracking all the root nodeid's (during insertion) in a separate vec, and, then iterating over each of their descendants. I was wondering, if there is a better way to do this? Thanks

NodeId::remove_subtree does not remove the whole subtree sometimes

The preorder traversal in NodeId::remove_subtree is incorrect. Upon return upwards in the tree, if a cursor's parent doesn't have a sibling, the algorithm terminates without looking at more distant ancestors which still might have some siblings that need removing.

In the following example arena, when removing node 1_2, node 1_2_2 should be marked as removed but it isn't:

arena                                                                                                                                                                                                                                                                                 
`-- 1                                                                                                                                                                                                                                                                                 
    |-- 1_1                                                                                                                                                                                                                                                                           
    |-- 1_2                                                                                                                                                                                                                                                                           
    |   |-- 1_2_1                                                                                                                                                                                                                                                                     
    |   |   `-- 1_2_1_1                                                                                                                                                                                                                                                               
    |   |       `-- 1_2_1_1_1                                                                                                                                                                                                                                                         
    |   `-- 1_2_2                                                                                                                                                                                                                                                                     
    `-- 1_3

Pull request incoming 🙂.

Suggestion: Panic when appending to node that has `self` node as child.

Hi,
thanks for writing this nice library.
For some reason I thought I could use indextree as a graph structure. Later I realized that this library is just for tree structures. And I got such strange behavior:

        let arena = &mut Arena::new();

        let start = arena.new_node("start");
        let a = arena.new_node("A");
        let b = arena.new_node("b");
        let c = arena.new_node("c");

        start.append(a, arena);
        a.append(start, arena);
        start.append(b, arena);
        b.append(start, arena);
        assert_eq!(start.children(arena).into_iter().count(), 2);

        a.append(c, arena);
        c.append(a, arena);

        assert_eq!(start.children(arena).into_iter().count(), 2); //FAILED: count is 1

I guess doing "proper" validation to check if I'm building graph is difficult. For example this gives infinite look:

        let arena = &mut Arena::new();

        let a = arena.new_node("a");
        let b = arena.new_node("b");
        let c = arena.new_node("c");

        a.append(b, arena);
        b.append(c, arena);
        c.append(a, arena);

        assert_eq!(a.children(arena).into_iter().count(), 1);
        assert_eq!(a.descendants(arena).into_iter().count(), 3); // infinite loop

But in my case just panic when trying to "connect" nodes to each other would give me earlier feedback that I'm using this library in a wrong way. I propose that for second append the code below should panic (I hope it's easy to check it, I didn't look at the source code)

        let arena = &mut Arena::new();

        let a = arena.new_node("a");
        let b = arena.new_node("b");

        a.append(b, arena);
        b.append(a, arena); //IMO this should panic

        assert_eq!(a.children(arena).into_iter().count(), 1);
        assert_eq!(a.descendants(arena).into_iter().count(), 2); // infinite loop

BTW, the example code on https://crates.io/crates/indextree is not compiling (assert!(a.append(b, arena).is_ok()); line)

Unsafe in get_pair_mut

I noticed there is a usage of unsafe in get_pair_mut. Do you think this could be avoided by using &[].split_at_mut()? Something along the lines of

let (xs, ys) = self.split_at_mut(max(a, b));
(&mut xs[min(a, b)], &mut ys.first)

How to get a node based on NodeId's private index1: NonZeroUsize?

I need an unique identifier associated with the nodes and it has to be an integer of any kind <= i/u64.

So I thought I take out the index would be a good idea:
let node_id = arena.get_node_id(node).unwrap(); let uid: usize = node_id.into();

How to put it back in again? How construct a NodeId so that I can get my Node out of the Arena again?

Implement node removal

Do you plan to implement removing node?

Collapse `prev_sybling` and `last_child` into `prev_cyclic_sybling`

See "Data structures for the document object model" http://www.aosabook.org/en/posa/parsing-xml-at-the-speed-of-light.html section for a neat trick to save one pointer per node while maintaining O(1) operations.

Iterating over all nodes while modifying them

Hello,

I was wondering if there was a way to iterate over all nodes of an Arena and modify them.
My program requires that I go over all nodes, which contain a struct, and modify a value inside that struct. Order is not important I just need to be able do it on all nodes.
However there doesn't seem to be a way to get a mutable iterator even with rayon. Is this a design quirk ? Am I missing something here ?

Thanks !

Appending to tree in parallel

Hello,

I'd like to append to my indextree arena in parallel through rayon, however I'm running into issues with the borrow checker. What's the recommended way to circumvent this?

Should insertions return `Result` type?

Insertions can fail if the preconditions are not met, but they are very very simple.
Preconditions:

NodeId::append(), NodeId::prepend():
- self != new_child.
NodeId::insert_before(), NodeId::insert_after():
- self != new_sibling.

They are sometimes ensured by the context, and then users are sure insertions do not fail.
In such situation, it seems redundant to unwrap many results.
(See https://github.com/saschagrunert/indextree/pull/39/files#diff-e7bc1f36b79d2e2d0030318c5fe20910 for example.)

I'm not sure which is better...
Both interface (panicking or not panicking when preconditions are not met) seems reasonable.

support no_std

It will be great if you can add no_std support for indextree. It won't be hard because indextree doesn't depend on any specific library right now.

Iteratation through children first

Hi,
I'd like to get an iteration through children first. Here are examples:

use indextree::Arena;

pub fn test_arena() {
  let mut arena = Arena::new();
  let n1 = arena.new_node(("1", 0));
  let n1_1 = arena.new_node(("1_1", 1));
  n1.append(n1_1, &mut arena);
  let n1_2 = arena.new_node(("1_2", 1));
  n1.append(n1_2, &mut arena);
  {
    let n1_2_1 = arena.new_node(("1_2_1", 2));
    n1_2.append(n1_2_1, &mut arena);
  }
  let n1_3 = arena.new_node(("1_3", 1));
  n1.append(n1_3, &mut arena);
  {
    let n1_3_1 = arena.new_node(("1_3_1", 2));
    n1_3.append(n1_3_1, &mut arena);
    let n1_3_2 = arena.new_node(("1_3_2", 2));
    n1_3.append(n1_3_2, &mut arena);
    let n1_3_3 = arena.new_node(("1_3_3", 2));
    n1_3.append(n1_3_3, &mut arena);
  }

  let it = arena.iter();
  for a in it {
    let (v, level) = a.get();
    println!("{}", format!("{:>width$}", v, width = 4*level));
  }
}

pub fn test_arena2() {
  let mut arena = Arena::new();
  let n1 = arena.new_node(("1", 0));
  let n1_1 = arena.new_node(("1_1", 1));
  n1.append(n1_1, &mut arena);
  let n1_2 = arena.new_node(("1_2", 1));
  n1.append(n1_2, &mut arena);
  let n1_3 = arena.new_node(("1_3", 1));
  n1.append(n1_3, &mut arena);

  {
    let n1_3_1 = arena.new_node(("1_3_1", 2));
    n1_3.append(n1_3_1, &mut arena);
    let n1_3_2 = arena.new_node(("1_3_2", 2));
    n1_3.append(n1_3_2, &mut arena);
    let n1_3_3 = arena.new_node(("1_3_3", 2));
    n1_3.append(n1_3_3, &mut arena);
  }
  {
    let n1_2_1 = arena.new_node(("1_2_1", 2));
    n1_2.append(n1_2_1, &mut arena);
  }

  let it = arena.iter();
  for a in it {
    let (v, level) = a.get();
    println!("{}", format!("{:>width$}", v, width = 4*level));
  }
}

Actually I got 2 things differents:

For the first test:
1_1
1_2
1_2_1
1_3
1_3_1
1_3_2
1_3_3

For the second test:
1
1_1
1_2
1_3
1_3_1
1_3_2
1_3_3

What I want is the first result, how can I do please?

Regards

Parallel tree traversal

Hi there!

The README mentions parallel tree traversals over an Arena but I couldn't find any related method on the docs.

How could one do so?

Also, are there any plans to support rayon's parallel iterators ?

Great lib, btw 👍

Prepend to the node with no children

indextree/tests/lib.rs

Lines 39 to 45 in 0e51295

 #[test] 

 fn failure_prepend() { 

 let arena = &mut Arena::new(); 

 let a = arena.new_node(1); 

 let b = arena.new_node(2); 

 assert!(!a.prepend(b, arena).is_ok()); 

 }

I think prepend() to the leaf node should success as append() does, but currently it is treated as invalid.
Is there any reason to forbid adding the only child by prepend()?

Add Arena::with_capacity

When creating large trees, inserting many nodes causes the underlying Vec to reallocate frequently.
Arena::with_capacity would simply create allocate the Vec beforehand.

fn with_capacity(capacity: usize) -> Self {
    Self {
        nodes: Vec::with_capacity(capacity),
        first_free_slot: None,
        last_free_slot: None,
    }
}

I would also suggest adding the Arena::clear function which simply clears the Vec, causing all the created NodeIds to be invalid.
This is equivalent to removing all the Nodes while retaining the Vec capacity.
Unless I forgot something, I'm rather confident that such an implementation would not cause soundness issues, e.g. with the iterators.

Bad NodeId.index?

You are using ˋself.nodes.len()` to get the next index, but if you add 2 nodes, then remove the first one, then add a node, both nodes will have the same index right ?

bug: `NodeStamp` does not work as expected when reusing nodes

First of all, I'd like to express my appreciation for your excellent work!

However, I've found a bug that NodeStamp does not work as expected when reusing nodes.

Problems

indextree/src/id.rs

Lines 40 to 64 in 3eee154

 impl NodeStamp { 

 pub fn is_removed(self) -> bool { 

 self.0.is_negative() 

 } 

 pub fn as_removed(&mut self) { 

 debug_assert!(!self.is_removed()); 

 self.0 = if self.0 < i16::MAX { 

 -self.0 - 1 

 } else { 

 -self.0 

 }; 

 } 

 pub fn reuseable(self) -> bool { 

 debug_assert!(self.is_removed()); 

 self.0 > i16::MIN 

 } 

 pub fn reuse(&mut self) -> Self { 

 debug_assert!(self.reuseable()); 

 self.0 = -self.0; 

 *self 

 } 

 }

New nodes are allocated with NodeStamp(0). When we decide to remove it, as_removed will be called so the stamp of the removed node will be turned to -1. Then if it's reused, the stamp will become 1. If we repeat this, the values of this stamp form a sequence of...

0, -1, 1, -2, 2, ..., 32766, -32767

Here comes the problem: -32767 is still considered reusable since it's greater than i16::MIN = -32768. So if we reuse it, then the stamp will become 32767. However, if we remove it again, we'll get a stamp of -32767 since self.0 < i16::MAX is now false, so we're actually reusing a stamp!

32766, -32767, 32767, -32767, 32767, -32767...

This breaks the assumption that we should use different stamps for different nodes, and the interfaces relying on this may behave incorrectly. For example...

use indextree::Arena;

fn main() {
    let mut tree: Arena<i32> = Arena::new();

    for i in 0usize.. {
        let id = tree.new_node(42);
        assert!(!id.is_removed(&tree));

        id.remove(&mut tree);
        assert!(id.is_removed(&tree));

        let new_id = tree.new_node(42);
        assert!(!new_id.is_removed(&tree));a
        assert!(id.is_removed(&tree), "i: {i}, id: {id:?}, new_id: {new_id:?}"); // <--

        new_id.remove(&mut tree);
    }
}

This code above panics during the 16385th loop.

thread 'main' panicked at 'i: 16384, id: NodeId { index1: 1, stamp: NodeStamp(32767) }, new_id: NodeId { index1: 1, stamp: NodeStamp(32767) }', src/bin/index_tree.rs:18:9

Possible Solutions

The root cause of this problem is that reusable always return true, which is not expected. So an intuitive solution could be correctly implementing this function.

  pub fn reuseable(self) -> bool {
      debug_assert!(self.is_removed());
-     self.0 > i16::MIN
+     self.0 > i16::MIN + 1
  }

However, I suppose it's worthwhile revisiting the meaning of stamp's existence. The ultimate purpose of reusing a node is to save memory, so using i16 seems not that convincing here. A node being unresuable means that we have to allocate a new node. For example, if we simply turn the stamp into an i64, we'll have much more chances to reuse a node. The cost of 6 extra bytes between i16 and i64 is much less than allocating a new node, not to mention the memory alignment that may erase the difference.

So I'm wondering whether it's acceptable to claim that indexing with or calling methods on a removed NodeId is always considered to be an undefined behavior, so that we can remove the stamp at all? This could make node reusing always available.

Auto-impl Eq for Arena and Node when T: Eq

Would it be possible to auto-impl / derive Eq for Node and Arena iff T: Eq?

Use `NonZeroUsize` for internal representation of `NodeId`

If NodeId uses NonZeroUsize, Option<NodeId> can be same size as NodeId.
(On the other hand, with current implementation by usize, Option<NodeId> would be twice larger than NodeId.)
Using NonZeroUsize reduces size of many types (even if user-defined) which have Option<NodeId>s.

There is a downside: the internal numbers should be decremented before they are used as vector index.

As far as I know, string-interner crate does this: use NonZeroU32 and -1 when they are used as index.

Easier access to underlying index

I need to access the underlying index of a NodeId.

Background:
I'm using indextree to implement an octree. Each node should also know which nodes are not visible from that node. My NodeData looks like this:

pub struct NodeData {
   ... 
    pub occludees: Vec<NodeId>
}

Now I need to serialize my tree in JSON format, but the problem is that I cannot access NodeId's underlying index. I also tried to use Serialize attribute (after activating deser feature), which gets me somewhere, but the index is serialized as following

{
     "index1": 51
}

Would it be possible to add an easy way to access the underlying index?
Thanks!

pretty print indextree

The documentation has many instances of nicely-formatted ASCII representations of tree state.
Is there a code sample of rendering to this format? Or better yet, a debug API to do so, possibly by fmt::Display or similar?

allow getting node using usize as an index.

I need to pass around the NodeID via ffi functions. And because its an opaque rust struct, I decided to simply get the usize and use that as the node_id for the sake of simplicity.

But, NodeID also has a version, making it impossible to get the original NodeID from just the usize.

It would be nice if the arena had a function to get the current NodeID at the index usize (or None if it doesn't exist).

Panic on semantic error, rather than returning error?

It seems indextree avoids panicking, even if there are semantic error (caused by broken inconsistent data).

For example:

indextree/src/lib.rs

Lines 460 to 471 in 0e51295

 if let Some((self_borrow, new_child_borrow)) = 

 arena.nodes.get_tuple_mut(self.index0(), new_child.index0()) 

 { 

 new_child_borrow.parent = Some(self); 

 last_child_opt = 

 mem::replace(&mut self_borrow.last_child, Some(new_child)); 

 if let Some(last_child) = last_child_opt { 

 new_child_borrow.previous_sibling = Some(last_child); 

 } else { 

 if self_borrow.first_child.is_some() { 

 bail!(NodeError::FirstChildAlreadySet); 

 }

In this code, if self_borrow.last_child was None, self_borrow.first_child should be also None, if indextree is working correctly.
If it is not, then the data can (and may) have been broken in unexpected way, and it seems meaningless to do more operation on that broken data.
Additionally, results are returned as Fallible<_> and it does type erasure, so users might easily miss such semantic bug.

I think semantic errors should be caught by panicking (typically .expect() on Result and Option) and assert macros.

feature request: `remove_children` and `detach_children`

When dealing with the tree structure, sometimes we want to remove the all children nodes of a specific node but keeping the node itself. For example, removing anything between tags <p> and </p>.

Here is a snippet demonstrating how to implement detach_children with indextree:

fn detach_children<T>(self, arena: &mut Arena<T>) {
    for child in self.children(arena) {
        let node = arena[child];
        node.parent = None;
        node.next_sibling = None;
        node.previous_sibling = None;
    }
    
    let node = arena[self];
    node.first_child = None;
    node.last_child = None;
}

I haven't tried it yet, so it may not be compiled.

Clarify order of Descendants, Traverse, and ReverseTraverse

I've had a look around and it doesn't seem like "tree order" has any meaning when it comes to graph traversal - rather, it relates to the number of children each node can have. The language in common use for tree traversal seems to be "depth first" and "breadth first", in "pre order", "post order", or "level order".

	#[test]
	fn failure_prepend() {
	let arena = &mut Arena::new();
	let a = arena.new_node(1);
	let b = arena.new_node(2);
	assert!(!a.prepend(b, arena).is_ok());
	}

	impl NodeStamp {
	pub fn is_removed(self) -> bool {
	self.0.is_negative()
	}

	pub fn as_removed(&mut self) {
	debug_assert!(!self.is_removed());
	self.0 = if self.0 < i16::MAX {
	-self.0 - 1
	} else {
	-self.0
	};
	}

	pub fn reuseable(self) -> bool {
	debug_assert!(self.is_removed());
	self.0 > i16::MIN
	}

	pub fn reuse(&mut self) -> Self {
	debug_assert!(self.reuseable());
	self.0 = -self.0;
	*self
	}
	}

	if let Some((self_borrow, new_child_borrow)) =
	arena.nodes.get_tuple_mut(self.index0(), new_child.index0())
	{
	new_child_borrow.parent = Some(self);
	last_child_opt =
	mem::replace(&mut self_borrow.last_child, Some(new_child));
	if let Some(last_child) = last_child_opt {
	new_child_borrow.previous_sibling = Some(last_child);
	} else {
	if self_borrow.first_child.is_some() {
	bail!(NodeError::FirstChildAlreadySet);
	}

saschagrunert / indextree Goto Github PK

indextree's Issues

About .gitignore:

About .rustfmt.toml:

Problems

Possible Solutions

Recommend Projects

Recommend Topics

Recommend Org

About `.gitignore`:

About `.rustfmt.toml`: