I've been beavering away learning rust by working on omnikro

Benchmarks, and 'interface' traits? about crop HOT 3 CLOSED

nomad commented on September 5, 2024

Benchmarks, and 'interface' traits?

from crop.

Comments (3)

noib3 commented on September 5, 2024

Although useful, they don't seem to blend in much unicode [..] being composed of Lorem Ipsum text

When benchmarking it really doesn't matter what the contents of the text you're editing are, you're usually only interested in figuring out how long each rope.insert or rope.delete takes. The main variable is how long the text is, not what it is. A wall of Lorem Ipsums is just as good of a data set as a bunch of emojis.

they're definitely pretty artificial

Yes, but that's because of the editing pattern. The positions of the insertions and deletions in those benchmarks are usually randomly generated: you insert a character here, then delete some there, and so on.

This is not how humans actually edit text. We usually insert or delete entire runs of characters before moving the cursor to another place in the document. We might jump around occasionally, but most of the edits are localized.

If a rope is designed with this assumption in mind (and both crop and Jumprope are) it's nice to have benchmarks that don't make you ping-pong around.

Luckily the author of Automerge (a CRDT library) and the author of Jumprope recorded several character-by-character editing traces of them typing out a document in real time. You can find those traces here.

I wonder if trying to standardise something for plug-n-play use, and benchmarking efforts, might work?

I already did something along these lines in this repo. There you can find a bunch of benchmarks, including those editing traces I mentioned. You can clone that repo, add your rope to the dependencies, implement the Rope trait on it and add it to this list. Then just run cargo run --release -- --bench <your_rope>.

from crop.

Omnikron13 commented on September 5, 2024

Although useful, they don't seem to blend in much unicode [..] being composed of Lorem Ipsum text

When benchmarking it really doesn't matter what the contents of the text you're editing are, you're usually only interested in figuring out how long each rope.insert or rope.delete takes. The main variable is how long the text is, not what it is. A wall of Lorem Ipsums is just as good of a data set as a bunch of emojis.

Is that strictly true, given that rust strings are stored as vectors of bytes/u8, encoded as UTF-8? A string entirely composed of ASCII characters could be indexed just by the byte position in O(1) time, but a string composed of a completely random mix of characters would require checking every character width to compute the indices and result in O(n) time, surely?

I've personally been playing around with indexing various things about the character composition of the strings (character widths, positions of newlines, etc.) to improve performance of certain operations, but this then does mean the distribution affects the actual times.

It's in testing how successful strategies I'm experimenting with here that purely ASCII benchmarks (or ones with a non-representative distribution of newlines) aren't likely to give me particularly useful results.

they're definitely pretty artificial

Yes, but that's because of the editing pattern. The positions of the insertions and deletions in those benchmarks are usually randomly generated: you insert a character here, then delete some there, and so on.

This is not how humans actually edit text. We usually insert or delete entire runs of characters before moving the cursor to another place in the document. We might jump around occasionally, but most of the edits are localized.

If a rope is designed with this assumption in mind (and both crop and Jumprope are) it's nice to have benchmarks that don't make you ping-pong around.

Hm, might this assumption be missing some things? I've not thought too deeply on this yet, but alongside adding or removing a line (for example), operations like find/replace which could be operating across an entire document aren't all that uncommon, just as an example.

I'm not sure how best to create benchmarks/test data to cover different types of edits, so that's just something I'm pondering on...

Luckily the author of Automerge (a CRDT library) and the author of Jumprope recorded several character-by-character editing traces of them typing out a document in real time. You can find those traces here.

I wonder if trying to standardise something for plug-n-play use, and benchmarking efforts, might work?

I already did something along these lines in this repo. There you can find a bunch of benchmarks, including those editing traces I mentioned. You can clone that repo, add your rope to the dependencies, implement the Rope trait on it and add it to this list. Then just run cargo run --release -- --bench <your_rope>.

Ah, I think these are exactly the sort of thing I'm thinking about, thanks. =)

I may try modifying this to use some rough unicode test files too, actually...

from crop.

noib3 commented on September 5, 2024

A string entirely composed of ASCII characters could be indexed just by the byte position in O(1) time, but a string composed of a completely random mix of characters would require checking every character width to compute the indices and result in O(n) time

Yes, but that time complexity is set at "design" time, not at runtime. If you design your rope to work with byte offsets then indexing is O(1), Unicode or not. It's the user's responsibility to make sure that the byte offsets they provide are valid. And if you instead design it around char offsets then it's always O(n), ASCII or not.

As an aside, I would advise against choosing codepoints as the indexing metric. Like you said it brings a performance penalty over bytes, and it doesn't provide any advantages over it. The fundamental unit of text in a real editor is not a Unicode code point, it's much closer to an extended grapheme cluster. Imo bytes and ECGs are the only two sensible metrics to choose from, and (for reasons that I won't go over here) using graphemes doesn't work.

Hm, might this assumption be missing some things? I've not thought too deeply on this yet, but alongside adding or removing a line (for example), operations like find/replace which could be operating across an entire document aren't all that uncommon, just as an example.

Well it's a general rule-of-thumb, not a theorem ;) There are definitely cases where it breaks down, e.g. find&replace and multiple cursors, but it's still a useful insight. As a piece of anecdotal evidence supporting this, after reimplementing crop's leaves as gap buffers instead of simple Strings I saw performance improvements of 10-20% on the editing traces.

I'm not sure how best to create benchmarks/test data to cover different types of edits

The traces already contain find&replace and all other kinds of edits, so I wouldn't worry about that. Of course you could record your own trace if you really wanted to. I think there's a vscode plugin for it but I'm not sure.

from crop.

Benchmarks, and 'interface' traits? about crop HOT 3 CLOSED

Comments (3)

Related Issues (6)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent