ziglibs / diffz Goto Github PK

Implementation of go-diff's diffmatchpatch in Zig

License: MIT License

Zig 100.00%

diffz's Issues

Allocator-independent testing

The merge conversation in #23 has demonstrated that switching to bring-your-own-allocator memory management will require some changes in testing strategy.

If I'm understanding some context correctly, ZLS is using DiffMatchPatch, and surely doing so with an arena. So for benchmarking and consistency reasons, it would be good to be able to run the tests using an arena as well.

Furthermore, it would be good to assure that an OutOfMemory incident doesn't leak memory, or improperly double-free it. std.testing has FailingAllocator for that kind of check, so that's a minimum of three allocators which would be well to use on the test suite.

Last, it would be good to set up the tests so that they're also benchmarks, using std.Timer to collect data. That could be conditionally reported from a separate build step, and is harmless to run when the information it provides isn't necessary. This could include running the tests many more times, in order to get useful amounts of timing data.

My sketch of a solution here is pretty simple: change each of the test blocks into a pair: a function, which takes an allocator and performs the tests, and a test block, which calls that function with an allocator. The functions should initialize a Timer and return its results, that's probably the right level of granularity but we should discuss that.

How things are structured from there is less clear to me. I haven't used a failing allocator before, but it seems pretty simple: run a for (0..) |allowed_allocations| loop, which initializes a FailingAllocator to permit that many allocations, and break when we no longer catch OutOfMemory errors.

Whether the tests should be run on both the std.testing.allocator, and an arena, every time, is less clear to me. Currently, on an M1, a test run is absolutely dominated by build time, finishing in a few hundred milliseconds when has_side_effects = true is used to allow the tests to rerun without any build changes. So some changes which bump the test running time up to a second or so wouldn't really move the needle in the usual workflow where tests are run after a build.

But it isn't obvious to me that double testing with std.testing.allocator, and an arena, is an important thing to do every time. Another option is to add the arena as a build flag, which would comptime switch from std.testing.allocator to ArenaAllocator. I also don't know what happens when you make a FailingAllocator backed by an ArenaAllocator, but the design seems pretty composable.

So that's my sketch of a plan here, let me know what you think.

Zig libraries need to manage their own memory

I was looking for a diffing library to use on a Zig project, and diffz looked like the best candidate. Even with just the diffing portion of diff-match-patch, and a lack of recent commits (it happens), I figured hey, community-supported library, it's got the part I actually need, and maybe I could take some time to continue the port.

I was disappointed to see that the library uses an arena as though it were a garbage collector. In a ziglib, functions are expected to take a generic Allocator, and behave accordingly, by freeing any memory they don't return. If the user wants to use an ArenaAllocator, that's fine, free is a no-op unless the allocation happens to be the last one performed, so one ends up with the performance benefits, while maintaining the flexibility which is a core competency of the language.

I assume this unfortunate state of affairs came to be because this version of diff-match-patch is a port from a garbage collected language, C♯ presumably. It's just a pity, because Zig provides such excellent tools for finding memory leaks, use after free, and double free, and the library has a robust test suite. Proper memory management could have been added during the port, and then this would be a real Zig library, not just a promising sketch of one.

Anyway, now there's an issue to track this, in case anyone wants to close it...

Match and patch

Tracking issue for completing the port of the library.

ziglibs / diffz Goto Github PK

diffz's Issues

Allocator-independent testing

Zig libraries need to manage their own memory

Match and patch

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent