Git Product home page Git Product logo

diffz's Issues

Allocator-independent testing

The merge conversation in #23 has demonstrated that switching to bring-your-own-allocator memory management will require some changes in testing strategy.

If I'm understanding some context correctly, ZLS is using DiffMatchPatch, and surely doing so with an arena. So for benchmarking and consistency reasons, it would be good to be able to run the tests using an arena as well.

Furthermore, it would be good to assure that an OutOfMemory incident doesn't leak memory, or improperly double-free it. std.testing has FailingAllocator for that kind of check, so that's a minimum of three allocators which would be well to use on the test suite.

Last, it would be good to set up the tests so that they're also benchmarks, using std.Timer to collect data. That could be conditionally reported from a separate build step, and is harmless to run when the information it provides isn't necessary. This could include running the tests many more times, in order to get useful amounts of timing data.

My sketch of a solution here is pretty simple: change each of the test blocks into a pair: a function, which takes an allocator and performs the tests, and a test block, which calls that function with an allocator. The functions should initialize a Timer and return its results, that's probably the right level of granularity but we should discuss that.

How things are structured from there is less clear to me. I haven't used a failing allocator before, but it seems pretty simple: run a for (0..) |allowed_allocations| loop, which initializes a FailingAllocator to permit that many allocations, and break when we no longer catch OutOfMemory errors.

Whether the tests should be run on both the std.testing.allocator, and an arena, every time, is less clear to me. Currently, on an M1, a test run is absolutely dominated by build time, finishing in a few hundred milliseconds when has_side_effects = true is used to allow the tests to rerun without any build changes. So some changes which bump the test running time up to a second or so wouldn't really move the needle in the usual workflow where tests are run after a build.

But it isn't obvious to me that double testing with std.testing.allocator, and an arena, is an important thing to do every time. Another option is to add the arena as a build flag, which would comptime switch from std.testing.allocator to ArenaAllocator. I also don't know what happens when you make a FailingAllocator backed by an ArenaAllocator, but the design seems pretty composable.

So that's my sketch of a plan here, let me know what you think.

Zig libraries need to manage their own memory

I was looking for a diffing library to use on a Zig project, and diffz looked like the best candidate. Even with just the diffing portion of diff-match-patch, and a lack of recent commits (it happens), I figured hey, community-supported library, it's got the part I actually need, and maybe I could take some time to continue the port.

I was disappointed to see that the library uses an arena as though it were a garbage collector. In a ziglib, functions are expected to take a generic Allocator, and behave accordingly, by freeing any memory they don't return. If the user wants to use an ArenaAllocator, that's fine, free is a no-op unless the allocation happens to be the last one performed, so one ends up with the performance benefits, while maintaining the flexibility which is a core competency of the language.

I assume this unfortunate state of affairs came to be because this version of diff-match-patch is a port from a garbage collected language, Cā™Æ presumably. It's just a pity, because Zig provides such excellent tools for finding memory leaks, use after free, and double free, and the library has a robust test suite. Proper memory management could have been added during the port, and then this would be a real Zig library, not just a promising sketch of one.

Anyway, now there's an issue to track this, in case anyone wants to close it...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.