Git Product home page Git Product logo

Comments (8)

TerrorJack avatar TerrorJack commented on May 20, 2024 1

We now have an improved block allocator. Previously, allocation of a block group (allocGroup*/allocBlock* functions in RTS API which is called by ByteArray#-related primops) always triggers the allocation of mblocks, which in turn triggers grow_memory. The consequences are:

  • Huge waste of space (when you want 4K space you get 1M, the rest is discarded)
  • grow_memory is expensive since it re-allocates the whole linear memory.

Now the block allocator is rewritten in JavaScript (so it's convenient to do its own bookkeeping). Improvements are:

  • We no longer invoke grow_memory for every mblocks allocation; the linear memory grows with at least a factor of 2. It's a well-known trick for reducing amortized allocation cost.
  • When allocating a small block group (fits in a single mblock), we don't necessarily allocate a new mblock. Since most allocations are small, this increases space utilization a lot.

For bytestring stuff, the allocation overhead is now reasonably low. Other fronts are also being worked on:

  • RTS API for marshaling between ByteStrings and JavaScript ArrayBuffers, which can be called in either Haskell or JavaScript.
  • Identify and crush a bug in the generated code related to MutableByteArray# (I still suspect it's closely related to #39)

from asterius.

TerrorJack avatar TerrorJack commented on May 20, 2024

There are multiple approaches to Data.ByteString support:

  1. Implement a custom version of bytestring without support for *Data.ByteString.Internal modules. Allocation happens outside of linear memory. This is the simplest approach and can be easily implemented using JavaScript FFI, but has compatibility problems when we support compiling 3rd party Hackage packages later.

  2. Introduce bytestring into the source tree and only do some minor patching to make cbits work. ByteString would still be backed by a ForeignPtr which in turn is backed by ByteArray#/MutableByteArray#. This is more ideal. Considering #35, we should implement some hooks for StgCmmPrim, so that it becomes possible to insert custom logic when doing codegen for ByteArray# related primops, and explore/evaluate different allocation schemes: in linear memory or outside the linear memory.

Also, text should come along with bytestring since these two are quite tightly coupled and if radical changes happen in bytestring, they shall happen in text too.

I propose a roadmap for introducing bytestring support as follows:

  1. Introduce bytestring as a boot lib, and implement unsupported cbits functions as runtime built-in functions in Asterius.Builtins.
  2. At this point, bytestring should work out of the box (given the blobs don't blow up the linear memory), introduce regression tests.
  3. Implement codegen patches in the ghc fork, and evaluate different allocation methods. The current virtual address space already permits logical pointers which point outside of the linear memory.
  4. When the dust settles for bytestring, bring in text too.

from asterius.

TerrorJack avatar TerrorJack commented on May 20, 2024

Before introducing bytestring, I need to improve the block allocator a little bit. Currently, we have allocBlock* functions available in Asterius.Builtins, they are used a lot by ByteArray# related primops, but the implementation is, well, pretty naive: they always trigger a grow_memory wasm instruction. It doesn't have to be that slow.

from asterius.

TerrorJack avatar TerrorJack commented on May 20, 2024

Another required feature: bulk memory operators, e.g. memcpy, memmove, memset, memcmp. They are used by bytestring in two ways:

  • As foreign imported C standard library functions in Data.ByteString.Internal
  • As Cmm primops called by routines in PrimOps.cmm in rts.

Wasm has a bulk memory operations proposal, and the V8 dudes are on it, but we'll stick to something which works right now and switch later when V8 stages those instructions

from asterius.

TerrorJack avatar TerrorJack commented on May 20, 2024

Just identified a bug that ruined my weekend: MAYBE_GC* macros in Cmm.h are used in a few places in PrimOps.cmm (e.g. stg_newPinnedByteArrayzh), the macro uses CHECK_GC to inspect the bdescr of current nursery and yields to gc if it decides there's no space; however CHECK_GC doesn't make any sense with our custom heap.

Well, the fix is annoyingly simple: disable CHECK_GC entirely. We don't support handling heap overflow exception in Haskell yet; if the storage manager decides there's a heap overflow, it's a JavaScript error.

from asterius.

TerrorJack avatar TerrorJack commented on May 20, 2024

bytestring is working in b1f3c0c!

  • Patching to the bytestring package is very minimal; most API is preserved and should work.
  • Anything involved with Handle won't work; they compile & link fine but will crash.
  • We haven't implemented marshaling between Haskell ByteString and JavaScript ArrayBuffer yet.

from asterius.

TerrorJack avatar TerrorJack commented on May 20, 2024

There should be a few clean-up commits that finish the following tasks before we sync back to master:

  • Improve JavaScript code generation and get rid of the weird --asterius-instance-callback flag. I thought about modularizing the runtime & generated stub code with ES6 modules, but right now we'll just implement another flag that allows users to supply another .js file which handles how AsteriusInstance is initialized and used.
  • Implement API for ArrayBuffer -> ByteString; copying is mandatory at the moment.
  • Implement API for ByteString -> ArrayBuffer; there can be a safe version which copies the region, or an unsafe version which does zero copying and simply aliases the region; if the user doesn't care about referential transparency, he/she may use the unsafe version.
  • Fix some out-of-sync entries in the docs.

Late night espressos are efficient..alas.

from asterius.

TerrorJack avatar TerrorJack commented on May 20, 2024

bytestring support delivered to master via 39bc959, along with a lot of critical bugfixes & improvements.

from asterius.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.