Git Product home page Git Product logo

rulloc's Introduction

Rulloc

Rulloc (Rust Allocator) is a general purpose memory allocator written in Rust. It implements std::alloc::Allocator and std::alloc::GlobalAlloc traits. All memory is requested from the kernel using mmap syscalls on Linux and VirtualAlloc on Windows. You can run the examples as follows:

cargo run --example standalone
cargo run --example global
cargo run --example buckets
cargo run --example aligned

Run the tests:

cargo test

Run with Miri:

cargo miri test
cargo miri run --example standalone
cargo miri run --example buckets
cargo miri run --example aligned

Global allocator example doesn't work with Miri, see examples/global.rs.

Implementation

I started this project for learning purposes, and I know the best way to make sure you understand something is explaining it to others. So there's plenty of documentation and ASCII diagrams throughout the codebase if you're interested in how memory allocators work internally. Start by reading src/allocator.rs for a quick overview and you'll be redirected to the rest of files through intra-doc links.

If you don't want to scroll through hundreds of lines of code, this is how a memory block looks like:

+----------------------------+
| pointer to next block      |   <------+
+----------------------------+          |
| pointer to prev block      |          |
+----------------------------+          |
| pointer to block region    |          |
+----------------------------+          | Block Header
| block size                 |          |
+----------------------------+          |
| is free flag (1 byte)      |          |
+----------------------------+          |
| padding (struct alignment) |   <------+
+----------------------------+
|       Block content        |   <------+
|            ...             |          |
|            ...             |          | Addressable content
|            ...             |          |
|            ...             |   <------+
+----------------------------+

All blocks belong to a memory region, which is a contiguous chunk of memory that can store multiple pages. In other words, the size of each region is a multiple of the virtual memory page size on the current platform, and each region contains a linked list of blocks:

+--------+--------------------------------------------------+
|        | +-------+    +-------+    +-------+    +-------+ |
| Region | | Block | -> | Block | -> | Block | -> | Block | |
|        | +-------+    +-------+    +-------+    +-------+ |
+--------+--------------------------------------------------+

The allocator can be configured at compile time with multiple allocation buckets of different sizes in order to reduce fragmentation. Each bucket stores a linked list of memory regions and a free list, which is basically a linked list of only free blocks:

                   Next free block in the free list            Next free block
               +--------------------------------------+   +-----------------------+
               |                                      |   |                       |
+--------+-----|------------------+      +--------+---|---|-----------------------|-----+
|        | +---|---+    +-------+ |      |        | +-|---|-+    +-------+    +---|---+ |
| Region | | Free  | -> | Block | | ---> | Region | | Free  | -> | Block | -> | Free  | |
|        | +-------+    +-------+ |      |        | +-------+    +-------+    +-------+ |
+--------+------------------------+      ---------+-------------------------------------+

And finally, to put it all together, the allocator is an array of buckets:

                                 Next Free Block                    Next Free Block
                      +------------------------------------+   +-----------------------+
                      |                                    |   |                       |
     +--------+-------|----------------+      +--------+---|---|-----------------------|-----+
     |        | +-----|-+    +-------+ |      |        | +-|---|-+    +-------+    +---|---+ |
0 -> | Region | | Free  | -> | Block | | ---> | Region | | Free  | -> | Block | -> | Free  | |
     |        | +-------+    +-------+ |      |        | +-------+    +-------+    +-------+ |
     +--------+------------------------+      +--------+-------------------------------------+

                                              Next Free Block
                                 +----------------------------------------+
                                 |                                        |
     +--------+------------------|-----+      +--------+------------------|------------------+
     |        | +-------+    +---|---+ |      |        | +-------+    +---|---+    +-------+ |
1 -> | Region | | Block | -> | Free  | | ---> | Region | | Block | -> | Free  | -> | Block | |
     |        | +-------+    +-------+ |      |        | +-------+    +-------+    +-------+ |
     +--------+------------------------+      +--------+-------------------------------------+

..............................................................................................

                                        Next Free Block
                                 +---------------------------+
                                 |                           |
     +--------+------------------|-----+      +--------+-----|-------------------------------+
     |        | +-------+    +---|---+ |      |        | +---|---+    +-------+    +-------+ |
N -> | Region | | Block | -> | Free  | | ---> | Region | | Free  | -> | Block | -> | Block | |
     |        | +-------+    +-------+ |      |        | +-------+    +-------+    +-------+ |
     +--------+------------------------+      +--------+-------------------------------------+

All these data structures are a little bit more complicated than that, but you'll have to read the source code for further details.

TODO

  • Multithreading. Right now the allocator stands behind a mutex, which is not the most efficient way of handling multiple threads. We also need better tests, maybe use loom? See src/allocator.rs for optimization ideas and low level details. Here's a quick summary:

    1. One mutex per bucket instead of one global mutex.
    2. Fixed number of allocators and load distribution between threads.
    3. One allocator per thread.
  • Reduce number of system calls for allocations and deallocations. The allocator requests just enough memory from the OS to store the user content plus headers, but that could result in calling mmap (or equivalent on other platforms) too many times. In order to avoid this, we could figure out how many extra pages to request in general, for example if we need 1 page to store our stuff plus user's stuff, we can request 10 pages instead and avoid kernel code execution for further allocations.

  • Study if it is possible to reduce the size of our headers. On 64 bit machines the size of each block header is 40 bytes and the size of each region header is 48 bytes. Region headers are not so important, but block headers are important for small allocations.

  • Memory alignment optimization. The current implementation adds padding to the beginning of the block content in order to align pointers. This is only done when the alignment constraint is greater than the pointer size on the current machine, and it's not an issue for small alignments such as 16 or 32, but it might waste too much space for alignments of 512 and beyond. One possible starting point for this issue: memory regions are already aligned to page size (usually 4096), so we could probably take advantage of that somehow.

  • Free blocks searching algorithm optimization. We do have a free list, which makes finding free blocks easier, and we also have buckets of fixed sizes, so this isn't particularly unoptimized. Except for sequential large allocations of different sizes. Whenever we receive an allocation request that we can't fit in any bucket because it's too large, we use the Dynamic Bucket, which is just a fancy name for a bucket that doesn't care about sizes and can store blocks of 1KB mixed with blocks of 1GB. For that bucket at least, we could implement a heap instead of a normal linked list for searching free blocks, or maybe just store a pointer to the biggest block so that we know immediately if we need to bother the kernel or not.

  • What should we do when mmap or VirtualAlloc fails? Panic? The allocator user doesn't know anything about the failure because std::alloc::Allocator doesn't have a return type for deallocations. Panicking doesn't seem the way to go because the program can continue normally, but how and when are we returning the memory region to the kernel then?

  • Remove std::alloc::GlobalAlloc implementation when std::alloc::Allocator becomes stable.

rulloc's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

rulloc's Issues

Contributing

Looking for newcomer support to this repo and wondering what preferences you had on precommit requirements, tests, etc.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.