You say your in-memory objects look like this: <div class="snippet-clipboard-conte

potential performance issue with packed structs about fatcache HOT 4 CLOSED

twitter commented on May 25, 2024

potential performance issue with packed structs

from fatcache.

Comments (4)

manjuraj commented on May 25, 2024

I guess, based on what you are suggesting, I can change the index definition to the following:

struct itemx {
  STAILQ_ENTRY(itemx) tqe;    /* link in index / free q */
  uint32_t            sid;    /* owner slab id */
  uint32_t            offset; /* item offset from owner slab base */
  uint64_t            cas;    /* cas */
  uint8_t             md[20]; /* sha1 message digest */
} __attribute__ ((__packed__));

This should avoid the unaligned access for cas. Note that all the index entries (struct itemx) are allocated on startup time in a contiguous machine word aligned memory location

See: https://github.com/twitter/fatcache/blob/master/src/fc_itemx.c#L113
and
See: https://github.com/twitter/fatcache/blob/master/src/fc_itemx.c#L120-L122

This means that every even entry is machine word aligned and every odd entry is not.

from fatcache.

leifwalsh commented on May 25, 2024

The problem is not with alignment but rather with cache lines. If you allocate an array of oddly-sized packed structs, some will end up crossing the boundary of a cache line (which is usually 64 bytes). If the boundary falls inside cas (or anything else you do atomic ops on), you'll see issues.

This struct is 44 bytes, and cas starts at the 16th byte offset within the struct. So for any n such that (16 + 44_n) % 64 > 56, the nth item in your array will have cas spanning two cache lines (it turns out that such values of n are 1 + 16_i for all integers i).

If you remove the __packed__ attribute, the compiler should pad out the struct to be aligned (in the array) to the width of the largest member (I think), and so the compiler will pad it out to 48 bytes and make it 8-byte aligned. Now, as long as cas is 8-byte aligned within the struct, you won't have this problem.

This is a struct declaration that should fix the problem if you ever run in to it:

struct itemx {
  STAILQ_ENTRY(itemx) tqe;    /* link in index / free q */
  uint32_t            sid;    /* owner slab id */
  uint32_t            offset; /* item offset from owner slab base */
  uint64_t            cas;    /* cas */
  uint8_t             md[20]; /* sha1 message digest */
};

from fatcache.

manjuraj commented on May 25, 2024

Using unpacked struct is unwise here, because it would reduced the number of items fatcache can index.

Here is the relevant section from README.md that talks about this:

The index entry (struct itemx) on a 64-bit system is 44 bytes in size. It is possible to further reduce index entry size to 28 bytes, if CAS is unsupported, MD5 hashing is used, and the next pointer is reduced to 4 bytes.

At this point, it is instructive to consider the relative size of fatcache's index and the on-disk data. With a 44 byte index entry, an index consuming 44 MB of memory can address 1M objects. If the average object size is 1 KB, then a 44 MB index can address 1 GB of on-disk storage - a 23x memory overcommit. If the average object size is 500 bytes, then a 44 MB index can address 500 MB of SSD - a 11x memory overcommit. Index size and object size relate in this way to determine the addressable capacity of the SSD.

from fatcache.

leifwalsh commented on May 25, 2024

Sure. You gain 9% in addressable disk for the same memory by keeping structs packed, that's pretty good. I suggest you just keep this in your back pocket for if you ever see weird performance problems.

Sent from my iPhone

On Feb 12, 2013, at 20:11, Manju Rajashekhar [email protected] wrote:

Using unpacked struct is unwise here, because it would reduced the number of items fatcache can index.

Here is the relevant section from README.md that talks about this:

The index entry (struct itemx) on a 64-bit system is 44 bytes in size. It is possible to further reduce index entry size to 28 bytes, if CAS is unsupported, MD5 hashing is used, and the next pointer is reduced to 4 bytes.

At this point, it is instructive to consider the relative size of fatcache's index and the on-disk data. With a 44 byte index entry, an index consuming 44 MB of memory can address 1M objects. If the average object size is 1 KB, then a 44 MB index can address 1 GB of on-disk storage - a 23x memory overcommit. If the average object size is 500 bytes, then a 44 MB index can address 500 MB of SSD - a 11x memory overcommit. Index size and object size relate in this way to determine the addressable capacity of the SSD.

—
Reply to this email directly or view it on GitHub.

from fatcache.

potential performance issue with packed structs about fatcache HOT 4 CLOSED

Comments (4)

Related Issues (17)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent