Git Product home page Git Product logo

Comments (4)

manjuraj avatar manjuraj commented on May 25, 2024

I guess, based on what you are suggesting, I can change the index definition to the following:

struct itemx {
  STAILQ_ENTRY(itemx) tqe;    /* link in index / free q */
  uint32_t            sid;    /* owner slab id */
  uint32_t            offset; /* item offset from owner slab base */
  uint64_t            cas;    /* cas */
  uint8_t             md[20]; /* sha1 message digest */
} __attribute__ ((__packed__));

This should avoid the unaligned access for cas. Note that all the index entries (struct itemx) are allocated on startup time in a contiguous machine word aligned memory location

See: https://github.com/twitter/fatcache/blob/master/src/fc_itemx.c#L113
and
See: https://github.com/twitter/fatcache/blob/master/src/fc_itemx.c#L120-L122

This means that every even entry is machine word aligned and every odd entry is not.

from fatcache.

leifwalsh avatar leifwalsh commented on May 25, 2024

The problem is not with alignment but rather with cache lines. If you allocate an array of oddly-sized packed structs, some will end up crossing the boundary of a cache line (which is usually 64 bytes). If the boundary falls inside cas (or anything else you do atomic ops on), you'll see issues.

This struct is 44 bytes, and cas starts at the 16th byte offset within the struct. So for any n such that (16 + 44_n) % 64 > 56, the nth item in your array will have cas spanning two cache lines (it turns out that such values of n are 1 + 16_i for all integers i).

If you remove the __packed__ attribute, the compiler should pad out the struct to be aligned (in the array) to the width of the largest member (I think), and so the compiler will pad it out to 48 bytes and make it 8-byte aligned. Now, as long as cas is 8-byte aligned within the struct, you won't have this problem.

This is a struct declaration that should fix the problem if you ever run in to it:

struct itemx {
  STAILQ_ENTRY(itemx) tqe;    /* link in index / free q */
  uint32_t            sid;    /* owner slab id */
  uint32_t            offset; /* item offset from owner slab base */
  uint64_t            cas;    /* cas */
  uint8_t             md[20]; /* sha1 message digest */
};

from fatcache.

manjuraj avatar manjuraj commented on May 25, 2024

Using unpacked struct is unwise here, because it would reduced the number of items fatcache can index.

Here is the relevant section from README.md that talks about this:


The index entry (struct itemx) on a 64-bit system is 44 bytes in size. It is possible to further reduce index entry size to 28 bytes, if CAS is unsupported, MD5 hashing is used, and the next pointer is reduced to 4 bytes.

At this point, it is instructive to consider the relative size of fatcache's index and the on-disk data. With a 44 byte index entry, an index consuming 44 MB of memory can address 1M objects. If the average object size is 1 KB, then a 44 MB index can address 1 GB of on-disk storage - a 23x memory overcommit. If the average object size is 500 bytes, then a 44 MB index can address 500 MB of SSD - a 11x memory overcommit. Index size and object size relate in this way to determine the addressable capacity of the SSD.

from fatcache.

leifwalsh avatar leifwalsh commented on May 25, 2024

Sure. You gain 9% in addressable disk for the same memory by keeping structs packed, that's pretty good. I suggest you just keep this in your back pocket for if you ever see weird performance problems.

Sent from my iPhone

On Feb 12, 2013, at 20:11, Manju Rajashekhar [email protected] wrote:

Using unpacked struct is unwise here, because it would reduced the number of items fatcache can index.

Here is the relevant section from README.md that talks about this:

The index entry (struct itemx) on a 64-bit system is 44 bytes in size. It is possible to further reduce index entry size to 28 bytes, if CAS is unsupported, MD5 hashing is used, and the next pointer is reduced to 4 bytes.

At this point, it is instructive to consider the relative size of fatcache's index and the on-disk data. With a 44 byte index entry, an index consuming 44 MB of memory can address 1M objects. If the average object size is 1 KB, then a 44 MB index can address 1 GB of on-disk storage - a 23x memory overcommit. If the average object size is 500 bytes, then a 44 MB index can address 500 MB of SSD - a 11x memory overcommit. Index size and object size relate in this way to determine the addressable capacity of the SSD.


Reply to this email directly or view it on GitHub.

from fatcache.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.