Git Product home page Git Product logo

Comments (3)

jamesbornholt avatar jamesbornholt commented on May 24, 2024

#69 improved this by replacing one ListObjects with a HeadObject. What's left to do is to use cached state as a hint to optimize the requests hereโ€”if we already suspect something is a file (or directory), we can start with the HeadObject (or ListObjects) and only do the other request if it fails.

from mountpoint-s3.

plundra avatar plundra commented on May 24, 2024

Not sure if applicable, I'm not familiar with internals of FUSE, however:

In mount.fuse(8) I see two options:

entry_timeout=T
       The timeout in seconds for which name lookups will be cached. The default is 1.0 second. For all the timeout options, it is possible to give fractions of a second as well (e.g. entry_timeout=2.8)
attr_timeout=T
       The timeout in seconds for which file/directory attributes are cached.  The default is 1.0 second.

Which both sounds excellent, if mount-s3 could set/use these?

We have a very static object structure, so caching would be great and work very well.
Did some measurements and for one test it's 30-50% list-type requests, which are costly both in round trip time, but also money :-)

(Thanks for a very exciting and promising project btw!)

from mountpoint-s3.

jamesbornholt avatar jamesbornholt commented on May 24, 2024

Yeah, we actually do set those timeouts, here:

impl Default for CacheConfig {
fn default() -> Self {
// We want to do as little caching as possible, but Linux filesystems behave badly when the
// TTL is exactly zero. For example, results from `readdir` will expire immediately, and so
// the kernel will immediately re-lookup every entry returned from `readdir`. So we apply
// small non-zero TTLs. The goal is to be small enough that the impact on consistency is
// minimal, but large enough that a single cache miss doesn't cause a cascading effect where
// every other cache entry expires by the time that cache miss is serviced. We also apply a
// longer TTL for directories, which are both less likely to change on the S3 side and
// checked more often (for directory permissions checks).
let file_ttl = Duration::from_millis(100);
let dir_ttl = Duration::from_millis(1000);
Self { file_ttl, dir_ttl }
}
}

We set them very low because we want to preserve S3's strong consistency model by default. But we know for some workloads, the bucket doesn't change very much/at all, and so we could cache that metadata much longer. We're tracking that as a roadmap item in #255.

This issue is tracking a smaller improvement we could make: if we've listed a file/directory previously, then when the cache expires we could speculate that it's still a file/directory when we try to look it up from S3 again. I think with the way that lookup works right now, that would allow us to skip some HeadObject requests, but probably not any ListObjects requests.

from mountpoint-s3.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.