Git Product home page Git Product logo

mountpoint-s3's People

Contributors

ahmarsuhail avatar amazon-auto avatar andrewatamzn avatar arjvik avatar arsh avatar dannycjones avatar dependabot[bot] avatar dprestegard avatar eltociear avatar ericjheinz avatar eslrahc-swa avatar frostyslav avatar indianwhocodes avatar jamesbornholt avatar jb2cool avatar jchorl avatar jiripospisil avatar jorajeev avatar lesserwhirls avatar lukenels avatar mheshwg avatar monthonk avatar passaro avatar paulomigalmeida avatar quinnypig avatar rbowen avatar sauraank avatar shtripat avatar sum12 avatar vladem avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mountpoint-s3's Issues

Add file connector version to user agent after 's3-file-connector'

We plumb the mechanism to set the user agent in #60, and set the prefix to include statically s3-file-connector. We added additional version information in #75.

We should update the user agent to include the version information, which includes the major,minor,patch as well as the commit hash.

Implement file mtime,ctime,atime

Today, all files have an 'mtime' or last modified time of 1st January 1970.

We should use the creation date of the object (LastModified) for the 'mtime' (last modified time). We may also want to use this value for 'ctime' (inode change time) and 'atime' (access time) as well, but not sure on this.

Basic metrics

We need some basic support for metrics — I think mostly throughput, time-to-first-byte latency, and IOPS. Maybe also FUSE request concurrency (i.e., queue depth). This has been really annoying to not have for me, debugging performance stuff.

It would also be nice to plumb in the metrics the CRT S3 client has about connection pool size and usage.

Ideally we'd do all this through tracing but its metrics support isn't great, so we might end up doing something different.

Send logs to file

Currently we have no way to log to a file—logs are always printed to stderr. This is particularly annoying for metrics, which are spammed to the console every few seconds. We should have a way to redirect logs to a file. This will be more important once we do #30.

Auto-configure Network Interface Card (NIC) throughput

Remove the need for customers to manually configure Mountpoint for Amazon S3 to achieve high throughout for the instance type they are using

A few of us now have misconfigured the connector by not specifying target throughput on EC2 instances with large NICs. This seems like something we should be able to autodiscover and just set correctly by default.

The CRT S3 client has the beginnings of this interface (awslabs/aws-c-s3#70), but as far as I can tell it needs to be manually invoked and only carries data for c5n.18xlarge right now.

FUSE tests intermittently failing due to missing tmp directory

Getting errors like the following, which go away after retry.

 fusermount3: user has no write access to mountpoint /tmp/.tmpcpODwC
Error: Failed to create FUSE session

Caused by:
    No such file or directory (os error 2)

Example failed job steps:

Hard-to-diagnose failures when bucket is in the wrong region

I tried to mount a bucket in us-west-2 but we currently hardcode us-east-1 as the region (which obviously we'll change).

I got difficult to diagnose errors like this:

2022-09-10T03:01:17.279560Z ERROR s3_client::s3_client::list_objects_v2: ListObjectsV2 on_list_finished_callback error error_code=14343
2022-09-10T03:01:17.279829Z ERROR s3_file_connector::fs: ListObjectsV2 failed err="on_list_finish callback error: 14343"

What's actually going on is that S3 sends back a 301:

2022-09-10T03:00:27.672712Z TRACE awscrt::S3MetaRequest: response body:
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>PermanentRedirect</Code><Message>The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.</Message><Endpoint>bornholt-test-bucket.s3-us-west-2.amazonaws.com</Endpoint><Bucket>bornholt-test-bucket</Bucket><RequestId>DJD1JHZN4GG0QK94</RequestId><HostId>X6PE0li9I1yrUCbezGoBKSXacLmUvhrZJ7vcVsvvhP9bz2j8W8BJB3tuh06mIDpxc5vfN6E/Eao=</HostId></Error>

and the CRT S3 client apparently doesn't follow 301s.

We probably want to check the bucket at mount time via HeadObject and make sure we're in the right region for it.

Support configuring storage class for new objects/files

Creating new files in S3 Intelligent-Tiering as a default, with an option to write to S3 Standard, S3 Standard-Infrequent Access, S3 Glacier Instant Retrieval, and S3 One Zone-Infrequent Access storage classes instead

We should be able to configure the connector to support creating objects in the immediate retrieval storage classes besides S3 Standard:

  • Intelligent-Tiering
  • Standard-Infrequent Access
  • One Zone-Infrequent Access
  • Glacier Instant Retrieval

Implement GetObjectAttributes for S3 Client

The CRT-based client we have does not currently support GetObjectAttributes. We need this API to get object's parts information which is required if we want to align range GETs to multi-part upload boundaries (#73).

Input/output error on high concurrency

I'm getting Input/output error when trying to list files on high concurrency (1000 or more jobs).
Looking at Rust debug logs, it seems like our requests get throttled from the server.

2022-10-28T14:05:55.107506Z DEBUG fuser::request: FUSE(  4) ino 0x0000000000000001 LOOKUP name "fio-Ji3OrnP7rSu9"    
2022-10-28T14:05:55.107609Z DEBUG lookup{req=4 ino=1 name="fio-Ji3OrnP7rSu9"}:list_objects{id=1}: s3_client::s3_client::list_objects: new request prefix="fio-Ji3OrnP7rSu9" delimiter="/" max_keys=1 continued=false
2022-10-28T14:05:55.107690Z DEBUG lookup{req=4 ino=1 name="fio-Ji3OrnP7rSu9"}:list_objects{id=2}: s3_client::s3_client::list_objects: new request prefix="fio-Ji3OrnP7rSu9/" delimiter="/" max_keys=1 continued=false
2022-10-28T14:05:55.134397Z ERROR awscrt::S3MetaRequest: id=0x7f7865003f30 Meta request failed from error 14342 (Response code indicates throttling). (request=0x7f788002cbb0, response status=503). Try to setup a retry.    
2022-10-28T14:05:55.136208Z DEBUG lookup{req=4 ino=1 name="fio-Ji3OrnP7rSu9"}:list_objects{id=2}: s3_client::s3_client: request finished request_id="E022WJ7XZMZ2B8RE" duration=28.500753ms
2022-10-28T14:05:55.136457Z DEBUG fuser::request: FUSE(  6) ino 0x0000000000000002 LOOKUP name "sequential-rw"    
2022-10-28T14:05:55.136487Z DEBUG lookup{req=6 ino=2 name="sequential-rw"}:list_objects{id=3}: s3_client::s3_client::list_objects: new request prefix="fio-Ji3OrnP7rSu9/sequential-rw" delimiter="/" max_keys=1 continued=false
2022-10-28T14:05:55.136527Z DEBUG lookup{req=6 ino=2 name="sequential-rw"}:list_objects{id=4}: s3_client::s3_client::list_objects: new request prefix="fio-Ji3OrnP7rSu9/sequential-rw/" delimiter="/" max_keys=1 continued=false
2022-10-28T14:05:55.145637Z ERROR awscrt::S3MetaRequest: id=0x7f786500ae10 Meta request failed from error 14342 (Response code indicates throttling). (request=0x7f78800013f0, response status=503). Try to setup a retry.    
2022-10-28T14:05:55.159091Z ERROR awscrt::S3MetaRequest: id=0x7f786500b6a0 Meta request failed from error 14342 (Response code indicates throttling). (request=0x7f7880019850, response status=503). Try to setup a retry.    
2022-10-28T14:05:55.176001Z ERROR awscrt::S3MetaRequest: id=0x7f7865003f30 Meta request failed from error 14342 (Response code indicates throttling). (request=0x7f788002cbb0, response status=503). Try to setup a retry.    
2022-10-28T14:05:55.188568Z ERROR awscrt::S3MetaRequest: id=0x7f786500b6a0 Meta request failed from error 14342 (Response code indicates throttling). (request=0x7f7880019850, response status=503). Try to setup a retry.    

I can also see some retry logs from crt but eventually it reached max retries and failed.

2022-10-28T14:05:56.054628Z DEBUG lookup{req=10 ino=1 name="fio-Ji3OrnP7rSu9"}:list_objects{id=8}: s3_client::s3_client: request finished request_id="YJ35ZXWTV3HDZJ4R" duration=203.25004ms
2022-10-28T14:05:56.115138Z DEBUG lookup{req=14 ino=2 name="sequential-rw"}:list_objects{id=10}: s3_client::s3_client: request finished request_id="YJ377YWF1Y1MH7V0" duration=201.431101ms
2022-10-28T14:05:56.278022Z ERROR awscrt::S3MetaRequest: id=0x7f786500a090 Meta request failed from error 14342 (Response code indicates throttling). (request=0x7f7880025ed0, response status=503). Try to setup a retry.    
2022-10-28T14:05:56.345973Z ERROR awscrt::S3MetaRequest: id=0x7f786500a090 Meta request failed from error 14342 (Response code indicates throttling). (request=0x7f7880025ed0, response status=503). Try to setup a retry.    
2022-10-28T14:05:56.539742Z ERROR awscrt::S3MetaRequest: id=0x7f786500a090 Meta request failed from error 14342 (Response code indicates throttling). (request=0x7f7880025ed0, response status=503). Try to setup a retry.    
2022-10-28T14:05:56.855499Z ERROR awscrt::S3MetaRequest: id=0x7f786500a090 Meta request failed from error 14342 (Response code indicates throttling). (request=0x7f7880025ed0, response status=503). Try to setup a retry.    
2022-10-28T14:05:56.855534Z ERROR awscrt::standard-retry-strategy: token_id=0x7f786500a370: error occurred while scheduling retry: aws-c-io: AWS_IO_MAX_RETRIES_EXCEEDED, Retry cannot be attempted because the maximum number of retries has been exceeded..    
2022-10-28T14:05:56.855541Z ERROR awscrt::S3Client: id=0x5630f9c04120 Client could not retry request 0x7f7880025ed0 for meta request 0x7f786500a090 with token 0x7f786500a370 due to error 1069 (Retry cannot be attempted because the maximum number of retries has been exceeded.)    
2022-10-28T14:05:56.855569Z DEBUG lookup{req=14 ino=2 name="sequential-rw"}:list_objects{id=9}: s3_client::s3_client: request finished request_id="unknown" duration=941.89268ms

Test and fix prefetcher error handling

Right now the prefetcher just blocks forever(?) if any requests fail. We should fix that to propagate the error to whoever's waiting for the request.

Align GETs to multi-part object boundaries

It's an S3 performance best practice to align range GETs to multi-part upload boundaries. Our current prefetcher doesn't do a good job of this: it's not aware of the underlying part size, and in its current configuration:

https://github.com/awslabs/s3-file-connector/blob/875508253753e071ed532192194adc35ce607916/s3-file-connector/src/prefetch.rs#L47

it will never line up with the part boundaries because 256k is not a multiple of the part size.

We could do arbitrarily fancy things here—in principle a multi-part object could have arbitrary part boundaries—by querying the object attributes to discover the parts. That's probably overkill. But we should at least improve the common case: assume that the object's part boundaries are the part size the connector is configured with (currently 8MB), and align our range GETs to those boundaries.

Concretely for our current prefetcher, once the sequential prefetch window gets bigger than the part size:

https://github.com/awslabs/s3-file-connector/blob/875508253753e071ed532192194adc35ce607916/s3-file-connector/src/prefetch.rs#L101-L102

we should transition to aligning reads on the part boundary (possibly requiring a single weirdly-sized GET to shift out the offset to the next part boundary). Or maybe we should just change our prefetching config to be aware of the part size?

Endpoint resolution & access points

Currently our endpoint resolution logic is the very naive:

format!("{}.s3.{}.amazonaws.com", bucket, self.region);

But this doesn't support:

  • Dual-stack endpoints (IPv6)
  • FIPS endpoints
  • Transfer acceleration endpoints
  • Regions that don't end in amazonaws.com (e.g., China regions)
  • Access points
  • PrivateLink endpoints

It doesn't look like the CRT has any built-in support for dealing with these, so we might need to do it ourselves.

Dynamically scale FUSE session threads

We currently have a hard-coded thread count for FUSE sessions. This is a bit annoying: we need multiple threads for good throughput on multiple reader workloads, but for single-reader workloads (the most common?) having extra threads is a performance hit (~10–20% slower at 8 threads than 1 thread).

libfuse does a scaling thing where it dynamically creates threads (up to some bound) whenever it accepts a request and sees there are no other free threads. We should probably try something similar.

Alternatively, maybe we do a slightly smarter thing and have just a single thread polling the FUSE device, and let it dispatch work to a Tokio runtime. It's been a while since we benchmarked that approach, so we should give it another try.

Stop running on unmount

Tell us more about this new feature.

Currently, the file connector will continue to run even if user unmounts it via umount command. The process should be able to catch unmount event and exit gracefully.

Crash when reading large files

I tried to read a big file (1GiB) and the file connector crashed with this error message

thread '<unnamed>' panicked at 'cannot read with no non-empty chunks left', /home/ubuntu/projects/s3-file-connector/s3-client/src/prefetcher.rs:347:18
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Error: Os { code: 103, kind: ConnectionAborted, message: "Software caused connection abort" }

Add authentication tests

Tell us more about this new feature.

We should have integration tests with different credentials providers to verify that the file connector works as expected for all possible environments.

Custom header support

Allowing custom headers means users of the file connector can use S3 features which function based on headers which may not have been added explicitly yet.

This issue would propose that the file connector should be able to configure an arbitrary number of static headers to be set on all S3 requests.

Readdir (list) benchmark

Tell us more about this new feature.

We have made some changes to improve readdir operation in both s3-client and s3-file-connector but we still don't have any way to measure the improvement or probably regression yet. We should extend our benchmark to include some readdir workloads.

CloudWatch integration

Publishing file client metrics to Amazon CloudWatch

54b5b89 added basic metrics support, but for now we just emit them as local log events. We should support sending them to CloudWatch, perhaps via a simple statsd backend for our new metrics sink.

IMDSv2 requests block forever from Docker container

Currently, the only way to use our connector in docker container is providing AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY.

It should be able to authenticate with instance profile inherited from its ec2 host when running in docker, or even better if we can support IRSA when running as kubernetes jobs in EKS cluster.

To reproduce, attach an instance profile that has permission to access S3 buckets to your ec2 instance, build docker image with this template

FROM rust:1.64

# install dependencies
RUN apt-get update; \
    apt-get install -y sudo fuse awscli

# copy s3-file-connector binary file into docker image
COPY s3-file-connector /usr/local/bin/

RUN chmod 777 /usr/local/bin/s3-file-connector

RUN useradd -ms /bin/bash fuseuser && echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers && adduser fuseuser sudo

USER fuseuser
WORKDIR /home/fuseuser/

then run these commands

docker build -t s3pluto .
docker run --rm --privileged -it --entrypoint bash  s3pluto
mkdir -p $HOME/mnt
aws s3 ls s3://your_bucket_name #ensure that you really have a permission to access the bucket
s3-file-connector $HOME/mnt your_bucket_name

Support for mkdir

Creation of new directories with Mountpoint for Amazon S3

This ticket tracks support for the mkdir operation.

Note that we need to define the semantics of the mkdir operation. For example, is it local inode changes only?

Running in background

Other file connectors like S3FS or Goofys are running in background by default.
I think we should do the same for our file connector and add a new option to allow users to run it in foreground.

Warn when bucket and mountpoint are swapped

ubuntu@ip-172-31-47-157:~$ ./s3-file-connector
error: The following required arguments were not provided:
    <MOUNT_POINT>
    <BUCKET_NAME>

USAGE:
    s3fc [OPTIONS] <MOUNT_POINT> <BUCKET_NAME>

For more information try --help

If you get these two positional arguments backwards, right now you get weird failures from an S3 request at startup time, because we try to validate the bucket name before the mountpoint. We should validate the mountpoint is a valid directory before validating the bucket.

Also, I think these arguments are in the opposite order to S3FS and Goofys, so we should reverse them.

Include short commit hash in the version for all builds

All builds should include the Git commit hash (shortened) of the source code that built the binary.

This will allow everyone to understand from which version of the source code it was built, even when we are not bumping the versions for every change.

For example, following this change the version identifier should be one of the following: 0.1.0-<short-commit-hash>, 0.1.0+<short-commit-hash>.

Use real uid/gid/permission masks

Right now we just have some random hardcoded uid, gid, and permissions masks. We should use real values here -- probably getuid and getgid, and then confirm we're happy with 0755 and 0644 as default permissions masks.

Dynamically load libfuse2/3

Since fuser only uses libfuse for mount/unmount, we should be able to load the right one dynamically rather than choosing at compile time. This will be useful because AL2 is still on libfuse2 whereas most other distros have moved on to libfuse3.

Right now at compile time fuser includes only the code for the version it compiles with. We should just include both and dynamically dispatch the mount/unmount/communications.

Run CI tests on aarch64

We should run CI tests on an aarch64 host to validate Graviton instance support. This might need to be a self-hosted runner; I don't think GitHub Actions offers ARM runners right now.

Add s3-file-connector information to user-agent header

We should add our application information to user-agent header in order to allow us to identify which requests come from s3-file-connector.

It also would be good to have a way for users to personalize it, for example, S3A default header is User-Agent: Hadoop <hadoop_version>, aws-sdk-java/<sdk_version> and allows user to set fs.s3a.user.agent.prefix property to add a custom prefix to user-agent header ((https://github.com/apache/hadoop/blob/8a9bdb1edc41fd2a6a4d2ac497ee29e01d0f67aa/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md?plain=1#L990-L1003).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.