Git Product home page Git Product logo

amazon-efs-tutorial's Introduction

Amazon Elastic File System (Amazon EFS)

Tutorials

Version 1.2.2

efs-t-1.2.1


© 2020 Amazon Web Services, Inc. and its affiliates. All rights reserved. This sample code is made available under the MIT-0 license. See the LICENSE file.

Errors or corrections? Email us at [email protected].


Table of Contents

Tutorials

1. Create a file system

2. Performance

3. Scale-out

4. Monitoring

5. In-cloud Transfer

6. Data science


Tutorials

These six (6) tutorials are designed to help you better understand the performance characteristics of Amazon Elastic File System (Amazon EFS) and how parallelism, I/O size, and Amazon EC2 instance types affects file system IOPS and throughput.

1. Create a file system

This tutorial is a set of AWS Cloudformation templates that will create an Amazon EFS file system and pre-load data to grow the file system to obtain higher levels of IOPS and throughput. Throughput and IOPS on Amazon EFS scales as a file system grows, so larger file systems are able to achieve higher levels of throughput and IOPS. Because file-based workloads are typically spiky—driving high levels of throughput for short periods of time, and low levels of throughput the rest of the time—Amazon EFS is designed to burst to high throughput levels for periods of time. Amazon EFS uses a credit system to determine when file systems can burst. File systems can be monitored using AWS CloudWatch metrics. These Cloudformation templates will also create an AWS CloudWatch dashboard, custom metrics, alarms, scheduled events, AWS Lambda function, SNS notification, and an Auto Scaling group to monitor and dynamically adjust alarm thresholds as the file system grows and shrinks.

Click on the link below to go to the Create a file system tutorial. Once you've finished that tutorial move on to Performance.

Tutorial Link
Create a file system

2. Performance

This tutorial is a set of scripts that will demonstrate:

  • different instance types provide different levels of network performance when accessing a file system
  • different I/O sizes (block sizes) and sync() freqencies (the rate data is persisted to disk) effects file system throughput
  • increasing the number of threads accessing a file system will increase IOPS and throughput

Click on the link below to go to the Performance tutorial. Once you've finished that tutorial move on to Scale-out.

Tutorial Link
Performance

3. Scale-out

This tutorial is a Cloudformation template that will create an Amazon EC2 spot fleet and download objects in parallel from an Amazon S3 bucket.

Click on the link below to go to the Scale-out tutorial. Once you've finished that tutorial move on to the Scenarios.

Tutorial Link
Scale-out

4. Monitoring

This tutorial is designed to help you better understand how Amazon EFS is performing by using Amazon CloudWatch and Metric Math to monitor file system performance.

Click on the link below to go to the Monitoring tutorial.

Tutorial Link
Monitoring

5. In-cloud Transfer

The AWS DataSync In-cloud Transfer Quick Start and Scheduler creates a one-time or recurring schedule to transfer files between source and destination Amazon EFS file systems. These file systems could be in the same or different AWS regions.

Click on the link below to go to the In-cloud Transfer Quick Start.

Quick Start Link
In-cloud Transfer

6. Data sciense workshop

This tutorial covers how to use Amazon EFS, a highly available, highly durable and elastic cloud native file storage for Data Science workloads.

Click on the link below to go to the Data science tutorial.

Tutorial Link
Data science

Troubleshooting

For feedback, suggestions, or corrections, please email me at [email protected].

License

This sample code is made available under the MIT-0 license. See the LICENSE file.

amazon-efs-tutorial's People

Contributors

darrylsosborne avatar hyandell avatar perryjj-aws avatar shivkumarr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

amazon-efs-tutorial's Issues

Nodejs8.10 not supported

We have an issue with template. Nodejs8.10 is not supported anymore:
The runtime parameter of nodejs8.10 is no longer supported for creating or updating AWS Lambda functions. We recommend you use the new runtime (nodejs12.x) while creating or updating functions. (Service: AWSLambdaInternal; Status Code: 400; Error Code: InvalidParameterValueException; Request ID: 115ba1c6-c130-43ba-9edc-ae40e8f0a8f6)

DestinationEfsFilesystemId constraint outdated

The current template in S3 and linked from https://github.com/aws-samples/amazon-efs-tutorial/tree/master/in-cloud-transfer has this parameter entry:

  DestinationEfsFilesystemId:
    AllowedPattern: ^(fs-)([a-z0-9]{8})$
    Description: Destination EFS filesystem id.
    Type: String

Unfortunately, EFS identifiers have more hex digits than that. (I suspect they fit in 8 digits at the time the template was last updated.) Pasting the ID of a perfectly usable EFS instance thus triggers errors when trying to use the template. :-)

Changing the 8 to 17 (the current number of EFS hex digits) in a local copy of the template was sufficient.

Cloudformation failing to create EFS resource (Solved)

I'm having the following issue during the cloudformation deployment:

Embedded stack arn:aws:cloudformation:us-east-2:409536199879:stack/efs-create-file-system-tutorial-ElasticFileSystem-UIN3WSA8YK7G/bb9d6340-f9cc-11ea-91a7-0a7cd1635ec6 was not successfully created: The following resource(s) failed to create: [InstanceRole, ElasticFileSystemDelete].

How can i run the fpart + cpio + GNU Parallel command with SSH

The command under your point (5.9)
time parallel --will-cite -j ${threads} --pipepart --round-robin --delay .1 --block 1M -a /home/ec2-user/fpart-files-to-transfer.0 sudo "cpio -dpmL /mnt/efs/01/tutorial/parallelcpio/${instance_id}" &

How can I run it from an on premise server ? I am syncing data between a virtual machine and en ec2 that has EFS mounted. I do need to ssh here.

something like this:
time parallel --will-cite -j ${threads} --pipepart --round-robin --delay .1 --block 1M -a /home/ec2-user/fpart-files-to-transfer.0 "cpio -dpmL ec2-user@ec2-ip-address:/docker-efs/docker/parallelcpio"

Use of ASG in efs-dashboard-with-size-monitor-and-burst-credit-balance-alarms.yml

I don't understand why is an ASG (and corresponding EC2 instances) created in https://github.com/aws-samples/amazon-efs-tutorial/blob/master/create-file-system/templates/efs-dashboard-with-size-monitor-and-burst-credit-balance-alarms.yml ?

It seems the only purpose is to be able to run the shell script (below) which seems to gather some info about the EFS instance, create a CLW alarm and then set the ASG to zero.

If above is a true statement, why can't this become a Lambda?

#!/bin/bash -x

FILE_SYSTEM_ID=$1
WARNING_THRESHOLD_MINUTES=$2
CRITICAL_THRESHOLD_MINUTES=$3
SNS_ARN=$4

error=0

# get region
availability_zone=$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone)
region=${availability_zone:0:-1}

# get instance id
instance_id=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)

# get autoscaling group name
asg_name=$(aws autoscaling describe-auto-scaling-instances --instance-ids ${instance_id} --region ${region} --output text --query 'AutoScalingInstances[0].AutoScalingGroupName')

# get autoscaling policy arn
asg_policy_arn=$(aws autoscaling describe-policies --auto-scaling-group-name ${asg_name} --region ${region} --output text --query 'ScalingPolicies[0].PolicyARN')

# validate FILE_SYSTEM_ID send notification and exit if doesn't exist
aws efs describe-file-systems --file-system-id ${FILE_SYSTEM_ID} --region ${region} --output text --query 'FileSystems[0].[FileSystemId]'
result=$?
if [ $result -ne 0 ]; then
   aws sns publish --topic-arn ${SNS_ARN} --region ${region} --message 'Amazon EFS burst credit balance CloudWatch alarm error. File system '${FILE_SYSTEM_ID}' does not exist.'
   exit
fi

# get current permitted throughput
count=1
while [ -z ${permitted_throughput} ] || [ ${permitted_throughput} == null ] && [ ${count} -lt 60 ]; do
   permitted_throughput=$(aws cloudwatch get-metric-statistics --namespace AWS/EFS --metric-name PermittedThroughput --dimensions Name=FileSystemId,Value=${FILE_SYSTEM_ID} --start-time $(date --utc +%FT%TZ -d '-60 seconds') --end-time $(date --utc +%FT%TZ) --period 60 --statistics Maximum --region ${region} --output json --query 'Datapoints[0].Maximum')
   sleep 2
   count=$(expr ${count} + 1)
done

# get current burst credit balance
count=1
while [ -z ${burst_credit_balance} ] || [ ${burst_credit_balance} == null ] && [ ${count} -lt 60 ]; do
   burst_credit_balance=$(aws cloudwatch get-metric-statistics --namespace AWS/EFS --metric-name BurstCreditBalance --dimensions Name=FileSystemId,Value=${FILE_SYSTEM_ID} --start-time $(date --utc +%FT%TZ -d '-60 seconds') --end-time $(date --utc +%FT%TZ) --period 60 --statistics Maximum --region ${region} --output json --query 'Datapoints[0].Maximum')
   sleep 2
   count=$(expr ${count} + 1)
done

# calculate new burst credit balance warning threshold
burst_credit_balance_threshold_warning=$(( ${burst_credit_balance:0:-2} - ( ( ( ${burst_credit_balance:0:-2} / ( ${permitted_throughput:0:-2} * 60 ) ) - $WARNING_THRESHOLD_MINUTES ) * ( ${permitted_throughput:0:-2} * 60 ) ) ))

# calculate new burst credit balance critical threshold
burst_credit_balance_threshold_critical=$(( ${burst_credit_balance:0:-2} - ( ( ( ${burst_credit_balance:0:-2} / ( ${permitted_throughput:0:-2} * 60 ) ) - $CRITICAL_THRESHOLD_MINUTES ) * ( ${permitted_throughput:0:-2} * 60 ) ) ))

# update warning alarm with new burst credit balance warning threshold
aws cloudwatch put-metric-alarm --alarm-name ''${FILE_SYSTEM_ID}' burst credit balance - Warning - '", !Ref 'AWS::StackName', " --alarm-description ''${FILE_SYSTEM_ID}' burst credit balance - Warning - '", !Ref 'AWS::StackName', " --actions-enabled --alarm-actions ${SNS_ARN} --metric-name BurstCreditBalance --namespace AWS/EFS --statistic Maximum --dimensions Name=FileSystemId,Value=${FILE_SYSTEM_ID} --period 60 --evaluation-periods 5 --threshold ${burst_credit_balance_threshold_warning} --comparison-operator LessThanThreshold --treat-missing-data missing --region ${region}
result=$?
if [ $result -ne 0 ]; then
   aws sns publish --topic-arn ${SNS_ARN} --region ${region} --message 'Amazon EFS burst credit balance CloudWatch alarm error. Check CloudWatch alarms for file system '${FILE_SYSTEM_ID}'.'
   error=$(expr ${error} + 1)
fi

# update critical alarm with new burst credit balance critical threshold
aws cloudwatch put-metric-alarm --alarm-name ''${FILE_SYSTEM_ID}' burst credit balance - Critical - '", !Ref 'AWS::StackName', " --alarm-description ''${FILE_SYSTEM_ID}' burst credit balance - Critical - '", !Ref 'AWS::StackName', " --actions-enabled --alarm-actions ${SNS_ARN} --metric-name BurstCreditBalance --namespace AWS/EFS --statistic Maximum --dimensions Name=FileSystemId,Value=${FILE_SYSTEM_ID} --period 60 --evaluation-periods 5 --threshold ${burst_credit_balance_threshold_critical} --comparison-operator LessThanThreshold --treat-missing-data missing --region ${region}
result=$?
if [ $result -ne 0 ]; then
   aws sns publish --topic-arn ${SNS_ARN} --region ${region} --message 'Amazon EFS burst credit balance CloudWatch alarm error. Check CloudWatch alarms for file system '${FILE_SYSTEM_ID}'.'
   error=$(expr ${error} + 1)
fi

# update burst credit balance increase threshold based
aws cloudwatch put-metric-alarm --alarm-name 'Set '${FILE_SYSTEM_ID}' burst credit balance increase threshold - '", !Ref 'AWS::StackName', " --alarm-description 'Set '${FILE_SYSTEM_ID}' burst credit balance increase threshold - '", !Ref 'AWS::StackName', " --actions-enabled --alarm-actions ${SNS_ARN} ${asg_policy_arn} --metric-name PermittedThroughput --namespace AWS/EFS --statistic Maximum --dimensions Name=FileSystemId,Value=${FILE_SYSTEM_ID} --period 60 --evaluation-periods 5 --threshold ${permitted_throughput:0:-2} --comparison-operator GreaterThanThreshold --treat-missing-data missing --region ${region}
result=$?
if [ $result -ne 0 ]; then
   aws sns publish --topic-arn ${SNS_ARN} --region ${region} --message 'Amazon EFS burst credit balance CloudWatch alarm error. Check CloudWatch alarms for file system '${FILE_SYSTEM_ID}'.'
   error=$(expr ${error} + 1)
fi

# update burst credit balance decrease threshold based
aws cloudwatch put-metric-alarm --alarm-name 'Set '${FILE_SYSTEM_ID}' burst credit balance decrease threshold - '", !Ref 'AWS::StackName', " --alarm-description 'Set '${FILE_SYSTEM_ID}' burst credit balance decrease threshold - '", !Ref 'AWS::StackName', " --actions-enabled --alarm-actions ${SNS_ARN} ${asg_policy_arn} --metric-name PermittedThroughput --namespace AWS/EFS --statistic Maximum --dimensions Name=FileSystemId,Value=${FILE_SYSTEM_ID} --period 60 --evaluation-periods 5 --threshold ${permitted_throughput:0:-2} --comparison-operator LessThanThreshold --treat-missing-data missing --region ${region}
result=$?
if [ $result -ne 0 ]; then
   aws sns publish --topic-arn ${SNS_ARN} --region ${region} --message 'Amazon EFS burst credit balance CloudWatch alarm error. Check CloudWatch alarms for file system '${FILE_SYSTEM_ID}'.'
   error=$(expr ${error} + 1)
fi

# auto terminate instance - setting auto scaling group desired capacity 0
if [ $error -eq 0 ]; then
   aws autoscaling update-auto-scaling-group --auto-scaling-group-name ${asg_name} --desired-capacity 0 --region ${region}
   else
   aws sns publish --topic-arn ${SNS_ARN} --region ${region} --message 'Amazon EFS burst credit balance CloudWatch alarm error. Check CloudWatch alarms for file system '${FILE_SYSTEM_ID}'.'
fi

nodejs12.x is no longer a supported runtime

In the same vein as #11 the template got a little long in the tooth :-)

Fortunately for those of us who don't use Lambda or NodeJS very often, the error message suggested a solution: specifying nodejs18.x was enough.

  AmiInfoFunction:
    Type: AWS::Lambda::Function
    Properties:
      Code:
        S3Bucket: !Sub solution-references-${AWS::Region}
        S3Key: datasync/amilookup-datasync-agent.zip
      Handler: amilookup-datasync-agent.handler
      Runtime: nodejs18.x
      Timeout: 30
      Role: !GetAtt LambdaExecutionRole.Arn

[question label] Provisioned Throughput Mode

This doesn't include the provisioned throughput mode. So, when ThroughputMode property is set to "provisioned", the "ProvisionedThroughPutInMibps" value will also be needed to be set. If these values were to be taken as parameters.
How to handle this dynamic behavior?

If these are provided as parameters, and the user chooses "bursting: as the mode. This leaves "ProvisionedThroughPutInMibps" pointless. But, if the value is not set, stack can't be deployed.

Can "Rules" be of any use here?

CFT USING PRIVATE IP

This solution is not helping us to create the datasync with private endpoint, we are getting public ip assigned to the agent using this template .can you share one to create in private ip for agents ?

Incorrect 'fpart' option in performance tutorial

In section 5.9 uses below fpart command to create file list,

time /usr/local/bin/fpart -Z -n 1 -o /home/ec2-user/fpart-files-to-transfer .

Option -Z is not valid option of fpart, the correct one should be -z.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.