EC2 backups

Shelvery

Shelvery creates backups in Amazon Web Services(AWS). It is primarly designed to be run as an AWS Lambda Function, but can be installed as a regular python package, and run as a CLI tool.

Shelvery currently supports the following resource types

EBS volumes
EC2 Instances (backups as AMIs)
RDS Instances
RDS Clusters
Redshift Clusters (limited support)
DocumentDb

The Shelvery strategy

Shelvery makes a distinction in the space of aws backup tools by unifying backup and retention periods logic within single class called ShelveryEngine - shelvery/engine.py. It differentiates itself from tooling that only allows linear retention periods (e.g. 28days). Shelvery enables father-son-grandson backup strategies, effectively enabling administrators to "keep the last 7 daily backups, but also the last 12 monthly backups, created on 1st of each month" Supported levels of retention are - daily, weekly (created on Sundays), monthly (created 1st of each month), and yearly backups (created on 1st of January). A general idea is that the more you walk back in past on the timeline, less dense backups are.

Shelvery also supports backups across multiple accounts without the need to have implemented Organizations.

Shelvery does not cater for backup restore process.

Features

Create and clean EBS Volume backups
Create and clean RDS Instance backups
Create and clean RDS Cluster backups
Create and clean EC2 Instance backups in form of Amazon Machine Images.
Create and clean Redshift manual backups
Get notified about all operations performed by Shelvery via SNS Topic
Share backups with other accounts automatically
Copying backups shared by other AWS accounts automatically, enabling a data bunker setup
Copy backups to disaster recovery AWS regions
Multiple levels of configuration, with priorities: Resource tags, Lambda payload, Environment Variables, Config defaults

Installation and usage

As Cli

Shelvery is published to PyPI as python package, and can be obtained from there. Additionally, you can clone this repository and deploy it as lambda.

Below is an example for installing shelvery within docker python:3 image, and doing some configuration steps.

# run within docker container (preffered, as it has clean python3 install)
docker run --rm -it -v $HOME/.aws:/root/.aws -w /shelvery -e 'AWS_DEFAULT_REGION=us-east-1' python:3 bash

# install shelvery package
pip install shelvery

# configure (optional)
export shelvery_dr_regions=us-east-2
export shelvery_keep_daily_backups=7

# create rds cluster backups
shelvery ec2ami create_backups

# cleanup rds cluster backups
shelvery ec2ami clean_backups

# create ebs backups
shelvery ebs create_backups

# create rds backups
shelvery rds create_backups

# cleanup ebs backups
shelvery ebs clean_backups

# cleanup rds backups
shelvery rds clean_backups

# create rds cluster backups
shelvery rds_cluster create_backups

# cleanup rds cluster backups
shelvery rds_cluster clean_backups

# create redshift backups
shelvery redshift create_backups

# cleanup redshift cluster backups
shelvery redshift clean_backups

# create documentdb backups
shelvery docdb create_backups

# cleanup documentdb backups
shelvery docdb clean_backups

Deploy as lambda

Shelvery can be deployed as a lambda fucntion to AWS using serverless framework. Serverless takes care of creating the necessary IAM roles. It also adds daily scheduled backups of all supported resources at 1AM UTC, and will run backup cleanup at 2AM UTC. Schedules are created as CloudWatch event rules. Look at serverless.yml file for more details. For this method of installation (that is deployment) there is no python package, and you'll need to clone the project. Below is an example of doing this process within a node docker container

$ docker run --rm -it -v $HOME/.aws:/root/.aws -w /src node:latest bash

root@4b2d6804c8d3:/src#  git clone https://github.com/base2services/shelvery && cd shelvery
Cloning into 'shelvery'...
remote: Counting objects: 114, done.
remote: Compressing objects: 100% (28/28), done.
remote: Total 114 (delta 19), reused 24 (delta 12), pack-reused 73
Receiving objects: 100% (114/114), 38.88 KiB | 0 bytes/s, done.
Resolving deltas: 100% (53/53), done.
Checking connectivity... done.
root@4b2d6804c8d3:/src/shelvery# npm install -g serverless
/usr/local/bin/slss -> /usr/local/lib/node_modules/serverless/bin/serverless
/usr/local/bin/serverless -> /usr/local/lib/node_modules/serverless/bin/serverless
/usr/local/bin/sls -> /usr/local/lib/node_modules/serverless/bin/serverless

> [email protected] postinstall /usr/local/lib/node_modules/serverless/node_modules/spawn-sync
> node postinstall


> [email protected] postinstall /usr/local/lib/node_modules/serverless
> node ./scripts/postinstall.js


┌───────────────────────────────────────────────────┐
│          serverless update check failed           │
│        Try running with sudo or get access        │
│       to the local update config store via        │
│ sudo chown -R $USER:$(id -gn $USER) /root/.config │
└───────────────────────────────────────────────────┘
+ [email protected]
added 296 packages in 13.228s


root@4b2d6804c8d3:/src/shelvery# sls deploy
Serverless: Packaging service...
Serverless: Excluding development dependencies...
Serverless: Uploading CloudFormation file to S3...
Serverless: Uploading artifacts...
Serverless: Uploading service .zip file to S3 (56.26 KB)...
Serverless: Validating template...
Serverless: Updating Stack...
Serverless: Checking Stack update progress...
.........
Serverless: Stack update finished...
Service Information
service: shelvery
stage: dev
region: us-east-1
stack: shelvery-dev
api keys:
  None
endpoints:
  None
functions:
  shelvery: shelvery-dev-shelvery
Serverless: Removing old service versions...

Delayed operations

Sharing a backup with another AWS account, or copying backup to another region is considered delayed operation that should be executed in separate thread, or in case of running on AWS lambda context, in another Lambda function invocation. This makes shelvery execution non-linear, in order to allow fanning out share/copy operations on larger number of backups.

If you want ot enforce linear execution (only possible when running as CLI), set environment variable SHELVERY_MONO_THREAD=1. This will ensure all shares / copies are done in single thread, and can prolong backup creation execution, as backup must be in available state prior it can be shared or copied.

AWS Credentials configuration

Shelvery uses boto3 as client library to communicate with Amazon Web Services. Use any environment variables that boto3 supports to configure credentials. In Amazon Lambda environment, shelvery will pick up IAM credentials from IAM role that Lambda is running under.

Runtime environment

Shelvery requires Python3.6 to run. You can run it either from any server or local machine capable of interpreting Python3.6 code, or as Amazon Lambda functions. All Shelvery code is written in such way that it supports both CLI and Lambda execution.

Backup lifecycle and retention periods

Any backups created via shelvery will get tagged with metadata, including backup name, creation time, and backup retention type (daily, weekly, monthly or yearly). Retention is following Grandfather-father-son backup scheme. Stale backups are cleaned using cleanup commands, according to retention policy. Default retention policy is as follows

Keep daily backups for 14 days
Keep weekly backups for 8 weeks
Keep monthly backups for 12 months
Keep yearly backups for 10 years

Only one type of backup is created per shelvery run, and determined by current date. Rules are following

If running shelvery on 1st January, yearly backups will be created
If running shelvery on 1st of the month, monthly backup will be created
If running shelvery on Sunday, weekly backup will be created
Any other day, daily backup is being created

All retention policies can be tweaked using runtime configuration, explained in the Runtime Configuration section Retention period is calculated at cleanup time, meaning lifetime of the backup created is not fixed, e.g. determined at backup creation, but rather dynamic. This allows greater flexibility for the end user - e.g. extending daily backup policy from last 7 to last 14 days will preserve all backups created in past 7 days, for another 7 days.

What resources are being backed up

For following resource types:

EC2 Volumes
EC2 Instances
RDS Instances
RDS Clusters
Redshift Clusters
DocumentDb

Simply add shelvery:create_backup tag with any of the following values

True
true
1

to resource that should be backed up.

Resources that are not marked to be manage by shelvery are skipped. Optionally you can export shelvery_select_entity environment variable to select single resource, though tagging condition still applies.

Notifications

All shelvery operations are being pushed to SNS topics, if configured (see shelvery_sns_topic and shelvery_error_sns_topic key in RuntimeConfiguration section below), whether success or failure. From SNS topic you send emails, subscribe custom Lambda functions, send payload to HTTP(S) endpoints etc. This allows monitoring systems to hook into shelvery, and possibly alert on any errors occurred.

Supported services

EC2

EBS Backups in form of EBS Snapshots is supported. EC2 Instance backups in form of Amazon Machine Images (AMIs) is supported as well.

RDS

RDS Backups are supported in form of RDS Snapshots. There is no support for sql dumps at this point. By default, RDS snapshots are created as copy of the last automated snapshot. Set configuration key shelvery_rds_backup_mode to RDS_CREATE_SNAPSHOT if you wish to directly create snapshot. Note that this may fail if RDS instance is not in available state

RDS Clusters

RDS Cluster backups behave same as RDS instance backups, just on RDS Clusters (Aurora). Value of shelvery_rds_backup_mode has the same effect as for RDS instance backups.

Redshift clusters

Redshift cluster snapshots are supported to an extent. Creating and cleaning backups is fully supported, whereas copying snapshot to DR region (or pulling shared snapshot from DR account) is not supported natively through API, but can be achieved using data transfer through S3 buckets, UNLOAD/COPY statements, and EnableSnapshotCopy API call. This operations however are currenlty outside of Shelvery scope

DocumentDb

DocumentDb backups behave same as RDS instance backups, just on DocumentDb Clusters. Value of shelvery_docdb_backup_mode has the same effect as shelvery_rds_backup_mode for RDS instance and cluster backups.

Runtime Configuration

There are multiple configuration options for the shelvery backup engine, configurable on multiple levels. Levels with higher priority number take precedence over the ones with lower priority number. Lowest priority are defaults that are set in code. Look at shelvery/runtime_config.py source code file for more information.

Keys

Available configuration keys are listed below:

shelvery_keep_daily_backups - Number of days to retain daily backups
shelvery_keep_weekly_backups - Number of weeks to retain weekly backups
shelvery_keep_monthly_backups - Number of months to keep monthly backups
shelvery_keep_yearly_backups - Number of years to keep yearly backups
shelvery_custom_retention_types - custom retention periods in name:seconds (quarterHourly:86400) format, comma separated, empty (disabled) by default
shelvery_current_retention_type - custom retention period applied to current create backup process
shelvery_dr_regions - List of disaster recovery regions, comma separated
shelvery_wait_snapshot_timeout - Timeout in seconds to wait for snapshot to become available before copying it to another region or sharing with other account. Defaults to 1200 (20 minutes)
shelvery_share_aws_account_ids - AWS Account Ids to share backups with. Applies to both original and regional backups
shelvery_source_aws_account_ids - List of AWS Account Ids, comma seperated, that are exposing/sharing their shelvery backups with account where shelvery is running. This can be used for having DR aws account that aggregates backups from other accounts.
shelvery_bucket_name_template - Template used to create bucket name. Available keys: {account_id}, {region}. Defaults to shelvery.data.{account_id}-{region}.base2tools
shelvery_rds_backup_mode - can be either RDS_COPY_AUTOMATED_SNAPSHOT or RDS_CREATE_SNAPSHOT. Values are self-explanatory
shelvery_docdb_backup_mode - can be either DOCDB_COPY_AUTOMATED_SNAPSHOT or DOCDB_CREATE_SNAPSHOT. Values are self-explanatory
shelvery_redshift_backup_mode - can be either REDSHIFT_COPY_AUTOMATED_SNAPSHOT or REDSHIFT_CREATE_SNAPSHOT. Values are self-explanatory
shelvery_lambda_max_wait_iterations - maximum number of chained calls to wait for backup availability when running Lambda environment. shelvery_wait_snapshot_timeout will be used only in CLI mode, while this key is used only on Lambda
shelvery_select_entity - select only single resource to be backed up, rather than all tagged with shelvery tags. This resource still needs to have shelvery tag on it to be backed up.
shelvery_sns_topic - SNS Topic to publish event messages, including error messages for failed backups
shelvery_error_sns_topic - SNS Topic for receiving just errors. If not set all messages will be sent through shelvery_sns_topic topic
shelvery_source_aws_account_ids - Comma-separated list of AWS account ids to pull backups from
shelvery_copy_resource_tags - Copy tags from original resource [boolean]
shelvery_exluded_resource_tag_keys - Comma separated list of tag keys to exclude from copying from original
shelvery_sqs_queue_url - SQS queue url
shelvery_sqs_queue_wait_period - sqs waiting period in seconds. [int] 0-900
shelvery_ignore_invalid_resource_state - ignore exceptions due to the resource being in a unavailable state, such as shutdown, rebooting. Default value is False. [boolean]
shelvery_encrypt_copy - [boolean] when copying a shared unencrypted snapshot, encrypt the shapshot. when enabled 'shelvery_copy_kms_key_id' must also be set.
shelvery_copy_kms_key_id - when copying a shared snapshot, you can specify a different kms key used to encrypt the original snapshot. Note that when copying to a new key, the shelvery requires access to both the new key and the original key. The full KMS key ARN should be supplied.
shelvery_reencrypt_kms_key_id - when it is required to re-encrypt a RDS instance or cluster snapshot with a different KMS key than that of the existing resource. Specify the ARN of the KMS Key.

Configuration Priority 0: Sensible defaults

shelvery_keep_daily_backups=14
shelvery_keep_weekly_backups=8
shelvery_keep_monthly_backups=12
shelvery_keep_yearly_backups=10
shelvery_wait_snapshot_timeout=120
shelvery_lambda_max_wait_iterations=5

Configuration Priority 1: Environment variables

$ export shelvery_rds_backup_mode=RDS_CREATE_SNAPSHOT
$ shelvery rds create_backups

Configuration Priority 2: Lambda Function payload

Any keys passed to the lambda function event payload (if shelvery is running in Lambda environment), under config key will hold priority over environment variables. E.g. following payload will clean ebs backups, with rule to delete dailies older than 5 days

{
  "backup_type": "ebs",
  "action": "clean_backups",
  "config": {
    "shelvery_keep_daily_backups": 5
  }
}

Configuration Priority 3: Resource tags

Resource tags apply for retention period and DR regions. E.g. putting following tag on your EBS volume, will ensure it's daily backups are retained for 30 days, and copied to us-west-1 and us-west-2. shelvery_share_aws_account_ids is not available on a resource level (Pull Requests welcome)

shelvery:config:shelvery_keep_daily_backups=30

shelvery:config:shelvery_dr_regions=us-west-1,us-west-2

Generic format for shelvery config tag is shevlery:config:$configkey=$configvalue

Custom retention periods

Custom retention periods can be set using shelvery_custom_retention_types formatted as [name:retention period] where retention period defined in seconds. Multiple periods can be set using a comma separated list.

shelvery_custom_retention_types=quarterHourly:86400,hourly:172800

When triggering shelvery on the desired schedule, specify the retention type using shelvery_current_retention_type tag with the desired retention type

shelvery_current_retention_type=quarterHourly

Multi account setup

Shelvery allows setting up disaster recovery AWS account with all of the backups from organisations AWS accounts copied to 'disaster recovery' account.

Sharing backups with other accounts

# this runs in every
# running in account 222222222222,333333333333
$ export AWS_DEFAULT_PROFILE=source_account
$ export shelvery_share_aws_account_ids=111111111111
$ shelvery ebs create_backups

Copying shared backups from other acccount into current account

# this is running in destination account, let's say 111111111111
$ export AWS_DEFAULT_PROFILE=dst_account
$ export shelvery_source_aws_account_ids=222222222222,333333333333
# this command will pull backups from both accounts
$ shelvery ebs pull_shared_backups

Cross-account copying of encrypted backups

Shelvery enables the sharing and copying of encrypted backups between accounts. In order to configure this, you must share the backup's KMS key and follow the instructions in step #2 above. A new kms key can be used when copying the shared snapshot or you can encrypt an unencrypted snapshot when coying to the shared account.

For instructions on how to share the KMS key, see https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ShareSnapshot.html

This is currently only supported for:

RDS clusters
RDS Instances

Waiting on backups to complete

By default shelvery will wait by sleeping and then querying the aws api for a complete status. If this is not your preferred method you can offload the sleep to SQS to save costs on lambda compute.

You can set the sqs url and the wait period (seconds) before lambda is invoked to check on the status of the backup. If the backup is not complete it will be passed back to sqs to wait for the same period.

shelvery_sqs_queue_url=https://sqs.us-east-1.amazonaws.com/111111111111/shelvery
shelvery_sqs_queue_wait_period=300

Deploying with SAM template

Install the AWS SAM cli

Setup you aws credentials and set the region

export AWS_PROFILE=profile
export AWS_DEFAULT_REGION=ap-southeast-2

or using sts keys

export AWS_ACCESS_KEY_ID=xxx
export AWS_SECRET_ACCESS_KEY=xxx
export AWS_SESSION_TOKEN=xxx

Run the deploy-sam-template.sh script with the options to deploy the template in the target account.

$ ./deploy-sam-template.sh help

Usage:
  ./deploy-sam-template.sh -b my-s3-bucket -r us-west-2     # deploy latest shelvery version in the us-west-2 region
  ./deploy-sam-template.sh -b my-s3-bucket -l true -p true  # package and deploy the current git branch

Options:
  -b BUCKET                     # s3 bucket to deploy the sam package to
  [-v VERSION]                  # set the shelvery version to deploy, defaults to x.x.x
  [-r REGION]                   # AWS region to deploy shelvery, if not set it will get from the aws config or environment
  [-p true] BOOLEAN             # Build and package shelvery from the current branch. Use with '-l true' to deploy the package.
  [-l true] BOOLEAN             # install shelvery from a local dist build in the ./dist/shelvery-${SHELVERY_VERSION}.tar.gz
  [-o KEY1=VALUE1,KEY2=VALUE2]  # Override cloudformation template parameters with a comma separated string of key value pairs
                                # e.g. -o ShelveryRdsBackupMode=RDS_CREATE_SNAPSHOT,ShelveryEncryptCopy=true
  [-u UID]                      # Set the docker user id, defaults to -u 501

	try:
	current_policy = client.get_bucket_policy(Bucket=bucket)['Policy']
	except ClientError as error:
	if error.response["Error"]["Code"] == "NoSuchBucketPolicy":
	current_policy = None

base2services / shelvery-aws-backups Goto Github PK

shelvery-aws-backups's Introduction

Shelvery

The Shelvery strategy

Features

Installation and usage

As Cli

Deploy as lambda

Delayed operations

AWS Credentials configuration

Runtime environment

Backup lifecycle and retention periods

What resources are being backed up

Notifications

Supported services

EC2

RDS

RDS Clusters

Redshift clusters

DocumentDb

Runtime Configuration

Keys

Configuration Priority 0: Sensible defaults

Configuration Priority 1: Environment variables

Configuration Priority 2: Lambda Function payload

Configuration Priority 3: Resource tags

Custom retention periods

Multi account setup

Cross-account copying of encrypted backups

Waiting on backups to complete

Deploying with SAM template

shelvery-aws-backups's People

Contributors

Stargazers

Watchers

Forkers

shelvery-aws-backups's Issues

Problem

Expectation

Received event

Stack trace

Improve integration tests

Test cases

Deployment

Deployment II

Known issues

Recommend Projects

Recommend Topics

Recommend Org