Git Product home page Git Product logo

aws-samples / amazon-rds-export-to-s3-automation Goto Github PK

View Code? Open in Web Editor NEW
14.0 4.0 3.0 241 KB

This repository contains source code for the AWS Database Blog Post Reduce data archiving costs for compliance by automating RDS snapshot exports to Amazon S3

License: MIT No Attribution

amazon-athena amazon-eventbridge amazon-rds amazon-s3 amazon-sns aws-backup aws-cloudformation aws-glue aws-glue-crawler aws-kms

amazon-rds-export-to-s3-automation's Introduction

Reduce data archiving costs for compliance by automating RDS snapshot exports to Amazon S3

This repository contains a CloudFormation template which deploys a serverless event-driven solution that integrates AWS Backup with the Amazon RDS export feature to automate export tasks and enables you to query the data using Amazon Athena without provisioning a new RDS instance or Aurora cluster.

Overview

The following diagram illustrates the architecture of the solution.

Solution Diagram

Let’s go through the steps shown in the diagram above:

  1. You create a backup plan which will put database backups to the vault created by the technical solution.
  2. In this solution, we use AWS Backup as a signal source for an EventBridge rule.
  3. The EventBridge rule triggers an AWS Lambda function which starts export task for the database. This solution uses AWS Key Management Service (AWS KMS) to encrypt the database exports in Amazon S3.
  4. This solution uses Amazon Simple Storage Service (Amazon S3) to store the database exports.

This solution also provides an option if you don’t need to query data export using Athena. When deploying the CloudFormation template, you can choose to skip the creation of resources for step 5, 6, and 7.

  1. The EventBridge rule triggers a Lambda function when the export task is completed. It uses Amazon Simple Notification Service (Amazon SNS) to send email if export task fails.
  2. The Lambda function uses AWS Glue to create a database, crawler and runs it.
  3. After the crawler successfully runs, you can use Amazon Athena to query the data directly in Amazon S3.

Usage

To get started, create the solution resources using a CloudFormation template:

  1. Download the templates/automate-rds-aurora-export.yaml CloudFormation template to create a new stack.
  2. For Stack name, enter a name.

stack name

  1. For KMS Key Configuration, choose if you want a new KMS key to be created as part of this solution. If you already have an existing KMS key that you want to use, choose No.
  2. If you choose No for KMS key creation, it is mandatory to enter a valid KMS key ID to be used by the solution. You need to configure key users manually after the solution deployed. Leave this field blank if you chose Yes for KMS Key Configuration.
  3. Under RDS Export Configuration, enter a valid email address to receive notification when an S3 export task failed.
  4. You can enter schema, database, or table names if you want only specific objects to be exported in comma-separated list. Otherwise, leave this field blank for all database objects to be exported. You can find more details about this parameter in the AWS Boto3 documentation.
  5. If you choose Yes, the solution will make exports automatically available in Athena.
  6. Click Next.

parameters

  1. Accept all the defaults and choose Next.
  2. Acknowledge the creation of AWS Identity and Access Management (IAM) resources and click Submit.

submit

The stack creation starts with the status Create in Progress and takes approximately 5 minutes to complete.

  1. On the Outputs tab, take note of the following resource names:
    • BackupVaultName
    • IamRoleForGlueService
    • IamRoleForLambdaBackupCompleted
    • IamRoleForLambdaExportCompleted
    • SnsTopicName

outputs

  1. If you decided to use an existing KMS key, you need to give the IAM roles you took note of in step 11 access to your existing KMS key. You can do that by using the AWS Management Console default view or policy view.
  2. Check your email inbox and choose Confirm subscription in the email from Amazon SNS. Amazon SNS opens your web browser and displays a subscription confirmation with your subscription ID.

Now you’re ready to store your all RDS or Aurora database exports on Amazon S3 automatically and make them available on Athena. This solution can work for all RDS or Aurora database backups taken using AWS Backup, which uses the backup vault created by the CloudFormation template.

Before you use this solution, ensure your RDS instance supports exporting snapshots to Amazon S3. There might be cases when tables or rows can be excluded from the export because using incompatible data types. Review the feature limitations for RDS and Aurora, test the data consistency between the source database and the exported data from Athena.

Clean up

To avoid incurring future charges, delete the resources you created:

  1. On the AWS Backup console, delete the recovery points.
  2. On the Amazon S3 console, empty the S3 bucket created by the CloudFormation template to store the RDS database exports.
  3. On the AWS CloudFormation console, delete the stack that you created for the solution.

amazon-rds-export-to-s3-automation's People

Contributors

amazon-auto avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.