Git Product home page Git Product logo

matiri's Introduction

matiri

  • Parallel MySql backup script storing backup info in Sqlite3 database.

Features:

  • Multi-Server: Multiple MySQL servers are supported whether they are co-located on the same or separate physical servers.
  • Parallel: Each database on the server to be backed up is done separately, in parallel (concurrency settable: default: 3)
  • Compressed: Each database backup compressed
  • Checksummed: SHA256 of each compressed backup file stored and the archive of all files
  • Archived: All database backups tar'ed together into single file
  • Recorded: Backup information stored in Sqlite3 database

MySql Credentials

matiri invokes a script called mysql.sh (must be in same directory as matiri script) to obtain MySQL server host, port number, user, password, and databases to include/exclude. A default implementation is supplied but should be modified to be more secure. For obvious security reasons, this file should be in mode 700. Note that only include or exclude are allowed, but not both.

Running

  1. Alter the mysql.sh script to include one or more host, port, user, password, and included or excluded databases
  2. Alter matiri_options to modify backup destinations, concurrency, naming conventions`
  3. Start matiri

Directory Structure

All backup files are grouped by year and month, which can be modified within matiri_options $BASE_DESTINATION_DIR/YYYY/MM

Four files are produced:

  1. mysql_backup_YYYY-MM-DD_ID.tar
    • tar of database backup files (see below)
  2. mysql_backup_YYYY-MM-DD_ID.tar.sha256
    • SHA1 of #1
  3. mysql_backup_YYYY-MM-DD_ID.meta
    • Info about the backup (redundant with Sqlite3 information)
  4. mysql_backup_YYYY-MM-DD_ID.err
    • stderr output from backup process

Database Backup files

The above (#1) tar file is made up of:    For each database being backed up, two files are produced:

  1. database__DBNAME.sql.gz
    • gzip of mysqldump output
  2. database__DBNAME.sql.gz.sha1
    • SHA1 of #1

Dependencies

  1. Sqlite3
  2. mysqldump (Tested on mysqldump Ver 10.13 Distrib 5.5.34, for Linux (x86_64))
  3. Standard Linux tools (tar, gzip, date, awk, xargs, sha1sum)

Sqlite3 Database

Default sqlite location: $BASE_DESTINATION_DIR/backups.sqlite3.db

  • Each time matiri is run, an entry in the 'backup_event' table is created for each defined server (multiple MySQL servers = multiple backup events).
  • The record is added, indicating a backup_event was started, with the 'completed' column set to -999 (not completed)
  • For each of the databases to be backed up:
    • A database record is added before the database backup starts, with the backup_event id as the forign key 'backup_id'. The 'completed' column set to -999 (not completed).
    • If this database backup completes successfully, the record is updated with the 'completed' column set to 0 (completed), the end_time is set, the size ('bytes') and the SHA256 of the backup file are recorded.
  • If the backup event has successfully executed, the backup_event is updated with the 'completed' column set to 0 (completed), the end_time is set, the size ('bytes') and the SHA256 of the tar file are recorded.

Database schema:

    CREATE TABLE backup_event (id INTEGER PRIMARY KEY, completed int NOT NULL, comments text, 
           host varchar(255) NOT NULL, port int NOT NULL, 
           start_time DATETIME not null, end_time DATETIME not null, 
           user varchar(64), bytes bigint NOT NULL, file text, sha256 char(64) NOT NULL, 
           error default NULL);


    CREATE TABLE database (id INTEGER PRIMARY KEY,  completed int NOT NULL, backup_id INTEGER, 
           database varchar(255) NOT NULL, file text, 
           start_time DATETIME not null, end_time DATETIME not null, 
           bytes bigint NOT NULL, sha256 char(64) NOT NULL, 
           error default NULL, FOREIGN KEY(backup_id) REFERENCES backup(id));

    CREATE INDEX backup_start_time on backup(start_time);

    CREATE INDEX database_start_time on database(start_time);

Perusing Sqlite3 Backup Database

Using the Sqlite3 command line tool

sqlite> select * from backup_event;
id|completed|comments|host|port|start_time|end_time|user|bytes|file|sha256|error
12|0||localhost|3306|2013-11-23 17:23:15|2013-11-23 17:23:15|backups|20480|/home/newtong/backups/2013/11/mysql_backup_2013-11-23_12.tar|6f216d2a4811382b66b25480328b385bab54e7531f73bf2aa5262b00b030017c|
13|0||localhost|3306|2013-11-23 17:23:26|2013-11-23 17:23:27|backups|20480|/home/newtong/backups/2013/11/mysql_backup_2013-11-23_13.tar|da5721440c8577a3b250232ba2e901350ea9a34876212312a5b9a28206ae6d33|
14|0||localhost|3306|2013-11-23 17:23:50|2013-11-23 17:23:51|backups|20480|/home/newtong/backups/2013/11/mysql_backup_2013-11-23_14.tar|f2a9b41e4157da803d79cf385db17dad1d273e48b352eba2cd0209eaf90fa2e9|
15|0||localhost|3306|2013-11-23 17:24:08|2013-11-23 17:24:15|backups|29399040|/home/newtong/backups/2013/11/mysql_backup_2013-11-23_15.tar|8ef6dbdb3537361e48bce1d3eeb3c114d25ebf8d7eb808312384035221f20e32|

sqlite> select * from database where backup_id = 15;
id|completed|backup_id|database|file|start_time|end_time|bytes|sha256|error
61|0|15|performance_schema|database__performance_schema.gz|2013-11-23 17:24:09|2013-11-23 17:24:15|1100|2e7ea55832e3fbb62ee1370a1f0b6ffef2415aba79a129b419181195588b6c27|
62|0|15|information_schema|database__information_schema.gz|2013-11-23 17:24:09|2013-11-23 17:24:15|395504|694688b16377916f31f9dbe2a8647928a6cbb4cd5419767b3335c5ca7e5e5f37|
63|0|15|mysql|database__mysql.gz|2013-11-23 17:24:09|2013-11-23 17:24:15|142109|a72789bedfdcc73ea419750fd4904fd1c859e175f23be208b58d1c262e45eae5|
64|0|15|specify_dao_live|database__specify_dao_live.gz|2013-11-23 17:24:09|2013-11-23 17:24:15|28842782|1d91517672bae1ae39294d3e4f819e79eb6d0c4325d9c30d2d78303c34e189cd|
sqlite> 

TODOs

  1. Command line parameters for backup location and dynamically setting credentials (mysql.sh) script
  2. Better docs here and in scripts
  3. Fix scripts to better adhere to the Google bash style guide
  4. Alternate compression: {p}bzip2 should be easy;
  5. Scripts to both delete old backups files and remove entries from the Sqlite3 db (keep in sync)
  6. Simple web server app to peruse backup information in Sqlite3 DB
  7. Explanation of the Sqlite3 fields stored.
  8. Implement support for passwordless backups

Name

Named after the Matiri River, New Zealand. I have been very close on several occasions but never have seen the river.

Acknowledgements

Partially developed at Agriculture and Agri-Food Canada, Ottawa, Ontario.

matiri's People

Contributors

giustiziaj avatar gnewton avatar iyadkandalaft avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

matiri's Issues

Script to detect big deltas in size of backups

Need script to check most recent backup database size with previous backup database size.
Note this refers to the size of the file generated by mysqldump, compressed, and tar'ed with other mysqldump files from the same backup event and metadata stored in the database record

Rename `database` table?

Calling the database table 'database' leads to some ambiguity / confusion in discussions. Perhaps change to 'database_meta` table or something more unambiguous?

crontab and matiri directory should be in the base directory of the backups

This is more of a best practice (suggested by IK): put the crontab(s) for backups and the matiri script directory in the base backup directory (in matiri: BASE_DESTINATION_DIR)

This way, the backups are completely self contained in this directory: cron, scripts and backup archive can just be copied around.

Add reading from configuration file

Right now most of the config is in the script.

Should read from external bash script, supplied on the command line, which it sources for config.
Example:

BASE_DESTINATION_DIR='/home/backups/backups'
MYSQL_USER='backups'
MYSQL_HOST='mysqlserver'
MYSQL_PASSWORD='mypass'
MYSQL_PORT='3306'
CONCURRENCY_LEVEL='3'

MySqld dump with InnoDB foreign keys might fail in loading: does not turn off foreign_key_checks

mysqldump may have a table create statement that contains a foreign key reference to a table that has not yet been created (i.e. it is created later in the dump restore). If foreign_key_checks is not turned off, this will cause the load of the dump to fail.
A work around is to prepend turning off the foreign key checks:

cat <(echo "SET FOREIGN_KEY_CHECKS=0;") imports.sql | mysql

from: http://stackoverflow.com/a/15938887/459050

Perhaps matiri could prepend this to avoid this work around.

Add INCLUDE and EXCLUDE functionality

In selecting which database(s) to backup, there are 3 modes:

  • INCLUDE and EXCLUDE are both empty: ALL
  • INCLUDE is a list of one or more strings; EXCLUDE is empty: ONLY backup those in INCLUDE list
    -EXCLUDE is a list of one or more strings; INCLUDE is empty: backup everything EXCEPT those in the EXCLUDE list

If both INCLUDE and EXCLUDE are set, the script will fail

Script to remove from file system and from DB

Over time, backups are removed.
A script must support this:

  • removing the file in the FS and removing the associated records in the database table.
  • NOT removing the record in the backup_events table
  • add field to the backup_events table: timestamp DateRemoved. If not null, is not removed.

Add short delay between mysqldump commands

When several mysqldumps are started due to CONCURRENCY_LEVEL > 1, both the local machine and the remove machine may take a hit if they are all started at the same moment.

Add a small (5s-20s), settable delay in starting these up.

Change name of database dump file to have ".sql" suffix

Right now the naming convention for the database export is:

database__foobar

where foobar is the name of the database being dumped.

This is gzipped to:

database__foobar.gz

Recent uses by my colleagues suggests that naming this file with a ".sql" suffix would be more useful than how things are now:

database__foobar.sql.gz

get_sha: File does not exist message

I noticed that when the script runs, there is a "File does not exist" message generated from the get_sha() function in util.sh:

File does not exist: /home/fedora/db-backups/2019/02/mysql_backup_2019-02-02_3_127.0.0.1_3306/database__127.0.0.1_3306_iah.sql.gz.sha256
util.sh: line 84: /home/fedora/db-backups/2019/02/mysql_backup_2019-02-02_3_127.0.0.1_3306/database__127.0.0.1_3306_iah.sql.gz.sha256: No such file or directory

I am not sure if this is normal operation and informational or indication of some sort of error?

Feature request: backup rotation

I could not see if your script supports backup rotation? That is, can you expire a backup after a certain amount of time and automatically delete the expired backup?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.