Git Product home page Git Product logo

media-backups's Introduction

Description

Over the years I've acquired a number of USB disks and USB thumb drives. I also have a large media server with pictures and music and cdrom iso images and video files that I could probably recreate, but it would be long and time consuming.

It's no long practical to back large file servers up to tape, cdroms or even DVDs. However all the old drives I have from various sources can store all my media files several times over.

Backup Method

The overall algorithm works like so:

  • Get a list of all files that need to be backed up. Done by: update-files.py
  • Split each file into a series of 64 meg chunks and copy it to the smallest free backup disk.
  • Track how many copies of each chunk exist.

Disk Management

Disk quality varies so for each disk assign the disk a confidence value. Initially I'm assigning 1 to flash drives and 2 to disks. Decide that enough copies have been made when a certain confidence threshhold has been met. For now I'm kind of partial to 6.

When a disk is full it should be removed and put offsite. So there needs to be some alerting there. The disks are encrypted so offsite storage should be fine.

These tools manage backup disk lifecycles:

  • list-disks.sh: Get a list of known backup disks. TODO: Clean up output. Highlight attached disks and report storage levels. Record last known storage levels for unattached disks.
  • add-disk.sh: Adds a new disk to the backup pool.
  • mount-disk.sh: Mounts an existing disk. TODO: Remove chunks that are no longer in use.
  • rm-disk.sh: Remove disk from storage pool. TODO: Remove chunks that were on that disk and update files.
  • umount-disk.sh: Unmount a backup disk. TODO: Record its storage levels.

TODO

  • Hardcoded to my needs. Everything needs to go into a config file.
  • Shell scripts probably need to evolve to python scripts.
  • Chunk distribution - currently do the initial backups of a chunk to a single disk. Need to add to that and distribute chunks till a confidence level of 6 is reached.
  • Compression?
  • An fsck for backup disks. Confirm checksums (filenames) match file contents.
  • Database access is really limited. How does one make sqlite db access less exclusive?
  • Should store file meta info to limit need to recalc sha1 sums.
  • Some way to request disks be removed while backup-files.py is running.
  • Store chunk size info.
  • Store disk size info.

media-backups's People

Contributors

lyda avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.