Git Product home page Git Product logo

backup-vm's People

Contributors

grmrgecko avatar m-beno avatar milkey-mouse avatar rugk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

backup-vm's Issues

RAM snapshot backup

In order for libvirt to allow restoration, backup and restore ops would have to fully store the state of the snapshot in the backup so it can be tacked back onto the VM and switched to. I think this may be out of scope/not feasible with current libvirt API.

Right now --memory is completely broken (it doesn't even parse correctly); the flag should probably be taken out until this feature is implemented.

Integration into existing backup process

How is it intended to be integrated into an existing backup process? E.g. in my script a usual backup is done. If I wanted to include VM images inside of it, I would need to dump these and then backup everything.

But this script automatically executes the backup, so is it designed to have an extra repo just for VM images? (In the usual use case, i think, I want to backup both, usual data and VMs) Or may I just backup the VMs into the same repo, but with a different archive name?

Save temp snapshots in temp dir

That does not look good:

backup-vm/backup-vm.py

Lines 365 to 367 in 12b57ff

# we probably can't write the temporary snapshot to the same directory
# as the original disk, so use the default libvirt images directory
disk.snapshot_path = os.path.join("/var/lib/libvirt/images", filename)

Would not it be better if backups are saved in the temp dir (/tmp), which is usually intended for that purpose?

Block copy still active: disk 'vda' not ready for pivot yet

I am having some issues with backup-vm. The below is happening to me fairly often. Seems to be more frequent for some VMs than others, which could be depending on the load/io in the guest I guess.

Most VMs have worked on first try. Some have required 2-3 tries. But I have one that has not yet worked at all, after 5-6 tries. It has however finished vda once or twice, but then it got stuck on vdb instead, so it's a bit random on that one as well. With the plan to run backups automatically daily or weekly this is a bit of an issue.

backup progress: 100%
libvirt: error code 83: block copy still active: disk 'vda' not ready for pivot yet
Traceback (most recent call last):
File "/usr/local/bin/backup-vm", line 11, in
load_entry_point('backup-vm==0.1.dev30+gf2d6dfd', 'console_scripts', 'backup-vm')()
File "/usr/local/lib/python3.7/dist-packages/backup_vm-0.1.dev30+gf2d6dfd-py3.7.egg/backup_vm/backup.py", line 54, in main
borg_failed = multi.assimilate(args.archives)
File "/usr/local/lib/python3.7/dist-packages/backup_vm-0.1.dev30+gf2d6dfd-py3.7.egg/backup_vm/snapshot.py", line 175, in exit
self.blockcommit(disks_to_backup)
File "/usr/local/lib/python3.7/dist-packages/backup_vm-0.1.dev30+gf2d6dfd-py3.7.egg/backup_vm/snapshot.py", line 105, in blockcommit
if self.dom.blockJobAbort(disk.target, libvirt.VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT) < 0:
File "/usr/lib/python3/dist-packages/libvirt.py", line 784, in blockJobAbort
if ret == -1: raise libvirtError ('virDomainBlockJobAbort() failed', dom=self)
libvirt.libvirtError: block copy still active: disk 'vda' not ready for pivot yet

I don't understand if the issue is in backup-vm or libvirt actually. I found this old, but recently fixed bug in Ubuntu. It's only for older releases though. The package libvirt doesn't exist in neither Ubuntu or Debian anymore.
https://launchpad.net/ubuntu/+source/libvirt/1.3.1-1ubuntu10.29

When the above happens I can see this:

virsh blockjob pgc-srtm-01 vda --info
Active Block Commit: [100 %]

virsh domblklist pgc-srtm-01
Target Source
vda /var/lib/libvirt/images/pgc-srtm-01-vda-tempsnap.qcow2
vdb /dev/pgc-kvm-04/pgc-srtm-01_srtm

I can fix the state by running these two commands:

virsh blockjob pgc-srtm-01 vda --abort
virsh blockcommit pgc-srtm-01 vda --active --verbose --pivot

Sometimes it does however leave the qcow2 in /var/lib/libvirt/images/, and also still links it in the xml. It seems to work fine to just remove the qcow2 file, edit the xml and virsh define it again.

My VMs uses LVM logical volumes as storage back-end. System is Debian Stable (Buster), fully upgraded.

I have also experienced the error message seen in issue #15. Not sure if it's related, but I feel that it could be.

Since I can reproduce it, anything I can do to provide more information?

Otherwise, thank you for a great software. I am really hoping to get it working in my environment as well. It does almost everything I wish for!

Makefile

Great project! And it would be even better with a Makefile.

How to pass additional commands to script?

AFAIK as you call borg directly in the script, it may be hard to pass additional commands to it. E.g. you cannot configure --compression, --chunker-params, etc.

Can't we use this in a more flexible way? Or is there no reason to configure it in such a way? (if so… why?)

Execution fails for empty CD drives

If there is a cd drive with no .iso mounted, the execution of the script fails.

  File "/usr/bin/backup-vm", line 11, in <module>
    load_entry_point('backup-vm==0.1.dev17+gce32c59.d20171207', 'console_scripts', 'backup-vm')()
  File "/usr/lib/python3.6/site-packages/backup_vm-0.1.dev17+gce32c59.d20171207-py3.6.egg/backup_vm/backup.py", line 28, in main
    all_disks = set(parse.Disk.get_disks(dom))
  File "/usr/lib/python3.6/site-packages/backup_vm-0.1.dev17+gce32c59.d20171207-py3.6.egg/backup_vm/parse.py", line 187, in get_disks
    yield from {d for d in map(cls, tree.findall("devices/disk")) if d.type is not None}
  File "/usr/lib/python3.6/site-packages/backup_vm-0.1.dev17+gce32c59.d20171207-py3.6.egg/backup_vm/parse.py", line 187, in <setcomp>
    yield from {d for d in map(cls, tree.findall("devices/disk")) if d.type is not None}
  File "/usr/lib/python3.6/site-packages/backup_vm-0.1.dev17+gce32c59.d20171207-py3.6.egg/backup_vm/parse.py", line 163, in __init__
    if len(xml.find("source").attrib.items()) >= 1:
AttributeError: 'NoneType' object has no attribute 'attrib'

The issue is that the following line:

if len(xml.find("source").attrib.items()) >= 1:

is looking for attributes in the "source" entry in the XML, which does not exist in the cd-rom block when there is no .iso file loaded.

backup fails with internal error from libvirt: block name doesn't match

Since a couple of days I'm using "backup-vm" for some qemu/libvirt VMs, so far mostly successful.
Today, a backup failed with the following error:

starting backup
libvirt: error code 1: internal error: qemu block name '/dev/vg_data01/mail2-sys
tem' doesn't match expected '/var/lib/libvirt/images/mail2-sda-tempsnap.qcow2'
Traceback (most recent call last):
  File "/usr/local/bin/backup-vm", line 11, in <module>
    load_entry_point('backup-vm==0.1.dev28+g442ce38', 'console_scripts', 'backup
-vm')()
  File "/usr/local/lib/python3.6/site-packages/backup_vm-0.1.dev28+g442ce38-py3.
6.egg/backup_vm/backup.py", line 54, in main
    borg_failed = multi.assimilate(args.archives)
  File "/usr/local/lib/python3.6/site-packages/backup_vm-0.1.dev28+g442ce38-py3.
6.egg/backup_vm/snapshot.py", line 175, in __exit__
    self.blockcommit(disks_to_backup)
  File "/usr/local/lib/python3.6/site-packages/backup_vm-0.1.dev28+g442ce38-py3.
6.egg/backup_vm/snapshot.py", line 81, in blockcommit
    | libvirt.VIR_DOMAIN_BLOCK_COMMIT_SHALLOW) < 0:
  File "/usr/local/lib64/python3.6/site-packages/libvirt.py", line 701, in block
Commit
    if ret == -1: raise libvirtError ('virDomainBlockCommit() failed', dom=self)
libvirt.libvirtError: internal error: qemu block name '/dev/vg_data01/mail2-syst
em' doesn't match expected '/var/lib/libvirt/images/mail2-sda-tempsnap.qcow2'

The first backup of this VM a day before completd without errors, so either the first backup left the VM in some state which caused problems during the next run, or there was some non-deterministic (e.g. timing-dependent) issue in the second run.

The VM (called mail2) has three disks (LVM logical volums):

sda  /dev/vg_data01/mail2_system
sdb  /dev/vg_data01/mail2_swap
sdc  /dev/vg_data01/mail2_data

After the failed backup, the VMs disks were in the following state:

[root@sabavm1 ~]# virsh domblklist mail2
Target     Source
------------------------------------------------
sda        /var/lib/libvirt/images/mail2-sda-tempsnap.qcow2
sdb        /var/lib/libvirt/images/mail2-sdb-tempsnap.qcow2
sdc        /dev/vg_data01/mail2-data

I tried then to remove the snapshots manually, but only sdb was succesful:

[root@sabavm1 ~]# virsh blockcommit mail2 sda --verbose --pivot
error: internal error: unable to find backing name for device drive-scsi0-0-0-0

[root@sabavm1 ~]# virsh blockcommit mail2 sdb --verbose --pivot
Block commit: [100 %]
Successfully pivoted

Next, I've shut the VM down and restarted it again. After I did that I was able to remove the snapshot and the status of the disks was back to normal:

[root@sabavm1 ~]# virsh blockcommit mail2 sda --verbose --pivotBlock commit: [100 %]
Successfully pivoted
[root@sabavm1 ~]# virsh domblklist mail2
Target     Source
------------------------------------------------
sda        /dev/vg_data01/mail2-system
sdb        /dev/vg_data01/mail2-swap
sdc        /dev/vg_data01/mail2-data

I wonder if this is an issue with libvirt and/or qemu (I have libvirt version 4.0.0 and qemu 2.9.0) or with "backup-vm".
What could I do to debug things further?

Automatic restore script

  • Refactor single file into module (e1b1519)
  • Basic restore functionality (aae2c0e)
  • Create non-dummy implementation of DiskLock (see libvirt source, it doesn't look like this stuff is in the Python wrapper)
    • Submit patch to that file to mailing list: s/A lock driver which locks nothing/A lock driver for virtlockd/
    • Add capability to "hotplug" disks out while being restored and back in afterwards (ask user first) (perhaps make a new issue, this isn't necessarily an 0.2 blocker)
  • Handle differences between backed up domain & current domain
    • Warn if disk sizes change
      • Logical size (what the VM sees)
      • Physical size (space on disk)?
    • Attach new disks if explicitly requested (disk removed from domain but later selected for restore)
  • Add backup-vm usage to README (6f6f54c)

Retry block commit in case of failure

Retry the commit part up to 3x, waiting 5 seconds between each try. The commit sometimes fails on my machine.
Perhaps add backup-vm --retry-merge domain option or something to run domBlockJobAbort on all existing disks.

Python 3.5 required?

The README says that Python >=3.4 is required, however, with 3.4 I am getting this error:
File "setup.py", line 22 lines = [*self.format_readme(f)] ^ SyntaxError: can use starred expression only as assignment target

I am by no means an expert with Python, quite the opposite, and may therefore be wrong. But my Google skills says that this syntax requires Python 3.5.

Dump disk images using qemu

If a VM has complex chains of disks (e.g. you want to back up snapshots already created, or even just having a qcow2 with another backing disk before running) not all the content in the VM would truly be backed up, just the last overlay image.

It should be fine to recursively run qemu-img on the disks in the domain so it can be sure images aren't depending on other images that should be backed up (and this option should have a CLI flag, because in the case of e.g. a common fresh Debian install base and overlay images with different software) it would be annoying to have many copies of the base image.

A better solution might be to read the disks the same way as qemu itself, which could probably be accomplished with a simple qemu-img convert -O raw <image> -. (The image should always be exported as raw regardless of input format because borg will do its own deduplication & compression.)

Auto-update usage in README.rst

When restore-vm (#1) and borg-multi (#6) are both in master, remember to update README.rst with the new content.
Perhaps implement a setup.py build_usage or something, similar to borg? I don't think github includes documents from rst's include directive, so it would have to edit the document in-place. One approach would be to put comments on the lines in the README before and after the usage snippets should be auto-inserted, and generate them by iterating through the entry_points.

Alternative backup engines

I'm currently evaluating a couple of other backup engines which implement public key crypto e.g. https://github.com/dpc/rdedup and I'd like to adapt backup-vm to work with this.

This could either be a fork which shares some of the same code (but no longer supports borg), or a version of backup-vm which supports both backup systems (probably more work in the short term, but better in the long term). It's not really clear to me which would be preferable.

Any thoughts?

Add unit tests

should probably mock libvirt instead of relying on the actual library. I'm leaning towards using the builtin unittest module instead of nose or py.test since it's supported on all python versions backup-vm is targeting (and because of my irrational bias towards included libs).

As of now anything that makes it into master passes the "rigorous" "test" of my daily backup script; master will be unreliable until v1.0.

Move multi-archive handling to separate script

The multiple-archive support is probably going to be useful to more people than just the VM aspect anyway, so separating the part of the script that automatically launches multiple borg instances, calculates total progress percentage, deduplicates prompts, etc. might be good. backup-vm could still call it via subprocess. It could be called borg-multi or something.

Backup failed with permission denied; now snapshot fails

First: great project, thanks for publishing it! Now my issue:

I tried to backup a live VM and it failed with this error:

libvirt.libvirtError: internal error: unable to execute QEMU command 'block-commit': Could not reopen file: Permission denied

Then I manually deleted the snapshot image. Now it fails with this:

libvirt: error code 1: internal error: unable to execute QEMU command 'transaction': Error: Trying to create an image with the same filename as the backing file 
Failed to create domain snapshot

I was root the entire time. Not sure what to do now.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.