Git Product home page Git Product logo

fatcopy's Introduction

fatcopy

Copy files with (a lot) of similarities. Typical use case is when you have to move a VM's disk back and forth 2 hosts.

It works along with SSH and a control master, by spawing a SSH process.

Usage

⚠️ Warning the binary must be in the path of the remote host

fatcopy /var/lib/libvirt/images/vm.qcow2 other-host:/var/lib/libvirt/images/vm.qcow2

Why can I just ue rsync ?

You could use rync with the following command:

rsync -aP --checksum --inplace -e ssh /var/lib/libvirt/images/vm.qcow2 other-host:/var/lib/libvirt/images/vm.qcow2

But as rsync's documentation states:

--inplace [...] This option is useful for transferring large files with block-based changes or appended data, and also on systems that are disk bound, not network bound. It can also help keep a copy-on-write filesystem snap‐ shot from diverging the entire contents of a file that only has minor changes.

This utility saves bandwith.

How does it works ?

The protocol is pretty simple.

  1. Both server and client sends their size.

  2. Until the server reaches the minimum of its size and the client:

  • The servers reads bulk_size * block_size of data and sends SHA256 hash of each block
  • The client also reads bulk_size * block_size and compare with the received hashes
  • If a hash match, then Ok is returned, otherwise DataNeeded is sent (in bulk)
  • If the server received any DataNeeded, then the slice of dat is sent.
  1. The server has the smaller size, then we are done. Otherwise the server will on send data's blocks. As the client has a smaller size, hashes will not match.

Some numbers

To send a 1GB file between 2 PCs (disks speed does not count as they were stored in tmpfs) with a 1Gbit/s link and the default settings over an SSH connection.

For reference here is some speeds:

Command run Time (in seconds) Speed (Mbit/s)
cat file | ssh host 'cat > file' 9.244 886.20
cat file > /dev/tcp/1.2.3.4/1337 9.110 899.24
rsync --inplace file host:same-file-at-99.99 10.872 753.50
rsync --inplace file host:same-file 0.054 A lot

Here are the results:

Description of remote file Time (in seconds) Speed (Mbit/s)
Completly random, same size 10.438 784.83
Completly random, half size 9.848 831.85
Empty 9.218 888.70
Same at 99% 10.471 782.36
Same at 99.9% 9.899 827.56
Same at 99.99% 5.542 1478.18
Same at 100% 5.467 1498.46
Same at 99.9%, 85% size 6.026 1359.45

fatcopy's People

Contributors

mephesto1337 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.