Git Product home page Git Product logo

go-acp's Introduction

Advanced Copy (ACP)

Multi-buffer copy to minimise fragmentation and maximize throughput

The acp tool runs multithreaded in order to read and write at the same time. It will not read multiple files at once, but will ensure that a read is happening while a write is happening. This ensures that multi-location copies are handled faster - as there is no need to wait for write to flush before the next buffered read, and no need to wait for the read to complete before writing the next buffer.

Two job threads exist, one for reading and one for writing. When thread one is done with a file, it will read the next file already as well, while thread 2 is finishing it's write to the first file, and so forth.

The tools also allows to specify non-standard buffer size. With larger buffers, there will be much less fragmentation.

Copying dirs and files, link preservation(and not), concat multiple files into one file and other features. See help below.

Note that you need buffer-size X 2 (times two) of RAM, as for threaded copy we use 2 buffers. So if buffer-size is 1GB to minimize a very large file fragmentation, you need 2GB of RAM to run the acp

Where will multithreading help

Obviously, it won't help if you are copying a file or files on a laptop on a local disk from one directory to another.

It will help if you have 2 disks and are copying between them. It will also help if you are dealing with large disk arrays that can take multiple r/w simultanously.

It will help if you are copying between 2 network locations using your machine as a proxy.

It will help if you are copying from network or local to a network drive.

It will somewhat help if you are copying between local and network.

Where will crazy-large buffers help

This was actually done to help create large files for virtual machines. Setting buffer size to 1GB during copy to a netapp filer where these were hosted, made the virtual machines run much faster as the files were not as fragmented (single 1GB flush will attempt on most filesystems to find a continuous 1GB space if possible to minimise fragmentation).

So, cases like this one. To fight off fragmentation. Most filesystems play nice and properly handle large buffer write calls by ensuring fragmentation is minimal.

Usage

Get the binary for your OS from the build job artifacts. See https://gitlab.com/bestmethod/go-acp/-/jobs - with a download button on the right-hand side. Or enter the job and "Browse" artifacts.

$ ./acp --help
2018/06/12 16:30:29 Usage:
  acp [OPTIONS] source_file [source_file [...]] destination_file

Application Options:
  -e, --report-each       Report as each file has completed being copied (for multi-file copy using recursive or config file)
  -s, --buffer-size=      Select buffer size to use. For SSDs optimal speed is 131072 (128KB). Use multiples of (e.g. 256KB, 1MB). Bigger buffer = less fragmentation. RAM required = buffer*2. (default: 131072)
  -p, --progress=         Print progress of copy operation ever X seconds. Disabled=0. (default: 1)
  -w, --print-raw         If set, will not print progres in human-readable format (e.g. 1MB) but always in bytes.
  -m, --override-mode     If set, will override permissions of destination if it already exists. Otherwise, will preserve existing destination permissions.
  -d, --delete-first      If set, this will delete the destination file before writing. Useful for changing inode number and resetting ownership.
  -l, --preserve-symlink  If set, will preserve symlinks instead of resolving contents of files they point to. This will only work for 1:1 mapping of source to destination file. If copying multiple sources to one destination file, this will be silently ignored for that
                          copy. Existing and files at destination will not be overwritten by a new symlink, unless DeleteFirst is specified.

Help Options:
  -h, --help              Show this help message

Copy behaviour:
  * file(s) -> dir	copy files to directory
  * file(s) -> file	copy files to file, concat mode
  * file(s) -> new-name	copy files to file, concat mode
  * dir(s)  -> file	error, no really
  * dir(s)  -> dir	copy directories' contents to dir (so acp a b, will result in a/* being in b, not b/a/*)
  * dir(s)  -> new-name	copy directories' contents to dir (so acp a b, will result in a/* being in b, not b/a/*)`) 

go-acp's People

Contributors

bestmethod avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.