Git Product home page Git Product logo

Comments (7)

mmstick avatar mmstick commented on July 17, 2024

You won't need to check for differences if just using the modified time. Any differences in the mtime of a file will instantly show that a file has changed.

from hoard.

Shadow53 avatar Shadow53 commented on July 17, 2024

The thing is, I don't want something like touch myfile to count as a "difference." Differing files should be based on existence and content, IMO.

So if a file exists in the hoard, does not exist on the filesystem, and would be copied to the filesystem with hoard restore, that's a "diff." Same for the opposite case, and for cases where the backed up version has different content from the one on the filesystem.

from hoard.

mmstick avatar mmstick commented on July 17, 2024

That'd be rather intense on I/O and CPU with a lot of files

from hoard.

mmstick avatar mmstick commented on July 17, 2024

It may seem silly, but simply syncing a file that was touched sounds better than spending a lot of resources diffing files.

from hoard.

Shadow53 avatar Shadow53 commented on July 17, 2024

You have a point. There's also some merit to the thought of at least using mtime to short-circuit a diffing process. I am not going to make a decision on this quite yet, as there are at least a couple other features that would also be affected.

from hoard.

Shadow53 avatar Shadow53 commented on July 17, 2024

Some notes on comparing things in case I go down that route.

  • If I am checking whether a hoard has any differences, I can stop checking after the first difference found.
    • Search first for files existing in one location and not the other.
    • Then check file contents.
  • When comparing files, byte-by-byte is more efficient than checksums (ignoring caching).
  • It is generally faster to read a bunch of one file than it is to switch back and forth, so use a buffered reader.
    • This also includes using multiple threads, since it is still just one disk controller.
    • I expect hoard to more likely be used with a bunch of small files rather than a bunch of large ones, so reading entire files into memory before comparing is also an option, for some maximum size.
  • File contents only need to be checked if file lengths are the same.
  • For the purposes of determining a clean backup/restore, could have a file that caches metadata on each file -- length, checksum, etc. -- and use that for comparison, which may be faster than comparing two files. That would not work in all cases, though.

from hoard.

Shadow53 avatar Shadow53 commented on July 17, 2024

Closing in favor of #26

from hoard.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.