Comments (7)
You won't need to check for differences if just using the modified time. Any differences in the mtime of a file will instantly show that a file has changed.
from hoard.
The thing is, I don't want something like touch myfile
to count as a "difference." Differing files should be based on existence and content, IMO.
So if a file exists in the hoard, does not exist on the filesystem, and would be copied to the filesystem with hoard restore
, that's a "diff." Same for the opposite case, and for cases where the backed up version has different content from the one on the filesystem.
from hoard.
That'd be rather intense on I/O and CPU with a lot of files
from hoard.
It may seem silly, but simply syncing a file that was touched sounds better than spending a lot of resources diffing files.
from hoard.
You have a point. There's also some merit to the thought of at least using mtime to short-circuit a diffing process. I am not going to make a decision on this quite yet, as there are at least a couple other features that would also be affected.
from hoard.
Some notes on comparing things in case I go down that route.
- If I am checking whether a hoard has any differences, I can stop checking after the first difference found.
- Search first for files existing in one location and not the other.
- Then check file contents.
- When comparing files, byte-by-byte is more efficient than checksums (ignoring caching).
- It is generally faster to read a bunch of one file than it is to switch back and forth, so use a buffered reader.
- This also includes using multiple threads, since it is still just one disk controller.
- I expect
hoard
to more likely be used with a bunch of small files rather than a bunch of large ones, so reading entire files into memory before comparing is also an option, for some maximum size.
- File contents only need to be checked if file lengths are the same.
- For the purposes of determining a clean backup/restore, could have a file that caches metadata on each file -- length, checksum, etc. -- and use that for comparison, which may be faster than comparing two files. That would not work in all cases, though.
from hoard.
Closing in favor of #26
from hoard.
Related Issues (20)
- Rewrite integration tests in Rust
- path_exists: do not break on os-specific env vars
- Use generated operations when backing up/restoring
- Refactor file diff iterator logic
- Allow periods in names
- Release 0.5.0
- Manually set file permissions based on configuration
- Improve Operation error message(s)
- Actually test that newer versions are compatible with older operation logs
- Clean up tests with helper macros
- Incorrect config file location in Windows HOT 2
- Set up knope for releases HOT 1
- Add init subcommand HOT 2
- Allow setting default values for environment variables
- Re-enable Android builds
- Make backup/restore use files iterator
- Ensure errors are reported at creation site
- Enforce stronger typing
- Replace `hoards_dir` with `data_dir`
- cache file operations
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hoard.