jcburley / duff Goto Github PK
View Code? Open in Web Editor NEWDuplicate File Finder
Home Page: http://duff.sourceforge.net/
License: Other
Duplicate File Finder
Home Page: http://duff.sourceforge.net/
License: Other
See duffdriver.c:344 for more (report_clusters).
Level of interest? Licensing?
Possibly tolerate out-of-memory conditions by backing off some aggressive optimizations? But at least don't blindly proceed assuming it always returns a non-NULL value.
See TODO in duffdriver.c:263 for details (process_path).
I rely on duff
a lot and often find that the file that appears in the excess mode is not the copy of the file I wish to delete (one file has been sorted into the right subfolder, the other has not).
I try to manipulate the names of the subdirectories to force the order to be how I want, but I don't think it can be fully controlled.
How can I control which copy appears in excess mode? If there is no such control, how difficult would it be to introduce such a feature?
See duffentry.c:352 (compare_entry_contents).
Probably need to quote the results of the dirname ...
just its $file argument is now quoted; otherwise, spaces can lead to dirname thinking a template is being supplied, rather than the default.
Version 0.5 (RC1) does duplicate-entry detection via qsort for better performance; this should make the biggest single improvement reasonably possible against 0.4.
A "temporary" array of pointers to Entry structures is now used in 0.5 so qsort() can sort it for us. Not sure what else would be worth doing here...any ideas?
duffdriver.c's cmpentryp() uses the relationship between the left and right pointers to determine an ordering if nothing else differs. The idea here is to provide a "stable" sort, so the order of output (reporting), within a set of duplicates, is the same as the input ordering.
Whether this is even needed, I don't know. In any case, I'm concerned that qsort() might be moving those pointers around. To be safe, something intrinsic to the relative Entry objects should be used -- perhaps something as simple as a monotonically increasing counter so each Entry has a unique ID.
Document that the new version passes the -t (thorough) option, since that makes sense for a script that automatically deletes "duplicates". Install as chmod +x. Document it on the duff man page.
Is this necessary?
This would mean allowing e.g. an email to be pulled apart and the individual attachments treated as (virtual) entries; ditto a tarball, a compressed file, a compressed tarball, etc.
See duffdriver.c:155 for this TODO (has_recurse_directory).
Not all changes have been documented.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.