Git Product home page Git Product logo

pdedup's People

Contributors

dhnunes avatar

pdedup's Issues

Hashing for same file after a cp changes the hash

If We have a simple file, and copy and change the name, the Linux native SHA256 repports it's the same file:

$ sha256sum test.log job.log
54486bc8a10b16ecbfeffcdb99c21a0af54a515495ce7cb2c6ee008664068f82 test.log
54486bc8a10b16ecbfeffcdb99c21a0af54a515495ce7cb2c6ee008664068f82 job.log
But running the code, it reports to be different file ( Since the hash is different ):

job-results/job-2018-06-11T14.35-4958f56/job.log fb595fa60766a6295929a97809ce830577227fe9c22835fc37ec89b9fb6dbb75
job-results/job-2018-06-11T14.35-4958f56/test.log 9b7f4cff871fd07e525da759e98601bc4dba52d8d14784a35463357a7c0f2353

[BUG] Broken Sym links breaks app with stack trace

Hey,

When running against a directory that has broken symbolic links , app just breaks:

$ python pdedup.py -d /home/dnunes
Walking into directory tree...
Traceback (most recent call last):
File "pdedup.py", line 14, in
result = file.file_walker()
File "/home/dnunes/Code/Python/pdedup/Library/filehasher.py", line 64, in file_walker
if os.path.getsize(os.path.join(root, file)):
File "/home/dnunes/.virtualenvs/pdedup/lib64/python3.7/genericpath.py", line 50, in getsize
return os.stat(filename).st_size
FileNotFoundError: [Errno 2] No such file or directory: '/home/dnunes/.steampath'

[RFE] Progress indicator

When checking for duplicated files, it is not clear is the app is responding or not.
Maybe adding an option for a progress bar or any other option indicating the app is working could be a good addendum to it. Something like a quiet and verbose mode.

Hashing the empty files

The result of SHA's are including empty files, and these files are not duplicated, are only empty

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.