Git Product home page Git Product logo

scripts's People

Contributors

taltman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scripts's Issues

find-dupes.awk: edits for Linux?

Hi,

Any chance you could help me adapt your find-dupes.awk script to work on a Linux system? Based on your notes, I was able to figure out the following changes:

  • Instead of ls -lTR, use ls -l --full-time -R | grep -v ^d
  • Use md5_exec = "md5sum"
  • Change $9 to $8: file = substr($0,match($0, $8)+length($8)+1,length($0))
  • Change $2 to $1 since we are using md5sum: hash = $1

I couldn't figure out the rest, starting with the line sizes[$5], as I don't know awk. Would appreciate it as I'm trying to find dupes using the md5sum from the stackexchange thread that you referenced, and it's still running after 1 day on 1.3TB worth of data.

Thanks in advance.

shorten-filenames.sh might lengthen rather than shorten path length

shasum will always produce a 40 digit hash value. If a file/folder name which is shorter than 40 digits is encoded , it actually gets longer than shorter. Steps to reproduce:

mkdir -p abcdefghij/abcdefghij/abcdefghij
shorten-filenames.sh . encode 25

This results in ./abcdefghij/abcdefghij/b92ab2ae522e8b2a922b9c9b2c4fa7f677373489 which is actually longer than the original folder ./abcdefghij/abcdefghij/abcdefghij

shorten-filenames.sh - Problem with whitespaces in path

I have been using the script today. But unfortunately, it tries to stat each word in a file with whitespaces as a different file:

mv: der Aufruf von stat für 'Recht' ist nicht möglich: No such file or directory
shorten-filenames.sh: it appears that directory . has already been shortened. Aborting.
mv: der Aufruf von stat für 'gehabt!' ist nicht möglich: No such file or directory
shorten-filenames.sh: it appears that directory . has already been shortened. Aborting.
mv: der Aufruf von stat für 'Der' ist nicht möglich: No such file or directory
shorten-filenames.sh: it appears that directory . has already been shortened. Aborting.
mv: der Aufruf von stat für 'neue' ist nicht möglich: No such file or directory
shorten-filenames.sh: it appears that directory . has already been shortened. Aborting.
mv: der Aufruf von stat für 'Skoda' ist nicht möglich: No such file or directory
shorten-filenames.sh: it appears that directory . has already been shortened. Aborting.
mv: der Aufruf von stat für 'Octavia' ist nicht möglich: No such file or directory
shorten-filenames.sh: it appears that directory . has already been shortened. Aborting.
mv: der Aufruf von stat für 'RS' ist nicht möglich: No such file or directory
shorten-filenames.sh: it appears that directory . has already been shortened. Aborting.
mv: der Aufruf von stat für '2013...' ist nicht möglich: No such file or directory
shorten-filenames.sh: it appears that directory . has already been shortened. Aborting.
mv: der Aufruf von stat für '_' ist nicht möglich: No such file or directory
shorten-filenames.sh: it appears that directory . has already been shortened. Aborting.
mv: der Aufruf von stat für 'rad-ab.com.mhtml' ist nicht möglich: No such file or directory

The file name is "Recht gehabt! Der neue Skoda Octavia RS 2013... _ rad-ab.com.mhtml"

I tried to fix it myself, but don't know where. I think there a some quotes missing somewhere in the script...

parameter "-E" not known by find in shorten-filenames.sh

I just tried the script shorten-filenames.sh under openSUSE 13.1. Unfortunately, I'm stuck with the error find: unknown predicate `-E' since the version of find that comes with openSUSE does not know about a option "-E".
So, what does the option "-E" stand for?
Which operating system are you using to get this extra option?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.