I do not know fdupes, but in this command there seem to be some strange characters (like U+201C LEFT DOUBLE QUOTATION MARK and U+201D RIGHT DOUBLE QUOTATION MARK). Is this realy a straight copy/paste of what you have on the terminal emulator? That is what we expect when you show something within CODE tags.
Fdupes output has some empty lines that’s why I have added the ** $files ]] || continue** in the loop to skip it.
That should be enough to handle file names with spaces and tabs however it fails on file names that has a new line.
I know that it is silly and highly unlikely that file/path names to have new lines in them but they are allowed in at least in Unix and it’s derivatives.
A work around that issue is to use null bytes as a delimiter for each files, and for recursive feature find is the right tool for the job.
Since fdupes is not an option using a hash for the files should work, and just compare them to find the duplicates.
declare -A array
while IFS= read -r -d '' file; do
read -r checksum _ < <(sha512sum -- "$file")
if ((array$checksum]++)); then
mv -v -- "$file" directory_to_move_dulicate_files/
done < <(find directory_with_duplicate_files/ -type f -print0)
The -print0 from find and the -d ‘’ from the builtin read both handle null bytes properly so the code above should be safe from files that has a spaces, tabs and new lines.
@hcvv, the mess in the quotes in my coded command line was bad on my part. I didn’t know what I was doing.
Regardless it was not the way to do this.
Until I read @jetchisel’s post, I was about to go through the whole thing on the web page I posted in my OP.
@jetchisel, THANKS both worked. So more tools in my Linux experiences.
So, I have my music duplicates found and moved from two ‘music’ directories to two different ‘dupes’ directories to check out both of your suggestions.
I thought over this during my sleep last night ;). How could this happen. It may be that you already now how the " were mistreated, but I assume you used some word processor for the statement. Those tend to make text better redabale for human beings by adapting, amongst many other things like spell-checking. Thus they try to make"intelligent guesses" about what those " are for and then adapting to “real” quoting open and close signs depending on the lanugage used.
For things that are to be understood primaraly by computers and not ny humans, better use an editor and not a word processor. Personaly I use vi (the older incarnation of present day vim), but that has a very steep learning curve. I think editors like Kate (in KDE) also do a good editing job without changing to much on what you mean (some editor are computer language sensitive and do things like using coulour highlighting to help you in finding unclosed constructs, but that is only in showing, nothing is changed).
That line of code was copied from a terminal window. It was what I ‘thought’ would (re)move duplicate files to a directory named $MusicDupes.
Like I said, I had my head in a dark place, and didn’t know what I was doing, only what I was trying to do.
I do use vi, vim, nano, when necessary within terminals. But I am still learning their capabilities.
I usually see this kind of »best-intentions« typographical replacement of characters in content-management and blogging software (Typo3, Django, WordPress, Blogger, that kind of stuff).
While I really like (and regularly use) fancy Unicode characters, having them replaced automagically by the mentioned software can cause havoc when copy-pasting to/from source code, IDE, terminal window or text shell. My pet peeves:
double-hyphens get replaced by typographically correct longer lines (»—« for example, the Unicode Em-Dash) and work no longer as command-line switches (»–recurse« vs »—recurse«)
backticks may get converted to accent characters, even combined to accented glyphs (echo to èchò)
bits of pathnames get interpreted as italic or cursive (/usr/bin to usr
bin) … which is just /wrong/
seperate ASCII characters get fused to ligatures (fish to ﬁsh — the former has 4 codepoints, the latter only 3 because »ﬁ« is one glyph), hard to spot sometimes
and of course, all kinds of »swiss«/«french»/‘single’/“double”/“fancy” shenanigans
Beware of well-intentioned goodies thrown at you by well-intentioned web-developers, I guess.
I remember one earlier case here on the forums. But there the quotes were alread wrong on the web-site. In this case they are correct on the web-site. But in any case, one should be aware of all these (as you call the rightly)
« des meileures intentions »