[BASH]read line by line, and compare

I want to read a file looks like this:


[a md5checksum][a space character][name of the file]

After reading it, I want to split it by space, and compare md5checksum of first line to second, second to third etc. I will sort the data beforehand, so doing this, I am aiming to achieve checking for dublicate files, even if name, modification date etc. are different. After finding them, I will probably execute some other code on them.

Is there a good example you know that will teach me how I could do this, or can you provide that example for me?

Yaşar Arabacı

Look at the options of sort involving duplicate lines and restricting the comparison fields.

Do you want to write this in bash? I think in this case it will be better to use python, for example.

I’m not sure what you are trying to accomplish, are you just wanting a list of the filenames with no duplicates?

Try this


for FILE in $(cat some_file | awk '{print $2}' | sort | uniq)
do
     echo ${FILE}
done

Hope this gets you started.

Good luck,
Hiatt

Er, why the loop to read each line from the pipeline which just prints to stdout anyway? Why not just run the pipeline?

sort -u -k 1.1,1.32 < yourFile