Comparing Files on remote servers

Dear all,
I was copying my files on a remote server, lets’ call it X, for some time and I knew that these files were backedup to a server Y.

It looks like that now the backup process is broken and might be two - three files not the same in both servers (X and Y)

I was wondering if you can suggest me a tool to search for file differences (compare files?) between the two servers .

I would like to thank you in advance for your help

Regards
Alex

Are these all openSUSE systems? Do you manage them all three? How are you copying to X? You are realy not very informative.

Hi,
I thought not more information would be needed as I think this boils down to “Find a way to compare two folder contents”.

I was copying to X with scp and later on rsync. How the copy between X and Y was done is not known.

Regards
Alex

Even if information might show not to be needed in the end, you must be aware of the fact that other people around the world that are reading your post know nothing about what you have, do or see. When you want adequate help you better try to try to give them all you can think they might need then to not give them what you think they will not need.

That might be difficult, because you even did not give me the things I asked for explicitly. I had three (3) questions You only gave something on the third one.

I am still wating for information on the first two and in the mean time will add a new one: How do you access system Y (login using what, other ways of access)?

alaios wrote:
> Dear all,
> I was copying my files on a remote server, lets’ call it X, for some
> time and I knew that these files were backedup to a server Y.
>
> It looks like that now the backup process is broken and might be two -
> three files not the same in both servers (X and Y)
>
> I was wondering if you can suggest me a tool to search for file
> differences (compare files?) between the two servers .

Why do you want to search for the differences? Why not just tell rsync
to repeat the backup?

On 11/07/2012 02:46 PM, alaios wrote:
> I thought not more information would be needed as I think this boils
> down to “Find a way to compare two folder contents”.

as an example of why more info (like that which Henk asked for) is required:

if X is openSUSE (or other Linux) and Y is openSUSE (or other Linux)
(his first question), and you have access to both (his second question)
and you are copying with rsync with switches to retain date/etc (his
third question) then you could make a little script which would ls each
and compare the two outputs with diff, to find differences… simple.

but, if X is openSUSE (or other Linux) and Y is Windows
Server/Solaris/SPARC/etc then that won’t work.

and, his fourth question: if you access via remote desktop and need a
GUI to do your server admin work, then that simple script gets . . .


dd

On 2012-11-07 14:46, alaios wrote:

> Hi,
> I thought not more information would be needed as I think this boils
> down to “Find a way to compare two folder contents”.

No, an answer is impossible unless you say what operating system each
machine runs and what type of access you have on each. For example, are
they Linux, and do you have ssh access to them?


Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” (Minas Tirith))

It looks like that both systems run same os and kernel

3.2.0-32-generic #51-Ubuntu SMP

in both systems I have a access to files through ssh. Both directories return exactly the same number of megabytes which I think is a first indicator (even though not enough) that files might be identical.

So what is the missing part here for proving that also?

I would like to thank you in advance for your help

Regards
Alex

On 2012-11-09 16:26, alaios wrote:
>
> It looks like that both systems run same os and kernel
>
> 3.2.0-32-generic #51-Ubuntu SMP
>
>
> in both systems I have a access to files through ssh. Both directories
> return exactly the same number of megabytes which I think is a first
> indicator (even though not enough) that files might be identical.

Then I would log on both machines via ssh and run a checksum on the
files, which you can then compare visually. Or create a checksum file
(man md5sum) which you download to another machine and run there, if it
is a bunch of files.


Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” (Minas Tirith))

I ẃould like to thank you for your reply. The problem I see is that the files are so many that comparing one by one would take ages. In that case either should I use a graphic interface (that supports ssh) or even better that would pinpoint files that are different (and that because I guess that only few files would have differences if any=)

Thanks again
RegardsA

Alex

http://zuhaiblog.com/2011/02/14/using-diff-to-compare-folders-over-ssh-on-two-different-servers/

shows an example with rsync (using the --dry-run option).


PC: oS 12.2 x86_64 | i7-2600@3.40GHz | 16GB | KDE 4.8.5 | GeForce GT 420
ThinkPad E320: oS 12.2 x86_64 | i3@2.30GHz | 8GB | KDE 4.9.3 | HD 3000
eCAFE 800: oS 11.4 i586 | AMD Geode LX 800@500MHz | 512MB | lamp server

Mass actions like your’s seldom ask for a GUI program. They ask for a program that can run from the CLI and thus is fit for batch work. A good script might be the solution, using something like ssh and sftp (over ssh) to fetch those files pair by pair, compare them and delete them again from local storage, on to the next pair. Maybe it runs for a few hours, but who cares, just do it at night and go for a drink and some sleep.

But for that you need somebody rather fluent in scripting and the usage of ssh from the batch. And that person needs to sit at your system because itt needs a lot of try and error. That means either you, or a good friend.

EDIT: Sorry Martin, didn’t see your suggestion. But yes all sorts of scripted soutions come to one’s mind.

On 2012-11-12 13:56, alaios wrote:
>
> I ẃould like to thank you for your reply. The problem I see is that
> the files are so many that comparing one by one would take ages. In that
> case either should I use a graphic interface (that supports ssh) or even
> better that would pinpoint files that are different (and that because I
> guess that only few files would have differences if any=)

No, you use md5sum to generate a list of files with their checksums.
When you do the checking in the other computer with that file, it
automatically checks all those filenames.


cer@minas-tirith:~/tmp> md5sum *
d41d8cd98f00b204e9800998ecf8427e  cp
d41d8cd98f00b204e9800998ecf8427e  env
d41d8cd98f00b204e9800998ecf8427e  export
3b05731dd9fe203b0b9d17e9ac84b143  hosts
3b05731dd9fe203b0b9d17e9ac84b143  hosts.old
8ad33d8de958b99c99f39f3323a51da8  os-en_mbr
9b6f027f10b0de58c5de910f2c47e5b6  os-en_mbr.msf

You copy that to a file.md5 file, via copy paste or via redirection. On
the target, you run md5sum -c file.md5:


cer@minas-tirith:~/tmp> md5sum -c file.md5
cp: OK
env: OK
export: OK
hosts: OK
hosts.old: OK
os-en_mbr: OK
os-en_mbr.msf: OK
cer@minas-tirith:~/tmp>

See? I told you this in the previous post already. I’ll leave to you how
to produce output only for bad files.


Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” (Minas Tirith))

Am 12.11.2012 13:56, schrieb alaios:
> case either should I use a graphic interface (that supports ssh) or even
> better that would pinpoint files that are different

If you look for a gui tool: krusader should be able to do that


PC: oS 12.2 x86_64 | i7-2600@3.40GHz | 16GB | KDE 4.8.5 | GeForce GT 420
ThinkPad E320: oS 12.2 x86_64 | i3@2.30GHz | 8GB | KDE 4.9.3 | HD 3000
eCAFE 800: oS 11.4 i586 | AMD Geode LX 800@500MHz | 512MB | lamp server

And Corlos’ suggestion might be the easiest. In fact no real scripting needed.

On 11/12/2012 02:56 PM, hcvv wrote:
> In fact no real scripting > needed.

or as i wrote days ago, just ls X to file, ls Y to file, and then diff
the two and out pops a list of differences.

simple.


dd

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Only works if it is a safe assumption that a file present means a file
intact. Anything short of a checksum (md5 or otherwise) or something
that verifies the contents like that (rsync can do this too) is going to
fall short if a single bit of the file is not identical on both sides.

Also, you’d at least want to sort the output since ordering isn’t the
same for every system/filesystem depending on a bunch of variables.

Good luck.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/

iQIcBAEBAgAGBQJQoSPXAAoJEF+XTK08PnB5jAUQAMJqvCTy0q3BG9MybCmoxj5o
jR/2iRSSWJSfyW2WFUU/K67elpJNPNOuN2v7salF616KE/JFnfNwk5ZCb4a2Ip1y
IefYGNtdrUbhmPLrD6KxPJPD2bD4KP2i3Lsn8H9cn4GokUMBzUeiDGVcBscwzIgX
+9z5PwcDUnbxVIGG4nzP73rxXeonbpT8utI2yIFvXzKZrmAA8Hd3IPvifUn2z4Se
G4iyOu7o19+k9jFMFGlFyNgq15r7n4Sc1vVuh1NjhIlWGA+kLAe1JeOwzwq9Htxj
gUCHi1dt3GP7liK1PeJip25c5v34w8RG8Rdi5tzoULSpH+gcgvqQDPUmAllzTuSJ
W1GZTZIWv86onSklHU4pNIgpvQEkBSyPf/2g85SToEJngD1QrhRzUExeK8Jnyg/c
1n9khcEyXEA2TPM1CoBc0YH1f6z1CyhOOl6/kDlZ9DkDe69vi7o2QDwKxZgTh9Y7
xNyF/6MpOtwMCTYCU+9EPe9wUxInIL4kNl+I+8J7Y2SZrVzt2sGE6DXcHEeM0E/p
boxsuiTOMBmD6aKBTVTyA43B12Dd6PCZa/CugWUaDPD9L8x4KTJJ0i0wsW4J5sOx
6pUBljuRmptZ4NujLfnJbQNtPIry4kn4+/a3n+8oeD50UA5IOeTg3VhyZpaSKKV4
qg1DrBMRu7usj+NhuPz9
=DJT8
-----END PGP SIGNATURE-----

On 2012-11-12 14:26, hcvv wrote:
>
> Mass actions like your’s seldom ask for a GUI program. They ask for a
> program that can run from the CLI and thus is fit for batch work. A good
> script might be the solution, using something like ssh and sftp (over
> ssh) to fetch those files pair by pair, compare them and delete them
> again from local storage, on to the next pair. Maybe it runs for a few
> hours, but who cares, just do it at night and go for a drink and some
> sleep.

That requires heavy network usage, which I thought was to be avoided.
And if what you want to check is for errors in transit, you can create
new errors…


Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” (Minas Tirith))

Hi I tried the rsync solution as md5sum could not run recursively, and requires probably scripting to make it work.

The dry run rsync produces (edited some lines regarding host and username)

rsync --dry-run -rvce "ssh" ./ username@host:/storage/volume325/Data
The authenticity of host 'hidden' can't be established.
ECDSA key fingerprint is 62:12:d3:f2:7e:25:05:89:f4:43:fa:11:c3:ef:62:04.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added hidden to the list of known hosts.
user@hosts's password: 
sending incremental file list
check.md5

sent 1870426 bytes  received 871 bytes  73.19 bytes/sec
total size is 1016380260676  speedup is 543142.14 (DRY RUN)


as you can see from the rsync just file a fille called check.md5 (it was created when I was trying the md5 solution proposed).
What I do not understand is the send bytes 1870426 why are so large when the check.md5 file is 0 bytes.

Could you please confirm me also that I understand right the output of rsync and all the files are indeed alike?

Regards
Alex

Am 15.11.2012 10:56, schrieb alaios:
> What I do not understand is the send bytes 1870426 why are so large
> when the check.md5 file is 0 bytes.
That means that only the file check.md5 is the sole difference between
the two folders.
As for the amount of data sent: How do you think rsync can compare files
without sending data about the files you want to compare?


PC: oS 12.2 x86_64 | i7-2600@3.40GHz | 16GB | KDE 4.8.5 | GeForce GT 420
ThinkPad E320: oS 12.2 x86_64 | i3@2.30GHz | 8GB | KDE 4.9.3 | HD 3000
eCAFE 800: oS 11.4 i586 | AMD Geode LX 800@500MHz | 512MB | lamp server