I’ve two computers with 1000Mbps cards, one opensuse 11.2 32 bits (AMD K7 over 1Ghz and 2MB RAM) and the other Opensuse 11.3 11.3 64 bits (single processor, 3GB). Both of the cards are configured with skge. The swith is an SMC 10/100/1000 and it reports connection at 1000Mbps on both cards. The i386 is a server and the other is a client.
I’ve in the client some directories of the server mounted by nfs.
I’m testing the connection copying long files (1GB for instance) and the copy reaches 14MBps at most, and most of the time it is more or less at 10MBps, which is too far from 1000Mbps.
It may be a problem of configuration? The switch is not good enough? or may be another kind of problem ?
I don’t think it’s a configuration problem. It may be a cable problem (you need good quality cat 6 cable). Also the cable can’t be longer than some value (don’t know this exact value) to achieve full 1000Mbps, however most likely the bottleneck is the speed at which You can read/write from/to Your hard drives and it can be solved by using a fast RAID.
Bottom line, as a “Johnny Come Lately” to NFS shares, I was surprised that NFS was/is generally configured to implement over UDP, but that is a configuration more suited to the era when it was last developed (approx 2002, but mostly even earlier). As a protocol without error correction, it relies on the higher protocol (NFS) to provide that necessary function. This works only for relatively tiny (a few megabytes) transfers and on a non-congested link (preferably dedicated). Also, because of the vintage of the technology you’re likely imploementing you’re probably configured to use a static buffer size.
Much better is to implement over TCP/IP which in most recent kernels supports dynamic re-sizing the TCP window buffer to accomodate conditions which tend to cause packet losss, eg. poor connection medium like wireless, congestion, multiple segmented networks. Also, you’ll notice the link in my post describes the 11 or so (maybe more today) algorithms which can be selected to modify the buffer size, for your purposes you can obviously select a method that skews towards better large file over fat pipes than the default.
Yes, but when was the last major contribution to implementation? – According to what I read, was approx 2002 and there was a long gap before that (Pls don’t ask me to re-do my investigation unless it’s important, but maybe it was in the MAN pages or similar?).
BTW - Just expanding to the OP, although implementing dynamic buffer re-sizing is a kind of silver bullet, of course it’s still subject to all usual suspects that could cause severe drop-off in throughput like link integrity, QoS, more, but I’m guessing the extra severe dropoff you’re seeing is the domino effect when your buffer window is overwhelmed and at least one of the networking devices has backed off to the slowest speed (10mb/sec). This means you might also consider looking at your NICS to see if you can lock in the highest speed only.
OK, I’m going to try what you say but I’m afraid the problem must be elsewhere. I’m testing the connection by sftp and smb, sftp gives a speed about 5MBps, far below 100Mbps (I suppose it’s due to the calculus required by sftp to encode the data), and smb reach more or less 11-12 MBps (I think smb uses TCP, doens’t it?)
On 01/13/2011 09:06 AM, glistwan wrote:
>
> fperal;2277505 Wrote:
>> Hi!
>>
>> I’ve two computers with 1000Mbps cards, one opensuse 11.2 32 bits (AMD
>> K7 over 1Ghz and 2MB RAM) and the other Opensuse 11.3 11.3 64 bits
>> (single processor, 3GB). Both of the cards are configured with skge. The
>> swith is an SMC 10/100/1000 and it reports connection at 1000Mbps on
>> both cards. The i386 is a server and the other is a client.
>>
>> I’ve in the client some directories of the server mounted by nfs.
>> I’m testing the connection copying long files (1GB for instance) and
>> the copy reaches 14MBps at most, and most of the time it is more or less
>> at 10MBps, which is too far from 1000Mbps.
>>
>> It may be a problem of configuration? The switch is not good enough? or
>> may be another kind of problem ?
>>
>> thanks in advance
> I don’t think it’s a configuration problem. It may be a cable problem
> (you need good quality cat 6 cable). Also the cable can’t be longer than
> some value (don’t know this exact value) to achieve full 1000Mbps,
> however most likely the bottleneck is the speed at which You can
> read/write from/to Your hard drives and it can be solved by using a fast
> RAID.
I disagree. You would need a really slow disk system to limit the throughput to
these values. In fact, I can get 27 Mbps pumping data across an 802.11G wireless
link into a very old HP laptop with PATA disk connected to my switch with a 100
Mbps link.
Check the connections at both ends with the ethtool utility. You can also
eliminate any disk latency by testing the throughput with iperf.
As I noted, subject to common sense limits you can overcome a multitude of ills just by modifying the windowing algorithm if you’re still trying to transfer large files. Also, you need to establish a baseline somewhere – What is your “normal” network performance? You see your problems because you’re doing actual testing, are there existing problems in your network you may not be aware of?
Unless you have a line tester that can provide proof positive of line quality, the next best thing in the meantime can be to run some other tests… like
Run your tests with small files
or,
Remove(or shutdown and make sure you don’t have Wake on LAN enabled anywhere)) all other machines
Reboot both machines (I don’t know if stop/start network services might be sufficient)
Run transfer file tests starting with tiny files, graduate to larger files
If you see a big improvement, add other machines back into the network one at a time. Maybe one of them is causing problems like a broadcast storm.
Also, if your machines are fairly capable and subject to the level of encryption/compression being used I doubt that would have any effect on network throughput speed (the bottleneck won’t likely be a machine’s internal performance).
This article gives a nice comparison of average read/write speeds for different storage. So I guess I was wrong and You should be able to get more than 14 MBps with Your config. Maybe it’s the switch fault. Does it have some inbuilt resource monitor or can You check for errors on interfaces ?
mount -t nfs -o rsize=2048,wsize=2048,noatime,nodev,vers=3 aldebaran:/origin /destination
takes 3’40’’ -> near 11MBps
mount -t nfs -o rsize=32768,wsize=32768,noatime,nodev,vers=3 aldebaran:/origin /destination
takes 1’50’’ -> near 21MBps but It’s because suddenly it stops, like freeze and then continues, but during the copy, most of the time it’s running at more than 60MBps. Copying a 600MB file works in a few seconds, at about 60MBps.
using a greater size does not improve much (1’40’’ for the 2.3GB file with 128K window)
But, with 32K window, copying a 600MB file from client to server, in spite of from server to client runs at only 14MBps.
¿isn’t the link symmetrical? (i’ve used 32K in both rsize and wsize, so I thought it will be)
james@linux-ew60:~> iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
4] local 192.168.0.153 port 5001 connected with 192.168.0.139 port 40103
ID] Interval Transfer Bandwidth
4] 0.0-20.0 sec 1.32 GBytes 566 Mbits/sec
Yet another useful command learned today. I used it to connect between a openSUSE 11.3 and 11.4 computer. The command above was on the 11.3 computer. For some reason, when I tried to reverse the roles, it would not work, even as it works every time this way. Perhaps something wrong with openSUSE 11.4, though I am sure I don’t know what. I have 1 MB networks cards and a 1 MB local switch.