network connection 1000Mbps doesn't reach 1000Mbps

Hi!

I’ve two computers with 1000Mbps cards, one opensuse 11.2 32 bits (AMD K7 over 1Ghz and 2MB RAM) and the other Opensuse 11.3 11.3 64 bits (single processor, 3GB). Both of the cards are configured with skge. The swith is an SMC 10/100/1000 and it reports connection at 1000Mbps on both cards. The i386 is a server and the other is a client.

I’ve in the client some directories of the server mounted by nfs.
I’m testing the connection copying long files (1GB for instance) and the copy reaches 14MBps at most, and most of the time it is more or less at 10MBps, which is too far from 1000Mbps.

It may be a problem of configuration? The switch is not good enough? or may be another kind of problem ?

thanks in advance

I don’t think it’s a configuration problem. It may be a cable problem (you need good quality cat 6 cable). Also the cable can’t be longer than some value (don’t know this exact value) to achieve full 1000Mbps, however most likely the bottleneck is the speed at which You can read/write from/to Your hard drives and it can be solved by using a fast RAID.

Best regards,
Greg

See this recent thread, particularly ken yap and my posts

http://forums.opensuse.org/english/get-technical-help-here/network-internet/452159-11-3-nfs-client-hangs-large-file-transfer-2.html

I’ll repost the link I posted in that thread as the reference
TCP and Linux’ Pluggable Congestion Control Algorithms LG #135

Bottom line, as a “Johnny Come Lately” to NFS shares, I was surprised that NFS was/is generally configured to implement over UDP, but that is a configuration more suited to the era when it was last developed (approx 2002, but mostly even earlier). As a protocol without error correction, it relies on the higher protocol (NFS) to provide that necessary function. This works only for relatively tiny (a few megabytes) transfers and on a non-congested link (preferably dedicated). Also, because of the vintage of the technology you’re likely imploementing you’re probably configured to use a static buffer size.

Much better is to implement over TCP/IP which in most recent kernels supports dynamic re-sizing the TCP window buffer to accomodate conditions which tend to cause packet losss, eg. poor connection medium like wireless, congestion, multiple segmented networks. Also, you’ll notice the link in my post describes the 11 or so (maybe more today) algorithms which can be selected to modify the buffer size, for your purposes you can obviously select a method that skews towards better large file over fat pipes than the default.

HTH,
Tony

@tsu2: NFS was started in 1984 (by Sun Microsystems).

Yes, but when was the last major contribution to implementation? – According to what I read, was approx 2002 and there was a long gap before that (Pls don’t ask me to re-do my investigation unless it’s important, but maybe it was in the MAN pages or similar?).

:slight_smile:

BTW - Just expanding to the OP, although implementing dynamic buffer re-sizing is a kind of silver bullet, of course it’s still subject to all usual suspects that could cause severe drop-off in throughput like link integrity, QoS, more, but I’m guessing the extra severe dropoff you’re seeing is the domino effect when your buffer window is overwhelmed and at least one of the networking devices has backed off to the slowest speed (10mb/sec). This means you might also consider looking at your NICS to see if you can lock in the highest speed only.

Tony

OK, I’m going to try what you say but I’m afraid the problem must be elsewhere. I’m testing the connection by sftp and smb, sftp gives a speed about 5MBps, far below 100Mbps (I suppose it’s due to the calculus required by sftp to encode the data), and smb reach more or less 11-12 MBps (I think smb uses TCP, doens’t it?)

On 01/13/2011 09:06 AM, glistwan wrote:
>
> fperal;2277505 Wrote:
>> Hi!
>>
>> I’ve two computers with 1000Mbps cards, one opensuse 11.2 32 bits (AMD
>> K7 over 1Ghz and 2MB RAM) and the other Opensuse 11.3 11.3 64 bits
>> (single processor, 3GB). Both of the cards are configured with skge. The
>> swith is an SMC 10/100/1000 and it reports connection at 1000Mbps on
>> both cards. The i386 is a server and the other is a client.
>>
>> I’ve in the client some directories of the server mounted by nfs.
>> I’m testing the connection copying long files (1GB for instance) and
>> the copy reaches 14MBps at most, and most of the time it is more or less
>> at 10MBps, which is too far from 1000Mbps.
>>
>> It may be a problem of configuration? The switch is not good enough? or
>> may be another kind of problem ?
>>
>> thanks in advance
> I don’t think it’s a configuration problem. It may be a cable problem
> (you need good quality cat 6 cable). Also the cable can’t be longer than
> some value (don’t know this exact value) to achieve full 1000Mbps,
> however most likely the bottleneck is the speed at which You can
> read/write from/to Your hard drives and it can be solved by using a fast
> RAID.

I disagree. You would need a really slow disk system to limit the throughput to
these values. In fact, I can get 27 Mbps pumping data across an 802.11G wireless
link into a very old HP laptop with PATA disk connected to my switch with a 100
Mbps link.

Check the connections at both ends with the ethtool utility. You can also
eliminate any disk latency by testing the throughput with iperf.

As I noted, subject to common sense limits you can overcome a multitude of ills just by modifying the windowing algorithm if you’re still trying to transfer large files. Also, you need to establish a baseline somewhere – What is your “normal” network performance? You see your problems because you’re doing actual testing, are there existing problems in your network you may not be aware of?

Unless you have a line tester that can provide proof positive of line quality, the next best thing in the meantime can be to run some other tests… like

Run your tests with small files

or,

Remove(or shutdown and make sure you don’t have Wake on LAN enabled anywhere)) all other machines
Reboot both machines (I don’t know if stop/start network services might be sufficient)
Run transfer file tests starting with tiny files, graduate to larger files
If you see a big improvement, add other machines back into the network one at a time. Maybe one of them is causing problems like a broadcast storm.

Also, if your machines are fairly capable and subject to the level of encryption/compression being used I doubt that would have any effect on network throughput speed (the bottleneck won’t likely be a machine’s internal performance).

Tony

As lwfinger said, check the “real” speed with iperf.

  • Install iperf if needed:
zypper in iperf 

  • On the server type:
iperf -s
  • On the client type:
iperf -t 20 -c <server IP>
  • wait 20 seconds

You might have a broken switch too. Try to turn it off/on and plug your net cables into other ports.

Hi
Ensure any auto negotiation is turned of on the system so they are
running at full/full.

The other thing is MBps and Mbps are two different units… as in bytes
and bits…


Cheers Malcolm °¿° (Linux Counter #276890)
SUSE Linux Enterprise Desktop 11 (x86_64) Kernel 2.6.32.24-0.2-default
up 4 days 15:35, 2 users, load average: 0.22, 0.16, 0.10
GPU GeForce 8600 GTS Silent - Driver Version: 260.19.29

Yes, I’m talking about MBps, 12MBps is just a bit over 100Mbps but far away from 1000Mbps

regards

Yes but that’s more than 4 times slower than 14 MBps I’m sure that at that speed WiFi is the limit.

Best regards,
Greg

andromeda:/home/fernando # iperf -t 20 -c 192.168.2.5

Client connecting to 192.168.2.5, TCP port 5001
TCP window size: 27.5 KByte (default)

3] local 192.168.2.3 port 43213 connected with 192.168.2.5 port 5001
ID] Interval Transfer Bandwidth
3] 0.0-20.0 sec 1.13 GBytes 485 Mbits/sec

well, is not that bad. Is almost half the maximum speed!

But, copying a folder of 1.1GB with more than 5000 files takes 2’30’’, that is 7.5MBps it’s even worse than copying large files

I’m going to continuo doing tests

regards

It’s normal that copying a lot of small files is slower than copying one large file.

Best regards,
Greg

CTX121634 - How to Evaluate Performance of Block Type Storage Repositories for XenServer - Citrix Knowledge Center

This article gives a nice comparison of average read/write speeds for different storage. So I guess I was wrong and You should be able to get more than 14 MBps with Your config. Maybe it’s the switch fault. Does it have some inbuilt resource monitor or can You check for errors on interfaces ?

Best regards,
Greg

The you should adapt the read/write buffer size in the nfs mount options on the client. Example for nfs3:

mount -t nfs -o rsize=32768,wsize=32768,noatime,nodev,vers=3

Try different sizes between 2K and 32k. It might make huge differences.

wow! Amazing!

copying a 2.3GB file from client to server

mount -t nfs -o rsize=2048,wsize=2048,noatime,nodev,vers=3 aldebaran:/origin /destination

takes 3’40’’ -> near 11MBps

mount -t nfs -o rsize=32768,wsize=32768,noatime,nodev,vers=3 aldebaran:/origin /destination

takes 1’50’’ -> near 21MBps but It’s because suddenly it stops, like freeze and then continues, but during the copy, most of the time it’s running at more than 60MBps. Copying a 600MB file works in a few seconds, at about 60MBps.

using a greater size does not improve much (1’40’’ for the 2.3GB file with 128K window)

But, with 32K window, copying a 600MB file from client to server, in spite of from server to client runs at only 14MBps.

¿isn’t the link symmetrical? (i’ve used 32K in both rsize and wsize, so I thought it will be)

regards

Switch to TCP for NFS transport if you haven’t already.

james@linux-ew60:~> iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
  4] local 192.168.0.153 port 5001 connected with 192.168.0.139 port 40103
 ID] Interval       Transfer     Bandwidth
  4]  0.0-20.0 sec  1.32 GBytes    566 Mbits/sec

Yet another useful command learned today. I used it to connect between a openSUSE 11.3 and 11.4 computer. The command above was on the 11.3 computer. For some reason, when I tried to reverse the roles, it would not work, even as it works every time this way. Perhaps something wrong with openSUSE 11.4, though I am sure I don’t know what. I have 1 MB networks cards and a 1 MB local switch.

Thank You,

some interesting things I’m learning:

On the server side
iperf -s -> it uses a 82.3KB window (default), I’ve tried to change It (less or more) and the results all always worse.

On the client side:


andromeda:/home/fernando # iperf -t 20 -d -c 192.168.2.5 -w 2K
WARNING: option -d is not valid for server mode
------------------------------------------------------------
Client connecting to 192.168.2.5, TCP port 5001
TCP window size: 4.00 KByte (WARNING: requested 2.00 KByte)
------------------------------------------------------------
  3] local 192.168.2.3 port 37121 connected with 192.168.2.5 port 5001
^C ID] Interval       Transfer     Bandwidth
  3]  0.0-13.5 sec    420 MBytes    262 Mbits/sec
andromeda:/home/fernando # iperf -t 20 -c 192.168.2.5 -w 2K
------------------------------------------------------------
Client connecting to 192.168.2.5, TCP port 5001
TCP window size: 4.00 KByte (WARNING: requested 2.00 KByte)
------------------------------------------------------------
  3] local 192.168.2.3 port 37129 connected with 192.168.2.5 port 5001
 ID] Interval       Transfer     Bandwidth
  3]  0.0-20.0 sec    566 MBytes    238 Mbits/sec
andromeda:/home/fernando # iperf -t 20 -c 192.168.2.5 -w 4K
------------------------------------------------------------
Client connecting to 192.168.2.5, TCP port 5001
TCP window size: 8.00 KByte (WARNING: requested 4.00 KByte)
------------------------------------------------------------
  3] local 192.168.2.3 port 37137 connected with 192.168.2.5 port 5001
 ID] Interval       Transfer     Bandwidth
  3]  0.0-20.0 sec    744 MBytes    312 Mbits/sec                                                                                                                                                              
andromeda:/home/fernando # iperf -t 20 -c 192.168.2.5 -w 8K
------------------------------------------------------------                                                                                                                                                    
Client connecting to 192.168.2.5, TCP port 5001                                                                                                                                                                 
TCP window size: 16.0 KByte (WARNING: requested 8.00 KByte)                                                                                                                                                     
------------------------------------------------------------                                                                                                                                                    
  3] local 192.168.2.3 port 37138 connected with 192.168.2.5 port 5001                                                                                                                                         
 ID] Interval       Transfer     Bandwidth
  3]  0.0-20.0 sec  1.03 GBytes    443 Mbits/sec
andromeda:/home/fernando # iperf -t 20 -c 192.168.2.5 -w 16K
------------------------------------------------------------
Client connecting to 192.168.2.5, TCP port 5001
TCP window size: 32.0 KByte (WARNING: requested 16.0 KByte)
------------------------------------------------------------
  3] local 192.168.2.3 port 37146 connected with 192.168.2.5 port 5001
 ID] Interval       Transfer     Bandwidth
  3]  0.0-20.0 sec  1.06 GBytes    453 Mbits/sec
andromeda:/home/fernando # iperf -t 20 -c 192.168.2.5 -w 32K
------------------------------------------------------------
Client connecting to 192.168.2.5, TCP port 5001
TCP window size: 64.0 KByte (WARNING: requested 32.0 KByte)
------------------------------------------------------------
  3] local 192.168.2.3 port 39775 connected with 192.168.2.5 port 5001
 ID] Interval       Transfer     Bandwidth
  3]  0.0-20.0 sec  1.08 GBytes    465 Mbits/sec
andromeda:/home/fernando # iperf -t 20 -c 192.168.2.5 -w 64K
------------------------------------------------------------
Client connecting to 192.168.2.5, TCP port 5001
TCP window size:   128 KByte (WARNING: requested 64.0 KByte)
------------------------------------------------------------
  3] local 192.168.2.3 port 39783 connected with 192.168.2.5 port 5001
 ID] Interval       Transfer     Bandwidth
  3]  0.0-20.0 sec  1.10 GBytes    472 Mbits/sec



so it’s useful using a greater window than default (I think is 4KB in NFS), but very large windows doesn’t improve more the results.

On the other side, I’ve tried to reverse the test, now I run on the client:
iperf -s

and on the server:

iperf -t 20 -c 192.168.2.3 -w 32K
------------------------------------------------------------
Client connecting to 192.168.2.3, TCP port 5001
TCP window size: 64.0 KByte (WARNING: requested 32.0 KByte)
------------------------------------------------------------
  3] local 192.168.2.5 port 35903 connected with 192.168.2.3 port 5001
 ID] Interval       Transfer     Bandwidth
  3]  0.0-20.0 sec    455 MBytes    191 Mbits/sec


Wow! It’s only a 40% of the other speed.

I think I must check the cabling

What a useful command!

regards