iowait slows system down

I have recently replaced a very old server (was running 11.1 i586) with a brand new machine with double core Intel and 4 Gigs of RAM. This machine got 11.2 i586 installed, all updates applied.

I soon realised that the new hardware was MUCH slower than the old one (software installation and actual load remaining the same). Top shows high numbers of %wa (=iowait), system load goes up to 11 where it normally was around 0.2 on the old machine. vmstat 1 shows processes being blocked waiting for harddisk IO.

Anyone else seeing this? Could it still be this kernel bug: https://bugzilla.kernel.org/show_bug.cgi?id=12309?

if it is then http://bugzilla.kernel.org/show_bug.cgi?id=12309#c11 might help and verify its the same issue ?

I already tried that. The proposed solution was to add ‘elevator=as’ to the boot options. It helps a lot, bringing down indicated (average) system load by a factor of ~20, but the far superior hardware is still doing worse than the old one (single core celeron) did. For me the problem is not solved by using the old scheduler.

My average load is very low, but sometimes a few requests come in simultaneously (like clients querying popper, triggered by cron) or hog the cpu a bit more, like inserting downloads into a mysql fulltext db and then other processes are blocked for a few seconds. That’s why I wonder if we still have the iowait bug in kernel 2.6.31.12-0.2-desktop or if this is something different.

I am reluctant to file a bug at this time, because I do not know enough about the problem and the previous report indicated above grew to 422 comments and had to be closed because it was a complete mess, with people reporting a lot of similar but unrelated problems.

For the time my question is: do we (still) have the issue of bug #12309 or is this something different?

Far from knowing vodoo I can’t point to you a bug report but I will say I’ve seen this about on other distro’s. Also some reports that some have seen improvements with the BFS CK’s new patch set, but I’ve also seen others say they don’t think it makes any difference.

I have to admit I think we’re talking the same but it normally is described regarding the transfer of files but it seems to point to the same bug report.

As for fixing mmm well up to 2.6.33.2 and still people are saying it is unresponsive. I can’t really comment mainly I guess due to not doing much intensive io also some seem to think it maybe related 64bit(though I am), which you would seem to be confirming.

I guess if you’re up to it you could try CK’s BFS patchset.

Edit
Gentoo Link

Which kernel are you using? Default should be used in a server environment. Desktop is optimized for desktop usage.

Hi there,

Whenever I see high IO wait, I can’t help but become curious about block device throughput. Too often I’ve found a bad drive with utterly crappy performance to be at fault, and it is easy to check.

What kind of results do you get from running hdparm -tT /dev/whatever? (Bonnie++ is good too.)

Also, a quick check with smartctl -a /dev/whatever might be interesting as well (pending and reallocated sector count, etc.)

Your issue may be utterly unrelated to drive health, but then again an IO wait bottleneck can also easily be created by a failing drive.

Cheers,
LewsTherin

@LewsTherinTelemon

hdparm -tT /dev/sda

/dev/sda:
 Timing cached reads:   2952 MB in  2.00 seconds = 1476.60 MB/sec
 Timing buffered disk reads:  332 MB in  3.01 seconds = 110.28 MB/sec

The device is a WDC WD10EARS-00Y5B1 and smartctl shows it to be totally healthy (as it’s supposed to be as this is a brand new disk).

@gogalthorp

uname -a
Linux xxxxxx 2.6.31.12-0.2-desktop #1 SMP PREEMPT 2010-03-16 21:25:39 +0100 i686 i686 i386 GNU/Linux

@FeatherMonkey

I’ve seen this about on other distro’s.

Me too. It seems to be a genuine kernel problem, hitting any distro.

also some seem to think it maybe related 64bit(though I am), which you would seem to be confirming.

I use i586, so it’s not related to 64bit (I think).

Do you have a reason to run 32bit ? I don’t know any tech details, but I’ve experienced some trouble with latest distros when 32bit versions were used on 64bit systems. Might be kernel related, I don’t know. I would at least give the x86_64 version a try and see if the same problem exists.
And me too, I never use the kernel-desktop version on a server.

I found that the story goes on here:

https://bugzilla.kernel.org/show_bug.cgi?id=13347

Reading the posts of this kernel bug thread I come to the conclusions:

  1. It appears to be hardware related. ASUS motherbords (P5KPL-AM SE, P5B-Deluxe, P5K) are reported to freeze under high harddisk I/O load.

  2. This is said to improve the situation:

Revert two kernel patches:

https://bugzilla.kernel.org/show_bug.cgi?id=12309#c360

or apply this:

# echo 50 > /proc/sys/vm/vfs_cache_pressure
# echo deadline > /sys/block/DEVICE/queue/scheduler
# echo 1024 > /sys/block/DEVICE/queue/nr_requests

where DEVICE is sda or whatever. Possibly line #2 above alone can do the trick.
https://bugzilla.kernel.org/show_bug.cgi?id=13347#c20

  1. The real issue is still unresolved, but when it hits it makes a server almost unuseable.

Do you have a reason to run 32bit ?

Yes, several. One of them is crm114 (used for mail filtering here) which does not work reliably when compiled on a 64 bit system.

Try the default kernel not the desktop. The way time slices are allocated are different and may effect high i/o situations.

Try the default kernel not the desktop.

I will do that. But I have a very strong feeling that the real cause of my trouble is here:

Suse 11.2 on WD HDD Green 1TB, can it work? - openSUSE Forums

My current partitions for /dev/sda:

GNU Parted 1.9.0
Using /dev/sda
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) unit s
(parted) print
Model: ATA WDC WD10EARS-00Y (scsi)
Disk /dev/sda: 1953525168s
Sector size (logical/physical): 512B/512B
Partition Table: msdos

Number  Start       End          Size         Type     File system     Flags
 1      63s         4209029s     4208967s     primary  ext3            boot, type=83
 2      4209030s    843075134s   838866105s   primary  ext4            type=83
 3      843075135s  853565579s   10490445s    primary  linux-swap(v1)  type=82
 4      853565580s  1953520064s  1099954485s  primary  ext4            type=83

Pretty ugly, uuuh (should start at sector 64)

Buying latest technology hard disk drives reminds me somewhat of my sig line.

Remember to make the change under a full moon >:)

Sounds like this is the problem.

edit: totally ignore me, this is of course only for drives which have a physical sector size of 4k… duh

http://lwn.net/Articles/377895/

Remember to make the change under a full moon

I don’t know what the phase of the moon currently is, but I made the change. The problem is completely resolved. %wa is low, system load is low too and the harddisks are blazing fast.

What can I say? It was a small step for a harddisk (shift the align by 512 bytes to match 4k blocks) but a giant leap for userkind.

A word of caution to any users having one of those Western Digital Caviar Green Drives with a designation of WD10EARS / WD15EARS or WD20EARS: Do not use the jumper to make them Windows-XP compatible, but align the partitions to match with the internal 4kB sector size used. The drive will report to use 512 bytes sectors, but he doesn’t. Start the first partition at sector 64 (instead of the default 63) and make sure that any partition size (counting in 512 bytes sectors) divides by 8. This is an example:

GNU Parted 2.2                                          
Using /dev/sda                                          
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) unit s                                                           
(parted) print                                                            
Model: ATA WDC WD10EARS-00Y (scsi)                                        
Disk /dev/sda: 1953525168s
Sector size (logical/physical): 512B/512B
Partition Table: msdos

Number  Start       End          Size        Type     File system     Flags
 1      64s         4112639s     4112576s    primary  ext3            boot
 2      4112640s    963899999s   959787360s  primary  ext4
 3      963900000s  974181599s   10281600s   primary  linux-swap(v1)
 4      974181600s  1953503999s  979322400s  primary  ext4

(parted) unit chs
(parted) print
Model: ATA WDC WD10EARS-00Y (scsi)
Disk /dev/sda: 121601,80,62
Sector size (logical/physical): 512B/512B
BIOS cylinder,head,sector geometry: 121601,255,63.  Each cylinder is 8225kB.
Partition Table: msdos

Number  Start      End            Type     File system     Flags
 1      0,1,1      255,254,62     primary  ext3            boot
 2      256,0,0    59999,254,62   primary  ext4
 3      60000,0,0  60639,254,62   primary  linux-swap(v1)
 4      60640,0,0  121599,254,62  primary  ext4

In my case partition #1 is mounted on /boot, #2 on / and #4 on /home. Use the latest version of parted and you will be very happy with this piece of HW. They are fast and quiet with low power consumption. The downside is that you reportedly can’t double boot with XP.

Have fun!

I just have to say, wow! I haven’t seen one of these yet, but I had assumed that just setting the ‘backward compatibility jumper’ back to the ‘works with XP, loses some disk space’ mode would turn everything back and it would work as usual, with all the old software, and the old performance. That ‘assumed’ word can have a real sting in its tail…