Page 1 of 2 12 LastLast
Results 1 to 10 of 18

Thread: High CPU Wait States on Intel Core 2 Quad Machine

  1. #1

    Default High CPU Wait States on Intel Core 2 Quad Machine

    All

    I have a machine that is randomly pausing. In top the CPU stats say that the CPU I/O Wait State is high, often 100% for several seconds, splitting the display to show individual CPUs it seems that randomly one or more CPUs is high in wait states, 100%, or near abouts, for seconds at a time.

    top is NOT reporting any application with high utilisation when this happens.

    I have determined via iotop that the disks are NOT under any load when this happens.

    I last night ran the memtest off the install CD, it completed 8 scans without error in 11 1/2 hours.

    My question is, does the CPU I/O also have to stop for network I/O or is there some sort of "buffer" between the two. This machine is being accessed via ssh to run X applications on other machines. So network traffic would be constantly high.

    Specifications.

    openSUSE 11.1 64 bit
    Kernel: (from uname) 2.6.27-29-0.1-default #1 SMP x86-64

    CPU: (from /proc/cpuinfo) Intel Core 2 Quad CPU Q8400 @ 2.66 GHz
    It sees 4 CPUs

    Memory: 8 GB

    Motherboard: Intel P5QL Pro Chipset P43.

    Disk controller: ICH10 southbridge.

    Disks: 4 SATA 1TB disks. The / partition is on a md raid mirrored over 2 disks. The other 2 disks are mounted under /mnt
    All partitions formated ext3

    Swap: is on one of the mirrored disks.

    /boot: is on the other disk of the mirror set.

    Network: is a PCIe Gigabit LAN Controller connected to a 100 Mbit switch.


    Cheers
    Jim

  2. #2
    Join Date
    Oct 2008
    Location
    Birmingham. AL
    Posts
    858

    Default Re: High CPU Wait States on Intel Core 2 Quad Machine

    That's a relatively high-powered system, so you'd think it would fly. Having 8 Gig of RAM makes a big difference, too.

    Try latencytop; it's in the Build repositories. I've never tried posting a "ymp" link here; see if this works: http://software.opensuse.org/ymp/dev...latencytop.ymp. If not, go to "software.opensuse.org," click the "search" item on the left, and enter "latencytop" in the search box.

    See if latencytop will give you some idea of what's happening and post back here. I'm intrigued.

  3. #3

    Default Re: High CPU Wait States on Intel Core 2 Quad Machine

    Quote Originally Posted by smpoole7 View Post
    That's a relatively high-powered system, so you'd think it would fly. Having 8 Gig of RAM makes a big difference, too.
    That is why it is annoying me... I have a Core 2 Duo Laptop, Admittedly running 11.0, without issues.

    Quote Originally Posted by smpoole7 View Post
    Try latencytop; it's in the Build repositories. I've never tried posting a "ymp" link here; see if this works: http://software.opensuse.org/ymp/dev...latencytop.ymp. If not, go to "software.opensuse.org," click the "search" item on the left, and enter "latencytop" in the search box.
    Yep the ymp link worked... Thanks for that.

    Quote Originally Posted by smpoole7 View Post
    See if latencytop will give you some idea of what's happening and post back here. I'm intrigued.
    Well firing it up for the first time, and not really knowing what I am looking at, at first glance, BUT the first thing I see is pdflush 3524.8, md0_raid1 3524.7 kjournald 3523.4.

    I assume that this is millisecs as this is the unit in the next window when I click on each of these entries. 3.5 secs is a long time in computer terms.

    Whoops this machine just paused again and this time md0_raid1, specifically Raid resync kernel thread, was up at 17700 ms this time.

    I wonder if it is indeed a disk issue. Might check the smart stats and see if one of them is having issues.

    Thanks again for the heads up on the latencytop tool.

    Cheers
    Jim

  4. #4
    Join Date
    Oct 2008
    Location
    Birmingham. AL
    Posts
    858

    Default Re: High CPU Wait States on Intel Core 2 Quad Machine

    Jim,

    Might not hurt to poke around in the BIOS settings, too. Wait states are a necessary evil, given the difference in speed between the CPU(s) and the memory buss. But it's possible that with a little judicious tinkering, you could speed that up.

    Are you running software or hardware raid? The ICH10R supports hardware; is that what you have?

    I'm headed to bed for the night, but I'm still intrigued. When one shells out the bucks for a loaded system like yours, one expects a load of results.

  5. #5

    Default Re: High CPU Wait States on Intel Core 2 Quad Machine

    Well this has me puzzled.

    I ran some tests on the disks using the smartctl utility.

    All disks pass and no errors being reported.

    I then used the hdparm -T utility to test the cache read performance. This also seems fine with each disk reporting throughput of approx 1.6 GB/s

    Even the /dev/md0 reports 1.5 GB/s and this was during a period of high wait states. I am guessing, as its output paused, that it is also freezing and does not see the overall time. I had top running and the wait states went to around 80% or so, on 3 of the 4 processors.

    When I get a chance later tonight I am going to have a closer look at the BIOS settings, reading the manual I notice that there are a series of overclocking settings, wondering if the machine builder got a bit too keen on some of these settings?

    Jim

    Test results to follow (too big to fit in this post)

    ...

  6. #6

    Default Re: High CPU Wait States on Intel Core 2 Quad Machine

    Output from tests (Part 1):

    # hdparm -T /dev/sda

    /dev/sda:
    Timing cached reads: 3202 MB in 2.00 seconds = 1600.75 MB/sec

    /dev/sda:
    Timing cached reads: 3238 MB in 2.00 seconds = 1619.27 MB/sec

    /dev/sda:
    Timing cached reads: 3286 MB in 2.00 seconds = 1642.61 MB/sec

    /dev/sda:
    Timing cached reads: 3162 MB in 2.00 seconds = 1581.46 MB/sec

    /dev/sda:
    Timing cached reads: 3118 MB in 2.00 seconds = 1558.72 MB/sec

    /dev/sda:
    Timing cached reads: 3234 MB in 2.00 seconds = 1617.50 MB/sec


    # hdparm -T /dev/sdb

    /dev/sdb:
    Timing cached reads: 3248 MB in 2.00 seconds = 1624.35 MB/sec

    /dev/sdb:
    Timing cached reads: 3250 MB in 2.00 seconds = 1624.96 MB/sec

    /dev/sdb:
    Timing cached reads: 3240 MB in 2.00 seconds = 1619.67 MB/sec


    # hdparm -T /dev/sdc

    /dev/sdc:
    Timing cached reads: 3238 MB in 2.00 seconds = 1618.84 MB/sec

    /dev/sdc:
    Timing cached reads: 3268 MB in 2.00 seconds = 1633.55 MB/sec

    /dev/sdc:
    Timing cached reads: 3276 MB in 2.00 seconds = 1638.22 MB/sec

    /dev/sdc:
    Timing cached reads: 3258 MB in 2.00 seconds = 1629.39 MB/sec


    # hdparm -T /dev/sdd

    /dev/sdd:
    Timing cached reads: 3224 MB in 2.00 seconds = 1611.91 MB/sec

    /dev/sdd:
    Timing cached reads: 3222 MB in 2.00 seconds = 1610.95 MB/sec

    /dev/sdd:
    Timing cached reads: 3226 MB in 2.00 seconds = 1612.82 MB/sec

    /dev/sdd:
    Timing cached reads: 3258 MB in 2.00 seconds = 1629.37 MB/sec


    # hdparm -T /dev/md0

    /dev/md0:
    Timing cached reads: 3200 MB in 2.00 seconds = 1599.65 MB/sec

    /dev/md0:
    Timing cached reads: 3264 MB in 2.00 seconds = 1631.64 MB/sec

    /dev/md0:
    Timing cached reads: 3162 MB in 2.00 seconds = 1580.80 MB/sec

  7. #7

    Default Re: High CPU Wait States on Intel Core 2 Quad Machine

    Output of tests (Part 2):


    Code:
    # smartctl -a /dev/sda
    smartctl 5.39 2008-10-24 22:33 [x86_64-suse-linux-gnu] (openSUSE RPM)
    Copyright (C) 2002-8 by Bruce Allen, smartmontools Home Page (last updated $Date: 2009-09-14 01:43:11 +0200 (Mon, 14 Sep 2009) $)
    
    === START OF INFORMATION SECTION ===
    Device Model:     WDC WD10EADS-00M2B0
    Serial Number:    WD-WCAV51010226
    Firmware Version: 01.00A01
    User Capacity:    1,000,204,886,016 bytes
    Device is:        Not in smartctl database [for details use: -P showall]
    ATA Version is:   8
    ATA Standard is:  Exact ATA specification draft version not indicated
    Local Time is:    Sun Oct  4 11:50:50 2009 NZDT
    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled
    
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    
    General SMART Values:
    Offline data collection status:  (0x84) Offline data collection activity
                                            was suspended by an interrupting command from host.
                                            Auto Offline Data Collection: Enabled.
    Self-test execution status:      (   0) The previous self-test routine completed
                                            without error or no self-test has ever
                                            been run.
    Total time to complete Offline
    data collection:                 (20400) seconds.
    Offline data collection
    capabilities:                    (0x7b) SMART execute Offline immediate.
                                            Auto Offline data collection on/off support.
                                            Suspend Offline collection upon new
                                            command.
                                            Offline surface scan supported.
                                            Self-test supported.
                                            Conveyance Self-test supported.
                                            Selective Self-test supported.
    SMART capabilities:            (0x0003) Saves SMART data before entering
                                            power-saving mode.
                                            Supports SMART auto save timer.
    Error logging capability:        (0x01) Error logging supported.
                                            General Purpose Logging supported.
    Short self-test routine
    recommended polling time:        (   2) minutes.
    Extended self-test routine
    recommended polling time:        ( 235) minutes.
    Conveyance self-test routine
    recommended polling time:        (   5) minutes.
    SCT capabilities:              (0x303f) SCT Status supported.
                                            SCT Feature Control supported.
                                            SCT Data Table supported.
    
    SMART Attributes Data Structure revision number: 16
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
      1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
      3 Spin_Up_Time            0x0027   115   112   021    Pre-fail  Always       -       7241
      4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       16
      5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
      7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
      9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       903
     10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
     11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
     12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       14
    192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       11
    193 Load_Cycle_Count        0x0032   198   198   000    Old_age   Always       -       7205
    194 Temperature_Celsius     0x0022   118   114   000    Old_age   Always       -       29
    196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
    197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
    198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
    199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
    200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0
    
    SMART Error Log Version: 1
    No Errors Logged
    
    SMART Self-test log structure revision number 1
    No self-tests have been logged.  [To run self-tests, use: smartctl -t]
    
    
    SMART Selective self-test log data structure revision number 1
     SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
        1        0        0  Not_testing
        2        0        0  Not_testing
        3        0        0  Not_testing
        4        0        0  Not_testing
        5        0        0  Not_testing
    Selective self-test flags (0x0):
      After scanning selected spans, do NOT read-scan remainder of disk.
    If Selective self-test is pending on power-up, resume after 0 minute delay.
    
    
    # smartctl -a /dev/sdb
    smartctl 5.39 2008-10-24 22:33 [x86_64-suse-linux-gnu] (openSUSE RPM)
    Copyright (C) 2002-8 by Bruce Allen, smartmontools Home Page (last updated $Date: 2009-09-14 01:43:11 +0200 (Mon, 14 Sep 2009) $)
    
    === START OF INFORMATION SECTION ===
    Device Model:     WDC WD10EADS-00M2B0
    Serial Number:    WD-WCAV51028064
    Firmware Version: 01.00A01
    User Capacity:    1,000,204,886,016 bytes
    Device is:        Not in smartctl database [for details use: -P showall]
    ATA Version is:   8
    ATA Standard is:  Exact ATA specification draft version not indicated
    Local Time is:    Sun Oct  4 11:52:54 2009 NZDT
    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled
    
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    
    General SMART Values:
    Offline data collection status:  (0x84) Offline data collection activity
                                            was suspended by an interrupting command from host.
                                            Auto Offline Data Collection: Enabled.
    Self-test execution status:      (   0) The previous self-test routine completed
                                            without error or no self-test has ever
                                            been run.
    Total time to complete Offline
    data collection:                 (21600) seconds.
    Offline data collection
    capabilities:                    (0x7b) SMART execute Offline immediate.
                                            Auto Offline data collection on/off support.
                                            Suspend Offline collection upon new
                                            command.
                                            Offline surface scan supported.
                                            Self-test supported.
                                            Conveyance Self-test supported.
                                            Selective Self-test supported.
    SMART capabilities:            (0x0003) Saves SMART data before entering
                                            power-saving mode.
                                            Supports SMART auto save timer.
    Error logging capability:        (0x01) Error logging supported.
                                            General Purpose Logging supported.
    Short self-test routine
    recommended polling time:        (   2) minutes.
    Extended self-test routine
    recommended polling time:        ( 248) minutes.
    Conveyance self-test routine
    recommended polling time:        (   5) minutes.
    SCT capabilities:              (0x303f) SCT Status supported.
                                            SCT Feature Control supported.
                                            SCT Data Table supported.
    
    SMART Attributes Data Structure revision number: 16
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
      1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
      3 Spin_Up_Time            0x0027   109   109   021    Pre-fail  Always       -       7508
      4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       16
      5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
      7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
      9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       903
     10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
     11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
     12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       14
    192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       11
    193 Load_Cycle_Count        0x0032   198   198   000    Old_age   Always       -       7091
    194 Temperature_Celsius     0x0022   121   112   000    Old_age   Always       -       26
    196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
    197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
    198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
    199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
    200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0
    
    SMART Error Log Version: 1
    No Errors Logged
    
    SMART Self-test log structure revision number 1
    No self-tests have been logged.  [To run self-tests, use: smartctl -t]
    
    
    SMART Selective self-test log data structure revision number 1
     SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
        1        0        0  Not_testing
        2        0        0  Not_testing
        3        0        0  Not_testing
        4        0        0  Not_testing
        5        0        0  Not_testing
    Selective self-test flags (0x0):
      After scanning selected spans, do NOT read-scan remainder of disk.
    If Selective self-test is pending on power-up, resume after 0 minute delay.

  8. #8

    Default Re: High CPU Wait States on Intel Core 2 Quad Machine

    Output of tests (Part 3)

    Code:
    # smartctl -a /dev/sdc
    smartctl 5.39 2008-10-24 22:33 [x86_64-suse-linux-gnu] (openSUSE RPM)
    Copyright (C) 2002-8 by Bruce Allen, http://smartmontools.sourceforge.net
    
    === START OF INFORMATION SECTION ===
    Device Model:     WDC WD10EADS-00M2B0
    Serial Number:    WD-WCAV51010266
    Firmware Version: 01.00A01
    User Capacity:    1,000,204,886,016 bytes
    Device is:        Not in smartctl database [for details use: -P showall]
    ATA Version is:   8
    ATA Standard is:  Exact ATA specification draft version not indicated
    Local Time is:    Sun Oct  4 11:52:59 2009 NZDT
    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled
    
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    
    General SMART Values:
    Offline data collection status:  (0x84) Offline data collection activity
                                            was suspended by an interrupting command from host.
                                            Auto Offline Data Collection: Enabled.
    Self-test execution status:      (   0) The previous self-test routine completed
                                            without error or no self-test has ever
                                            been run.
    Total time to complete Offline
    data collection:                 (21600) seconds.
    Offline data collection
    capabilities:                    (0x7b) SMART execute Offline immediate.
                                            Auto Offline data collection on/off support.
                                            Suspend Offline collection upon new
                                            command.
                                            Offline surface scan supported.
                                            Self-test supported.
                                            Conveyance Self-test supported.
                                            Selective Self-test supported.
    SMART capabilities:            (0x0003) Saves SMART data before entering
                                            power-saving mode.
                                            Supports SMART auto save timer.
    Error logging capability:        (0x01) Error logging supported.
                                            General Purpose Logging supported.
    Short self-test routine
    recommended polling time:        (   2) minutes.
    Extended self-test routine
    recommended polling time:        ( 248) minutes.
    Conveyance self-test routine
    recommended polling time:        (   5) minutes.
    SCT capabilities:              (0x303f) SCT Status supported.
                                            SCT Feature Control supported.
                                            SCT Data Table supported.
    
    SMART Attributes Data Structure revision number: 16
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
      1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
      3 Spin_Up_Time            0x0027   117   117   021    Pre-fail  Always       -       7133
      4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       16
      5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
      7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
      9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       902
     10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
     11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
     12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       14
    192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       12
    193 Load_Cycle_Count        0x0032   196   196   000    Old_age   Always       -       14632
    194 Temperature_Celsius     0x0022   121   116   000    Old_age   Always       -       26
    196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
    197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
    198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
    199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
    200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0
    
    SMART Error Log Version: 1
    No Errors Logged
    
    SMART Self-test log structure revision number 1
    No self-tests have been logged.  [To run self-tests, use: smartctl -t]
    
    
    SMART Selective self-test log data structure revision number 1
     SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
        1        0        0  Not_testing
        2        0        0  Not_testing
        3        0        0  Not_testing
        4        0        0  Not_testing
        5        0        0  Not_testing
    Selective self-test flags (0x0):
      After scanning selected spans, do NOT read-scan remainder of disk.
    If Selective self-test is pending on power-up, resume after 0 minute delay.
    
    
    # smartctl -a /dev/sdd
    smartctl 5.39 2008-10-24 22:33 [x86_64-suse-linux-gnu] (openSUSE RPM)
    Copyright (C) 2002-8 by Bruce Allen, http://smartmontools.sourceforge.net
    
    === START OF INFORMATION SECTION ===
    Device Model:     WDC WD10EADS-00M2B0
    Serial Number:    WD-WCAV51025602
    Firmware Version: 01.00A01
    User Capacity:    1,000,204,886,016 bytes
    Device is:        Not in smartctl database [for details use: -P showall]
    ATA Version is:   8
    ATA Standard is:  Exact ATA specification draft version not indicated
    Local Time is:    Sun Oct  4 11:53:04 2009 NZDT
    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled
    
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    
    General SMART Values:
    Offline data collection status:  (0x82) Offline data collection activity
                                            was completed without error.
                                            Auto Offline Data Collection: Enabled.
    Self-test execution status:      (   0) The previous self-test routine completed
                                            without error or no self-test has ever
                                            been run.
    Total time to complete Offline
    data collection:                 (19200) seconds.
    Offline data collection
    capabilities:                    (0x7b) SMART execute Offline immediate.
                                            Auto Offline data collection on/off support.
                                            Suspend Offline collection upon new
                                            command.
                                            Offline surface scan supported.
                                            Self-test supported.
                                            Conveyance Self-test supported.
                                            Selective Self-test supported.
    SMART capabilities:            (0x0003) Saves SMART data before entering
                                            power-saving mode.
                                            Supports SMART auto save timer.
    Error logging capability:        (0x01) Error logging supported.
                                            General Purpose Logging supported.
    Short self-test routine
    recommended polling time:        (   2) minutes.
    Extended self-test routine
    recommended polling time:        ( 221) minutes.
    Conveyance self-test routine
    recommended polling time:        (   5) minutes.
    SCT capabilities:              (0x303f) SCT Status supported.
                                            SCT Feature Control supported.
                                            SCT Data Table supported.
    
    SMART Attributes Data Structure revision number: 16
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
      1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
      3 Spin_Up_Time            0x0027   107   107   021    Pre-fail  Always       -       7608
      4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       16
      5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
      7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
      9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       897
     10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
     11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
     12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       14
    192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       12
    193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       2208
    194 Temperature_Celsius     0x0022   121   116   000    Old_age   Always       -       26
    196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
    197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
    198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
    199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
    200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0
    
    SMART Error Log Version: 1
    No Errors Logged
    
    SMART Self-test log structure revision number 1
    No self-tests have been logged.  [To run self-tests, use: smartctl -t]
    
    
    SMART Selective self-test log data structure revision number 1
     SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
        1        0        0  Not_testing
        2        0        0  Not_testing
        3        0        0  Not_testing
        4        0        0  Not_testing
        5        0        0  Not_testing
    Selective self-test flags (0x0):
      After scanning selected spans, do NOT read-scan remainder of disk.
    If Selective self-test is pending on power-up, resume after 0 minute delay.

  9. #9

    Default Re: High CPU Wait States on Intel Core 2 Quad Machine

    Checked the BIOS settings and all the Overclocking stuff was set to defaults of Auto. So nothing suspicious there.

    One thing the BIOS does have is a Hardware Monitor screen showing things like Fan Speeds and CPU Temps etc. CPU is running at approx 36 C, so not too hot.

    I would not be worrying too much about the CPU wait states being too high except for the fact that the whole machine freezes for seconds at a time. Which makes it almost unusable when trying to do real work.

    Jim

  10. #10

    Default Re: High CPU Wait States on Intel Core 2 Quad Machine

    After the following changes I can still not find what is going on.

    Running latencytop over several days and watching it during hangs I noticed that it was showing high latency during disk "stuff"

    fsync(), Writing page to disk, etc. I also noticed it is mentioning Page Fault. Which I know as a memory operation, to do with swap, but the machine has 8 GB of mem and the swap is empty, so I might be wrong there.

    Anyway I have tried the following - with results included:

    Found a mailing list post at:

    Linux-Kernel Archive: Re: Finding what is stuck...

    which mentions using:

    echo noop > /sys/block/sda/queue/schedular

    which changes the schedular to use noop instead of CFQ, did this all all disks.

    No difference.

    changed back to CFQ

    =//=

    Mounted the disks as ext2

    No difference

    Mounted back as ext3

    =//=

    Reran the Memory test 7 Passes in 9 hours or so.

    No Errors

    =//=

    Started the machine using the Fail Safe option

    No Difference

    Rebooted back to Normal

    =//=

    Updated BIOS from v1001 to v1004

    No difference

    =//=

    Updated the CPU microcode downloaded from Intel website

    No difference

    =//=

    Now I am running out of ideas...

    Jim

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •