system freezes are destroying my hard drive, kernel version 3.0-3.4

I’ve been trying to get my computer to stop freezing since the day I put it together. It doesn’t freeze in windows vista, just linux, all distros. I’m tired of this.
When I say a system freeze, I mean a complete freeze. Ctl-Alt-F1 doesn’t work, my led’s on keyboard don’t work, if I was playing sound it loops over and over. I activated SysRq and none of the BUSIER mnemonic seem to do anything. I setup kdump, does nothing ( no files in /var/crash). I’ve used Suse Automatated Kernel Compile for kernels 3.0.36,3.2.18,3.3.7, and 3.4. None of the kernels keep it from freezing.

I keep ksysguard open and always on top so I know how much memory I have left. Its always 1GB or less used and I have 4GB

I suspected it being my sandy bridge processor, but since it works fine in windows vista, I can’t see it being my hardware. Its got to be either the kernel, some module, something.
I’ve suspected maybe my wireless card, but since it works fine in windows, I can’t see it being the problem. (my wifi uses ath5k module) This is the same wireless card I used back when I was using opensuse 10.0 and it never caused a crash with that, even using madwifi. Obviously the card was in a different computer at the time.

I’ve used Gnome 3, Gnome fallback,latest kde w/ w/o desktop effects. had freezes with all of them.

So now I have the problem that I have too many bad sectors on my hard drive for linux to be happy so I had to go into disk utility and switch on the “Don’t warn if disk is failing” box.

I really like using linux, but its supposed to be more stable than windows, what gives.

Please Help Me.

On 2012-07-09 03:56, cw9000 wrote:
> So now I have the problem that I have too many bad sectors on my hard
> drive for linux to be happy so I had to go into disk utility and switch
> on the “Don’t warn if disk is failing” box.

Lets concentrate on the disk, because one thing that can crash Linux is a
failing disk. A crashing system can not cause bad sectors; corrupted
filesystem, yes.

Run the SMART long test. Ask if you don’t know how.

Is Windows using the same hard disk?


Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)

yes windows is using the same hard disk. and no I don’t know how to do SMART long test.

EDIT: I ran

smartctl --test=long /dev/sda

and it says it will take 203 minutes

I’m guessing that the video drivers are causing the freezing.

I have similar graphics, and it is likely to freeze whenever the brightness of the screen is changed. I configure so that brightness changes are infrequent, and that keeps it running most of the time.

I should mention how frequently it freezes. Usually it freezes within 1 day. 4 or 5 days tops. I can leave my windows vista forever and it won’t freeze. I haven’t noticed any freezing with brightness changes, but I have come to my computer after the monitor is turned off and it won’t wake back up. Because that happened, I have

xset -dpms

set in my .bash_profile

cw9000 wrote:

> system freezes are destroying my hard drive, kernel version 3.0-3.4

> I’ve been trying to get my computer to stop freezing since the day I put
> it together. It doesn’t freeze in windows vista, just linux, all
> distros. I’m tired of this.

> So now I have the problem that I have too many bad sectors on my hard
> drive for linux to be happy so I had to go into disk utility and switch
> on the “Don’t warn if disk is failing” box.

You have this backward. As Carlos says, system crashes don’t cause bad
sectors. Bad sectors can cause system crashes. So you do need to run
that long smart test that was suggested. And almost certainly, you will
need to buy a new disk.

Your system freezes may be connected with the faulty disk, or they may
be connected with your video as nrickert suggests. That is, you may have
either one or two faults.

On 2012-07-09 05:06, cw9000 wrote:

> EDIT: I ran
>
> Code:
> --------------------
> smartctl --test=long /dev/sda
> --------------------
>
> and it says it will take 203 minutes

Right. After that time, you need to issue “smartctl -A /dev/sda” to see the
results. Post them here and will help interpreting.


Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)

It is sounding more and more like a video problem.

In your KDE power settings, set your computer to never dim the display. Turning off the display is usually not a problem, but dimming it after an idle time is.

If I am leaving my laptop unattended for a long period, I shut it down to avoid the freeze problem.

The good news - freezes seem less frequent in 12.2 (currently testing Beta2), though they still happen.

okay here are the results of the long test:

smartctl -A /dev/sda

smartctl 5.42 2011-10-20 r3458 [i686-linux-3.1.10-1.16-desktop] (SUSE RPM)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net


=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   114   099   006    Pre-fail  Always       -       72073454
  3 Spin_Up_Time            0x0003   095   095   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       354
  5 Reallocated_Sector_Ct   0x0033   093   093   036    Pre-fail  Always       -       296
  7 Seek_Error_Rate         0x000f   076   060   030    Pre-fail  Always       -       46888432
  9 Power_On_Hours          0x0032   092   092   000    Old_age   Always       -       7757
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       354
183 Runtime_Bad_Block       0x0000   100   100   000    Old_age   Offline      -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   093   000    Old_age   Always       -       507
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   065   046   045    Old_age   Always       -       35 (Min/Max 35/40)
194 Temperature_Celsius     0x0022   035   054   000    Old_age   Always       -       35 (0 20 0 0 0)
195 Hardware_ECC_Recovered  0x001a   029   018   000    Old_age   Always       -       72073454
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       202993039319436
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       3889810569
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       3378140532

On 2012-07-09 20:16, cw9000 wrote:
>
> okay here are the results of the long test:



>   === START OF READ SMART DATA SECTION ===
>   SMART Attributes Data Structure revision number: 10
>   Vendor Specific SMART Attributes with Thresholds:
>   ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

>   5 Reallocated_Sector_Ct   0x0033   093   093   036    Pre-fail  Always       -       296

>   9 Power_On_Hours          0x0032   092   092   000    Old_age   Always       -       7757


Backup that disk and replace it as soon as you can. It has remapped 296
sectors already. You have a margin of I don’t know how many, but each
sector that goes bad means lost data and a possible crash.


Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)

I ended up buying a new hard drive. A 2TB Western Digital ( this one is a seagate 1TB, if the reason my computer has been freezing is because of hard drive problems, then I bought a bad drive cause its been freezing since its beginning). I plan on putting videos on this old one. Will simply having it plugged in to the computer cause freezing?

I was going to buy one anyway. I plan on putting opensuse 12.2 beta 2 on it unless somebody thinks its a bad idea.
I used 12.1 milestone 5 before.

I took out my old pci wifi card and am using a cheap usb one for now. I also changed the “dim when idle”.

so if it is my video drivers causing freezing, how is opensuse 12.2 beta 2 doing a better job with it? I thought kernels and modules were outside of the distro’s. And can someone explain why opensuse is currently on 3.1.x-x kernel and they have already released 3.5 over at the kernel archives. I don’t know how that whole thing works. does opensuse have any say over the development of kernels. I mean does opensuse polish old kernels themselves?

as far as the results of my smartctl test go, I didn’t get an error log ( “smartctl -l error /dev/sda” ). It says 296 for #5 but none of the values show failure (when failed). and I don’t know what to make of #1, seems high.

Thanks for any and all info

On 2012-07-10 03:46, cw9000 wrote:
>
> I ended up buying a new hard drive. A 2TB Western Digital ( this one
> is a seagate 1TB, if the reason my computer has been freezing is because
> of hard drive problems, then I bought a bad drive cause its been
> freezing since its beginning).

You could have returned it on warranty.

> I plan on putting videos on this old
> one. Will simply having it plugged in to the computer cause freezing?

Maybe. Maybe not. But I would not trust that disk for anything, for the moment.

> I was going to buy one anyway. I plan on putting opensuse 12.2 beta 2
> on it unless somebody thinks its a bad idea.

If it works for you, it is a good idea :slight_smile:

> And can someone explain why opensuse is
> currently on 3.1.x-x kernel and they have already released 3.5 over at
> the kernel archives.

Why should it be? :-))

> as far as the results of my smartctl test go, I didn’t get an error log
> ( “smartctl -l error /dev/sda” ). It says 296 for #5 but none of the
> values show failure (when failed). and I don’t know what to make of #1,
> seems high.

It is difficult to understand. The alarm fires when the value goes under
the threshold (not over it). But the raw values some go up, some go down.
For the reallocated sector count, the raw value I understand is the count.
Your disk will remap many more sectors if it needs to, but each failure is
bad for your system.

Actually, what is really bad is if the count increases, if it is stable it
doesn’t matter that much (I had one with several bad sectors for several
years without problems). But you having crashes → bad.

What you can do is write all zeroes to the full disk, and check the count,
several times, perhaps over a few weeks. If the count remains stable, you
can reuse the disk. :slight_smile:

Meanwhile, it is suspect hardware.


Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)

With KDE, the power options in 12.2 allow you to tell it to never dim the screen. With 12.1, the best I could do was give a long time delay.

It has not completely solved the freeze problem, but it is a lot rarer.

Presumably there’s also an newer Xorg and newer video drivers.

@nrickert:

okay, thanks. I’m using Gnome mostly. In Gnome you can just uncheck the “dim to save power” under System Settings->Screen
and/or
gsettings set org.gnome.settings-daemon.plugins.power idle-dim-ac false

if either of those are in fact what you’re talking about.

@robin_listas
when you say zero out the drive you mean

dd if=/dev/zero of=/dev/sda

?
and how do you get a “count”. I’m not following

Thanks

I was going to buy one anyway. I plan on putting opensuse 12.2 beta 2 on it unless somebody thinks its a bad idea.

I would rather try tumbleweed. My video related problems (Ivy bridge) disappeared with the newer kernel when I switched to tumbleweed (there is still one problem I have that might be video related), and I guess there was a good reason for postponing the 12.2 release. The difference in support for Sandy and Ivy bridge graphics between kernels 3.1.X and 3.4.X is really big.

Am 10.07.2012 09:16, schrieb Nikos78:
> I would rather try tumbleweed.
Of course an option, but if it is only to get a new kernel there is no
need to go for tumbleweed.
Adding the repo
http://download.opensuse.org/repositories/Kernel:/stable/standard
and switching the kernel to it will bring you to 3.4.4 (and soon to 3.5)
with a standard openSUSE 12.1.


PC: oS 12.1 x86_64 | i7-2600@3.40GHz | 16GB | KDE 4.8.4 | GeForce GT 420
ThinkPad E320: oS 12.1 x86_64 | i3@2.30GHz | 8GB | KDE 4.8.4 | HD 3000
eCAFE 800: oS 12.1 i586 | AMD Geode LX 800@500MHz | 512MB | KDE 3.5.10

On 2012-07-10 06:16, cw9000 wrote:

> @robin_listas
> when you say zero out the drive you mean
>
> Code:
> --------------------
> dd if=/dev/zero of=/dev/sda
> --------------------

Yep.

> ?
> and how do you get a “count”. I’m not following

If you mean the “count” parameter to dd, you do not need one: you are doing
the entire disk, it will stop when it is finished. Ah, you mean the “count”
in my post. I meant the reallocated count in the output of SMART log.

Remember that the procedure is destructive! It deletes everything. It
exercises the disk and tries to find out if it is still reliable or it
keeps degrading.


Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)

As my thread states, I’ve used kernels 3.0 - 3.4 and didn’t find a single one of them to be more stable than the other. That’s what is so frustrating. Thanks for the info, though.

**@martin_helm
**Thanks for the link to the kernel repo. I didn’t know it existed. You seem to always know where the repos are.:slight_smile:

Am 10.07.2012 17:46, schrieb cw9000:
> *@martin_helm
> *Thanks for the link to the kernel repo. I didn’t know it existed. You
> seem to always know where the repos are.:slight_smile:
>
Only the ones I have a use for :wink:


PC: oS 12.1 x86_64 | i7-2600@3.40GHz | 16GB | KDE 4.8.4 | GeForce GT 420
ThinkPad E320: oS 12.1 x86_64 | i3@2.30GHz | 8GB | KDE 4.8.4 | HD 3000
eCAFE 800: oS 12.1 i586 | AMD Geode LX 800@500MHz | 512MB | KDE 3.5.10

I just wanted to say that I changed my hard drive, changed my wifi (now internet works a lot better), changed the “dim screen to save power” to unchecked and changed to opensuse 12.2. My computer couldn’t be happier, well except for a few new things from opensuse 12.2. My faith in Linux restored. :slight_smile: