A strange greeting from the kernel (11.4)

On a fresh install of openSUSE 11.4 i586 (vmlinuz-2.6.37.1-1.2-desktop) the following appeared in the terminal window:

Message from syslogd@phoenix at Apr 20 13:05:41 ...
 kernel: 4787.436077] Uhhuh. NMI received for unknown reason 3d on CPU 0.

Message from syslogd@phoenix at Apr 20 13:05:41 ...
 kernel: 4787.436095] Do you have a strange power saving mode enabled?

Message from syslogd@phoenix at Apr 20 13:05:41 ...
 kernel: 4787.436104] Dazed and confused, but trying to continue

At the time when this happened yast was running installing texlive (may not be related at all). What is going on here?

On 04/20/2011 01:36 PM, vodoo wrote:
> NMI received for unknown reason 3d

looks like a kernel bug, http://tinyurl.com/3n6jmav


CAVEAT: http://is.gd/bpoMD
[openSUSE 11.3 + KDE4.5.5 + Thunderbird3.1.8 via NNTP]
A Penguin Being Tickled - http://www.youtube.com/watch?v=0GILA0rrR6w

Thank you. Redhat says:

Should be fixed in 2.6.38-rc6-git2

For openSUSE we are stuck with 2.6.37, so this would have to be backported. Uhhh

vodoo wrote:

>
> Thank you. Redhat says:
>
>> Should be fixed in 2.6.38-rc6-git2
>
> For openSUSE we are stuck with 2.6.37, so this would have to be
> backported. Uhhh
>
Why are you stuck with that? Is there something with your hardware which
stops you from updating to a newer kernel?
If you think you are stuck just because you do not know where to get a new
kernel look here:
http://download.opensuse.org/repositories/Kernel:/stable/standard

You have to check if the kernels there are new enough to include that fix (I
did not check that now).


PC: oS 11.3 64 bit | Intel Core2 Quad Q8300@2.50GHz | KDE 4.6.1 | GeForce
9600 GT | 4GB Ram
Eee PC 1201n: oS 11.4 64 bit | Intel Atom 330@1.60GHz | KDE 4.6.0 | nVidia
ION | 3GB Ram

Why are you stuck with that? Is there something with your hardware which stops you from updating to a newer kernel?

No, it’s rather something with my brainware stopping me. Let me explain: I am configuring a spare system which may eventually become a production server (for DNS, Apache, MySQL and a bit more). I want it to be rock solid and reliable. The choice for using 11.4 was made because this offers a better remaining lifetime with updates and support.

I know that openSUSE developers strictly avoid to update to a higher kernel release during the life of a SUSE release. They must have a very good reason. 11.4 will remain with 2.6.37.x and - as far as I understand - this is because the kernel API may have some changes with the next release 2.6.38.x. 11.4 has been extensively tested with 2.6.37 and some apps may (or may not) break with a newer kernel.

This is my concern. Upgrading to a newer kernel may fix one problem but introduce an unknown number of new problems or cause instabilities. Am I right here, or would you recommend to “just try it”?

On 2011-04-21 10:36, vodoo wrote:
> This is my concern. Upgrading to a newer kernel may fix one problem but
> introduce an unknown number of new problems or cause instabilities. Am I
> right here, or would you recommend to “just try it”?

Yes, you are right.


Cheers / Saludos,

Carlos E. R.
(from 11.2 x86_64 “Emerald” at Telcontar)

On 2011-04-21 10:36, vodoo wrote:
> This is my concern. Upgrading to a newer kernel may fix one problem but
> introduce an unknown number of new problems or cause instabilities. Am I
> right here, or would you recommend to “just try it”?
>
I did not recommend to update or not, I asked why you said that you are
stuck with 2.6.37 since you gave no rationale and showed you where to find a
newer one.
I also never update a kernel without urgent need.
What you can do is to upgrade the kernel but keep the old one if something
does not work as expected.
And there is also a reason why the repository for which I gave you the link
has stable in its name.
At the end you have to decide if it is worth a try or not, nobody else can
decide that for you.


PC: oS 11.3 64 bit | Intel Core2 Quad Q8300@2.50GHz | KDE 4.6.1 | GeForce
9600 GT | 4GB Ram
Eee PC 1201n: oS 11.4 64 bit | Intel Atom 330@1.60GHz | KDE 4.6.0 | nVidia
ION | 3GB Ram

I experienced this exact problem with the early 2.6.37 kernels. Other variants (-30, -31, -20 et al) were also presented, raising concerns of a possible hardware problem. Subsequent upgrades to the 2.6.37-39.1 kernel corrected this, and the problem(s) have not re-presented upto and including kernel 2.6.39-rc4-9.

One reason for the concern (at first) of a potential hardware problem was the messages only occurred on PC’s with RADEON graphics. The problematic Intel Arrandale showed no evidence of the problem, nor an Intel 855GM.

As to “being stuck”, I have regularly run upgraded kernels from the Kernel/HEAD repositories (/openSUSE_11.3 and /openSUSE_11.4), with but a single problem. Do take additional care with /openSUSE_Factory !

I have the same problem on my laptop (ASUS A6KT) with openSuSE 11.4.
I did not have this problem with openSuSE 11.2

I suspect the opensource radeon driver. I have read that some card require specific firmware. Is this correct?
My graphics card is a ATI Radeon X1600 mobile. This card is supposed to be be well supported by the RADEON driver but it clearly isn’t.
Default:

  • glxgears shows nothing (got this fixed now by using a drirc file)
  • KDE desktop effects cannot be enabled
  • movies are jerky 1 frame/second
    When I connect a second screen to my VGA connector, I get an avalanche of these message (multiple errors per second).

On openSuSE 11.2:

  • glxgears just worked
  • KDE desktop effects could be enable but it was way too slow, but still it worked
  • movies played smooth except for HD movies.

I upgraded my kernel to stable this weekend 2.6.38.xx. from the openSuSE build service (download.opensuse.org) as mentioned by martin_helm above.
This first thing i saw when booting was this error message. So the problem is not solved by upgrading to 2.6.38. Besides that, the 2.6.38 kernel did not manage to completly boot.

Any suggestions?

For us, this was simply a power supply failure, everything works now that we replaced it. Go figure.

Various other blogs point to the possible memory failures or PCI device failures…I didn’t check the current/voltage output of the power supply but maybe it got too low as it wore out.

Bottom line is that this error message does not contain specific enough information to quickly track down the error.