Kernel BUG - ASUS P5Q Pro

aruzsi · December 16, 2008, 7:40pm

Hi,

What does it mean:
kernel: BUG: unable to handle kernel paging request at 0000000000100100

4GB RAM+4 GB swap.
The machine was unable doing anything except answering ping requests on eth1 (3Com 509B).

BTW: upgraded kernel to:
kernel-default-2.6.25.18-0.2

TIA,

lwfinger · December 16, 2008, 8:26pm

aruzsi wrote:
> Hi,
>
> What does it mean:
> kernel: BUG: unable to handle kernel paging request at
> 0000000000100100

Something either needed a humongous amount of memory (~ 6GB), or it leaked
memory like crazy. In either case, there was no more virtual memory available.
You need to go back to /var/log/messages, find this line, and post the stuff
that got dumped below it. That will show what program was running when the
kernel crashed.

If you can remember, what had been happening prior to this crash? Were you
running any unusual programs? Had the system been up for a long time?

Larry

aruzsi · December 16, 2008, 8:56pm

As I mentioned there are 4 GB RAM (4x1 GB) and 4GB swap.

This error message is the last one in /var/log/messages.

Before the system stopped working I tried making an iso file from SuSE 11.0 x86_64 with the next command:
dd if=/dev/sr0 of=SuSE-11.0-64.iso bs=2048
Nothing special program. Maybe ncpfs mounted Novell volume?

The system was up just in some days. It is very few.

TIA,

lwfinger · December 17, 2008, 12:09am

aruzsi wrote:
> As I mentioned there are 4 GB RAM (4x1 GB) and 4GB swap.
>
> This error message is the last one in /var/log/messages.
>
> Before the system stopped working I tried making an iso file from SuSE
> 11.0 x86_64 with the next command:
> dd if=/dev/sr0 of=SuSE-11.0-64.iso bs=2048
> Nothing special program. Maybe ncpfs mounted Novell volume?
>
> The system was up just in some days. It is very few.

This shouldn’t have caused any problems. Have you repeated the process?

aruzsi · December 17, 2008, 9:11am

Yes, I’ve done about 10 times the dd copy.
No problem, yet.

Yesterday I started some vnc process for using many memory. Machine is working still …

TIA,

user · December 18, 2008, 6:22pm

aruzsi schrieb:
> What does it mean:
> kernel: BUG: unable to handle kernel paging request at
> 0000000000100100

Well, it means what it says - there was a bug (malfunction) in
the kernel which manifested itself in an attempt to access address
0000000000100100, which the paging mechanism refused to handle.
(Quite rightly, I might add. That value sure looks unusual for a
kernel address.)

This sort of message is typically followed by a so-called backtrace
which tells you exactly which part of the kernel hit the bug.
That backtrace contains essential information for finding out what
went wrong and how to fix it. Without it, I could only speculate.

> 4GB RAM+4 GB swap.
> The machine was unable doing anything except answering ping requests on
> eth1 (3Com 509B).

I’m not sure what you are trying to say. What exactly did you try
after the BUG message had appeared, and in which way did the
machine’s reaction differ from the correct behaviour?

> BTW: upgraded kernel to:
> kernel-default-2.6.25.18-0.2

Before or after you encountered the BUG message?

–
Tilman Schmidt
Phoenix Software GmbH
Bonn, Germany

aruzsi · December 18, 2008, 6:51pm

OK. It was clear for me literally.

This sort of message is typically followed by a so-called backtrace
which tells you exactly which part of the kernel hit the bug.

Typically.
As I mentioned, in the /var/log/messages no more lines.

> 4GB RAM+4 GB swap.
> The machine was unable doing anything except answering ping requests on
> eth1 (3Com 509B).

I’m not sure what you are trying to say. What exactly did you try
after the BUG message had appeared, and in which way did the
machine’s reaction differ from the correct behaviour?

The only one possibility was: pushing the RESET button.
No ssh, no login on the console.
On eth1 there was answer for ping requests.

> BTW: upgraded kernel to:
> kernel-default-2.6.25.18-0.2

Before or after you encountered the BUG message?

Before the BUG.

TIA,

baskitcaise · December 18, 2008, 7:52pm

aruzsi wrote:

> The only one possibility was: pushing the RESET button.
> No ssh, no login on the console.
> On eth1 there was answer for ping requests.
>
>> BTW: upgraded kernel to:
>> kernel-default-2.6.25.18-0.2

Have a look in the /etc/sysconfig Editor part of YaST and enable SyS-Req key option, this will in most circumstances ( except for really serious kernel crashes ) allow you to have a bit of control over the shutdown:

Ctrl+Alt+Sys-rq+s = sync drives
Dtrl+Sys-rq+u = Unmount drives
Ctrl+Sus rq+b = reboot

The above will reboot the machine after sync of the drive and unmounting them which will hopefully mean that there will be no corrupt inodes and your disks will not need a fsck.

There are loads more options a read me is in the Document dir in the kernel source or can easily be googled for.

http://www.linuxhowtos.org/Tips%20and%20Tricks/sysrq.htm

HTH

–
Mark…

Nil illegitimi carborundum

aruzsi · December 18, 2008, 8:34pm

Thanks!
I hope I won’t have to use it again. Machine has been working since 2 days and >8 hours.
It is not too long but it is working …

TIA,

robopensuse · December 19, 2008, 3:57pm

Good explanation.

The page files are always 4KiB, though you can have a small number of huge pages to, in general many small ones are allocated to fulfil memory requests, so the previous exlanation does not stack up.

It is useful to see if ping and sys-rq work, often apparent console lock ups, are actually X and console issues, so seeing if you can ssh into the box is often worthwhile.

But if you see a panic, then the kernel has caught an error and needs reboot, preferably as gently as possible, as explained to with the magic keys.