Use of hardware accelerated OpenGL kill X-server forcing cold power cycle - 2.6.31.12-22.1-desktop

Just thought i’d report this:

o.s: opensuse 11.2
arch: x86_64
kernel: 2.6.31.12-22-default also tested with kernel: 2.6.31.12-22-desktop
nvidia driver: NVIDIA-Linux-x86_64-195.36.15-pkg2.run
card: Nvidia 8400 GS
X.Org X Server 1.6.5

motherboard:Intel TOM COVE DH55TC H55 Socket 1156 VGA DVI HDMI Out 8
Intel G6950 2.80GHz 3MB Cache Retail Box Processor
Bios: TCIBX10H.86A.0028.2009.1201.1520 (as far as i’m aware the latest)

Use of hardware accelerated OpenGL functions kill X-server forcing cold power cycle

Reproducable steps:

  1. log in and play a video using mplayer -ao alsa -vo x11 myvideo.org
    Works but not ideal due to inability to full screen
  2. log in and play a video using mplayer -ao alsa -vo xv myvideo.org
    Cause crash after 5secs play. Cannot ssh in remotely. Cannot change tty (alt+F1 etc). Needs a power restart
    Log files are as follows:

(WW) Mar 22 21:00:41 NVIDIA(0): WAIT (2, 6, 0x8000, 0x0000f360, 0x000000d0)
(WW) Mar 22 21:00:48 NVIDIA(0): WAIT (1, 6, 0x8000, 0x0000f360, 0x000000d0)

==> /var/log/messages <==
Mar 22 21:00:51 goat kernel: [118045.247264] NVRM: Xid (0001:00): 8, Channel 00000001

==> /var/log/Xorg.0.log <==
(WW) Mar 22 21:00:51 NVIDIA(0): WAIT (2, 6, 0x8000, 0x0000f360, 0x000010b4)

Crash number 2 can be reproduced with following programs:
log in with compiz enabled. Dies after 30 secs or so (moving windows,resizing etc)
log in with compiz disabled. Dies after 120secs or so (no movement of windows, just an active remote ssh session)
log in and use xbmc. Dies on around 10 secs

Error occurs with these drivers too:

NVIDIA-Linux-x86_64-173.08-pkg2.run
NVIDIA-Linux-x86_64-173.14.05-pkg2.run
NVIDIA-Linux-x86_64-173.14.25-pkg2.run
NVIDIA-Linux-x86_64-177.80-pkg2.run
NVIDIA-Linux-x86_64-180.29-pkg2.run
NVIDIA-Linux-x86_64-185.18.14-pkg2.run
NVIDIA-Linux-x86_64-190.53-pkg2.run
NVIDIA-Linux-x86_64-190.53-pkg2.run.1
NVIDIA-Linux-x86_64-195.30-pkg2.run
NVIDIA-Linux-x86_64-195.36.15-pkg2.run

Tested with the following kernel too:
2.6.31.12-22-desktop
Error message for this kernel:


==> /var/log/Xorg.0.log <==
GetModeLine - scrn: 0 clock: 106500
GetModeLine - hdsp: 1440 hbeg: 1520 hend: 1672 httl: 1904
              vdsp: 900 vbeg: 903 vend: 909 vttl: 934 flags: 6
(WW) Mar 22 22:01:25 NVIDIA(0): WAIT (2, 6, 0x8000, 0x0000e974, 0x0000e984)



==> /var/log/messages <==
Mar 22 22:01:35 goat kernel:   962.131257] NVRM: Xid (0001:00): 8, Channel 00000001

==> /var/log/Xorg.0.log <==
(II) Mar 22 22:01:35 NVIDIA(0): The NVIDIA X driver has encountered an error; attempting to
(II) Mar 22 22:01:35 NVIDIA(0):     recover...

StartX with logverbose 6

saw the following:


X.Org X Server 1.6.5
Release Date: 2009-10-11
X Protocol Version 11, Revision 0
Build Operating System: openSUSE SUSE LINUX
Current Operating System: Linux goat 2.6.31.12-22-desktop #1 SMP PREEMPT 2010-03-19 14:17:13 +0100 x86_64
Build Date: 02 November 2009  12:04:43PM
 
        Before reporting problems, check http://wiki.x.org
        to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
        (++) from command line, (!!) notice, (II) informational,
        (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Mon Mar 22 22:25:07 2010
(==) Using config file: "/etc/X11/xorg.conf"
/etc/X11/xim: Checking whether an input method should be started.
sourcing /etc/sysconfig/language to get the value of INPUT_METHOD
INPUT_METHOD is not set or empty (no user selected input method).
Trying to start a default input method for the locale en_GB.UTF-8 ...
There is no default input method for the current locale.
Dummy input method "none" (do not use any fancy input method by default)

snip


yakuake: symbol lookup error: /usr/lib64/libXi.so.6: undefined symbol: XESetWireToEventCookie
<unknown program name>(5285)/: Communication problem with  "yakuake" , it probably crashed. 
Error message was:  "org.freedesktop.DBus.Error.NoReply" : " "Message did not receive a reply (timeout by message bus)" " 

snip


QPainter::setCompositionMode: Painter not active
QPainter::end: Painter not active, aborted
kscreenlocker: symbol lookup error: /usr/lib64/libXi.so.6: undefined symbol: XESetWireToEventCookie

This particular error cause X screen to go black

Update.

Tried the following then executes ‘test 2’ (see the first post)

acpi=off - no effect on the crash

In the ‘devices’ section of Xorg.conf I added and set Option “UseEvents” “true”:

Crash was not immediate 15 secs instead of <10secs


==> /var/log/Xorg.0.log <==
(WW) Mar 27 10:38:16 NVIDIA(0): WAIT (2, 6, 0x8000, 0x00002508, 0x00002518)

==> /var/log/messages <==
Mar 27 10:38:27 goat kernel:   569.951403] NVRM: Xid (0001:00): 8, Channel 00000001

In the ‘devices’ section of Xorg.conf I set Option “UseEvents” “false”:

Same result as previous.

Trying to get a gdb trace, but the whole freeze thing is making it difficult.

As it stands opensuse 11.2 is unusable as a desktop machine on for me on my hardware. Will keep posting various things I try…

Setting the option in Xorg.conf (devices section) to:

Option “NvAGP” “0”

Test 2 (see first post).

—same result, although this time the nvidia driver did attempt to recover (which failed)


==> /var/log/Xorg.0.log <==
GetModeLine - scrn: 0 clock: 106500
GetModeLine - hdsp: 1440 hbeg: 1520 hend: 1672 httl: 1904
              vdsp: 900 vbeg: 903 vend: 909 vttl: 934 flags: 6

(WW) Mar 27 10:54:28 NVIDIA(0): WAIT (2, 6, 0x8000, 0x0000dd84, 0x0000ea84)

(WW) Mar 27 10:54:35 NVIDIA(0): WAIT (1, 6, 0x8000, 0x0000dd84, 0x0000ea84)

==> /var/log/messages <==
Mar 27 10:54:36 goat kernel:   306.625515] NVRM: Xid (0001:00): 8, Channel 00000001

==> /var/log/Xorg.0.log <==
(II) Mar 27 10:54:40 NVIDIA(0): The NVIDIA X driver has encountered an error; attempting to
(II) Mar 27 10:54:40 NVIDIA(0):     recover...




I know what you have been doing.

  1. You’ve installed a non-standard kernel from some repo, not knowing how it was compiled?
  2. You installed Xorg from a non-standard repo.

If you check, you will find that in /usr/lib64 there are libXi6 links, one pointing to …6.0.0, one pointing to …6.1.0
I solved this at a friend’s by disabling the Xorg repo, performing a search for ‘xorg’, and reinstalling all packages unconditionally, with the exception of libXi6, that one needs to be uninstalled.

You should also try, to first get it working without an xorg.conf, just to be sure it does not conflict with HAL.

For the next time: adding experimental repos brings you in a world where you’re on your own, and where things like you’ve met now, easily happen.

Logged in and did a mount -o remount,sync on /

Crash occurred after 5 mins here…

sync = All I/O to the file system should be done synchronously. In case of media with limited number of write cycles (e.g. some flash drives) “sync” may cause life-cycle shortening.

I understand your post, but I only added the experimental repos (packman,x11, jeng) when I was trying to debug the initial problem.

kernel
Index of /repositories/Kernel:/openSUSE-11.2/openSUSE_11.2 - kernel-desktop-2.6.31.12-22.1.x86_64

Not sure whether the above repo is experimental, but i’ve had a cursory look in menuconfig and i’ve notice nothing untoward-
I did try and upgrade x using the X11 repo, quickly discovered that it install non-stable x packages :smiley: which was fun. So i’ve removed the repo did a zypper dup and the packages seemed to downgrade correctly.

Removing xorg.

Did that, made sure I have a clean xorg.conf. It’s the default one created with nvidia-xconfig.

The symlinnks.

This is interesting:
I do have the following:
14 Mar 20 22:37 libXi.so.6 -> libXi.so.6.1.0
Oct 19 19:51 libXi.so.6.0.0
Mar 19 19:04 libXi.so.6.1.0

I’ll try and relink to old one and see what happens… :slight_smile:
oh and uninstalling libxi

In /usr/lib64 had the following:


libXi.so.6 -> libXi.so.6.1.0
libXi.so.6.0.0
libXi.so.6.1.0

libXi6-7.4-13.1

The above caused:

*The opensuse green login screen (kdm?) not to start.

  • kde4 Windows (after a cmd startx) had no decoration (close,maximise buttons), i.e the top bar did not appear.

Symlink libXi.so.6 to libXi.so.6.0.0 for the above to be fixed.
Nothing to do with my problem but im sure this will help someone else out :smiley:

Removing libXi6-7.4-13.1

For whatever fudging reason I had a the 64 and 32 bit versions installed (perhaps it was there for some compatibility reason); removed both.

Test 2 didn’t cause the crash (with same error messages) for a good minute.

c’est la vie… onwards…

Some news,

Posted this problem on the nvidia forum, and it looks like it’s (according to a NVIDIA employee) a known problem. Something to do with the GPU losing communication with the driver. Currently there is no fix.

Will see if I can find a work around.

Does playing with compositing disabled (ALT+SHIFT+F12) also crash X?

I could play nicely in KDE4.something, with direct rendering enabled, if I disabled compositing. No crashes, however.