Random crashes with nouveau and GeForce GT9800

Hello board,

I decided to give 11.3 a try and installed it some days ago and followed several update cycles to keep close to the final.

Unfortunately there seems to be a stability issue with the nouveau driver and my GT9800. The system is stable under WinXP and was rock stable under 11.1 (with nvidia proprietary driver) so I don’t suppose that a hardware glitch is the cause.

Symptoms: some random graphic defects in KDE especially when drawing icons and the like (e.g. Toolbar) and sudden freezes of the whole X system (very annoying). I cannot escape by ctrl-alt-bsbs and cannot switch to the tty console screen to get the system back alive without hard reset.

Has anybody experienced the same? Any suggestions?

what I recovered from /var/log/messages which might be related to the issue:

Jul 13 22:00:37 linux-u3yw kernel:  6038.502860] [drm] nouveau 0000:03:00.0: PGRAPH_TRAP_TEXTURE - VM: Trapped read at 2411020700 status 00000af0 00000000 channel 3
Jul 13 22:00:37 linux-u3yw kernel:  6038.502863] [drm] nouveau 0000:03:00.0: magic set 0:
Jul 13 22:00:37 linux-u3yw kernel:  6038.502865] [drm] nouveau 0000:03:00.0:   0x00408904: 0x20087f05
Jul 13 22:00:37 linux-u3yw kernel:  6038.502868] [drm] nouveau 0000:03:00.0:   0x00408908: 0x24110207
Jul 13 22:00:37 linux-u3yw kernel:  6038.502870] [drm] nouveau 0000:03:00.0:   0x0040890c: 0x40000e00
Jul 13 22:00:37 linux-u3yw kernel:  6038.502873] [drm] nouveau 0000:03:00.0:   0x00408910: 0x02070000
Jul 13 22:00:37 linux-u3yw kernel:  6038.502875] [drm] nouveau 0000:03:00.0: PGRAPH_TRAP_TEXTURE - TP0: Unhandled ustatus 0x00000003
Jul 13 22:00:37 linux-u3yw kernel:  6038.507555] [drm] nouveau 0000:03:00.0: PGRAPH_TRAP - Ch 3/4 Class 0x8297 Mthd 0x15e0 Data 0x00000000:0x00000000
Jul 13 22:00:37 linux-u3yw kernel:  6038.507570] [drm] nouveau 0000:03:00.0: PGRAPH_TRAP_TEXTURE - VM: Trapped read at 2411020700 status 00000af0 00000000 channel 3
Jul 13 22:00:37 linux-u3yw kernel:  6038.507573] [drm] nouveau 0000:03:00.0: magic set 1:
Jul 13 22:00:37 linux-u3yw kernel:  6038.507575] [drm] nouveau 0000:03:00.0:   0x00409904: 0x20087f05
Jul 13 22:00:37 linux-u3yw kernel:  6038.507578] [drm] nouveau 0000:03:00.0:   0x00409908: 0x24110207
Jul 13 22:00:37 linux-u3yw kernel:  6038.507580] [drm] nouveau 0000:03:00.0:   0x0040990c: 0x40000e00
Jul 13 22:00:37 linux-u3yw kernel:  6038.507583] [drm] nouveau 0000:03:00.0:   0x00409910: 0x02070000
Jul 13 22:00:37 linux-u3yw kernel:  6038.507585] [drm] nouveau 0000:03:00.0: PGRAPH_TRAP_TEXTURE - TP1: Unhandled ustatus 0x00000003
Jul 13 22:00:46 linux-u3yw kernel:  6047.517620] [drm] nouveau 0000:03:00.0: PFIFO_DMA_PUSHER - Ch 3
Jul 14 07:26:28 linux-u3yw kernel:  1482.444772] [drm] nouveau 0000:03:00.0: PFIFO_DMA_PUSHER - Ch 2
Jul 14 07:26:28 linux-u3yw kernel:  1482.449311] [drm] nouveau 0000:03:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x8297 Mthd 0x155c Data 0x00000000:0x000cb280
Jul 14 07:26:28 linux-u3yw kernel:  1482.449319] [drm] nouveau 0000:03:00.0: PGRAPH_DATA_ERROR - INVALID_BITFIELD
Jul 14 07:26:28 linux-u3yw kernel:  1482.449347] [drm] nouveau 0000:03:00.0: PGRAPH_DATA_ERROR - Ch 2/5 Class 0x8297 Mthd 0x1564 Data 0x00000000:0x40af2000
Jul 14 07:26:28 linux-u3yw kernel:  1482.449352] [drm] nouveau 0000:03:00.0: PGRAPH_DATA_ERROR - INVALID_BITFIELD

Thanks for any suggestions!
Greetings
BJo

Addendum

Some sysinfo:

uname -a
Linux linux-u3yw 2.6.34-12-desktop #1 SMP PREEMPT 2010-06-29 02:39:08 +0200 x86_64 x86_64 x86_64 GNU/Linux
lspci -nnk
00:00.0 RAM memory [0500]: nVidia Corporation C51 Host Bridge [10de:02f0] (rev a2)
        Subsystem: Micro-Star International Co., Ltd. Device [1462:7252]
00:00.2 RAM memory [0500]: nVidia Corporation C51 Memory Controller 1 [10de:02fe] (rev a2)
        Subsystem: Micro-Star International Co., Ltd. Device [1462:7252]
00:00.3 RAM memory [0500]: nVidia Corporation C51 Memory Controller 5 [10de:02f8] (rev a2)
        Subsystem: Micro-Star International Co., Ltd. Device [1462:7252]
00:00.4 RAM memory [0500]: nVidia Corporation C51 Memory Controller 4 [10de:02f9] (rev a2)
        Subsystem: Micro-Star International Co., Ltd. Device [1462:7252]
00:00.5 RAM memory [0500]: nVidia Corporation C51 Host Bridge [10de:02ff] (rev a2)
        Subsystem: Micro-Star International Co., Ltd. Device [1462:7252]
00:00.6 RAM memory [0500]: nVidia Corporation C51 Memory Controller 3 [10de:027f] (rev a2)
        Subsystem: Micro-Star International Co., Ltd. Device [1462:7252]
00:00.7 RAM memory [0500]: nVidia Corporation C51 Memory Controller 2 [10de:027e] (rev a2)
        Subsystem: Micro-Star International Co., Ltd. Device [1462:7252]
00:02.0 PCI bridge [0604]: nVidia Corporation C51 PCI Express Bridge [10de:02fc] (rev a1)
        Kernel driver in use: pcieport
00:03.0 PCI bridge [0604]: nVidia Corporation C51 PCI Express Bridge [10de:02fd] (rev a1)
        Kernel driver in use: pcieport
00:04.0 PCI bridge [0604]: nVidia Corporation C51 PCI Express Bridge [10de:02fb] (rev a1)
        Kernel driver in use: pcieport
00:09.0 RAM memory [0500]: nVidia Corporation MCP51 Host Bridge [10de:0270] (rev a2)
        Subsystem: Micro-Star International Co., Ltd. Device [1462:7252]
00:0a.0 ISA bridge [0601]: nVidia Corporation MCP51 LPC Bridge [10de:0260] (rev a3)
        Subsystem: Micro-Star International Co., Ltd. Device [1462:7252]
00:0a.1 SMBus [0c05]: nVidia Corporation MCP51 SMBus [10de:0264] (rev a3)
        Subsystem: Micro-Star International Co., Ltd. Device [1462:7252]
        Kernel driver in use: nForce2_smbus
00:0b.0 USB Controller [0c03]: nVidia Corporation MCP51 USB Controller [10de:026d] (rev a3)
        Subsystem: Micro-Star International Co., Ltd. Device [1462:7252]
        Kernel driver in use: ohci_hcd
00:0b.1 USB Controller [0c03]: nVidia Corporation MCP51 USB Controller [10de:026e] (rev a3)
        Subsystem: Micro-Star International Co., Ltd. Device [1462:7252]
        Kernel driver in use: ehci_hcd
00:0d.0 IDE interface [0101]: nVidia Corporation MCP51 IDE [10de:0265] (rev a1)
        Subsystem: Micro-Star International Co., Ltd. Device [1462:7252]
        Kernel driver in use: pata_amd
00:0e.0 IDE interface [0101]: nVidia Corporation MCP51 Serial ATA Controller [10de:0266] (rev a1)
        Subsystem: Micro-Star International Co., Ltd. Device [1462:7252]
        Kernel driver in use: sata_nv
00:0f.0 IDE interface [0101]: nVidia Corporation MCP51 Serial ATA Controller [10de:0267] (rev a1)
        Subsystem: Micro-Star International Co., Ltd. Device [1462:7252]
        Kernel driver in use: sata_nv
00:10.0 PCI bridge [0604]: nVidia Corporation MCP51 PCI Bridge [10de:026f] (rev a2)
00:10.1 Audio device [0403]: nVidia Corporation MCP51 High Definition Audio [10de:026c] (rev a2)
        Subsystem: Micro-Star International Co., Ltd. Device [1462:7252]
        Kernel driver in use: HDA Intel
00:14.0 Bridge [0680]: nVidia Corporation MCP51 Ethernet Controller [10de:0269] (rev a3)
        Subsystem: Micro-Star International Co., Ltd. Device [1462:7252]
        Kernel driver in use: forcedeth
00:18.0 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration [1022:1100]
00:18.1 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map [1022:1101]
00:18.2 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller [1022:1102]
00:18.3 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control [1022:1103]
        Kernel driver in use: k8temp
03:00.0 VGA compatible controller [0300]: nVidia Corporation G92 [GeForce 9800 GT] [10de:0614] (rev a2)
        Kernel driver in use: nouveau
04:07.0 Network controller [0280]: RaLink RT2561/RT61 rev B 802.11g [1814:0302]
        Subsystem: D-Link System Inc AirPlus G DWL-G510 Wireless Network Adapter (Rev.C) [1186:3c09]
        Kernel driver in use: rt61pci
04:08.0 FireWire (IEEE 1394) [0c00]: VIA Technologies, Inc. VT6306/7/8 [Fire II(M)] IEEE 1394 OHCI Controller [1106:3044] (rev c0)
        Subsystem: Micro-Star International Co., Ltd. Device [1462:252d]
        Kernel driver in use: ohci1394
lsmod
Module                  Size  Used by
ip6t_LOG                5898  6 
xt_tcpudp               2859  2 
xt_pkttype              1288  3 
ipt_LOG                 6067  6 
xt_limit                2559  12 
snd_pcm_oss            53669  0 
snd_mixer_oss          19415  1 snd_pcm_oss
snd_seq                68137  0 
snd_seq_device          7834  1 snd_seq
edd                    10208  0 
af_packet              23229  4 
cpufreq_conservative    12628  0 
cpufreq_userspace       3264  0 
cpufreq_powersave       1258  0 
powernow_k8            20075  1 
mperf                   1523  1 powernow_k8
ip6t_REJECT             4828  3 
nf_conntrack_ipv6      21550  4 
ip6table_raw            1627  1 
xt_NOTRACK              1192  4 
ipt_REJECT              2672  3 
xt_state                1618  8 
iptable_raw             1686  1 
iptable_filter          1946  1 
ip6table_mangle         2036  0 
nf_conntrack_netbios_ns     1854  0 
nf_conntrack_ipv4      10379  4 
nf_conntrack           89639  5 nf_conntrack_ipv6,xt_NOTRACK,xt_state,nf_conntrack_netbios_ns,nf_conntrack_ipv4
nf_defrag_ipv4          1673  1 nf_conntrack_ipv4
ip_tables              21698  2 iptable_raw,iptable_filter
ip6table_filter         1887  1 
ip6_tables             23320  4 ip6t_LOG,ip6table_raw,ip6table_mangle,ip6table_filter
x_tables               26644  16 ip6t_LOG,xt_tcpudp,xt_pkttype,ipt_LOG,xt_limit,ip6t_REJECT,ip6table_raw,xt_NOTRACK,ipt_REJECT,xt_state,iptable_raw,iptable_filter,ip6table_mangle,ip_tables,ip6table_filter,ip6_tables
nls_iso8859_1           4729  1 
nls_cp437               6447  1 
vfat                   12114  1 
fat                    59802  1 vfat
fuse                   75897  3 
loop                   18524  0 
dm_mod                 86809  0 
snd_hda_codec_realtek   324064  1 
arc4                    1601  2 
ecb                     2495  2 
firewire_ohci          26938  0 
firewire_core          60890  1 firewire_ohci
snd_hda_intel          28461  2 
rt61pci                22741  0 
crc_itu_t               1747  2 firewire_core,rt61pci
rt2x00pci               7461  1 rt61pci
snd_hda_codec         113025  2 snd_hda_codec_realtek,snd_hda_intel
snd_hwdep               7954  1 snd_hda_codec
rt2x00lib              34850  2 rt61pci,rt2x00pci
snd_pcm               105589  3 snd_pcm_oss,snd_hda_intel,snd_hda_codec
mac80211              290013  2 rt2x00pci,rt2x00lib
ohci1394               33702  0 
cfg80211              182659  2 rt2x00lib,mac80211
ppdev                  10072  0 
edac_core              50480  0 
sg                     33348  0 
snd_timer              26828  2 snd_seq,snd_pcm
joydev                 11942  0 
rfkill                 21863  1 cfg80211
eeprom_93cx6            1893  1 rt61pci
ieee1394              104836  1 ohci1394
usb_storage            52819  0 
edac_mce_amd            9619  0 
parport_pc             37547  0 
k8temp                  4264  0 
forcedeth              59560  0 
sr_mod                 16684  0 
cdrom                  43440  1 sr_mod
pcspkr                  2222  0 
snd                    84348  14 snd_pcm_oss,snd_mixer_oss,snd_seq,snd_seq_device,snd_hda_codec_realtek,snd_hda_intel,snd_hda_codec,snd_hwdep,snd_pcm,snd_timer
soundcore               9003  1 snd
snd_page_alloc          9569  2 snd_hda_intel,snd_pcm
serio_raw               5318  0 
parport                40384  2 ppdev,parport_pc
i2c_nforce2             7593  0 
nouveau               553248  2 
ttm                    65906  1 nouveau
drm_kms_helper         33008  1 nouveau
drm                   221762  4 nouveau,ttm,drm_kms_helper
i2c_algo_bit            6728  1 nouveau
sd_mod                 41436  6 
button                  6989  1 nouveau
fan                     4527  0 
processor              45715  1 powernow_k8
ata_generic             3707  0 
pata_amd               12922  0 
sata_nv                25589  5 
libata                211330  3 ata_generic,pata_amd,sata_nv
scsi_mod              191748  5 sg,usb_storage,sr_mod,sd_mod,libata
thermal                20625  0 
thermal_sys            18230  3 fan,processor,thermal

So you didn’t install the nvidia driver ? Why?

I observed immediate crashes (within 5 minutes) with my FX5200 and the nouveau driver, and I immediately swapped to the proprietary ‘nvidia’ driver.

Does it happen because of the activation of the desktop composite?

Do you mean that compiz-thingy? Or the compositsystem?

As far as I know: I didn’t activate the “Arbeitsflächeneffekte” (desktop effects?). So no fancy stuff that is going on.

But as I recognized: I can sort of “reproduce” the crash once I start some video stuff. It doesn’t take long for the system to freeze then.
In normal operation (text processing / mail and the like) it’s taking considerably longer for the system to freeze. But it happens anyway which is… annoying.

Greetings
BJo

So you are not using the nvidia driver as this post suggests…!?

something I wish I could do.

Is there an easy (read: failsave) way of switching?
Well… or some sort of cook book how to achieve this?
I don’t mind the effort to get there - but all I need is a really complete description how to do it and how to react to errors that might occur and how to solve these or how to roll back (except the obvious but tedious: “just reinstall the whole system”).

Do you have a good pointer to some resource?
Thanks for any suggestion.
Greetings
BJo

openSUSE Graphic Card Practical Theory Guide for Users

SDB:NVIDIA drivers - openSUSE

Nvidia for Newbies part 1

ah - that one I missed. Thank you!
Now I am stuck in indecision - should I wait for the easy click and go solution or should I go boldly where some have gone before? :slight_smile:

But at least it looks promising - I have a choice and the chance that my problem gets solved! Things are getting better every day.
(There were days several years in the past where I hated linux for burning me and causing me all kind of trouble I didn’t ask for… at least not directly. And I love seeing things are getting better every release!)

Happy greetings
BJo

I have the similar problems with X randomly crashing and putting me back to the login screen. I have removed the nouveau driver and compiled the Nvidia driver against the Kernel I also have the nomodeset flag at boot. There is nothing in my messages log or X log that shows a reason this is happening. Running Gnome with or with out compiz, but even under Xface or TWM I get the same problem. Did not happen under 11.2. I have spent the whole day trying to fix this and can see no reason why it is happening. Having given up I am now using Ubuntu and so far no crashes so I suppose my hardware is not the culprit after all.

What sort of check did you do to ensure that you actully had the nvidia driver running? For some hardware just using the nomodeset flag at boot is not enough. One needs to also black list the nouveau driver. Did you follow all the advice in the release note?

Glad to read its not hardware and its working for you now. Note 11.2 and 11.3 are completely different in how nvidia graphics are handled, and 11.2 simply adds confidence to what you noted on Unbuntu that your hardware is good.

Hi oldcpu

Yes I followed the release notes first and added the “nomodeset”, thisa did not work.
I tried to install the Nvidia driver but you can not compile this against KMS.
Using Sysconfig Editor in YaST I set NO_KMS_IN_INITRD to Yes.
As Root from a terminal ran mkinitrd

Rebooted
From init 3
Using zypper rm nouveau
Installed latest Nvidia Driver
Init 6
Login
using nvidia-settings, it told me I was using the the Nvidia Driver confirmed in my xorg.conf,
Section “Device”
Identifier “Device0”
Driver “nvidia”
VendorName “NVIDIA Corporation”

It seems to only crash out when doing;
Intense disc transfers (Deleting 2-3 GB of data or copying data)
Installing multiple packages in YaST,
Banshee rebuilding/rescanning media.

If I just use it for browsing its fine. At first I thought I had a hardware fault mobo or disc but I would expect the PC to lock up not just black screen back to the login prompt. Mem checked fine and like I say it works fine in 11.2 and the latest v Ubuntu.

I don’t know if the order is important, but the order you followed is NOT what I have ever recommended. Maybe it matters. Maybe it does not. I am suspicous it is the cause. Please note this URL: SDB:Configuring graphics cards - openSUSE I tried to cover all points there.

Reading your post, I can see not evidence of your black listing the nouveau driver. You may need to black list it in the /etc/modprobe.d/50-blacklist.conf file. Please, where did I miss that in your post? You also may need to set yast > System > /etc/sysconfig Editor > System > Kernel > NO_KMS_IN_INITRD and set it to “yes”. This takes a minute or two to save once changed is submitted. As for mkintrd , I did NOTHING of the sort directly. OK ? I just changed the setting in YaST. Maybe YaST did that, but I sure did NOT send any such direct command.

I confess your post confuses me as to what you have done.

I don’t recall saying I had followed your guide;), I followed the release notes displayed after the installation. You can find them in YaST there is NO mention of blacklisting anything.

Maybe it matters. Maybe it does not. I am suspicous it is the cause. Please note this URL: SDB:Configuring graphics cards - openSUSE I tried to cover all points there.

Thanks for the link I have read it now and it is very thorough.

Reading your post, I can see not evidence of your black listing the nouveau driver. You may need to black list it in the /etc/modprobe.d/50-blacklist.conf file. Please, where did I miss that in your post?

I didn’t, could not understand why I would have to black list it if I had removed it from the system " Using zypper rm nouveau" lsmod does not show it only the nvidia driver.

You also may need to set yast > System > /etc/sysconfig Editor > System > Kernel > NO_KMS_IN_INITRD and set it to “yes”. This takes a minute or two to save once changed is submitted. As for mkintrd , I did NOTHING of the sort directly. OK ? I just changed the setting in YaST. Maybe YaST did that, but I sure did NOT send any such direct command.

I listed the NO_KMS_IN_INITRD in my previous post;)

I confess your post confuses me as to what you have done.

Disabled KMS
Removed Nouveau Driver
Installed Nvidia Driver

The screen “Flashes” black then throws me out to a login screen, Just as if I had CTL+ALT+BkSp. It does not remain black.

Really sorry if I am confusing you I realise you are trying to help.

Needless to say, this behaviour should not be happening. How confident are you that you had a good install ? ie md5sum passed (comparing downloaded iso to that on web site ? ). Confirm also you burned at slowest speed possible to a +R or -R CD/DVD (and NOT to an RW). Preferably burned on same burner as that being used for the install.

I confirmed the MD5 when I burnt the disc and then let k3b verify the data. I am sure it was not burnt at the lowest speed, as you reccomend, but it was burnt to a DVD+R disc.

I have installed this three times, each time I have the same problem and can replicate the problem in each install.

Using YaST to install Software causes the problem to manifest, but not zypper.
Banshee Whilst trying to index my media will cause it to happen.
Moving lots of Data between discs

If don’t do any of the above then its fine.
There is nothing in my Messages log or Xorg log, Is there any other log I can look at?