Frequent Total Freezes - Nouveau?

Hi, guys.

Every other day or so I get a complete and total lockup of my system. This happens in Leap 15.1, and previously in Leap 15.0. I am assuming that it is related to the nouveau drivers as it happens at various times in various applications, but invariably involves some screen activity, usually in Firefox but also in things like editing text documents. Just moving the cursor or scrolling a text document can cause it to happen out of the blue. The only thing that can be done is hit the power button and reboot. /var/log/messages show absolutely nothing happening at the moment of the freeze and reboot that might explain the freeze.

My question is: should I buy a more current video card and install the latest Nvidia drivers? Would that be more reliable or less reliable than nouveau? Or is this some sort of known issue with KDE 5.12.8 and previous versions?

My hardware is a Ryzen 5 2600X CPU, an ASUS ROG STRIX X470 Gaming motherboard, and an older GeForce GT 740 2GB video card.

Linux linux-h2ol 4.12.14-lp151.28.10-default #1 SMP Sat Jul 13 17:59:31 UTC 2019 (0ab03b7) x86_64 x86_64 x86_64 GNU/Linux
Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.12.14-lp151.28.10-default root=UUID=7df10d3e-99cf-4d67-8c19-b319d26e9d96 splash=silent resume=/dev/disk/by-id/nvme-ADATA_SX8200PNP_2J1720070214-part3 mitigations=auto quiet
lshw -c video
       description: VGA compatible controller
       product: GK107 [GeForce GT 740]
       vendor: NVIDIA Corporation
       physical id: 0
       bus info: pci@0000:0b:00.0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress vga_controller bus_master cap_list rom
       configuration: driver=nouveau latency=0
       resources: irq:317 memory:f6000000-f6ffffff memory:e0000000-efffffff memory:f0000000-f1ffffff ioport:e000(size=128) memory:c0000-dffff
 hwinfo --gfxcard
13: PCI b00.0: 0300 VGA compatible controller (VGA)             
  [Created at pci.386]
  Unique ID: IluS.WF0fF5FjFN2
  Parent ID: w+J7.0TU4LKoL980
  SysFS ID: /devices/pci0000:00/0000:00:03.1/0000:0b:00.0
  SysFS BusID: 0000:0b:00.0
  Hardware Class: graphics card
  Model: "nVidia GK107 [GeForce GT 740]"
  Vendor: pci 0x10de "nVidia Corporation"
  Device: pci 0x0fc8 "GK107 [GeForce GT 740]"
  SubVendor: pci 0x3842 " Corp."
  SubDevice: pci 0x2742 
  Revision: 0xa1
  Driver: "nouveau"
  Driver Modules: "nouveau"
  Memory Range: 0xf6000000-0xf6ffffff (rw,non-prefetchable)
  Memory Range: 0xe0000000-0xefffffff (ro,non-prefetchable)
  Memory Range: 0xf0000000-0xf1ffffff (ro,non-prefetchable)
  I/O Ports: 0xe000-0xefff (rw)
  Memory Range: 0x000c0000-0x000dffff (rw,non-prefetchable,disabled)
  IRQ: 317 (257558 events)
  Module Alias: "pci:v000010DEd00000FC8sv00003842sd00002742bc03sc00i00"
  Driver Info #0:
    Driver Status: nouveau is active
    Driver Activation Cmd: "modprobe nouveau"
  Config Status: cfg=no, avail=yes, need=no, active=unknown
  Attached to: #32 (PCI bridge)

Primary display adapter: #13
(II) xfree86: Adding drm device (/dev/dri/card0)
    13.271] (--) PCI:*(11@0:0:0) 10de:0fc8:3842:2742 rev 161, Mem @ 0xf6000000/16777216, 0xe0000000/268435456, 0xf0000000/33554432, I/O @ 0x0000e000/128, BIOS @ 0x????????/131072
    13.271] (II) LoadModule: "glx"
    13.271] (II) Loading /usr/lib64/xorg/modules/extensions/
    13.275] (II) Module glx: vendor="X.Org Foundation"
    13.275]     compiled for 1.20.3, module version = 1.0.0
    13.275]     ABI class: X.Org Server Extension, version 10.0
    13.275] (II) Scanning /etc/X11/xorg_pci_ids directory for additional PCI ID's supported by the drivers
    13.275] (II) Scanning /etc/X11/xorg_pci_ids directory for additional PCI ID's supported by the drivers
    13.275] (==) Matched nvidia as autoconfigured driver 0
    13.275] (==) Matched nouveau as autoconfigured driver 1
    13.275] (==) Matched nv as autoconfigured driver 2
    13.275] (==) Matched modesetting as autoconfigured driver 3
    13.275] (==) Matched fbdev as autoconfigured driver 4
    13.275] (==) Matched vesa as autoconfigured driver 5
    13.275] (==) Assigned the driver to the xf86ConfigLayout
    13.275] (II) LoadModule: "nvidia"
    13.276] (WW) Warning, couldn't open module nvidia
    13.276] (EE) Failed to load module "nvidia" (module does not exist, 0)
    13.276] (II) LoadModule: "nouveau"
    13.276] (II) Loading /usr/lib64/xorg/modules/drivers/
    13.276] (II) Module nouveau: vendor="X.Org Foundation"
    13.276]     compiled for 1.20.3, module version = 1.0.15
    13.276]     Module class: X.Org Video Driver
    13.276]     ABI class: X.Org Video Driver, version 24.0
    13.276] (II) LoadModule: "nv"
    13.276] (WW) Warning, couldn't open module nv
    13.276] (EE) Failed to load module "nv" (module does not exist, 0)
    13.276] (II) LoadModule: "modesetting"
    13.276] (II) Loading /usr/lib64/xorg/modules/drivers/
    13.277] (II) Module modesetting: vendor="X.Org Foundation"
    13.277]     compiled for 1.20.3, module version = 1.20.3
    13.277]     Module class: X.Org Video Driver
    13.277]     ABI class: X.Org Video Driver, version 24.0
    13.277] (II) LoadModule: "fbdev"
    13.277] (II) Loading /usr/lib64/xorg/modules/drivers/
    13.277] (II) Module fbdev: vendor="X.Org Foundation"
    13.277]     compiled for 1.20.3, module version = 0.5.0
    13.277]     Module class: X.Org Video Driver
    13.277]     ABI class: X.Org Video Driver, version 24.0
    13.277] (II) LoadModule: "vesa"
    13.277] (II) Loading /usr/lib64/xorg/modules/drivers/
    13.277] (II) Module vesa: vendor="X.Org Foundation"
    13.277]     compiled for 1.20.3, module version = 2.4.0
    13.277]     Module class: X.Org Video Driver
    13.277]     ABI class: X.Org Video Driver, version 24.0
    13.277] (II) NOUVEAU driver 
    13.277] (II) NOUVEAU driver for NVIDIA chipset families :
    13.277]     RIVA TNT        (NV04)
    13.277]     RIVA TNT2       (NV05)
    13.277]     GeForce 256     (NV10)
    13.277]     GeForce 2       (NV11, NV15)
    13.277]     GeForce 4MX     (NV17, NV18)
    13.277]     GeForce 3       (NV20)
    13.277]     GeForce 4Ti     (NV25, NV28)
    13.277]     GeForce FX      (NV3x)
    13.277]     GeForce 6       (NV4x)
    13.277]     GeForce 7       (G7x)
    13.277]     GeForce 8       (G8x)
    13.277]     GeForce GTX 200 (NVA0)
    13.277]     GeForce GTX 400 (NVC0)
    13.277] (II) modesetting: Driver for Modesetting Kernel Drivers: kms
    13.277] (II) FBDEV: driver for framebuffer: fbdev
    13.277] (II) VESA: driver for VESA chipsets: vesa
    13.280] (II) [drm] nouveau interface version: 1.3.1
    13.280] (WW) Falling back to old probe method for modesetting
    13.281] (WW) Falling back to old probe method for fbdev
    13.281] (II) Loading sub module "fbdevhw"
    13.281] (II) LoadModule: "fbdevhw"
    13.281] (II) Loading /usr/lib64/xorg/modules/
    13.281] (II) Module fbdevhw: vendor="X.Org Foundation"
    13.281]     compiled for 1.20.3, module version = 0.0.2
    13.281]     ABI class: X.Org Video Driver, version 24.0
    13.281] (II) Loading sub module "dri2"
    13.281] (II) LoadModule: "dri2"
    13.281] (II) Module "dri2" already built-in
    13.281] (--) NOUVEAU(0): Chipset: "NVIDIA NVE7"

Thanks in advance for any assistance.

On X I found that only the proprietary drivers installed “the hard way” resulted in perfect performance (with ForceCompositionPipeline enabled).

On Wayland, nouveau has suboptimal performance but no tearing.

The easiest way to determine if it’s something KDE related would probably be to use something else for a day or two.
Lightweight options include XFCE, icewm and openbox.

The GT740 is supported with the latest Nvidia driver 430.34…

Install via the repository version (G05) or the hard way. I have a GT710 running here on Leap 15.1 installed the hard way.

Here’s another question: Would I be better off buying an AMD GPU card such as the RX 560 (I don’t game, and don’t need any significant performance, so a 4GB card like this for $110 is more than enough for me) than sticking with the NVidia card? I’ve just done some research on this and it seems to be up in the air, although quite a few people suggest that AMD drivers tend to work better than NVidia.

No, I’d rather not go that route. Way too much of a chance of me borking the whole thing. I just did a complete reinstall from 15,0 to 15.1 after borking moving over the 15,0 to my new NVME SSD. Not wanting to risk doing that again soon. Thanks anyway.

So your current card GT740 is powered via the splitter cable or does the power supply have it’s own power rail for the card?

Sure it’s not power related?

Like I indicated, I see no freezes here, but I run Gnome, even using the nouveau driver was fine before I installed the driver.

Again, check the power specs for the card vs your current power supply. Is the motherboard card slot PCIe 3.0?

The amdgpu driver is good, however the newer cards will work better with a later kernel, can you post the info on the card your looking at?

I’ve got a EVGA 650 Gold watt power supply going to a 6-pin PCI connector on the card. The card is rated minimum 400 watts, so I doubt it’s a power problem. The motherboard card slot is PCIe 3.0 16x.

The specs for the Radeon RX 560 are here:

The specific card I’m looking at is here:

What I’m hearing from my research is that these cards’ advanced features require kernels past the 4.12 that openSUSE Leap 15.1 ships with. However, I don’t need advanced features. I just need a video card with drivers that don’t lock up on me every other day. So I would assume that the standard open source AMD driver would work well enough. If anyone knows different, I’d like to know.

You might already have IceWM installed. The next time you’re at the display manager (login screen) look for a drop down menu that lets you pick a different type of “session” (desktop environment or window manager).

zypper in openbox only wants to install 5 packages, then you’ll see it in that same session menu.
Within an openbox session, everything is accessed via right-clicking the non-existent desktop.

I’m 99% sure you can’t “bork the whole thing”. The remaining 1% is covered by snapper.

Are you sure you are using the nouveau DDX? Nothing you posted reports unambiguously which DDX is actually in use. The entire Xorg.0.log, or a larger section of it, would have made it so. So would output from inxi -Gxx. Before spending money or tainting your system with proprietary NVidia software, try the other FOSS option. If indeed you are using the nouveau DDX, that could be by removing the xf86-video-nouveau package. If not, and you are using the default DDX, modesetting, then add xf86-video-nouveau if it is not installed, or specify it explicitly via /etc/X11/xorg.conf.d/50-device.conf. The modesetting DDX, which is the newer technology upstream default DDX, the only DDX I ever use intentionally for my own NVidia GPUs, all of which are at least 6 years old, can also be explicitly specified via 50-device.conf.

Here is the output of inxi -Gxx:

inxi -Gxx
Graphics:  Card: NVIDIA GK107 [GeForce GT 740] bus-ID: 0b:00.0 chip-ID: 10de:0fc8
           Display Server: x11 (X.Org 1.20.3 ) drivers: nouveau (unloaded: modesetting,fbdev,vesa)
           Resolution: 1920x1080@60.00hz
           OpenGL: renderer: NVE7 version: 4.3 Mesa 18.3.2 Direct Render: Yes

I am unfamiliar with the term DDX, so I looked it up and hunted around the nouveau Web site. I really don’t want to get buried in the details of how the Xorg server operates if it’s not necessary. So am I using the “nouveau DDX” or not based on the inxi report? This is a brand new install of Leap 15.1 so whatever is there is what was installed by default. According to Yast, this is what is installed: xf86-video-nouveau-1.0.15-lp151.4.1

Also in the process of browsing around, I find that apparently there have been issues before, as late as June. See here:

So again, I’m forced to consider whether I’d be better off upgrading to a Radeon card and using the open source AMD drivers rather than spend days trying to figure out why an older Nvidia card crashes more or less randomly. At the nouveau site, their procedure for handling hangups entails things like having a second machine using SSH to try to force getting logs to explain what happened - otherwise they say explicitly, solving such problem is “nearly impossible.” That doesn’t fill me with confidence.

DDX is simply an easy way to distinguish identically named drivers for kernel and X apart. It works the same way with amdgpu. Without more, blaming “nouveau” or “amdgpu” is blaming both kernel and X for a problem unlikely to be a fault in both.

If you don’t wish to try a possible solution that costs no money, that’s up the you. All that’s necessary to try the free alternative is ‘zypper rm xf86-video-nouveau’ (the reverse-engineered FOSS DDX for NVidia GPUs) and restart to give a more modern technology (modesetting DDX) a try. Your old GeForce may well have seen better days, but I certainly wouldn’t buy a new GPU without exhausting free options first.

I fail to see how DDX is even mentioned in the output I listed, but whatever.

I will try removing the xf86-video-nouveau as you suggest and see what happens over the next several days. Thanks for the suggestion.

DDX where used is there to provide context, to distinguish between kernel driver of same name, and device dependent X driver. In inxi output, DDX as a context designator would be redundant, as the context is obviously Xorg, while space in that output is considered precious by its author.

The term modesetting differs between kernel and X. In the kernel, modesetting is a process (which usually goes by the nickname KMS) utilized by Intel, AMD and Nouveau kernel drivers, while in X, modesetting is a DDX driver name.

Following is an inxi version 3.0.35-0 example of use of modesetting DDX with 15.1 and a 10 year old NVidia GPU:

# inxi -Gxx
Graphics:  Device-1: NVIDIA GT218 [GeForce 210] vendor: driver: nouveau v: kernel bus ID: 01:00.0
           chip ID: 10de:0a65
           Display: server: X.Org 1.20.3 driver: modesetting resolution: 1920x1200~60Hz
           OpenGL: renderer: llvmpipe (LLVM 7.0 128 bits) v: 3.3 Mesa 18.3.2 compat-v: 3.1 direct render: Yes

Note that the in-use kernel driver and the in-use DDX are the only drivers mentioned, unlike earlier versions, which listed possible alternative X drivers.

I look forward to seeing the results of your testing with the modesetting DDX.

It failed. The system just locked up again ten minutes ago.

I will be buying an AMD video card and doing away with all this nouveau and Nvidia nonsense. Thanks for your assistance.

Adding to this thread my similar experience.

Leap 15.1
Nvidia 390.116 drivers installed through YAST
GTX 960 card
Dual displays

Both screens will unexpectedly go to power-save suspend and be unrecoverable. Computer can be rebooted through SSH from another machine but nothing I’ve tried will recover the suspended displays.

For a Nvidia GTX 960 driver 430.40 (Release Date: 2019.7.29) is available:
Maybe driver updating will solve your problems.

Just a follow up to my situation.

I purchased a Gigabyte AMD Radeon RX 550 GV-RX550D5-2GD Rev. 2.9 video card from Newegg for $95 (plus tax and shipping). Decided I didn’t need an RX 560 as the 550 was represented as more than adequate for non-gaming use and I could save $20 or so.

Did some research, but in the end found a guy who said, “Plug it in. Reboot.” Did that. He was right. Worked first time, no other adjustment needed. Using the default amdgpu drivers. System even seems “faster” somehow, which might be just a hallucination on my part. Who knows, maybe it has something to do with my running an AMD Ryzen system.

The important thing - haven’t had a lockup on my system since I installed it (five days, a week ago, whatever.)

Previously with the nouveau drivers I was getting lockups every day, occasionally every other day, and sometimes twice a day - especially after I removed the original drivers as was suggested in this thread, which seemed to make it worse.

So much for Nvidia and nouveau.

Good deal on the card :slight_smile: Should be faster with the improved GPU and since your on a Ryzen.

I have the same problem, and I just want to cry instead of trying to solve a problem with which I have no clue.