Today's update (4/19) broke graphical interface

Today’s update broke the graphical interface. My system boots into a character terminal. I believe it might have to do with an update of nvidia-open-driver-G06-signed-kmp-default to 550.76 for kernel 6.8.6_1-1.1. When I boot into kernel 6.8.5_1-1.5 which uses version 550.67, I get to the graphical interface with no problem. I’m running a NVIDIA GeForce RTX 3060. This is a guess on my part. However, my boot messages for the 6.8.6 version of the system ends at:

<6>[    6.331016] snd_hda_codec_realtek hdaudioC0D0:    mono: mono_out=0x0
<6>[    6.331016] snd_hda_codec_realtek hdaudioC0D0:    inputs:
<6>[    6.331017] snd_hda_codec_realtek hdaudioC0D0:      Front Mic=0x19
<6>[    6.331018] snd_hda_codec_realtek hdaudioC0D0:      Rear Mic=0x18
<6>[    6.331019] snd_hda_codec_realtek hdaudioC0D0:      Line=0x1a

Whereas the 6.8.5 boot continues from that point with:

<6>[    3.818574] intel_rapl_msr: PL4 support detected.
<6>[    3.818743] intel_rapl_common: Found RAPL domain package
<6>[    3.818745] intel_rapl_common: Found RAPL domain core
<6>[    3.829099] input: HDA Intel PCH Front Mic as /devices/pci0000:00/0000:00:1f.3/sound/card0/input14
<6>[    3.829141] input: HDA Intel PCH Rear Mic as /devices/pci0000:00/0000:00:1f.3/sound/card0/input15
<6>[    3.829175] input: HDA Intel PCH Line as /devices/pci0000:00/0000:00:1f.3/sound/card0/input16
<6>[    3.829207] input: HDA Intel PCH Line Out as /devices/pci0000:00/0000:00:1f.3/sound/card0/input17
<6>[    3.829231] input: HDA Intel PCH Front Headphone as /devices/pci0000:00/0000:00:1f.3/sound/card0/input18
<6>[    3.829265] iwlwifi 0000:00:14.3: Detected Intel(R) Wi-Fi 6E AX211 160MHz, REV=0x430
<4>[    3.829312] thermal thermal_zone2: failed to read out thermal zone (-61)
<3>[    3.836490] iwlwifi 0000:00:14.3: WRT: Invalid buffer destination
<6>[    3.872293] nvidia-modeset: Loading NVIDIA UNIX Open Kernel Mode Setting Driver for x86_64  550.67  Release Build  (abuild@host)  Tue Mar 19 21:01:46 UTC 2024
<6>[    3.875885] [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
<6>[    3.997653] iwlwifi 0000:00:14.3: WFPM_UMAC_PD_NOTIFICATION: 0x20
<6>[    3.997684] iwlwifi 0000:00:14.3: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
<6>[    3.997692] iwlwifi 0000:00:14.3: WFPM_AUTH_KEY_0: 0x90
<6>[    3.997701] iwlwifi 0000:00:14.3: CNVI_SCU_SEQ_DATA_DW9: 0x0
<6>[    3.998813] iwlwifi 0000:00:14.3: RFIm is deactivated, reason = 5
<6>[    3.998895] iwlwifi 0000:00:14.3: loaded PNVM version e28bb9d7
<6>[    4.015375] iwlwifi 0000:00:14.3: Detected RF GF, rfid=0x2010d000
<6>[    4.083955] iwlwifi 0000:00:14.3: base HW address: 70:d8:23:47:0b:1c
<6>[    5.457990] EXT4-fs (nvme1n1p6): mounted filesystem 510911d5-7a59-43ee-85cb-bb5aef977c23 r/w with ordered data mode. Quota mode: none.
<6>[    5.462876] EXT4-fs (nvme0n1p5): mounted filesystem 47474143-da9e-4c94-8946-f0527f328fec r/w with ordered data mode. Quota mode: none.
<6>[    5.468756] EXT4-fs (nvme0n1p6): mounted filesystem 7433fe13-2e60-400e-84f1-ff1e4494c93d r/w with ordered data mode. Quota mode: none.

You’ll note that the Nvidia Open kernel modesetting driver never gets loaded in the 6.8.6 boot.

In my case, display resolution fell down to 1024x768, and got stuck. “Displays” setting withing gnome-setting does not allow any change. Someone needs to fix nvidia-open-driver-G06… immediately. I cannot do anything in the display resolution of 1024x768.
Thanks.

@bearymore Hi is the 550.76 version of kernel-firmware-nvidia-gspx-G06 installed? Run zypper se -si nvidia

@hughsong Hi and welcome to the Forum :smile:
That’s not how it works in openSUSE :wink: If it’s an issue, the only way to reach the correct folks is via a bug report openSUSE:Submitting bug reports - openSUSE Wiki

FWIW on my test system I run the proprietary version via rpm, this is still at 550.67, on this system I use proprietary version via the run file 550.76 (still use a Tesla P4) and it’s all fine.

Is driver for the 6.8.6 installed?

First we misuse developers for the first line help desk and then we complain that developers are too slow to do what they are responsible for - to develop …

Just a quick thing to be noted:

Repository ‘repo-non-free’ is invalid.
[NVIDIA:repo-non-free|Index of /opensuse/tumbleweed] Valid metadata not found at specified URL
History:

…as of 4/20 the repo seems to be having this issue while trying to refresh. May, or may not be related.

@Android_Gynous The OP is using the open Nvidia driver from the oss repository, I suspect a transient issue in your locale as no issues here (or yesterday) with the nvidia repo.

1 Like

Here is what is installed:

S  | Name                                      | Type    | Version              | Arch   | Repository
---+-------------------------------------------+---------+----------------------+--------+----------------------
i+ | kernel-firmware-nvidia                    | package | 20240322-2.1         | noarch | Main Repository (OSS)
i+ | kernel-firmware-nvidia                    | package | 20240322-2.1         | noarch | repo-oss
i+ | kernel-firmware-nvidia-gspx-G06           | package | 550.67-1.2           | x86_64 | (System Packages)
i+ | kernel-firmware-nvidia-gspx-G06           | package | 550.67-1.1           | x86_64 | (System Packages)
i+ | kernel-firmware-nvidia-gspx-G06           | package | 550.54.14-1.1        | x86_64 | (System Packages)
i+ | kernel-firmware-nvidia-gspx-G06           | package | 545.29.06-1.2        | x86_64 | (System Packages)
i+ | kernel-firmware-nvidia-gspx-G06           | package | 545.29.06-1.1        | x86_64 | (System Packages)
i+ | kernel-firmware-nvidia-gspx-G06           | package | 550.76-1.1           | x86_64 | Main Repository (OSS)
i+ | kernel-firmware-nvidia-gspx-G06           | package | 550.76-1.1           | x86_64 | repo-oss
i+ | libnvidia-egl-wayland1                    | package | 1.1.13-1.3           | x86_64 | Main Repository (OSS)
i+ | libnvidia-egl-wayland1                    | package | 1.1.13-1.3           | x86_64 | repo-oss
i+ | nvidia-compute-G06                        | package | 550.67-20.1          | x86_64 | Nvidia Drivers
i+ | nvidia-compute-G06                        | package | 550.67-20.1          | x86_64 | repo-non-free
i+ | nvidia-compute-G06-32bit                  | package | 550.67-20.1          | x86_64 | Nvidia Drivers
i+ | nvidia-compute-G06-32bit                  | package | 550.67-20.1          | x86_64 | repo-non-free
i+ | nvidia-compute-utils-G06                  | package | 550.67-20.1          | x86_64 | Nvidia Drivers
i+ | nvidia-compute-utils-G06                  | package | 550.67-20.1          | x86_64 | repo-non-free
i+ | nvidia-gl-G06                             | package | 550.67-20.1          | x86_64 | Nvidia Drivers
i+ | nvidia-gl-G06                             | package | 550.67-20.1          | x86_64 | repo-non-free
i+ | nvidia-gl-G06-32bit                       | package | 550.67-20.1          | x86_64 | Nvidia Drivers
i+ | nvidia-gl-G06-32bit                       | package | 550.67-20.1          | x86_64 | repo-non-free
i+ | nvidia-open-driver-G06-signed-kmp-default | package | 550.67_k6.8.5_1-1.5  | x86_64 | (System Packages)
i+ | nvidia-open-driver-G06-signed-kmp-default | package | 550.76_k6.8.6_1-1.1  | x86_64 | Main Repository (OSS)
i+ | nvidia-open-driver-G06-signed-kmp-default | package | 550.76_k6.8.6_1-1.1  | x86_64 | repo-oss
i+ | nvidia-utils-G06                          | package | 550.67-20.1          | x86_64 | Nvidia Drivers
i+ | nvidia-utils-G06                          | package | 550.67-20.1          | x86_64 | repo-non-free
i+ | nvidia-video-G06                          | package | 550.67-20.1          | x86_64 | Nvidia Drivers
i+ | nvidia-video-G06                          | package | 550.67-20.1          | x86_64 | repo-non-free
i+ | nvidia-video-G06-32bit                    | package | 550.67-20.1          | x86_64 | Nvidia Drivers
i+ | nvidia-video-G06-32bit                    | package | 550.67-20.1          | x86_64 | repo-non-free
i+ | openSUSE-repos-MicroOS-NVIDIA             | package | 20240412.89bd714-1.1 | x86_64 | Main Repository (OSS)
i+ | openSUSE-repos-MicroOS-NVIDIA             | package | 20240412.89bd714-1.1 | x86_64 | repo-oss
i+ | openSUSE-repos-Tumbleweed-NVIDIA          | package | 20240412.89bd714-1.1 | x86_64 | Main Repository (OSS)
i+ | openSUSE-repos-Tumbleweed-NVIDIA          | package | 20240412.89bd714-1.1 | x86_64 | repo-oss

What I see is that there are multiple versions of the kernel firmware. However, the drivers are all 550.67. 550.76 is only available for the kernel-firmware. I’m guessing that the .76 firmware is used with 6.8.6 but the .67 version is used with the previous kernel 6.8.5 and that might be why graphics run with 6.8.5 but not with 6.8.6.

@bearymore So you have both proprietary and open installed… looks like those service files are an issue and added the Nvidia repo.

I’ve always used the proprietary drivers as free drivers were not available - hence the Nvidia repository. The utility files were on the system with the 6.8.5 kernel and it works fine. They don’t seem to be available on any other repository. The only Nvidia flie that was updated in the zypper dup that hosed the system was nvidia-open-driver-G06-signed-kmp-default 550.67_k6.86_1. First it was deleted and then reinstalled. Right now I have version 1_1. The fact of that one change was, I think, the problem. Before that update I had 6.8.6 and it ran the graphic interface with no problem. I’d go back, but there seems to be no file available.

@bearymore Ahh, ok, well you need to remove the open drivers then add a lock for nvidia-open-driver-G06-signed-kmp-default. And you have the double up on repos…

1 Like

nvidia’s open driver only contains a kernel module so the proprietary user space driver packages are actually required to be used together…

imo the real issue is that the nvidia repo isn t always update in time.
so the version of nvidia-open-driver-G06-signed-kmp-default could be out of sync with other driver components. i ve noticed this happening on my system few times before.

idk why zypper does not block this invalid upgrade, thus i had to manually check this before every dup. somewhat annoying.

Malcom, I think you may have solved the issue. I just eliminated all of the nvidia kernel firmware not from the nvidia directory, and added nvidia-driver-G06-kmp-default from the nvidia repository and eliminated the open one from the main repository. 6.86 came up with the graphical interface. However, I think lictex is correct, too. The open-driver-G06-signed-kmp-default is version 550.76. Everything from the nvidia repository is version 550.67.

I am nervous that I eliminated too much, namely all the kernel-firmware-nvidia-gspx-G06 files and the kernel-firmware-nvidia files. Here is what’s left. I hope this is a correct configuartion.

S  | Name                             | Type    | Version                        | Arch   | Repository
---+----------------------------------+---------+--------------------------------+--------+---------------------------------
i+ | libnvidia-egl-wayland1           | package | 1.1.13-1.3                     | x86_64 | Main Repository (OSS)
i+ | libnvidia-egl-wayland1           | package | 1.1.13-1.3                     | x86_64 | repo-oss
i+ | nvidia-compute-G06               | package | 550.67-20.1                    | x86_64 | Nvidia Drivers
i+ | nvidia-compute-G06               | package | 550.67-20.1                    | x86_64 | repo-non-free
i+ | nvidia-compute-G06-32bit         | package | 550.67-20.1                    | x86_64 | Nvidia Drivers
i+ | nvidia-compute-G06-32bit         | package | 550.67-20.1                    | x86_64 | repo-non-free
i+ | nvidia-compute-utils-G06         | package | 550.67-20.1                    | x86_64 | Nvidia Drivers
i+ | nvidia-compute-utils-G06         | package | 550.67-20.1                    | x86_64 | repo-non-free
i  | nvidia-driver-G06-kmp-default    | package | 550.67_k6.7.9_1-20.1           | x86_64 | Nvidia Drivers
i  | nvidia-driver-G06-kmp-default    | package | 550.67_k6.7.9_1-20.1           | x86_64 | repo-non-free
i+ | nvidia-drivers-G06               | package | 550.67-20.1                    | x86_64 | Nvidia Drivers
i+ | nvidia-drivers-G06               | package | 550.67-20.1                    | x86_64 | repo-non-free
i+ | nvidia-gl-G06                    | package | 550.67-20.1                    | x86_64 | Nvidia Drivers
i+ | nvidia-gl-G06                    | package | 550.67-20.1                    | x86_64 | repo-non-free
i+ | nvidia-gl-G06-32bit              | package | 550.67-20.1                    | x86_64 | Nvidia Drivers
i+ | nvidia-gl-G06-32bit              | package | 550.67-20.1                    | x86_64 | repo-non-free
i+ | nvidia-utils-G06                 | package | 550.67-20.1                    | x86_64 | Nvidia Drivers
i+ | nvidia-utils-G06                 | package | 550.67-20.1                    | x86_64 | repo-non-free
i+ | nvidia-vaapi-driver              | package | 0.0.11+git20231224.f977766-4.6 | x86_64 | pallaswept (openSUSE_Tumbleweed)
i+ | nvidia-video-G06                 | package | 550.67-20.1                    | x86_64 | Nvidia Drivers
i+ | nvidia-video-G06                 | package | 550.67-20.1                    | x86_64 | repo-non-free
i+ | nvidia-video-G06-32bit           | package | 550.67-20.1                    | x86_64 | Nvidia Drivers
i+ | nvidia-video-G06-32bit           | package | 550.67-20.1                    | x86_64 | repo-non-free
i+ | openSUSE-repos-MicroOS-NVIDIA    | package | 20240412.89bd714-1.1           | x86_64 | Main Repository (OSS)
i+ | openSUSE-repos-MicroOS-NVIDIA    | package | 20240412.89bd714-1.1           | x86_64 | repo-oss
i+ | openSUSE-repos-Tumbleweed-NVIDIA | package | 20240412.89bd714-1.1           | x86_64 | Main Repository (OSS)
i+ | openSUSE-repos-Tumbleweed-NVIDIA | package | 20240412.89bd714-1.1           | x86_64 | repo-oss

@bearymore AFAIK the firmware is part of the proprietary install so those rpms are only needed to supplement the open driver. It also needs manual activation…

So I have two gpu’s one old and one new (well I actually have three but that’s for pci-vfio)

So I install manually via the run file and can see;


cat /proc/driver/nvidia/gpus/0000\:02\:00.0/information
Model: 		 Unknown
IRQ:   		 61
GPU UUID: 	 GPU-0f2a639c-3f7b-acc7-9d05-115ef6dcb51f
Video BIOS: 	 86.04.8c.00.10
Bus Type: 	 PCIe
DMA Size: 	 47 bits
DMA Mask: 	 0x7fffffffffff
Bus Location: 	 0000:02:00.0
Device Minor: 	 0
GPU Excluded:	 No

cat /proc/driver/nvidia/gpus/0000\:03\:00.0/information
Model: 		 NVIDIA T400
IRQ:   		 62
GPU UUID: 	 GPU-41dbedb2-a46d-4f41-4ac3-8a2dbdfea5b6
Video BIOS: 	 90.17.76.00.0b
Bus Type: 	 PCIe
DMA Size: 	 47 bits
DMA Mask: 	 0x7fffffffffff
Bus Location: 	 0000:03:00.0
Device Minor: 	 1
GPU Firmware: 	 550.76 <====
GPU Excluded:	 No

But have manually added to a conf file options nvidia NVreg_EnableGpuFirmware=1

@bearymore See Chapter 43. GSP Firmware as your gpu will not auto load this firmware anyway, just like my T400.

Thank you Malcolm. I have two questions left. First, is there any reason to prefer the open version of the drivers over the proprietary version or vice versa? Second, what conf file does options nvidia NVreg_EnableGpuFirmware=1 belong in and does it noticeably improve performance?

Again, thanks for all your help

@bearymore If I had a later offload gpu (I have a Tesla P4) I would give it a whirl to see. I ran it with just the Quadro T400 when if first came out and had power management issues.

My T400 is just for driving displays as opposed to a hard workout, so it probably does improve performance using the gsp firmware. You would need to do some benchmarking to see, eg blender-benchmark, maybe memtest vulkan.

I use a `/etc/modprobe.d/50-nvidia.conf file containing;

blacklist nouveau
options nouveau modeset=0
##Enable NVIDIA GSP Firmware
options nvidia NVreg_EnableGpuFirmware=1
##Power Management
options nvidia NVreg_DynamicPowerManagement=0x02

I also run persistent power with a systemd service…

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.