Something broken about Nvidia 550.120-27.1, Gnome, switcherooctl (Optimus laptop)

With Nvidia driver 550.120-27.1 on Gnome starting apps on the discrete GPU (Right Click + Launch using Discrete Graphics Card) doesn’t work, app doesn’t show on nvidia-smi like in:

LT-B:~ # nvidia-smi
Fri Oct  4 14:03:21 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.107.02             Driver Version: 550.107.02     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 960M        Off |   00000000:01:00.0 Off |                  N/A |
| N/A   30C    P8             N/A /  200W |      35MiB /   4096MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      3488      G   /usr/bin/gnome-shell                            1MiB |
|    0   N/A  N/A      7789    C+G   /usr/bin/gnome-text-editor                     27MiB |
+-----------------------------------------------------------------------------------------+
LT-B:~ #

LibreOffice Calc refuses to use OpenCL.
I remember seeing a warning (“Couldn’t run a post-transaction script, see logs…”) during upgrade to 550.120, but couldn’t find anything meaningful in the logs.
Forced reinstall of all Nvidia packages didn’t help either.
Gnome-shell apparently used render offload and GLX apparently worked, so maybe there is some link missing or wrong in the CUDA or OpenCL areas.

Back to 550.107 and locked for the time being.
(Not really a request for help, more a heads-up to interested users)

Remove all the nvidia modules in /lib/modules/6.11.0-1-default/updates, then reinstall:

sudo rpm -e  nvidia-driver-G06-kmp-default --nodeps
sudo zypper in nvidia-driver-G06-kmp-default

How is it different from a forced reinstall (that didn’t work)?

This way was provided by the openSUSE maintainer of the Nvidia packages, Stefan.
https://bugzilla.opensuse.org/show_bug.cgi?id=1231227#c15
And it seems proven to work:
https://bugzilla.opensuse.org/show_bug.cgi?id=1231227#c19

Thanks for the reference. My problem looks a bit different, but going through the bug report I checked that with 550.107 some nvidia modules are included in the initrd:

LT-B:~ # lsinitrd /boot/initrd-6.11.0-1-default |grep nvidia
-rw-r--r--   1 root     root         1797 Sep 13 15:27 usr/lib/modprobe.d/50-nvidia-default.conf
-rw-r--r--   1 root     root           18 Aug 12 13:51 usr/lib/modprobe.d/nvidia-default.conf
-rw-r--r--   1 root     root        14037 Sep 13 15:27 usr/lib/modules/6.11.0-1-default/kernel/drivers/hid/hid-nvidia-shield.ko.zst
-rw-r--r--   1 root     root         3029 Sep 13 15:27 usr/lib/modules/6.11.0-1-default/kernel/drivers/usb/typec/altmodes/typec_nvidia.ko.zst
-rw-r--r--   1 root     root      2755432 Sep 13 15:27 usr/lib/modules/6.11.0-1-default/updates/nvidia-modeset.ko
-rw-r--r--   1 root     root      6298288 Sep 13 15:27 usr/lib/modules/6.11.0-1-default/updates/nvidia-uvm.ko
LT-B:~ #

while according to Stefan no kernel module should be included with 550.120.
Maybe the exact sequence of uninstall, (update initrd?), upgrade, update initrd (again?) etc. might leave some leftover somewhere and a full uninstall then install upgraded helps…
Currently 550.120 offers no gain to this system, so I’m happy with 550.107 for the time being and will check again with the next nvidia upgrade.

Just tried an upgrade to 550.120.28

LT-B:~ # zypper se -si nvidia
Loading repository data...
Reading installed packages...

S  | Name                          | Type    | Version                | Arch   | Repository
---+-------------------------------+---------+------------------------+--------+----------------------
i  | libnvidia-egl-wayland1        | package | 1.1.16-1.1             | x86_64 | Main Repository (OSS)
i+ | nvidia-compute-G06            | package | 550.120-28.1           | x86_64 | NVIDIA
i+ | nvidia-compute-utils-G06      | package | 550.120-28.1           | x86_64 | NVIDIA
i+ | nvidia-driver-G06-kmp-default | package | 550.120_k6.11.0_1-28.1 | x86_64 | NVIDIA
i+ | nvidia-gl-G06                 | package | 550.120-28.1           | x86_64 | NVIDIA
i+ | nvidia-utils-G06              | package | 550.120-28.1           | x86_64 | NVIDIA
i+ | nvidia-video-G06              | package | 550.120-28.1           | x86_64 | NVIDIA
LT-B:~ #

and I’m back to the original post (no openCL, no right-click using dGPU…).
Tried the remedy in post #2 to no avail.
Tried to rebuild the initrd in every known (to me) mode, even removing the original file and building from scratch and naaaay.
GL sort of works, even if glxinfo spits something new to me:

bruno@LT-B:~> DRI_PRIME=1 glxinfo |grep renderer
glx: failed to create dri3 screen
failed to load driver: nouveau
MESA-INTEL: warning: Haswell Vulkan support is incomplete
    GLX_MESA_copy_sub_buffer, GLX_MESA_gl_interop, GLX_MESA_query_renderer, 
    GLX_MESA_query_renderer, GLX_MESA_swap_control, GLX_SGIS_multisample, 
Extended renderer info (GLX_MESA_query_renderer):
OpenGL renderer string: zink Vulkan 1.3(NVIDIA GeForce GTX 960M (NVIDIA_PROPRIETARY))
bruno@LT-B:~>

Video HW decoding apparently works.
Nvidia modules are loaded and work, but something changed between 550.107 and 550.120 that Gnome and libreoffice don’t like.

Just seen that Nvidia driver updated to 550.127.05, so “zypper dup” and… still broken.
But this time i tried harder and since clinfo found no available platform due to no libOpenCL1.so being found, I force reinstalled libOpenCL1

LT-B:~ # zypper in --force libOpenCL1

After that clinfo found the Nvidia CUDA platform and Libreoffice Calc was able to use OpenCL again. Darktable also was able to find and use OpenCL, so I’m fine for my current needs.
What is still intriguing is that “Launch with Discrete Graphics Card” in Gnome is still missing.
To be double sure I forced reinstall switcheroo-control too:

LT-B:~ # zypper in --force switcherooctl

To be fair, it is possible to launch “foreign” apps like Firefox on the dGPU, but no way with “native” Gnome apps like totem, gnome-text-editor, nautilus, gnome-maps… none of them shows up as using the dGPU:

bruno@LT-B:~> nvidia-smi
Sat Oct 26 16:34:54 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.127.05             Driver Version: 550.127.05     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 960M        Off |   00000000:01:00.0 Off |                  N/A |
| N/A   37C    P8             N/A /  200W |     216MiB /   4096MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      3437      G   /usr/bin/gnome-shell                            1MiB |
|    0   N/A  N/A      7107      C   ...b64/libreoffice/program/soffice.bin         27MiB |
|    0   N/A  N/A      8023      C   /usr/bin/darktable                             28MiB |
|    0   N/A  N/A      8225      G   /usr/lib64/firefox/firefox                    152MiB |
+-----------------------------------------------------------------------------------------+
bruno@LT-B:~>

OpenGL (glxinfo etc.) and even video decoding apparently work.
I’m not going to force-reinstall the whole Gnome DE just for the sake of testing, given that the apps I definitely need are OK now, but better insight from experts will be appreciated.

@OrsoBruno Hi, that’s because GNOME automatically detects the dGPU (Nvidia/Cuda) and should be using it for compute.

Now I use the beta open driver, but I also still see launch with discrete gpu, but it starts but not visible, needs some Wayland variable foo…

nvidia-smi
Sat Oct 26 10:48:29 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01              Driver Version: 565.57.01      CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA T400                    Off |   00000000:02:00.0 Off |                  N/A |
| 50%   43C    P8             N/A /   31W |      23MiB /   2048MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      2678      G   /usr/bin/gnome-shell                            1MiB |
|    0   N/A  N/A     64595    C+G   /usr/bin/gnome-text-editor                     17MiB |
+-----------------------------------------------------------------------------------------+

Vs normal launch

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      2678      G   /usr/bin/gnome-shell                            1MiB |
|    0   N/A  N/A     64937    C+G   /usr/bin/gnome-text-editor                      2MiB |
+-----------------------------------------------------------------------------------------+

It may also be this https://www.phoronix.com/news/GTK-4.16-Released on Wayland with GSK renderer being Vulkan…

Also see https://docs.gtk.org/gtk4/running.html#gsk_renderer

Thanks Malcolm, that’s definitely it. When invoked with:

bruno@LT-B:~> GSK_RENDERER=ngl switcherooctl gnome-text-editor

Text Editor shows up in nvidia-smi:

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      3437      G   /usr/bin/gnome-shell                            1MiB |
|    0   N/A  N/A     20248      G   gnome-text-editor                              13MiB |
+-----------------------------------------------------------------------------------------+

So nothing is broken.
But the “plain” invocation spits:

bruno@LT-B:~> gnome-text-editor
MESA-INTEL: warning: Haswell Vulkan support is incomplete
MESA-INTEL: warning: ../src/intel/vulkan_hasvk/anv_formats.c:752: FINISHME: support YUV colorspace with DRM format modifiers
MESA-INTEL: warning: ../src/intel/vulkan_hasvk/anv_formats.c:783: FINISHME: support more multi-planar formats with DRM modifiers

So everything works as intended, but maybe the new GTK 4.16 default is a bit ahead of my HW?

@OrsoBruno Yes that would seem to be the case, You can set as a default?

mkdir -p ~/.config/environment.d
echo "GSK_RENDERER=ngl" >> ~/.config/environment.d/gsk.conf

Logout/Login and see how that goes.

Also Mesa 24.2.x is about to drop, so might want to remove after that’s installed and see if it helps…

Back to the old habit:

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A     28947      G   /usr/bin/gnome-shell                            1MiB |
|    0   N/A  N/A     30156      G   /usr/bin/gnome-text-editor                     13MiB |
+-----------------------------------------------------------------------------------------+

Let’s see what happens with the next Mesa :wink:

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.