The GPU is unstable.
Immediately after turning on the machine, the GPU is functional. It becomes unusable after a few minutes. Finally, after going into deep sleep mode, an error message appears.
I can no longer restart the machine, and it is impossible to shut it down via the graphical interface.
I ran the script: nvidia-bug-report.sh
Here are some excerpts:
Component | Details
================================================================================
Vulkan Info | None
--------------------------------------------------------------------------------
NVIDIA SMI | NVIDIA-SMI version : 580.105.08
| NVML version : 580.105
| DRIVER version : 580.105.08
| CUDA Version : 13.0
--------------------------------------------------------------------------------
NVIDIA GPU Details | NVIDIA GeForce RTX 3080 Laptop GPU, 580.105.08, 8192 MiB, 94.04.43.00.9F, 00000000:01:00.0, [N/A]
--------------------------------------------------------------------------------
NVIDIA Settings | None
--------------------------------------------------------------------------------
NVIDIA Fabric Manager | None
--------------------------------------------------------------------------------
NVIDIA Subnet Manager | None
--------------------------------------------------------------------------------
Mellanox Link | None
--------------------------------------------------------------------------------
InfiniBand Status | None
--------------------------------------------------------------------------------
InfiniBand Network Discovery | None
--------------------------------------------------------------------------------
NVIDIA MSE/NETIR Versions | None
--------------------------------------------------------------------------------
NVIDIA Switch Details | mst command not found
--------------------------------------------------------------------------------
NVIDIA NIC Details | None
--------------------------------------------------------------------------------
OS Details | None
--------------------------------------------------------------------------------
.........
*** /etc/os-release
*** ls: lrwxrwxrwx. 1 root root 21 2025-11-13 20:31:10.000000000 +0100 /etc/os-release -> ../usr/lib/os-release
NAME="openSUSE Tumbleweed"
# VERSION="20251113"
ID="opensuse-tumbleweed"
ID_LIKE="opensuse suse"
VERSION_ID="20251113"
PRETTY_NAME="openSUSE Tumbleweed"
ANSI_COLOR="0;32"
# CPE 2.3 format, boo#1217921
CPE_NAME="cpe:2.3:o:opensuse:tumbleweed:20251113:*:*:*:*:*:*:*"
#CPE 2.2 format
#CPE_NAME="cpe:/o:opensuse:tumbleweed:20251113"
BUG_REPORT_URL="https://bugzilla.opensuse.org"
SUPPORT_URL="https://bugs.opensuse.org"
HOME_URL="https://www.opensuse.org"
DOCUMENTATION_URL="https://en.opensuse.org/Portal:Tumbleweed"
LOGO="distributor-logo-Tumbleweed"
........
● nvidia-powerd.service - nvidia-powerd service
Loaded: loaded (/usr/lib/systemd/system/nvidia-powerd.service; enabled; preset: enabled)
Active: active (running) since Mon 2025-11-17 20:38:34 CET; 18min ago
Invocation: 082b14f58834406890ac60c028e3acc5
Main PID: 1030 (nvidia-powerd)
Tasks: 5 (limit: 37402)
CPU: 998ms
CGroup: /system.slice/nvidia-powerd.service
└─1030 /usr/bin/nvidia-powerd
nov. 17 20:38:35 dans nvidia-powerd[1030]: DBus Connection is established
nov. 17 20:38:35 dans nvidia-powerd[1030]: ERROR! DC power limits table is not supported
nov. 17 20:38:36 dans nvidia-powerd[1030]: ERROR! Failed to get SysPwrLimitGetInfo!!
nov. 17 20:38:36 dans nvidia-powerd[1030]: ERROR! Client (presumably SBIOS) has requested to disable Dynamic Boost DC controller
nov. 17 20:38:47 dans nvidia-powerd[1030]: ERROR! Exception NvPcfApi is not initialized
nov. 17 20:38:56 dans nvidia-powerd[1030]: ERROR! Exception NvPcfApi is not initialized
nov. 17 20:39:16 dans nvidia-powerd[1030]: ERROR! Exception NvPcfApi is not initialized
nov. 17 20:41:20 dans nvidia-powerd[1030]: ERROR! Exception NvPcfApi is not initialized
nov. 17 20:55:13 dans nvidia-powerd[1030]: ERROR! Exception NvPcfApi is not initialized
nov. 17 20:56:36 dans nvidia-powerd[1030]: ERROR! Exception NvPcfApi is not initialized
○ nvidia-persistenced.service - NVIDIA Persistence Daemon
Loaded: loaded (/usr/lib/systemd/system/nvidia-persistenced.service; disabled; preset: enabled)
Active: inactive (dead)
.........
nov. 16 10:50:04 dans kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 511
nov. 16 10:50:04 dans kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 580.105.08 Wed Oct 29 23:15:11 UTC 2025
nov. 16 10:50:05 dans kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 580.105.08 Wed Oct 29 22:15:26 UTC 2025
nov. 16 10:50:05 dans kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
nov. 16 10:50:05 dans systemd[1]: Started nvidia-powerd service.
nov. 16 10:50:05 dans nvidia-powerd[1020]: nvidia-powerd version:2.0 (build 1)
nov. 16 10:50:07 dans nvidia-powerd[1020]: DBus Connection is established
nov. 16 10:50:07 dans nvidia-powerd[1020]: ERROR! DC power limits table is not supported
nov. 16 10:50:07 dans nvidia-powerd[1020]: ERROR! Failed to get SysPwrLimitGetInfo!!
nov. 16 10:50:07 dans nvidia-powerd[1020]: ERROR! Client (presumably SBIOS) has requested to disable Dynamic Boost DC controller
nov. 16 10:50:07 dans kernel: [drm] Initialized nvidia-drm 0.0.0 for 0000:01:00.0 on minor 0
nov. 16 10:50:13 dans gnome-shell[1763]: Added device '/dev/dri/card0' (nvidia-drm) using atomic mode setting.
nov. 16 10:51:32 dans gnome-shell[3592]: Added device '/dev/dri/card0' (nvidia-drm) using atomic mode setting.
nov. 16 10:57:46 dans kernel: NVRM: GPU at PCI:0000:01:00: GPU-622a4aae-0147-82a2-9605-6c976230a1ee
.........
nov. 16 10:57:52 dans kernel: NVRM: Xid (PCI:0000:01:00): 119, pid=8551, name=python, Timeout after 6s of waiting for RPC response from GPU0 GSP! Expected function 76 (GSP_RM_CONTROL) sequence 2943 (0x20800a81 0x4).
nov. 16 10:57:58 dans nvidia-powerd[1020]: ERROR! Failed to get AC Line status
nov. 16 10:57:58 dans kernel: NVRM: Xid (PCI:0000:01:00): 119, pid=1020, name=nvidia-powerd, Timeout after 6s of waiting for RPC response from GPU0 GSP! Expected function 76 (GSP_RM_CONTROL) sequence 2944 (0x2080205a 0x4).
nov. 16 10:57:58 dans kernel: NVRM: Xid (PCI:0000:01:00): 154, GPU recovery action changed from 0x0 (None) to 0x1 (GPU Reset Required)
nov. 16 11:18:13 dans suspend[11143]: nvidia-suspend.service
nov. 16 11:18:13 dans logger[11143]: <13>Nov 16 11:18:13 suspend: nvidia-suspend.service
nov. 16 11:18:18 dans kernel: nvidia-modeset: ERROR: GPU:0: Error while waiting for GPU progress: 0x0000c67d:0 2:2:0:4040
Any Idea?
D.

