openSUSE 15.2 and Nvidia install frustrations on Asus Zephyrus

Hi Folks,
I have been failing for a couple of weeks to install nvidia drivers on a new laptop (ASUS ROG Zephyrus 15.6, AMD Ryzen 7 4800HS, GTX 1660Ti).

I have it set up as a dual-boot and I have installed openSUSE 15.2 on it a number of different ways (bios secure boot enabled/disabled, grub secure boot enabled/disabled). Unfortunately I haven’t been keeping fastidious records to make sure I’ve tried all of the combinations.

Just a basic installing of 15.2 works fine, but the trouble starts when I try to install the nvidia drivers. I’ve tried installing the drivers (with nouveau and nv blacklisted/not installed), using each of the nvidia repo; the easy way; the hard way; with varying degrees of failure. It irks me that I can’t get past resorting to nouveau when all of the nvidia resources seem to be available!

I suspect my lack of expertise is the culprit here as I’m not a sysadmin/programmer. Nor am I clear on all of the implications of the newer security features. I’ve read through the SDB:NVIDIA installation notes several times and tried to follow the instructions verbatim; I’ve also read through the Nvidia README installation doc, although my eyes glazed over at several points… Is there anyone that can provide some additional guidance on where I’m going wrong?

Here are some additional details:

hwinfo --gfxcard


18: PCI 100.0: 0300 VGA compatible controller (VGA)
  [Created at pci.386]
  Unique ID: VCu0.ntnBJi4lw56
  Parent ID: mnDB.hF1hOH7Fi01
  SysFS ID: /devices/pci0000:00/0000:00:01.1/0000:01:00.0
  SysFS BusID: 0000:01:00.0
  Hardware Class: graphics card
  Model: "nVidia TU116M [GeForce GTX 1660 Ti Mobile]"
  Vendor: pci 0x10de "nVidia Corporation"
  Device: pci 0x2191 "TU116M [GeForce GTX 1660 Ti Mobile]"
  SubVendor: pci 0x1043 "ASUSTeK Computer Inc."
  SubDevice: pci 0x171f 
  Revision: 0xa1
  Memory Range: 0xfb000000-0xfbffffff (rw,non-prefetchable,disabled)
  Memory Range: 0xb0000000-0xbfffffff (ro,non-prefetchable,disabled)
  Memory Range: 0xc0000000-0xc1ffffff (ro,non-prefetchable,disabled)
  I/O Ports: 0xf000-0xffff (rw,disabled)
  Memory Range: 0xfc000000-0xfc07ffff (ro,non-prefetchable,disabled)
  IRQ: 255 (no events)
  Module Alias: "pci:v000010DEd00002191sv00001043sd0000171Fbc03sc00i00"
  Driver Info #0:
    Driver Status: nouveau is not active
    Driver Activation Cmd: "modprobe nouveau"
  Config Status: cfg=new, avail=yes, need=no, active=unknown
  Attached to: #30 (PCI bridge)

32: PCI 500.0: 0300 VGA compatible controller (VGA)
  [Created at pci.386]
  Unique ID: Ddhb.SJNlrH_Yvq6
  Parent ID: JZZT.e+TNXSUNut3
  SysFS ID: /devices/pci0000:00/0000:00:08.1/0000:05:00.0
  SysFS BusID: 0000:05:00.0
  Hardware Class: graphics card
  Model: "ATI Renoir"
  Vendor: pci 0x1002 "ATI Technologies Inc"
  Device: pci 0x1636 "Renoir"
  SubVendor: pci 0x1043 "ASUSTeK Computer Inc."
  SubDevice: pci 0x171f 
  Revision: 0xc6
  Memory Range: 0xd0000000-0xdfffffff (ro,non-prefetchable)
  Memory Range: 0xe0000000-0xe01fffff (ro,non-prefetchable)
  I/O Ports: 0xd000-0xdfff (rw,disabled)
  Memory Range: 0xfc500000-0xfc57ffff (rw,non-prefetchable)
  IRQ: 255 (no events)
  Module Alias: "pci:v00001002d00001636sv00001043sd0000171Fbc03sc00i00"
  Config Status: cfg=new, avail=yes, need=no, active=unknown
  Attached to: #24 (PCI bridge)

Primary display adapter: #18

hwinfo --arch
Arch: x86_64/grub

inxi -Gxx

Graphics:  Device-1: NVIDIA TU116M [GeForce GTX 1660 Ti Mobile] vendor: ASUSTeK driver: N/A bus ID: 01:00.0 chip ID: 10de:2191 
           Device-2: Advanced Micro Devices [AMD/ATI] Renoir vendor: ASUSTeK driver: N/A bus ID: 05:00.0 chip ID: 1002:1636 
           Display: server: X.Org 1.20.3 compositor: kwin_x11 driver: ati unloaded: fbdev,modesetting,radeon,vesa 
           resolution: 1920x1080~77Hz s-dpi: 96 
           OpenGL: renderer: llvmpipe (LLVM 9.0.1 128 bits) v: 3.3 Mesa 19.3.4 compat-v: 3.1 direct render: Yes 

lspci -nnk

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116M [GeForce GTX 1660 Ti Mobile] [10de:2191] (rev a1)
    Subsystem: ASUSTeK Computer Inc. Device [1043:171f]
    Kernel modules: nouveau
01:00.1 Audio device [0403]: NVIDIA Corporation TU116 High Definition Audio Controller [10de:1aeb] (rev a1)
--
05:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Renoir [1002:1636] (rev c6)
    Subsystem: ASUSTeK Computer Inc. Device [1043:171f]
05:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1637]
    Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1637]

In the latest attemp to install the nvidia drivers, I removed xf86-video-nouveau and xf86-video-nv but I booted with nomodeset into init 3, with Secure boot enabled both in bios and grub2. (libdrm_nouveau2 was installed)

I tried installing the hard way using each of NVIDIA-Linux-x86_64-435.21.run and NVIDIA-Linux-x86_64-430.09.run (uninstalling and rebooting between each attempt) In this case i accepted most of the defaults except as follows:
register kernel module with DKMS: yes
but then the installation generated an error: “Unable to load the ‘nvidia-drm’ kernal module”

and examining the Xorg.0.log file after reboot seems to indicate the nvidia driver isn’t finding screens/devices.


     6.442] (II) modesetting: Driver for Modesetting Kernel Drivers: kms
     6.442] (II) FBDEV: driver for framebuffer: fbdev
     6.442] (II) VESA: driver for VESA chipsets: vesa
     6.444] xf86EnableIOPorts: failed to set IOPL for I/O (Operation not permitted)
     6.445] (EE) open /dev/dri/card0: No such file or directory
     6.445] (WW) Falling back to old probe method for modesetting
     6.445] (EE) open /dev/dri/card0: No such file or directory
     6.445] (II) Loading sub module "fbdevhw"
     6.445] (II) LoadModule: "fbdevhw"
     6.446] (II) Loading /usr/lib64/xorg/modules/libfbdevhw.so
     6.446] (II) Module fbdevhw: vendor="X.Org Foundation"
     6.446]     compiled for 1.20.3, module version = 0.0.2
     6.446]     ABI class: X.Org Video Driver, version 24.0
     6.446] (EE) Unable to find a valid framebuffer device
     6.446] (WW) Falling back to old probe method for fbdev
     6.446] (II) Loading sub module "fbdevhw"
     6.446] (II) LoadModule: "fbdevhw"
     6.446] (II) Loading /usr/lib64/xorg/modules/libfbdevhw.so
     6.446] (II) Module fbdevhw: vendor="X.Org Foundation"
     6.446]     compiled for 1.20.3, module version = 0.0.2
     6.446]     ABI class: X.Org Video Driver, version 24.0
     6.446] (II) FBDEV(2): using default device
     6.446] (EE) Screen 0 deleted because of no matching config section.
     6.446] (II) UnloadModule: "modesetting"
     6.446] (EE) Screen 0 deleted because of no matching config section.
     6.446] (II) UnloadModule: "fbdev"
     6.446] (II) UnloadSubModule: "fbdevhw"
     6.446] (II) FBDEV(0): Creating default Display subsection in Screen section
    "Default Screen Section" for depth/fbbpp 24/32

Any recommendations?
Sorry for the long post and hopefully I figured out the code tags utility…

You have dual GPUs and an odd hardware setup. Most cases it its Intel+NVIDIA called Optimus. In this case it is AMD plus NVIDIA. Just maybe you need to try suse-prime which is used in the Intel-NVIDIA case to control which GPU is in control. The problem is the NVIDIA driver uses special X stack programs which are incompatible with other brand GPU. Can you turn off the AMD GPU from the BIOS??

Thanks for your reply…
Not familiar with prime, so I looked it up. It says I can’t have Wayland or an xorg.conf (etc).
I don’t have an xorg.conf and a quick check suggests xorg.conf.d files do not have any offending assignments.
However, checking in yast2 there are a lot of Wayland references. Is that going to be a problem?
Also, I could not find any ref for turning off the AMD gpu in bios.

I also just checked the nvidia-installer.log, which I should have mentioned…
It says that it is unable to load the ‘nvidia-drm’ kernel module
Sorry, that’s probably relevant

Does anyone have any plausible solutions that will help me get the nvidia drivers installed on this system?
gogalthorp suggested it may be a conflict with the amd gpu drivers, but I have no clue how to circumvent this and suse-prime did not help.

To Wayland. normally it may depend om which desktop. at this time Wayland on KDE + NVIDIA may not work completely well Gnome does better.

Did you try to disable the AMD GPU???

Also I guess you are installing the hardway try installing from the NVIDIA repo.

Hi
@bcain, install from the nvidia run file (the hard way), see https://en.opensuse.org/SDB:NVIDIA_the_hard_way

Depending on desktop choice, and for sure use Xorg not Wayland. To use the dGPU (The AMD) one on the GNOME DE install switcheroo-control (adds an option to use dGPU on right-click of a desktop/menu icon) or DRI_PRIME=1 <some app> from the command line should kick it into life.

thanks gogalthorp and Malcolm,
I could not find a way to disable the AMD gpu in bios and I am indeed trying to install nvidia the hard way, but now I’ve run afoul of the secure boot requirements.
I’ve seen some recommendations about turning secure boot off in bios and disabling it in the boot loader, but then others issue dire warnings about doing that.
Do I have to create some kind of a special GPG key pair to move foreward? I’ve not had any noteworthy experience with this level of security of signing/authenticating files.
This is definitely not a task for a novice who is faint of heart! I need a better cookbook than https://en.opensuse.org/SDB:NVIDIA_the_hard_way

Hi
Just disable secure boot via YaST and BIOS and move on to install the nvidia driver :wink: You can also go into the BIOS and select the efi entry for non-secure boot…

Thanks Malcolm,
After disabling Secure Boot in bios and yast2, I ran the nvidia installation script successfully. I opted to allow it to rewrite the xorg.conf file and on reboot, all I got was a bsod.
I thought that this might be due to the xorg.conf file so I renamed it, which allowed getting to a console login.
I then checked the boot kernel params and there was a “nomodeset”, which I removed and rebooted without any effect (still getting to a init 3 login).
Any suggestions?
Cheers.

Hi
Is the nouveau driver blacklisted, have you run mkinitrd, you don’t need an xorg.conf file, so remove that if it exists.

Is the default set to graphical login?


systemctl set-default graphical.target

Are you using GNOME? If so ensure /etc/gdm/custom.conf is edited to set wayland to false (see the comments in the file).

Hi Malcolm,
verified that nouveau was blacklisted in /etc/modprobe.d/nvidia.conf; added “options nouveau modeset=0” to the file
ran mkinitrd after nvidia installation, but just re-ran it
xorg.conf file is removed (not just renamed to xorg.conf.tmp)
ran systemctl set-default graphical.target
I am running Gnome; /etc/gdm/custom.conf did not have Wayland=false enabled so uncommented that line

reboot
no change; still boots to console

Cheers.

Hi
From the console, can you show the output from;


/sbin/lspci -nnk | egrep -A3 "VGA|Display|3D"
nvidia-smi

lspci -nnk output


01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116M [GeForce GTX 1660 Ti Mobile] [10de:2191] (rev a1)
    Subsystem: ASUSTeK Computer Inc. Device [1043:171f]
    Kernel driver in use: nvidia
    Kernel modules: nouveau, nvidia_drm, nvidia
--
05:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Renoir [1002:1636] (rev c6)
    Subsystem: ASUSTeK Computer Inc. Device [1043:171f]
05:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1637]
    Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1637]

nvidia-smi output

Wed Dec  2 12:30:23 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 435.21       Driver Version: 435.21       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 166...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   59C    P0     3W /  N/A |      0MiB /  5944MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

I also found the following framebuffer errors in Xorg.0.log – not sure if it is relevant

   288.959] (II) modesetting: Driver for Modesetting Kernel Drivers: kms
   288.959] (II) FBDEV: driver for framebuffer: fbdev
   288.959] (II) VESA: driver for VESA chipsets: vesa
   288.961] (WW) Falling back to old probe method for modesetting
   288.961] (II) Loading sub module "fbdevhw"
   288.961] (II) LoadModule: "fbdevhw"
   288.962] (II) Loading /usr/lib64/xorg/modules/libfbdevhw.so
   288.962] (II) Module fbdevhw: vendor="X.Org Foundation"
   288.962]     compiled for 1.20.3, module version = 0.0.2
   288.962]     ABI class: X.Org Video Driver, version 24.0
   288.962] (EE) Unable to find a valid framebuffer device
   288.962] (WW) Falling back to old probe method for fbdev
   288.962] (II) Loading sub module "fbdevhw"
   288.962] (II) LoadModule: "fbdevhw"
   288.962] (II) Loading /usr/lib64/xorg/modules/libfbdevhw.so
   288.962] (II) Module fbdevhw: vendor="X.Org Foundation"
   288.962]     compiled for 1.20.3, module version = 0.0.2
   288.962]     ABI class: X.Org Video Driver, version 24.0
   288.962] (II) FBDEV(2): using default device
   288.962] (II) Loading sub module "fb"
   288.962] (II) LoadModule: "fb"
   288.962] (II) Loading /usr/lib64/xorg/modules/libfb.so
   288.962] (II) Module fb: vendor="X.Org Foundation"
   288.962]     compiled for 1.20.3, module version = 1.0.0
   288.962]     ABI class: X.Org ANSI C Emulation, version 0.4
   288.962] (II) Loading sub module "wfb"
   288.962] (II) LoadModule: "wfb"
   288.962] (II) Loading /usr/lib64/xorg/modules/libwfb.so
   288.962] (II) Module wfb: vendor="X.Org Foundation"
   288.962]     compiled for 1.20.3, module version = 1.0.0
   288.962]     ABI class: X.Org ANSI C Emulation, version 0.4
   288.962] (II) Loading sub module "ramdac"
   288.962] (II) LoadModule: "ramdac"
   288.962] (II) Module "ramdac" already built-in
   288.963] (EE) Screen 0 deleted because of no matching config section.
   288.963] (II) UnloadModule: "modesetting"
   288.963] (EE) Screen 0 deleted because of no matching config section.
   288.963] (II) UnloadModule: "fbdev"
   288.963] (II) UnloadSubModule: "fbdevhw"
   288.963] (EE) 
Fatal server error:
   288.963] (EE) Cannot run in framebuffer mode. Please specify busIDs        for all framebuffer devices
   288.963] (EE) 

Hi
I suspect the issue is the AMD gpu;


05:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Renoir [1002:1636] (rev c6)
    Subsystem: ASUSTeK Computer Inc. Device [1043:171f]

No driver in use… If your going to run Leap, you need to look at installing the 5.10.x kernel from kernel:HEAD repository.

Or, as a test create a Tumbleweed live USB and boot from that and see if things work.

Hi Malcolm,
Ok. I will investigate both of those options and I may be back.
Thanks for all of your help!
Stay healthy.
Cheers,
Brad

Hi
No worries. I actually suspect the primary device is the AMD gpu, hence no graphics… Is this an AMD CPU powered system?

Hi
AFAIK, renoir gpu’s need kernel >=5.6 kernel… so kernel:stable should give you right amdgpu driver…

This laptop is not shipped with openSUSE preinstalled.

Check in BIOS that Fast boot is disabled: https://rog.asus.com/us/support/FAQ/1044641 .

How to enter the BIOS configuration: https://www.asus.com/support/FAQ/1008829/ .

Hi Folks,
Thanks for all the feedback. I took Malcolm’s suggestion and reinstalled with TW, but I’m not sure it has improved my situation. I disabled bios Fast Boot very early on and yes, it is an AMD CPU (Ryzen 7).
I just tried the Easy Way using the nvidia repo and it appears that things work, but I’m not sure.
Here is the output of a couple of commands you’ve asked about in the past

lspci -nnk


01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116M [GeForce GTX 1660 Ti Mobile] [10de:2191] (rev a1)
05:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Renoir [1002:1636] (rev c6)

nvidia-smi

Sun Dec 20 10:48:40 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.27.04    Driver Version: 460.27.04    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 166...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   43C    P0    16W /  N/A |      0MiB /  5944MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Does this indicate that at least it seems to recognize the NVIDIA GPU, but that it isn’t being used (from above: “0 GeForce GTX 166… Off” and “No running processes found”)
When I run nvidia-settings, I get an error: “Error: Unable to load info from any available system”, which I naively interpret to indicate that I don’t have the nvidia driver in use.
BTW…I do have a graphical display on the laptop and an external HDMI monitor; however, I cannot get a 2nd external monitor to run off a USB-C port (it does work on the Windows side, so I don’t think it is a HW problem.)
Any guidance appreciated!
Cheers,
Brad