leap 42.2 "issues"

hello,

clean install of “leap 42.2” for a pretty old PC with GeForce-7900 and AMD 64 X2. xdm + wmaker.

  1. blacklisted “mouveau” and added the “official nVidia repo”:
zypper addrepo -f http://download.nvidia.com/opensuse/leap/42.2 nvidia

driver failed to load due to the known “mtrr issues” with a new (4.x.x) kernels. no X as a result…

  1. downloaded the official “NVIDIA-Linux-x86_64-304.132.run” installer, patched to remove “mtrr issue”, build, loaded. X started. GLX/OpenGL works only for “root”. double checked all possible links to “libgl/libglx”, no issues in Xorg.log. all is fine, but only “root” has a proper working OpenGL. for Users “glxinfo”, “glxgears” and anything OpenGL-related crashes, including “nvidia-settings” if we press “OpenGL/GLX Information” menu entry. Users are in groups “video”, “audio” and some others as they require.

how to cure? please help… “openSuSE 13.2” - works just fine, but it’s support soon will be over… seems like a pure 42.2 “feature”…

  1. downloaded current “4.8.10” kernel from “kernel.org”. build. during/after
make modules_install

“depmod” failed to create/generate “modules.dep” and other required files. manual invocation of “depmod” doesn’t help. as a result “dracut” unable to include required drivers to initrd and system unable to boot with a new compiled kernel. manual invocation of “dracut” with --add-drivers “filesystem and device required drivers” failed (it’s clear that it doesn’t see any due to the “depmod” failure).

“openSuSE 13.2” - works without any flaws. please advise how to cure this case.

Thank You Very Much in advance.

Curious whether you tried the nouveau driver before you decided to blacklist it and install the nVidia drivers (or maybe you have a specific reason to use the nVidia driver?)

TSU

Hm, I’m not aware of any “known mtrr issues with new (4.x.x) kernels”.
The driver should normally work fine.

BUT:

  1. downloaded the official “NVIDIA-Linux-x86_64-304.132.run” installer, patched to remove “mtrr issue”, build, loaded. X started. GLX/OpenGL works only for “root”. double checked all possible links to “libgl/libglx”, no issues in Xorg.log. all is fine, but only “root” has a proper working OpenGL. for Users “glxinfo”, “glxgears” and anything OpenGL-related crashes, including “nvidia-settings” if we press “OpenGL/GLX Information” menu entry. Users are in groups “video”, “audio” and some others as they require.

That’s a known problem with the 304.132 driver, caused by a security fix.
nvidia is aware of it and are working on solving the problem.

See 1003918 – Nvidia 304.132 black screen on GeForce 6150SE nForce 430 (regression since 304.131?) and Nvidia not working for non-root user after upgrade to Leap 42.1 - Hardware - openSUSE Forums e.g.

The 304.131 version (still available for download at nvidia’s homepage) should work fine though.

THANKS! i’ll check and report here a bit later. sorry…

heh, *(Startpage - Private Search Engine. No Tracking. No Search History."nvidia mtrr failed to build module) the patch is trivial. most annoying is that “openSuSE-13.2” works just excellent here…

Curious whether you tried the nouveau driver before you decided to blacklist it and install the nVidia drivers (or maybe you have a specific reason to use the nVidia driver?)
just a matter of personal habit. lost one nVidia card due to the overheating once. won’t blame anyone except myself, but it’s enough to use original driver if it’s available. there’s nothing else i can say.*

nVidia driver 304.131 works.

short fairy-tale:

  1. download “NVI*.run” file (driver) from nVidia offsite;
  2. make it work if “mtrr issue” is all what you’ve got:

$ cd favorite_folder_for_nvidia
$ sh path_to/NVI*.run -x
$ cd NVI*/kernel
$ patch -pi -i path_to/disable-mtrr.patch

patch itself:
$ cat disable-mtrr.patch


Author: Luca Boccassi <luca.boccassi@gmail.com>
Description: Disable MTRR on kernel >= 4.3
 From kernel 4.3 and newer (commit 2baa891e42d84) mtrr_add and mtrr_del are no
 longer exported. The Nvidia kernel shim still uses it as of 304.131, causing
 the module to error out when loading. Disable MTRR if running on 4.3 or greater
 until upstream fixes it.
--- a/nv-linux.h
+++ b/nv-linux.h
@@ -256,6 +256,15 @@
 #include <linux/seq_file.h>
 #endif
 
+/*
+ * As of version 304.131, os-agp.c and os-mtrr.c still use deprecated
+ * kernel APIs for mtrr which are no longer exported since 4.3, causing
+ * the module to error out when loaded.
+ */
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(4,3,0)
+#undef CONFIG_MTRR
+#endif
+
 #if !defined(NV_VMWARE) && defined(CONFIG_MTRR)
 #include <asm/mtrr.h>
 #endif

that’s it, folks :slight_smile:

post-install check:


> ldconfig -p | grep -i] libGL.so  # check the links to point at nVidia driver


> rpm -ql `rpm -qf /usr/bin/Xorg` | grep -i libglx  # check the links to point at nVidia driver

the only serious “issue” left (yet) - find an answer how to build custom kernel from sources. i’ll try to rebuild “kmod-compat” from 13.2 later. may be it’d be enough… any clues are appreciated.

Thanks!

as expected, here’s the “quick and dirty” fix for building custom (recent) kernels on “Leap-42.2” (it’s a VERY BAD “idea” to do it this way, but it works :slight_smile: ):

# rpm -Uhv http://download.opensuse.org/distribution/13.2/repo/oss/suse/x86_64/kmod-18-2.2.2.x86_64.rpm http://download.opensuse.org/distribution/13.2/repo/oss/suse/x86_64/kmod-compat-18-2.2.2.x86_64.rpm http://download.opensuse.org/distribution/13.2/repo/oss/suse/x86_64/libkmod2-18-2.2.2.x86_64.rpm

after upgrade from 13.2 base system “depmod” works properly and “dracut” makes “initrd” which capable to boot the system with a fresh new compiled kernel. but…

N.B.: it’s only yet another example of… that base “openSuSE-13.2” is … well … works better… and much more polished… somehow… somewhere :slight_smile:
i didn’t check in depth! only can say, that right now my “oS-13.2” works with kernel-4.8.10 and nVidia-304.132 driver without issues. “Leap-42.2” works with kernel-4.8.10 (fresh local build), but quick compilation of nVidia-304.131 (for kernel-4.8.10) failed. it’s clear, that there’s no “simple solutions” as we’re dealing with a core components.

i’m a bit upset that… well… let’s hope that “openSuSE-13.2” will not be abandoned after Jan/2017 or the base core system of “Leap” will be… at least on par with the one in “13.2”.

wolfi323 - thanks for a clue to use 304.131 driver. may be it’s worth to mention that +iglx/-iglx is capable to control X behavior. or:

Section "ServerFlags"
Option "AllowIndirectGLX" "on" # or "off"  (or Option "AllowIndirectGLXProtocol" "boolean") #
EndSection

need to test for Users with nVidia-304.132

best regards and thanks for reading all of this :slight_smile:

? end of story ?

after brief exam of a current kernel packages in OBS have found that instead of “depmod” ? all ?] packages use “suse-module-tools”. tried similar approach with a fresh local build of “kernel-4.8.10” and noticed, that all compressed modules are completely ignored. result is the same: bad “initrd” and failure on boot if we’re not building “pure monolith” :). hello good-old-days without initrd! what a funny time it was…

not sure how strict the requirement not to compress kernel modules… compressed rpm’s win in size from such approach, but regular users are not [with ssd]. here’s the dirty sizes of “system 4.4.27-2-default” vs similar “local build with compression 4.8.10-2-86-64”:

 > du -sh /lib/modules/4*
218M    /lib/modules/4.4.27-2-default
64M     /lib/modules/4.8.10-2-86-64

“xz” compressed “vmlinuz” is a bit smaller too, but difference is insignificant due to the default “bzip” compression.

is it a bug? it’s a clear regression if we compare with “13.2”, but also it could be a “yet another feature” of “leap”. not pleasant one, but … it’d be nice to hear someone who knows exactly why some kernel options are restricted in “leap”… also it’d be nice to have a note about such restrictions in “Release Notes”. at least.

back to nVidia-304.132:

OpenGL works for all Users if we enable indirect glx rendering. options in /etc/X11/xorg.conf may or may not work. it’s best to force “X” to start with required parameter ("+iglx"). short how-to:

  1. find the way your X-session works. example with option “+iglx” already set:

$ ps aux | grep X

> xinit /home/${USER}/.xinitrc -- /etc/X11/xinit/xserverrc :0 vt2 -auth /home/${USER}/.serverauth.1618 -nolisten tcp
> X :0 vt2 -auth /home/${USER}/.serverauth.1618 -nolisten tcp -nolisten tcp **+iglx**
  1. adjust X-control-file by adding required X-option to the start parameters:

$ tail -n 4 /etc/X11/xinit/xserverrc

exec X $dspnum -auth $auth $args +iglx
else
    exec X $dspnum $args +iglx
fi

above i edited “/etc/X11/xinit/xserverrc” file. if you’re using display managers like “gdm”, “kdm”, etc. you may adjust their configs in a similar way.

thx

Thanks for the nice hints! Here is my additional input to reduce the work needed:

The above listed steps to get a running NVIDIA module can be reduced to one command:


#> bash NVIDIA-Linux-x86_64-304.132.run --apply-patch nv-linux.h-2.diff --target-os=Linux --target-arch=x86_64

With a small modification of the patch such that it works directly with the NVIDIA *.run file (save as nv-linux.h-2.diff):


Author: Luca Boccassi <luca.boccassi@gmail.com>
Description: Disable MTRR on kernel >= 4.3
 From kernel 4.3 and newer (commit 2baa891e42d84) mtrr_add and mtrr_del are no
 longer exported. The Nvidia kernel shim still uses it as of 304.131, causing
 the module to error out when loading. Disable MTRR if running on 4.3 or greater
 until upstream fixes it.
--- kernel/nv-linux.h
+++ kernel/nv-linux.h
@@ -256,6 +256,15 @@
 #include <linux/seq_file.h>
 #endif
 
+/*
+ * As of version 304.131, os-agp.c and os-mtrr.c still use deprecated
+ * kernel APIs for mtrr which are no longer exported since 4.3, causing
+ * the module to error out when loaded.
+ */
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(4,3,0)
+#undef CONFIG_MTRR
+#endif
+
 #if !defined(NV_VMWARE) && defined(CONFIG_MTRR)
 #include <asm/mtrr.h>
 #endif

the nVidia’s “304.134” version works as it should (finally). patch to disable “mtrr” still required here, but all users are happy to get proper direct OpenGL. according to nVidia:

  • Added support for X.Org xserver ABI 23 (xorg-server 1.19)

it works :slight_smile:

Good to hear that at least the OpenGL problem is fixed. :slight_smile:

according to nVidia:

  • Added support for X.Org xserver ABI 23 (xorg-server 1.19)

That’s only relevant for Tumbleweed though (and future openSUSE Leap releases).
Leap 42.2 comes with X.Org server 1.18.3 (with video driver ABI 20).

And that wouldn’t affect OpenGL, but rather if the driver works at all.

FYI, the patch has been added to openSUSE’s nvidia packages in the repo a few days ago:
https://bugzilla.opensuse.org/show_bug.cgi?id=1017755

NVidia’s .run installer is still “broken” though.