Results 1 to 2 of 2

Thread: Nvidia driver permission problems

Hybrid View

  1. #1

    Default Nvidia driver permission problems

    Hello!

    I'm building a openSUSE Leap 15.2 installer using kiwi-ng and nvidia drivers are installed and included in the built disk image. After installing the image I am experiencing permission problems for nvidia and gl. The sddm-greeter cannot be displayed and glxinfo must be run with sudo. Adding sddm user and normal users to video group works around the issue, but that's not the real solution. At least it indicates it is "only" a permission problem. Why are not dynamic permissions assigned by logind?

    Installed Nvidia packages
    Code:
    sudo zypper se -i nvidia
    Loading repository data...
    Reading installed packages...
    
    S  | Name                      | Summary                                                               | Type
    ---+---------------------------+-----------------------------------------------------------------------+--------
    i  | nvidia-computeG05         | NVIDIA driver for computing with GPGPU                                | package
    i  | nvidia-gfxG05-kmp-default | NVIDIA graphics driver kernel module for GeForce 600 series and newer | package
    i+ | nvidia-glG05              | NVIDIA OpenGL libraries for OpenGL acceleration                       | package
    i+ | x11-video-nvidiaG05       | NVIDIA graphics driver for GeForce 600 series and newer               | package
    An extract from journalctl for the sddm-greeter
    Code:
    Apr 20 14:34:07 localhost sddm-greeter[1162]: Failed to create OpenGL context for format QSurfaceFormat(version 2.0, options QFlags<QSurfaceFormat::FormatOption>(), depthBufferSize 24, redBufferSize -1, greenBufferSize -1, blueBufferSize -1, alphaBufferSize -1, stencilBufferSize 8, samples -1, swapBehavior QSurfaceFormat::DoubleBuffer, swapInterval 1, colorSpace QSurfaceFormat::DefaultColorSpace, profile  QSurfaceForma>
    Apr 20 14:34:07 localhost sddm-helper[1154]: [PAM] Closing session
    Apr 20 14:34:07 localhost sddm-helper[1154]: [PAM] Ended.
    Devices file permissions
    Code:
    ls -la /dev/nv*
    crw-rw----  1 root video 195,   0 Apr 20 14:34 /dev/nvidia0
    crw-rw----+ 1 root video 195, 255 Apr 20 14:34 /dev/nvidiactl
    crw-rw----+ 1 root video 195, 254 Apr 20 14:34 /dev/nvidia-modeset
    crw-rw----+ 1 root video 239,   0 Apr 20 14:34 /dev/nvidia-uvm
    crw-rw----+ 1 root video 239,   1 Apr 20 14:34 /dev/nvidia-uvm-tools
    crw-------  1 root root   10, 144 Apr 20 14:34 /dev/nvram
    
    ls -la /dev/dri/*
    crw-rw----+ 1 root video 226,   0 Apr 20 14:34 /dev/dri/card0
    crw-rw----+ 1 root video 226, 128 Apr 20 14:34 /dev/dri/renderD128
    
    /dev/dri/by-path:
    total 0
    drwxr-xr-x 2 root root  80 Apr 20 14:34 .
    drwxr-xr-x 3 root root 100 Apr 20 14:34 ..
    lrwxrwxrwx 1 root root   8 Apr 20 14:34 pci-0000:02:00.0-card -> ../card0
    lrwxrwxrwx 1 root root  13 Apr 20 14:34 pci-0000:02:00.0-render -> ../renderD128
    Dynamic permission
    Code:
    getfacl /dev/nv*
    getfacl: Removing leading '/' from absolute path names
    # file: dev/nvidia0
    # owner: root
    # group: video
    user::rw-
    group::rw-
    other::---
    
    # file: dev/nvidiactl
    # owner: root
    # group: video
    user::rw-
    user:myuser:rw-
    group::rw-
    mask::rw-
    other::---
    
    # file: dev/nvidia-modeset
    # owner: root
    # group: video
    user::rw-
    user:myuser:rw-
    group::rw-
    mask::rw-
    other::---
    
    # file: dev/nvidia-uvm
    # owner: root
    # group: video
    user::rw-
    user:myuser:rw-
    group::rw-
    mask::rw-
    other::---
    
    # file: dev/nvidia-uvm-tools
    # owner: root
    # group: video
    user::rw-
    user:myuser:rw-
    group::rw-
    mask::rw-
    other::---
    
    # file: dev/nvram
    # owner: root
    # group: root
    user::rw-
    group::---
    other::---
    
    getfacl /dev/dri/*
    getfacl: Removing leading '/' from absolute path names
    # file: dev/dri/by-path
    # owner: root
    # group: root
    user::rwx
    group::r-x
    other::r-x
    
    # file: dev/dri/card0
    # owner: root
    # group: video
    user::rw-
    user:myuser:rw-
    group::rw-
    mask::rw-
    other::---
    
    # file: dev/dri/renderD128
    # owner: root
    # group: video
    user::rw-
    user:myuser:rw-
    group::rw-
    mask::rw-
    other::---

    Kernel modules
    Code:
    lsmod | grep nvidia
    nvidia_drm             61440  2
    nvidia_modeset       1232896  3 nvidia_drm
    nvidia_uvm           1118208  0
    nvidia              34168832  66 nvidia_uvm,nvidia_modeset
    drm_kms_helper        229376  1 nvidia_drm
    drm                   544768  5 drm_kms_helper,nvidia_drm
    
    
    /sbin/lspci -nnk | grep -iA3 vga
    02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM107GL [Quadro K2200] [10de:13ba] (rev a2)
            Subsystem: Hewlett-Packard Company Device [103c:1097]
            Kernel driver in use: nvidia
            Kernel modules: nouveau, nvidia_drm, nvidia
    Xorg.0.log: https://pastebin.pl/view/8c9af808

    I will complement with output from loginctl and glxinfo tomorrow, only have access via ssh now when I post this. But as mentioned above I require sudo to get output from glxinfo, otherwise I get "X Error of failed request: BadValue".

  2. #2

    Default Re: Nvidia driver permission problems

    I noticed that re-installing nvidia-gfxG05-kmp-default (originally installed by kiwi) solved the problem and I could see I got correct correct dynamic permissions.
    Code:
    getfacl /dev/nvidia0 
    getfacl: Removing leading '/' from absolute path names 
    # file: dev/nvidia0 
    # owner: root 
    # group: video 
    user::rw- 
    user:sddm:rw- 
    group::rw- 
    mask::rw- 
    other::---
    


    At first I was trying to compare the system before and after re-installing the package, but couldn't find any differences in the places I would expect. However, then I checked the install script run by
    nvidia-gfxG05-kmp-default, and it was at least pretty obvious what went wrong.

    Code:
    sudo zypper install --download-only nvidia-gfxG0-kmp-default
    
    rpm -q --scripts /var/cache/zypp/packages/NVIDIA/x86_64/nvidia-gfxG05-kmp-default-460.73.01_k5.3.18_lp152.19-lp152.37.1.x86_64.rpm
    
    ...
    # Create symlinks for udev so these devices will get user ACLs by logind later (bnc#1000625)
    mkdir -p /run/udev/static_node-tags/uaccess
    mkdir -p /usr/lib/tmpfiles.d
    ln -snf /dev/nvidiactl /run/udev/static_node-tags/uaccess/nvidiactl 
    ln -snf /dev/nvidia-uvm /run/udev/static_node-tags/uaccess/nvidia-uvm
    ln -snf /dev/nvidia-uvm-tools /run/udev/static_node-tags/uaccess/nvidia-uvm-tools
    ln -snf /dev/nvidia-modeset /run/udev/static_node-tags/uaccess/nvidia-modeset
    cat >  /usr/lib/tmpfiles.d/nvidia-logind-acl-trick-G05.conf << EOF
    L /run/udev/static_node-tags/uaccess/nvidiactl - - - - /dev/nvidiactl
    L /run/udev/static_node-tags/uaccess/nvidia-uvm - - - - /dev/nvidia-uvm
    L /run/udev/static_node-tags/uaccess/nvidia-uvm-tools - - - - /dev/nvidia-uvm-tools
    L /run/udev/static_node-tags/uaccess/nvidia-modeset - - - - /dev/nvidia-modeset
    EOF
    devid=-1
    for dev in $(ls -d /sys/bus/pci/devices/*); do 
      vendorid=$(cat $dev/vendor)
      if [ "$vendorid" == "0x10de" ]; then 
        class=$(cat $dev/class)
        classid=${class%00}
        if [ "$classid" == "0x0300" -o "$classid" == "0x0302" ]; then 
          devid=$((devid+1))
          ln -snf /dev/nvidia${devid} /run/udev/static_node-tags/uaccess/nvidia${devid}
          echo "L /run/udev/static_node-tags/uaccess/nvidia${devid} - - - - /dev/nvidia${devid}" >> /usr/lib/tmpfiles.d/nvidia-logind-acl-trick-G05.conf
        fi
      fi
    done
    ...
    This command in the for loop will not give the same result on the kiwi build machine as in the installed system, $(ls -d /sys/bus/pci/devices/*). If manually running the two steps within that for loop, the issue is resolved (I think even doing only the last step is enough if rebooting the system afterwards). I cannot say I completely understand how this configuration solves the problems with ACLs for logind, I though it was enough with the rules in /lib/udev/rules.d, so if someone could explain this a bit I would be grateful. Otherwise, at least the problem has now been resolved.


Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •