Results 1 to 6 of 6

Thread: CUDA 10.1 Issues, and workaround

  1. #1

    Thumbs down CUDA 10.1 Issues, and workaround

    I have two workstations, both were running Leap 15.1. I in-place upgraded the first workstation to Leap 15.2, tested that CUDA was still functional (via Blender 2.83), and used this success as justification to perform a clean install of Leap 15.2 on my primary workstation. However, CUDA did not work after a clean install. Blender could not identify any CUDA capable devices, despite all other tests succeeding.

    I use Ansible to configure my workstations, so I know for certainty that the configuration was consistent.

    NVIDIA-SMI output:
    Code:
    +-----------------------------------------------------------------------------+| NVIDIA-SMI 440.100      Driver Version: 440.100      CUDA Version: 10.2     ||-------------------------------+----------------------+----------------------+| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC || Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. ||===============================+======================+======================||   0  GeForce RTX 2070    Off  | 00000000:01:00.0  On |                  N/A || 29%   29C    P8    23W / 225W |    774MiB /  7974MiB |      2%      Default |+-------------------------------+----------------------+----------------------+
    NVCC output:
    Code:
    ~> /usr/local/cuda/bin/nvcc --versionnvcc: NVIDIA (R) Cuda compiler driverCopyright (c) 2005-2019 NVIDIA CorporationBuilt on Fri_Feb__8_19:08:17_PST_2019Cuda compilation tools, release 10.1, V10.1.105
    Confirmed NVIDIA drivers were installed fine. Even played some Portal 2.

    Running Blender with the --debug-cycles flag produced this output:
    Code:
    I0707 00:58:13.995280 11815 blender_python.cpp:191] Debug flags initialized to:CPU flags:  AVX2       : True  AVX        : True  SSE4.1     : True  SSE3       : True  SSE2       : True  BVH layout : BVH8  Split      : FalseCUDA flags:  Adaptive Compile : FalseOptiX flags:  CUDA streams : 1OpenCL flags:  Device type    : ALL  Debug          : False  Memory limit   : 0...I0707 00:59:02.191576 11815 device_cuda.cpp:41] CUEW initialization succeededI0707 00:59:02.397126 11815 device_cuda.cpp:43] Found precompiled kernelsCUDA cuInit: Unknown errorI0707 00:59:03.931155 11815 device_opencl.cpp:48] CLEW initialization succeeded.


    Saw the same results in Blender 2.82 and 2.9 nightly. Again, these same tests worked FINE on the workstation that was in-place upgraded. And it rendered fine. But not on the workstation that was clean installed.

    Searching found me this discussion. Following the advice of those posters, I compiled the sample code and got the same error as them running the deviceQuery code sample:

    Code:
    ./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking)cudaGetDeviceCount returned 999-> unknown errorResult = FAIL


    But this is where it gets weird. If deviceQuery is ran as an elevated user ONCE, CUDA starts working correctly for non-elevated users until the next reboot.

    Code:
    sudo ./deviceQuery./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking)Detected 1 CUDA Capable device(s)Device 0: "GeForce RTX 2070"  CUDA Driver Version / Runtime Version          10.2 / 10.1  CUDA Capability Major/Minor version number:    7.5  Total amount of global memory:                 7974 MBytes (8361672704 bytes)  (36) Multiprocessors, ( 64) CUDA Cores/MP:     2304 CUDA Cores  GPU Max Clock rate:                            1815 MHz (1.81 GHz)  Memory Clock rate:                             7001 Mhz  Memory Bus Width:                              256-bit  L2 Cache Size:                                 4194304 bytes  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers  Total amount of constant memory:               65536 bytes  Total amount of shared memory per block:       49152 bytes  Total number of registers available per block: 65536  Warp size:                                     32  Maximum number of threads per multiprocessor:  1024  Maximum number of threads per block:           1024  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)  Maximum memory pitch:                          2147483647 bytes  Texture alignment:                             512 bytes  Concurrent copy and kernel execution:          Yes with 3 copy engine(s)  Run time limit on kernels:                     Yes  Integrated GPU sharing Host Memory:            No  Support host page-locked memory mapping:       Yes  Alignment requirement for Surfaces:            Yes  Device has ECC support:                        Disabled  Device supports Unified Addressing (UVA):      Yes  Device supports Compute Preemption:            Yes  Supports Cooperative Kernel Launch:            Yes  Supports MultiDevice Co-op Kernel Launch:      Yes  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0  Compute Mode:     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.1, NumDevs = 1Result = PASS


    I experimented in disabling AppArmor, ensured my user was a member of the video group (and rebooted), upgraded to Cuda 10.2. But I can't explain why running this as root makes it work for all users, or why this is occurring on a clean install but not an upgraded install.

    Any thoughts on how I can debug this further? I've got the workaround, but would prefer a solution.

  2. #2
    Join Date
    Jun 2008
    Location
    Podunk
    Posts
    29,629
    Blog Entries
    15

    Default Re: CUDA 10.1 Issues, and workaround

    Hi and welcome to the Forum
    Upgrade to cuda 11, it has the later driver (as in 450.x not 440.x series), kernel and gcc support.....
    Last edited by malcolmlewis; 06-Jul-2020 at 08:38.
    Cheers Malcolm °¿° SUSE Knowledge Partner (Linux Counter #276890)
    SUSE SLE, openSUSE Leap/Tumbleweed (x86_64) | GNOME DE
    If you find this post helpful and are logged into the web interface,
    please show your appreciation and click on the star below... Thanks!

  3. #3

    Default Re: CUDA 10.1 Issues, and workaround

    Quote Originally Posted by malcolmlewis View Post
    Hi and welcome to the Forum
    Upgrade to cuda 11, it has the later driver (as in 450.x not 440.x series), kernel and gcc support.....
    CUDA 11 is a RC.

    To OP: try to reinstall Nvidia drivers and then CUDA.
    Upgrade process recommends uninstalling proprietary drivers before performing upgrade.

  4. #4
    Join Date
    Jun 2008
    Location
    Podunk
    Posts
    29,629
    Blog Entries
    15

    Default Re: CUDA 10.1 Issues, and workaround

    Quote Originally Posted by Svyatko View Post
    CUDA 11 is a RC.

    To OP: try to reinstall Nvidia drivers and then CUDA.
    Upgrade process recommends uninstalling proprietary drivers before performing upgrade.
    Hi
    Are you using cuda? The download for openSUSE is the 11 version......
    https://developer.nvidia.com/cuda-do...e=runfilelocal

    Code:
    nvidia-smi 
    
    Tue Jul  7 09:22:40 2020       
    +-----------------------------------------------------------------------------+
    | NVIDIA-SMI 450.51       Driver Version: 450.51       CUDA Version: 11.0     |
    |-------------------------------+----------------------+----------------------+
    | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
    |                               |                      |               MIG M. |
    |===============================+======================+======================|
    |   0  GeForce GT 710      Off  | 00000000:00:03.0 N/A |                  N/A |
    | 40%   44C    P8    N/A /  N/A |     98MiB /   978MiB |     N/A      Default |
    |                               |                      |                  N/A |
    +-------------------------------+----------------------+----------------------+
    
    nvcc --version
    
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2020 NVIDIA Corporation
    Built on Wed_May__6_19:09:25_PDT_2020
    Cuda compilation tools, release 11.0, V11.0.167
    Build cuda_11.0_bu.TC445_37.28358933_0
    
    /data/applications/cuda/cuda-samples-master/Samples/deviceQuery/deviceQuery Starting...
    
     CUDA Device Query (Runtime API) version (CUDART static linking)
    
    Detected 1 CUDA Capable device(s)
    
    Device 0: "GeForce GT 710"
      CUDA Driver Version / Runtime Version          11.0 / 11.0
      CUDA Capability Major/Minor version number:    3.5
      Total amount of global memory:                 979 MBytes (1026490368 bytes)
      ( 1) Multiprocessors, (192) CUDA Cores/MP:     192 CUDA Cores
      GPU Max Clock rate:                            954 MHz (0.95 GHz)
      Memory Clock rate:                             800 Mhz
      Memory Bus Width:                              64-bit
      L2 Cache Size:                                 524288 bytes
      Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
      Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
      Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
      Total amount of constant memory:               65536 bytes
      Total amount of shared memory per block:       49152 bytes
      Total number of registers available per block: 65536
      Warp size:                                     32
      Maximum number of threads per multiprocessor:  2048
      Maximum number of threads per block:           1024
      Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
      Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
      Maximum memory pitch:                          2147483647 bytes
      Texture alignment:                             512 bytes
      Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
      Run time limit on kernels:                     Yes
      Integrated GPU sharing Host Memory:            No
      Support host page-locked memory mapping:       Yes
      Alignment requirement for Surfaces:            Yes
      Device has ECC support:                        Disabled
      Device supports Unified Addressing (UVA):      Yes
      Device supports Compute Preemption:            No
      Supports Cooperative Kernel Launch:            No
      Supports MultiDevice Co-op Kernel Launch:      No
      Device PCI Domain ID / Bus ID / location ID:   0 / 0 / 3
      Compute Mode:
         < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
    
    deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.0, CUDA Runtime Version = 11.0, NumDevs = 1
    Result = PASS
    
    cat /etc/os-release 
    
    NAME="openSUSE Leap"
    VERSION="15.2"
    ID="opensuse-leap"
    ID_LIKE="suse opensuse"
    VERSION_ID="15.2"
    PRETTY_NAME="openSUSE Leap 15.2"
    ANSI_COLOR="0;32"
    CPE_NAME="cpe:/o:opensuse:leap:15.2"
    BUG_REPORT_URL="https://bugs.opensuse.org"
    Miniatures attachées Miniatures attachées Click image for larger version. 

Name:	Screenshot from 2020-07-07 09-27-38.png 
Views:	129 
Size:	79.4 KB 
ID:	902  
    Cheers Malcolm °¿° SUSE Knowledge Partner (Linux Counter #276890)
    SUSE SLE, openSUSE Leap/Tumbleweed (x86_64) | GNOME DE
    If you find this post helpful and are logged into the web interface,
    please show your appreciation and click on the star below... Thanks!

  5. #5

    Default Re: CUDA 10.1 Issues, and workaround

    Blender support for CUDA 11 is still quite preliminary, 10.1 and 10.2 still the recommended versions.

    This seems to be a modprobe issue with nvidia-uvm and nvidia-uvm-tools. More information here

  6. #6
    Join Date
    Jun 2008
    Location
    Podunk
    Posts
    29,629
    Blog Entries
    15

    Default Re: CUDA 10.1 Issues, and workaround

    Quote Originally Posted by MrPendulum View Post
    Blender support for CUDA 11 is still quite preliminary, 10.1 and 10.2 still the recommended versions.

    This seems to be a modprobe issue with nvidia-uvm and nvidia-uvm-tools. More information here
    Hi
    Looks like you and the bug report have it sorted...

    My workflow for nvidia has always been the hard way, the lag with Leap and Tumbleweed (primary desktop) is slow, for SLES and SLED I use the repos though, since I still have older card support (no cuda needed). 11 has better gcc support as well...
    Cheers Malcolm °¿° SUSE Knowledge Partner (Linux Counter #276890)
    SUSE SLE, openSUSE Leap/Tumbleweed (x86_64) | GNOME DE
    If you find this post helpful and are logged into the web interface,
    please show your appreciation and click on the star below... Thanks!

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •