NV CUDA Driver and kernel 3.0.4

Greetings,

I’ve spent about 3 hours yesterday trying to get my GeForce 210 with CUDA DevDriver running on the 3.0.4 kernel.
It was running w/o problems on 2.6.39-desktop (with dev-src for 2.6.37). [yeah, I know I should stick to what works - but I don’t like it]

After uninstalling the old drivers I tried several ways for an installation of the new ones:

  1. ./devdriver_4.0_linux_64_270.41.19.run on the new kernel (even manually navigating to the kernel-src folder)
  2. ./devdriver_4.0_linux_64_270.41.19.run on the old kernel, with new dev-src &c.
  3. XFree 280.16 driver on the new kernel with the new dev-src &c.

The results:

  1. Works now.
  2. did not compile telling me the source files were missing (obviously, as they were for 3.0.4 and not 2.6.39)
  3. Aborted the installation with the following (excerpt from nvidia-installer.log)
-> Performing CC sanity check with CC="cc".

-> Performing CC version check with CC="cc".

-> Kernel source path: '/lib/modules/3.0.4-2-default/source'

-> Kernel output path: '/lib/modules/3.0.4-2-default/build'

ERROR: If you are using a Linux 2.4 kernel, please make sure

       you either have configured kernel sources matching your

       kernel or the correct set of kernel headers installed

       on your system.

       

       If you are using a Linux 2.6 kernel, please make sure

       you have configured kernel sources matching your kernel

       installed on your system. If you specified a separate

       output directory using either the "KBUILD_OUTPUT" or

       the "O" KBUILD parameter, make sure to specify this

       directory with the SYSOUT environment variable or with

       the equivalent nvidia-installer command line option.

       

       Depending on where and how the kernel sources (or the

       kernel headers) were installed, you may need to specify

       their location with the SYSSRC environment variable or

       the equivalent nvidia-installer command line option.

As far as I can tell all the kernel sources are in place.

My current guess would be that the Kernel had some major change which the installer doesn’t like.

If anybody has any suggestions - I would be happy to try them out.

On 09/04/2011 08:46 AM, Aquinox wrote:
>
> Greetings,
>
> I’ve spent about 3 hours yesterday trying to get my GeForce 210 with
> CUDA DevDriver running on the 3.0.4 kernel.
> It was running w/o problems on 2.6.39-desktop (with dev-src for
> 2.6.37). [yeah, I know I should stick to what works - but I don’t like
> it]
>
> After uninstalling the old drivers I tried several ways for an
> installation of the new ones:
> 1. ./devdriver_4.0_linux_64_270.41.19.run on the new kernel (even
> manually navigating to the kernel-src folder)
> 2. ./devdriver_4.0_linux_64_270.41.19.run on the old kernel, with new
> dev-src&c.
> 3. XFree 280.16 driver on the new kernel with the new dev-src&c.
>
> The results:
>
> 3. Works now.
> 2. did not compile telling me the source files were missing (obviously,
> as they were for 3.0.4 and not 2.6.39)
> 1. Aborted the installation with the following (excerpt from
> nvidia-installer.log)
>
> Code:
> --------------------
> → Performing CC sanity check with CC=“cc”.
> → Performing CC version check with CC=“cc”.
> → Kernel source path: ‘/lib/modules/3.0.4-2-default/source’
> → Kernel output path: ‘/lib/modules/3.0.4-2-default/build’
> ERROR: If you are using a Linux 2.4 kernel, please make sure
> you either have configured kernel sources matching your
> kernel or the correct set of kernel headers installed
> on your system.

Their build system has not been updated for kernel 3.0. Somewhere in it is a
test for “2.6” in the kernel version string. When it doesn’t find it, it assumes
2.4. The test needs to be changed to not find “2.4”. I searched through the code
of the NVIDIA-Linux-x86-270.41.19 driver after extracting the code and could not
find any test of that kind. I conclude that it is in the binary of the
nvidia-installer code, which means that nVidia will have to make the change.

I suppose this kind of problem is why Ubuntu calls their current kernel 2.6.40,
not 3.0. If you build your own kernel, that would be an easy change to the
kernel’s makefile.

Sounds plausible.

As for building a custom kernel, any good links? I’ve done it only once and don’t know if I did a good job back then, so if you have any links to guidelines and/or common optimizations for desktop environments feel free to give them to me.

Thanks.:slight_smile:

Here’s me, running kernel 3.0.4 on Tumbleweed, NVIDIA’s NVIDIA-Linux-x86_64-280.11.run driver package installed without a problem. To be honest, I don’t quite get what’s happening in the posts above. From what it looks like (seen it in the past when I was messing with kernels), the installed kernel sources do not match the running kernel. Install the latest kernel by adding this repo Index of /repositories/Kernel:/stable/standard , then make sure all kernel packages are from this repo. Download NVIDIA driver from ftp://download.nvidia.com/XFree86/

Again, absolutely no issues on installing nor running.

Really? I think it’s quite clear - but maybe it’s just me having sat on that so long.

To make it crystal clear:

  1. I do NOT care for the XFree86_64 driver.
  2. the XFree86_64 driver 280.13 (sry, not .16) runs OK on my system.
  3. Kernel, Sources and Syms match (3.0.4-2)

What I want to accomplish is:
Having the devdriver_4.0, which is required for GPGPU computing and GPGPU development, run on my OS 11.4 x64 with Kernel 3.0.4.

Hi
Both drivers contain the CUDA libraries, have you installed the SDK and run some of the tests on the 280 driver?

If you run extract on either and look at the readme.txt


   o Two CUDA libraries (/usr/lib/libcuda.so.x.y.z, /usr/lib/libcuda.la);
     these libraries provide runtime support for CUDA (high-performance
     computing on the GPU) applications.

Else if you look at the documentation Chapter 7.0 FAQ’s (in the html directory, using --extract-only option), try this option;


sh devdriver_4.0_linux_64_270.41.19.run --add-this-kernel -aq

You’re right. The SDK examples work fine with the 280 driver.

And I think I kinda figured out where the snag in installing the devdriver_4.0 lies.
The “–add-this-kernel -aq” options did not do the trick, the result is the same error as described in my first post, except I had to look for it in the log to get the reason why the kernel could not be compiled.

However, I’ve taken a look at “conftest.sh” in the kernel directory of the driver (after --extract-only).
Somewhere along the line there is indeed a check for 2.6 or 2.4 kernel.
My guess would be, that it is possible to modify the file to let a 3.0.4 kernel pass, but I’m not good at shell-programming.

On 09/05/2011 04:26 AM, Aquinox wrote:
>
> You’re right. The SDK examples work fine with the 280 driver.
>
> And I think I kinda figured out where the snag in installing the
> devdriver_4.0 lies.
> The “–add-this-kernel -aq” options did not do the trick, the result is
> the same error as described in my first post, except I had to look for
> it in the log to get the reason why the kernel could not be compiled.
>
> However, I’ve taken a look at “conftest.sh” in the kernel directory of
> the driver (after --extract-only).
> Somewhere along the line there is indeed a check for 2.6 or 2.4 kernel.
> My guess would be, that it is possible to modify the file to let a
> 3.0.4 kernel pass, but I’m not good at shell-programming.

The error is something in the CFLAGS variable when the various tests are
compiled. The error message is

In file included from
/lib/modules/3.1.0-rc4-wl+/build/include/linux/sem.h:81:0,
from /lib/modules/3.1.0-rc4-wl+/build/include/linux/sched.h:72,
from /lib/modules/3.1.0-rc4-wl+/build/include/linux/utsname.h:35,
from conftest17458.c:5:
/lib/modules/3.1.0-rc4-wl+/build/include/linux/rcupdate.h: In function
‘__kfree_rcu’:
/lib/modules/3.1.0-rc4-wl+/build/include/linux/rcupdate.h:822:2: error:
size of unnamed array is negative

If I get some time, I will try to chase it down. Of course, that particular
error could be a difference between 3.0 and 3.1.

I got part way, but I have no more time for the project.

Applying the patch file below will get you started. At least it knows you don’t
have a 2.4 kernel:

Index: NVIDIA-Linux-x86-270.41.19/kernel/conftest.sh

— NVIDIA-Linux-x86-270.41.19.orig/kernel/conftest.sh
+++ NVIDIA-Linux-x86-270.41.19/kernel/conftest.sh
@@ -1497,6 +1497,12 @@ case “$6” in
VERBOSE=$7
FILE=“linux/version.h”
SELECTED_MAKEFILE=""

  •    KERN_VER=$(uname -r | cut -b 1)
    
  •    if  -n "$KERN_VER" -a $KERN_VER -eq 3 ]; then
    
  •           UTSNAME=""
    
  •    else
    
  •           UTSNAME="#include <linux/utsname.h>"
    
  •    fi
    

if -f $HEADERS/$FILE -o -f $OUTPUT/include/$FILE ]; then

@@ -1506,7 +1512,7 @@ case “$6” in

echo "$CONFTEST_PREAMBLE
#include <linux/version.h>

  •        #include &lt;linux/utsname.h&gt;
    
  •        $UTSNAME
    

#if defined(TEST_2_4) && (LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,0))
#error “!KERNEL_2_4”
#endif
Index: NVIDIA-Linux-x86-270.41.19/kernel/nv-linux.h

— NVIDIA-Linux-x86-270.41.19.orig/kernel/nv-linux.h
+++ NVIDIA-Linux-x86-270.41.19/kernel/nv-linux.h
@@ -32,10 +32,8 @@

define KERNEL_2_4

#elif LINUX_VERSION_CODE < KERNEL_VERSION(2, 6, 0)

error This driver does not support 2.5 kernels!

-#elif LINUX_VERSION_CODE < KERNEL_VERSION(2, 7, 0)
-# define KERNEL_2_6
#else
-# error This driver does not support development kernels!
+# define KERNEL_2_6
#endif

#if defined(KERNEL_2_4)

I found a complete patch at http://www.nvnews.net/vbulletin/showthread.php?t=165000:

diff -urN work-OLD/kernel/conftest.sh work-NEW/kernel/conftest.sh
— work-OLD/kernel/conftest.sh 2011-05-30 19:41:31.000000000 -0700
+++ work-NEW/kernel/conftest.sh 2011-05-30 19:47:23.000000000 -0700
@@ -76,7 +76,9 @@
}

build_cflags() {

  • BASE_CFLAGS="-D__KERNEL__ \
  • Adding -Os optimizer option to work around rcupdate.h compiler bug, see here:

  • http://choon.net/forum/read.php?21,82725

  • BASE_CFLAGS="-O2 -D__KERNEL__
    -DKBUILD_BASENAME="#conftest$$" -DKBUILD_MODNAME="#conftest$$"
    -nostdinc -isystem $ISYSTEM"

diff -urN work-OLD/kernel/nv-linux.h work-NEW/kernel/nv-linux.h
— work-OLD/kernel/nv-linux.h 2011-05-16 23:32:19.000000000 -0700
+++ work-NEW/kernel/nv-linux.h 2011-05-30 19:46:06.000000000 -0700
@@ -34,6 +34,9 @@

error This driver does not support 2.5 kernels!

#elif LINUX_VERSION_CODE < KERNEL_VERSION(2, 7, 0)

define KERNEL_2_6

+#elif LINUX_VERSION_CODE >= KERNEL_VERSION(3, 0, 0)
+/* For compatibility, pretend all kernels 3.0.0 and higher are “2.6” */
+# define KERNEL_2_6
#else

error This driver does not support development kernels!

#endif