nvidia 319.2 driver error - or BIOS problems ?

I’m installing NVIDIA GPU driver v.319.32 (for my PNY K20c GPU) under OpenSUSE 12.3 (kernel 3.7.10-1.1)
working on Supermicro X9SCI motherboard (BIOS v. 2.0a). The problems I have may be in BIOS (the corresponding /var/log/messages extractions are at the bottom of this my message) or in driver itself.

The necessary steps for removing of nouveau driver were performed, and nouveau kernel module isn’t loaded.
But compilation of kernel interface of nvidia.ko(via standard NVIDIA-Linux-x86_319/32.run execution) gives an error:

----------------(from nvidia-installer log)---------------------------
KBUILD_SRC=/usr/src/linux-3.7.10-1.1
KBUILD_EXTMOD="/tmp/selfgz1723/NVIDIA-Linux-x86_64-319.32/kernel" -f /usr/src/linux-3.7.10-1.1/Makefile
modules
test -e include/generated/autoconf.h -a -e include/config/auto.conf || (
echo >&2;
echo >&2 " ERROR: Kernel configuration is invalid.";
echo >&2 " include/generated/autoconf.h or include/config/auto.conf are missing.";
echo >&2 " Run ‘make oldconfig && make prepare’ on kernel src to fix it.";
echo >&2 ;
/bin/false)

But both files are presented, and kernel preparation steps ( make oldconfig && make prepare) were executed successfully.

But then in nvidia-installer.log I see compilation steps w/warnings ( and I beleive that
nvidia.ko was built.) After that I see messages

--------------------------from nvidia-installer.log ----------------------------------
NVIDIA: left KBUILD.
→ done.
→ Kernel module compilation complete.
→ Unable to determine if Secure Boot is enabled: No such file or directory
ERROR: Unable to load the kernel module ‘nvidia.ko’. …

Then nvidia-installer.log contains also kernel messages:

--------------------------from nvidia-installer.log ----------------------------------
→ Kernel module load error: No such device
→ Kernel messages:
… 25.286079] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
1379.760532] nvidia: module license ‘NVIDIA’ taints kernel.
1379.760536] Disabling lock debugging due to kernel taint
1379.765158] nvidia 0000:01:00.0: enabling device (0140 → 0142)
1379.765165] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
1379.765165] NVRM: BAR1 is 0M @ 0x0 (PCI:0000:01:00.0)
1379.765166] NVRM: The system BIOS may have misconfigured your GPU.
1379.765169] nvidia: probe of 0000:01:00.0 failed with error -1
1379.765177] NVRM: The NVIDIA probe routine failed for 1 device(s).
1379.765178] NVRM: None of the NVIDIA graphics adapters were initialized!

It looks as BIOS setting problem, but the corresponding (by my opinion) parameter in
BIOS Advanced/PСI, PCI-E submenu - “Above 4G Decoding” - is Enabled.

I’m also afraid some kernel messages from /var/log/messages
-----------------------from /var/log/messages------
2013-07-04T01:43:43.666022+04:00 c6ws4 kernel: 0.421559] pci 0000:00:01.0: BAR 15: can’t assign mem pref (size 0x18000000)
2013-07-04T01:43:43.666024+04:00 c6ws4 kernel: 0.421563] pci 0000:00:01.0: BAR 14: assigned [mem 0xe1000000-0xe1ffffff]
2013-07-04T01:43:43.666025+04:00 c6ws4 kernel: 0.421566] pci 0000:00:16.1: BAR 0: assigned [mem 0xe0001000-0xe000100f 64bit]
2013-07-04T01:43:43.666026+04:00 c6ws4 kernel: 0.421576] pci 0000:01:00.0: BAR 1: can’t assign mem pref (size 0x10000000)
2013-07-04T01:43:43.666027+04:00 c6ws4 kernel: 0.421579] pci 0000:01:00.0: BAR 3: can’t assign mem pref (size 0x2000000)
2013-07-04T01:43:43.666027+04:00 c6ws4 kernel: 0.421581] pci 0000:01:00.0: BAR 0: assigned [mem 0xe1000000-0xe1ffffff]
2013-07-04T01:43:43.666028+04:00 c6ws4 kernel: 0.421584] pci 0000:01:00.0: BAR 6: can’t assign mem pref (size 0x80000)
2013-07-04T01:43:43.666029+04:00 c6ws4 kernel: 0.421586] pci 0000:00:01.0: PCI bridge to [bus 01]

The question is: is there some NVIDIA driver problems - or it’s hardware PCI problems in my Supermicro X9SCI motherboard ?

Mikhail

DID you install the kernel source files? Have a look at my blog on the subject here: Installing the nVIDIA Video Driver the Hard Way - Blogs - openSUSE Forums

Thank You,

And do you know you can install the NVIDIA drivers from Yast. No compile needed if you are using one of the standard openSUSE kernels.

>DID you install the kernel source files? Have a look at my blog on the >subject here: Installing the nVIDIA Video Driver the Hard Way - Blogs - >openSUSE Forums

make oldconfig && make prepare I successfully performed are IMHO impossible in the case if I have no kernel source RPM’s .

>And do you know you can install the NVIDIA drivers from Yast. No compile >needed if you are using one of the standard openSUSE kernels.

AFAIK NVIDIA-Linux-x86_64-319.32.run must find if there is some standard kernel presented. But it (*.run) compiles instead kernel interface !

I seem to remember something about TDR. From reading those log messages, it looks like the BIOS isn’t configuring your GPU correctly. You’ll have to manually go in and edit your BIOS to match. I wont give info about BIOS settings. It’s just something I won’t do. You can Google TDR and your nvidia model and cross reference it with your BIOS. That’s as far as I am willing to go.

Thank you very much for your message !
I also thought about BIOS settings, but it looks that there is no specifical settings, which may influence on this situation - other than “Above 4G Decoding”. PCI timeout, I beleive, isn’t the source of problems - there is no "hangup’s’, all the “erroneous” kernel messages arises w/o visible delay.

What is about TDR - it’s serious idea. I didn’t encounter anything about TDR in linux (although AFAIK TDR is presented beginning from 3.9 kernels). I didn’t find in Google any TDR data for my Tesla K20 GPU, but as I wrote above, I think that there is no timeouts influence here.

Mikhail