Kernel Panic with Nouveau Driver and NVIDIA GTX860M ( same occurs with Leap Milestone 1)

Dear Forum members,

I have been using successfully opensuse 13.1 ( professionally) and tried 13.2 about a year ago, abandoned and resume with 13.1.

I eventually tried this week to install ( bare metal) again opensuse 13.2 Gnome or leap 42.1 Gnome on my Clevo P17SM-A with NVIDIA GTX860M without success.

from dmesg >
0.000000] DMI: Notebook P17SM-A /P17SM-A , BIOS 4.6.5 07/28/2015

Installation is fine in both cases, but Kernel Panic occurs in a reproducible way : on restart or logout, more randomly with Gnome Term, raising lspci command ( sic !).

Needless to mention that Hardware Modprobe in Yast ends up with Kernel Panic as well.

And I get the following from the dmesg log (and also mixed up with my login prompt in the console at startup):

nouveau E PBUS][0000:01:00.0] MMIO read of 0x00000000 FAULT at 0x400700 IBUS ]
**
The consequence is after a couple of sudden panics :**

25.503692] systemd-journald[462]: File /var/log/journal/016627c3c4784cd4812d4b7e96a34226/system.journal corrupted or uncleanly shut down, renaming and replacing.

lspci -v
01:00.0 VGA compatible controller: NVIDIA Corporation GK104M [GeForce GTX 860M] (rev a1) (prog-if 00 [VGA controller])
Subsystem: CLEVO/KAPOK Computer Device 7481
Flags: bus master, fast devsel, latency 0, IRQ 49
Memory at f6000000 (32-bit, non-prefetchable) [size=16]
Memory at e0000000 (64-bit, prefetchable) [size=256]
Memory at f0000000 (64-bit, prefetchable) [size=32]
I/O ports at e000 [size=128]
Expansion ROM at f7000000 [disabled] [size=512]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Endpoint, MSI 00
Capabilities: [b4] Vendor Specific Information: Len=14 <?>
Capabilities: [100] Virtual Channel
Capabilities: [128] Power Budgeting <?>
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
Capabilities: [900] #19
Kernel driver in use: nouveau
Kernel modules: nouveau

uname -r
3.16.7-21-desktop

GTX 860M is supported by nouveau
lsmod | grep nouveau

nouveau 1304774 0
ttm 93506 1 nouveau
i2c_algo_bit 13413 2 i915,nouveau
mxm_wmi 13021 1 nouveau
drm_kms_helper 65670 2 i915,nouveau
drm 335594 8 ttm,i915,drm_kms_helper,nouveau
button 13971 2 i915,nouveau
wmi 19193 2 mxm_wmi,nouveau
video 24419 2 i915,nouveau

RPM Packages are the usual ones without any NVIDIA drivers.

libdrm_nouveau2 2.4.58
xf86-video-intel 2.99.916
xf86-video-nouveau 1.0.11

HELP HELP is really welcome as I would like to move forward with next opensuse releases .

As I mentionned same nouveau stall symptoms occured with install of Leap Milestone 42.1.

Luc

Note: I couldn’t not even install succesfully NVIDIA drivers (NVIDIA-Linux-x86_64-352.30) even after nvidia-installer blacklisting nouveau.[/size][/size][/size][/size][/size]

How did you attempt install the NVIDIA drivers. How did they fail? Error messages??

The system just doesn’t reboot : “Oops something wrong happened.Log Out”, just after User Login screen. And the Ctrl Alt F1 does help, because system crashes before loading and mounting disks completely.

I tried two methods :

  1. via the Nvidia Community repository that I susbscribed to, then via Software Manager .

  2. Then reinstalled opensuse 13.2 and tried with http://www.therandombits.com/1/how-to-install-nvidia-driver-for-gtx-970-in-opensuse-13-2/
    with NVIDIA-Linux-x86_64-352.30, after blacklisting nouveau, which seemed useless since nvidia-installer does that, still requiring to reboot first

The challenges in all those attempts is to navigate in a system that will crash. at some point So, that means reinstalling from scratch several times to try to collect the elements to describe.
I am thinking now using FS systems (ext2) without journaling , to avoid corruption on disks because of the nouveau CPU syncing issue.

Select “recovery mode” in the boot menu (second entry in “Advanced Options”), or add “nomodeset” to the boot options (press ‘e’ and append it to the line starting with “linux” or “linuxefi”). This should give you a useable system, as it disables nouveau and uses a generic driver instead.

I guess you mean adjusting Boat Loader Options or modifying default grub config
vim /etc/default/grub appending nomodeset to variable GRUB_CMDLINE_LINUX_DEFAULT to make default boot acting as the recovery one ?

Thanks Wolfi, you are 100% right and this leads me to evidence my misunderstanding of the Schenker/Clevo P17SM-A architecture.

The laptop has a GPU integrated in the Haswell Chip as per
lspci -v
00:02.0 VGA compatible controller: Intel Corporation 4th Gen Core Processor Integrated Graphics Controller (rev 06) (prog-if 00 [VGA controller]) Subsystem: CLEVO/KAPOK

and has also an aditionnal dedicated NVIDIA GTX860M GPU. So for now, we have to discard the nouveau nvidia driver (xf86-video-nouveau-1.0.11-1.3.x86_64) and use the Intel xf86-video-intel.x86_64…

Digging a little further I realized that BumbleBee Project address the following problem and more: what if you have a GPU advanced NVIDIA card that overwaste the battery and that you can’t disable it in the BIOS ( even the last flashed one BIOS 4.6.5 07/28/2015 )

“If you prefer to use graphics drivers that are directly provided by openSUSE, Bumblebee can be configured to use only the main Intel graphics card with its open source driver. This option does not require installing the proprietary Nvidia driver.” https://en.opensuse.org/SDB:NVIDIA_Bumblebee

Supposedly, BumbleBee package makes it possible to switch to the NVIDIA GPU using

  • nouveau open source driver ( that is stalling my 13.2 system )
    or
  • NVIDIA proprietary drivers ( auto-recompiling when kernel upgrades happen because of DKMS ?)

I will install the BumbleBee Package and try it with the switch:

  • nouveau-NVIDIA versus Intel
  • NVIDIA proprietary Intel. nvidia

Thanks again.

Before you install bumblebee stuff fully uninstall any NVIDIA driver packages you may have the normal packages may interfere since they modify mesa files and simply over installing NVIDIA-bumblebee package may leave the incorrect mesa files from the regular driver, which would cause problems with Intel GPU.

Optimus is the perfect example of a hardware kludge. It is a real pain and you must follow the instruction exactly or it won’t work :wink:

The problem is that to date NVIDIA has not provided direct support for their hardware mesh-up on Linux

Is, as gogalthorp said, “Optimus the perfect example of a hardware kludge” ? or is nouveau Nvidia driver temporarily lost in the jungle of NVIDIA cards kludge ?

I installed BumbleBee easily on bare opensuse 13.2 without refresh on repos.

BumbleBee is installed in two steps,

https://en.opensuse.org/SDB:NVIDIA_Bumblebee

  1. start with nouveau opensource driver for nvidia card
  2. move eventually forward with nvidia proprietary drivers ( they will recompile with DKMS when you upgrade linux kernel )

Conclusions:

  1. BumbleBee (step 1) using nvidia nouveau driver still crashes my laptop
    with
    nouveau E PBUS][0000:01:00.0] MMIO read of 0x00000000 FAULT at 0x400700 IBUS ] error

(except if modifying /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT variable, appending nomodeset, but this means switching off nouveau nvidia driver at boot).

  1. BumbleBee works smoothly (step2), ie. runs on demand (discrete) NVIDIA advanced graphical processing ( cmd : optirun glxspheres )
    or run regular graphics with VGA controller/Intel Processor Integrated Graphics Controller. Nothing changed in /etc/default/grub.

So I guess, opensuse 13.2 Optimus Laptop,
1- either install BumbleBee with NVIDIA Proprietary Drivers ( blacklisting nouveau)
2- or do not install BumbleBee (as above), but modify /etc/default/grub in order to avoid nouveau for nvidia GPU and straightly run with Intel Card,
3- or change nothing and always boot with recovery mode…

Anyone to guide me to trace nouveau driver, I would be glad to help resolve this, and run opensource software.

Cheers,

Is, as gogalthorp said, “Optimus the perfect example of a hardware kludge” ? or is nouveau Nvidia driver temporarily lost in the jungle of NVIDIA cards kludge ?

I installed BumbleBee easily on bare opensuse 13.2 without refresh on repos.

BumbleBee is installed in two steps,

https://en.opensuse.org/SDB:NVIDIA_Bumblebee

  1. start with nouveau opensource driver for nvidia card
  2. move eventually forward with nvidia proprietary drivers ( they will recompile with DKMS when you upgrade linux kernel )

Conclusions:

  1. BumbleBee (step 1) using nvidia nouveau driver still crashes my laptop
    with
    nouveau E PBUS][0000:01:00.0] MMIO read of 0x00000000 FAULT at 0x400700 IBUS ] error

(except if modifying /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT variable, appending nomodeset, but this means switching off nouveau nvidia driver at boot).

  1. BumbleBee works smoothly (step2), ie. runs on demand (discrete) NVIDIA advanced graphical processing ( cmd : optirun glxspheres )
    or run regular graphics with VGA controller/Intel Processor Integrated Graphics Controller. Nothing changed in /etc/default/grub.

So I guess, opensuse 13.2 Optimus Laptop,
1- either install BumbleBee with NVIDIA Proprietary Drivers ( blacklisting nouveau)
2- or do not install BumbleBee (as above), but modify /etc/default/grub in order to avoid nouveau for nvidia GPU and straightly run with Intel Card,
3- or change nothing and always boot with recovery mode…

Anyone to guide me to trace nouveau driver, I would be glad to help resolve this, and run opensource software.

Cheers,

Yes, it is (to some degree at least).
But what’s worse is that nvidia themselves don’t really support it well in their proprietary driver. (there is some support meanwhile I think, but it’s not easy to use)

or is nouveau Nvidia driver temporarily lost in the jungle of NVIDIA cards kludge ?

Note: nvidia != nouveau
nouveau is the open source driver, reverse engineered without any support (not even specifications/documentation) from NVidia.
And it still has severe problems, with certain chipsets at least.

  1. BumbleBee (step 1) using nvidia nouveau driver still crashes my laptop
    with
    nouveau E PBUS][0000:01:00.0] MMIO read of 0x00000000 FAULT at 0x400700 IBUS ] error

(except if modifying /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT variable, appending nomodeset, but this means switching off nouveau nvidia driver at boot).

By default your Optimus system runs on the intel chip. (you can only run selected applications on the nvidia chip via the “optirun” command)
So you should not use “nomodeset”, as this also disables the intel driver, causing the system/desktop to run with a generic driver (fbdev or vesa) with bad performance.

Anyone to guide me to trace nouveau driver, I would be glad to help resolve this, and run opensource software.

You should probably file a bug report with your kernel panic, either on http://bugzilla.opensuse.org/ (same username/password as here) or directly at the kernel bugzilla.

But probably test with a newer kernel first, the latest stable one is available here:
http://download.opensuse.org/repositories/Kernel:/stable/standard/

BumbleBee works just fine now with Leap 42.1 and the described configuration. A real pleasure.:slight_smile: