KDE freezes with nouveau drivers

We are having a problem with KDE locking up a workstation. It is running Leap 42.1. ‘lspci -v’ gives:

VGA compatible controller: NVIDIA Corporation GK208 [GeForce GT 720] (rev a1) (prog-if 00 [VGA controller])
    Subsystem: eVga.com. Corp. Device 2722
    Flags: bus master, fast devsel, latency 0, IRQ 123
    Memory at f6000000 (32-bit, non-prefetchable) [size=16]
    Memory at e8000000 (64-bit, prefetchable) [size=128]
    Memory at f0000000 (64-bit, prefetchable) [size=32]
    I/O ports at e000 [size=128]
    Expansion ROM at f7000000 [disabled] [size=512]
    Capabilities: [60] Power Management version 3
    Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
    Capabilities: [78] Express Legacy Endpoint, MSI 00
    Capabilities: [100] Virtual Channel
    Capabilities: [128] Power Budgeting <?>
    Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
    Kernel driver in use: nouveau
    Kernel modules: nouveau

Everything seems to run fine for hours, or a day or two, then a single desktop event (mouse click or keystroke) causes the whole desktop to freeze: the mouse pointer still moves, but mouse clicks or keystrokes do nothing (including using Ctrl-Alt-F1 to try to get to a text console). I can ssh in from another machine on the network, but doing ‘/sbin/init 3’ or ‘reboot’ from that prompt doesn’t cause the system drop to runlevel 3 or to reboot.

The contents of Xorg.0.log running up to the freeze look like this:

  4976.021] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 296611 < target_msc 296612
 13106.520] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 784467 < target_msc 784468
 13717.021] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 821099 < target_msc 821100
 14658.021] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 877562 < target_msc 877563
 15273.021] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 914464 < target_msc 914465
 16516.021] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 989048 < target_msc 989049
 18366.021] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 1100054 < target_msc 1100055
 20509.021] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 1228641 < target_msc 1228642
 20510.021] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 1228701 < target_msc 1228702
 27753.021] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 1663305 < target_msc 1663306
 31932.021] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 1914059 < target_msc 1914060
 34902.021] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 2092269 < target_msc 2092270
 34907.021] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 2092569 < target_msc 2092570
 36689.021] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 2199495 < target_msc 2199496
 39352.021] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 2359284 < target_msc 2359285
 39943.021] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 2394746 < target_msc 2394747
 42018.021] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 2519253 < target_msc 2519254
 42898.021] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 2572056 < target_msc 2572057
 46199.021] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 2770127 < target_msc 2770128
 47422.021] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 2843511 < target_msc 2843512
 51443.021] (WW) NOUVEAU(0): nouveau_dri2_flip_event_handler: Pageflip has impossible msc 3084784 < target_msc 3084785
(EE) [mi] EQ overflowing.  Additional events will be discarded until existing events are processed.
(EE) 
(EE) Backtrace:
(EE) 0: /usr/bin/Xorg (xorg_backtrace+0x48) [0x58b268]
(EE) 1: /usr/bin/Xorg (mieqEnqueue+0x22b) [0x56d9db]
(EE) 2: /usr/bin/Xorg (QueuePointerEvents+0x52) [0x454102]
(EE) 3: /usr/lib64/xorg/modules/input/evdev_drv.so (0x7fe1a4092000+0x6307) [0x7fe1a4098307]
(EE) 4: /usr/lib64/xorg/modules/input/evdev_drv.so (0x7fe1a4092000+0x6b05) [0x7fe1a4098b05]
(EE) 5: /usr/bin/Xorg (0x400000+0x7a428) [0x47a428]
(EE) 6: /usr/bin/Xorg (0x400000+0xa28d0) [0x4a28d0]
(EE) 7: /lib64/libc.so.6 (0x7fe1affe3000+0x35140) [0x7fe1b0018140]
(EE) 8: /lib64/libc.so.6 (ioctl+0x7) [0x7fe1b00c0bc7]
(EE) 9: /usr/lib64/libdrm.so.2 (drmIoctl+0x34) [0x7fe1b13ad6d4]
(EE) 10: /usr/lib64/libdrm.so.2 (drmCommandWrite+0x1e) [0x7fe1b13afd2e]
(EE) 11: /usr/lib64/libdrm_nouveau.so.2 (nouveau_bo_wait+0x7c) [0x7fe1abe6a8cc]
(EE) 12: /usr/lib64/xorg/modules/drivers/nouveau_drv.so (0x7fe1ac06f000+0xd41f) [0x7fe1ac07c41f]
(EE) 13: /usr/lib64/xorg/modules/drivers/nouveau_drv.so (0x7fe1ac06f000+0xddd4) [0x7fe1ac07cdd4]
(EE) 14: /usr/bin/Xorg (DRI2SwapBuffers+0x1b0) [0x55e840]
(EE) 15: /usr/bin/Xorg (0x400000+0x160113) [0x560113]
(EE) 16: /usr/bin/Xorg (0x400000+0x3d20e) [0x43d20e]
(EE) 17: /usr/bin/Xorg (0x400000+0x40feb) [0x440feb]
(EE) 18: /lib64/libc.so.6 (__libc_start_main+0xf5) [0x7fe1b0004b25]
(EE) 19: /usr/bin/Xorg (0x400000+0x2c60e) [0x42c60e]
(EE) 
(EE) [mi] These backtraces from mieqEnqueue may point to a culprit higher up the stack.
(EE) [mi] mieq is *NOT* the cause.  It is a victim.
(EE) [mi] EQ overflow continuing.  100 events have been dropped.
(EE) 
(EE) Backtrace:
(EE) 0: /usr/bin/Xorg (xorg_backtrace+0x48) [0x58b268]
(EE) 1: /usr/bin/Xorg (QueuePointerEvents+0x52) [0x454102]
(EE) 2: /usr/lib64/xorg/modules/input/evdev_drv.so (0x7fe1a4092000+0x6307) [0x7fe1a4098307]
(EE) 3: /usr/lib64/xorg/modules/input/evdev_drv.so (0x7fe1a4092000+0x6b05) [0x7fe1a4098b05]
(EE) 4: /usr/bin/Xorg (0x400000+0x7a428) [0x47a428]
(EE) 5: /usr/bin/Xorg (0x400000+0xa28d0) [0x4a28d0]
(EE) 6: /lib64/libc.so.6 (0x7fe1affe3000+0x35140) [0x7fe1b0018140]
(EE) 7: /lib64/libc.so.6 (ioctl+0x7) [0x7fe1b00c0bc7]
(EE) 8: /usr/lib64/libdrm.so.2 (drmIoctl+0x34) [0x7fe1b13ad6d4]
(EE) 9: /usr/lib64/libdrm.so.2 (drmCommandWrite+0x1e) [0x7fe1b13afd2e]
(EE) 10: /usr/lib64/libdrm_nouveau.so.2 (nouveau_bo_wait+0x7c) [0x7fe1abe6a8cc]
(EE) 11: /usr/lib64/xorg/modules/drivers/nouveau_drv.so (0x7fe1ac06f000+0xd41f) [0x7fe1ac07c41f]
(EE) 12: /usr/lib64/xorg/modules/drivers/nouveau_drv.so (0x7fe1ac06f000+0xddd4) [0x7fe1ac07cdd4]
(EE) 13: /usr/bin/Xorg (DRI2SwapBuffers+0x1b0) [0x55e840]
(EE) 14: /usr/bin/Xorg (0x400000+0x160113) [0x560113]
(EE) 15: /usr/bin/Xorg (0x400000+0x3d20e) [0x43d20e]
(EE) 16: /usr/bin/Xorg (0x400000+0x40feb) [0x440feb]
(EE) 17: /lib64/libc.so.6 (__libc_start_main+0xf5) [0x7fe1b0004b25]
(EE) 18: /usr/bin/Xorg (0x400000+0x2c60e) [0x42c60e]
(EE) 
(EE) [mi] EQ overflow continuing.  200 events have been dropped.

and further similar backtraces until:

(EE) [mi] EQ overflow continuing.  1000 events have been dropped.
(EE) [mi] No further overflow reports will be reported until the clog is cleared.

I am reluctant to suggest that the user concerned tries the proprietary NVIDIA drivers, because two of our other users find after updating to the current NVIDIA drivers that KDE crashes immediately after logging on. My current advice to them is to use another desktop (MATE or XFCE) but this is rather unsatisfactory. Does anyone have any suggestions about how to stabilise KDE with this NVIDIA card and the nouveau driver?[/size][/size][/size][/size][/size]

is KDE fully up to date ( i.e. are the machines updated regularly )?

Does turning off compositing make a difference ?

What’s the rendering re. compositing set to?

Yes they are. This is normally my first thought as well.

Does turning off compositing make a difference ?

We are trying it now, but obviously if it does help it will be a few days before we know.

What’s the rendering re. compositing set to?

OpenGL 2.0

Thanks,
Peter.

Try setting it to XRender

On Fri, 14 Oct 2016 14:56:02 GMT
pakeller <pakeller@no-mx.forums.microfocus.com> wrote:

> I am reluctant to suggest that the user concerned tries the
> proprietary NVIDIA drivers, because two of our other users find after
> updating to the current NVIDIA drivers that KDE crashes immediately
> after logging on. My current advice to them is to use another desktop
> (MATE or XFCE) but this is rather unsatisfactory. Does anyone have
> any suggestions about how to stabilise KDE with this NVIDIA card and
> the nouveau driver?

I had this problem until I updated to Qt5.7. I still get it when trying
42.2-beta as it is on Qt5.6. I added the following two 42.1 repositories
for Qt5 and Frameworks5 and switched to them:

https://en.opensuse.org/KDE_repositories#KDE_Frameworks_5_.26_Plasma_5


Graham Davis, Bracknell, Berks., UK.
openSUSE 42.1; KDE Plasma 5.8.0; Qt 5.7.0; Kernel 4.8.1;
AMD Phenom II X2 550 Processor; Sound: ATI SBx00 Azalia (Intel HDA);
Video: nVidia GeForce 210 (Driver: nouveau)

Thanks, we’ll do that. This particular workstation is only used for a couple of days a week, so if this helps it may be some time before we know for sure. OTOH, if it doesn’t help, we may find out fairly quickly next time we use it. I’ll report back when we know either way.

Interesting… unfortunately this isn’t really a practical solution for us: I know from past experience that departing from a fairly vanilla install can end up sucking a huge amount of time in user support later on. I might experiment on my own workstation like this (if I used KDE myself that is), but I can’t really manage setups like this for users that aren’t technically knowledgeable.

This KDE-related blog entry https://blog.martin-graesslin.com/blog/2015/10/some-thoughts-on-the-quality-of-plasma-5/comment-page-1/ illustrates very clearly the weakest link problem (albeit for Intel rather than NVIDIA drivers). Maybe the effort involved in getting Leap 42.2 ready means that openSUSE haven’t been able to keep everything in sync for 42.1, and things will sort themselves out in a few weeks. Proprietary drivers certainly don’t help of course.

how did you install the nvidia driver if you used the nvidia repo there should be no issues, if you used the run file you need to re-run it after every kernel update (this is not recommended on leap as there is a working repository)
I’d say add the nvidia repo and use the propiatory driver

zypper ar -f ftp://download.nvidia.com/opensuse/leap/42.1/ nvidia
zypper in x11-video-nvidiaG0x

where G0x is G02 or G03 or G04 deppending on how old your hardware is, the G03 should work on most nvidia cards
I think those users with nvidia issues installed the driver with the run file from nvidia, first remove that driver by re-running the run file then add the nvidia repo and pull the appropriate driver, the propitiatory driver is stable with plasma 5, nouveau not so much, there’s a bug report over at freedesktop that’s still open
https://bugs.freedesktop.org/show_bug.cgi?id=92077

if you don’t want to use the propitiatory driver do not use plasma 5 use lxqt or lxde or install and use plasma 5 with openbox instead of kwin5

ps. if there are such serios issues with the nvidia driver you really should report them at bugzilla as there are a lot of opensuse users that are running the nvidia propitiatory driver
https://bugzilla.opensuse.org

On Mon, 17 Oct 2016 15:36:02 GMT
pakeller <pakeller@no-mx.forums.microfocus.com> wrote:

> Cloddy;2796265 Wrote:
> > On Fri, 14 Oct 2016 14:56:02 GMT
> > I had this problem until I updated to Qt5.7. I still get it when
> > trying 42.2-beta as it is on Qt5.6. I added the following two 42.1
> > repositories for Qt5 and Frameworks5 and switched to them:
> >
> > http://tinyurl.com/bczda7s
> >
> >
>
> Interesting… unfortunately this isn’t really a practical solution
> for us: I know from past experience that departing from a fairly
> vanilla install can end up sucking a huge amount of time in user
> support later on. I might experiment on my own workstation like this
> (if I used KDE myself that is), but I can’t really manage setups like
> this for users that aren’t technically knowledgeable.
>
> This KDE-related blog entry http://tinyurl.com/zqwmrma illustrates
> very clearly the weakest link problem (albeit for Intel rather than
> NVIDIA drivers). Maybe the effort involved in getting Leap 42.2 ready
> means that openSUSE haven’t been able to keep everything in sync for
> 42.1, and things will sort themselves out in a few weeks. Proprietary
> drivers certainly don’t help of course.
>
>

Fair enough. I’ve also hit some other trouble with 42.1 in that my USB
mouse died so I replaced it, then my keyboard died so I replaced it.
Then both replacements started dying and needing a reboot to kickstart
them. I’ve now been back on 13.2 for a couple of days with no trouble
yet. Need some more testing to rule out - or rule in - hardware as
cause of problems. Backs up your point regarding departure from vanilla
install. :wink:


Graham Davis, Bracknell, Berks., UK.
openSUSE 42.1; KDE Plasma 5.8.0; Qt 5.7.0; Kernel 4.8.1;
AMD Phenom II X2 550 Processor; Sound: ATI SBx00 Azalia (Intel HDA);
Video: nVidia GeForce 210 (Driver: nouveau)

I did use the nvidia repo. You are right that there should be no issues, but in practice there are. This has happened before: a couple of years ago nvidia also released a disastrous update for their linux drivers that messed things up big time.

ps. if there are such serios issues with the nvidia driver you really should report them at bugzilla as there are a lot of opensuse users that are running the nvidia propitiatory driver
https://bugzilla.opensuse.org

Good point, and maybe I will, but gathering enough information for a complete bug report takes time and effort, and I have a lot of other things (other than Linux support that is) to do right now. One problem at a time…

if you want to stick with opensource drivers and use plasma 5 then do not use kwin5 as a window manager that’s where the bug is use openbox with openbox-kde
https://software.opensuse.org/package/openbox-kde
this will disable all composing and special effects but it’s lean and stable with novou, I have plasma 5, lxqt and openbox installed and I can confirm that lxqt with openbox and plasma 5 with openbox run fine with the nvidia hardware, I had issues with novou and kwin5 but I have none with the propitiatory driver I have an older geforce 240 that runs plasma 5 smoothly

I should have noted that the nvidia driver should only be used on desktops or non-optimus laptops if you try and use it on an optimus laptop strange things would happen, on optimus hardware (newer laptops) you need to use bumblebee with or without the propitiatory drivers
https://en.opensuse.org/SDB:NVIDIA_Bumblebee

Turning off compositing has fixed the crashing, and the workstation’s user is happy with the situation: he isn’t bothered about fancy desktop effects and the advanced KDE bells and whistles. Rather than distracting him from his work with further testing of the compositor, I’ll leave this issue for now.

As for the NVIDIA proprietary driver issue that I referred to in passing, this happens on machines that have been zypper dup-ed multiple times, from as far back as 11.4 in one case. Although there were never any major problems with the upgrades, the states of these systems are not the same as they would be from a fresh install of their currently-installed OSs. I have done a fresh install of 42.2-RC1 on one of them (in separate / and /boot volumes with a new local user) and given it a quick spin. It seems to work well. Sometime early next year I will probably install Leap 42.2 from scratch on these systems and see how the NVIDIA drivers behave then.

Thanks to all for your input,
P.