problem with nvidia quadro 4000 and nouveau

Helllo;

I have a HP proliant system with a nvidia quadro 4000 card with two screens: one on the DVI output and the other on the display port 1;
I’m using KDE. .

i have different crashes which lock totally the system (keyboard and mouse are not responding) and I must do a hard reboot;

this occurs at different time when I load a program, move or resize a program screen and when I try to reactivate a session:

in the last case I have this in journalctl

Dec 14 18:58:18 hpprol2 kernel: nouveau 0000:0a:00.0: fifo: PBDMA0: 80000000 ] ch 30 [007e6ab000 kscreenlocker_g[21546]] subc 0 mthd 0000 data 00000000
Dec 14 18:58:18 hpprol2 kernel: nouveau 0000:0a:00.0: fifo: PBDMA0: 80040000 ] ch 30 [007e6ab000 kscreenlocker_g[21546]] subc 0 mthd 0000 data 00000000
Dec 14 18:58:18 hpprol2 kernel: nouveau 0000:0a:00.0: fifo: PBDMA0: 80000000 ] ch 30 [007e6ab000 kscreenlocker_g[21546]] subc 0 mthd 0000 data 00000000
Dec 14 18:58:18 hpprol2 kernel: nouveau 0000:0a:00.0: fifo: read fault at 0000000000 engine 07 [PFIFO] client 06 [PFIFO] reason 00 [PT_NOT_PRESENT] on channel 30 [007e6ab000 kscreenlocker_g
Dec 14 18:58:18 hpprol2 kernel: nouveau 0000:0a:00.0: fifo: fifo engine fault on channel 30, recovering...
Dec 14 18:59:05 hpprol2 kernel: nouveau 0000:0a:00.0: Xorg[1784]: nv50cal_space: -16
Dec 14 18:59:05 hpprol2 kernel: nouveau 0000:0a:00.0: Xorg[1784]: nv50cal_space: -16
Dec 14 18:59:06 hpprol2 kernel: nouveau 0000:0a:00.0: Xorg[1784]: nv50cal_space: -16
Dec 14 18:59:06 hpprol2 kernel: nouveau 0000:0a:00.0: Xorg[1784]: nv50cal_space: -16

the next errors occurred when loading kconsole

Dec 14 06:54:15 hpprol2 kernel: audit: type=1105 audit(1450072455.966:157): pid=2446 uid=1000 auid=1000 ses=1 msg='op=PAM:session_open grantors=pam_limits,pam_unix,pam_umask,pam_systemd,pam
Dec 14 07:01:04 hpprol2 kernel: nouveau 0000:0a:00.0: fifo: PBDMA0: 80000000 ] ch 30 [007e6ab000 ksmserver[2095]] subc 0 mthd 0000 data 00000000
Dec 14 07:01:04 hpprol2 kernel: nouveau 0000:0a:00.0: fifo: PBDMA0: 80064000 ] ch 30 [007e6ab000 ksmserver[2095]] subc 0 mthd 0000 data 00000000
Dec 14 07:01:04 hpprol2 kernel: nouveau 0000:0a:00.0: fifo: PBDMA0: 80024000 ] ch 30 [007e6ab000 ksmserver[2095]] subc 0 mthd 0000 data 00000000
Dec 14 07:01:04 hpprol2 kernel: nouveau 0000:0a:00.0: fifo: read fault at cb412c0000 engine 07 [PFIFO] client 06 [PFIFO] reason 01 [PT_TOO_SHORT] on channel 30 [007e6ab000 ksmserver[2095]]
Dec 14 07:01:04 hpprol2 kernel: nouveau 0000:0a:00.0: fifo: fifo engine fault on channel 30, recovering...
Dec 14 07:07:16 hpprol2 audit[2092]: ANOM_ABEND auid=1000 uid=1000 gid=100 ses=1 pid=2092 comm="kactivitymanage" exe="/usr/bin/kactivitymanagerd" sig=11
Dec 14 07:07:16 hpprol2 kernel: kactivitymanage[2092]: segfault at 7f6f44e78d10 ip 00007f6f450bde21 sp 00007ffd5cc07e78 error 4 in libQt5Sql.so.5.5.1[7f6f450a8000+3f000]

I have already removed the content of the .cache directory, but this doesn’t seems to solve the problem
In some case I have first a screen corruption and i can do a move to the console (Alt-Ctrl-F1)…There I can load top and I see that Xorg is using 100% of on core and xembedsniproxy is also using 100%.
trying to kill one of these programs result in a lock.
trying to return to the DE (alt-ctrl-F7) locks the system.
the shutdown command starts but never end

Any Idea?
Regards
Philippe

Hi,
If you are using Tumbleweed, I think it’s better to use the nvidia driver from
http://www.nvidia.com/object/unix.html

You need to install the kernel source and kernel devel package,
make and gcc.

The first thing I would try in such a case is to use the propriatary nvidia driver instead of nouveau. If that also shows problems I would think about blaming the hardware.

Hello,

Thanks for your suggestions. I’m a bit reluctant installing the proprietary NVIDIA driver on a Tumbleweed system.

I did a test with a new user and the problem seems not occurring;
I retest with the old user after removing the ~/.cache and ~/.config directory and it seems that the system is again working without problem. Fingers crossed :slight_smile:
So I think that something gets corrupted in .config directory but what?

Still I found something strange with YaST;
As I said I have a system with two screens (extended and I use KDE)
There are very little number of configuration changes: the colour of the active and inactive title bars and I have defined a local image as desktop background. theme is Breeze

If I load YaST on the primary screen I can start the online update without problem
Loading YaST on the second screen I start online update: I see the refresh of the repositories and thereafter online update disappear >:(

Journalctl output:

Dec 18 12:58:47 hpprol2 audit[25931]: ANOM_ABEND auid=1000 uid=0 gid=0 ses=5 pid=25931 comm="y2base" exe="/usr/lib/YaST2/bin/y2base" sig=11
Dec 18 12:58:47 hpprol2 kernel: audit: type=1701 audit(1450439927.141:278): auid=1000 uid=0 gid=0 ses=5 pid=25931 comm="y2base" exe="/usr/lib/YaST2/bin/y2base" sig=11
Dec 18 13:03:08 hpprol2 systemd-coredump[26020]: Process 25931 (y2base) of user 0 dumped core.

in Var/log/zypper/traceI have

YaST got signal 11 at file /usr/share/YaST2/clients/online_update_select.rb:141
  sender PID: 504
Liberating suppressed debugging messages:
End of suppressed debugging messages
Backtrace: (use c++filt to demangle)

add thereafter one hundred lines of backtrace

I think that I’ll maybe open a bug therefore

Regards
Philippe