I have over the last 2 months experienced seemingly random system freezes on a Lenovo w520 (HD3000 Sandy Bridge) and a Lenovo w530(HD4000 Ivy Bridge) machine. Both running OpenSuSE 12.2. I have read many forum threads and tried a few things without success. The logs are not very helpful as nothing specific seems to be the problem and I suspect the freezes prevent the logs from being written. A hard reset is needed to get the machines going again.
I have noticed that on the current HD4000 system I cannot select OpenGL as the Compositing type when booting with the HD4000 adapter. Using the discrete nVidia (K2000) works fine however. I do not know if this happened on the w520.
I also had both machines working correctly by not installing what I thought were pertinent graphics/XServer/Mesa/KDE4 related updates after the initial install. I attempted to do the same during subsequent installs but somehow included an update that seems to produce an unstable system.
The w520 machine is no longer available for debug purposes as it has been replaced by the w530. I think it noteworthy that the freezes happened on both Sandy Bridge and Ivy Bridge, although I do not know how this might be useful to pinpoint the problem.
I have limited knowledge of the intricacies of graphics drivers / XServer /Mesa interactions and am looking for help in getting to the bottom of this. Possibly increasing debug output or other diagnostics to get the machine running correctly.
on my HW this occurs when the updates are inconsistent
by this I mean some updates from one repo and others from another leading to a
conflict
one way to solve this is to open yast then open the Repositories tap
then click on the ‘installed (Available)’ column header until the ^ sign is
shown and then see if any package version number is in red
if so it indicates the package is newer than required
go through each repository you have enabled and toggle the
‘Switch system packages’ / ‘Cancel switching bar’ to try to go to an older
version without breaking any dependencies
I do not know what your preferences are but on a standard install would suggest
you give higher priority to the repos listed below first
Thanks for your suggestions. I will try that. If there are changes I will observe how the system operates afterwards. I have currently switched Desktop Effects off and it seems better, but that might just be coincidence. I will get back with any results.
I have gone through all the repositories as suggested and found no inconsistencies except for some audio (packman repository) updates. I have updated the machine to the repositories and their respective priorities listed below:
I am writing this from another machine. While working on the w530 it froze again. However, I was listening to music through headphones which continued to play. I tried pinging the w530 from another machine, but got no response, neither did ssh work. A DVD writing process using K3b seemed to continue indefinitely - at least the DVD was spinning but never finished and was unusable after a hard reset. I mention these things as it might indicate which subsytems are still operational.
Have you considered trying out Tumbleweed on one of your pc’s,
by doing this you would be updating to a newer kernel and desktop environment etc.
just make sure you have the following packages installed before
installing the graphics driver
kernel-default-devel-3.6.10-15.1.x86_64
kernel-desktop-3.6.10-15.1.x86_64
kernel-desktop-devel-3.6.10-15.1.x86_64
kernel-devel-3.6.10-15.1.noarch
kernel-source-3.6.10-15.1.noarch
kernel-syms-3.6.10-15.1.x86_64
kernel-xen-devel-3.6.10-15.1.x86_64
kernel-firmware-20120719git-2.2.noarch
(NB. I’m assuming you have a 64bit OS otherwise you need the 32bit versions)
(I do not have any recent experience with nVidia graphics,
so apologies for any inaccuracies)
you can check your kernel install with cmd
rpm -qa | grep kernel
all kernel packages should have the same version number with the exception of kernel-firmware
Thanks again for your suggestion. I have been considering that for a while. I did attempt it but produced an unusable system, probably because of badly configured repositories. Hence my following question:
How do I need to include the Tumbleweed repository so that it correctly has priority over the distribution and update repos? Should I disable the update repos and just leave the basic distribution repo and Tumbleweed repo? Or alternatively should I just add the Tumbleweed repo with a priority higher than both the distribution and update repos?
OK. I have migrated the system to Tumbleweed. It still boots - that’s a relief. I notice that the intel driver, Mesa etc. do not have rpms in the Tumbleweed repositiry. They are still the ones from the current updates repo. But as you pointed out, it might be a kernel issue, which has been updated on my system to 3.6.10-15.
During the switch to Tumbleweed I noticed some stubborn dependencies (incorrect ones I think) in the kernel-desktop-base rpm to the older (3.4) kernel-desktop rpm in the current updates repo. I forced the rpm install with --nodeps which seemed to do the trick. I suspect it’s an rpm package issue and not a real dependency. It should probably depend on the 3.6.10-15 kernel-desktop rpm. I have a non-working kernel option in the grub boot menu/config which might be a result of this. Anyway, the latest kernel seems to work just fine.
I have not re-installed the nVidia driver for the new kernel, but will attempt to do so now, including bumblebee for Optimus.
If the freezes with the intel driver continue, I will consider using the latest intel, Mesa, vaapi, etc from the factory repo.
I changed the Tumbleweed priority to 89 (as opposed to 99) to partially resolve the dependency issue I mentioned in the previous post. I did not see any mention of this being necessary in the Wiki, so maybe I did something slightly incorrect. Anyway, the latest kernel seems to work just fine.
Also, looking at this, I suppose the packman repo should have a priority 89 or something so that it takes precedence over the openSUSE-current repo.
Looks like you are making progress,
has this eliminated the freezing?
I’ve never dared install a package ignoring dependencies
On occasions Tumbleweed tries to load two kernels, default and desktop,
hence some care is needed, but I thought that had been sorted
(this seems to occur when the repo itself is being updated)
Firstly all the best for the new year - Einen guten Rutsch!
I did get freezes again with Tumbleweed. I then dug deeper. It seems the NVidia driver installation somehow adds OpenGL binaries (and or XRender) which then get used even with the Intel driver - which then causes instability. Now I am not sure of this, but I found Intel/Nvidia combination of drivers running based on the output from glxinfo - which I stupidly did not keep and hence can’t post. Anyway, I downgraded to the 12.2 distribution, applied updates but most importantly DID NOT install NVidia drivers. So far so good - no freezes, but also no NVidia support - which is acceptable for the time being. I don’t have enough knowledge of the details of the XServer/drivers/3D acceleration etc. components to really dissect the problem, but based on empirical evidence this is the conclusion I came to: Installing the NVidia drivers creates an unstable XServer environment when using the Intel adapter (with Intel driver) on a Lenovo W520 and W530. What exactly the install messes up I do not know. Strangely, using the NVidia card (with NVidia driver) never froze the system - I guess it’s compatible with its own OpenGL driver…
I am not sure that this is very useful, but at least I know what not to do for a while to keep the system stable.
On Optimus based machines you cannot use the “normal” NVIDIA driver, since it replaces some X-files by symlinks to it’s own versions, which breaks the proper funtioning of the intel driver. So, installing the NVIDIA driver causes the issue of the freezes. If you uninstall it, make sure you reinstall the xorg-x11 packages.
Those Lenovos are a little different than most Optimus systems, you have the choice of three modes, Intel only, Nvidia only and Optimus. This gives the opportunity to use them in Intel mode when on battery and in Nvidia mode when on AC, I use my Asus EeePC1015pn this way. Now I also had freezes in 12.2 and as I needed a working netbook I simply returned to 12.1, so I really don’t know if this will work in 12.2.
I do as follows, use only the Nvidia driver from the Nvidia repo, take a backup of this file /etc/ld.so.conf.d/nvidia-gfxGO2.conf, add the following to your /etc/init.d/boot.local
mode=$(/sbin/lspci | grep VGA);
#Make shure we are in command of /usr/X11R6/lib
if -e /etc/ld.so.conf.d/nvidia-gfxGO2.conf ]; then
/bin/rm /etc/ld.so.conf.d/nvidia-gfxGO2.conf
fi
#Check graphics mode and take apropriate action
if `echo $mode | grep -c "Intel" ` -gt 0 ]; then
if -e /usr/lib/xorg/modules/updates/extensions/libglx.so ]; then
/bin/rm /usr/lib/xorg/modules/updates/extensions/libglx.so
fi
if -e /usr/X11R6/lib/libGL.so.1 ]; then
/bin/rm /usr/X11R6/lib/libGL.so.1
fi
else
/bin/ln -s /usr/lib/xorg/modules/updates/extensions/libglx.so.* /usr/lib/xorg/modules/updates/extensions/libglx.so
/sbin/ldconfig /usr/X11R6/lib
fi
So I guess I wasn’t completely on the wrong track when I got suspicious of the NVidia openGL driver running together with the Intel driver?
I presume you use the below script on a 12.1 system? Looks like it could work on 12.2. So basically the problem is the libGL / libglx library that gets mismatched?
I am tempted to try it. But have spent so much time updating from repositories and fixing a broken system, that I am a little apprehensive to try again - deadlines looming…
If I can, as Knurpht suggested, just uninstall the NVidia driver again in case of failure and restore to the original XOrg files and return to a stable system, then I will try in the next few days and give some more feedback. If it’s another whole “zypper dup” mission, followed by reconfiguring things - because of using a sledgehammer so to say, then I might defer a little more.
Is this being addressed in 12.3 btw? I woiuld really like stable and importantly - simple - NVidia support.
So the question is: Try Bumblebee (if that addresses this issue), or use your script method to shunt the openGL library in and out?
I have a stable system. I have NOT installed the NVidia driver again due to time constraints. I currently am on the stander 12.2 distribution and update repositories without any problems.
If I get a chance to try Bumblebee or the script I will post results on this thread.