I updated Tumbleweed last night, as I do most every day. What was updated was the kernel version packages to 4.14.11 and the amd and intel microcode packages. There were no errors during install.
When I try to boot, the system will completely hang after the outputs: Reached target Switch Root and Starting Switch Root… (My system is showing duplicate entries for each command during boot, as others are posting about if that may be a concern)
If I leave let the computer sit, it will reboot after a couple of minutes.
I downloaded and burned the newest Tumbleweed iso dated 2018-01-04 and burned it. The system boots correctly and the DVD software seems to work correctly. If I run the upgrade option, it wants to update the kernel, even though it’s the same version that is already installed. One thing that seems strange is that when I click on the “versions” tab for the kernel packages in the Yast software selection section, it is showing that it wants to update from the DVD as well as the update repo. If I try to disable one, both are disabled. If I re-enable one, both are re-enabled. I went ahead with the upgrade two times with the same results as the original update on my working system. During boot, it will freeze and eventually reboot.
Hardware is an ASUS Crosshair VI Hero motherboard, Ryzen 1800X CPU clocked at 3.9GHz, 64GB of DDR4-2666 RAM, and an ASUS ATI Radeon RX480 graphics card with 8GB of RAM. I’ve been using this hardware since March of last year. The only issue was that I was forced to go to Tumbleweed because the kernel-default in Leap 42.3 was too old to have the amdgpu-pro drivers I need for the RX480 card and the “radeon” driver would not work with it, so I ended up with generic VESA graphics.
The system continues to boot normally into Windows 10 with no issues. I am in Windows now, writing this post.
I am at a loss as to what to do next. I can’t even get to a console login to do any troubleshooting.
I resist letting the DVD do a fresh install because I believe that either the new kernel or amd microcode is what broke the system. It could be a lot of work to accomplish nothing and I have several encrypted luks partitions that I have to manually set up after every fresh install, using cryptsetup. From what I saw in the Yast software manager on the install DVD, cryptsetup is now no longer available, even in the online repos.
A couple of things:
Can you still boot the system through Advanced Options, a previous kernel ?
If not, try hitting ‘e’ at boot. It will open a new screen where you can add “nopti” at the bootcommand line after the “showopts” option. Please report back
To add: If the ‘nopti’ option works around the issue, use the YaST bootloader module to add it to the kernel parameters.
What is last message on console? Got here with kernel 4.14.11 error:
PANIC: double fault, error code: 0x0
Kernel panic - not syncing: Machine halted
Not logged into journal, so did take only picture. Snapper rollback to kernel 4.14.9 was a solution.
Sounds similar to https://lists.opensuse.org/opensuse-factory/2018-01/msg00091.html
Testing “nopti” and/or “nospec” is obligatory today. In any case you should open bug report and add results of these tests and information about your CPU. Photo with panic stack is rather useful as well.
I’m not seeing any kernel panic messages. It simply freezes. The last console message is: “sp5100_tco: I/O address 0x0cd6 already in use”, but I have seen that message on most successful boots also. One thing that may be worth noting is that in the two posted images in this thread, all are using TeamViewer daemon, as am I. Related?
I’ll try editing the boot menu shortly to add nopti (What is pti exactly?). I had Yast uninstall the last kernel when this new one was installed, so that’s a dead end. I’m downloading Leap 42.3 right now in Windows, just in case I’ll need it. I’ll post back in a few hours.
Even more reasons for someone to finally open bug report and provide this information. Having possibility to reliably trigger bug may facilitate debugging and resolution.
You should be able to disable teamviewerd service to let your system boot. Actually you may be able to single step boot process using systemd.confirm_spawn=yes on kernel command line (remove “quiet” as well).
What is pti exactly?
(Kernel) Page Table Isolation
I’ll post back in a few hours.
You may post anywhere you like of course, but those who can actually fix it are listening on bugzilla.
The reason for me posting back is just to confirm that adding nopti to the kernel commands fixed the problem.
The only problem TeamViewer has ever caused me was a very slow shutdown because TeamViewer did not want to terminate. The system put it on a 90 second timer during shutdown to force-kill it. It hasn’t done that in recent weeks though. I just mentioned TeamViewer in the earlier post because I thought it was a strange coincidence.
Thanks for all the help.
Which specific bugzilla should I report this problem to?
openSUSE one as this is openSUSE kernel regression with all probability. https://bugzilla.opensuse.org/, use the same user/password as here.
Doing it now. This may need to be reported to kernel.org bugzilla as well. While I was trying to make things work, I removed kernel-default and installed kernel-vanilla with identical results.
In another thread user mentioned that downgrading ucode-intel package allowed system to boot. You may want to try it as well.