boot hangs with kernel 4.4.104

After updating my kernel to 4.4.104 I cannot boot my system. I get to the “Initializing ramdisk” but nothing else. I cannot get to a regular console. Recovery mode doesn’t help: it stops at the same point. The only option is to halt the system with the power button.
I have no problem with 4.4.103 (and earlier), and I can keep using that

Any suggestion on how to proceed?
I don’t know which information could be relevant

Thanks

Likely caused by the recent security fixes I suppose.
Try the “nopti” kernel option and see if it helps, some people experienced kernel panics without it on Tumbleweed.

There also was a bug report about 42.2’s kernel update recently, not sure it’s related to your problem though:
http://bugzilla.opensuse.org/show_bug.cgi?id=1074896

Try to boot with “nopti” and/or “nospec” kernel options. Does any combination allow you to boot?

I had the same issue on my laptop (“Kernel panic - not syncing: Attempted to kill the idle task!”) with Kernel 4.4.104-39. The former Kernel 4.4.103-36 worked fine.

I have downloaded 4.4.107-1 from here https://software.opensuse.org/package/kernel-default which boots on my laptop again.

I tried “nopti” and “nospec” alone and in combination: no success

I followed the instructions in the bug to get extra output. I don’t even reach the kernel panic point. With options “earlyprintk=efi vga keep” I get


console [tty0] enabled
bootconsole [earlyefi0] disabled

It doesn’t seem the same problem, although related

With option “noefi” I can boot. Am I missing something? or is there something wrong in my configuration?

A thing you can do to have some protection re. future updates:
Edit ( as root ) /etc/zypp/zypp.conf and look for

multiversion.kernels = latest,latest-1,running

at the end of this line add the current kernel version number, preceded by a comma.
To find the version number use

uname -a

More likely that huge patch set merged recently has bugs in various corner cases. Open bug report and provide this information (that nopti/nospec does not help but “noefi” does). Testing the latest kernel would be also useful (4.4.107 as mentioned in another post).

I have another suspicion - and a possible solution - as I had the exact same thing happen with my PC (and an intel Broadwell E, Core i7 6800K CPU).

After an update on Friday to

  • kernel : 4.4.104-39.1
  • ucode-intel : 20170707-13.1

My computer didn’t boot anymore and was hanging in the “loading initil ramdisk” screen.

After some researching and experimenting, I got the tip to try disabling the loading of custom microcode with the kernel parameter “dis_ucode_ldr**”**!
As soon as i did this, I could again boot my machine without any problems.

This then brought me the idea, to check what has been updated in the microcode area, and indeed, there was that latest ucode-intel update: 20170707-13.1

After going back to an old ucode-intel (20170511-8.1) - and locking it until things get fixed - I can normally boot up without the need to use “dis_ucode_ldr”.

I also checked a little bit, what kind of microcode updates this ucode-intel-20170707-13.1.x86_64.rpm offered. First I scanned for my CPU ID with iucode_tool:

  • iucode_tool: system has processor(s) with signature 0x000406f1

Then looked with iucode_tool what was included in the current ucode set:




  1.   085/001: sig 0x000406f1, pf_mask 0xef, 2017-03-01, rev 0xb000021, size 26624 
  1.   085/002: sig 0x000406f1, pf_mask 0xef, 2017-03-01, rev 0xb000021, size 26624 
  1.   085/003: sig 0x000406f1, pf_mask 0xef, 2017-11-18, rev 0xb000025, size 27648 



Its quite likely, that the latest line (with date 2017-11-18) was / is the culprit, as it seems to be a relatively new ucode.

But what made me really surprised was, when I checked what the latest - downloadable - Intel (official) packaes includes which one can download here https://downloadcenter.intel.com/download/27337/Linux-Processor-Microcode-Data-File
So, there you get the latest 20171117.tgz … and this one only offers one ucode for my CPU:

089/001: sig 0x000406f1, pf_mask 0xef, 2017-03-01, rev 0xb000021, size 26624

So, obviously there is quite some confusion (quite likely on Intels side), on what theyhave (keep) in their current ucode package … and what Suse has to offer in the latest update. Maybe Intel already removed the ucode update for my CPU because they know that it created some issues.

What ever. For the moment it seems that some - definitely my - CPUs don’t boot (or hang) when they get treated with the latest ucode. Thus maybe - for now - its better to stay with an older (but stable) ucode package on CPUs which exhibit these issues.

PS … as a reference, I also wrote about this on a the German opensuse-forum:
https://www.opensuse-forum.de/thread/39696-boot-probleme-mit-aktuellem-intel-microcode-auf-core-i7-6800k-broadwell-e/

There is one information which I did forget to give: I am on OpenSUSE Leap 42.3 (but nevertheless, I suppose the problem to be similar or very similar to the here reported 42.2)

Repeating what I just wrote in another thread - those who can actually fix it are listening on bugzilla. It is good to let possible workaround to be known here for other users but to get it fixed (even in form of removing offending microcode) you must file bug report. And this bug report must be done by you - because only you can test possible solutions offered by developers.

OK, I created a bug report: