Kernel 4.19.1 won't boot (black screen)

https://bugzilla.opensuse.org/show_bug.cgi?id=1116106

I’m running into a major issue, which seems to have been introduced with Kernel 4.19.1 which is now available in the latest openSUSE Tumbleweed snapshot.

My machine refuses to boot with this kernel. It remains stuck at a black screen right after issuing the boot command in grub2. The HDD led flashes a few times shortly after grub2 disappears, but after that it will stay off and nothing new happens. I cannot use “control + alt + fN” to switch to a different runlevel either, however I’m still able to toggle the NumLock / CapsLock leds which means the system isn’t freezing up entirely. The computer won’t respond to the power button and has to be restarted via the reset button. I’m also noticing a longer delay after the “initializing ramdisks” message, before the boot loader disappears and I’m supposed to see either the console or splash screen.

I’ve already discovered an important clue: Although the delay persists and I never see a console or splash screen during the boot process (just a black screen), Kernel 4.19.1 appears to boot successfully if I’m using the ‘radeon’ and not ‘amdgpu’ driver. I was able to load it once after removing the following parameters from my Kernel command line, which I default to in order to run my GCN 2.0 card on the most modern driver (Radeon R9 390 8GB):

radeon.si_support=0 radeon.cik_support=0 amdgpu.si_support=1 amdgpu.cik_support=1

Update: I just booted my mother’s computer. It also uses openSUSE Tumbleweed and has the latest snapshot, as well as the same kernel parameters for amdgpu. Her video card is a GCN 1.0 (Radeon R7 370 2GB).

For her there are absolutely no issues when booting with 4.19.1. The delay after “initializing ramdisks” does show up (it’s even longer), however the splash screen will then display and booting is successful.

I’m pondering whether Kernel 4.19 introduces an issue in amdgpu which affects GCN 2.0 video cards explicitly. If anyone here has such a card, can you confirm or infirm the results I’m getting to be sure?

Search the forums and you’ll find at least one other user.

On Wed 14 Nov 2018 11:46:03 PM CST, MirceaKitsune wrote:

Update: I just booted my mother’s computer. It also uses openSUSE
Tumbleweed and has the latest snapshot, as well as the same kernel
parameters for amdgpu. Her video card is a GCN 1.0 (Radeon R7 370 2GB).

For her there are absolutely no issues when booting with 4.19.1. The
delay after “initializing ramdisks” does show up (it’s even longer),
however the splash screen will then display and booting is successful.

I’m pondering whether Kernel 4.19 introduces an issue in amdgpu which
affects GCN 2.0 video cards explicitly. If anyone here has such a card,
can you confirm or infirm the results I’m getting to be sure?

Hi
For GCN 1.0 I only use amdgpu.cik_support=1 nothing else, and blacklist
radeon. Don’t have GCN 2.0, only GCN 3.0 which just uses amdgpu.

What happens if you remove the boot options and blacklist radeon only?
But does sound like a regression… Or maybe this kernel has better
GCN 2.0 support and the options/blacklist are not needed.


Cheers Malcolm °¿° SUSE Knowledge Partner (Linux Counter #276890)
SLES 15 | GNOME Shell 3.26.2 | 4.12.14-25.25-default
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below… Thanks!

This thread? I’ve subbed to it and will follow its progress as well, thanks. Seems to be about the same thing, I’ll follow your advice there and wait a bit longer. I’m seeing 4.19.2 is already waiting in Factory, I’ll try this kernel again once that makes it in. If this doesn’t fix it either I’ll investigate deeper, and also try out the parameters malcolmlewis suggested above.

On Thu 15 Nov 2018 12:06:03 AM CST, Knurpht wrote:

Search the forums and you’ll find at least one other user.

Hi
This is amdgpu oss driver related, not nvidia :wink:


Cheers Malcolm °¿° SUSE Knowledge Partner (Linux Counter #276890)
SLES 15 | GNOME Shell 3.26.2 | 4.12.14-25.25-default
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below… Thanks!

Hi
OK, my card is a Mullins and GCN 2.0, and is working fine with the 4.19.1 kernel amd GNOME DE…

Suggest a /etc/modprobe.d/50-radeon.conf file with blacklist radeon and just the one kernel boot option.

Else might be DE related…

DE related sounds like a weird possibility, since the DE comes after the login process and shouldn’t affect things so early during boot. Perhaps SDDM related at most? I think it varies with specific types of GPU’s, most logical reason I can imagine.

I’ll wait for Kernel 4.19.2 as I understand the drivers will be recompiled for it. If that doesn’t fix it either I’ll try different boot parameters and see what that does.

Just remember those 4 kernel parameters I quoted in the first post work perfectly on my mother’s computer, which has the exact same TW snapshot just a GCN 1.0 card instead of 2.0. So the issue must be hardware related, those parameters haven’t broken for openSUSE or the new Kernel.

Hi
Could be hardware, you might want to try now to confirm and try and get some logs for your bug report… you will be asked for some files for sure…

Not just amdgpu or nvidia. I’ve got a Lenovo T61 that exhibited the same behavior after the update to 4.19.1.
My video card: Mobile GM965/GL960 Integrated Graphics Controller (primary)

It looks like as soon as the initrd loads, it immediately crashes and causes a reboot. The only way for me to recover was rolling back to a snapshot with 4.18.
I can update everything else but if I go to the 4.19 kernel, I’m toast.

Interestingly enough, I also have a desktop machine that is having no issues with 4.19.

Same identical issue with Kernel 4.19.2, the latest snapshot does not fix this as suggested. 4.19.2 must be missing some video card firmwares.

Please advice the next steps. There should be no need to change the way I blacklist radeon and whitelist amdgpu: The exact same parameters work fine on my mother’s computer on the same Kernel version and Tumbleweed snapshot, and also work fine on my machine with Kernel 4.18.5… their functionality has thus not changed. This must be hardware specific and related to the video card model.

Seems many people on various distros are reporting this exact problem:

https://forum.manjaro.org/t/problem-with-new-kernel-4-19/63778
https://forum.manjaro.org/t/cant-get-my-amdgpu-drivers-to-work-stuck-on-radeon
https://forum.manjaro.org/t/upgrade-to-4-19-disables-my-amdgpu

It’s been suggested that it is caused by the following issue:

https://bugs.freedesktop.org/show_bug.cgi?id=108260

If that is indeed the case, the issue was supposedly fixed in Kernel 4.20. That is however a long way from now, and Kernel 4.18.5 doesn’t even appear in my Advanced grub2 options menu any more! Can anyone dig up more information on this to confirm that bug is indeed the cause? And if that’s correct then is there a chance the openSUSE admins could decide to either revert Tumbleweed to 4.18.5 in the meantime or do an early emergency upgrade to 4.20.0 in the following days?

I also preformed a test which was suggested on another thread. It gave me some interesting results. I booted 4.19.2 with the following kernel parameter, additional to my default parameters in the first post:

amdgpu.dc=0

What changed: Instead of the black screen, the computer froze on the last grub2 screen (black box saying “loading initial ramdisks”). After less than a minute however, the NumLock / CapsLock leds became possible to toggle again, but most importantly the computer shut down cleanly when I pressed the power button. As others are suggesting, something must be causing X11 to not recognize or to later drop the display device.

On Tue 20 Nov 2018 02:46:03 PM CST, MirceaKitsune wrote:

Same identical issue with Kernel 4.19.2, the latest snapshot does not
fix this as suggested. 4.19.2 must be missing some video card firmwares.

Please advice the next steps. There should be no need to change the way
I blacklist radeon and whitelist amdgpu: The exact same parameters work
fine on my mother’s computer on the same Kernel version and Tumbleweed
snapshot, and also work fine on my machine with Kernel 4.18.5… their
functionality has thus not changed. This must be hardware specific and
related to the video card model.

Hi
So, I would suggest moving back to radeon and see how that goes, remove
the blacklist/whitelist(?) and the boot command line options. Disable
plymouth as well (I remove all plymouth related stuff and lock) via
boot option. See how that goes else it could be further down the food
chain with your system…

Both my AMD systems are running fine on the 4.19.2 kernel.

No firmware missing for current cards (you may see some dracut
firmware warnings about the newer cards not released yet).

Not running things like virtualbox?


Cheers Malcolm °¿° SUSE Knowledge Partner (Linux Counter #276890)
SLES 15 | GNOME Shell 3.26.2 | 4.12.14-25.25-default
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below… Thanks!

I’m currently on the Kernel 4.19.2 with the radeon driver. This world perfectly fine. Of course I don’t want to use radeon over amdgpu, which is inferior and also has some bizarre GPU crashes with image corruption on rare occasions.

Was there a kernel parameter to disable plymouth? I can try that next as it may reveal some of the error messages unless the black screen will also cover those.

I have Virtualbox installed. Of course this issue is me booting my real machine.

On Tue 20 Nov 2018 04:06:03 PM CST, MirceaKitsune wrote:

malcolmlewis;2886767 Wrote:
> Hi
> So, I would suggest moving back to radeon and see how that goes,
> remove the blacklist/whitelist(?) and the boot command line options.
> Disable plymouth as well (I remove all plymouth related stuff and
> lock) via boot option. See how that goes else it could be further
> down the food chain with your system…
>
> Both my AMD systems are running fine on the 4.19.2 kernel.
>
> No firmware missing for current cards (you may see some dracut
> firmware warnings about the newer cards not released yet).
>
> Not running things like virtualbox?

I’m currently on the Kernel 4.19.2 with the radeon driver. This world
perfectly fine. Of course I don’t want to use radeon over amdgpu, which
is inferior and also has some bizarre GPU crashes with image corruption
on rare occasions.

Was there a kernel parameter to disable plymouth? I can try that next as
it may reveal some of the error messages unless the black screen will
also cover those.

I have Virtualbox installed. Of course this issue is me booting my real
machine.

Hi
At the grub screen, press the ‘e’ key to edit add (in my case the
linuxefi line);


plymouth.enable=0 console=tty

You also remove the ‘quiet’ option on this line, then press the F10 key
to boot.


Cheers Malcolm °¿° SUSE Knowledge Partner (Linux Counter #276890)
SLES 15 | GNOME Shell 3.26.2 | 4.12.14-25.25-default
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below… Thanks!

I also have issue with this kernel. PC boots fine, but no graphics login screen - only console login available. I have nvidia card with proprietary driver. Forced to fallback to 4.18.5.

As advised, I booted with the following additional parameters:

plymouth.enable=0 console=tty

Result: I could see the verbose output as the boot logs were printed to the screen. Everything goes fine up until the very last moment when the login screen (sddm) is supposed to appear, that is when the screen goes black. If I add the parameter amdgpu.dc=0 to the mix, the only difference is that instead of the screen going black I see the last image (console output) frozen on the monitor. I didn’t take a photo of that as the boot logs were perfectly normal and only indicated casual stuff (eg: describing various usb devices), there are no error messages or anything related to the display when the image gets stuck.

Hi
So one wonders if it’s sddm related for your particular setup… So, the test was with radeon or amdgpu? If amdgpu, what if you switch back to radeon?

Radeon works fine: I’m currently on 4.19.2 with it. Which I now have no choice about: Apparently the latest update decided to only keep two kernels around, which are 4.19.1 and 4.19.2… 4.18.5 was deleted from /boot and I no longer have it. If radeon didn’t work I would now be locked out of my system and having to see how to repair it. I’m hoping the openSUSE developers will be working to find a solution to this quickly.

On Tue 20 Nov 2018 10:06:03 PM CST, MirceaKitsune wrote:

malcolmlewis;2886816 Wrote:
> Hi
> So one wonders if it’s sddm related for your particular setup… So,
> the test was with radeon or amdgpu? If amdgpu, what if you switch back
> to radeon?

Radeon works fine: I’m currently on 4.19.2 with it. Which I now have no
choice about: Apparently the latest update decided to only keep two
kernels around, which are 4.19.1 and 4.19.2… 4.18.5 was deleted from
/boot and I no longer have it. If radeon didn’t work I would now be
locked out of my system and having to see how to repair it. I’m hoping
the openSUSE developers will be working to find a solution to this
quickly.

Hi
Yes, I mentioned that in another thread, user nrickert also has a tweak
to keep the oldest…

So have you looked at sddm logs? Temporarily change the login manager
to gdm or another one? If you set your user to autologin to skip sddm,
does the desktop turn up?


Cheers Malcolm °¿° SUSE Knowledge Partner (Linux Counter #276890)
SLES 15 | GNOME Shell 3.26.2 | 4.12.14-25.25-default
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below… Thanks!