Black screen on Nvidia after updating to 20260428

You have luck.

For me initcall_blacklist=sysfb_init don’t work.

Waiting for 7.1 kernel and/or new driver version.

Regards

Well we are at 7.0.7 and so far nothing has changed. If there is a general new approach how the kernel initializes the HW (as Lioli7k seems to have found), I suppose this will not change with a 7.1.x version. That is why the kernel maintainers need to be informed and have to be aware of the issue(s).
A new driver version is likely not to have any effect as it is not “started” by the kernel boot.

1 Like

I have reported issue to SUSE and it’s being worked on. Unfortunately it’s really complex and it’s still unknown what is causing it. Devs can’t fix it if they don’t know what’s broken. I’m doing what I can on my part and try to patch kernel myself. I’ll be posting progress here when I encounter anything interesting regarding the issue.

5 Likes

This is really difficult issue to fix. Its behavior varies from system to system and nobody knows what is causing it yet. Linux kernel is over 40 million lines of code in size. Finding a tiny needle in all that hay is by no means a 3 day long job.

Developers have to help you with this if they don’t do it themselves! Your effort can take ages!

I’m only gonna cover some of it to keep it brief.

  1. Packages need maintainers to be updated, tested and shipped. There are over 18000 packages available for tumbleweed. I don’t know all the maintainers of course, but from what I understand a lot of them aren’t SUSE employees. Meaning they do it for free in their free time. And not all maintainers have free time to update and test every single package they maintain.
  2. Automatic testing on real hardware is difficult. I think SUSE usually tests all updates in VMs. And VMs are most likely are unaffected by this bug. If it actually was easy, SUSE would’ve had a solution in place years ago. I don’t think you want SUSE employees to test every update by hand on hundreds if not thousands of hardware combinations for a rolling release distro.
1 Like

Not sure this will help but you could try

Create a file /etc/systemd/system/nvidia-reset.service containing the following…

[Unit]
Description=Startup NVIDIA reset
After=sysinit.target

[Service]
Type=oneshot
ExecStart=/usr/bin/nvidia-smi --gpu-reset
RemainAfterExit=yes

[Install]
WantedBy=sysinit.target

then run sudo systemctl enable nvidia-reset.service

reboot

Aside from looking at what changed in OBS not that many things. Problem is probably somewhere within Linux kernel code. So ideally we need people with kernel expertise.

Answers to you points:

  1. Haven’t read all the changes because there are over 15000 commits during 2 week merge window alone. What did changed is system scheduler mode. It now can randomly interrupt processes if they take too long. This might be causing issues.
  2. That is precisely what I’m working on.
  3. Hard to tell. To me looks like that “small” number is not that small.
  4. Not that I know of (Ubuntu 26.04 and Arch Linux seem unaffected). Seems OpenSUSE specific.

Testing is difficult because most of things that matter require kernel patching. So ideally kernel + C knowledge is needed for this task.

I haven’t directly contacted SUSE kernel devs about it but I did asked to notify them. So hopefully they know about it.

My install is years old. I was wondering if this is happening to anyone on a fresh install? Apologies if this has already been mentioned in the comments.

They won’t be able to help much if I don’t know how to reproduce the issue reliably. And I have no idea what causes it. All I know that it’s related to graphical subsystems and scheduler might be acting up. Issue is that SUSE can’t reproduce it on their end.

It is happening on my partner’s computer, and we just installed Tumblweed for the very first time two-ish days ago.

1 Like

@akontsevich I need to stop you right there, take a look at the bug report, it’s not even pointing to any kernel developer (Component), so if they do not know, they will not be involved!

Yeah. I have asked to CC them but that’s about it.

@Lioli7k Post on the openSUSE Kernel Mailing list with a link to the bug report will likely get eyes on the issue.
https://lists.opensuse.org/archives/list/kernel@lists.opensuse.org/

3 Likes

Sent a mail to kernel mailing list. Hopefully they have any ideas to try.

3 Likes

My install is years (2019) old and I am NOT affected likely because of my ancient video card. I’m looking at some 3070 ti cards. Hopefully I’ll get one to test on :upside_down_face: To me it seems likely something (simplefb) is stepping on the gpu memory io region possibly do to asychronous loading. initcall_blacklist=sysfb_init is taking a hammer to that problem. Something like this might be a little more friendly.

When I get an affected card I’ll probably start with:

simplefb.async_probe=0
simpledrm.async_probe=0
nvidia.async_probe=0
nvidia_drm.async_probe=0

Or just disable async_probe entirely (just for testing)

module.async_probe=0
2 Likes

Thanks to all that are helping with this issue . Thanks also for investing your time ! Highly appreciated !

3 Likes

So, for funsies I tested these options I found none to work.

I tested each of the first four individually, then the final one, then i added all of the first four together and nothing.

However, whenever I did simpledrm.async_probe=0 I got a slightly different outcome, where my screen flashed blue for 5ish seconds and then pumped me onto a black screen with a cursor. While I’m pretty sure I could swap to other TTYs (as my cursor would vanish and reappear when switching back) I couldn’t see anything on them.

This does make me mildly confused, as it seems like an odd thing to happen, so the reason I’m posting this is a hope that it gives anyone more knowledgeable an idea.

5 Likes

Guess this is really kind of hard to stably reproduce…

I’m using 5060Ti and the G07 driver. I recall after 0428 update it was partly broken. I can still go into the KDE desktop, but KWin continuously report crashes and the desktop environment may get totally freezed if I keep moving the mouse fast. Yet the update about a week later (perhaps 0506? don’t remember exactly) fix everything.

I also try a fresh new install on an external SSD for test, where everything works just smoothly…

Just to share info that may help, i run tumbleweed + kde plasma + RTX5070 + nvidia open G07 595.75.

with kernel 6.19 , everything was running fine.
with kernel 7.0.1 update, i was able to login, but nvidia driver was failed.
with kernel 7.0.2 update, i was unable to boot, had green screen.
this continued till kernel 7.0.5, and even “initcall” workaround failed.
I tested a fresh install with Gnome, and “initcall” workaround was successful.
then with kernel 7.0.6 or 7.0.7 even KDE plasma with “initcall” workaround was successful.

shut down. and restart are taking a minute, right now. both on Gnome, and KDE.