You have luck.
For me initcall_blacklist=sysfb_init don’t work.
Waiting for 7.1 kernel and/or new driver version.
Regards
You have luck.
For me initcall_blacklist=sysfb_init don’t work.
Waiting for 7.1 kernel and/or new driver version.
Regards
Well we are at 7.0.7 and so far nothing has changed. If there is a general new approach how the kernel initializes the HW (as Lioli7k seems to have found), I suppose this will not change with a 7.1.x version. That is why the kernel maintainers need to be informed and have to be aware of the issue(s).
A new driver version is likely not to have any effect as it is not “started” by the kernel boot.
I have reported issue to SUSE and it’s being worked on. Unfortunately it’s really complex and it’s still unknown what is causing it. Devs can’t fix it if they don’t know what’s broken. I’m doing what I can on my part and try to patch kernel myself. I’ll be posting progress here when I encounter anything interesting regarding the issue.
This is really difficult issue to fix. Its behavior varies from system to system and nobody knows what is causing it yet. Linux kernel is over 40 million lines of code in size. Finding a tiny needle in all that hay is by no means a 3 day long job.
Developers have to help you with this if they don’t do it themselves! Your effort can take ages!
I’m only gonna cover some of it to keep it brief.
Not sure this will help but you could try
Create a file /etc/systemd/system/nvidia-reset.service containing the following…
[Unit]
Description=Startup NVIDIA reset
After=sysinit.target
[Service]
Type=oneshot
ExecStart=/usr/bin/nvidia-smi --gpu-reset
RemainAfterExit=yes
[Install]
WantedBy=sysinit.target
then run sudo systemctl enable nvidia-reset.service
reboot
Aside from looking at what changed in OBS not that many things. Problem is probably somewhere within Linux kernel code. So ideally we need people with kernel expertise.
Answers to you points:
Testing is difficult because most of things that matter require kernel patching. So ideally kernel + C knowledge is needed for this task.
I haven’t directly contacted SUSE kernel devs about it but I did asked to notify them. So hopefully they know about it.
My install is years old. I was wondering if this is happening to anyone on a fresh install? Apologies if this has already been mentioned in the comments.
They won’t be able to help much if I don’t know how to reproduce the issue reliably. And I have no idea what causes it. All I know that it’s related to graphical subsystems and scheduler might be acting up. Issue is that SUSE can’t reproduce it on their end.
It is happening on my partner’s computer, and we just installed Tumblweed for the very first time two-ish days ago.
@akontsevich I need to stop you right there, take a look at the bug report, it’s not even pointing to any kernel developer (Component), so if they do not know, they will not be involved!
Yeah. I have asked to CC them but that’s about it.
@Lioli7k Post on the openSUSE Kernel Mailing list with a link to the bug report will likely get eyes on the issue.
https://lists.opensuse.org/archives/list/kernel@lists.opensuse.org/
Sent a mail to kernel mailing list. Hopefully they have any ideas to try.
My install is years (2019) old and I am NOT affected likely because of my ancient video card. I’m looking at some 3070 ti cards. Hopefully I’ll get one to test on
To me it seems likely something (simplefb) is stepping on the gpu memory io region possibly do to asychronous loading. initcall_blacklist=sysfb_init is taking a hammer to that problem. Something like this might be a little more friendly.
When I get an affected card I’ll probably start with:
simplefb.async_probe=0
simpledrm.async_probe=0
nvidia.async_probe=0
nvidia_drm.async_probe=0
Or just disable async_probe entirely (just for testing)
module.async_probe=0
Thanks to all that are helping with this issue . Thanks also for investing your time ! Highly appreciated !
So, for funsies I tested these options I found none to work.
I tested each of the first four individually, then the final one, then i added all of the first four together and nothing.
However, whenever I did simpledrm.async_probe=0 I got a slightly different outcome, where my screen flashed blue for 5ish seconds and then pumped me onto a black screen with a cursor. While I’m pretty sure I could swap to other TTYs (as my cursor would vanish and reappear when switching back) I couldn’t see anything on them.
This does make me mildly confused, as it seems like an odd thing to happen, so the reason I’m posting this is a hope that it gives anyone more knowledgeable an idea.
Guess this is really kind of hard to stably reproduce…
I’m using 5060Ti and the G07 driver. I recall after 0428 update it was partly broken. I can still go into the KDE desktop, but KWin continuously report crashes and the desktop environment may get totally freezed if I keep moving the mouse fast. Yet the update about a week later (perhaps 0506? don’t remember exactly) fix everything.
I also try a fresh new install on an external SSD for test, where everything works just smoothly…
Just to share info that may help, i run tumbleweed + kde plasma + RTX5070 + nvidia open G07 595.75.
with kernel 6.19 , everything was running fine.
with kernel 7.0.1 update, i was able to login, but nvidia driver was failed.
with kernel 7.0.2 update, i was unable to boot, had green screen.
this continued till kernel 7.0.5, and even “initcall” workaround failed.
I tested a fresh install with Gnome, and “initcall” workaround was successful.
then with kernel 7.0.6 or 7.0.7 even KDE plasma with “initcall” workaround was successful.
shut down. and restart are taking a minute, right now. both on Gnome, and KDE.