I am having a serious problem here after the latest kernel upgrade which the NIC that I have can not pass through on KVM.
The message that I have on Virt manager is this:
Error starting domain: internal error: QEMU unexpectedly closed the monitor (vm=‘win10’): qxl_send_events: spice-server bug: guest stopped, ignoring
2024-08-08T01:45:51.824474Z qemu-system-x86_64: -device {“driver”:“vfio-pci”,“host”:“0000:0b:00.0”,“id”:“hostdev0”,“bus”:“pci.2”,“addr”:“0x0”}: vfio 0000:0b:00.0: group 83 is not viable
Please ensure all devices within the iommu_group are bound to their vfio bus driver.
The kernel is:
Linux 192.168.88.179 6.4.0-150600.23.17-default #1 SMP PREEMPT_DYNAMIC Tue Jul 30 06:37:32 UTC 2024 (9c450d7) x86_64 x86_64 x86_64 GNU/Linux
All worker normally before, but after the kernel update something is not working.
P.S.
On lspci the device is enumerate properly but on Network manager it shows as a Intel chipset root port.
Yes I have checked on /etc/default/grub the ioommu for Intel is enabled and configured properly.
@Jniko Hi and welcome to the Forum
So is there a particular reason to use vfio-pci for the network card as just using it on the host as a bridge device via wicked (or NetworkManager)?
So have you configured this card (it is a separate device, not part of the motherboard?) for passthrough, with and appropriate alias file and loading the vfio modules with dracut?
Yes I need to passthrough for low latency application such as PTP and multicast.
Everything is configured properly and never had a problem, from Leap 15.5 and 5.14.y kernels. I had upgraded to Leap 15.6 and 6.4.y kernels and everything went smooth and worked perfect.
The error described previous started when I upgraded the kernel to the latest
6.4.0-150600.23.17-default.
So something must be broken… 2 different servers do the same thing is impossible all sudden after the kernel upgrade.
The Intel I350- T2 is a PCI-e card and the Intel I-217 is built in card on motherboard. All have the same response.
By downgrade to previous kernel everything is working perfect.
This card is not isolated or configured with dracut. When the KVM VM’s are not powered on this card is not used but connected to the network as simple NIC card without any basic usage.
When I power on the KVM VM’s the card isolate properly and successfully from the host and passthrough to assigned VM . When I power of the VM the NIC card reassigned to the host correctly and without any single issue. (this was until 6.4.0-150600.23.14-default which was worked correct and without any single issue)
@Jniko then in you bug report, describe what your doing on the working kernel, then repeat with the failing kernel and attach the journal logs with the error(s).
Aug 09 05:42:53 virtqemud[2793]: Unable to read from monitor: Connection reset by peer
Aug 09 05:42:53 virtqemud[2793]: internal error: QEMU unexpectedly closed the monitor (vm=‘win10’): qxl_send_events: spice-s>
2024-08-09T02:42:53.163376Z qemu-system-x86_64: -device {“driver”:“vfio-pci”,“host”:"0000:0>
Please ensure all devices within the iommu_group are bound to their vfio bus driver.
@malcolmlewis maybe do you have more information about the fix for this issue?
I have tried the quad port Intel I-350 and have the same symptom.
If the whole IOMMU group of the card is passed through on one VM is working fine.
If you try to attach only one port of the card the symptom above happens.
@Jniko Either via YaST Software Management and searc for kernel-default and there should be a changelog tab, or rpm -q kernel-default --changelog | less to see the current changes…
@Jniko Hi, so I noticed on the bug report, so your up to date with everything on the system(s)? There was a new kernel… not sure about kvm parts as don’t use it on Leap 15.6.