Page 1 of 1

SR-IOV issue on 82576 VFs

Posted: 11. Jul 2016, 09:43
by astrosmash
Hello,

***
Host OS release: Ubuntu 14.04.4 LTS with 4.7.0-rc6 x86_64 kernel
Virtualbox release: 5.0.24r108355, installed from .run installer
igb: 5.3.5.3 on 4-port Intel 82576 (8086:10e8) NIC
***

I am unable to passthrough Intel Corporation 82576 Virtual Function NICs to the guest. I am using pci_stub. I have blacklisted igbvf driver, but I also tried to use it and then unbind from/bind to pci_stub manually, but no difference. Attached VFs simply are not recognized from guests (guest is running Ubuntu, but I tried FreeBSD as well and vfs are not shown in pciconf -lv)

I am able to passthrough Physical ports though, but not without errors.

Code: Select all

[  876.357493] vboxpci 0000:07:00.0: PCIRAW_POWER_ON
[  876.467546] vboxpci 0000:07:00.0: failed to attach to IOMMU, error -22
Such warnings are printed both for VFs and PFs, but passed-through PF is working anyway (I assume this is EINVAL from iommu/2015-January/011960.html).

Also these errors are printed:

Code: Select all

[  874.627676] pci-stub 0000:07:00.0: claimed by stub
[  874.627704] pci-stub 0000:07:00.0: enabling device (0002 -> 0003)
[  874.627775] vboxpci 0000:07:00.0: vboxPciOsDevInit
[  874.627892] vboxpci 0000:07:00.0: region 0: mmio fbfe0000+131072
[  874.627897] vboxpci 0000:07:00.0: region 1: mmio fb800000+4194304
[  874.627901] vboxpci 0000:07:00.0: region 2: pio 6000+32
[  874.627905] vboxpci 0000:07:00.0: region 3: mmio fb7f0000+16384
[  874.627945] vboxpci 0000:07:00.0: got irq 49
[  874.628554] vboxpci 0000:07:00.0: reg=0 start=fbfe0000 size=131072
[  874.628618] vboxpci 0000:07:00.0: invalid region 1
[  874.628667] vboxpci 0000:07:00.0: invalid region 3
The last three lines are repeated then after successful boot of guest.

Code: Select all

# VBoxManage showvminfo --details (UUID)

...
Attached physical PCI devices:

   Host device host08:10.1 at 08:10.1 attached as 01:10.1 <----- these are 4 VFs, none of them are visible later
   Host device host08:10.3 at 08:10.3 attached as 01:10.3
   Host device host08:10.5 at 08:10.5 attached as 01:10.5
   Host device host08:10.7 at 08:10.7 attached as 01:10.7
   Host device host07:00.0 at 07:00.0 attached as 01:00.0 <----- this is Physical port
...

# VBoxManage showvminfo --details (UUID) --machinereadable | grep -i pci
AttachedHostPCI=08:10.1,01:10.1
AttachedHostPCI=08:10.3,01:10.3
AttachedHostPCI=08:10.5,01:10.5
AttachedHostPCI=08:10.7,01:10.7
AttachedHostPCI=07:00.0,01:00.0

# grep -i "device" /opt/VirtualBox/vms/VM/Logs/VBox.log | grep -i host
00:00:00.065596   DeviceName         <string>  = "host08:10.1" (cb=12)
00:00:00.065603   HostPCIDeviceNo    <integer> = 0x0000000000000010 (16)
00:00:00.065630   DeviceName         <string>  = "host08:10.3" (cb=12)
00:00:00.065636   HostPCIDeviceNo    <integer> = 0x0000000000000010 (16)
00:00:00.065663   DeviceName         <string>  = "host08:10.5" (cb=12)
00:00:00.065670   HostPCIDeviceNo    <integer> = 0x0000000000000010 (16)
00:00:00.065696   DeviceName         <string>  = "host08:10.7" (cb=12)
00:00:00.065702   HostPCIDeviceNo    <integer> = 0x0000000000000010 (16)
00:00:00.065729   DeviceName         <string>  = "host07:00.0" (cb=12)
00:00:00.065735   HostPCIDeviceNo    <integer> = 0x0000000000000000 (0)

00:00:00.400282 Attached PCI device ffff:ffff host address 08:10.1 to guest address 01:10.1
00:00:00.400436 Attached PCI device ffff:ffff host address 08:10.3 to guest address 01:10.3
00:00:00.400576 Attached PCI device ffff:ffff host address 08:10.5 to guest address 01:10.5
00:00:00.400715 Attached PCI device ffff:ffff host address 08:10.7 to guest address 01:10.7
00:00:00.505534 Attached PCI device 8086:10e8 host address 07:00.0 to guest address 01:00.0
I assume ffff:ffff is invalid. It should be 8086:10ca as shown in lspci -vvnn for my Virtual functions. PF with address of 8086:10e8 is recognized (and then passed through) correctly.
However device ID is detected correctly as per dmesg:

Code: Select all

[  874.522568] vboxpci 0000:08:10.1: vboxPciOsDevInit
[  874.522624] vboxpci 0000:08:10.1: region 0: mmio faf40000+16384
[  874.522629] vboxpci 0000:08:10.1: region 3: mmio faf60000+16384
[  874.522706] vboxpci: detected device: 8086:10ca at 08:10.3, driver pci-stub
this is for all four passed-through VFs, ID is 8086:10ca but ffff:ffff in VM log.

I tried with intel_iommu=pt, intel_iommu=on, intel_iommu=on iommu=pt, but none of these worked for successful VF passthrough. I have CONFIG_VFIO is not set so I use pci_stub.

I tried to check the issue with 12.1.3. The built-in VM debugger , but I have headless installation with no GUI. Can I use debugger without GUI?

If any log extractions required, please ask me. Thanks!

-Alex

Re: SR-IOV issue on 82576 VFs

Posted: 12. Jul 2016, 01:50
by astrosmash
Just a little update.
Such warnings are printed both for VFs and PFs, but passed-through PF is working anyway (I assume this is EINVAL from iommu/2015-January/011960.html).
Here is a clarification on 2015/11/12/661 , and my assumption seems to be true. I installed qemu and tried to check passthrough with it, but neither physical port nor virtual one works (I previously mentioned that I was able to passthrough physical port using VBox). VBox seems to somehow ignore EINVAL when corresponding IOMMU group has more than one member. qemu does not and just dies with kvm assign device failed ret -22.

I compiled latest 4.7-rc7 kernel with VFIO support and override_for_missing_acs_capabilities patch. Then (assuming that VBox does not use VFIO api to map passed-through devices) i tried qemu with -device vfio-pci,host=..., but it totally refused due to my physical port not being in separate IOMMU group. To clarify, I have the following devices in IOMMU group member of which I'd like to passthrough:

- PES12N3A PCI Express Switch
- 2 of 4 Physical ports (so 2 ports along with PES12N3A have iommu group A, and second pair has group B with yet another pci address of PES12N3A).
- Virtual functions of corresponding PFs, if ones were created.

VFIO seems to be completely useless in such a configuration. I have E5540 processors which seem to lack proper ACS support and so unwanted devices are merged to one IOMMU group. override_for_missing_acs_capabilities patch seems not to work either, multiple devices are mapped to one IOMMU group regardless of its settings. I tried "downstream" option as well as specifying bridge IDs in "id:" paramether.

If I try to bind both of 2 Physical ports to vfio driver, qemu then tries to include bridge as well and gots refused:

Code: Select all

** from /sys/kernel/debug/tracing/trace , having trace_event=iommu **
qemu-system-x86-5298  [000] ....  5177.884029: attach_device_to_domain: IOMMU: device=0000:04:04.0

** from dmesg **
vfio-pci 0000:07:00.0: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.
So I seem to have the following options:
- revert to kernel < 4.2 and then try to use pci-stub without (hopefully) being returned with EINVAL.
- if using qemu alongwith VFIO api, try to change pci port of network card so it won't get mapped in such a way. (I don't actually think it will work, but anyway)...
- purchasing new platform based on new processor with proper ACS support. Don't consider this as option, actually.

I will ping back in if I got some updates.

Re: SR-IOV issue on 82576 VFs

Posted: 12. Jul 2016, 07:05
by astrosmash
Hello,

Solved.

i have checked the patch and noticed that the quirk was not applied to my kernel correctly. I have modified the patch for "PCI_ANY_ID, PCI_ANY_ID, pcie_acs_overrides" to be added correctly and also added "PCI_VENDOR_ID_INTEL, 0x10CA, pci_quirk_mf_endpoint_acs" to "pci_dev_acs_enabled[]" (PCI ID of my VFs are 8086:10ca.) I am launching the kernel with pcie_acs_override=id:111d:8018 paramether which is my PCI bridge. Now anything is sorted out properly, in their separate IOMMU groups, including VFs so I can attach what I want to the VM without requirement to attach something else. This would actually allow you to use pci-stub (since the iommu domain does not exceed one member) as well as VFIO.

However:

Issue with attaching VF into VBox guest is not resolved (they are still recognized as ffff:ffff, and not visible from guest, despite no more errors). In qemu, you are able to use both pci-stub and vfio for SR-IOV.
Virtual function is visible from VM using igbvf driver.

So after you have patched the kernel, for vfio, passthrough will look like:

echo 1 > /sys/module/vfio_iommu_type1/parameters/allow_unsafe_interrupts
echo "8086 10ca" > /sys/bus/pci/drivers/vfio-pci/new_id
echo "0000:08:10.1" > /sys/bus/pci/devices/0000\:08\:10.1/driver/unbind
echo "0000:08:10.1" > /sys/bus/pci/drivers/vfio-pci/bind

Then -device vfio-pci,host=08:10.1 as usual.
You may refer to your hypervisor's manual pages for the details.

Regarding EINVAL: Here lkml/2015/11/12/661 Alex Williamson describes the issue why errors are printed on LINUX_VERSION_CODE >= KERNEL_VERSION(4, 2, 0). PCI device attachment is performed by

Code: Select all

 int rcLnx = iommu_attach_device(pData->pIommuDomain, &pPciDev->dev);
without actual check for group members number, but this is at most addressed for VBox developers.

Thanks.