Thanks a lot for your help and your time.
There are two topics here: Hyper-V and P-core/E-core architecture.
Hyper-V
I wondered why you mentioned that Hyper-V was active. It is not. More precisely, I temporarily installed it to test if it could be a better alternative to VirtualBox. I realized that it probably needed specially tweaked distros for guests because a VM was not even able to boot the standard Ubuntu installation ISO. I concluded it was some other Microsoftish non-standard stuff, not really crap, but not appropriate to install the wide variety of distros I need. So, I removed it, rebooted. All tests from this thread were performed after that reboot.
When I open the "Windows Features" configuration panel, I can see that Hyper-V is not installed. I see that "Virtual Machine Platform" and "Windows Hypervisor Plaftorm" are still there. I do not know what they are. Admin software on top of Hyper-V? If Hyper-V itself is not installed, they should not prevent VT-x from being used by VBox. Anyway, I removed them and rebooted.
After reboot, I emptied the Logs directory on one of the VM and started it. In VBox.log, I see this:
Code: Select all
00:00:00.763848 NEM: Adjusting APIC configuration from X2APIC to APIC max mode. X2APIC is not supported by the WinHvPlatform API!
00:00:00.763849 NEM: Disable Hyper-V if you need X2APIC for your guests!
00:00:00.763963 NEM:
00:00:00.763963 NEM: NEMR3Init: Snail execution mode is active!
00:00:00.763963 NEM: Note! VirtualBox is not able to run at its full potential in this execution mode.
00:00:00.763963 NEM: To see VirtualBox run at max speed you need to disable all Windows features
00:00:00.763963 NEM: making use of Hyper-V. That is a moving target, so google how and carefully
00:00:00.763963 NEM: consider the consequences of disabling these features.
00:00:00.763963 NEM:
00:00:00.763976 CPUM: No hardware-virtualization capability detected
I assume that this is why you said that Hyper-V was active. But it has been uninstalled and the host rebooted several times since then.
- How to check that Hyper-V is no longer active, really? Outside VBox logs, I mean.
- If there are some remains of Hyper-V which prevent VBox from using the HW virtualization features, how to clean them?
However, the problems I described in this thread were already there before I tried Hyper-V. I tried it precisely because of these problems with VBox. So, even if I agree that it can only be better not having Hyper-V installed, it was not the root cause of the problems in the first place.
Alder Lake P-core/E-core architecture
The Alder Lake P-core/E-core architecture of the i7-13700H Gen13 is a good idea. I did not think about it. I tried the same config on an older Windows 10 laptop with 4 homogeneous i7 cores. The same Fedora guest configuration boots in 19 seconds, headless, up to Gnome session (autologin). So, booting Fedora headless on a Windows host was not the only reason.
In the Windows Task Manager, using the per logical processor view, CPU 0 to 11 are P-cores (6 cores with hyperthreading) and CPU 12 to 19 are E-cores (8 cores, no hyperthreading).
I see that a VBox headless boot uses the E-cores. We can see that the usage of these CPU's is climbing quite fast while the P-cores stay quiet.
Interestingly, when the Fedora guest is stuck early in the boot with "soft lockup - CPU stuck" errors, we see that 4 of the E-cores are approximately 30 to 40% busy. Something in VBox seems looping.
Another observation which surprises me: while the VM are configured with 6 processors, only 4 E-cores are busy. The CPU allocation also changes from time to time. Typically, CPU 12-15 are busy, 16-19 are idle. After a couple of minutes, CPU 16-19 become busy and 12-15 idle, again and again.
When starting a VM interactively from the VBox GUI, the activity move back and forth between P-cores and E-cores. The peaks seem to run on the P-cores. I assume that this is the expected behaviour. This may explain why the headless boots, even on the distros where it works, was much slower that boots from the GUI.
Therefore, I disable power-throttling (ie. migration to E-cores) for the most important VBox programs:
Code: Select all
powercfg /powerthrottling disable /path 'C:\Program Files\Oracle\VirtualBox\VBoxHeadless.exe'
powercfg /powerthrottling disable /path 'C:\Program Files\Oracle\VirtualBox\VirtualBoxVM.exe'
powercfg /powerthrottling disable /path 'C:\Program Files\Oracle\VirtualBox\VirtualBox.exe'
powercfg /powerthrottling disable /path 'C:\Program Files\Oracle\VirtualBox\VBoxNetDHCP.exe'
powercfg /powerthrottling disable /path 'C:\Program Files\Oracle\VirtualBox\VBoxNetNAT.exe'
powercfg /powerthrottling disable /path 'C:\Program Files\Oracle\VirtualBox\VBoxSVC.exe'
Indeed, a headless boot now runs at the speed of light, same time as a boot from the GUI. We see that the P-cores are busy, not the E-cores.
More interesting, the guests which always failed to boot (such as Fedora and its "CPU soft lockup" errors) now boot normally. I understand the performance improvement when moving to P-cores. But I do not understand the change of behavior. The E-cores are certainly slower than the P-cores, but they are still as performant, or even more performant then older laptops on which VBox works correctly. I have been using VBox for 10 or 15 years, as well as other hypervisors (VMware, Parallels, KVM/Qemu, UTM) and the first CPU's I used with VBox were certainly much slower than my current E-cores. And it worked.
So, my problem is fixed. Thank you all for the help. But the lack of rational explanation for the hangs of some distros during the boot when running on E-cores still worries me.