Page 1 of 2
VirtualBox benchmarked
Posted: 17. Aug 2009, 02:09
by Technologov
Altough a bit old, this site shows some performance bottlenecks of VirtualBox: (vs KVM and Native)
http://www.phoronix.com/scan.php?page=a ... virt&num=1
Re: VirtualBox benchmarked
Posted: 17. Aug 2009, 08:51
by sandervl
Yes, it's really surprising that KVM with 8 virtual CPUs wins some benchmarks compared to a 1 CPU VirtualBox VM.

Re: VirtualBox benchmarked
Posted: 17. Aug 2009, 20:26
by ezjd
How the virtual CPU is "mapped" to real cores is confusing me. In a dual core machine, only when a guest has 2 virtual CPUs, it wil use both cores 100% for heavy load task. For 1 virtual CPU, it either uses 1 core 100% or 50% on both cores. My guess is that with 2 virtual CPUs, the number of guest OS process in native machine might be more than 1, so that scheduler can assign them to 2 cores 100%.
In vmware, it is different, and it is just like VBox 1 virtual core case, and never use 2 cores 100%.
Re: VirtualBox benchmarked
Posted: 18. Aug 2009, 10:31
by Two
I think VBox showed a very impressive performance in that test. Considering that KVM is only half a virtualizer (it only works as long as the guest agrees on getting virtualized) and VBox run with only 1 core, I find it quiet impressive that VBox not only beat it in certain parts (like I/O) but was even faster than the host in the Matrix test (a test where I really like to know what went wrong

).
Once the multi-CPU virtualization runs stable, I no longer see a reason to work on the host in an office environment.
Re: VirtualBox benchmarked
Posted: 18. Aug 2009, 10:45
by Technologov
>Considering that KVM is only half a virtualizer (it only works as long as the guest agrees on getting virtualized)
Not true, KVM is a full virtualizer, just like VBox is. You mistake with Xen in PV mode.
>I find it quiet impressive that VBox not only beat it in certain parts (like I/O) but was even faster than the host in the Matrix test (a test where I really like to know what went wrong

).
I think it is due to incorrect guest clock.
The best way to fix it, is to build a new benchmark, which will use remote clock such as NTP, for better benchmarking.
ezjd : just like 2 cores can be mapped to host, so can be 8 cores.
VirtualBox started supporting SMP only recently, and still needs some optimizations. Most other virtualizers (except MS Virtual PC) do support SMP. But in my opinion it is "nice-to-have" feature. Not really needed.
Re: VirtualBox benchmarked
Posted: 18. Aug 2009, 14:42
by Two
Technologov wrote:Not true, KVM is a full virtualizer, just like VBox is. You mistake with Xen in PV mode.
Whops, my bad. Thanks for the correction.
Technologov wrote:VirtualBox started supporting SMP only recently, and still needs some optimizations. Most other virtualizers (except MS Virtual PC) do support SMP. But in my opinion it is "nice-to-have" feature. Not really needed.
If you work on a QuadCore and using all 4 cores cuts down your compile time from 20 minutes to 5 minutes, then I'd say it is more then just a "nice to have" feature.
In server operation environments it becomes even more important, as utilizing for example 8 cores to do certain calculations can extremly speed up you servers.
Re: VirtualBox benchmarked
Posted: 18. Aug 2009, 16:00
by sej7278
Two wrote:If you work on a QuadCore and using all 4 cores cuts down your compile time from 20 minutes to 5 minutes, then I'd say it is more then just a "nice to have" feature.
In server operation environments it becomes even more important, as utilizing for example 8 cores to do certain calculations can extremly speed up you servers.
sorry thats just rubbish. i bet you believe a 2ghz core2quad runs at 8ghz too!
apart from that being rubbish even in the non-virtualised world, i found within virtualbox, smp support seemed to slow down compilation.
Re: VirtualBox benchmarked
Posted: 18. Aug 2009, 19:41
by ezjd
According to my benchmark @
http://ezjd.blogspot.com/2009/08/virtua ... albox.html, 2 virtual CPUs helps improve compiling on a dual core machine. I don't have methods to measure how much guest clock is slowed down, but I did notice that test was finished earlier than single CPU. Maybe I should have used a stop watch
I think the bottleneck of SMP performance right now is IO, as you can see in above tests. Slowed down guest clock won't make it "longer" to finish a test. I built whole Embedded Linux distribution at work, and I didn't see much different between single CPU or 2 CPUs on a dual core PC. So still there is only limited performance improvement using SMP if any and SMP doesn't drop the performance much either.
Re: VirtualBox benchmarked
Posted: 18. Aug 2009, 20:20
by sandervl
Lots of opinions here and little data, so let me add some:
Windows 7 x64 RTM host, 4 GB RAM, 3.1 Ghz Core i7. (4x2 cores, hyperthreaded)
Windows 7 x64 RTM guest, 1 GB RAM, SATA disk (important as IDE generates a lot of overhead!)
Compile VMM (VBoxVMM.dll, vmmgc.gc, vmmr0.r0). Time measured on the host.
One CPU:
- * host: 49s, 47s (kmk -j1 clean debug build)
* guest: 52s, 52s (kmk -j1 clean debug build, nested paging enabled)
* guest: 99s, 98s (kmk -j1 clean debug build, nested paging disabled)
Two CPUs:
- * host: 25s, 25s (kmk -j2 clean debug build)
* guest: 38s, 35s (kmk -j2 clean debug build, nested paging enabled)
* guest: 68s, 69s (kmk -j2 clean debug build, nested paging disabled)
Four CPUs:
- * host: 14s, 15s (kmk -j4 clean debug build)
* guest: 19s, 20s (kmk -j4 clean debug build, nested paging enabled)
* guest: **s, **s (kmk -j4 clean debug build, nested paging disabled) -> unreliable
There's a performance bottleneck in the non-nested paging case, which I'm investigating.
Keep in mind that this compilation job is rather small, so everything ends up in the disk cache. Will try bigger jobs later.
Re: VirtualBox benchmarked
Posted: 18. Aug 2009, 21:02
by sej7278
sandervl wrote:
Four CPUs:
- * host: 14s, 15s (kmk -j4 clean debug build)
* guest: 19s, 20s (kmk -j4 clean debug build, nested paging enabled)
* guest: **s, **s (kmk -j4 clean debug build, nested paging disabled) -> unreliable
There's a performance bottleneck in the non-nested paging case, which I'm investigating.
yeah i guess that's what i'm seeing as only have a core2quad so no nested paging, i still think there's a good margin for speeding up i/o somehow though.
Re: VirtualBox benchmarked
Posted: 18. Aug 2009, 21:06
by vbox4me2
sandervl wrote:One CPU:
- * host: 49s, 47s (kmk -j1 clean debug build)
* guest: 52s, 52s (kmk -j1 clean debug build, nested paging enabled)
* guest: 99s, 98s (kmk -j1 clean debug build, nested paging disabled)
Can I/we assume that when using smp mode with 1 cpu is not the same as none smp(pre v3) where 'one' single core would load spread between cores? I'd really like to see some benchmark comparisons between pre-smp and smp 1 core.
Re: VirtualBox benchmarked
Posted: 18. Aug 2009, 22:19
by ezjd
ezjd wrote:According to my benchmark @
http://ezjd.blogspot.com/2009/08/virtua ... albox.html, 2 virtual CPUs helps improve compiling on a dual core machine. I don't have methods to measure how much guest clock is slowed down, but I did notice that test was finished earlier than single CPU. Maybe I should have used a stop watch
I think the bottleneck of SMP performance right now is IO, as you can see in above tests. Slowed down guest clock won't make it "longer" to finish a test. I built whole Embedded Linux distribution at work, and I didn't see much different between single CPU or 2 CPUs on a dual core PC. So still there is only limited performance improvement using SMP if any and SMP doesn't drop the performance much either.
My test was done with nest paging on all the time. I was using OSE version so no SATA available. I did test using SATA but I didn't see much different with IDE in IO test.
Does host/guest OS matter? I did everyhing in Linux (Ubuntu 9.04). The only difference I can tell is that when SMP enabled, XP schedule VBox process different than Linux because guest easily eats up all cores when doing heavy task with XP host.
Re: VirtualBox benchmarked
Posted: 18. Aug 2009, 23:27
by sandervl
ezjd wrote:My test was done with nest paging on all the time. I was using OSE version so no SATA available. I did test using SATA but I didn't see much different with IDE in IO test.
Nested paging is only supported on Intel Core i7 CPUs. (for AMD on any post Barcelona model) The VirtualBox GUI can't yet tell if your CPU supports it, but will show you when you hover the mouse over the cpu icon in the VM toolbar.
ezjd wrote:Does host/guest OS matter? I did everyhing in Linux (Ubuntu 9.04). The only difference I can tell is that when SMP enabled, XP schedule VBox process different than Linux because guest easily eats up all cores when doing heavy task with XP host.
The guest can matter. The host shouldn't really.
Re: VirtualBox benchmarked
Posted: 18. Aug 2009, 23:30
by sandervl
vbox4me2 wrote:
Can I/we assume that when using smp mode with 1 cpu is not the same as none smp(pre v3) where 'one' single core would load spread between cores? I'd really like to see some benchmark comparisons between pre-smp and smp 1 core.
There is no difference between non-smp in version 3 and older versions.
Re: VirtualBox benchmarked
Posted: 20. Aug 2009, 22:04
by ezjd
sandervl wrote:The guest can matter. The host shouldn't really.
Probably I should learn more about scheduling before asking questions but my question is why a single-CPU VM will only use up to 100% of one core in a dual core system OR up to 50% of both cores at same time? In s single core system, why it only uses up to 50% of CPU? If no artificial limit applied, I feel it is related to host scheduler.
Another question is how guest process is mapped to host process. I think a guest process will be 'scheduled' twice if not one-to-one mapping. Actually, in host, I don't see a lot more processes are created when running a VM, so it sounds like all-to-one mapping or new VM processes are 'hidden'

?