Raw disk I/O benchmark - qemu-kvm vs vbox

Here you can provide suggestions on how to improve the product, website, etc.
Post Reply
comps
Posts: 1
Joined: 7. Jan 2012, 01:50
Primary OS: Debian other
VBox Version: PUEL
Guest OSses: Linux 2.6, MS Windows

Raw disk I/O benchmark - qemu-kvm vs vbox

Post by comps »

With the suggestion of implementing virtio-blk in VirtualBox, I hereby present a disk I/O benchmark proving that virtio can be really faster in random I/O patterns. All the details are provided in the actual benchmark report below.

(I had to mangle URLs a bit in the report below, to fool the antispam filter)
KVM versus VirtualBox raw disk I/O performance benchmark, January 2012
======================================================================

This is a test report of storage speed of KVM (qemu-kvm) and VirtualBox
using the following versions:

        qemu-kvm version 1.0
                from git://git.kernel. org/pub/scm/virt/kvm/qemu-kvm.git
        VirtualBox version 4.1.8r75467 installed using a .sh installer
                from htt ps ://www .virtualbox. org/wiki/Linux_Downloads

(qemu-kvm was compiled using default CFLAGS, eg. no specific optimizations)

Host kernel used was (vanilla) Linux 3.2.0 with custom config, unrelated
to this benchmark.
Linux distribution used was Debian GNU/Linux 6.0.3 running on
        CPU: Intel i5-2500K, Sandy Bridge
        RAM: 2x 4GB Kingston HyperX DDR3 CL9 (KHX1333C9D3B1) = 8GB dual channel
        other hardware is unrelated

The actual 4 GiB testing drive was placed into tmpfs, therefore all
benchmark-related files / disks were stored in host's RAM.
The host had no configured swap space.

Environment used for running virtual machines was Xorg 1.7.7, fluxbox 1.1.1,
both from default Debian 6.0 repository. The host itself was not doing anything
else at the time of this test (eg. no additional load, pure X+fluxbox, most
daemons / kernel modules disabled). All tools were run as root on the host.

VM guest used was intended to be a minimal custom configuration of OpenWrt,
composed using their "Image Generator" (10.03.1, "x86_generic"), however virtio
support is not available for that particular image, therefore
"SystemRescueCD" live distribution (version 2.4.1) was chosen.
This live distribution was custom-modified to allow serial console interaction,
boot "altker32" (3.1.5) kernel by default and start in single-user mode
(runlevel 1) to minimize memory usage.


Benchmark background
====================

The goal of this benchmark was to measure both sequential and random I/O speeds
of both qemu-kvm and VirtualBox and compare them. Technologies in question were
IDE, AHCI, SCSI and virtio. Shared Folders (vbox) and 9p virtfs (KVM)
were not tested (due to sysrescd kernel headers unavailability).
The specific goal behind this benchmark was to find out whether VirtualBox
implementation of storage backends can match the performance of virtio-blk
used by qemu-kvm (KVM).


Testing specifics
=================

Serial console interaction was used for actual testing, to avoid SDL/graphic
overhead and simulate a generic server application.

The guest was tuned for maximum benchmark accuracy, that is - startup with 100M
of RAM (to allow initrd extraction), then fill most of the space using tmpfs
to minimize guest-side caching during the benchmark process:
        mkdir fill/
        mount -o size=100% -t tmpfs tmpfs fill/
        dd if=/dev/zero of=fill/ramfill bs=1M count=50
(which leaves about 22MB "free", enough for "dd" with 16MB blocks)

The actual sysrescd (guest OS) ISO file was placed into tmpfs to eliminate HDD
I/O during command/lib load inside the guest. An IDE interface was chosen for
the guest cdrom drive.


General benchmark setup
=======================

For both sequential and random I/O benchmarks, a 4 GiB zero-filled file
was created in tmpfs using:
        dd if=/dev/zero of=/tmp/bigfile bs=1G count=4
        losetup /dev/loop1 /tmp/bigfile
(the loop overhead is there because VirtualBox is unable to use raw files)

For direct disk access from VirtuaBox, a disk.vmdk file was created using:
        VBoxManage internalcommands createrawvmdk \
                -filename disk.vmdk -rawdisk /dev/loop1

Both qemu-kvm and VirtualBox were optimized (using cmdline flags / settings)
for maximum raw performance as much as possible
(ie. "IO APIC" in vbox disabled, as a hint there suggests performance hit),
any guest-side addons / additions were NOT INSTALLED, since they do not help
raw disk I/O performance.

Please note that "Use host I/O cache" was disabled, as well as "cache=none"
qemu-kvm version of it, for the sake of accuracy.

As for guest parameters, 100MB of RAM was already mentioned, note that this
is "100MB" as reported by VirtualBox GUI, which is 93057024 bytes reported
by the guest, with qemu-kvm and -m 100M it's 93110272 bytes (about 500KB diff).
Only 1 CPU core was used for the guest.

Arguments used for qemu-kvm were passed in the "old" (non -device) form,
similar to:
        qemu-kvm -enable-kvm -nographic -boot order=d,menu=off \
                -m 100M -serial pty \
                -cdrom /tmp/myrescd.iso \
                -drive if=ide,media=disk,cache=none,aio=native,file=/dev/loop1

And as mentioned earlier, both virtualization tools were run without
graphical output, that is -nographic for qemu-kvm and vboxheadless for VBox.


Specific benchmarking tools / ways
==================================

For the actual timing, "time" command was used and its "user" time noted.

For sequential I/O, "dd" tool was used as
        dd if=/dev/sda of=/dev/null bs=16M
        dd if=/dev/zero of=/dev/sda bs=16M
where "/dev/sda" has 4294967296 bytes (4 GiB).
Returned "user" time is the total time of reading / writing 4 GiB of data,
and from that, the hand-calculated speed of "MBs per second" was noted.

For random I/O, a custom small C program was used. This tool basically does
        1) select random LBA sector from 0,MAX
        2) seek to it
        3) read (or write) $blocksize bytes
        4) goto 1
until it's interrupted by pre-set alarm(), which defaults to 60 seconds.
Two values were reported; total number of blocks read and an average
of "blocks read per second" - only the second value was noted.
(MAX = (4 GiB * 1024^3 / 512)-1)

Each test was performed 5 times, doing
        echo 3 > /proc/sys/vm/drop_caches
before each pass (although that should have minimal impact due to guest memory
being already saturated thanks to "ramfill" (see above)).
The final value was calculated as the arithmetic mean of those 5 values.

(note: "sync" was not needed as the guest had almost no RAM for writeback)

The above echo was executed on the host as well, just before each VM startup,
the VM was re-started for EACH individual series of 5 passes
(eg. for each test).


Raw benchmark results
=====================

=== SEQUENTIAL I/O ===

reported READ speeds (in MiB / second) - first part
with average CPU usage on host, all 4 cores (percentage) - second part:
           |   IDE   |   SCSI  |  VIRTIO |  "SATA" |  "SAS"  
-------------------------------------------------------------
qemu-kvm   |  766.52 |  893.78 | 1097.01 |    -    |    -    
VirtualBox |  547.01 |  773.76 |    -    |  681.48 |  783.92 
-------------------------------------------------------------
qemu-kvm   |      27 |      27 |      28 |    -    |    -    
VirtualBox |      33 |      35 |    -    |      30 |      34 
-------------------------------------------------------------

reported WRITE speeds (in MiB / second) - first part
with average CPU usage on host, all 4 cores (percentage) - second part:
           |   IDE   |   SCSI  |  VIRTIO |  "SATA" |  "SAS"  
-------------------------------------------------------------
qemu-kvm   |  700.46 |  818.58 |  696.34 |    -    |    -    
VirtualBox |  458.27 |  556.40 |    -    |  574.86 |  554.55 
-------------------------------------------------------------
qemu-kvm   |      26 |      29 |      19 |    -    |    -    
VirtualBox |      32 |      29 |    -    |      29 |      29 
-------------------------------------------------------------

=== RANDOM I/O ===

reported READ seek speeds (in seeks / second) - first part
with average CPU usage on host, all 4 cores (percentage) - second part:
           |   IDE   |   SCSI  |  VIRTIO |  "SATA" |  "SAS"  
-------------------------------------------------------------
qemu-kvm   |  8015.5 | 14608.3 | 26896.7 |    -    |    -    
VirtualBox |  6722.9 | 12730.8 |    -    | 11313.4 | 12692.0 
-------------------------------------------------------------
qemu-kvm   |      27 |      31 |      31 |    -    |    -    
VirtualBox |      29 |      31 |    -    |      30 |      31 
-------------------------------------------------------------

reported WRITE seek speeds (in seeks / second) - first part
with average CPU usage on host, all 4 cores (percentage) - second part:
           |   IDE   |   SCSI  |  VIRTIO |  "SATA" |  "SAS"  
-------------------------------------------------------------
qemu-kvm   |  3029.9 |  5478.9 | 10220.2 |    -    |    -    
VirtualBox |  2616.6 |  7330.0 |    -    |  4450.9 |  7285.9 
-------------------------------------------------------------
qemu-kvm   |      27 |      31 |      33 |    -    |    -    
VirtualBox |      28 |      30 |    -    |      29 |      30 
-------------------------------------------------------------


Benchmark summary, conclusion
=============================

In all those benchmarks, "real" time was measured instead of "sys" because
the benchmark was intended to express the actual VM speed, including CPU time
consumed by interface emulation / translation.
CPU usage on the host was mostly "1 core 100%" (25% total), with additional
load caused most likely by emulation.

Deviations / differences between each of the 5 runs weren't really huge,
about 5-15MB/s sequential I/O and 30-80 seeks/second random I/O.

Significant write seek time reduction might be caused by tmpfs, which seems
to use 4K blocks for files and thus needs to do a read-modify-write cycle
for each write().

As for conclusion, this benchmarker is not going to say "qemu-kvm is overall
faster than VirtualBox", although it would be in fact true. He's rather
going to say "VirtualBox could benefit from virtio".

It still all boils down to specific usecases - VirtualBox is somewhat slower
in raw I/O performance, however it can make it up in other areas - be that
guest-additions, 2D/3D acceleration or something else.
Furthermore, you won't notice the performance difference on a slow magnetic
drive.

This benchmark doesn't claim to be 100% accurate, however its author configured
both virtualization tools to his best knowledge.

comps
michaln
Oracle Corporation
Posts: 2973
Joined: 19. Dec 2007, 15:45
Primary OS: MS Windows 7
VBox Version: PUEL
Guest OSses: Any and all
Contact:

Re: Raw disk I/O benchmark - qemu-kvm vs vbox

Post by michaln »

That benchmark, although quite interesting, sadly has no practical relevance. There are two serious problems which make it an ultimately irrelevant exercise:

1) No customer I know of is using RAM for storing virtual disks. Optimizing for this unrealistic case would be a waste of time.

2) The benchmark in no way simulates a real server load where the guest actually does something with the data (be it a database server, web server, or whatnot).
powerhouse
Posts: 1
Joined: 11. Apr 2012, 01:05

Re: Raw disk I/O benchmark - qemu-kvm vs vbox

Post by powerhouse »

Interesting benchmark. While it may not be representing an everyday situation, I actually am going to use RAM disk for application cache files.

With regard to "normal" HD performance, is it possible to draw conclusions from this benchmark? Or would HD performance be much more influenced by the configuration and driver that one selects? Also, would this benchmark be somehow reflect on SSD performance?
RandyTate
Posts: 1
Joined: 2. Jan 2013, 21:09

Re: Raw disk I/O benchmark - qemu-kvm vs vbox

Post by RandyTate »

Wow, that's excellent research! Saves me a lot of time, actually. Thank you.


I'm not sure why someone would claim it's not relevant. Maybe in a typical scenario the disks would be the bottleneck for this kind of benchmark.
But even if the disks were the bottleneck the overhead in vbox is quite appearent and that does consume resources that might have gone to better use elsewhere, especially in VPS senarios where the CPU is shared by many users. Hopefully vbox devs get wind of this and view it as an oppertunity for improvement.
Rusty X
Posts: 4
Joined: 25. Jan 2014, 00:06

Re: Raw disk I/O benchmark - qemu-kvm vs vbox

Post by Rusty X »

Interesting results, but could the benchmark be repeated with a real disk instead of RAM, say a SATA3 or M.2 SSD?
That would be way more interesting. Given the fact that virtio-pci doesn't support discard, vbox might actually be a better alternative, if only its I/O performance can be proven to be on par with that of virtio.
socratis
Site Moderator
Posts: 27329
Joined: 22. Oct 2010, 11:03
Primary OS: Mac OS X other
VBox Version: PUEL
Guest OSses: Win(*>98), Linux*, OSX>10.5
Location: Greece

Re: Raw disk I/O benchmark - qemu-kvm vs vbox

Post by socratis »

Interesting re-incarnation of a 5+ years old, well buried thread. It's also interesting to have a look at the original commenters. Three of them from 2012 have only one comment (the one on this thread), and the one that's still around (a developer) said:
michaln wrote:No customer I know of is using RAM for storing virtual disks. Optimizing for this unrealistic case would be a waste of time.
or else: "No one is paying for an almost-never-used-scenario? Not a priority."
Which I think kind of sums it up nicely.
Rusty X wrote:Interesting results, but could the benchmark be repeated with "XYZ"? That would be way more interesting.
The first benchmark was conducted by a user, I assume because they wanted to test their specific setup. I can only assume you could do the same, right?
Do NOT send me Personal Messages (PMs) for troubleshooting, they are simply deleted.
Do NOT reply with the "QUOTE" button, please use the "POST REPLY", at the bottom of the form.
If you obfuscate any information requested, I will obfuscate my response. These are virtual UUIDs, not real ones.
mpack
Site Moderator
Posts: 39134
Joined: 4. Sep 2008, 17:09
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Mostly XP

Re: Raw disk I/O benchmark - qemu-kvm vs vbox

Post by mpack »

Given that the OP posted once, 5 years ago, and hasn't been back since, I suspect that he isn't going to be repeating any benchmarks. So, I suggest that you test it yourself. One note: using raw disk reduces performance, it does not increase it.
Post Reply