Virtual partition vs. raw harddisk access and effects on performance

Discussions related to using VirtualBox on Linux hosts.
Post Reply
frankvw
Posts: 4
Joined: 9. Jul 2019, 14:22

Virtual partition vs. raw harddisk access and effects on performance

Post by frankvw »

On my previous Linux 12.04 host I ran Windows XP as a guest OS, using VB's standard virtual harddisk mechanism (i.e. the virtualized harddisk lives in a file on the host OS filesystem). On my new Linux 18.04 host I'm now running Windows 10 as a guest OS, using raw harddisk access for the guest. In other words, I have a separate NTFS partition on my harddisk (separate from the ext4 partitions used by the Linux host OS) which W10 accesses directly rather than as a virtualized disk living in a file on the Linux file system. In the Windows Task Manager I see that Disk 0 (C:) is labelled "VBOX HARDDISK".

I had expected harddisk performance to be better this way than what I had seen previously. It's not.

Instead I see that the significant amount of harddisk activity generated by Windows 10 significantly slows down the host OS as soon as the latter attempts to perform harddisk I/O. I suspect this is because I have two different OSes trying to access the same physical disk, both unaware of the other.

I now realize that, when using standard virtualized harddisks, the physical I/O applies to a file on the host OS (in this case Linux) which means that Linux is the only OS involved in accessing the physical harddisk and therefore is able to optimize the various I/O operations in order to achieve optimum performance and minimal waste head motion.

My question: when using raw harddisk access for the guest OS, is the host OS still aware of the guest OS harddisk I/O and is it still capable of optimizing same? In other words, does "raw" access from the guest environment still go through an optimizing abstraction layer in the host OS in order to make the host OS aware of the guest's physical harddisk I/O and put it through the same optimization mechanisms as use by the host I/O for regular harddisk access? The fact that in the Windows environment the harddisk is labelled "VBOX HARDDISK" suggests this might be the case, however this type of harddisk access being known as "raw" suggests the opposite.

Or have I shot myself in the foot from a performance standpoint by having the guest OS performing raw disk access on the same physical device as the host OS is using, rather than allow the host OS to optimize the guest's I/O by treating it as regular file I/O on the host filesystem?
socratis
Site Moderator
Posts: 27329
Joined: 22. Oct 2010, 11:03
Primary OS: Mac OS X other
VBox Version: PUEL
Guest OSses: Win(*>98), Linux*, OSX>10.5
Location: Greece

Re: Virtual partition vs. raw harddisk access and effects on performance

Post by socratis »

frankvw wrote:I had expected harddisk performance to be better this way than what I had seen previously
Why? You're missing the host's I/O cache...
frankvw wrote:Instead I see that the significant amount of harddisk activity generated by Windows 10
That's what Win10 does, even in real life. My physical Win10 can take even up to an hour with the disk activity pinned at 100%. Then it settles. Usually it's the Defender, the Optimization, the .NET, the Installer, everyone and their cousin. Check what's going on inside your guest, and who's doing what...
frankvw wrote:In other words, does "raw" access from the guest environment still go through an optimizing abstraction layer in the host OS in order to make the host OS aware of the guest's physical harddisk I/O and put it through the same optimization mechanisms as use by the host I/O for regular harddisk access?
My understanding is that "No, it doesn't". It has full reigns over the hard disk and no optimization/cache options. There maybe some low level calls involved where the host OS gets called, but that would be at a disk/sector level, not at a file optimized-reading level.
frankvw wrote:Or have I shot myself in the foot from a performance standpoint by having the guest OS performing raw disk access on the same physical device as the host OS is using, rather than allow the host OS to optimize the guest's I/O by treating it as regular file I/O on the host filesystem?
Could be. Haven't done any benchmarks, neither do I know/remember of anyone that has.
Do NOT send me Personal Messages (PMs) for troubleshooting, they are simply deleted.
Do NOT reply with the "QUOTE" button, please use the "POST REPLY", at the bottom of the form.
If you obfuscate any information requested, I will obfuscate my response. These are virtual UUIDs, not real ones.
frankvw
Posts: 4
Joined: 9. Jul 2019, 14:22

Re: Virtual partition vs. raw harddisk access and effects on performance

Post by frankvw »

socratis wrote:
frankvw wrote:I had expected harddisk performance to be better this way than what I had seen previously
Why? You're missing the host's I/O cache...
But instead of going through an additional virtualization layer and having to deal with two file systems, the I/O now goes straight to and from the physical partition. Or at least that was my thought.

Windows 10's atrocious disk I/O load aside (on which agree with you) I came across a section in the user guide that says:
With VirtualBox, this type of access is called “raw hard disk access”; it allows a guest operating system to access its virtual hard disk much more quickly than with disk images, since data does not have to pass through two file systems (the one in the guest and the one on the host).
However, I have now come across several benchmarks that indicate that isn't true

So it looks like I have indeed shot myself in the foot. I'll see if there is a quick and relatively painless way to migrate back to the standard VDI (file based) storage.
mpack
Site Moderator
Posts: 39134
Joined: 4. Sep 2008, 17:09
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Mostly XP

Re: Virtual partition vs. raw harddisk access and effects on performance

Post by mpack »

frankvw wrote:
With VirtualBox, this type of access is called “raw hard disk access”; it allows a guest operating system to access its virtual hard disk much more quickly than with disk images, since data does not have to pass through two file systems (the one in the guest and the one on the host).
Maybe on some obscure system that might once have been true. I don't understand how though. No benchmarks citations I'll bet.

In answer to your question: "raw disk" access is still just as virtual as ever. It is raw in the sense of bypassing the host filesystem and host partitions. Put simply it calls the IOCTL level functions instead of the filesystem API. It does NOT mean direct hardware access (how could it??), and it does NOT equate to fast. On the contrary, on any modern OS that has a DMA driven caching scheme supporting the main filesystem, you can expect raw disk access (i.e. sector level basic CPU controlled access) to be substantially slower.

The fact that the guest OS might implement it's own DMA based caching does not mean that you have no problem: all I/O in a VM is (by definition) simulated using the host CPU, so DMA is not real DMA.
socratis
Site Moderator
Posts: 27329
Joined: 22. Oct 2010, 11:03
Primary OS: Mac OS X other
VBox Version: PUEL
Guest OSses: Win(*>98), Linux*, OSX>10.5
Location: Greece

Re: Virtual partition vs. raw harddisk access and effects on performance

Post by socratis »

frankvw wrote:I'll see if there is a quick and relatively painless way to migrate back to the standard VDI (file based) storage.
  • Use Disk2VHD, a Microsoft tool, to save your physical disk in the VHD format. On an external drive that's big enough that is, or your Linux partition if you have the space.
  • Convert the VHD to VDI, which is native to VirtualBox. You can either read ch 8.24. VBoxManage clonemedium, or even easier, use the nice CloneVDI by 'mpack' utility to do so.
  • Create a new VM as close to the original as possible. When prompted for a disk, choose the VDI you just created.
mpack wrote:On the contrary, on any modern OS that has a DMA driven caching scheme supporting the main filesystem, you can expect raw disk access (i.e. sector level basic CPU controlled access) to be substantially slower.
Thanks Don for putting in "scientific terms" what I was 80% sure about... ;)
Do NOT send me Personal Messages (PMs) for troubleshooting, they are simply deleted.
Do NOT reply with the "QUOTE" button, please use the "POST REPLY", at the bottom of the form.
If you obfuscate any information requested, I will obfuscate my response. These are virtual UUIDs, not real ones.
socratis
Site Moderator
Posts: 27329
Joined: 22. Oct 2010, 11:03
Primary OS: Mac OS X other
VBox Version: PUEL
Guest OSses: Win(*>98), Linux*, OSX>10.5
Location: Greece

Re: Virtual partition vs. raw harddisk access and effects on performance

Post by socratis »

frankvw wrote:I'll see if there is a quick and relatively painless way to migrate back to the standard VDI (file based) storage.
Another way that I was just told by a developer, and I was not aware of, is that cloning of a rawdisk VMDK would also work! That is use the:
  • 
    VBoxManage clonemedium "<YourVMDK>" "<YourNewVDI>" --format VDI
That should create a VDI that reflects the contents of your partition. I've not tried it myself, I was just told that this is a possibility, which actually makes a lot of sense...
Do NOT send me Personal Messages (PMs) for troubleshooting, they are simply deleted.
Do NOT reply with the "QUOTE" button, please use the "POST REPLY", at the bottom of the form.
If you obfuscate any information requested, I will obfuscate my response. These are virtual UUIDs, not real ones.
frankvw
Posts: 4
Joined: 9. Jul 2019, 14:22

Re: Virtual partition vs. raw harddisk access and effects on performance

Post by frankvw »

mpack wrote:"raw disk" access is still just as virtual as ever. It is raw in the sense of bypassing the host filesystem and host partitions. Put simply it calls the IOCTL level functions instead of the filesystem API. It does NOT mean direct hardware access (how could it??), and it does NOT equate to fast. On the contrary, on any modern OS that has a DMA driven caching scheme supporting the main filesystem, you can expect raw disk access (i.e. sector level basic CPU controlled access) to be substantially slower. The fact that the guest OS might implement it's own DMA based caching does not mean that you have no problem: all I/O in a VM is (by definition) is simulated using the host CPU, so DMA is not real DMA.
Thank you for clearing that up! This is exactly what I wanted to know, and it is consistent with my perception (not having run any decent benchmarks) on my current system. I'll migrate back to VDI.
frankvw
Posts: 4
Joined: 9. Jul 2019, 14:22

Re: Virtual partition vs. raw harddisk access and effects on performance

Post by frankvw »

socratis wrote:Another way that I was just told by a developer, and I was not aware of, is that cloning of a rawdisk VMDK would also work! That is use the:
  • 
    VBoxManage clonemedium "<YourVMDK>" "<YourNewVDI>" --format VDI
That should create a VDI that reflects the contents of your partition. I've not tried it myself, I was just told that this is a possibility, which actually makes a lot of sense...
I'm going to give that a try tonight. I've got enough room on my host FS so this should be fairly risk-free. Thank you, sir!
mpack
Site Moderator
Posts: 39134
Joined: 4. Sep 2008, 17:09
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Mostly XP

Re: Virtual partition vs. raw harddisk access and effects on performance

Post by mpack »

Well, it would certainly work, but it's a pretty horrible way to clone a Windows drive. It would create a fully "dumb" image, where a 500GB drive creates a 500GB image - no smart discarding of unused sectors or pagefile.

I would rather run Disk2VHD from inside the VM.
socratis
Site Moderator
Posts: 27329
Joined: 22. Oct 2010, 11:03
Primary OS: Mac OS X other
VBox Version: PUEL
Guest OSses: Win(*>98), Linux*, OSX>10.5
Location: Greece

Re: Virtual partition vs. raw harddisk access and effects on performance

Post by socratis »

Sure, for Windows systems. But it's certainly a "trick" that I didn't know was available for a rawdisk VMDK, plus it's cross-platform, it may have its uses for other OSes that don't speak Windoze... ;)
Do NOT send me Personal Messages (PMs) for troubleshooting, they are simply deleted.
Do NOT reply with the "QUOTE" button, please use the "POST REPLY", at the bottom of the form.
If you obfuscate any information requested, I will obfuscate my response. These are virtual UUIDs, not real ones.
mpack
Site Moderator
Posts: 39134
Joined: 4. Sep 2008, 17:09
Primary OS: MS Windows 10
VBox Version: PUEL
Guest OSses: Mostly XP

Re: Virtual partition vs. raw harddisk access and effects on performance

Post by mpack »

There are bound to be smart equivalents of Disk2VHD for other OS, I expect even CloneZilla has that feature these days. It shouldn't ever be necessary to fall back on dumb copiers.
Post Reply